Once every object in the frame is mapped to 3D coordinates, exaggerating the par...

Once every object in the frame is mapped to 3D coordinates, exaggerating the parallax is trivial (like rendering a video game for 3D glasses / VR), but to your point, figuring out what was behind those objects (to fill in the negative space that remains after translation) is a guessing game, equivalent to various smart fill / magic eraser features. Originally they used only context from the same photo, and nowadays they also use generative AI. Not always great results.