To find temporal redundancy in the video, some of these codecs have a clever technique to do motion compensation. The old way is to do a diff between the images of the current frame and the next frame, and record the diff in the output in place of the next frame to save space.
The newer way is to shift the current frame a little bit before doing the diff, hoping that the motion in the video moves the next frame in the same direction as the shifting direction. If the image of the current shifted frame coincides with the next frame, the diff produced should be very small. The output for the next frame then consists of the shifting direction+amount and the diff. The decoder can reverse the process by shifting the current frame by the direction and amount, and applying the diff to get to the next frame.
To find the optimized shifted diff, the codecs would try shifting none, left, right, up, down, and even upper-left/upper-right/etc. Then see which shifted direction produces the smallest diff and use that for compression. Encoding is so expensive because of all these searchings for the optimal compression.
Good to know. Didn't know H.261 had it. I know MPEG4 (H.264) has it. MPEG4 was like coming out in 1990s/2000s? It's pretty old. But the idea is the same for all newer codecs. They just have different ways to chop up a frame into blocks and do shifting at the block level.
its been there since VCD (MPEG), DVD (MPEG2) & HDDVD/BR MPEG4-10 or H.264. its a very old idea. In fact it has been used in implementing a temporal filter to create a cleaner content for a bit too. I know because I worked on it all.
Modern codecs not only chop the frame into variable size blocks, they also do fractional shifts, in many directions, and predict the shift from past and future and parts of current frame.
From what I've seen, most codecs just brute-force in -1/+1 directions, nothing like doing Lucas-Kanade analysis to get an arbitrary direction. Not sure what the tradeoffs are. May be newer codecs will use Lucas-Kanade?
The neat thing about codecs is they're really defining the bitstream and decoding algorithms.
The encoding algorithms are not written in stone, so to speak. If you can optimize motion estimation encoding using Lucas-Kanade, you can just write that into, say, the x264 encoder and so long as the output is complaint you can expect all existing decoders to play back just fine.
Decoding of course may not be so flexible for compatibility reasons, and of course software is much more malleable than hardware.
Are there video codecs which are (accidentally) Turing-complete?
An arbitrary direction takes too many bits to encode, rather than a few set predicted directions. Reminder that you don't have to nail the prediction exactly, but choose a prediction that gives you the lowest residue error. So you need a way of encoding predictions that's smaller than your coded residue.
The newer way is to shift the current frame a little bit before doing the diff, hoping that the motion in the video moves the next frame in the same direction as the shifting direction. If the image of the current shifted frame coincides with the next frame, the diff produced should be very small. The output for the next frame then consists of the shifting direction+amount and the diff. The decoder can reverse the process by shifting the current frame by the direction and amount, and applying the diff to get to the next frame.
To find the optimized shifted diff, the codecs would try shifting none, left, right, up, down, and even upper-left/upper-right/etc. Then see which shifted direction produces the smallest diff and use that for compression. Encoding is so expensive because of all these searchings for the optimal compression.