Hello, Video Codec

ww520 · on July 15, 2021

To find temporal redundancy in the video, some of these codecs have a clever technique to do motion compensation. The old way is to do a diff between the images of the current frame and the next frame, and record the diff in the output in place of the next frame to save space.

The newer way is to shift the current frame a little bit before doing the diff, hoping that the motion in the video moves the next frame in the same direction as the shifting direction. If the image of the current shifted frame coincides with the next frame, the diff produced should be very small. The output for the next frame then consists of the shifting direction+amount and the diff. The decoder can reverse the process by shifting the current frame by the direction and amount, and applying the diff to get to the next frame.

To find the optimized shifted diff, the codecs would try shifting none, left, right, up, down, and even upper-left/upper-right/etc. Then see which shifted direction produces the smallest diff and use that for compression. Encoding is so expensive because of all these searchings for the optimal compression.

userbinator · on July 15, 2021

...and "newer" is a relative term - motion compensation first appeared in H.261 (1988), I believe.

ww520 · on July 15, 2021

Good to know. Didn't know H.261 had it. I know MPEG4 (H.264) has it. MPEG4 was like coming out in 1990s/2000s? It's pretty old. But the idea is the same for all newer codecs. They just have different ways to chop up a frame into blocks and do shifting at the block level.

DesiLurker · on July 15, 2021

its been there since VCD (MPEG), DVD (MPEG2) & HDDVD/BR MPEG4-10 or H.264. its a very old idea. In fact it has been used in implementing a temporal filter to create a cleaner content for a bit too. I know because I worked on it all.

ww520 · on July 15, 2021

Good know. Very interesting.

labawi · on July 15, 2021

Modern codecs not only chop the frame into variable size blocks, they also do fractional shifts, in many directions, and predict the shift from past and future and parts of current frame.

ww520 · on July 15, 2021

I've heard some research into using triangle as macroblock for future codecs. That would really change things up.

ksec · on July 15, 2021

Triangle partition mode was in VVC VTM 6, cant remember if it made it into the final spec due to complexity increase.

porphyra · on July 15, 2021

You could also run optical flow, e.g. Lucas-Kanade flow [1] on the past pair of frames to predict how things will move in the next.

[1] https://en.wikipedia.org/wiki/Lucas%E2%80%93Kanade_method

ww520 · on July 15, 2021

From what I've seen, most codecs just brute-force in -1/+1 directions, nothing like doing Lucas-Kanade analysis to get an arbitrary direction. Not sure what the tradeoffs are. May be newer codecs will use Lucas-Kanade?

eyesee · on July 16, 2021

The neat thing about codecs is they're really defining the bitstream and decoding algorithms.

The encoding algorithms are not written in stone, so to speak. If you can optimize motion estimation encoding using Lucas-Kanade, you can just write that into, say, the x264 encoder and so long as the output is complaint you can expect all existing decoders to play back just fine.

Decoding of course may not be so flexible for compatibility reasons, and of course software is much more malleable than hardware.

Are there video codecs which are (accidentally) Turing-complete?

Jasper_ · on July 15, 2021

An arbitrary direction takes too many bits to encode, rather than a few set predicted directions. Reminder that you don't have to nail the prediction exactly, but choose a prediction that gives you the lowest residue error. So you need a way of encoding predictions that's smaller than your coded residue.

hifly · on July 15, 2021

really? it's this because it's too expensive to do an FFT? this would give you all probable matches directly? no?

ww520 · on July 15, 2021

Not sure. Patent? Hardware cost? Complexity?