It still has some artifacts more often than not, they are a lot subtler in nature but they still come out, whether it's texture, proportion, lighting, or perspective. Now some things are easier to fix on second pass edits, some are not. I guess it's why they consider image editing to be the next challenge.
The Gemini models and Eleven V3, and whatever internal audio model Sora 2 uses are about neck and neck in converging performance. They have some unexplainable flavor to them though. Especially Sora.
It's the skin textures. It's the slightly better lipsyncing. Maybe it will be different when us normal users get it but so far the demos with Sam don't make him look waxy.
So far the true progress it has made is getting textures right close up. It still fudges how skin looks like the more it pans away from the characters.
This whole autonomous driving levels kinda muddies the waters. Some would argue this isn't full L4 even. But it is a self driving car in the places it offers its services.
Google does a good job with that too usually. Which makes their last two announcements (IMO success and Genie 3) being a bit light on details is somewhat surprising.
Training "high" points in voice inflection has been the priority, we've seen this in the 4o voice outputs and to some degree the Google NotebookLM podcast outputs. I would assume it's because they're trying to make it "act", but now it's a problem of swinging too hard on one end of the spectrum.