> What will replace Internet data for training? Curated synthetic datasets?
My take is that the access Meta, Google etc. have to extra data has reduced the amount of research into using synthetic data because they have had such a surplus of it relative to everyone else.
For example, when I've done training of object detectors (quite out of date now) I used Blender 3D models, scripts to adjust parameters, and existing ML models to infer camera calibration and overlay orientation. This works amazingly well for subsequently identifying the real analogue of the object, and I know of people doing vehicle training in similar ways using game engines.
There were several surprising tactical details to all this which push the accuracy up dramatically and you don't see too widely discussed, like ensuring that things which are not relevant are properly randomized in the training set, such as the surface texture of the 3D models. (i.e. putting random fractal patterns on the object for training improves how robust the object detector is to disturbance in reality).
My take is that the access Meta, Google etc. have to extra data has reduced the amount of research into using synthetic data because they have had such a surplus of it relative to everyone else.
For example, when I've done training of object detectors (quite out of date now) I used Blender 3D models, scripts to adjust parameters, and existing ML models to infer camera calibration and overlay orientation. This works amazingly well for subsequently identifying the real analogue of the object, and I know of people doing vehicle training in similar ways using game engines.
There were several surprising tactical details to all this which push the accuracy up dramatically and you don't see too widely discussed, like ensuring that things which are not relevant are properly randomized in the training set, such as the surface texture of the 3D models. (i.e. putting random fractal patterns on the object for training improves how robust the object detector is to disturbance in reality).