It seems highly unlikely that they did not use indices. Scanning all documents w...

It seems highly unlikely that they did not use indices. Scanning all documents would be prohibitively slow. I think it is more likely that the indices were really large, and it would take hundreds to thousands of machines to store the indices in RAM. Having a parallel scan through those indices seems likely.

Wikipedia [1] links to "Jeff Dean's keynote at WSDM 2009" [2] which suggests that indices were most certainly used.

Then again, I am no expert in this field, so if you could share more details, I'd love to hear more about it.

[1] https://en.wikipedia.org/wiki/Google_data_centers

[2] https://static.googleusercontent.com/media/research.google.c...