Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Guess at root cause:

Some replication process at Google had fallen behind by 6 months (and presumably didn't have monitoring/alerting), and someone noticed and in trying to fix it they forced that replica to take mastership (meaning the users now see the 6 month old data).

Since the replicas presumably now have conflicting changes, re-merging the two is going to require a lot of code be written to smartly merge the data, and some users are going to permanently lose data (where they edited an old version of a document for example, and those edits cannot be automatically rebased onto the new version)



Wasn't Google starting to delete old files? My guess would be that something screwed up in that process.


and some users are going to permanently lose data

Why couldn't you write code to let users compare and choose which version of the data they want to keep?


You could... But writing and deploying a diffing tool for every different gsuite application is probably 6 months of work for a whole team... There are so many corner cases.

What will the users do for 6 months waiting for their data to return?


"let users compare" doesn't require diffing tools. Worst case put the versions next to reach other.


> Worst case put the versions next to each other.

For some files (anything encrypted, for example) this may actually be the best you could do.

Dropbox had this behavior ten years ago, so it's kind of inexcusable for Google today to be just mindlessly overwriting.


That's an excellent description of a basic diffing tool.


A tool that cannot parse the files is not a diffing tool.


Ideally Google will resolve it without the customer needing to do the work.


I guess there may be a bevy of different file formats and a diff for a spreadsheet an an image would require building tooling, just from those examples. I was thinking of creating an application where the front end is more than just a file browser to stored file types but the interface to read/edit the content as well.


You don't even need to go that deep -- just give me a filetype-based icon, file name, and last modified date, and that should be enough for me to choose. And if they want to be simple, include the option to always keep the most recent version of the file (which is going to be the right answer in many cases).


That seems like one of the more unlikely causes, among the almost infinite variety of causes that are consistent with "several people said they can't find some files".

To me the most likely one by far is the deletion was actually commanded, for example by a flaw in the program they use to sync, or by malware.


A few users have reported that documents they deleted months ago have reappeared too. And files they moved long ago have unmoved.


I am still skeptical that users are correctly describing the behavior of a multi-party synced filesystem, or that Drive for Windows is bug-free. You seem to think it's a flaw on the backend but I think there are many other possible causes.


Or some backend data storage system experienced transient errors and occasionally returned a corrupt pointer to a storage location.

Low frequency data loss until it reaches alerting threshold of 0.1%




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: