Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Amazon reveals private Alexa voice data files (heise.de)
66 points by atemerev on Dec 22, 2018 | hide | past | favorite | 23 comments


Mods: The title should likely be changed back to the original ("Amazon reveals private Alexa voice data files").

The story here isn't that Echo stores recordings (they tell you they do, you can even listen to them) and it wasn't done "via GDPR" since that's a law, not a method of communication or platform.


Yes. Submitters: please follow the site guidelines and "use the original title, unless it is misleading or linkbait; don't editorialize." (https://news.ycombinator.com/newsguidelines.html)

(Submitted title was "Amazon Echo stores your voice recordings – sent other user’s records via GDPR".)


Is you squint and tilt your head just enough, it's a communication protocol. Send a data request with the right header, data comes back.


Since it seems like quite a lot of people are surprised that Amazon stores these recordings...

PSA: you can go into the Alexa app and look at your Echo history and even listen to recordings of each interaction.


Google is the same - see your recordings here:

https://myactivity.google.com/myactivity?restrict=vaa

I'm not sure if it's codified in law, but big tech companies are moving towards 'if we store non-anonymized user data, it must be possible for the user to log in and see it themselves'. I think they do that so they can argue that storing and playing back the audio is a feature of the product, rather than something ancillary they happen to do which might not be in the users interest.


This used to be called "search history", and then "web history", and now "my activity". These controls for user data have been a best practice Google has been following for many years.

https://policies.google.com/technologies/retention


I was surprised they store a real recording. I thought they submitted some pre-processed form which saves bandwith, partly anonymizes the speech and then hands it off for recognition.


Dupe. From the other day with 482 points, 403 comments: https://news.ycombinator.com/item?id=18727020


Perfect example of how regulations designed to serve one purpose can in fact create just as many problems as it solves.

They meant to give consumers insight into their own data. Instead they created a process for getting previously locked up private data out of the company, which is prone to human error and / or intentional abuse. Nice.


But the problem was still there: poor control over stored recordings. It could have surfaced another way, like a law enforcement request for some suspect's recordings and being sent someone else's recordings.

So the regulation is not at fault here, "they created a process for getting previously locked up private data out of the company, which is prone to human error and / or intentional abuse" is the real problem.


It shows how lax companies are with our data, and that products aren't designed with privacy from day one, but shoddily bolted on after the fact (if at all). Blaming the GDPR/"regulations" here is completey misguided.


Anybody know how this compairs with Google's home product ?

I currently have both and need to make a decision on which route I am going to go.


Google Home stores everything after the hotword trigger until the light goes off (you stop speaking).

Each action has a card that explains what triggered it, what device, and what result was given, and you can listen to it. You can also turn recording storage off entirely, or delete by device or date ranges.

Honestly the only bad part is it's buried several layers down in the options and account activity, where most people don't go looking. If you do care though the privacy options and controls are pretty good.


This is the same with Alexa, with the exact same drawback that it's buried in the settings>History section.

It's actually a really cool feature; this entire fiasco could likely have been avoided if Amazon were to embrace the feature (data export and review) rather than treat it as something only nerds are interested in.

If you have data export by default (like Google's Data Takeout), then you don't need to build internal custom systems and manual processes that are only tested on GDPR requests. You've already built them for the default case. Handling GDPR requests is now user self-serve with a link to documentation explaining how to get their data.


As regards to recording and storage, they are identical. Both store everything you say after the keyword until it stops recording.

To make the choice which route to go, you need to figure out which one has more integrations that are useful to you. Try integrating your other stuff with both and see which works better. Also try asking both the same questions and see which gives a better answer.


HomePod doesn’t store anything.


also good way to discover boyfriend of my wife


I you think she's talking about him at home or he's coming over when you're not home, wouldn't it be easier to go the traditional route of a hidden camera, or hiding and audio recorder in the bedroom?

Relying on her to say "Alexa, play sexy music for Johnny" seems unreliable. But if you really want to use the Echo to spy on your wife, you can just review the voice history yourself.


you should tell the telegraph that story


Go on...


Amazon also has people listen to your voice recordings so they can manually label the outcome for future training data.


Everyone does this, training data needs labels.


Obviously I recognize that training data needs labels as that’s why I said they did it. But I bet you ten bucks the average consumer does not expect humans to hear what they say to Alexa.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: