What Does Google Have on You?

Facebook is getting the lion’s share of news right now for its data practices; and rightfully so. However, Google is bigger.

Vast Volumes of Data

We all use Google. It is the most used search engine in the world and it has numerous, very large data centers around the world to hold its vast amounts of data. The amount of data Google holds is kept secret. However, thanks to Randall Munroe , the very witty and astute NASA scientist turned cartoonist at XKCD , we have a pretty educated guess as to how much data Google has. Using location data and power data that he has collected he extrapolates Google has 15 exabytes of data. To give you a sense of scale, here are all of the zeros: 15,000,000,000,000,000,000. And where did they get this data? Us. And we add to it every time we do a search through Google, use Google maps, or use an app they own or whose creators share data with Google. Some of this data is used for directing and customizing web searches and your online experience as well as targeting you with advertisements. It is also used to tailor the news and media you take in, narrowing your views and perspectives by limiting your exposure and playing to your biases and interests.

Screen Shot 2018-03-30 at 8.38.01 AMGoogle Takeout

Google Takeout is a project managed by the Google Data Liberation Front  to provide Google account holders the ability to view and download the Google use data that has been collected on them. Go here to download your Google Takeout archive. Google will compile an archive of your data and will provide a list of delivery options (Gmail, Google Drive, etc). This can take a while, over a day in some cases, depending on how much data you have provided to Google over the years. Once your archive is complete you will receive a set of directories including an index.html file, your bookmarks, calendar, Chrome information, contacts, drive, etc. Pretty much every Google category you have ever provided any information to. The Index.html file will provide a summary of your files associated with your Google apps. I was primarily concerned with these three data categories:

  •    Google location history (Maps and Places)
  •    My browsing history
  •    My autofill information

IMG_2728 2Google Location History

Google has the ability to track and remember every move you have ever made through Google Maps. This is probably the most popular smartphone navigation app in use today. If you enlarge the index image you will see there is no data found under Location History or under Maps (your places) for my Google account. I am not completely surprised since I am probably one of the few people that uses Apple Maps; this just means all my movement is tracked somewhere else. Some people, even tech consultants and journalists have written about how shocked they were to find Google Maps has been tracking there every move but this just tells me they weren’t doing their due diligence. When we download these apps we receive a pop-up letting us know the app wants to access our data. I set this setting to just the minimum level required to use the application. Ideally, this gives Google information only when I need information from them. I am sure they are still collecting it in some way but this is the best I can do. If you have your location settings set to “Always” you trace everywhere you have ever been while Google Maps was installed. This can be a cool way to track your travels or it can freak you about a little bit if you start wondering about other ways this can be used. I personally don’t believe people have interest in tracking me but why provide the option?

Screen Shot 2018-03-30 at 8.35.50 AMMy Browsing History

In the Chrome directory of my Takeout Archive is a BrowserHistory.json file that contains my browsing history. It is a very long text file of pretty much everything I have searched for in the past 18 months. If you are into text analytics it can be a fun exercise to break this down for your most commonly searched terms to see what Google is mining on you as part of your profile. Google eventually anonymizes this data with the last octet of the IP address being dropped after 9 months and the cookie data is anonymized after 18 months. However, this data still helps Google provide you focused information and all of the data is retained indefinitely. Given this, it probably is not a good idea to do a search on “annoying neighbor anarchist cookbook”.

Screen Shot 2018-03-30 at 8.36.06 AMMy Autofill Data

Also, in the Chrome directory is Autofill.json. This contains all of the information Google is allowed to use to autocomplete website log ins and form completion. This is a very convenient feature that can be fraught with hazards if one is not careful. People often let this document store passwords and account information but sensitive information like this is best left to fully encrypted options such as a quality password manager. Thankfully, my Autofill file only contained basic name, address, and email information, all of which is very easily obtained publicly.

Final Thoughts

It is possible that Google has more information than you imagined. For better or worse, this is necessary for users to benefit from the services we want from Google and we do benefit from using their tools and options. However, it is important to remember that nothing is free and our relationship with Google is completely transactional and weighted in their favor. But, many of us find, within reason, the risks and vulnerabilities are worth it.

Google does not frustrate me in the same way that Facebook does. They never tried to hide behind a dubious altruistic reason to collect our data and in recent years they have focused on transparency. The My Activity page is relatively easy to navigate and makes it easy review and delete items from your history. They also make it pretty straightforward to delete other information from your account. But remember, the data never completely disappear, they are just anonymized and separated from your account information. If in doubt, opt out and turn the services unless you need them. Google is convinced they are providing adequate transparency and that data retention is fundamental to deliver value, and security, to their users. As we will see in another post, Google one of the nice guys and is not the biggest of the data hoarders.  Scary.

It is important to remember that all of these digital devices and platforms are data loggers; they are designed to facilitate the flow of information and in order to do that they have to recored and learn things about us.

Comment and let me know your thoughts. Follow me on Twitter if you want to learn more about data literacy and health.

Remember – Nothing is free. We are the product. Be vigilant and data literate.