Just a few days ago, scientists at Princeton revealed that Google was collecting user location data even when the permission for doing so had been denied. Now the saga gets even more worrisome as a team of researchers led by Professor Douglas Schmidt at Vanderbilt University have published a paper that reveals how Chrome and Android OS are shipping more data to Google servers than you would have expected. Not only that, the paper also sheds light on how the seemingly anonymous data points collected and shared can be used by Google to identify the individuals from whose devices that data originated.
Google collects data about a user in two primary ways; through the user’s interaction with Google products (active data collection) and through its advertiser and publisher focused tools (passive data collection) such as Google Analytics, DoubleClick, AdSense, AdWords, and AdMob. According to the paper, these tools have a massive user base, over 1 million mobile apps use AdMob, over 1 million advertisers use AdWords, over 15 million websites use AdSense and over 30 million websites use Google Analytics.
How the Study was Set up
In order to get an accurate picture of just how much active and passive data collection was going on, the researchers set up a brand new Gmail account on a phone that had been factory reset. Additionally, they used a brand-new SIM card. These precautions allowed the researcher to ensure that Google had no usage data on the current user’s profile (named Jane). Then the researcher went about her day as normal, using this brand-new phone. This allowed the researchers to study the Active data being collected.
In order to quantify the amount of passive data being collected, the team monitored data sent to Google from mobile phones (both Android and iPhone) within a 24-hour period. During this time, the phones were used by users, but with minimal interaction with Google Apps and services.
In a third study scenario, both the Android smartphone and the iPhone were placed on a table and left unused for a period of two hours.
Once data had been collected from the three scenarios the researchers concluded that
Google was able to build an interest profile for the user Jane based on just a single day’s use. This was achieved through the use of a Google account used to log into the Play Store and Google Chrome.
- Android smartphone communicated with Google Servers roughly 900 times in 24 hours whereas the Apple iPhone only makes 100 such requests.
- During this time, Android smartphone shared location 14 times per hour while the iPhone shared no location data with Google.
- The Apple iPhone made roughly 120 data connections with Apple servers. Half the times, the connection was for device upload in the form of photo/iOS backups.
Android Smartphone communicated location information 450 times within 24 hours. A total number of communication requests were roughly 2100 times in 24 hours (approx.90/hour). The team points out that “The number of calls to Google’s advertising domains was similar from both devices - an expected outcome since the usage of 3rd-party web pages and apps was similar on both devices. One notable difference was that the location data sent to Google from an iOS device is practically non-existent. In the absence of Android and Chrome platforms—or the use of any other Google product—Google becomes significantly limited in its ability to track the user location”
The researchers conclude that the Android Platform combined with the use of Chrome (which tracks all your website visits and clicks) allows Google to collect an immense amount of information. While most of the information packets by themselves could be considered anonymous, the paper detail in excruciating detail how Google can easily link all the data packets collected to the user from whose device they originated. The paper is a scary look at Google’s pervasive reach into our lives. In comparison, Apple's iOS does restrict the data collection to a considerable degree, especially if people are to minimise their use of Google's applications on the platform. If you would like, you can read the whole paper at this link.