There are billions of smartphones around the world and many are close to their owners at all times. For this reason, they have been recruited to help with contact tracing for the COVID-19 pandemic as they are able to record every proximate encounter you have with a friend, acquaintance, or a stranger.
Your phone can detect other nearby phones using wireless technologies like Bluetooth, and this means it can be used to figure out which pockets and purses that hold those phones were also probably close to one another at some time. Bluetooth is a short-range radio standard already widely used for things like headphones and video game controllers. To use Bluetooth for contact tracing, every capable phone constantly advertises (to those nearby, hopefully) random codes. These codes are routinely changing (every 15 minutes, for example) so that you can’t consistently track someone as they live their life by seeing the same code all over town. Essentially, this randomization makes these codes unlinkable—except unless they need to be linked to warn others about possible risk. Who gets to link them is a key dimension.
Centralized and Decentralized
There are two main ways this type of contact tracing has been implemented: often called centralized and decentralized. The Google-Apple Exposure Notifications (GAEN) is decentralized, but various jurisdictions have centralized schemes. In a centralized system, there is a central authority who assigns and distributes the random codes. This means that only the central authority can link a code to an actual person—to everyone else they are just random. If someone contracts COVID-19, they inform the central authority of every code they ever heard. Then, the central authority can figure out which phones (i.e, people) created those codes, and possibly warn them about an exposure.
The privacy issues are clear: the central authority learns everyone the sick person was near, even if those others never become ill. It is also the authority’s decision whether to warn anyone, and they would presumably strike a balance for false positives and negatives as too many false alarms will undermine public confidence and discourage voluntary adoption.
A decentralized system, proposed by the DP-3T team and later adopted in smartphones as GAEN, claims to have better privacy for users by removing the central authority. In the GAEN model, every phone records codes emitted by other phones over Bluetooth, but each code is generated using a pseudorandom function based on a secret key. So, if you know someone’s key, you can determine all the codes that they would have broadcast, and when they had done so. If a user contracts COVID-19, they publish the relevant keys through a health authority, and all other users can then recreate the codes that they would have heard from that user—had they been nearby to this person at the time—and compare them to the list of codes they actually received to see if they are at risk.
Attacks on Proximity Tracing
There are a number of attacks that can be done on both types of contact tracing systems. An adversary who learns some random codes can mount a replay (or relay) attack, where they rebroadcast the codes somewhere else. A foreign adversary who wanted to cause a fake outbreak, for example, could collect codes at one site, like a COVID-19 testing facility, and rebroadcast them elsewhere, like at a vital industry in the victim country. In a centralized system, this can still happen, though the centralized authority has more metainformation to detect such attacks and avoid issuing false positives.
An adversary who learns random codes can also note down where these codes were broadcasted. By recording codes and the locations in which they originated, an adversary only needs to wait for someone to publish their key in order to build a location history for them, which would greatly de-anonymize them. In a centralized system, users lose their privacy to a government authority; in a decentralized one users risk losing their privacy to anyone capable of collecting this data.
The advent of this technology has also paved the way for a new attack: biosurveillance. In the same way that GPS works by having ambient GPS-related data in the air, phones constantly transmitting codes over Bluetooth create an ambient background of health-related data. If an attacker follows another person, they would hear the same things that the person would hear, and so would be able to do a risk calculation for that person even if the surveilled person did not use contact tracing apps and had believed they opted out of the entire affair.
The defenses to these attacks are primarily economic and legal. How would an adversary actually manage to collect all this information? Arsenals of Bluetooth antennas are conspicuous and expensive. High-gain and high-energy antennas may be detected. The attacks can be prosecuted. The costs are enormous with little benefit.
Our discovery, however, is that there is a perfect weapon to mount this attack that is already widely deployed: the smartphone itself. Smartphones can record both Bluetooth beacons and GPS locations, and then upload that data to the attacker’s servers to do de-anonymization and biosurveillance attacks. And, to carry out a replay attack, other phones can be told to rebroadcast previously-collected beacons in other locations. Or the attacker can just sell this information to those wanting to do these attacks. Since there are billions of these devices already deployed and active at all times, the attacker needs only the means to access them. Thanks to the internet and the mobile app ecosystem, they may be able to do so easily, and from anywhere on the planet.
Attack by Surveillance Capitalism
Establishing the necessary infrastructure to carry out an attack like via mobile phones is still non-trivial, but almost certainly costs less than installing huge antennas everywhere. Yet, there is another way to ensure that it is cost effective—even profitable: surveillance capitalism. The attacker doesn’t need to spend time and money hacking millions of phones to collect Bluetooth data: they can just pay app developers to include code that collects it; or they can provide developers with libraries that facilitate convenient app features, but also collect Bluetooth data as much and as often as possible. This would give the attacker access to mobile phones all over the planet to act as “ears on the ground” on their behalf, listening for the data needed to do these attacks.
One of the many things we do at AppCensus is to keep a finger on the pulse of the myriad of ads and analytics libraries that are surveilling users of mobile devices. By routinely testing as many apps as possible, we can look for unusual behaviours across the ecosystem. While looking for behaviors related to contact tracing, an SDK by X-Mode stood out: it collects not only your location and the names and serial numbers of all the WiFi routers near you, but it also collects similar data from Bluetooth devices and Bluetooth low energy beacons.
As it turns out, X-Mode’s SDK was a bit of a needle in a haystack of apps: we only found it in a dozen or so apps from our entire dataset. Thanks to a volunteer at PersonalData.IO, we were told about another batch of apps and we list all the ones we found sending Bluetooth scans at the end of this post. Just because it was hard to find, however, doesn’t mean it isn’t prevalent. One of the apps, MP3 converter for videos, had more than 100 million installations as of November 2020, and many others had more than a million, like Just a Compass (Free & No Ads). None of the apps observed had any obvious need for both location and administrative control over the Bluetooth stack—which is what Android requires to scan for Bluetooth signals, and something about which users have limited visibility and control aside from disabling all access to Bluetooth.
What this X-Mode example actually shows is that the cost to an adversary to implement these attacks has been wildly overestimated. It is a glimpse at the pervasiveness of an existing surveillance infrastructure operating as a for-profit enterprise that could be leveraged by an attacker in the right circumstances. If an organization like X-Mode were so inclined, also collecting the random codes from the Bluetooth beacons would be trivial and sufficient to carry out de-anonymization and biosurveillance attacks at a very large scale—whether they or a third party are ultimately responsible for using the data to carry out the attack. Conversely, executing a replay attack (i.e., “false positive” attack) would require a fake broadcast of these codes, and would impose a more conspicuous code change in X-Mode’s SDK, but still wouldn’t pose any real technical challenge.
It is only a few lines of code that determine how this information is weaponized. As more legal frameworks and platform policies are realized, it is clear that app developers have a responsibility to account for the behaviors of their apps and the constituent components, including the data they collect and share with third parties. It is also increasingly evident that the ecosystem of third party developers that provide essential features and services (so that app developers don’t need to build their own maps service, for example) needs far greater scrutiny. For that reason, we here at AppCensus will continue to monitor the behaviour of as many apps and SDKs as we can, to shine a brighter light on any hints of such attacks or related behaviour.
Appendix: X-Mode Apps
|Video MP3 Converter||100M+|
|Fleet Battle – Sea Battle||10M+|
|myTuner Radio and Podcasts||10M+|
|SPEEDCHECK Internet Speed Test||10M+|
|Offline GPS Navigation, Traffic & Maps by Karta||5M+|
|Just a Compass (Free & No Ads)||1M+|
|What The Forecast?!!||1M+|
|Radio Italia: Online Radio Streaming||500K+|
|Radio Korea – FM Radio and Podcasts||500K+|
|Radio Polska – Radio FM||500K+|
|Portable ORG Keyboard||500K+|
|Amsterdam Travel Guide||100K+|
|Berlin Travel Guide||100K+|
|London Travel Guide||100K+|
|Radio Australia: Online Radio & FM Radio App||100K+|
|Radio Belgium: FM Radio and Internet Radio||100K+|
|Radio Canada – Internet Radio App||100K+|
|Radio Singapore: FM Radio + Radio Online Singapore||100K+|
|The Sun Ephemeris||50K+|
|Beijing Travel Guide||10K+|