Proximity Tracing in an Ecosystem of Surveillance Capitalism

Published by Joel Reardon on

AppCensus Hero Image

Co-written with Paul-Olivier Dehaye and Bobby Richter

There are billions of smartphones around the world and many are close to their owners at all times. For this reason, they have been recruited to help with contact tracing for the COVID-19 pandemic as they are able to record every proximate encounter you have with a friend, acquaintance, or a stranger.

Your phone can detect other nearby phones using wireless technologies like Bluetooth, and this means it can be used to figure out which pockets and purses that hold those phones were also probably close to one another at some time. Bluetooth is a short-range radio standard already widely used for things like headphones and video game controllers. To use Bluetooth for contact tracing, every capable phone constantly advertises (to those nearby, hopefully) random codes. These codes are routinely changing (every 15 minutes, for example) so that you can’t consistently track someone as they live their life by seeing the same code all over town. Essentially, this randomization makes these codes unlinkable—except unless they need to be linked to warn others about possible risk. Who gets to link them is a key dimension.

Centralized and Decentralized

There are two main ways this type of contact tracing has been implemented: often called centralized and decentralized. The Google-Apple Exposure Notifications (GAEN) is decentralized, but various jurisdictions have centralized schemes. In a centralized system, there is a central authority who assigns and distributes the random codes. This means that only the central authority can link a code to an actual person—to everyone else they are just random. If someone contracts COVID-19, they inform the central authority of every code they ever heard. Then, the central authority can figure out which phones (i.e, people) created those codes, and possibly warn them about an exposure.

Central authority gives every user random codes and the times to broadcast them. It knows who own each code.
A user with COVID-19 informs the central authority of every code they heard. The authority can translate this into the actual people they encountered and possibly warn them of risk.

The privacy issues are clear: the central authority learns everyone the sick person was near, even if those others never become ill. It is also the authority’s decision whether to warn anyone, and they would presumably strike a balance for false positives and negatives as too many false alarms will undermine public confidence and discourage voluntary adoption.

A decentralized system, proposed by the DP-3T team and later adopted in smartphones as GAEN, claims to have better privacy for users by removing the central authority. In the GAEN model, every phone records codes emitted by other phones over Bluetooth, but each code is generated using a pseudorandom function based on a secret key. So, if you know someone’s key, you can determine all the codes that they would have broadcast, and when they had done so. If a user contracts COVID-19, they publish the relevant keys through a health authority, and all other users can then recreate the codes that they would have heard from that user—had they been nearby to this person at the time—and compare them to the list of codes they actually received to see if they are at risk.

In a decentralized version, each user create their own secret key that generates random-looking codes. Without the key they are unlinkable, but with the key they can be linked.
Users with COVID-19 publish their keys through a health authority. Each user downloads those keys and can do an offline calculation of their own risk without revealing encounters.

Attacks on Proximity Tracing

There are a number of attacks that can be done on both types of contact tracing systems. An adversary who learns some random codes can mount a replay (or relay) attack, where they rebroadcast the codes somewhere else. A foreign adversary who wanted to cause a fake outbreak, for example, could collect codes at one site, like a COVID-19 testing facility, and rebroadcast them elsewhere, like at a vital industry in the victim country. In a centralized system, this can still happen, though the centralized authority has more metainformation to detect such attacks and avoid issuing false positives.

An adversary who learns random codes can also note down where these codes were broadcasted. By recording codes and the locations in which they originated, an adversary only needs to wait for someone to publish their key in order to build a location history for them, which would greatly de-anonymize them. In a centralized system, users lose their privacy to a government authority; in a decentralized one users risk losing their privacy to anyone capable of collecting this data.

The advent of this technology has also paved the way for a new attack: biosurveillance. In the same way that GPS works by having ambient GPS-related data in the air, phones constantly transmitting codes over Bluetooth create an ambient background of health-related data. If an attacker follows another person, they would hear the same things that the person would hear, and so would be able to do a risk calculation for that person even if the surveilled person did not use contact tracing apps and had believed they opted out of the entire affair.

GPS satellites make location information an unavoidable part of the environment.
Contact tracing apps similarly make health data part of the environment.

The defenses to these attacks are primarily economic and legal. How would an adversary actually manage to collect all this information? Arsenals of Bluetooth antennas are conspicuous and expensive. High-gain and high-energy antennas may be detected. The attacks can be prosecuted. The costs are enormous with little benefit.

Our discovery, however, is that there is a perfect weapon to mount this attack that is already widely deployed: the smartphone itself. Smartphones can record both Bluetooth beacons and GPS locations, and then upload that data to the attacker’s servers to do de-anonymization and biosurveillance attacks. And, to carry out a replay attack, other phones can be told to rebroadcast previously-collected beacons in other locations. Or the attacker can just sell this information to those wanting to do these attacks. Since there are billions of these devices already deployed and active at all times, the attacker needs only the means to access them. Thanks to the internet and the mobile app ecosystem, they may be able to do so easily, and from anywhere on the planet.

The phone itself is the perfect weapon to mount these attacks by collecting the data, annotating with location, and giving to the attacker.

Attack by Surveillance Capitalism

Establishing the necessary infrastructure to carry out an attack like via mobile phones is still non-trivial, but almost certainly costs less than installing huge antennas everywhere. Yet, there is another way to ensure that it is cost effective—even profitable: surveillance capitalism. The attacker doesn’t need to spend time and money hacking millions of phones to collect Bluetooth data: they can just pay app developers to include code that collects it; or they can provide developers with libraries that facilitate convenient app features, but also collect Bluetooth data as much and as often as possible. This would give the attacker access to mobile phones all over the planet to act as “ears on the ground” on their behalf, listening for the data needed to do these attacks. 

One of the many things we do at AppCensus is to keep a finger on the pulse of the myriad of ads and analytics libraries that are surveilling users of mobile devices. By routinely testing as many apps as possible, we can look for unusual behaviours across the ecosystem. While looking for behaviors related to contact tracing, an SDK by X-Mode stood out: it collects not only your location and the names and serial numbers of all the WiFi routers near you, but it also collects similar data from Bluetooth devices and Bluetooth low energy beacons.

As it turns out, X-Mode’s SDK was a bit of a needle in a haystack of apps: we only found it in a dozen or so apps from our entire dataset. Thanks to a volunteer at PersonalData.IO, we were told about another batch of apps and we list all the ones we found sending Bluetooth scans at the end of this post. Just because it was hard to find, however, doesn’t mean it isn’t prevalent. One of the apps, MP3 converter for videos, had more than 100 million installations as of November 2020, and many others had more than a million, like Just a Compass (Free & No Ads). None of the apps observed had any obvious need for both location and administrative control over the Bluetooth stack—which is what Android requires to scan for Bluetooth signals, and something about which users have limited visibility and control aside from disabling all access to Bluetooth.

We emphasize that there is no evidence to suggest that X-Mode or those apps are mounting the attacks we’ve outlined concerning contact tracing systems and their users. Importantly, we have no evidence that X-Mode’s Bluetooth collection actually included the random codes that are generated by the contact tracing API, which would be necessary to mount an attack. Recent stories from Vice and the Wall Street Journal report that, instead, X-Mode sells its location information to branches in the U.S. military. Further, they were already collecting Bluetooth data before it started having this novel health purpose, and only recently updated their privacy policy to add “disease prevention and research, security, anti-crime and law enforcement”.

What this X-Mode example actually shows is that the cost to an adversary to implement these attacks has been wildly overestimated. It is a glimpse at the pervasiveness of an existing surveillance infrastructure operating as a for-profit enterprise that could be leveraged by an attacker in the right circumstances. If an organization like X-Mode were so inclined, also collecting the random codes from the Bluetooth beacons would be trivial and sufficient to carry out de-anonymization and biosurveillance attacks at a very large scale—whether they or a third party are ultimately responsible for using the data to carry out the attack. Conversely, executing  a replay attack (i.e., “false positive” attack) would require a fake broadcast of these codes, and would impose a more conspicuous code change in X-Mode’s SDK, but still wouldn’t pose any real technical challenge.

False positive relay attacks can be done extrajurisdictionally. A foreign attacker collects codes from one place likely to later lead to a positive encounter, and rebroadcast them at a target site that the attacker wants to shutdown.

Conclusion

It is only a few lines of code that determine how this information is weaponized. As more legal frameworks and platform policies are realized, it is clear that app developers have a responsibility to account for the behaviors of their apps and the constituent components, including the data they collect and share with third parties. It is also increasingly evident that the ecosystem of third party developers that provide essential features and services (so that app developers don’t need to build their own maps service, for example) needs far greater scrutiny. For that reason, we here at AppCensus will continue to monitor the behaviour of as many apps and SDKs as we can, to shine a brighter light on any hints of such attacks or related behaviour.

Appendix: X-Mode Apps


App NameInstallations
Video MP3 Converter100M+
Fleet Battle – Sea Battle10M+
myTuner Radio and Podcasts10M+
SPEEDCHECK Internet Speed Test10M+
Offline GPS Navigation, Traffic & Maps by Karta5M+
Compass1M+
Just a Compass (Free & No Ads)1M+
Speedcheck1M+
What The Forecast?!!1M+
Wiseplay1M+
Radio Italia: Online Radio Streaming500K+
Radio Japan500K+
Radio Korea – FM Radio and Podcasts500K+
Radio Polska – Radio FM500K+
Portable ORG Keyboard 500K+
Bubble Level 200K+
Amsterdam Travel Guide100K+
Berlin Travel Guide100K+
London Travel Guide100K+
Radio Australia: Online Radio & FM Radio App100K+
Radio Belgium: FM Radio and Internet Radio100K+
Radio Canada – Internet Radio App100K+
Radio Singapore: FM Radio + Radio Online Singapore100K+
Altimeter PRO50K+
The Sun Ephemeris50K+
Beijing Travel Guide10K+
Wildlife Weather10K+

Categories: Uncategorized

Joel Reardon

Dr. Joel Reardon is the Forensics Lead and Co-Founder of AppCensus. Dr. Reardon is also a professor at the University of Calgary and a world-renowned expert in digital forensics and security.