What the Huq?
Huq Industries is a UK company that provides location-based services and mobility data, some of which is collected via an SDK that is integrated with various apps. From Huq’s marketing materials: “The most accurate grade of mobility data is derived from the mobile OS via (A-)GPS using a first-party specialised SDK such as ours.” We had previously seen Huq’s SDK present in a number of apps, collecting precise GPS location, as well as both connected and nearby router MAC addresses and router SSIDs. In the last couple of years, however, we noticed that they no longer seemed to collect location information. We decided to look into what was happening, especially because we continued to see it access location and router information, while not obviously transmitting it.
Despite the lack of transmissions, we looked at what files were opened and written to, and found a few Huq-related files, one of which was called
huqVisitAwaitingSubmissionStore.xml. You can find it in the per-app local storage, so if Huq’s code was in an app named com.some.app, the data file for Huq would be found in
We looked inside this “awaiting submission” file, and found a XML-formatted file that contained a list of strings, each of which was an encoding of a JSON object storing a network scan. For example, a snippet of it looks like this:
Here’s a better formatted version:
From this, it appears that Huq is queuing up these location data reports to send, but not sending them out right away. Looking at the timing of location tracking, it looks like Huq does a batch upload every nine minutes or so while the phone is on—including when the app containing the Huq SDK is not in use.
At each of these times, there were 10 separate reports sent, each with its own date and time. Thus it appears that what is happening is that Huq waits until there are 10 events to report, and then sends that batch. We found one app that had a telling variable,
VISIT_SUBMISSION_BATCH_SIZE, which was set equal to 10 in a shared preferences file called
The trend of about nine minutes thus seems like the time it takes to reach a threshold of 10 events so that the server gets notified.
These reports are given different event type names, such as
HuqGeoEvent, to track when a user moves;
HuqNetworkJoinEvent, when the user connects to WiFi; and
HuqNetworkChangedEvent, which sometimes also sends a scan of nearby WiFi networks, as in this example when I took our testing phone on a stroll to downtown Bowness, hoping to trigger different events. I found that it uploaded WiFi networks for Bow Cannabis and its adjacent snack shop, Munchies.
We also tested to see if it respects an opt-out mechanism for people who do not want their home router’s data getting scooped up in a broad data collection: appending “_nomap” to the network’s SSID. We combined this with Microsoft’s now defunct opt-out mechanism (for an alternative purpose) of appending “_optout” to the SSID. We tested each individually and both orderings of the two by inserting fake scan results, and found that Huq collected them all:
Each of these batch reports seems to be linked to multiple identifiers, including the advertising ID and a random-looking one called HuqIID. Originally believing that the HuqIID was an installation identifier, we found that it persisted across installations (that is, it was the same value after uninstalling and reinstalling the same app). It also was the same for different apps by the same developer. This suggests that it is derived from the Android ID, which is a non-resetable per-developer identifier. We confirmed this by looking at the app’s use of the system’s SHA1 implementation at runtime: the HuqIID is a UUID generated by the SHA1 output of the hex bytes
0650522fafbd415e97b249ab863a0884 concatenated with the ASCII representation of the Android ID. We found that all apps we saw communicate with Huq used this same method (with the same prefix) to compute the HuqIID. We also saw that transmissions included the phone’s “Bluetooth Name”, which is user-configurable—though yours may very well be set to “Your Name’s Phone”.
The advertising ID can be reset by the user, but linking it to a persistent identifier that remains the same as long as the app is installed means that a motivated entity with access to the data can bridge the advertising ID, thus defeating the purpose of any resettable or ephemeral ID. When not collected with explicit user consent—which wasn’t requested by any of the apps that we examined—this appears to potentially violate Google’s platform policies: “The advertising identifier may only be connected to personally-identifiable information or associated with any persistent device identifier (for example: SSAID, MAC address, IMEI, etc.) with the explicit consent of the user.” We previously wrote a report on this so-called “identifier bridging” jointly with the IDAC. Many of the apps that we found Huq’s code present in included ultimatum-style demands to accept terms of service before the app could be used. Some sent data even when the user did not accept the terms, but instead quit the app. In two cases, the app had options to opt-out of data collection, yet in our testing Huq continued to collect and transmit location data despite that status. That is, clicking “NO” in the figure below (left) or ensuring data collection was off (right) did not observably impact Huq’s data collection.
Thankfully, this aggressive tracking of people’s location and linking it to multiple identifiers does not seem to be prevalent among apps. Looking at tens of thousands of apps, we only found Huq in 17. As mentioned, we didn’t observe any apps sending location data until we discovered this batching technique, and since then, we have tested some apps more exhaustively to see that nearly all are, in fact, sending location data linked to both the advertising ID and an Android-ID-derived HuqIID. Despite that, some of these apps are quite popular, so this may affect many users (e.g., Huq claims to receive over a billion daily events). The table below summarizes the apps in which we found Huq’s SDK. Note that just because we do not have a “yes” in the “Sent Geolocation” column does not mean that it doesn’t, only that we haven’t done a lengthy manual analysis of that app.