MAC randomization is an increasingly common privacy mechanism used in phones, tablets and some laptops.
Every network interface has a MAC address, which is a unique identifier used as a network address in communications. Your phone, tablet or computer's WiFi interface has a unique MAC address that it uses to identify itself to nearby network access points when it's looking for known WiFi networks to join, and to identify itself when joining a WiFi network.
A MAC address can be used to identify a device passively as phone, tablet or computer via network access points, because of this, operating system providers started to introduce MAC randomization to enhance the privacy of the users of those devices. This works by changing the advertised MAC address of the device every few minutes, which means that although it is trivial to determine whether a given address is real or randomized, it is impossible to know whether a given random MAC address is the same device as the last seen random MAC address.
These changes mean:
- unless a device is associated to a WiFi network, it is almost impossible to link a device to a user or to identify that a device had visited before
- Unique counts of devices present at a location over a long period are now very difficult, as a single device could be represented by multiple randomized MAC addresses
More recently, operating system providers have also started using randomized MAC addresses for connecting to WiFi networks too, using a different unique MAC address for each known SSID. In this way, the operating system is preventing the same user from being automatically identified as they move from network to network unless those networks are broadcasting the same SSID. This means that a user can be reliability re-identified and given access to WiFi as they travel from one venue to another at the same company (broadcasting the same WiFi SSID), but they can't be automatically identified if they go to another company (broadcasting a different WiFi SSID).
In addition to saving a MAC address per SSID, randomized MAC addresses may also be regenerated if a user doesn't use a WiFi network for a prolonged time.
MAC randomization was introduced by Apple's iOS in 2014 and Android in 2017, each operating system performs this differently and with Android this can also differ by device. For more information about how MAC randomization is performed, refer to one of the following:
Impact on metrics
In a scenario where MAC randomization is observed, and where the core service provider (WiFi vendor collecting the data) does not filter randomized addresses, the Presence metric can appear as a higher value than physically present. As presence data includes both authenticated and associated data, the effects of this will diminish as the number of associated clients approaches the total devices present.
Repeat vs New
Perhaps the most affected statistic by MAC randomization is the repeat/new visitor metric. With particular bias towards new visitors, as freshly generated randomized addresses will not have been previously observed, thus contributing to new visitor observations.
The probability of skew towards a randomized MAC overlapping with a genuine visitor is extremely improbable, as vendors are typically always using local-bit markers in the MAC address. The probability of overlap in 2 randomized addresses is extremely low, within the constraints of single venue space and short visit time window.
The filtering of randomized MAC address will for the most part clean this data. It is expected that the trend seen through the proportions of new vs repeat will be genuine and relatively accurate. Both metrics may appear lower than the actual value by a proportion based on a combination of the authenticated user set, and the dwell time of individuals (opportunity to generate genuine MAC increases over time) e.g. convenience store/fast food restaurant would observe more impact than supermarket/restaurant.
MAC randomization effects on dwell data is a little less transparent. Overall there is a bias towards lower dwell metrics for the short observations caused by new devices, however where devices still present genuine MAC addresses over the longer periods (e.g. randomization only while in sleep mode) the genuine dwell time can still be present and accurate for the device.
Typically filtering out low dwell times, and filtering out randomized addresses can improve quality of dwell metrics.
Location (x,y) Data
Regardless of any randomization, the measured signal from the devices remains visible. It is still effective to compute relative location of this individual device. In general as the movement of a mobile device is largely relative to the data window, for unassociated devices there is little advantage to knowing a prior location for the individual device. I.e. it could be appropriate for the device to move many tens of mattress in the probe interval.
For connected devices there is far more data throughput and we are guaranteed to have the genuine MAC address, and no randomized MAC will overlap a genuine MAC address, so there is no real impact of randomization for this service.
Improve accuracy of metrics
Avoid the MAC randomization skew
Where we have genuine authenticated users, we will observe genuine accurate data.
It therefore follows that the higher proportion of authenticated users onboard, at the venue, the higher the accuracy you will see in your data. As we strive to attain further accuracy into the customer behavior within venues, it follows that with more users directly connected to the WiFi service, you will have a deeper insight, finer granularity, and a more engaged and participant audience.
Compensate for the MAC randomization skew
As Android / iOS currently have differing strategies for randomization, it is important to assess the proportion of iOS vs Android devices actually present for the interval of the metric. The bias of measurable data to actual data relates to this proportion.
Due to trampling of vendor codes, prevalent in Apple iOS devices, we cannot realistically use the vendor component in the randomized MAC address to identify proportions of Android/iOS. It would be skewed in favor of non-iOS. It is therefore more reliable to relate the proportion to the ratio observed by actual authenticated users(& attempted authentications during that window). For larger data sets this approach can yield more accuracy, unless there is a known underlying reason why this may be skewed (e.g. mobile phone showroom, or inaccessibility for a device type to authenticate).
Where proportions of non-randomized versus randomised MACs are known, and there is a gauge of the split between iOS and Android, we can produce a new estimate of bias based on this quantifiable observational evidence, the device susceptibility to randomization, the measured dwell (opportunity weighting) and some tempered coefficients to the probability of randomization over time, clamped by the observational space, and based on the prior device classification.
Where there is no vendor data available on the set (quantity) of randomized data, we must rely on typical bias, measured and computed at alternate venues to give a current gauge as to the proportions of randomization likely in the environment.
Generally, and for the purposes of statistical observation, the traffic shapes and time of day patterns will largely remain the same and project the same ‘shapes’ in overall data.
Finding your MAC address on mobile
Navigate to your settings screen, then select Wi-Fi.
When connected to a Wi-Fi network, you will see an ‘i’ icon next to your network name.
Press this ‘i’ and you will see ‘Wi-Fi Address’ with a 12-digit long string next to it. This is your MAC address.
The process can vary slightly depending on which model or make of phone you are using.
In most cases, you will want to navigate to your settings screen, then click into either Connections or Wi-Fi.
Once on the Wi-Fi settings page, select the icon for more information beside your network name – this can be a gear or ‘i’ icon.
The MAC address will either be instantly viewable, or you may need to select ‘View more’ for the information to show:
1. 2. 3.