News recently broke that the Public Health Agency of Canada (PHAC) had been procuring location data from millions of mobile devices to study how COVID-19 lockdowns were working.
Appalled opposition MPs called for an emergency meeting of the ethics committee of the House of Commons, fearing that the pandemic was being used as an excuse to scale up surveillance.
At the same time, our group of interdisciplinary experts from around the world convened at a research retreat on the subject of the ethics of mobility data analysis. Computer scientists, together with philosophers and social scientists, looked at the ethical challenges posed by the uses of mobility data, especially those legitimized by the pandemic.
The collection and use of location data
Telecommunications providers like Telus and Rogers know where cellphones are located by triangulation from the cell towers to which each phone connects. The data is a commodity and they share it, in anonymized form, with others, including academics.
Smartphones can also use the global positioning system (GPS) or their connection to Wi-Fi access points to collect location data and share it with companies to receive customized services, like navigation or recommendations.
Many companies are interested in gathering location data even when their services don’t require it, as selling that data to other companies is an attractive prospect. For example, Tectonix tweeted in 2020 about a dashboard it had developed with data acquired from X-Mode to track the cellphones of people who partied on a Fort Lauderdale beach during spring break in March.
Not so anonymous
Companies and data brokers may claim to only store or sell anonymized location data, but that’s little comfort when location data itself is so identifiable and revealing. In particular, and contrary to popular belief, it’s almost impossible to make detailed location data truly anonymous. Even if a sequence of locations visited by an individual is stripped of any connection to that person’s name or other identifiers, the possibility of re-identification due to the inherent information contained in this trajectory must be considered.
It becomes quite simple to look up who lives at a given location and assume that where someone spends their days is their work, and where they are in the evenings is their home, and this can uniquely identify a person.
Similar to records of a person’s online activities, the places visited can also reveal sensitive data such as health (repeated visits to a particular clinic), religion, hobbies and family (where your children go to school). Location data is hard to anonymize and can be used to re-identify a person, and all sorts of other information can be inferred from patterns of movement.
Government agencies like PHAC want to use mobility data to understand trends in the “movement of populations during the COVID‑19 pandemic” to study how the disease spreads and also to monitor how measures put in place, such as the confinement, are respected by the population.
The fear expressed by Conservative and Bloc Québécois members of Parliament is that the government is using the pandemic to justify a new level of surveillance of Canadians that could continue even after the pandemic is over.
Read more: Health data collected during the coronavirus pandemic needs to be managed responsibly
Best practices
If governments are going to promote contact tracing or collecting mobility data for health reasons such as transmission of COVID-19, best practices suggest that the scope should be clearly defined, the information gathered kept to a minimum and there should be an expiry date for the project after which it’s reviewed. Some specific practices that government agencies like PHAC might consider include:
First, as PHAC is discovering, transparency is key: be transparent about what information is sought, how it will be stored and for how long, who will have access and what outcomes are anticipated.
In particular, due to its sensitivity with respect to privacy but also other ethical issues — such as the risks of stigmatization of particular groups in the population — the collection and analysis of large location datasets by governments should be made public from the beginning in a manner similar to the discussions around contact-tracing applications.
Location data isn’t representative since some groups, like children or the elderly, are less likely to carry smartphones, while others are more tech-savvy. Biased data needs to be accounted for, and transparency is a way for the public to audit the use of such data.
Read more: Race-based COVID-19 data may be used to discriminate against racialized communities
Second, any organization gathering or working with data should develop a data-management plan that covers how it will deal with security and privacy implications.
Transparency and accountability around data plans is part of ensuring long-term ethical maintenance. Mobility data can easily be misused by third parties to make inappropriate inferences about citizens, so brokers, governments and researchers need to plan for how they will share data. This is especially true for researchers who value preservation of research data to enable replication and further research.
Open datasets allow results to be replicated and new research to be imagined combining past datasets, but openly shared datasets can also be used to re-identify people in inappropriate ways. The way data is anonymized before being shared is of great importance. It should also be made public to be open to the scrutiny of experts or associations concerned about privacy.
Third, civil society organizations need to be engaged in a dialogue around government policies, regulations and bias. Public trust in government surveillance and academic research needs to be developed and maintained — before there are scandals, not after.
We especially need to talk about who is represented and who is excluded, and what the implications are. For example, if the elderly or economically disadvantaged are less likely to have smartphones, then mobility data may under-represent their interests. That’s detrimental if this data is used to guide public policies.
Finally, we all need to contribute to the development of a new consensus around surveillance whether by governments, companies or researchers. PHAC could lead a conversation around health surveillance.
As the pandemic has shown, public trust in health measures is important to their success. Transparency followed by dialogue could allow appropriate data gathering and use, while still enabling useful research, especially in times of crisis.
Chiara Renso of the Institute of Information Science and Technologies co-authored this article.
Geoffrey M Rockwell receives funding from the Social Science and Humanities Research Council of Canada and holds a leadership role in the Canadian Society for Digital Humanities. He has also received funding from the Alberta Machine Intelligence Institute. He is affiliated with the AI, People, and Society Initiative sponsored by the ATB AI Lab.
Bettina Berendt receives funding from the German Federal Ministry of Education and Research (BMBF) – Nr. 16DII113f.
Florence Chee does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.
Jeanna Matthews is affiliated with and holds leadership roles within the Association for Computing Machinery (ACM) and Institute of Electrical and Electronics Engineers (IEEE).
Sébastien Gambs receives funding from the Natural Sciences and Engineering Research Council (NSERC) of Canada, through the Discovery grant program, as well as from the Canada Research Chair program.
This article was originally published on The Conversation. Read the original article.