WPES '23

Proceedings of the 22nd Workshop on Privacy in the Electronic Society
Last Update : [26 November, 2023]

SESSION: Full Papers

Zef: Low-latency, Scalable, Private Payments
  • Mathieu Baudet
  • Alberto Sonnino
  • Mahimna Kelkar
  • George Danezis

Zef is the first Byzantine-Fault Tolerant (BFT) protocol to support payments in anonymous digital coins at arbitrary scale. Zef achieves its performance by forgoing the expense of BFT consensus and using instead Byzantine Consistent Broadcast as its core primitive. Zef is asynchronous, low-latency, linearly-scalable, and powered by partially-trusted sharded authorities. Zef introduces opaque coins represented as off-chain certificates that are bound to user accounts. In order to hide the values of coins when a payment operation consumes or creates them, Zef uses cryptographically hiding commitments and NIZK proofs. Coins creation and spending are unlinkable through the Coconut blind and randomizable threshold anonymous credentials scheme. To control storage costs associated with coin replay prevention, Zef allows safe accounts deletion once the account is deactivated. Our extensive benchmarks on AWS confirm textbook linear scalability and demonstrate a confirmation time under one second at nominal capacity. Compared to existing anonymous payment systems based on a blockchain, this represents a latency speedup of three orders of magnitude, with no theoretical limit on throughput.

From Privacy Policies to Privacy Threats: A Case Study in Policy-Based Threat Modeling
  • Yana Dimova
  • Mrunmayee Kode
  • Shirin Kalantari
  • Kim Wuyts
  • Wouter Joosen
  • Jan Tobias Mühlberg

Privacy threat modeling is a systematic approach to assess potential privacy risks which are a consequence of a given system design. Eliciting privacy threats requires a detailed understanding of system components and the ways in which these components interact. This makes it hard to impossible for any user, e.g., parties who interact with the system but do not possess knowledge about the inner workings of that system, to meaningfully engage in threat modeling and risk assessment. We explore an approach to address this problem by relying on information from a system's publicly available privacy policies to derive system models and apply threat modeling analyses. We chose the WhatsApp instant messaging system as a case study for privacy threat modeling from the perspective of a "regular" user. We apply the LINDDUN GO methodology and evaluate how threats evolved with time in two significant territorial areas, the European Union and India. Our study illustrates the impact of regulations and court cases and our approach may aid practitioners without inside knowledge to make informed choices regarding privacy risks when adopting third-party services.

UA-Radar: Exploring the Impact of User Agents on the Web
  • Jean Luc Intumwayase
  • Imane Fouad
  • Pierre Laperdrix
  • Romain Rouvoy

\beginabstract In the early days of the web, giving the same web page to different browsers could provide very different results. As the rendering engine behind each browser would differ, some elements of a page could break or be positioned in the wrong location. At that time, the User Agent (UA) string was introduced for content negotiation. By knowing the browser used to connect to the server, a developer could provide a web page that was tailored for that specific browser to remove any usability problems. Over the past three decades, the UA string remained exposed by browsers, but its current usefulness is being debated. Browsers now adopt the exact same standards and use the same languages to display the same content to users, bringing the question if the content of the UA string is still relevant today, or if it is a relic of the past. Moreover, the diversity of means to browse the web has become so large that the UA string is one of the top contributors to tracking users in the field of browser fingerprinting, bringing a sense of urgency to deprecate it. In this paper, our goal is to understand the impact of the UA on the web and if this legacy string is still actively used to adapt the content served to users. We introduce \radar, a web page similarity measurement tool that compares in-depth two web pages from the code to their actual rendering, and highlights the similarities it finds. We crawled 270,048 web pages from 11,252 domains using 3 different browsers and 2 different UA strings to observe that 100% of the web pages were similar before any JavaScript was executed, demonstrating the absence of differential serving. % Our experiments also show that only a very small number of websites are affected by the lack of UA information, which can be fixed in most cases by updating code to become browser-agnostic. Our study brings some proof that it may be time to turn the page on the UA string and retire it from current web browsers. \endabstract

Client-specific Property Inference against Secure Aggregation in Federated Learning
  • Raouf Kerkouche
  • Gergely Ács
  • Mario Fritz

Federated learning has become a widely used paradigm for collaboratively training a common model among different participants with the help of a central server that coordinates the training. Although only the model parameters or other model updates are exchanged during the federated training instead of the participant's data, many attacks have shown that it is still possible to infer sensitive information or to reconstruct participant data. Although differential privacy is considered an effective solution to protect against privacy attacks, it is also criticized for its negative effect on utility. Another possible defense is to use secure aggregation, which allows the server to only access the aggregated update instead of each individual one, and it is often more appealing because it does not degrade the model quality. However, combining only the aggregated updates, which are generated by a different composition of clients in every round, may still allow the inference of some client-specific information.

In this paper, we show that simple linear models can effectively capture client-specific properties only from the aggregated model updates due to the linearity of aggregation. We formulate an optimization problem across different rounds in order to infer a tested property of every client from the output of the linear models, for example, whether they have a specific sample in their training data (membership inference) or whether they misbehave and attempt to degrade the performance of the common model by poisoning attacks. Our reconstruction technique is completely passive and undetectable. We demonstrate the efficacy of our approach on several scenarios, showing that secure aggregation provides very limited privacy guarantees in practice.

Comparing Privacy Labels of Applications in Android and iOS
  • Rishabh Khandelwal
  • Asmit Nayak
  • Paul Chung
  • Kassem Fawaz

The increasing concern for privacy protection in mobile apps has prompted the development of tools such as privacy labels to assist users in understanding the privacy practices of applications. Both Google and Apple have mandated developers to use privacy labels to increase transparency in data collection and sharing practices. These privacy labels provide detailed information about apps' data practices, including the types of data collected and the purposes associated with each data type. This offers a unique opportunity to understand apps' data practices at scale. In this study, we conduct a large-scale measurement study of privacy labels using apps from the Android Play Store (n=2.4M) and the Apple App Store (n=1.38M). We establish a common mapping between iOS and Android labels, enabling a direct comparison of disclosed practices and data types between the two platforms. By studying over 100K apps, we identify discrepancies and inconsistencies in self-reported privacy practices across platforms. Our findings reveal that at least 60% of all apps have different practices on the two platforms. Additionally, we explore factors contributing to these discrepancies and provide valuable insights for developers, users, and policymakers. Our analysis suggests that while privacy labels have the potential to provide useful information concisely, in their current state, it is not clear whether the information provided is accurate. Without robust consistency checks by the distribution platforms, privacy labels may not be as effective and can even create a false sense of security for users. Our study highlights the need for further research and improved mechanisms to ensure the accuracy and consistency of privacy labels.

Maybenot: A Framework for Traffic Analysis Defenses
  • Tobias Pulls
  • Ethan Witwer

In light of the increasing ubiquity of end-to-end encryption and the use of technologies such as Tor and VPNs, analyzing communications metadata---traffic analysis---is a last resort for network adversaries. Traffic analysis attacks are more effective thanks to improvements in deep learning, raising the importance of deploying defenses. This paper introduces Maybenot, a framework for traffic analysis defenses. Maybenot is an evolution and generalization of the Tor Circuit Padding Framework by Perry and Kadianakis, designed to support a wide range of protocols and use cases. Defenses are probabilistic state machines that trigger padding and blocking actions based on events. A lightweight simulator enables rapid development and testing of defenses. In addition to describing the Maybenot framework, machines, and simulator, we implement and thoroughly evaluate the state-of-the-art website fingerprinting defenses FRONT and RegulaTor as Maybenot machines. Our evaluation identifies challenges associated with state machine-based frameworks as well as possible enhancements that will further improve Maybenot's support for effective defenses moving forward.

Unveiling the Impact of User-Agent Reduction and Client Hints: A Measurement Study
  • Asuman Senol
  • Gunes Acar

The user-agent string contains the details of a user's device, browser and platform. Prior work on browser fingerprinting showed that the user-agent string can facilitate covert fingerprinting and tracking of users. In order to address these privacy concerns, browsers including Chrome recently reduced the user-agent string to make it less identifying. Simultaneously, Chrome introduced several highly identifying (or high-entropy) user-agent client hints (UA-CH) to allow access to browser properties that are redacted from the user-agent string. In this empirical study, we attempt to characterize the effects of these major changes through a large-scale web measurement on the top 100K websites. Using an instrumented crawler, we quantify access to high-entropy browser features through UA-CH HTTP headers and the JavaScript API. We measure access delegation to third parties and investigate whether the new client hints are already used by tracking, advertising and browser fingerprinting scripts. Our results show that high-entropy UA-CHs are accessed by one or more scripts on 59.2% of the successfully visited sites and 93.8% of these calls were made by tracking and advertising-related scripts-primarily by those owned by Google. Overall, we find that scripts from -9K distinct registrable (eTLD+1) third-party domains take advantage of their unfettered access and retrieve the high-entropy UA-CHs. We find that on 91.6% of the sites where high-entropy client hints are accessed via the JavaScript API, the high-entropy hints are exfiltrated by a tracker script to a remote server. Turning to high-entropy UA-CHs sent in the HTTP headers-which require opt-in or delegation-we found very limited use. Only 1.3% of the websites use the Accept-CH header to receive high-entropy UA-CHs; and an even smaller fraction of websites (0.4%) delegate high-entropy hints to third-party domains. Overall, our findings indicate that user-agent reduction efforts were effective in minimizing the passive collection of identifying browser features, but third-party tracking and advertising scripts continue to enjoy their unfettered access.

Trends in Privacy Dialog Design after the GDPR: The Impact of Industry and Government Actions
  • Logan Warberg
  • Vincent Lefrere
  • Cristobal Cheyre
  • Alessandro Acquisti

Prior research found that a significant portion of EU-based websites responded to the GDPR by implementing privacy dialogs that contained inadequate consent options or dark patterns nudging visitors towards accepting tracking. Less attention, so far, has been devoted to capturing the evolution of those privacy dialogs over time. We study the evolution of privacy dialogs for a period of 18 months after the GDPR became effective using screenshots from the homepages of 911 US and EU news and media websites. We assess the impact of government and third-party actions that provided additional guidance and tools for compliance on privacy dialogs' choice architecture. Over time, we observe an increase in the use of privacy dialogs providing the option to accept or reject tracking, and a reduction of nudges that encourage users to accept tracking. While the debate over the extent to which various stakeholders' responses to the GDPR meaningfully improved EU residents' privacy remains open, our results suggest that exogenous shocks (such as government interventions) may prompt websites to enact changes that bring on-the-ground implementation of the GDPR at least nominally closer to its intended goals (such as making rejecting tracking easier for visitors).

SESSION: Short Papers

A Quantitative Information Flow Analysis of the Topics API
  • Mário S. Alvim
  • Natasha Fernandes
  • Annabelle McIver
  • Gabriel H. Nunes

Third-party cookies have been a privacy concern since cookies were first developed in the mid 1990s, but more strict cookie policies were only introduced by Internet browser vendors in the early 2010s. More recently, due to regulatory changes, browser vendors have started to completely block third-party cookies, with both Firefox and Safari already compliant. The Topics API is being proposed by Google as an additional and less intrusive source of information for interest-based advertising (IBA), following the upcoming deprecation of third-party cookies. Initial results published by Google estimate the probability of a correct re-identification of a random individual would be below 3% while still supporting IBA. In this paper, we analyze the re-identification risk for individual Internet users introduced by the Topics API from the perspective of Quantitative Information Flow (QIF), an information- and decision-theoretic framework. Our model allows a theoretical analysis of both privacy and utility aspects of the API and their trade-off, and we show that the Topics API does have better privacy than third-party cookies. We leave the utility analyses for future work.

The HandyTech's Coming Between 1 and 4: Privacy Opportunities and Challenges for the IoT Handyperson
  • Denise Anthony
  • Carl A. Gunter
  • Weijia He
  • Mounib Khanafer
  • Susan Landau
  • Ravindra Mangar
  • Nathan Reitinger

Smart homes are gaining popularity due to their convenience and efficiency, both of which come at the expense of increased complexity of Internet of Things (IoT) devices. Due to the number and heterogeneity of IoT devices, technologically inexperienced or time-burdened residents are unlikely to manage the setup and maintenance of IoT apps and devices. We highlight the need for a "HandyTech": a technically skilled contractor who can set up, repair, debug, monitor, and troubleshoot home IoT systems. In this paper, we consider the potential privacy challenges posed by the HandyTech, who has the ability to access IoT devices and private data. We do so in the context of single and multi-user smart homes, including rental units, condominiums, and temporary guests or workers. We examine the privacy harms that can arise when a HandyTech has legitimate access to information, but uses it in unintended ways. By providing insights for the development of privacy control policies and measures in-home IoT environments in the presence of the HandyTech, we capture the privacy concerns raised by other visitors to the home, including temporary residents, part-time workers, etc. This helps lay a foundation for the broad set of privacy concerns raised by home IoT systems.

BAZAAR: Anonymous Resource Sharing
  • Christoph Coijanovic
  • Daniel Schadt
  • Christiane Weis
  • Thorsten Strufe

In areas such as manufacturing or logistics, it is beneficial for everyone to share access capacity with others. Increased efficiency increases profits, lowers prices for consumers, and reduces environmental impact. However, in order to share a resource such as manufacturing capacity, suitable partners must be found. Ideally, a centralized exchange is used to find partners, but this comes with privacy risks. Since participants in the exchange are competitors, they can use information about someone else's capacity to their disadvantage, e.g., by undercutting the prices of an already poorly performing competitor to drive it out of business. In this paper, we show that such an exchange can be set up without compromising the privacy of its participants. We formalize privacy goals in the context of resource sharing via an indistinguishability game. We also propose Bazaar, a protocol that allows participants to find suitable matches while satisfying our formal privacy goals.

Extending Browser Extension Fingerprinting to Mobile Devices
  • Brian Hyeongseok Kim
  • Shujaat Mirza
  • Christina Pöpper

Browser extensions are tools that extend basic browser features to enhance web experience. It has been shown that extensions can be exploited to fingerprint users and even infer personal information about them. However, as browser extensions have been limited to desktops previously, no prior work has explored fingerprintability of extensions on mobile devices, despite the increasing extension support for mobile browsers. This paper aims to fill this gap by extending extension fingerprinting techniques, traditionally performed on desktops, to mobile phones. Out of the 16 chosen extensions, we discover that 6 extensions are uniquely identifiable by their client-side modifications. We present our experimental results through our evaluation of variable interactions between various browsers, devices, and extension lists, and investigate how shifting the attention from the list of installed extensions to the actual modification data can help attackers discriminate users better.

TIGER: Tor Traffic Generator for Realistic Experiments
  • Daniela Lopes
  • Daniel Castro
  • Diogo Barradas
  • Nuno Santos

Tor is the most widely adopted anonymity network, helping safeguard the privacy of Internet users, including journalists and human rights activists. However, effective attacks aimed at deanonymizing Tor users' remains a significant threat. Unfortunately, evaluating the impact such attacks by collecting realistic Tor traffic without gathering real users' data poses a significant challenge.

This paper introduces TIGER (Tor traffIc GEnerator for Realistic experiments), a novel framework that automates the generation of realistic Tor traffic datasets towards improving our understanding of the robustness of Tor's privacy mechanisms. To this end, TIGER allows researchers to design large-scale testbeds and collect data on the live Tor network while responsibly avoiding the need to collect real users' traffic. We motivate the usefulness of TIGER by collecting a preliminary dataset with applicability to the evaluation of traffic confirmation attacks and defenses.

Legitimate Interest is the New Consent - Large-Scale Measurement and Legal Compliance of IAB Europe TCF Paywalls
  • Victor Morel
  • Cristiana Santos
  • Viktor Fredholm
  • Adam Thunberg

Cookie paywalls allow visitors of a website to access its content only after they make a choice between paying a fee or accept tracking. European Data Protection Authorities (DPAs) recently issued guidelines and decisions on paywalls lawfulness, but it is yet unknown whether websites comply with them. We study in this paper the prevalence of cookie paywalls on the top one million websites using an automatic crawler. We identify 431 cookie paywalls, all using the Transparency and Consent Framework (TCF). We then analyse the data these paywalls communicate through the TCF, and in particular, the legal grounds and the purposes used to collect personal data. We observe that cookie paywalls extensively rely on legitimate interest legal basis systematically conflated with consent. We also observe a lack of correlation between the presence of paywalls and legal decisions or guidelines by DPAs.

Impact Analysis of Organizational Structure of Group Companies on Privacy Policies
  • Keika Mori
  • Yuta Takata
  • Daiki Ito
  • Masaki Kamizono
  • Tatsuya Mori

Privacy issues within subsidiary companies can significantly impact the overall trust in a group of companies. Therefore, it is important to address privacy concerns and establish privacy governance across the entire group, including the parent and subsidiary companies. However, the assessment of the actual state of privacy governance within a group from an external perspective remains challenging. In this study, we have analyzed the publicly disclosed privacy policies of the parent and subsidiary companies within a group and compared the results to investigate the practical implementation of privacy governance. The similarity in legal compliance was examined based on the policies of 901 group companies and several influencing factors were identified, such as organizational structure and the number of companies within the group. Specifically, we observed a decrease in similarity with an increase in the number of companies and complexity of the organizational structure. Moreover, companies with lesser similarities belonged to industries handling personal information and having fewer employees.

Analysis of Google Ads Settings Over Time: Updated, Individualized, Accurate, and Filtered
  • Nathan Reitinger
  • Bruce Wen
  • Michelle L. Mazurek
  • Blase Ur

Advertising companies and data brokers often provide consumers access to a dashboard summarizing attributes they have collected or inferred about that user. These attributes can be used for targeted advertising. Several studies have examined the accuracy of these collected attributes or users' reactions to them. However, little is known about how these dashboards, and the associated attributes, change over time. Here, we report data from a week-long, longitudinal study (n=158) in which participants used a browser extension automatically capturing data from one dashboard, Google Ads Settings, after every fifth website the participant visited. The results show that Ads Settings is frequently updated, includes many attributes unique to only a single participant in our sample, and is approximately 90% accurate when assigning age and gender. We also find evidence that Ads Settings attributes may dynamically impact browsing behavior and may be filtered to remove sensitive interests.

A First Look at Generating Website Fingerprinting Attacks via Neural Architecture Search
  • Prabhjot Singh
  • Shreya Arun Naik
  • Navid Malekghaini
  • Diogo Barradas
  • Noura Limam

An adversary can use website fingerprinting (WF) attacks to breach the privacy of users who access the web through encrypted tunnels like Tor. These attacks have increasingly relied on the use of deep neural networks (DNNs) to build powerful classifiers that can match the traffic of a target user to the specific traffic pattern of a website.

In this paper, we study whether the use of neural architecture search (NAS) techniques can provide adversaries with a systematic way to find improved DNNs to launch WF attacks. Concretely, we study the performance of the prominent AutoKeras NAS tool on the WF scenario, under a limited exploration budget, and analyze the effectiveness and efficiency of the resulting DNNs.

Our evaluation reveals that AutoKeras's DNNs achieve a comparable accuracy to that of the state-of-the-art Tik-Tok attack on undefended Tor traffic, and obtain 5--8% accuracy improvements against the FRONT random padding defense, thus highlighting the potential of NAS techniques to enhance the effectiveness of WF.