PLAS ’18- Proceedings of the 13th Workshop on Programming Languages and Analysis for Security
SESSION: Invited Talk
Differential Privacy offers ways to answer statistical queries about sensitive data while providing strong provable privacy guarantees ensuring that the presence or absence of a single individual in the data has a negligible statistical effect on the query’s result. In this talk I will introduce the basics of differential privacy and some of the fundamental mechanisms for building differentially private programs. I will then overview few different language-based approaches developed to help a programmer to certify her programs differentially private and to guarantee that they provide accurate answers.
SESSION: Session on Differential Privacy
The sensitivity of a function is the maximum change of its output for a unit change of its input. In this paper we present a method for determining the sensitivity of SQL queries, seen as functions from databases to datasets, where the change is measured in the number of rows that differ. Given a query, a database schema and a number, our method constructs a formula that is satisfiable only if the sensitivity of the query is bigger than this number. Our method is composable, and can also be applied to SQL workflows. Our results can be used to calibrate the amount of noise that has to be added to the output of the query to obtain a certain level of differential privacy.
Local differential privacy (LDP) is a variant of differential privacy (DP) where the noise is added directly on the individual records, before being collected. The main advantage with respect to DP is that we do not need a trusted third party to collect and sanitise the sensitive data of the user. The main disadvantage is that the trade-off between privacy and utility is usually worse than in DP, and typically to retrieve reasonably good statistics from the locally sanitised data it is necessary to have access to a huge collection of them. In this paper, we focus on the problem of estimating the counting queries on numerical data, and we propose a variant of LDP based on the addition of geometric noise. Such noise function is known to have appealing properties in the case of counting queries. In particular, it is universally optimal for DP, i.e., it provides the best utility for a given level of DP, regardless of the side knowledge of the attacker. We explore the properties of geometric noise for counting queries in the LDP setting, and we conjecture an optimality property, similar to the one that holds in the DP setting.
SESSION: Session on Information Flow
Recent years have seen a proliferation of research on information flow control. While the progress has been tremendous, it has also given birth to a bewildering breed of concepts, policies, conditions, and enforcement mechanisms. Thus, when designing information flow controls for a new application domain, the designer is confronted with two basic questions: (i) What is the right security characterization for a new application domain? and (ii) What is the right enforcement mechanism for a new application domain?
This paper puts forward six informal principles for designing information flow security definitions and enforcement mechanisms: attacker-driven security, trust-aware enforcement, separation of policy annotations and code, language-independence, justified abstraction, and permissiveness. We particularly highlight the core principles of attacker-driven security and trust-aware enforcement, giving us a rationale for deliberating over soundness vs. soundiness. The principles contribute to roadmapping the state of the art in information flow security, weeding out inconsistencies from the folklore, and providing a rationale for designing information flow characterizations and enforcement mechanisms for new application domains.
This paper presents a simple but equally expressive vari- ant on the terminating fragment of the Dependency Core Calculus (DCC) of Abadi et al. . DCC is a concise and elegant calculus for tracking dependency. The calculus has applications in, among other areas, information flow control, slicing, and binding time analysis. However, in this paper we show that it is possible to replace a core technical device in DCC with an alternative, simpler, formulation. The calculus has a denotational semantics in the same domain as DCC, using which we prove that the two calculi are equivalent. As a proof of concept to show that our calculus provides a simple analysis of dependency we implement it in Haskell, obtaining a simpler implementation compared to previous work .
A common security recommendation is to reduce the in-memory lifetime of secret values, in order to reduce the risk that an attacker can obtain secret data by probing memory. To mitigate this risk, secret values can be overwritten, at source level, after their last use. The problem we address here is how to ensure that a compiler preserve these mitigation efforts and thus that secret values are not easier to obtain at assembly level than at source level. We propose a formal definition of Information Flow Preserving program Transformations in which we model the information leak of a program using the notion of Attacker Knowledge. Program transformations are validated by relating the knowledge of the attacker before and after the transformation. We consider two classic compiler passes (Dead Store Elimination and Register Allocation) and show how to validate and, if needed, modify these transformations in order to be information flow preserving.
SESSION: Invited Talk
Formal verification of computer networks has become a reality in recent years, with the emergence of a large number of domain-specific property-checking tools. Although early tools were limited to stateless data planes, recent work has shown how to handle richer models including stateful middleboxes and distributed control planes. This talk will survey recent progress in the area, highlighting the key technical challenges that remain and discussing opportunities related to security.
SESSION: Session on Analysis of Binary Code
Finding similar procedures in stripped binaries has various use cases in the domains of cyber security and intellectual property. Previous works have attended this problem and came up with approaches that either trade throughput for accuracy or address a more relaxed problem.
In this paper, we present a cross-compiler-and-architecture approach for detecting similarity between binary procedures, which achieves both high accuracy and peerless throughput. For this purpose, we employ machine learning alongside similarity by composition: we decompose the code into smaller comparable fragments, transform these fragments to vectors, and build machine learning-based predictors for detecting similarity between vectors that originate from similar procedures.
We implement our approach in a tool called Zeek and evaluate it by searching similarities in open source projects that we crawl from the world-wide-web. Our results show that we perform 250X faster than state-of-the-art tools without harming accuracy.
Context-Sensitive Flow Graph and Projective Single Assignment Form for Resolving Context-Dependency of Binary Code
Program analysis on binary code is considered as difficult because one has to resolve destinations of indirect jumps. However, there is another difficulty of context-dependency that matters when one processes binary programs that are not compiler generated. In this paper, we propose a novel approach for tackling these difficulties and describe a way to reconstruct a control flow from a binary program with no extra assumptions than the operational meaning of machine instructions.