As we think about enhancing software supply chain security, what does the landscape of threats and opportunities look like? What are useful ways for framing the problem, and how does the industry view the challenge? Where do responsibilities lie? Who has the power to make positive changes or to act with malice? And most importantly, what are the roles and responsibilities of industry, academia, government, and the open source community at large? In this keynote, industry veteran Trevor Rosen will offer some answers to these questions borne from his time at the center of the SolarWinds/SUNBURST breach and his experience in standing up a new supply chain integrity practice at GitHub. You can expect to hear some war stories, some strong opinions, and to walk away inspired to join hands with colleagues from all over the technical landscape to solve a huge (but tractable!) problem in information security.
Building reliable software is challenging because today's software supply chains are built and secured from tools and individuals from a broad range of organizations with complex trust relationships.In this setting, tracking the origin of each piece of software and understanding the security and privacy implications of using it is essential. In this work we aim to secure software supply chains by using verifiable policies in which the origin of information and the trust assumptions are first-order concerns and abusive evidence is discoverable. To do so, we propose Policy Transparency, a new paradigm in which policies are based on authorization logic and all claims issued in this policy language are made transparent by inclusion in a transparency log. Achieving this goal in a real-world setting is non-trivial and to do so we propose a novel software architecture called PolyLog. We find that this combination of authorization logic and transparency logs is mutually beneficial - transparency logs allow authorization logic claims to be widely available aiding in discovery of abuse, and making claims interpretable with policies allows misbehavior captured in the transparency logs to be handled proactively.
This paper systematizes knowledge about secure software supply chain patterns. It identifies four stages of a software supply chain attack and proposes three security properties crucial for a secured supply chain: transparency, validity, and separation. The paper describes current security approaches and maps them to the proposed security properties, including research ideas and case studies of supply chains in practice. It discusses the strengths and weaknesses of current approaches relative to known attacks and details the various security frameworks put out to ensure the security of the software supply chain. Finally, the paper highlights potential gaps in actor and operation-centered supply chain security techniques.
The world is currently strongly connected through both the internet at large, but also the very supply chains which provide everything from food to infrastructure and technology. The supply chains are themselves vulnerable to adversarial attacks, both in a digital and physical sense, which can disrupt or at worst destroy them. In this paper, we take a look at two examples of such successful attacks to put the idea of Supply Chain Attacks into perspective, and analyse how EU and national law can prevent these attacks or otherwise punish companies which do not try to mitigate them at all possible costs. We find that the current types of national regulation are not technology specific enough, and cannot force or otherwise mandate the correct parties who could play the biggest role in preventing supply chain attacks to do everything in their power to mitigate them. But, current EU law is on the right path, and further \textcolorblack development of this may be what is necessary to combat these large threats, as national law may fail at properly regulating companies when it comes to cybersecurity.
Supply chain attacks on open-source projects aim at injecting and spreading malicious code such that it is executed by direct and indirect downstream users. Recent work systematized the knowledge about such attacks and proposed a taxonomy in the form of an attack tree. We propose a visualization tool calledRisk Explorer for Software Supply Chains, which allows inspecting the taxonomy of attack vectors, their descriptions, references to real-world incidents and other literature, as well as information about associated safeguards. Being open-source itself, the community can easily reference new attacks, accommodate for entirely new attack vectors or reflect the development of new safeguards.
The demand for quick and reliable DevOps operations pushed distributors of repository platforms to implement workflows. Workflows allow automating code management operations directly on the repository hosting the software. However, this feature also introduces security issues that directly affect the repository, its content, and all the software supply chains in which the hosted code is involved in. Hence, an attack exploiting vulnerable workflows can affect disruptively large software ecosystems. To empirically assess the importance of this problem, in this paper, we focus on the de-facto main distributor (i.e., GitHub). We developed a security assessment methodology for GitHub Actions workflows, which are widely adopted in software supply chains. We implemented the methodology in a tool (GHAST) and applied it on 50 open-source projects. The experimental results are worrisome as they allowed identifying a total of 24,905 security issues (all reported to the corresponding stakeholders), thereby indicating that the problem is open and demands further research and investigation.
Development teams are increasingly investing in automating the updating of third-party libraries to limit the patch time of zero-day exploits such as the Equifax breach. GitHub bots such as Dependabot and Renovate build such functionality by leveraging existing test infrastructure in repositories to test and evaluate new library updates. However, two recent studies suggest that test suites in projects lack effectiveness and coverage to reliably find regressions in third-party libraries. Adequate test coverage and effectiveness are critical in discovering new vulnerabilities and weaknesses from third-party libraries. The recent Log4Shell incident exemplifies this, as projects will likely not have adequate tests for logging libraries. This position paper discusses the weaknesses and challenges of current testing practices and techniques from a supply chain security perspective. We highlight two key challenges that researchers and practitioners need to address: (1) the lack of resources and best practices for testing the uses of third-party libraries and (2) enhancing the reliability of automating library updates.
The insertion of trojanised binaries into supply chains are a particularly subtle form of cyber-attack that require a multi-staged and complex deployment methodology to implement and execute. In the years preceding this research there has been a spike in closed-source software supply chain attacks used to attack downstream clients or users of a company. To detect this attack type, we present an approach to detecting the insertion of malicious functionality in supply chains via differential analysis of binaries. This approach determines whether malicious functionality has been inserted in a particular build by looking for indicators of maliciousness. We accomplish this via automated comparison of a known benign build to successive potentially malicious versions. To substantiate this approach we present a system, Exorcist, that we have designed, developed and evaluated as capable of detecting trojanised binaries in Windows software supply chains. In evaluating this system we analyse 12 samples from high-profile APT attacks conducted via the software supply chain.
Open-source software supply chain attacks aim at infecting downstream users by poisoning open-source packages. The common way of consuming such artifacts is through package repositories and the development of vetting strategies to detect such attacks is ongoing research. Despite its popularity, the Java ecosystem is the less explored one in the context of supply chain attacks. In this paper, we present simple-yet-effective indicators of malicious behavior that can be observed statically through the analysis of Java bytecode. Then we evaluate how such indicators and their combinations perform when detecting malicious code injections. We do so by injecting three malicious payloads taken from real-world examples into the Top-10 most popular Java libraries from libraries.io. We found that the analysis of strings in the constant pool and of sensitive APIs in the bytecode instructions aid in the task of detecting malicious Java packages by significantly reducing the information, thus, making also manual triage possible.
Improper input validation is still one of the most severe problem classes in web application security, although there are concepts with a good problem-solution fit, such as static taint analysis. In practice, however, existing static taint analyzers suffer from both high false positive and false negative rates, making them impractical for effective detection of new vulnerabilities. In this work, we present an approach that aims to systematically specialize existing taint analyzers toward software marketplaces to improve both recall and precision of their analyses. To validate whether our approach is suitable for finding new vulnerabilities in web applications, we applied a specialized taint-analyzer to a random sample of 1,000 plugins from the WordPress plugin store. As a result, we were able to disclose ten CVE entries, including two vulnerabilities with a high or even critical CVSS score. Our preliminary results indicate the principle feasibility of our approach and show that it may be suitable for mass vulnerability detection in software marketplaces, providing a promising foundation for future works in this domain.
While vulnerability research often focuses on technical findings and post-public release industrial response, we provide an analysis of the rest of the story: the coordinated disclosure process from discovery through public release. The industry-wide 'Trojan Source' vulnerability which affected most compilers, interpreters, code editors, and code repositories provided an interesting natural experiment, enabling us to compare responses by firms versus nonprofits and by firms that managed their own response versus firms that outsourced it. We document the interaction with bug bounty programs, government disclosure assistance, academic peer review, and press coverage, among other topics. We compare the response to an attack on source code with the response to a comparable attack on NLP systems employing machine-learning techniques. We conclude with recommendations to improve the global coordinated disclosure system.
Smart home IoT devices are known to be breeding grounds for security and privacy vulnerabilities. Although some IoT vendors deploy updates, the update process is mostly opaque to researchers. It is unclear what software components are on devices, whether and when these components are updated, and how vulnerabilities change alongside the updates. This opaqueness makes it difficult to understand the security of software supply chains of IoT devices. To understand the software update practices on IoT devices, we leverage IoT Inspector's dataset of network traffic from real-world IoT devices. We analyze the User Agent strings from plain-text HTTP connections. We focus on four software components included in User Agents: cURL, Wget, OkHttp, and python-requests. By keeping track of what kinds of devices have which of these components at what versions, we find that many IoT devices potentially used outdated and vulnerable versions of these components---based on the User Agents---even though less vulnerable, more updated versions were available; and that the rollout of updates tends to be slow for some IoT devices.
Deep neural networks achieve state-of-the-art performance on many tasks, but require increasingly complex architectures and costly training procedures. Engineers can reduce costs by reusing a pre-trained model (PTM) and fine-tuning it for their own tasks. To facilitate software reuse, engineers collaborate around model hubs, collections of PTMs and datasets organized by problem domain. Although model hubs are now comparable in popularity and size to other software ecosystems, the associated PTM supply chain has not yet been examined from a software engineering perspective. We present an empirical study of artifacts and security features in 8 model hubs. We indicate the potential threat models and show that the existing defenses are insufficient for ensuring the security of PTMs. We compare PTM and traditional supply chains, and propose directions for further measurements and tools to increase the reliability of the PTM supply chain.