ACM CCS 2020 - November 9-13, 2020
AISec'20: Proceedings of the 13th ACM Workshop on Artificial Intelligence and SecurityFull Citation in the ACM Digital Library
SESSION: Session 1: Adversarial Machine Learning
This paper aims to provide a thorough study on the effectiveness of the transformation-based ensemble defence for image classification and its reasons. It has been empirically shown that they can enhance the robustness against evasion attacks, while there is little analysis on the reasons. In particular, it is not clear whether the robustness improvement is a result of transformation or ensemble. In this paper, we design two adaptive attacks to better evaluate the transformation-based ensemble defence. We conduct experiments to show that 1) the transferability of adversarial examples exists among the models trained on data records after different reversible transformations; 2) the robustness gained through transformation-based ensemble is limited; 3) this limited robustness is mainly from the irreversible transformations rather than the ensemble of a number of models; and 4) blindly increasing the number of sub-models in a transformation-based ensemble does not bring extra robustness gain.
Convolutional Neural Networks (CNNs) are deployed in more and more classification systems, but adversarial samples can be maliciously crafted to trick them, and are becoming a real threat. There have been various proposals to improve CNNs' adversarial robustness but these all suffer performance penalties or have other limitations. In this paper, we offer a new approach in the form of a certifiable adversarial detection scheme, the Certifiable Taboo Trap (CTT). This system, in theory, can provide certifiable guarantees of detectability of a range of adversarial inputs for certain l-∞ sizes. We develop and evaluate several versions of CTT with different defense capabilities, training overheads and certifiability on adversarial samples. In practice, against adversaries with various l-p norms, CTT outperforms existing defense methods that focus purely on improving network robustness. We show that CTT has small false positive rates on clean test data, minimal compute overheads when deployed, and can support complex security policies.
E-ABS: Extending the Analysis-By-Synthesis Robust Classification Model to More Complex Image Domains
Conditional generative models, such as Schott et al.'s Analysis-by-Synthesis (ABS), have state-of-the-art robustness on MNIST, but fail in more challenging datasets. In this paper, we present E-ABS, an improvement on ABS that achieves state-of-the-art robustness on SVHN. E-ABS gives more reliable class-conditional likelihood estimations on both in-distribution and out-of-distribution samples than ABS. Theoretically, E-ABS preserves ABS's key features for robustness; thus, we show that E-ABS has similar certified robustness as ABS. Empirically, E-ABS outperforms both ABS and adversarial training on SVHN and a traffic sign dataset, achieving state-of-the-art robustness on these two real-world tasks. Our work shows a connection between ABS-like models and some recent advances on generative models, suggesting that ABS-like models are a promising direction for defending adversarial examples.
SCRAP: Synthetically Composed Replay Attacks vs. Adversarial Machine Learning Attacks against Mouse-based Biometric Authentication
Adversarial attacks have gained popularity recently due to their simplicity and impact. Their applicability to diverse security scenarios is however less understood. In particular, in some scenarios, attackers may come up naturally with ad-hoc black-box attack techniques inspired directly on characteristics of the problem space rather than using generic adversarial techniques. In this paper we explore an intuitive attack technique for Mouse-based Behavioral Biometrics and compare its effectiveness against adversarial machine learning attacks. We show that attacks leveraging on domain knowledge have higher transferability when applied to various machine-learning techniques and are also more difficult to defend against. We also propose countermeasures against such attacks and discuss their effectiveness.
SESSION: Session 2: Malware Detection
Machine learning (ML) techniques are being used to detect increasing amounts of malware and variants. Despite successful applications of ML, we hypothesize that the full potential of ML is not realized in malware analysis (MA) due to a semantic gap between the ML and MA communities---as demonstrated in the data that is used. Due in part to the available data, ML has primarily focused on detection whereas MA is also interested in identifying behaviors. We review existing open-source malware datasets used in ML and find a lack of behavioral information that could facilitate stronger impact by ML in MA. As a first step in bridging this gap, we label existing data with behavioral information using open-source MA reports---1) altering the analysis from identifying malware to identifying behaviors, 2)~aligning ML better with MA, and 3)~allowing ML models to generalize to novel malware in a zero/few-shot learning manner. We classify the behavior of a malware family not seen during training using transfer learning from a state-of-the-art model for malware family classification and achieve 57% - 84% accuracy on behavioral identification but fail to outperform the baseline set by a majority class predictor. This highlights opportunities for improvement on this task related to the data representation, the need for malware specific ML techniques, and a larger training set of malware samples labeled with behaviors.
Training classifiers that are robust against adversarially modified examples is becoming increasingly important in practice. In the field of malware detection, adversaries modify malicious binary files to seem benign while preserving their malicious behavior. We report on the results of a recently held robust malware detection challenge. There were two tracks in which teams could participate: the attack track asked for adversarially modified malware samples and the defend track asked for trained neural network classifiers that are robust to such modifications. The teams were unaware of the attacks/defenses they had to detect/evade. Although only 9 teams participated, this unique setting allowed us to make several interesting observations.
We also present the challenge winner: GRAMS, a family of novel techniques to train adversarially robust networks that preserve the intended (malicious) functionality and yield high-quality adversarial samples. These samples are used to iteratively train a robust classifier. We show that our techniques, based on discrete optimization techniques, beat purely gradient-based methods. GRAMS obtained first place in both the attack and defend tracks of the competition.
Yara rules are a ubiquitous tool among cybersecurity practitioners and analysts. Developing high-quality Yara rules to detect a malware family of interest can be labor- and time-intensive, even for expert users. Few tools exist and relatively little work has been done on how to automate the generation of Yara rules for specific families. In this paper, we leverage large n-grams (n ≥ 8) combined with a new biclustering algorithm to construct simple Yara rules more effectively than currently available software. Our method, AutoYara, is fast, allowing for deployment on low-resource equipment for teams that deploy to remote networks. Our results demonstrate that AutoYara can help reduce analyst workload by producing rules with useful true-positive rates while maintaining low false-positive rates, sometimes matching or even outperforming human analysts.In addition, real-world testing by malware analysts indicates AutoYara could reduce analyst time spent constructing Yara rules by 44-86%, allowing them to spend their time on the more advanced malware that current tools can't handle. Code will be made available at https://github.com/NeuromorphicComputationResearchProgram.
State of the art deep learning techniques are known to be vulnerable to evasion attacks where an adversarial sample is generated from a malign sample and misclassified as benign.
Detection of encrypted malware command and control traffic based on TCP/IP flow features can be framed as a learning task and is thus vulnerable to evasion attacks. However, unlike e.g. in image processing where generated adversarial samples can be directly mapped to images, going from flow features to actual TCP/IP packets requires crafting the sequence of packets, with no established approach for such crafting and a limitation on the set of modifiable features that such crafting allows.In this paper we discuss learning and evasion consequences of the gap between generated and crafted adversarial samples. We exemplify with a deep neural network detector trained on a public C2 traffic dataset, white-box adversarial learning, and a proxy-based approach for crafting longer flows. Our results show 1) the high evasion rate obtained by using generated adversarial samples on the detector can be significantly reduced when using crafted adversarial samples; 2) robustness against adversarial samples by model hardening varies according to the crafting approach and corresponding set of modifiable features that the attack allows for; 3) incrementally training hardened models with adversarial samples can produce a level playing field where no detector is best against all attacks and no attack is best against all detectors, in a given set of attacks and detectors. To the best of our knowledge this is the first time that level playing field feature set- and iteration-hardening are analyzed in encrypted C2 malware traffic detection.
SESSION: Session 3: Machine Learning for Security and Privacy
Outsourcing machine learning inference creates a confidentiality dilemma: either the client has to trust the server with potentially sensitive input data, or the server has to share his commercially valuable model. Known remedies include homomorphic encryption, multi-party computation, or placing the entire model in a trusted enclave. None of these are suitable for large models. For two relevant use cases, we show that it is possible to keep all confidential model parameters in the last (dense) layers of deep neural networks. This allows us to split the model such that the confidential parts fit into a trusted enclave on the client side. We present the eNNclave toolchain to cut TensorFlow models at any layer, splitting them into public and enclaved layers. This preserves TensorFlow's performance optimizations and hardware support for public layers, while keeping the parameters of the enclaved layers private. Evaluations on several machine learning tasks spanning multiple domains show that fast inference is possible while keeping the sensitive model parameters confidential. Accuracy results are close to the baseline where all layers carry sensitive information and confirm our approach is practical.
Impersonation attacks against web authentication servers have been increasing in complexity over the last decade. Tunnelling services, such as VPNs or proxies, can be for instance used to faithfully impersonate victims in foreign countries. In this paper we study the detection of user authentication attacks involving network tunnelling geolocation deception. For that purpose we explore different models to profile a user based on network latencies. We design a classical machine learning model and a deep learning model to profile web resource loading times collected on client-side. In order to test our approach we profiled network latencies for 86 real users located around the globe. We show that our proposed novel network profiling is able to detect up to 88.3% of attacks using VPN tunneling schemes
Disabling Backdoor and Identifying Poison Data by using Knowledge Distillation in Backdoor Attacks on Deep Neural Networks
Backdoor attacks are poisoning attacks and serious threats to deep neural networks. When an adversary mixes poison data into a training dataset, the training dataset is called a poison training dataset. A model trained with the poison training dataset becomes a backdoor model and it achieves high stealthiness and attack-feasibility. The backdoor model classifies only a poison image into an adversarial target class and other images into the correct classes. We propose an additional procedure to our previously proposed countermeasure against backdoor attacks by using knowledge distillation. Our procedure removes poison data from a poison training dataset and recovers the accuracy of the distillation model. Our countermeasure differs from previous ones in that it does not require detecting and identifying backdoor models, backdoor neurons, and poison data. A characteristic assumption in our defense scenario is that the defender can collect clean images without labels. A defender distills clean knowledge from a backdoor model (teacher model) to a distillation model (student model) with knowledge distillation. Subsequently, the defender removes poison-data candidates from the poison training dataset by comparing the predictions of the backdoor and distillation models. The defender fine-tunes the distillation model with the detoxified training dataset to improve classification accuracy. We evaluated our countermeasure by using two datasets. The backdoor is disabled by distillation and fine-tuning further improves the classification accuracy of the distillation model. The fine-tuning model achieved comparable accuracy to a baseline model when the number of clean images for a distillation dataset was more than 13% of the training data. Our results indicate that our countermeasure can be applied for general image-classification tasks and that it works well whether the defender's received training dataset is a poison dataset or not.