Publications

The darpa searchlight dataset of application network traffic

Abstract

Researchers are in constant need of reliable data to develop and evaluate AI/ML methods for networks and cybersecurity. While Internet measurements can provide realistic data, such datasets lack ground truth about application flows. We present a ∼ 750GB dataset that includes ∼ 2000 systematically conducted experiments and the resulting packet captures with video streaming, video teleconferencing, and cloud-based document editing applications. This curated and labeled dataset has bidirectional and encrypted traffic with complete ground truth that can be widely used for assessments and evaluation of AI/ML algorithms.

Metadata

publication
Proceedings of the 15th Workshop on Cyber Security Experimentation and Test …, 2022
year
2022
publication date
2022/8/8
authors
Calvin Ardi, Connor Aubry, Brian Kocoloski, Dave DeAngelis, Alefiya Hussain, Matt Troglia, Stephen Schwab
link
https://dl.acm.org/doi/abs/10.1145/3546096.3546103
resource_link
https://dl.acm.org/doi/pdf/10.1145/3546096.3546103
book
Proceedings of the 15th Workshop on Cyber Security Experimentation and Test
pages
59-64