Paper accepted at IMC ’20.

A Haystack Full of Needles: Scalable Detection of IoT Devices in the Wild.

Said Jawad Saidi (1), Anna Maria Mandalari (2), Roman Kolcun (2), Hamed Haddadi (2), Daniel J. Dubois (3), David Choffnes (3), Georgios Smaragdakis (4,1), Anja Feldmann (1,5)

  1. Max Planck Institute for Informatics
  2. Imperial College London
  3. Northeastern University
  4. TU Berlin
  5. Saarland University

Consumer Internet of Things (IoT) devices are extremely popular, providing users with rich and diverse functionalities, from voice assistants to home appliances. These functionalities often come with significant privacy and security risks, with notable recent large-scale coordinated global attacks disrupting large service providers. Thus, an important first step to address these risks is to know what IoT devices are where in a network. While some limited solutions exist, a key question is whether device discovery can be done by Internet service providers that only see sampled flow statistics. In particular, it is challenging for an ISP to efficiently and effectively track and trace activity from IoT devices deployed by its millions of subscribers—all with sampled network data.

In this paper, we develop and evaluate a scalable methodology to accurately detect and monitor IoT devices at subscriber lines with limited, highly sampled data in-the-wild.
Our findings indicate that millions of IoT devices are detectable and identifiable within hours, both at a major ISP as well as an IXP, using passive, sparsely sampled network flow headers. Our methodology is able to detect devices from more than 77% of the studied IoT manufacturers, including popular devices such as smart speakers. While our methodology is effective for providing network analytics, it also highlights significant privacy consequences.

lab

About this publication

  • Title: A Haystack Full of Needles: Scalable Detection of IoT Devices in the Wild
  • Authors: Said Jawad Saidi (Max Planck Institute for Informatics), Anna Maria Mandalari (Imperial College London), Roman Kolcun (Imperial College London), Hamed Haddadi (Imperial College London), Daniel J. Dubois (Northeastern University), David Choffnes (Northeastern University), Georgios Smaragdakis (TU Berlin, Max Planck Institute for Informatics), Anja Feldmann (Max Planck Institute for Informatics, Saarland University)
  • Venue: Internet Measurement Conference (IMC) 2020
  • Download Full Text (PDF)
  • Citation:
    @inproceedings{saidi-imc20,
    title={{A Haystack Full of Needles: Scalable Detection of IoT Devices in the Wild}},
    author={Saidi, Said Jawad and Mandalari, Anna Maria and Kolcun, Roman and Haddadi, Hamed and Dubois, Daniel J. and Choffnes, David and Smaragdakis, Georgios and Feldmann, Anja},
    booktitle={Proc. of the Internet Measurement Conference (IMC)},
    year={2020}
    }

Tools and dataset

To generate IoT activity and capture ground truth IoT data, we used the Mon(IoT)r Testbed, which is software design to facilitate, organize, and automate the capture of network traffic for IoT devices deployed on a local network or over a VPN. For more information on our testbed and to deploy it yourself for your own IoT experiments, you can visit the dedicated page on this website.

To foster further research in the area of IoT privacy and security, we made all the signatures available on our public Github repository: https://github.com/IoTrim/iot_wild.