On the Applicability of Network Digital Twins in Generating Synthetic Data for Heavy Hitter Discrimination

Amit Karamchandani, Javier Nunez, Luis de la Cal, Yenny Moreno, Alberto Mozo, Antonio Pastor

September 2025

Abstract

Differentiating between benign and malicious heavy hitter (HH) flows is a significant challenge for telecommunication infrastructure, as they cause significant congestion and degraded quality of service. Accurate identification is essential for effective mitigation, but current methods lack the granularity needed to distinguish between legitimate activity and malicious distributed denial-of-service (DDoS) traffic. To address this using machine learning (ML), a large labeled dataset is required, yet obtaining such datasets from live networks is infeasible due to privacy policies and operational constraints. To address these challenges, network digital twin (NDT) is proposed as an innovative approach to generate synthetic labeled data for ML applications tailored to complex network problems by emulating diverse and realistic network environments and traffic conditions. To demonstrate this approach, Telefonica’s Mouseworld NDT is extended for automated data collection and labeling of benign and malicious HH flows along with normal traffic. Results show that the ML model trained on this NDT-generated data accurately detects benign and malicious HH flows, validating the effectiveness of the proposed approach in creating realistic labeled data for the application of ML to complex network management solutions. For reproducibility and further research, the dataset and code are openly available.

Type

Publication

IEEE Communications Magazine

Real-Time Systems;Labeling;Data Models;Adaptation Models;Training;Emulation;Data Collection;Synthetic Data;Prevention and Mitigation;Accuracy

On the Applicability of Network Digital Twins in Generating Synthetic Data for Heavy Hitter Discrimination

Abstract

Amit Karamchandani

Predoctoral Researcher

Luis de la Cal

Predoctoral researcher

Alberto Mozo

Head of the research group
Full professor

On the Applicability of Network Digital Twins in Generating Synthetic Data for Heavy Hitter Discrimination

Abstract

Amit Karamchandani

Predoctoral Researcher

Luis de la Cal

Predoctoral researcher

Alberto Mozo

Head of the research groupFull professor

Head of the research group
Full professor