Martino Trevisan - Teaching and Thesis

Teaching (2022/2023)

Opportunities Abroad

Impact of novel network protocols (QUIC, MASQUE and HTTP/3): @École Normale Supérieure de Lyon, more details here
- Other topics include:
  - Performance Evaluation of Admission Policies for Edge Compute Systems
  - Stressing systems security through on-the-fly network traffic generation using generative models
  - Measurements of the impact on performance of Apple Private Relay on mobile users
  - Video streaming quality inference at 100gbps using subsets of flows
Towards Scalable and Robust Solutions for Complex Distributed Systems: Machine Learning, Coordination Semantics, and Infrastructure Challenges: @TU Wien, more details here and here
- Other topics include:
  - ML for predictive/Proactive scaling in the cloud
  - Edge ML
  - Serverless Computing
  - Security for distributed systems
  - Fairness, accountability and transparency in ML
Novel Algorithms for Full-Stack Observability: @Cisco Systems, more details here
Applications of Large Language Models to solve complex reasoning tasks: @NEC Labs (Heidelberg), more details here and here
- Other topics include:
  - Development of advanced interfaces for cyber threat intelligence leveraging generative AI and advanced machine learning techniques for data visualisation and analysis
  - Application of AI agent technologies to the investigation of cybersecurity incidents and intelligence or employment of AI Agents for red team and penetration testing activities
  - Development of algorithmic and optimisation techniques for machine learning frameworks, such as PyTorch

Thesis Proposals

Network Trace Compression for ML-based Traffic Classification

Knowing domain names associated with traffic allows eavesdroppers to profile users without accessing packet payloads. Encrypting domain names transiting the network is, therefore, a key step to increase network confidentiality. The latest efforts include encrypting the TLS Server Name Indication (Encrypted Client Hello extension) and encrypting DNS traffic, with DNS over HTTPS (DoH) representing a prominent proposal.

Nevertheless, recent work shows that, by using simplistic features and off-the-shelf machine learning models, the network administrator or even an attacker can uncover the domain names of users relying on eSNI or DoH. The biggest challenge, however, is the storage and processing of large-scale network traces, as their volume can easily account for several GB per day even for a middle-sized organization.

The goal of the thesis is to design, implement, and evaluate compression techniques which can reduce the size of network traces, while at the same time preserving good accuracy in the domain classification task. The thesis will leverage operational per-TCP connection log files, including rich features such as packet size and timing. Lossy techniques based on quantization will be explored as a first choice, but more advanced approaches based on NN models can be considered.

Prerequisites:

Python programming
Machine learning essentials
Internet Protocol Stack

References:

Trevisan, M., Soro, F., Mellia, M., Drago, I., & Morla, R. (2020). Does domain name encryption increase users' privacy?. ACM SIGCOMM Computer Communication Review, 50(3), 16-22.
Trevisan, M., Soro, F., Mellia, M., Drago, I., & Morla, R. (2023). Attacking DoH and ECH: Does Server Name Encryption Protect Users’ Privacy?. ACM Transactions on Internet Technology, 23(1), 1-22.

Project details: part of the COMPACT PRIN project and in collaboration with the PoliMI AntLab Research Group

Clustering webpages for realistic experiments on the Internet

Experimenting networked systems is fundamental for the development of novel techniques, assessing the impact of design choices and improve users' Quality of Experience. Testing the Web is typically done using lists of popular websites -- e.g., the Alexa rank (https://www.alexa.com/topsites), which however only offer a list of homepages of the target websites. This is a strong limitation, as websites are known to have a diverse webpage structure depending for example, on the subsections in which content is organized. The goal of this thesis is to develop a system able to select a subset of the pages of a website so that they are representative of the diversity of the internal structure. To this end, it is necessary to leverage Data Science and Machine Learning techniques, clustering among all, to group together similar pages and choose the right (and right number of) representatives. Using open datasets, and collecting additional if needed, the student will apply Machine Learning tools to achieve this goal, using Big Data approaches if the size of the dataset becomes large.

Thesis Fast Track

A Bachelor Fast Track Thesis consists in producing an extended summary of a scientific article published in high-quality international conferences or journals. A description of what a Fast Track Thesis means can be found here.

The available articles are:

Dong, H., Zhang, Y., Lee, H., Huque, S., & Sun, Y. (2024). Deciphering the Digital Veil: Exploring the Ecosystem of DNS HTTPS Resource Records. In 2024 ACM Internet Measurement Conference.
Lee, J., Mohaisen, D., & Kang, M. S. (2024). Measuring DNS-over-HTTPS Downgrades: Prevalence, Techniques, and Bypass Strategies. Proceedings of the ACM on Networking, 2(CoNEXT4), 1-22.
Ashiq, M. I., Fiebig, T., & Chung, T. (2025). Unraveling the Complexities of MTA-STS Deployment and Management in Securing Email. In ACM on Internet Measurement Conference. ACM.

Page updated

Google Sites

Report abuse