Impact of novel network protocols (QUIC, MASQUE and HTTP/3): @École Normale Supérieure de Lyon, more details here
Video streaming quality inference at 100gbps using subsets of flows
Other topics include:
ML for predictive/Proactive scaling in the cloud
Security for distributed systems
Fairness, accountability and transparency in ML
Novel Algorithms for Full-Stack Observability: @Cisco Systems, more details here
Other topics include:
Development of advanced interfaces for cyber threat intelligence leveraging generative AI and advanced machine learning techniques for data visualisation and analysis
Application of AI agent technologies to the investigation of cybersecurity incidents and intelligence or employment of AI Agents for red team and penetration testing activities
Development of algorithmic and optimisation techniques for machine learning frameworks, such as PyTorch
Compressed Representation of Darknet Traffic for Cybersecurity
Darknets, distinctive subnetworks on the Internet, function as passive observers by recording all received packets without hosting any devices. Defined by its unsolicited nature, the traffic they capture makes darknets akin to "network telescopes," offering insights into cybersecurity events like network scans and exploit attempts.
Handling the substantial volume of data collected by darknets poses significant challenges due to its size. Despite this, the data often exhibits patterns of repetition and similarity, suggesting a potential for effective compression or summarization.
The primary goal of this thesis is to design and implement advanced algorithms capable of extracting compressed representations from darknet traffic. Leveraging recent machine learning advancements, particularly Autoencoders, Generative Adversarial Networks, and Diffusion Models, the aim is to obtain compressed yet informative representations. These representations can be utilized for compression purposes or as generative modules.
The study will focus on diverse datasets obtained from medium-sized operational darknets deployed in different countries. These datasets, spanning multiple years, will be processed using Big Data techniques and infrastructures due to their size.
Clustering webpages for realistic experiments on the Internet
Experimenting networked systems is fundamental for the development of novel techniques, assessing the impact of design choices and improve users' Quality of Experience. Testing the Web is typically done using lists of popular websites -- e.g., the Alexa rank (https://www.alexa.com/topsites), which however only offer a list of homepages of the target websites. This is a strong limitation, as websites are known to have a diverse webpage structure depending for example, on the subsections in which content is organized. The goal of this thesis is to develop a system able to select a subset of the pages of a website so that they are representative of the diversity of the internal structure. To this end, it is necessary to leverage Data Science and Machine Learning techniques, clustering among all, to group together similar pages and choose the right (and right number of) representatives. Using open datasets, and collecting additional if needed, the student will apply Machine Learning tools to achieve this goal, using Big Data approaches if the size of the dataset becomes large.
Analysis and Correlation of Behaviour on Online Social Networks
Online social networks, such as Instagram and Facebook, allow users to interact and debate with each other. In this research, the candidate will collect large quantities of data from social networks and from public repositories such as Wikidata. The data will be organized and analyzed using big data techniques (such as Pyspark). Then, the student will characterize the behaviour of different classes of users on the social network (e.g., nationality, activity, language, age, etc.). The student will analyze possible bias in the categories and the dynamic of the changes. The student will possibly use machine learning techniques, forecasting methods and graphs.
Thesis Fast Track
A Bachelor Fast Track Thesis consists in producing an extended summary of a scientific article published in high-quality international conferences or journals. A description of what a Fast Track Thesis means can be found here.
The available articles are:
Mike Kosek, Luca Schumann, Robin Marx, Trinh Viet Doan, and Vaibhav Bajpai. 2022. DNS privacy with speed? evaluating DNS over QUIC and its impact on web performance. In Proceedings of the 22nd ACM Internet Measurement Conference (IMC '22). https://doi.org/10.1145/3517745.3561445
Marcin Nawrocki, Pouyan Fotouhi Tehrani, Raphael Hiesgen, Jonas Mücke, Thomas C. Schmidt, and Matthias Wählisch. 2022. On the interplay between TLS certificates and QUIC performance. In Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies (CoNEXT '22). https://doi.org/10.1145/3555050.3569123
Su, Junhua, and Alexandros Kapravelos. "Automatic Discovery of Emerging Browser Fingerprinting Techniques." In Proceedings of the ACM Web Conference 2023. 2023. https://dl.acm.org/doi/abs/10.1145/3543507.3583333
Lee, Dongkeun, Minwoo Joo, and Wonjun Lee. "Net-track: Generic Web Tracking Detection Using Packet Metadata." In Proceedings of the ACM Web Conference 2023. 2023. https://dl.acm.org/doi/abs/10.1145/3543507.3583372
Lim, Sangwon, et al. "ZTLS: A DNS-based Approach to Zero Round Trip Delay in TLS handshake." In Proceedings of the ACM Web Conference 2023. 2023. https://dl.acm.org/doi/abs/10.1145/3543507.3583516
Habib, Rumaisa, et al. "A First Look at Public Service Websites from the Affordability Lens." In Proceedings of the ACM Web Conference 2023. 2023. https://dl.acm.org/doi/abs/10.1145/3543507.3583415