Thesis Proposals
QUIC, HTTP/3 and Proxies: a complicated love triangle
In collaboration with École Normale Supérieure de Lyon. It is possible to work on the thesis in Lyon.
Web protocols are constantly evolving, and the recently standardized HTTP/3 and QUIC promise improvements in performance and security. Many network scenarios require the use of so-called HTTP proxies, middleboxes that act as intermediaries for Web traffic. They have a number of functionalities, especially in the enterprise context where security requires special attention. HTTP proxies can filter websites and allow workstations to communicate with the Internet when they need to work in an isolated environment for security reasons. The operation of proxies is challenged by the new web protocols as they implement end-to-end encryption with various implications. Traffic through an HTTP/3 proxy would go through two nested layers of encryption and congestion control, which is known to severely impact performance.
This thesis has the goal of studying the security and performance of different configurations of HTTP proxies and servers using an experimental testing environment. It will investigate the options currently offered for operating web proxies and servers, with particular attention to recent proposals, among all the Multiplexed Application Substrate over QUIC Encryption (masque), which optimizes the QUIC transport protocol when used to contact a proxy.
Prerequisites:
Networking: TCP, UDP, HTTP, TLS, QUIC
System Programming: Bash, Linux Networking Stack, Docker
Non-exhaustive list of useful tools:
BrowerTime (web automation), WebPageReplay (website simulation), Squid (HTTP proxy), nginx (web server)
Clustering webpages for realistic experiments on the Internet
Experimenting networked systems is fundamental for the development of novel techniques, assessing the impact of design choices and improve users' Quality of Experience. Testing the Web is typically done using lists of popular websites -- e.g., the Alexa rank (https://www.alexa.com/topsites), which however only offer a list of homepages of the target websites. This is a strong limitation, as websites are known to have a diverse webpage structure depending for example, on the subsections in which content is organized. The goal of this thesis is to develop a system able to select a subset of the pages of a website so that they are representative of the diversity of the internal structure. To this end, it is necessary to leverage Data Science and Machine Learning techniques, clustering among all, to group together similar pages and choose the right (and right number of) representatives. Using open datasets, and collecting additional if needed, the student will apply Machine Learning tools to achieve this goal, using Big Data approaches if the size of the dataset becomes large.
Analysis and Correlation of Behaviour on Online Social Networks
Online social networks, such as Instagram and Facebook, allow users to interact and debate with each other. In this research, the candidate will collect large quantities of data from social networks and from public repositories such as Wikidata. The data will be organized and analyzed using big data techniques (such as Pyspark). Then, the student will characterize the behaviour of different classes of users on the social network (e.g., nationality, activity, language, age, etc.). The student will analyze possible bias in the categories and the dynamic of the changes. The student will possibly use machine learning techniques, forecasting methods and graphs.
Analysis of Network Traffic of Biomedical Devices Using Machine Learning
With e-Health technologies enabling remote treatments, security of health devices has become more important than ever. The scenario becomes even more alarming when considering threats to vital settings, with the health crisis that stepped up cyber-attacks on hospitals, healthcare, and medical research facilities. IoT is entering hospitals and healthcare by enabling remote patient assistance and monitoring. Expensive examination equipment and patient monitoring devices are nowadays connected to the hospital networks, possibly left exposed to the Internet. Patients' and doctors' devices are offered WiFi connectivity for both leisure and monitoring, with a BYOD policy that leaves attackers open field.
The goal of this thesis is to evaluate in practice the security of connected medical devices. The student will leverage network traffic analysis for that. Network traffic is a rich source of information and a powerful means to detect intrusion to ICT systems. Moreover, network packets may carry sensitive data in unencrypted form if the protocols are not correctly implemented or configured.
The student will analyze large datasets of network traffic from various sources (including traffic from biomedical devices) and use Artificial Intelligence and Machine Learning to identify potential risks for cybersecurity and privacy in a real hospital settings.
Thesis Fast Track
A Bachelor Fast Track Thesis consists in producing an extended summary of a scientific article published in high-quality international conferences or journals. A description of what a Fast Track Thesis means can be found here.
The available articles are:
Mike Kosek, Luca Schumann, Robin Marx, Trinh Viet Doan, and Vaibhav Bajpai. 2022. DNS privacy with speed? evaluating DNS over QUIC and its impact on web performance. In Proceedings of the 22nd ACM Internet Measurement Conference (IMC '22). https://doi.org/10.1145/3517745.3561445
Diwen Xue, Benjamin Mixon-Baca, ValdikSS, Anna Ablove, Beau Kujath, Jedidiah R. Crandall, and Roya Ensafi. 2022. TSPU: Russia's decentralized censorship system. In Proceedings of the 22nd ACM Internet Measurement Conference (IMC '22). https://doi.org/10.1145/3517745.3561461
Eric Zeng, Rachel McAmis, Tadayoshi Kohno, and Franziska Roesner. 2022. What factors affect targeting and bids in online advertising? a field measurement study. In Proceedings of the 22nd ACM Internet Measurement Conference (IMC '22). https://doi.org/10.1145/3517745.3561460
Marcin Nawrocki, Pouyan Fotouhi Tehrani, Raphael Hiesgen, Jonas Mücke, Thomas C. Schmidt, and Matthias Wählisch. 2022. On the interplay between TLS certificates and QUIC performance. In Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies (CoNEXT '22). https://doi.org/10.1145/3555050.3569123
Ram Sundara Raman, Mona Wang, Jakub Dalek, Jonathan Mayer, and Roya Ensafi. 2022. Network measurement methods for locating and examining censorship devices. In Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies (CoNEXT '22). https://doi.org/10.1145/3555050.3569133
Su, Junhua, and Alexandros Kapravelos. "Automatic Discovery of Emerging Browser Fingerprinting Techniques." In Proceedings of the ACM Web Conference 2023. 2023. https://dl.acm.org/doi/abs/10.1145/3543507.3583333
Lee, Dongkeun, Minwoo Joo, and Wonjun Lee. "Net-track: Generic Web Tracking Detection Using Packet Metadata." In Proceedings of the ACM Web Conference 2023. 2023. https://dl.acm.org/doi/abs/10.1145/3543507.3583372
Lim, Sangwon, et al. "ZTLS: A DNS-based Approach to Zero Round Trip Delay in TLS handshake." In Proceedings of the ACM Web Conference 2023. 2023. https://dl.acm.org/doi/abs/10.1145/3543507.3583516
Habib, Rumaisa, et al. "A First Look at Public Service Websites from the Affordability Lens." In Proceedings of the ACM Web Conference 2023. 2023. https://dl.acm.org/doi/abs/10.1145/3543507.3583415