Student Projects


How to apply

To apply, please send your CV, your Ms and Bs transcripts by email to all the contacts indicated below the project description. Do not apply on SiROP . Since Prof. Davide Scaramuzza is affiliated with ETH, there is no organizational overhead for ETH students. Custom projects are occasionally available. If you would like to do a project with us but could not find an advertized project that suits you, please contact Prof. Davide Scaramuzza directly to ask for a tailored project (sdavide at ifi.uzh.ch).


Upon successful completion of a project in our lab, students may also have the opportunity to get an internship at one of our numerous industrial and academic partners worldwide (e.g., NASA/JPL, University of Pennsylvania, UCLA, MIT, Stanford, ...).



Learning Robust Agile Flight via Adaptive Curriculum - Available

Description: Reinforcement learning-based controllers have demonstrated remarkable success in enabling fast and agile flight. Currently, the training process of these reinforcement learning controllers relies on a static, pre-defined curriculum. In this project, our objective is to develop a dynamic and adaptable curriculum to enhance the robustness of the learning-based controllers. This curriculum will continually adapt in an online fashion based on the controller's performance during the training process. By using the adaptive curriculum, we expect the reinforcement learning controllers to enable more diverse, generalizable, and robust performance in unforeseen scenarios. Applicants should have a solid understanding of reinforcement learning, machine learning experience (PyTorch), and programming experience in C++ and Python.

Goal: Improve the robustness and generalizability of the training framework and validate the method in different navigation task settings. The approach will be demonstrated and validated both in simulated and real-world settings.

Contact Details: Jiaxu Xing (jixing@ifi.uzh.ch), Ismail Geles (geles@ifi.uzh.ch), Prof. Davide Scaramuzza (sdavide@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Novel Learning Paradigms for Low-latency and Efficient Vision - Available

Description: Event cameras offer remarkable advantages, including ultra-high temporal resolution in the microsecond range, immunity to motion blur, and the ability to capture high-speed phenomena (https://youtu.be/AsRKQRWHbVs). These features make event cameras invaluable for applications like autonomous driving. However, efficiently processing the sparse event streams while maintaining low latency remains a difficult challenge. Previous research has focused on developing sparse update frameworks for event-based neural networks to reduce computational complexity, i.e., FLOPs. This project takes the next step by directly lowering the processing runtime to unlock the full potential of event cameras for real-time applications.

Goal: The focus of the project is to reduce runtime using common hardware (GPUs), which have been highly optimized for parallelization. The project will explore drastically new processing paradigms, which can potentially be transferred to standard frames. This ambitious project requires a strong sense of curiosity, self-motivation, and a principled approach to tackling research challenges. You should have solid Python programming skills and experience with at least one deep learning framework. If you’re excited about exploring cutting-edge techniques to push the boundaries, please feel free to contact us. **Key Requirement** 1. Background in Deep Learning: Proficiency in Python and familiarity with state-of-the-art deep learning frameworks. 2. Problem-Solving Skills: Ability to approach research problems in a principled way.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to: Roberto Pellerito [rpellerito@ifi.uzh.ch], Nikola Zubic [zubic@ifi.uzh.ch], Prof. Davide Scaramuzza [sdavide@ifi.uzh.ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Vision-Based Tactile Sensor for Humanoid Hands (in collaboration with Soft Robotics Lab) - Available

Description: Humanoid robots are rapidly advancing, with dexterous hand manipulation emerging as a key research frontier. These systems currently rely primarily on vision-based perception for manipulation. However, such approaches face limitations in scenarios where the line of sight is blocked, or when precise force control is critical for stable manipulation. To enable fine-grained and robust manipulation, tactile sensing at multiple points on the fingertip and palm is fundamental. Several tactile sensing strategies exist, but vision-based tactile sensors stand out due to their compactness, low cost, and high spatial resolution. Their performance, however, is limited by the camera bandwidth and power consumption.

Goal: This project proposes the development of a novel event-based tactile sensor, replacing conventional cameras with event cameras. This approach leverages the asynchronous, high-bandwidth, and low-power properties of event-based vision to provide real-time, high-resolution tactile feedback. The ultimate goal is to integrate these sensors into a human-scale robotic hand and validate their effectiveness in dexterous manipulation tasks. The project will focus on building a method for estimating force from videos of a deformable material. **Key Requirement** We are looking for a highly skilled student with: 1. Background in mechatronics, robotics, or a related field. 2. Strong interest in tactile sensing and perception. 3. Strong experience with Deep Learning, in particular: CNNs, RNNs, GNNs and Transformers. 4. Strong experience with sensor characterization. 5. Basic knowledge of event-based vision and tactile sensors are a plus.

Contact Details: If you are interested in working on cutting-edge tactile sensing technologies and contributing to the future of humanoid robotics, please contact us with your CV, transcripts of Bachelor, Master and a small motivational introduction. Roberto Pellerito rpellerito@ifi.uzh.ch, Jaehoon Kim jaehoon.kim@srl.ethz.ch, Prof. Davide Scaramuzza [sdavide@ifi.uzh.ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Time-continuous Facial Motion Capture Using Event Cameras - Available

Description: Traditional facial motion capture systems often rely on marker-based methods or multi-camera rigs to track facial movements. However, these approaches can be limited in capturing fine details such as subtle wrinkles and micro-expressions. Recent advancements in learning-based techniques have enabled high-fidelity facial tracking using monocular RGB images, but the temporal resolution is constrained by the frame rate of conventional cameras. Event-based cameras offer a promising alternative, providing superior temporal resolution without the need for costly and bulky high-speed RGB cameras. This project aims to leverage the advantages of event-based cameras to achieve unprecedented quality in tracking subtle facial movements.

Goal: Develop a facial motion capture system that utilizes event-based cameras to accurately track fine facial movements, including micro-expressions and subtle wrinkles. The system should overcome the limitations of traditional methods by providing higher temporal resolution and capturing intricate facial details.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to: Roberto Pellerito (rpellerito@ifi.uzh.ch), Nico Messikommer (nmessi@ifi.uzh.ch) and Davide Scaramuzza (sdavide@ifi.uzh.ch).

Thesis Type: Semester Project / Bachelor Thesis / Master Thesis

See project on SiROP

Long-Horizon Learning for Agile Autonomous Drone Racing - Available

Description: Autonomous drone racing is one of the most demanding domains for robot learning. A racing drone must perceive its environment, estimate its state, plan aggressive maneuvers, and execute precise control actions at high speed. Small mistakes can quickly lead to large performance losses or crashes, making this an ideal setting for studying advanced learning-based control. Many learning-based methods focus on short-term observations or reactive policies. However, drone racing naturally involves long-horizon temporal structure: the drone must anticipate future gates, adapt its trajectory, and maintain stability across complex sequences of decisions. This creates an exciting opportunity to study modern sequence models in the context of agile robotics. The goal of this project is to explore whether long-sequence architectures can improve the performance, robustness, and adaptability of Reinforcement Learning systems for drone racing. The student will work on simulation-based development and evaluation, with the possibility of progressing to real-world validation. This is a challenging thesis topic intended for a truly outstanding student with excellent grades, strong technical skills, and prior experience in machine learning, robotics, control, or autonomous systems.

Goal: The main goal is to develop and evaluate a Reinforcement Learning framework for autonomous drone racing that leverages long-horizon temporal information. The student will ideally: study long-sequence modeling methods for decision-making and control, develop an RL framework tailored to agile drone racing, train and evaluate the system in simulation, analyze performance in dynamic racing scenarios, compare against standard RL or control-based baselines, investigate robustness, generalization, and computational efficiency, and optionally validate the approach on real drone-racing hardware.

Contact Details: Interested candidates should send their CVs, transcripts (bachelor's and master's) to Nikola Zubic (zubic@ifi.uzh.ch), Ismail Geles (geles@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).

Thesis Type: Master Thesis

See project on SiROP

Event Cameras for Agile Drone 3D Perception - Available

Description: Drones are becoming faster, smaller, and more agile, but their cameras often struggle exactly when perception matters most. During rapid flight, sharp turns, or low-light conditions, conventional cameras can produce heavily blurred images, making it difficult for an autonomous drone to understand where it is and what the world around it looks like. Event cameras offer a radically different way of seeing: instead of recording full images at fixed frame rates, they react instantly to tiny brightness changes, similar to how biological vision is sensitive to motion. This project explores how event cameras can help drones perceive the world during fast flight. The student will develop machine learning methods that combine conventional camera images, event camera data, and drone motion information to recover a sharp and consistent 3D understanding of the environment. Rather than treating motion blur as a failure case, the project will investigate how high-speed visual signals can be used as a strength: events provide precise timing information that can help reconstruct what happened during the blurred exposure of a normal camera. The project is suitable for students interested in computer vision, machine learning, robotics, and autonomous systems. It offers the opportunity to work on robust 3D reconstruction, motion estimation, sensor fusion, neural rendering, and real-world drone experiments.

Goal: The goal of this project is to build a learning-based perception system that allows drones to recover sharp visual and 3D information during high-speed motion. The student will design and implement methods for combining event cameras with conventional images and motion cues, evaluate the system on challenging drone sequences, and analyze when event-based sensing provides clear advantages over standard cameras.

Contact Details: Interested candidates should send their CV, transcripts (bachelor’s and master’s), and desired start date to Rong Zou (zou@ifi.uzh.ch) and Daniel Zhai (dzhai@ifi.uzh.ch).

Thesis Type: Master Thesis

See project on SiROP

Learning Rapid UAV Exploration with Foundation Models - Available

Description: In this project, our objective is to efficiently explore unknown indoor environments using UAVs. Recent research has demonstrated significant success in integrating foundational models with robotic systems. Leveraging these foundational models, the drone will employ learned semantic relationships from large-world-scale data to actively explore and navigate through unknown environments. While most prior research has focused on ground-based robots, this project aims to investigate the potential of integrating foundational models with aerial robots to introduce more agility and flexibility. Applicants should have a solid understanding of mobile robot navigation, machine learning experience (PyTorch), and programming experience in C++ and Python.

Goal: Develop such a framework in simulation and conduct a comprehensive evaluation and analysis. If feasible, deploy such a model in a real-world environment.

Contact Details: Jiaxu Xing (jixing@ifi.uzh.ch), Daniel Zhai (dzhai@ifi.uzh.ch), Prof. Davide Scaramuzza (sdavide@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Event‑based Temporal Segmentation & Tracking - Available

Description: Event cameras are revolutionary sensors that capture pixel-level illumination changes with microsecond latency, providing significant advantages in high-speed and high-dynamic-range scenarios where traditional cameras suffer from motion blur. Recently, large-scale foundational segmentation models have been successfully adapted to the event domain. However, these current approaches remain constrained to per-frame analysis, treating continuous event streams as isolated, static snapshots and ignoring temporal consistency. At the same time, existing event-based methods for moving object segmentation can isolate motion but fail to maintain instance identity over time—they can segment moving pixels, but they cannot "track" specific objects. This project aims to bridge the gap between static foundational segmentation and dynamic motion analysis by developing the first comprehensive tracker for event cameras. The objective is to design a system capable of not only segmenting arbitrary objects but also maintaining their identity consistently across long, high-speed sequences. The student will extend current spatial feature adaptation strategies to support temporal identity, effectively transforming a frame-by-frame instance segmenter into a robust Video Object Segmentation (VOS) tracker. Furthermore, to handle severe object occlusions and rapid, erratic motion, the project will explore sparse temporal memory mechanisms that prevent identity-switching. Finally, to rigorously test the system's reliability, the student will establish a novel benchmark for dense segmentation in extreme edge cases, such as night driving with severe glare and rapid evasive maneuver.

Goal: The primary goal of this project is to develop the first temporally consistent, foundational tracker for event cameras capable of long-term identity persistence. The student will design and implement temporal memory mechanisms to advance event-based zero-shot segmentation into continuous Video Object Segmentation. Additionally, the project will culminate in the creation of a new benchmark tailored for dense tracking in challenging, high-speed, and low-visibility domains.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to: Roberto Pellerito [rpellerito@ifi.uzh.ch], Leonardo Ravaglia [leonardo.ravaglia@thegoodailab.org], Nicoletta Risi [nicolettarisi93@gmail.com] and Davide Scaramuzza [sdavide@ifi.uzh.ch]

Thesis Type: Master Thesis

See project on SiROP

Reinforcement Learning with World Models - Available

Description: Reinforcement learning has shown promising results for robot control, but many approaches require large amounts of interaction data and struggle to generalize to new situations. World models provide a way to learn predictive representations of the environment that capture its dynamics and enable more efficient decision making. By modeling how the environment evolves in response to actions, these approaches can support planning and improve policy learning.

Goal: This project aims to develop and evaluate world models for robot control. The learned model will predict future states of the environment and be integrated with reinforcement learning to improve control performance in simulated and real robotic tasks. Students should have prior experience with reinforcement learning algorithms and deep learning frameworks.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Ismail Geles [geles (at) ifi (dot) uzh (dot) ch], Rudolf Reiter [rreiter (at) ifi (dot) uzh (dot) ch], Daniel Zhai [dzhai (at) ifi (dot) uzh (dot) ch], Prof. Davide Scaramuzza [sdavide (at) ifi (dot) uzh (dot) ch]

Thesis Type: Master Thesis

See project on SiROP

Spiking Architectures for Advanced Event-Based Temporal Reasoning - Available

Description: Biological neural systems excel at processing information with remarkable efficiency and robustness, largely due to their reliance on the precise timing and dynamic interplay of neural activity. In contrast, many conventional deep learning architectures simplify these temporal dynamics, often overlooking the rich information embedded in the precise timing of events. Event-based cameras offer a unique data stream that mirrors this biological principle, capturing asynchronous "spikes" of visual information in response to scene changes. This project aims to develop novel spiking neural network (SNN) architectures that harness these inherent characteristics of event data. We propose an approach that emphasizes neuron-level temporal processing. Furthermore, we will investigate how collective spiking synchronization can serve as a powerful latent representation for understanding dynamic scenes and sequential patterns. This paradigm seeks to strike a balance between biological plausibility and computational efficiency, leveraging the sparsity and high temporal resolution of event data to achieve robust and interpretable performance in complex, dynamic environments.

Goal: The primary goal is to design, implement, and rigorously evaluate an SNN architecture capable of advanced temporal reasoning on event-based data. This involves developing methods for individual spiking neurons to effectively process their historical event inputs and exploring how emergent synchronization patterns within the network can represent rich contextual information. The project will involve testing the developed models on challenging event-based vision tasks that require sequential understanding, such as gesture recognition, dynamic object tracking, or agile robot navigation. Performance will be assessed in terms of accuracy, computational efficiency, and robustness to noisy or complex event streams, with comparisons to existing event-based learning paradigms. Applicants should possess a strong background in spiking neural networks, deep learning frameworks (PyTorch), computer vision, and programming proficiency in Python. Experience with event cameras or neuromorphic computing is a significant advantage.

Contact Details: Interested candidates should send their CVs and transcripts (bachelor's and master's) to Nikola Zubic (zubic@ifi.uzh.ch), Roberto Pellerito (rpellerito@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).

Thesis Type: Master Thesis

See project on SiROP

Rethinking RNNs for Neuromorphic Computing and Event-based Vision - Available

Description: While more recent sequence modeling architectures have gained prominence, traditional Recurrent Neural Networks (RNNs), such as LSTMs and GRUs, remain highly effective for tasks requiring strong state-tracking capabilities and continuous temporal reasoning, which are qualities crucial for processing dynamic time-series data. Event-based cameras, which produce sparse, asynchronous data streams in response to scene changes, generate precisely this kind of highly temporal information. However, efficiently processing these event streams with traditional RNNs, especially on resource-constrained platforms or future neuromorphic hardware, presents significant challenges due to their strictly sequential nature and inherent inefficiencies in current hardware implementations. This project aims to change the deployment of RNNs for event-based vision by developing hardware-aware optimization strategies. We will explore novel parallelization schemes that can process multiple, smaller hidden states concurrently, analogous to how multi-head mechanisms operate, thereby better utilizing modern parallel computing architectures. Furthermore, we will focus on fine-grained kernel optimization, targeting specific hardware characteristics such as internal cache sizes, memory access patterns, and compute handling, to unlock efficiency and throughput for RNNs processing event data. The ultimate goal is to enable RNNs to leverage the advantages of event-based sensors for real-time, low-latency applications.

Goal: The primary goal of this project is to design, implement, and rigorously evaluate highly optimized RNN architectures tailored for efficient processing of event-based vision data, with a strong focus on their potential for neuromorphic computing and modern GPU hardware. This involves developing custom kernels and optimization techniques that exploit the sparsity and asynchronous nature of event streams, alongside parallelization strategies that significantly accelerate RNN inference. The student will benchmark the developed solutions on representative event-based vision tasks (e.g., object detection, optical flow, motion estimation) to demonstrate substantial improvements in processing speed and computational efficiency compared to standard implementations. Applicants should possess strong programming skills in Python and C++, expertise in deep learning frameworks (e.g., PyTorch, JAX), and a solid understanding of RNN architectures. Experience with hardware-level optimization (CUDA, Triton) or neuromorphic computing concepts is highly advantageous.

Contact Details: Interested candidates should send their CVs and transcripts (bachelor's and master's) to Nikola Zubic (zubic@ifi.uzh.ch), Roberto Pellerito (rpellerito@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).

Thesis Type: Master Thesis

See project on SiROP

Event Representation Learning for Control with Visual Distractors - Available

Description: Autonomous systems operating in complex, real-world environments often face significant challenges from visual distractors and high-speed dynamics. Traditional frame-based cameras can struggle with motion blur and high-dynamic range, leading to unreliable visual representations for control. Event-based cameras, with their microsecond latency and ability to capture only changes in a scene, offer a promising alternative for robust perception in such demanding scenarios. This project aims to investigate event-based representation learning for control tasks, specifically focusing on environments with static and dynamic visual distractors. Drawing inspiration from benchmarks like the DeepMind Control Vision Benchmark (DMC-VB), we will explore how event data's unique properties, sparsity, and high temporal resolution can be leveraged to learn more robust and efficient control policies. The goal is to develop representations that are inherently less susceptible to visual noise and rapid environmental changes, thereby improving the performance and reliability of autonomous agents.

Goal: The primary goal of this project is to design, implement, and evaluate novel event-based representation learning methods for control tasks, focusing on scenarios with visual distractors. This includes developing techniques to extract meaningful features from sparse event streams that are invariant to static or dynamic visual clutter. The student will work with simulated environments (adapted to generate event data or using existing event-based simulators) to benchmark the developed methods against traditional frame-based approaches. Experiments will cover various locomotion and navigation tasks, assessing the robustness, sample efficiency, and real-time performance of event-based control policies. Applicants should have a strong background in deep learning, reinforcement learning, computer vision, and programming skills in Python (PyTorch/JAX). Experience with event-based vision or control systems is highly beneficial.

Contact Details: Interested candidates should send their CVs and transcripts (bachelor's and master's) to Nikola Zubic (zubic@ifi.uzh.ch), Rong Zou (zou@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).

Thesis Type: Semester Project / Master Thesis

See project on SiROP