Student Projects
How to apply
To apply, please send your CV, your Ms and Bs transcripts by email to all the contacts indicated below the project description. Do not apply on SiROP . Since Prof. Davide Scaramuzza is affiliated with ETH, there is no organizational overhead for ETH students. Custom projects are occasionally available. If you would like to do a project with us but could not find an advertized project that suits you, please contact Prof. Davide Scaramuzza directly to ask for a tailored project (sdavide at ifi.uzh.ch).
Upon successful completion of a project in our lab, students may also have the opportunity to get an internship at one of our numerous industrial and academic partners worldwide (e.g., NASA/JPL, University of Pennsylvania, UCLA, MIT, Stanford, ...).
Codesign of shape and control: A study in autonomous perching - Available
Description: This project establishes an automated co-design approach, specifically focusing on the development of an optimal control policy and the optimization of 3D shapes, targeting static design criteria, such as a desired lift-over-drag ratio for aerodynamic shapes. We are undertaking an ambitious project that aims to extend the concept of shape optimization to design a fully autonomous system. We propose to explore this project via an autonomous glider perching problem described in a paper by MIT [1]. We are looking for a master's student to help us design and build the glider to reproduce the experiments in [1]. In addition to designing and building the glider, the student needs to implement a controller that successfully executes the perching maneuver. This will require performing system identification, implementing motion planning algorithms, and designing tracking controllers. Project outcomes: - Build a physical RC glider for the perching task, made of foam - Develop a control law for the glider (e.g., optimal control/trajectory optimization and MPC) Prerequisites: Currently completing a master's in computer science or mechanical/electrical engineering, and likes math. Prospective applicants need to have been exposed to the following topics through coursework or past projects: - Optimal control, model predictive control - Numerical optimization (convex, linear, quadratic programming) - Familiarity with deep learning and computational fluid dynamics is a plus - Hands-on robotics experience is also a plus The project is cosupervised by Prof. Pascal Fua. References: [1] Moore, Joseph & Cory, Rick & Tedrake, Russ. (2014). Robust post-stall perching with a simple fixed-wing glider using LQR-Trees. Bioinspiration & Biomimetics. 9. 025013. 10.1088/1748-3182/9/2/025013.
Goal: Can we optimize the design of an autonomous system for a complex dynamic maneuver?
Contact Details: Please send your CV, Bachelor's and Master's transcripts to Ming Xu, Mingda.xu@epfl.ch and Rudolf Reiter, rreiter@ifi.uzh.ch
Thesis Type: Master Thesis
Acrobatic Drone Flight Through Moving Gaps with Event Vision - Available
Description: The Micro Aerial Vehicle (MAV) is a highly agile robotic platform capable of performing rapid attitude adjustments and navigating through narrow gaps smaller than its own maximum radius in extremely short timeframes. This exceptional ability to maneuver in challenging environments highlights its potential for advanced navigation tasks. However, existing methods typically rely on motion capture systems or are designed for static targets, which limits the full exploitation of a drone’s maneuverability. Achieving vision-based navigation through a fast-moving narrow gap poses significant challenges for both perception and control in autonomous drones. In terms of perception, drones must address dynamic blur caused by the high-speed motion of both the drone and the target, while detecting and predicting the position of narrow gaps with minimal latency. On the control side, the drone must maximize its agility to assess the feasibility of a gap-crossing task in real time and execute precise maneuvers to complete the traversal. The primary objectives of this project are: 1) To develop a low-latency, event-based detection system for dynamic gap identification. 2) To design a failure-aware vision-based control policy capable of evaluating the feasibility of gap-crossing tasks, making real-time decisions on whether to attempt traversal, and controlling the drone’s motion to successfully navigate through the moving gaps. The project’s success will be measured by: 1) The accuracy and latency of event-based target detection, benchmarked quantitatively against RGB camera-based methods. 2) The success rate of completing dynamic gap-crossing tasks at varying target moving speeds.
Goal: The purpose of this research is to explore the potential of event-based visual control systems in enabling acrobatic flight tasks, such as navigating through narrow gaps, to push the boundaries of drone maneuverability. This work aims to investigate whether neuromorphic vision technologies can offer unique advantages for drone navigation in dynamic and extreme environments. **Key Requirements**: 1) Proficiency in computer vision, image processing, and deep learning techniques (for event-based image detection). 2) Foundational knowledge of reinforcement learning for developing robust control policies. 3) Understanding of drone dynamics and control systems to ensure precise and agile navigation.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to: Yunfan Ren [yunfan@ifi.uzh.ch], Jiaxu Xing [jixing@ifi.uzh.ch], Prof. Davide Scaramuzza [sdavide@ifi.uzh.ch]
Thesis Type: Semester Project / Master Thesis
Vision-Based Agile Aerial Transportation - Available
Description: Transporting loads with drones is often constrained by traditional control systems that rely on predefined flight paths, GPS, or external motion capture systems. These methods limit a drone's adaptability and responsiveness, particularly in dynamic or cluttered environments. Vision-based control has the potential to revolutionize aerial transportation by enabling drones to perceive and respond to their surroundings in real-time. Imagine a drone that can swiftly navigate through complex environments and deliver payloads with precision using only onboard vision sensors. Applicants are expected to be proficient in Python, C++, and Git.
Goal: This project aims to develop a vision-based control system for drones capable of agile and efficient aerial transportation. The system will leverage real-time visual input to dynamically adapt to environmental conditions, navigate obstacles, and manage load variations with reinforcement or imitation learning.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Ismail Geles [geles (at) ifi (dot) uzh (dot) ch], Leonard Bauersfeld [bauersfeld (at) ifi (dot) uzh (dot) ch], Angel Romero [roagui (at) ifi (dot) uzh (dot) ch]
Thesis Type: Master Thesis
Event-based Object Segmentation for Vision-Guided Autonomous Systems - Available
Description: Event cameras offer groundbreaking advantages: microsecond latency, high dynamic range, and sparse asynchronous output, which make them ideal for challenging perception tasks where traditional frame cameras falter (motion blur, low light, high contrast). While much of the current event-based research focuses on reconstruction, VO, or representation learning, precise object segmentation remains largely unexplored. Segmenting dynamic objects in complex environments is critical for downstream tasks such as agile navigation, collision avoidance, and robust control. By leveraging the temporal resolution and sparsity of event data, we aim to advance object segmentation methods that can outperform frame-based alternatives, especially under fast motion or harsh lighting. The primary goal of this project is to design, implement, and rigorously evaluate an end-to-end pipeline that performs object segmentation using event camera data. Success will be measured by segmentation accuracy, temporal consistency, latency, and robustness compared to conventional frame-based segmentation.
Goal: Dataset curation and preprocessing: Collect or simulate paired event streams and segmentation annotations (using existing datasets such as DSEC, EED, or custom setups). Align event data with ground-truth masks via synchronous frame capture or semi-synthetic generation. Method development: Develop or adapt deep learning architectures for segmentation on event data, e.g., spiking networks, graph neural nets, or sparse convolutional networks working on event voxel grids. Investigate hybrid methods that combine event streams with RGB frames to improve segmentation under challenging conditions. Performance evaluation: Quantitatively evaluate segmentation quality using metrics such as IoU, precision, recall, and temporal stability. Benchmark latency and computational efficiency, considering real-time deployment constraints. Robustness testing: Test across scenarios with dynamic lighting (HDR), rapid motion, motion blur, and occlusions. Compare performance to frame-based segmentation baselines. Optional real-time deployment: Integrate the segmentation pipeline onboard a drone or robot. Demonstrate real-time perception on dynamic tasks like object avoidance, tracking, or mapping. **Key Requirement** 1. Strong programming skills in Python, proficiency with PyTorch or similar deep learning frameworks. 2. Background in computer vision, deep learning, and ideally some experience with event-based vision. 3. Familiarity with segmentation tasks and performance evaluation. C++ knowledge for real-time/deployment aspects; ROS familiarity; hardware deployment experience are a plus.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to: Roberto Pellerito [rpellerito@ifi.uzh.ch], Rong Zou [zou@ifi.uzh.ch], Prof. Davide Scaramuzza [sdavide@ifi.uzh.ch]
Thesis Type: Semester Project / Master Thesis
Novel Learning Paradigms for Low-latency and Efficient Vision - Available
Description: Event cameras offer remarkable advantages, including ultra-high temporal resolution in the microsecond range, immunity to motion blur, and the ability to capture high-speed phenomena (https://youtu.be/AsRKQRWHbVs). These features make event cameras invaluable for applications like autonomous driving. However, efficiently processing the sparse event streams while maintaining low latency remains a difficult challenge. Previous research has focused on developing sparse update frameworks for event-based neural networks to reduce computational complexity, i.e., FLOPs. This project takes the next step by directly lowering the processing runtime to unlock the full potential of event cameras for real-time applications.
Goal: The focus of the project is to reduce runtime using common hardware (GPUs), which have been highly optimized for parallelization. The project will explore drastically new processing paradigms, which can potentially be transferred to standard frames. This ambitious project requires a strong sense of curiosity, self-motivation, and a principled approach to tackling research challenges. You should have solid Python programming skills and experience with at least one deep learning framework. If you’re excited about exploring cutting-edge techniques to push the boundaries, please feel free to contact us. **Key Requirement** 1. Background in Deep Learning: Proficiency in Python and familiarity with state-of-the-art deep learning frameworks. 2. Problem-Solving Skills: Ability to approach research problems in a principled way.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to: Roberto Pellerito [rpellerito@ifi.uzh.ch], Nikola Zubic [zubic@ifi.uzh.ch], Prof. Davide Scaramuzza [sdavide@ifi.uzh.ch]
Thesis Type: Semester Project / Master Thesis
Vision Language Action models for Drones - Available
Description: Designing a generative model for drone flight paths that obey high-level natural language descriptions (e.g. “fast 8 shape”) and pass through specific waypoints is a challenging multi-modal problem. The goal is to produce plausible 3D trajectories offline, which can then be used to train reinforcement learning (RL) policies by imitation or as goal demonstrations. Key challenges include: (1) combining textual commands with spatial constraints (control points) in a single model, (2) obtaining training data of trajectories paired with language descriptions (potentially from videos), and (3) ensuring generated paths are physically feasible and capture the qualitative style indicated by the language (e.g. *“fast”*implies higher speed, “8 shape” implies a figure-eight loop). In this project we will explore suitable model architectures, data sources, trajectory extraction methods, relevant research, and propose a system design to tackle this text-to-trajectory generation task.
Goal: The goal of this project is to design a generative model that can translate high-level natural language descriptions and waypoint constraints into physically feasible 3D drone trajectories, which can then be used to train or guide reinforcement learning policies for autonomous flight. **Key Requirement** 1. Strong programming skills in Python, proficiency with PyTorch or similar deep learning 2. frameworks. Strong background in computer vision, deep learning, and reinforcement learning, MPC and classical control. 3.Understanding of drone dynamics and control systems.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to: Roberto Pellerito [rpellerito@ifi.uzh.ch], Daniel Zhai [dzhai@ifi.uzh.ch], Prof. Davide Scaramuzza [sdavide@ifi.uzh.ch]
Thesis Type: Semester Project / Master Thesis
Event-Based Tactile Sensor for Humanoid Robotic Hands - Available
Description: Humanoid robots are rapidly advancing, with dexterous hand manipulation emerging as a key research frontier. These systems currently rely primarily on vision-based perception for manipulation. However, such approaches face limitations in scenarios where the line of sight is blocked, or when precise force control is critical for stable manipulation. To enable fine-grained and robust manipulation, tactile sensing at multiple points on the fingertip and palm is fundamental. Several tactile sensing strategies exist, but vision-based tactile sensors stand out due to their compactness, low cost, and high spatial resolution. Their performance, however, is limited by the camera bandwidth and power consumption.
Goal: This project proposes the development of a novel event-based tactile sensor, replacing conventional cameras with event cameras. This approach leverages the asynchronous, high-bandwidth, and low-power properties of event-based vision to provide real-time, high-resolution tactile feedback. The ultimate goal is to integrate these sensors into a human-scale robotic hand and validate their effectiveness in dexterous manipulation tasks. The project will focus on building a method for estimating force from videos of a deformable material. **Key Requirement** We are looking for a highly skilled student with: 1. Background in mechatronics, robotics, or a related field. 2. Strong interest in tactile sensing and perception. 3. Strong experience with Deep Learning, in particular: CNNs, RNNs, GNNs and Transformers. 4. Strong experience with sensor characterization. 5. Basic knowledge of event-based vision and tactile sensors are a plus.
Contact Details: If you are interested in working on cutting-edge tactile sensing technologies and contributing to the future of humanoid robotics, please contact us with your CV, transcripts of Bachelor, Master and a small motivational introduction. Roberto Pellerito rpellerito@ifi.uzh.ch, Jaehoon Kim jaehoon.kim@srl.ethz.ch, Prof. Davide Scaramuzza [sdavide@ifi.uzh.ch]
Thesis Type: Semester Project / Master Thesis
Neural Vision for Celestial Landings (in collaboration with European Space Agency) - Available
Description: Event-based cameras offer significant benefits in difficult robotic scenarios characterized by high-dynamic range and rapid motion. These are precisely the challenges faced by spacecraft during landings on celestial bodies like Mars or the Moon, where sudden light changes, fast dynamics relative to the surface, and the need for quick reaction times can overwhelm vision-based navigation systems relying on standard cameras. In this work, we aim to design novel spacecraft navigation methods for the descent and landing phases, exploiting the power efficiency and sparsity of event cameras. Particular effort will be dedicated to developing a lightweight frontend, utilizing asynchronous convolutional and graph neural networks to effectively harness the sparsity of event data, ensuring efficient and reliable processing during these critical phases. The project is in collaboration with European Space Agency at the European Space Research and Technology Centre (ESTEC) in Noordwijk (NL).
Goal: Investigate the use of asynchronous neural networks (either regular or spiking) for building an efficient frontend system capable of processing event-based data in real-time. Experiments will be conducted both pre-recorded dataset as well as on data collected during the project. **Key Requirement** We look for students with strong programming (Pyhton/C++) and computer vision backgrounds. Knowledge in machine learning frameworks (pytorch, tensorflow) is required as well as familiarity with Visual Odometry, SLAM, feature tracking. Previous experience with IMU is a plus.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to: Roberto Pellerito (rpellerito@ifi.uzh.ch), Simone Nascivera (snascivera@ifi.uzh.ch) and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Master Thesis
Learning Perception-Aware Navigation Utilizing MPC Layers - Available
Description: Safe and efficient navigation in challenging environments remains a fundamental problem in robotics. Reinforcement learning (RL) methods can learn complex navigation policies but often lack robustness when faced with perception errors or localization drift. On the other hand, model predictive control (MPC) offers strong guarantees in trajectory optimization but struggles with long-term adaptability in uncertain conditions. This project aims to combine the strengths of both approaches by embedding an MPC layer within an RL policy, while explicitly incorporating uncertainty-aware localization. By leveraging uncertainty estimates from perception modules, the navigation system can reason about when and how to rely on MPC corrections, enabling more reliable decision-making in environments with noise, dynamic obstacles, or degraded sensor data. The outcome is a perception-aware navigation framework that adapts to uncertainty while maintaining efficiency and safety.
Goal: The goal of this project is to design and evaluate an RL-based navigation policy augmented with an MPC layer and guided by uncertainty-aware localization. The system will be benchmarked in simulated and/or real-world environments with challenging perception conditions, focusing on metrics such as navigation success rate, robustness to sensor noise, and safety in obstacle-dense scenarios.
Contact Details: Interested candidates should send their CV and transcripts (bachelor’s and master’s) to Rudolf Reiter (rreiter@ifi.uzh.ch), Simone Nascivera (snascivera@ifi.uzh.ch)
Thesis Type: Master Thesis
Vision-Based World Models for Real-Time Robot Control - Available
Description: This master's project focuses on enabling real-time, vision-based control for quadrotors by distilling large, complex world models into lightweight versions suitable for deployment on resource-constrained platforms. The goal is to achieve fast, efficient inference from camera inputs, supporting agile indoor navigation in previously unseen environments. Large-scale vision models capable of generating and understanding complex scenes are typically too computationally intensive for onboard use. This project addresses that challenge by applying model distillation techniques to transfer knowledge from a pre-trained, high-capacity model to a smaller, faster one. The distilled models will be deployed on quadrotors to evaluate real-world performance, focusing on latency, energy consumption, and navigation success. Beyond standard RGB input, the project will also investigate using additional visual modalities like depth and semantic segmentation to enhance control capabilities. The work will follow a structured timeline, starting with a literature review and dataset setup, moving through distillation and model optimization, and ending with deployment and testing. This project is an excellent fit for students interested in robotics, computer vision, and efficient deep learning, and it offers the chance to contribute to the future of responsive, autonomous robotic systems. **Applicant Requirements:** - Proficiency in reinforcement learning, robotics, and computer vision - Strong programming skills with Python - First experience with large neural network world models - Knowledge of simulation software and real-time data processing - Understanding of drone dynamics and control systems
Goal: Investigate model distillation techniques and their application to vision-based world models for deployment on navigation tasks.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Rudolf Reiter (rreiter AT ifi DOT uzh DOT ch) and Daniel Zhai (dzhai (at) ifi (dot) uzh (dot) ch).
Thesis Type: Master Thesis
Vision-Based Reinforcement Learning in the Real World - Available
Description: This master's project offers an exciting opportunity to work on real-world vision-based drone flight without relying on simulators. The goal is to develop learning algorithms that enable quadrotors to fly autonomously using visual input, learned directly from real-world experience. By avoiding simulation, this approach opens up new possibilities for the future of robotics. A significant focus of the project is achieving high sample efficiency and designing a robust safety framework that enables effective exploration by leveraging the latest research results on optimization layers within RL policies. The project will begin with state-based learning as an intermediate step, progressing toward complete vision-based learning. It builds on recent research advances and a well-established drone navigation and control software stack. The lab provides access to multiple vision-capable quadrotors ready for immediate use. This project is ideal for outstanding master’s students interested in robotics, learning systems, and real-world deployment. It offers a rare chance to contribute to a high-impact area at the intersection of machine learning, control, and computer vision, with strong potential for further academic or industrial opportunities. Applicants should have proficiency in computer vision, reinforcement learning, and robotics, as well as strong programming skills in Python and C++. Initial experience with large neural network world models is expected, as well as familiarity with simulation software and real-time data processing. A solid understanding of drone dynamics and control systems is also essential.
Goal: The goal is to investigate how the latest optimization-based reinforcement learning advances push the limits of learning real-world tasks such as agile vision-based flight.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Rudolf Reiter (rreiter AT ifi DOT uzh DOT ch) and Ismail Geles (geles AT ifi DOT uzh DOT ch)
Thesis Type: Master Thesis
Event-Guided 3D Gaussian Splatting for HDR Reconstruction and Relighting - Available
Description: High dynamic range (HDR) reconstruction and photorealistic relighting remain major challenges in computer vision, especially in scenarios with limited lighting, strong contrast, or degraded frame data. Conventional multi-view pipelines struggle in such conditions, often producing artifacts, noisy reconstructions, or incomplete relighting. Event cameras, with their microsecond latency and high dynamic range sensitivity, provide complementary information that can significantly improve robustness in these scenarios. This project aims to integrate asynchronous event streams into the emerging paradigm of 3D Gaussian Splatting, exploring how temporal event cues can enhance scene reconstruction and appearance modeling. By fusing event-driven information with sparse or noisy frame captures, the goal is to generate more faithful HDR reconstructions and enable realistic relighting from novel viewpoints, even under challenging low-light or extreme dynamic range conditions.
Goal: The primary goal of this project is to design a hybrid reconstruction pipeline that leverages event streams for HDR-aware 3D Gaussian Splatting. Evaluation will be conducted on real and synthetic multi-view event datasets, focusing on reconstruction accuracy, visual realism, and robustness to degraded frame data.
Contact Details: Interested candidates should send their CV and transcripts (bachelor’s and master’s) to Rong Zou (zou@ifi.uzh.ch), Nikola Zubic (zubic@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Master Thesis
Reflection and Ghosting Removal with Event Streams - Available
Description: Capturing clear, artifact-free images through reflective surfaces such as glass, shop windows, or vehicle windshields is a long-standing challenge in computer vision. Standard frame-based methods for reflection removal often fail in realistic settings, where reflections overlap with scene content and vary dynamically with motion or lighting. Event cameras offer a promising solution: their high temporal resolution and asynchronous nature provide unique cues to disambiguate reflections from true scene signals. This project explores how event streams can be combined with standard frames to suppress reflections and ghosting by exploiting differences in motion between direct scene content and reflection artifacts. The aim is to design a practical reflection-removal pipeline that works robustly in unconstrained real-world environments.
Goal: The primary objective is to develop methods that leverage asynchronous temporal cues from events to disentangle direct scene information from reflections, and to benchmark the approach on both synthetic and real datasets captured through reflective surfaces. The effectiveness will be measured in terms of reflection suppression, preservation of true scene details, and computational efficiency.
Contact Details: Interested candidates should send their CV and transcripts (bachelor’s and master’s) to Rong Zou (zou@ifi.uzh.ch), Nikola Zubic (zubic@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Semester Project / Master Thesis
Event-based Occlusion Suppression for Robust Detection and VO in Adverse Weather - Available
Description: Autonomous systems deployed in outdoor environments must operate reliably in the presence of challenging weather conditions such as rain or snow. While event cameras offer resilience to many visual degradations, dynamic particle occlusions can still severely disrupt event-based perception tasks such as object detection and visual odometry (VO). Unlike frame-based systems, which often blur under such conditions, event cameras generate high-frequency spurious activity that can overwhelm downstream models if not properly handled. This project aims to systematically investigate how dynamic occlusions degrade event-based perception pipelines and to develop practical, low-latency suppression techniques that restore reliable operation in adverse environments.
Goal: The main objective of this project is to quantify the effect of dynamic occlusions on the performance of event-based detection and VO algorithms, and design and evaluate event-based robustness strategies against dynamic particle occlusions. Performance will be assessed on simulated and real-world adverse-weather event datasets in terms of detection accuracy, VO robustness, and latency overhead.
Contact Details: Interested candidates should send their CV and transcripts (bachelor’s and master’s) to Rong Zou (zou@ifi.uzh.ch), Roberto Pellerito (rpellerito@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Semester Project / Master Thesis
Meta-model-based-RL for adaptive flight control - Available
Description: Drone dynamics can change significantly during flight due to variations in load, battery levels, and environmental factors such as wind conditions. These dynamic changes can adversely affect the drone's performance and stability, making it crucial to develop adaptive control strategies. This research project aims to develop and evaluate a meta model-based reinforcement learning (RL) framework to address these variable dynamics. By integrating dynamic models that account for these variations and employing meta-learning techniques, the proposed method seeks to enhance the adaptability and performance of drones in dynamic environments. The project will involve learning dynamic models for the drone, implementing a meta model-based RL framework, and evaluating its performance in both simulated and real-world scenarios, aiming for improved stability, efficiency, and task performance compared to existing RL approaches and traditional control methods. Successful completion of this project will contribute to the advancement of autonomous drone technology, offering robust and efficient solutions for various applications. Applicants are expected to be proficient in Python, C++, and Git.
Goal: Develop methods for meta (model-based) RL to handle variable drone dynamics.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to: Ismail Geles [geles (at) ifi (dot) uzh (dot) ch], Daniel Zhai [dzhai (at) ifi (dot) uzh (dot) ch], Jiaxu Xing [jixing (at) ifi (dot) uzh (dot) ch]
Thesis Type: Master Thesis
Spiking Architectures for Advanced Event-Based Temporal Reasoning - Available
Description: Biological neural systems excel at processing information with remarkable efficiency and robustness, largely due to their reliance on the precise timing and dynamic interplay of neural activity. In contrast, many conventional deep learning architectures simplify these temporal dynamics, often overlooking the rich information embedded in the precise timing of events. Event-based cameras offer a unique data stream that mirrors this biological principle, capturing asynchronous "spikes" of visual information in response to scene changes. This project aims to develop novel spiking neural network (SNN) architectures that harness these inherent characteristics of event data. We propose an approach that emphasizes neuron-level temporal processing. Furthermore, we will investigate how collective spiking synchronization can serve as a powerful latent representation for understanding dynamic scenes and sequential patterns. This paradigm seeks to strike a balance between biological plausibility and computational efficiency, leveraging the sparsity and high temporal resolution of event data to achieve robust and interpretable performance in complex, dynamic environments.
Goal: The primary goal is to design, implement, and rigorously evaluate an SNN architecture capable of advanced temporal reasoning on event-based data. This involves developing methods for individual spiking neurons to effectively process their historical event inputs and exploring how emergent synchronization patterns within the network can represent rich contextual information. The project will involve testing the developed models on challenging event-based vision tasks that require sequential understanding, such as gesture recognition, dynamic object tracking, or agile robot navigation. Performance will be assessed in terms of accuracy, computational efficiency, and robustness to noisy or complex event streams, with comparisons to existing event-based learning paradigms. Applicants should possess a strong background in spiking neural networks, deep learning frameworks (PyTorch), computer vision, and programming proficiency in Python. Experience with event cameras or neuromorphic computing is a significant advantage.
Contact Details: Interested candidates should send their CVs and transcripts (bachelor's and master's) to Nikola Zubic (zubic@ifi.uzh.ch), Roberto Pellerito (rpellerito@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Master Thesis
Integrating Event-based Vision Capabilities into LLMs - Available
Description: Recent advancements in Large Language Models (LLMs) have demonstrated their powerful capabilities in understanding and generating human-like text. Integrating visual information into these models opens up new frontiers for multimodal AI. However, conventional vision systems often struggle with high-dynamic range environments and rapid motion, situations where event-based cameras excel due to their asynchronous, sparse data generation. This project aims to integrate event-based vision capabilities directly into LLMs, transforming them into Event-based Multimodal Large Language Models (EMLLMs). By leveraging event-specific, custom-tailored layers, we seek to enable LLMs to process the unique information stream from event cameras, internalizing dynamic scene understanding without the overhead of external vision modules. This approach promises to improve LLMs' ability to interpret fast-moving, high-contrast scenarios, crucial for applications like autonomous navigation and human-robot interaction in challenging real-world conditions.
Goal: The primary goal is to design, implement, and evaluate a novel framework to inject event-based vision capabilities into pre-trained LLMs. This involves developing event-specific, custom-tailored layers and adapting knowledge distillation techniques to transfer event-based priors from specialized event-based neural networks. The project will investigate methods for efficiently processing sparse, asynchronous event data within the LLM architecture and evaluate its performance on tasks requiring understanding of dynamic visual information, comparing it with traditional vision-based or encoder-based multimodal LLMs. Applicants should have a strong background in deep learning, machine learning frameworks (PyTorch, JAX), and programming skills in Python. Experience with event cameras or large language models is beneficial.
Contact Details: Interested candidates should send their CVs and transcripts (bachelor's and master's) to Nikola Zubic (zubic@ifi.uzh.ch), Rong Zou (zou@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Semester Project / Master Thesis
Rethinking RNNs for Neuromorphic Computing and Event-based Vision - Available
Description: While more recent sequence modeling architectures have gained prominence, traditional Recurrent Neural Networks (RNNs), such as LSTMs and GRUs, remain highly effective for tasks requiring strong state-tracking capabilities and continuous temporal reasoning, which are qualities crucial for processing dynamic time-series data. Event-based cameras, which produce sparse, asynchronous data streams in response to scene changes, generate precisely this kind of highly temporal information. However, efficiently processing these event streams with traditional RNNs, especially on resource-constrained platforms or future neuromorphic hardware, presents significant challenges due to their strictly sequential nature and inherent inefficiencies in current hardware implementations. This project aims to change the deployment of RNNs for event-based vision by developing hardware-aware optimization strategies. We will explore novel parallelization schemes that can process multiple, smaller hidden states concurrently, analogous to how multi-head mechanisms operate, thereby better utilizing modern parallel computing architectures. Furthermore, we will focus on fine-grained kernel optimization, targeting specific hardware characteristics such as internal cache sizes, memory access patterns, and compute handling, to unlock efficiency and throughput for RNNs processing event data. The ultimate goal is to enable RNNs to leverage the advantages of event-based sensors for real-time, low-latency applications.
Goal: The primary goal of this project is to design, implement, and rigorously evaluate highly optimized RNN architectures tailored for efficient processing of event-based vision data, with a strong focus on their potential for neuromorphic computing and modern GPU hardware. This involves developing custom kernels and optimization techniques that exploit the sparsity and asynchronous nature of event streams, alongside parallelization strategies that significantly accelerate RNN inference. The student will benchmark the developed solutions on representative event-based vision tasks (e.g., object detection, optical flow, motion estimation) to demonstrate substantial improvements in processing speed and computational efficiency compared to standard implementations. Applicants should possess strong programming skills in Python and C++, expertise in deep learning frameworks (e.g., PyTorch, JAX), and a solid understanding of RNN architectures. Experience with hardware-level optimization (CUDA, Triton) or neuromorphic computing concepts is highly advantageous.
Contact Details: Interested candidates should send their CVs and transcripts (bachelor's and master's) to Nikola Zubic (zubic@ifi.uzh.ch), Roberto Pellerito (rpellerito@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Master Thesis
Event Representation Learning for Control with Visual Distractors - Available
Description: Autonomous systems operating in complex, real-world environments often face significant challenges from visual distractors and high-speed dynamics. Traditional frame-based cameras can struggle with motion blur and high-dynamic range, leading to unreliable visual representations for control. Event-based cameras, with their microsecond latency and ability to capture only changes in a scene, offer a promising alternative for robust perception in such demanding scenarios. This project aims to investigate event-based representation learning for control tasks, specifically focusing on environments with static and dynamic visual distractors. Drawing inspiration from benchmarks like the DeepMind Control Vision Benchmark (DMC-VB), we will explore how event data's unique properties, sparsity, and high temporal resolution can be leveraged to learn more robust and efficient control policies. The goal is to develop representations that are inherently less susceptible to visual noise and rapid environmental changes, thereby improving the performance and reliability of autonomous agents.
Goal: The primary goal of this project is to design, implement, and evaluate novel event-based representation learning methods for control tasks, focusing on scenarios with visual distractors. This includes developing techniques to extract meaningful features from sparse event streams that are invariant to static or dynamic visual clutter. The student will work with simulated environments (adapted to generate event data or using existing event-based simulators) to benchmark the developed methods against traditional frame-based approaches. Experiments will cover various locomotion and navigation tasks, assessing the robustness, sample efficiency, and real-time performance of event-based control policies. Applicants should have a strong background in deep learning, reinforcement learning, computer vision, and programming skills in Python (PyTorch/JAX). Experience with event-based vision or control systems is highly beneficial.
Contact Details: Interested candidates should send their CVs and transcripts (bachelor's and master's) to Nikola Zubic (zubic@ifi.uzh.ch), Rong Zou (zou@ifi.uzh.ch), and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Semester Project / Master Thesis
Better Scaling Laws for Neuromorphic Systems - Available
Description: This project explores and extends the novel "deep state-space models" framework by leveraging their transfer function representations. In contrast to time-domain parameterizations (e.g., S4 layers), transfer function parameterization enables direct computation of the model’s corresponding convolutional kernel via a single Fast Fourier Transform. This is state-free, and in theory, it would maintain constant memory and computational overhead regardless of the state size, therefore offering substantial speed and scalability advantages over existing approaches. Building on these promising theoretical results, this project aims to derive better scaling laws for neuromorphic systems by studying and deploying state-free inference in diverse long-sequence and event-based vision applications.
Goal: Implement the transfer function-based state-space model, then comprehensively benchmark its training speed, memory usage, and performance on neuromorphic and event-based vision tasks. Investigate how state-free inference behaves as model size and sequence length grow, deriving empirical or theoretical scaling relationships. Compare this approach with other state-of-the-art methods (e.g., S4, Transformer-based models) in terms of speed, memory footprint, and model accuracy or task performance. Prerequisites include familiarity with basics of LTI systems and linear ODEs, and Python programming language.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to Nikola Zubic (zubic@ifi.uzh.ch), Marco Cannici (cannici@ifi.uzh.ch) and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Semester Project / Master Thesis
Leveraging Long Sequence Modeling for Drone Racing - Available
Description: Recent advancements in machine learning have highlighted the potential of Long Sequence Modeling as a powerful approach for handling complex temporal dependencies, positioning it as a compelling alternative to traditional Transformer-based models. In the context of drone racing, where split-second decision-making and precise control are of greatest importance, Long Sequence Modeling can offer significant improvements. These models are adept at capturing intricate state dynamics and handling continuous-time parameters, providing the flexibility to adapt to varying time steps essential for high-speed navigation and obstacle avoidance. This project aims to bridge this gap by investigating the application of Long Sequence Modeling techniques in RL to develop advanced autonomous drone racing systems. The ultimate goal is to improve autonomous drones' performance, reliability, and adaptability in competitive racing scenarios.
Goal: Develop a Reinforcement Learning framework based on Long Sequence Modeling tailored for drone racing. Simulate the framework to evaluate its performance in controlled environments. Conduct a comprehensive analysis of the framework’s effectiveness in handling long sequences and dynamic racing scenarios. Ideally, the optimized model should be deployed in real-world drone racing settings to validate its practical applicability and performance.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to Nikola Zubic (zubic@ifi.uzh.ch), Angel Romero Aguilar (roagui@ifi.uzh.ch) and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Master Thesis
Neural Architecture Knowledge Transfer for Event-based Vision - Available
Description: Processing the sparse and asynchronous data from event-based cameras presents significant challenges. Transformer-based models have achieved remarkable results in sequence modeling tasks, including event-based vision, due to their powerful representation capabilities. Despite their success, their high computational complexity and memory demands make them impractical for deployment on resource-constrained devices typical in real-world applications. Recent advancements in efficient sequence modeling architectures offer promising alternatives that provide competitive performance with significantly reduced computational overhead. Recognizing that Transformers already demonstrate strong performance on event-based vision tasks, we aim to leverage their strengths while addressing efficiency concerns.
Goal: Study knowledge transfer techniques to transfer knowledge from complex Transformer models to simpler, more efficient models. Test the developed models on benchmark event-based vision tasks such as object recognition, optical flow estimation, and SLAM.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master) to Nikola Zubic (zubic@ifi.uzh.ch), Giovanni Cioffi (cioffi@ifi.uzh.ch) and Davide Scaramuzza (sdavide@ifi.uzh.ch).
Thesis Type: Master Thesis
Automatic Failure Detection for Drones - Available
Description: Automatic failure detection is an essential topic for aerial robots as small failures can already lead to catastrophic crashes. Classical methods in fault detection typically use a system model as a reference and check that the observed system dynamics are within a certain error margin. In this project, we want to explore sequence modeling as an alternative approach that feeds all available sensor data into a neural network. The network will be pre-trained on simulation data and finetuned on real-world flight data. Such a machine learning-based approach has significant potential because neural networks are very good at picking up patterns in the data that are hidden/invisible to hand-crafted detection algorithms.
Goal: The goal of the project is to develop a method that is able to automatically detect the health-status of a drone from minimal flight data, such as taking off or performing a short 'check' maneuver.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Leonard Bauersfeld [bauersfeld (at) ifi (dot) uzh (dot) ch], Ismail Geles [geles (at) ifi (dot) uzh (dot) ch], and Davide Scaramuzza (sdavide (at) ifi (dot) uzh (dot) ch).
Thesis Type: Semester Project / Master Thesis
Learning Robust Agile Flight via Adaptive Curriculum - Available
Description: Reinforcement learning-based controllers have demonstrated remarkable success in enabling fast and agile flight. Currently, the training process of these reinforcement learning controllers relies on a static, pre-defined curriculum. In this project, our objective is to develop a dynamic and adaptable curriculum to enhance the robustness of the learning-based controllers. This curriculum will continually adapt in an online fashion based on the controller's performance during the training process. By using the adaptive curriculum, we expect the reinforcement learning controllers to enable more diverse, generalizable, and robust performance in unforeseen scenarios. Applicants should have a solid understanding of reinforcement learning, machine learning experience (PyTorch), and programming experience in C++ and Python.
Goal: Improve the robustness and generalizability of the training framework and validate the method in different navigation task settings. The approach will be demonstrated and validated both in simulated and real-world settings.
Contact Details: Jiaxu Xing (jixing@ifi.uzh.ch), Nico Messikommer (nmessi@ifi.uzh.ch)
Thesis Type: Semester Project / Master Thesis
Vision-based Navigation in Dynamic Environment via Reinforcement Learning - Available
Description: In this project, the goal is to develop a vision-based policy that enables autonomous navigation in complex, cluttered environments. The learned policy should enable the robot to effectively reach a designated target based on visual input while safely avoiding encountered obstacles. Some of the use cases for this approach will be to ensure a safe landing on a moving target in a cluttered environment or to track a moving target in the wild. Applicants should have a solid understanding of reinforcement learning, machine learning experience (PyTorch), and programming experience in C++ and Python.
Goal: Develop such a policy based on an existing reinforcement learning pipeline. Extend the training environment adapted for the task definition. The approach will be demonstrated and validated both in simulated and real-world settings.
Contact Details: Jiaxu Xing (jixing@ifi.uzh.ch), Leonard Bauersfeld (bauersfeld@ifi.uzh.ch)
Thesis Type: Master Thesis
Learning Rapid UAV Exploration with Foundation Models - Available
Description: In this project, our objective is to efficiently explore unknown indoor environments using UAVs. Recent research has demonstrated significant success in integrating foundational models with robotic systems. Leveraging these foundational models, the drone will employ learned semantic relationships from large-world-scale data to actively explore and navigate through unknown environments. While most prior research has focused on ground-based robots, this project aims to investigate the potential of integrating foundational models with aerial robots to introduce more agility and flexibility. Applicants should have a solid understanding of mobile robot navigation, machine learning experience (PyTorch), and programming experience in C++ and Python.
Goal: Develop such a framework in simulation and conduct a comprehensive evaluation and analysis. If feasible, deploy such a model in a real-world environment.
Contact Details: Jiaxu Xing (jixing@ifi.uzh.ch), Nico Messikommer (nmessi@ifi.uzh.ch)
Thesis Type: Semester Project / Master Thesis
Energy-Efficient Path Planning for Autonomous Quadrotors in Inspection Tasks - Available
Description: Autonomous quadrotors are increasingly used in inspection tasks, where flight time is often limited by battery capacity. In these operations, reducing energy consumption is essential, especially when quadrotors must navigate complex paths near inspection targets. Traditional path planning methods often overlook energy costs, which limits their effectiveness in real-world applications. This project aims to explore and evaluate state-of-the-art path planning approaches that incorporate energy efficiency into trajectory optimization. Various planning techniques will be tested to identify the most suitable methods for minimizing energy consumption, ensuring smooth navigation, and maximizing inspection coverage within a single battery charge. Strong programming skills in Python/C++ and a background in robotics or autonomous systems are required. Experience in motion planning, machine learning, or energy modeling is beneficial but not essential.
Goal: The goal of this project is to develop, implement, and test an energy-efficient waypoint path planning method that improves quadrotor endurance in inspection tasks, maximizing inspection coverage within a single battery cycle.
Contact Details: Leonard Bauersfeld (bauersfeld AT ifi DOT uzh DOT ch), Rudolf Reiter (rreiter AT ifi DOT uzh DOT ch)
Thesis Type: Semester Project / Master Thesis
Event-based Particle Image Velocimetry - Available
Description: When drones are operated in industrial environments, they are often flown in close proximity to large structures, such as bridges, buildings or ballast tanks. In those applications, the interactions of the induced flow produced by the drone’s propellers with the surrounding structures are significant and pose challenges to the stability and control of the vehicle. A common methodology to measure the airflow is particle image velocimetry (PIV). Here, smoke and small particles suspended in the surrounding air are tracked to estimate the flow field. In this project, we aim to leverage the high temporal resolution of event cameras to perform smoke-PIV, overcoming the main limitation of frame-based cameras in PIV setups. Applicants should have a knowledge in machine learning and programming experience with Python and C++. Experience in fluid mechanics is beneficial but not a requirement.
Goal: The goal of the project is to develop and successfully demonstrate a PIV method in the real world.
Contact Details: Leonard Bauersfeld (bauersfeld@ifi.uzh.ch), Koen Muller (kmuller@ethz.ch)
Thesis Type: Semester Project / Master Thesis
Fine-tuning Policies in the Real World with Reinforcement Learning - Available
Description: Training sub-optimal policies is relatively straightforward and provides a solid foundation for reinforcement learning (RL) agents. However, these policies cannot improve online in the real world, such as when racing drones with RL. Current methods fall short in enabling drones to adapt and optimize their performance during deployment. Imagine a drone equipped with an initial sub-optimal policy that can navigate a race course but not with maximum efficiency. As the drone races, it learns to optimize its maneuvers in real-time, becoming faster and more agile with each lap. Applicants are expected to be proficient in Python, C++, and Git.
Goal: This project aims to explore online fine-tuning in the real world of sub-optimal policies using RL, allowing racing drones to improve continuously through real-world interactions.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Ismail Geles [geles (at) ifi (dot) uzh (dot) ch], Rudolf Reiter [rreiter (at) ifi (dot) uzh (dot) ch], Jiaxu Xing [jixing (at) ifi (dot) uzh (dot) ch]
Thesis Type: Semester Project / Master Thesis
Inverse Reinforcement Learning from Expert Pilots - Available
Description: Drone racing demands split-second decisions and precise maneuvers. However, training drones for such races relies heavily on crafted reward functions. These methods require significant human effort in design choices and limit the flexibility of learned behaviors. Inverse Reinforcement Learning (IRL) offers a promising alternative. IRL allows an AI agent to learn a reward function by observing expert demonstrations. Imagine an AI agent analyzing recordings of champion drone pilots navigating challenging race courses. Through IRL, the agent can infer the implicit factors that contribute to success in drone racing, such as speed and agility. Applicants are expected to be proficient in Python, C++, and Git.
Goal: We want to explore the application of Inverse Reinforcement Learning (IRL) for training RL agents performing drone races or FPV freestyle to develop methods that extract valuable knowledge from the actions and implicit understanding of expert pilots. This knowledge will then be translated into a robust reward function suitable for autonomous drone flights.
Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to: Ismail Geles [geles (at) ifi (dot) uzh (dot) ch], Leonard Bauersfeld [bauersfeld (at) ifi (dot) uzh (dot) ch]
Thesis Type: Semester Project / Master Thesis