Student Projects


How to apply

To apply, please send your CV, your Ms and Bs transcripts by email to all the contacts indicated below the project description. Do not apply on SiROP . Since Prof. Davide Scaramuzza is affiliated with ETH, there is no organizational overhead for ETH students. Custom projects are occasionally available. If you would like to do a project with us but could not find an advertized project that suits you, please contact Prof. Davide Scaramuzza directly to ask for a tailored project (sdavide at ifi.uzh.ch).


Upon successful completion of a project in our lab, students may also have the opportunity to get an internship at one of our numerous industrial and academic partners worldwide (e.g., NASA/JPL, University of Pennsylvania, UCLA, MIT, Stanford, ...).



State-Space Models for Efficient Reinforcement Learning - Available

Description: Lately, the machine learning research witnessed a notable surge in interest towards State-Space Models (SSMs) as a viable alternative to the Transformer paradigm. The intrinsic strength of SSMs resides in their training and deployment capabilities: they can be trained utilizing methodologies akin to Convolutional Neural Networks (CNNs) or through parallel scan techniques, yet they can be deployed in a manner similar to Recurrent Neural Networks (RNNs), achieving inference in constant time. This is an ideal scenario since we mimic the operational characteristics of RNNs during the inference phase while maintaining the training efficiency/speed comparable to that of CNNs. Additionally, SSMs are capable of learning continuous-time parameters, which offer the flexibility to be discretized at any chosen time-step. While these models have demonstrated encouraging outcomes on simpler sequence modeling datasets and in a limited scope on standard computer vision tasks, their application within the realm of Reinforcement Learning (RL) remains unexplored. This project aims to investigate the potential of harnessing the capabilities of SSMs in the field of RL, aiming to construct efficient RL systems of the future.

Goal: Develop SSM-based RL framework that works in simulation, and conduct a detailed evaluation and analysis of such framework. Ideally, deploy this model in a real-world environment.

Contact Details: Nikola Zubic (zubic@ifi.uzh.ch), Angel Romero Aguilar (roagui@ifi.uzh.ch)

Thesis Type: Master Thesis

See project on SiROP

Exploring Event Generation Strategies - Available

Description: The domain of Event-Based Vision, which replicates the human eye's ability to register changes within a scene, offers significant advancements in terms of power efficiency, latency, and dynamic range. However, there is still lack of datasets when compared with RGB vision. There are some works for bridging the gap by creating events from videos, such as: https://github.com/uzh-rpg/rpg_vid2e . It would be interesting to create more performative and efficient method for the generation of events from videos, and various directions could be explored.

Goal: The primary objective of this project is to create a more efficient and performative learning-based method for Video to Events generation. Directions and/or techniques are chosen by student, hence this is a Master's thesis.

Contact Details: Nikola Zubic (zubic@ifi.uzh.ch), Marco Cannici (cannici@ifi.uzh.ch)

Thesis Type: Master Thesis

See project on SiROP

gpuFlightmare: High-Performance GPU-Based Physics Simulation and Image Rendering for Flying Robots - Available

Description: gpuFlightmare is a next-generation GPU-accelerated framework designed to enhance the capabilities of Flightmare, a CPU-based physics simulation tool. By transitioning to GPU processing, this project addresses two main limitations of the existing system: the inability to scale simulations to larger, more complex environments and the slow image rendering speeds that hinder efficient policy training for flying robots.

Goal: The goal of gpuFlightmare is to provide a more efficient and effective platform for developing and testing vision-based navigation policies. By improving simulation and rendering speeds, the project will facilitate faster iterations of policy training and validation, making it a valuable tool for researchers and developers in the field of aerial robotics.

Contact Details: Yunlong Song (song@ifi.uzh.ch), Nico Messikommer ((nmessi@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Autonomous Flight Using A Camera - Available

Description: In First-Person View (FPV) drone flying, professional pilots demonstrate remarkable skill, navigating through complex environments with precision and flair. The essence of FPV flight lies not just in efficiency or speed, but in the "cool" factor — the ability to perform dynamic, agile maneuvers that captivate and impress. This project explores the challenge of capturing this "coolness" factor in optimization, enabling the development of an autonomous flight system capable of replicating the nuanced flight patterns of expert human pilots. Our research focuses on formulating these advanced maneuvers and implementing them through a vision-based system, allowing drones to autonomously navigate through cluttered spaces like forests with the same level of skill and style as their human counterparts.

Goal: To create a sophisticated autonomous FPV flight system that integrates advanced computer vision and control algorithms, enabling drones to autonomously execute complex, human-like maneuvers in cluttered and dynamically changing environments.

Contact Details: Yunlong Song (song@ifi.uzh.ch), Nico Messikommer (nmessi@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Visual Representation Learning for Efficient Deep Reinforcement Learning - Available

Description: The study of end-to-end deep learning in computer vision has mainly focused on developing useful object representations for image classification, object detection, or semantic segmentation. Recent work has shown that it is possible to learn temporally and geometrically aligned keypoints given only videos, and the object keypoints learned via unsupervised learning manners can be useful for efficient control and reinforcement learning.

Goal: The goal of this project is to find out if it is possible to learn useful features or intermediate representations for controlling mobile robots at high speed. For example, can we use the Transporter (a neural network architecture) to find useful features in an autonomous car racing environment? if so, can we use these features to discover an optimal control policy via deep reinforcement learning?

Contact Details: Yunlong Song (song@ifi.uzh.ch), Jiaxu Xing (jixing@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Foundation Model for Drone Navigation in Confined Spaces - Available

Description: This master thesis project centers on the development of a foundation model for drone navigation within confined spaces such as ballast tanks of ships. A specific emphasis is on efficiently fine-tuning the model using data collected from drone flights for fast domain adaptation. Recent advancements have showcased the capability of foundation models to help autonomously navigate ground robots, particularly in planar environments. This project will involve extending a foundation navigation model capable of directly generating drone control commands from input images, focusing on navigating challenging environments with narrow passages and varying lighting conditions. Applicants should have a good understanding of mobile robot navigation, machine learning experience (PyTorch), and programming experience in C++ and Python.

Goal: The primary objective of this master thesis project is to develop and fine-tune a foundation navigation model for drones in confined spaces, leveraging data from real and simulated drone flights. We aim to demonstrate the efficacy of the navigation model through real-world deployment using a commercial drone platform.

Contact Details: Contact Details: Harmish Khambhaita (harmish@ifi.uzh.ch), Christian Sprecher (sprecher@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Autonomously traversing ship manholes using end-to-end vision-based control - Available

Description: Navigating through ship manholes poses a significant challenge for industrial drones, requiring a combination of agility and adaptability to successfully traverse the confined spaces within ship interiors. Given the challenges of limited lighting, constrained spaces, and complex geometries, the proposed solution focuses on leveraging end-to-end learning techniques for drone navigation. This entails the development of end-to-end vision-based algorithms to facilitate agile maneuvers trained on diverse datasets of ship interiors collected from simulation and real-world experiments. Prospective students for this thesis project should possess a strong background in reinforcement learning, deep neural networks, and robot perception. Proficiency in PyTorch is essential, along with programming skills in both C++ and Python.

Goal: Develop an end-to-end learning-based approach for autonomous drone navigation in ship ballast tank manholes, incorporating both real and simulated training data. The project aims to emphasize speed, a high success rate, and safety in flying through the confined spaces of ship interiors. Eventually, the project outcome will demonstrate the efficiency and safety of the developed approach with real-world tests.

Contact Details: Harmish Khambhaita (harmish@ifi.uzh.ch), Ismail Geles (geles@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Drone Racing End-to-end policy learning: from features to commands - Available

Description: This project focuses on using RL to learn quadrotor policies to fly at high speeds in complex tracks, directly from features.

Goal: The primary objective of this thesis is to use the current RL pipeline to explore the possibility of learning directly from a feature map, instead of other representations. The project involves several key stages: collecting a feature map of a real-world environment, transferring these real-world features into a simulator, using RL to train the drone in this simulated environment, and finally deploying this learning in the real world with real-time feature detection. Applicant Requirements: - Proficiency in machine learning, specifically in Reinforcement Learning. - Experience in programming with Python and C++. - Knowledge in simulation software and real-time data processing. - Understanding of drone dynamics and control systems. - Background in signal processing and non-linear dynamic systems. - Additional experience in image processing and embedded systems is advantageous.

Contact Details: Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Angel Romero (roagui AT ifi DOT uzh DOT ch), Giovanni Cioffi (cioffi AT ifi DOT uzh DOT ch), Ismail Geles (geles AT ifi DOT uzh DOT ch) and Jiaxu Xing (xing AT ifi DOT uzh DOT ch)

Thesis Type: Master Thesis

See project on SiROP

Segmentation and Object Detection in Neural Radiance Fields (NeRFs) for Enhanced 3D Scene Understanding - Available

Description: This master thesis project focuses on advancing 3D scene understanding through the integration of segmentation and object detection techniques within Neural Radiance Fields (NeRFs). NeRFs have demonstrated remarkable capabilities in synthesizing high-fidelity 3D scenes, and this project aims to enhance their functionality by incorporating state-of-the-art methods for accurate segmentation and object detection. The student will explore novel approaches to seamlessly integrate these techniques, enabling NeRFs to not only generate realistic scenes but also identify and categorize objects within them. The project's scope includes experimentation with diverse datasets and validation through quantitative metrics to evaluate the effectiveness of the proposed methodology.

Goal: The primary goal of this master thesis project is to enhance 3D scene understanding within Neural Radiance Fields (NeRFs) by integrating advanced segmentation and object detection techniques. Validation of the proposed approach will be conducted using diverse datasets, with a focus on quantitative metrics to demonstrate the effectiveness of the enhanced NeRF model.

Contact Details: Harmish Khambhaita (harmish@ifi.uzh.ch), Marco Cannici (cannici@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Vision-Based Autonomous Drone Recovery Using Reinforcement Learning - Available

Description: This project is focused on developing a vision-only flight recovery system for autonomous drones. A critical capability for autonomous drones is to recover safely from any unstable state. This project explores the potential of using reinforcement learning to enable a drone to transition from an unstable to a stable state, using only vision sensors. The challenge lies in creating a system that not only stabilizes the drone but also ensures it can safely land in various unforeseen scenarios.

Goal: - Develop a Reinforcement Learning Policy: Create a learning algorithm capable of recovering the drone from any dangerous, unstable state to a stable state, utilizing only vision-based inputs. - Testing the system in simulation and hardware in the loop environment - Testing the system in the real world platform Requirements: - Strong background in machine learning and reinforcement learning - Proficient in programming in C++ and python - Solid understanding of nonlinear dynamic systems. - Comfortable in a hands-on, experimental environment.

Contact Details: Please send your CV and transcripts (bachelor and master), and any projects you have worked on that you find interesting to Angel Romero (roagui AT ifi DOT uzh DOT ch), Jiaxu Xing (xing AT ifi DOT uzh DOT ch) and Ismail Geles (geles AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Vision-based Navigation in Dynamic Environment via Reinforcement Learning - Available

Description: In this project, the goal is to develop a vision-based policy that enables autonomous navigation in complex, cluttered environments. The learned policy should enable the robot to effectively reach a designated target based on visual input while safely avoiding encountered obstacles. Some of the use cases for this approach will be to ensure a safe landing on a moving target in a cluttered environment or to track a moving target in the wild. Applicants should have a solid understanding of reinforcement learning, machine learning experience (PyTorch), and programming experience in C++ and Python.

Goal: Develop such a policy based on an existing reinforcement learning pipeline. Extend the training environment adapted for the task definition. The approach will be demonstrated and validated both in simulated and real-world settings.

Contact Details: Jiaxu Xing (jixing@ifi.uzh.ch), Leonard Bauersfeld (bauersfeld@ifi.uzh.ch)

Thesis Type: Master Thesis

See project on SiROP

Learning Rapid UAV Exploration with Foundation Models - Available

Description: In this project, our objective is to efficiently explore unknown indoor environments using UAVs. Recent research has demonstrated significant success in integrating foundational models with robotic systems. Leveraging these foundational models, the drone will employ learned semantic relationships from large-world-scale data to actively explore and navigate through unknown environments. While most prior research has focused on ground-based robots, this project aims to investigate the potential of integrating foundational models with aerial robots to introduce more agility and flexibility. Applicants should have a solid understanding of mobile robot navigation, machine learning experience (PyTorch), and programming experience in C++ and Python.

Goal: Develop such a framework in simulation and conduct a comprehensive evaluation and analysis. If feasible, deploy such a model in a real-world environment.

Contact Details: Jiaxu Xing (jixing@ifi.uzh.ch), Nico Messikommer (nmessi@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Autonomous Drone Navigation via Learning from YouTube Videos - Available

Description: The evolving landscape of large vision and language models, paired with the untapped availability of unlabeled internet data, presents new exciting opportunities for training robotic policies. Inspired by how humans learn, this project aims to explore the possibility of learning flight patterns, obstacle avoidance, and navigation strategies by simply watching drone flight videos available on YouTube. State-of-the-art methods for processing and encoding videos, as well as unsupervised training techniques, will be evaluated and designed during the project. Applicants should have a strong background in machine learning, computer vision, and proficiency in Python programming. Familiarity with deep learning frameworks such as PyTorch is desirable.

Goal: Investigate the feasibility and effectiveness of using large vision models along with self-supervised learning techniques to teach drones to navigate autonomously by analyzing YouTube videos. Develop a prototype system capable of learning from online videos and demonstrate its effectiveness in simulated and real-world environments.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Marco Cannici (cannici AT ifi DOT uzh DOT ch) and Angel Romero (roagui AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Gaussian Splatting Visual Odometry - Available

Description: Recent works have shown that Gaussian Splatting (GS) is a compact and accurate map representation. Thanks to their properties GS maps are appealing for SLAM systems. However, recent works including GS maps in SLAM struggle with map-to-frame mapping. In this project, we will investigate the potential of GS maps in VO. The goal is to achieve robust map-to-frame tracking. We will benchmark our solution against feature-based and direct-based tracking baselines. This project will be done in collaboration with Meta.

Goal: The goal is to investigate the use of Gaussian splatting maps in visual-inertial systems. We look for students with strong programming (C++ preferred), computer vision (ideally have taken Prof. Scaramuzza's class), and robotic backgrounds.

Contact Details: Giovanni Cioffi, cioffi (at) ifi (dot) uzh (dot) ch, Manasi Muglikar, muglikar (at) ifi (dot) uzh (dot) ch

Thesis Type: Semester Project / Master Thesis

See project on SiROP

IMU-centric Odometry for Drone Racing and Beyond - Available

Description: Our recent work has shown that it is possible to estimate the state of a racer drone only using a low-grade IMU. This project will build upon our previous work and try to extend its applicability to scenarios beyond racing. To achieve this goal, we will investigate an "unconventional" way of using camera images inside the odometry pipeline. The developed VIO pipeline will be compared to existing state-of-the-art model-based algorithms, with a focus on application in agile flights in the wild, and deployed on embedded platforms (Nvidia Jetson TX2 or Xavier).

Goal: Development of an IMU-centric odometry algorithm. Benchmark against state-of-the-art VIO method. A successful thesis will lead to the deployment of the proposed odometry algorithm on the real drone platform. We look for students with strong programming (C++ preferred), computer vision (ideally have taken Prof. Scaramuzza's class), and robotic background. Hardware experience (running code on robotic platforms) is preferred.

Contact Details: Giovanni Cioffi [cioffi (at) ifi (dot) uzh (dot) ch], Jiaxu Xing [jixing (at) ifi (dot) uzh (dot) ch]

Thesis Type: Master Thesis

See project on SiROP

Navigating on Mars - Available

Description: The first ever Mars helicopter Ingenuity flew over a texture-poor terrain and RANSAC wasn’t able to find inliers: https://spectrum.ieee.org/mars-helicopter-ingenuity-end-mission Navigating the Martian terrain poses significant challenges due to its unique and often featureless landscape, compounded by factors such as dust storms, lack of distinct textures, and extreme environmental conditions. The absence of prominent landmarks and the homogeneity of the surface can severely disrupt optical navigation systems, leading to decreased accuracy in localization and path planning.

Goal: This project aims to address these challenges by developing a navigation system that is resilient to Mars' sparse features and dust interference, employing advanced computational techniques to enhance environmental perception and autonomy.

Contact Details: Manasi Muglikar muglikar (at) ifi (dot) uzh (dot) ch, Giovanni Cioffi cioffi (at) ifi (dot) uzh (dot) ch

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Data-driven Event Generation from Images - Available

Description: Event cameras represent a significant advancement in imaging technology, capturing scenes based on changes in light intensity rather than at fixed intervals. This project aims to address the challenge of limited event-based datasets by generating synthetic events from traditional frame-based data. By employing data-driven deep learning techniques, we plan to create high-fidelity artificial events that closely mimic real-world occurrences, reducing the gap between simulated and actual event data.

Goal: In this project, the student applies current state-of-the-art deep learning models for image generation to create artificial events from standard frames. In the scope of the project, the student will obtain a deep understanding of event cameras to generate realistic events. Since multiple state-of-the-art deep learning methods will be explored, a good background in deep learning is required. If you are interested, we are happy to provide more details.

Contact Details: Nico Messikommer [nmessi (at) ifi (dot) uzh (dot) ch], Marco Cannici [cannici (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Efficient Neural Scene Reconstruction with Event Cameras - Available

Description: Building upon the success of learning-based methods in scene reconstruction and synthesis, this project aims to advance the field forward by enhancing the efficiency and speed of existing formulations in the context of event cameras. While learning-based methods have already showcased the potential of event cameras in neural scene reconstruction, they often require extensive training to achieve top-quality results. This project seeks to address this limitation by leveraging the sparse nature of events to accelerate the training of radiance fields.

Goal: The primary objective of this project is to explore innovative strategies for neural scene reconstruction using event cameras, with a focus on optimizing the training and inference speed. Applicants with a background in programming (Python/Matlab), computer vision, and familiarity with machine learning frameworks (PyTorch) are encouraged to apply.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Marco Cannici (cannici AT ifi DOT uzh DOT ch) and Manasi Muglikar (muglikar AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Multimodal Fusion for Enhanced Neural Scene Reconstruction Quality - Available

Description: Recent advancements in neural radiance field training have shown remarkable success by fusing together vision and semantic modalities for improved reconstruction quality. In this project, we build upon this recent trend and investigate how the use of modalities such as depth and event data can improve radiance fields. The project aims to explore how prior 3D information can assist in reconstructing fine details and how the help of high-temporal resolution data can enhance modeling in the case of scene and camera motion. By exploring the fusion of these modalities, we aim to achieve more accurate and detailed representations of complex environments.

Goal: The primary goal of this project is to evaluate the fusion of multiple sensor modalities, including RGB, depth, and event cameras, for enhanced scene reconstruction quality. We aim to leverage the unique strengths of each modality to achieve finer detail reconstruction and effectively handle complex scenes. Applicants with a background in programming (Python/Matlab), experience in computer vision, and familiarity with machine learning frameworks (PyTorch) are encouraged to apply.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Marco Cannici (cannici AT ifi DOT uzh DOT ch) and Manasi Muglikar (muglikar AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Domain Transfer between Events and Frames for Motor Policies - Available

Description: Recent robotics breakthroughs mainly use motor policies trained in simulation to perform impressive maneuvers in the real world. This project seeks to capitalize on the high-temporal resolution of event cameras to enhance the robustness of motor policies by integrating event data as a sensor modality. However, current methods for generating events in simulation are inefficient, requiring the rendering of multiple frames at a high frame rate. The primary goal of this project is to develop a shared embedding space for events and frames, enabling training on simulated frames and deployment on real-world event data. The project offers opportunities to test the proposed approach on various robotic platforms, such as quadrotors and miniature cars, depending on the project's progress.

Goal: Participants will build upon the foundations laid by previous student projects (published at ECCV22) and leverage insights from the domain of Unsupervised Domain Adaptation (UDA) literature to transfer motor policies from frames to events. The project will involve validating the approach in simulation, with potential real-world experiments conducted in our drone arena. Emphasis will be placed on demonstrating the advantages of event cameras in challenging environments, such as low-light conditions and high-dynamic scenes. Given the use of various deep learning methods for task transfer, a strong background in deep learning is essential for prospective participants. If you are interested, we are happy to provide more details.

Contact Details: Nico Messikommer [nmessi (at) ifi (dot) uzh (dot) ch], Jiaxu Xing [jixing (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Data-driven Keypoint Extractor for Event Data - Available

Description: Neuromorphic cameras, characterized by their robustness to High Dynamic Range (HDR) scenes, high-temporal resolution, and low power consumption, have paved the way for innovative applications in camera pose estimation, particularly for fast motions in challenging environments. This project focuses on enhancing camera pose estimation by exploring a data-driven approach to keypoint extraction, leveraging recent advancements in frame-based keypoint extraction techniques. To achieve this, the project aims to integrate a Visual Odometry (VO) pipeline to provide real-time feedback in an online fashion.

Goal: The primary objective of this project is to develop a data-driven keypoint extractor capable of identifying interest points in event data. Building upon insights from a previous student project (submitted to CVPR23), participants will harness neural network architectures to extract keypoints within an event stream. Furthermore, the project will involve adapting existing Visual Odometry (VO) algorithms to work with the developed keypoint extractor and tracker. Prospective students should possess prior programming experience in a deep learning framework and have completed at least one course in computer vision. This project offers an exciting opportunity to contribute to the cutting-edge intersection of neuromorphic imaging and computer vision. If you're ready to delve into the realm of data-driven keypoint extraction and its application in camera pose estimation, we're excited to provide further details.

Contact Details: Nico Messikommer [nmessi (at) ifi (dot) uzh (dot) ch], Giovanni Cioffi [cioffi (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

HDR NERF: Neural Scene reconstruction in low light - Available

Description: Implicit scene representations, particularly Neural Radiance Fields (NeRF), have significantly advanced scene reconstruction and synthesis, surpassing traditional methods in creating photorealistic renderings from sparse images. However, the potential of integrating these methods with advanced sensor technologies that measure light at the granularity of a photon remains largely unexplored. These sensors, known for their exceptional low-light sensitivity and high dynamic range, could address the limitations of current NeRF implementations in challenging lighting conditions, offering a novel approach to neural-based scene reconstruction.

Goal: his project aims to pioneer the integration of SPAD sensors with neural-based scene reconstruction frameworks, specifically focusing on enhancing Neural Radiance Fields. The primary objective is to investigate how photon derived data can be utilized to improve scene reconstruction fidelity, depth accuracy, and rendering quality under diverse lighting conditions. By extending NeRF to incorporate event-based data from SPADs, we anticipate a significant leap in the performance of neural scene synthesis methodologies, particularly in challenging environments where traditional sensors falter.

Contact Details: Manasi Muglikar muglikar (at) ifi (dot) uzh (dot) ch, Marco Cannici cannici (at) ifi (dot) uzh (dot) ch

Thesis Type: Master Thesis

See project on SiROP

Low Latency Occlusion-aware Object Tracking - Available

Description: In this project, we will develop a low-latency, robust to occlusion, object tracker. Three main paradigms exist in the literature to perform object tracking: Tracking-by-detection, Tracking-by-regression, and Tracking-by-attention. We will start with a deep literature review to evaluate the current solutions to our end goal of being fast and robust to occlusion. Starting from the conclusions of this study, we will design a novel tracker that can achieve our goal. In addition to RBG images, we will investigate other sensor modalities such as inertial measurement units and event cameras. This project is done in collaboration with Meta.

Goal: Develop a low-latency object tracker that is robust to occlusions. We look for students with strong computer vision background and familiar with common software tools used in Deep Learning (for example, PyTorch or TensorFlow).

Contact Details: Giovanni Cioffi [cioffi (at) ifi (dot) uzh (dot) ch], Nico Messikommer [nmessi (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Event-based occlusion removal - Available

Description: Unwanted camera occlusions, such as debris, dust, raindrops, and snow, can severely degrade the performance of computer-vision systems. Dynamic occlusions are particularly challenging because of the continuously changing pattern. This project aims to leverage the unique capabilities of event-based vision sensors to address the challenge of dynamic occlusions. By improving the reliability and accuracy of vision systems, this work could benefit a wide range of applications, from autonomous driving and drone navigation to environmental monitoring and augmented reality.

Goal: The goal of this project is to develop an advanced computational framework capable of identifying and eliminating dynamic occlusions from visual data in real-time, utilizing the high temporal resolution of event-based vision sensors.

Contact Details: Manasi Muglikar, muglikar (at) ifi (dot) uzh (dot) ch, Nico Messikommer nmessi (at) ifi (dot) uzh (dot) ch

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Foundation Models for Event-based Segmentation - Available

Description: In the field of event-based vision, the key challenge lies in efficiently processing the asynchronous stream of data generated by event-based sensors. These sensors, inspired by the biological mechanisms of the human retina, capture the dynamics of a scene with high temporal resolution and low latency. The project proposes to work on foundation models for Event-based Segmentation. This approach is aimed at mitigating the challenges posed by the scarcity of labeled data in event-based vision. The project will focus on creating models capable of understanding and segmenting complex visual scenes by using novel learning methodologies. This innovative methodology has the potential to significantly expand the capabilities of event-based vision systems, particularly in dynamic and unstructured environments.

Goal: The primary goal of this project is to design, implement, and validate foundation models (CLIP, SAM) for Event-based Segmentation. Interesting joint usage of both foundation models will be explored. Applicants should have a solid machine learning background, strong programming skills (Python, C++) and experience in frameworks such as PyTorch or JAX.

Contact Details: Nikola Zubic (zubic@ifi.uzh.ch), Manasi Muglikar (muglikar@ifi.uzh.ch)

Thesis Type: Master Thesis

See project on SiROP

What can Large Language Models offer to Event-based Vision? - Available

Description: Event-based vision algorithms process visual changes in an asynchronous manner akin to how biological visual systems function, while large language models (LLMs) specialize in parsing and generating human-like text. This project aims to explore the intersection of Large Language Models (LLMs) and Event-based Vision, leveraging the unique capabilities of each domain to create a symbiotic framework. By marrying the strengths of both technologies, the initiative aims to develop a novel, more robust paradigm that excels in challenging conditions.

Goal: The primary objective is to devise methodologies that synergize the capabilities of LLMs with Event-Based Vision systems. We intend to address identified shortcomings in existing paradigms by leveraging the inferential strengths of LLMs. Rigorous evaluations will be conducted to validate the efficacy of the integrated system under various challenging conditions.

Contact Details: Nikola Zubic (zubic@ifi.uzh.ch), Nico Messikommer (nmessi@ifi.uzh.ch)

Thesis Type: Master Thesis

See project on SiROP

Advancing Augmented Reality Helmets for motorcyclists and racecars: Independence through Self-Localization - Available

Description: Augmented reality (AR) helmets represent a significant advancement in automotive technology for motorcyclists and racecars, offering drivers essential information while maintaining focus on the road. These systems can project 3D navigation arrows anchored on the road or the optimal race-trajectory on the track. Check out this video (https://www.youtube.com/watch?v=JYwFaNrGHrY) for how the system performs in Zurich. Current systems rely on vehicle data for localization, limiting their flexibility and performance. This project aims to develop a robust state estimation framework that enables AR helmets to localize independently of vehicle data. By leveraging onboard visual and inertial sensors, we seek to enhance the helmet's ability to accurately determine its position relative to the vehicle. This thesis is done in collaboration with Aegis Rider (https://aegisrider.com/)

Goal: Our objective is to design a state-of-the-art state estimation framework that localizes the AR helmet relative to the vehicle without any vehicle data and is suitable for deployment on mobile computing platforms. We prioritize achieving minimal latency while ensuring precise localization using only helmet-mounted visual and inertial sensors. We look for students with strong programming (C++ preferred), computer vision (ideally have taken Prof. Scaramuzza's class), and robotic background. Hardware experience (running code on robotic platforms) is preferred.

Contact Details: Giovanni Cioffi (cioffi@ifi.uzh.ch), Simon Hecker (simon@aegisrider.com)

Thesis Type: Master Thesis

See project on SiROP

High-speed drone flight with spiking neural networks - Available

Description: Recent works have demonstrated high-speed autonomous flight through challenging outdoor environments by implementing an obstacle avoidance pipeline based on depth sensors via end-to-end deep learning of Artificial Neural Networks (ANNs). Our simulations suggest that more biologically inspired Spiking Neural Networks (SNNs), characterized by simple and efficient dynamics coupled with sparse communication via spikes, can also perform this task well. Following these initial results, this project investigates the deployment and evaluation of SNN models on real drones, identifying and addressing potential sim-to-real gap stemming from the differences between the simulation and the real world. Applicants should have solid machine learning experience (TensorFlow) and be familiar with concepts of drone navigation and ROS. The project is a collaboration between RPG and IBM Research Zurich.

Goal: Compare the performance and the simulation-to-reality gap of SNN and ANN models. Reduce the gap by enhancing the SNN architecture and its training setup. Demonstrate high-speed flight with SNN-based navigation in a real drone.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Marco Cannici (cannici AT ifi DOT uzh DOT ch) and Stanislaw Wozniak (stw AT zurich DOT ibm DOT com)

Thesis Type: Master Thesis

See project on SiROP

3D reconstruction with event cameras - Available

Description: Event cameras are bio-inspired sensors that offer several advantages, such as low latency, high-speed and high dynamic range, to tackle challenging scenarios in computer vision. Research on structure from motion and multi-view stereo with images has produced many compelling results, in particular accurate camera tracking and sparse reconstruction. Active sensors with standard cameras like Kinect have been used for dense scene reconstructions. Accurate and efficient reconstructions using event-camera setups is still an unexplored topic. This project will focus on solving the problem of 3D reconstruction using active perception with event cameras​ .

Goal: The goal is to develop a system for accurate mapping of complex and arbitrary scenes using depth acquired by an event camera setup. We seek a highly motivated student with the following minimum qualifications: - Excellent coding skills in Python and C++ - At least one course in computer vision (multiple view geometry) - Strong work ethic - Excellent communication and teamwork skills Preferred qualifications: - Experience with machine learning Contact for more details.

Contact Details: Manasi Muglikar, muglikar (at) ifi (dot) uzh (dot) ch

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Model-based Reinforcement Learning for Autonomous Drone Racing - Available

Goal: The objective of this project is to modify an existing model-free Reinforcement Learning pipeline for drones to a model-based Reinforcement Learning pipeline. The goal is to investigate potential performance improvements of the reinforcement learning algorithm by incorporating a model of the drone's dynamics, which will allow the algorithm to make more informed decisions. This will result in faster learning and better generalization, leading to better performance in real-world scenarios. To accomplish this goal, the student will need to research and implement various model-based reinforcement learning algorithms and evaluate their performance in a simulation environment for drone navigation. The student will also need to fine-tune the parameters of the algorithm to achieve optimal performance. The final product will be a pipeline that can be used to train a drone to navigate in a variety of environments with improved efficiency and accuracy. Applicants should have a strong background in both model-free and model-based reinforcement learning techniques, programming in C++ and Python, and a good understanding of nonlinear dynamic systems. Additional experience in signal processing, machine learning, as well as being comfortable operating in a hands-on environment is highly desired.

Contact Details: Please send your CV and transcripts (bachelor and master), and any projects you have worked on that you find interesting to Angel Romero (roagui AT ifi DOT uzh DOT ch) and Yunlong Song (song AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP