The Event-Camera Dataset and Simulator

This dataset presents the world's first collection of datasets with an event-based camera for high-speed robotics. The data also include intensity images, inertial measurements, and ground truth from a motion-capture system. An event-based camera is a revolutionary vision sensor with three key advantages: a measurement rate that is almost 1 million times faster than standard cameras, a latency of 1 microsecond, and a high dynamic range of 130 decibels (standard cameras only have 60 dB). These properties enable the design of a new class of algorithms for high-speed robotics, where standard cameras suffer from motion blur and high latency. All the data are released both as text files and binary (i.e., rosbag) files.

More information on the dataset website.



E. Mueggler, H. Rebecq, G. Gallego, T. Delbruck, D. Scaramuzza

The Event-Camera Dataset and Simulator: Event-based Data for Pose Estimation, Visual Odometry, and SLAM

PDF (arXiv) YouTube Dataset

Information Gain Based Active Reconstruction Framework

The Information Gain Based Active Reconstruction Framework is a modular, robot-agnostic, software package for performing next-best-view planning for volumetric object reconstruction using a range sensor. Our implementation can be easily adapted to any mobile robot equipped with any camera-based range sensor (e.g stereo camera, structured light sensor) to iteratively observe an object to generate a volumetric map and a point cloud model. The algorithm allows the user to define the information gain metric for choosing the next best view, and many formulations for these metrics are evaluated and compared in our ICRA paper. This framework is released open source as a ROS-compatible package for autonomous 3D reconstruction tasks.

Download the code from GitHub.

Check out a video of the system in action on YouTube.



S. Isler, R. Sabzevari, J. Delmerico, D. Scaramuzza

An Information Gain Formulation for Active Volumetric 3D Reconstruction

IEEE International Conference on Robotics and Automation (ICRA), Stockholm, 2016.

PDF YouTube Software

Fisheye and Catadioptric Synthetic Datasets for Visual Odometry

We provide two synthetic scenes (vehicle moving in a city, and flying robot hovering in a confined room). For each scene, three different optics were used (perspective, fisheye and catadioptric), but the same sensor is used (keeping the image resolution constant). These datasets were generated using Blender, using a custom omnidirectional camera model, which we release as an open-source patch for Blender.

Download the datasets from here.



Z. Zhang, H. Rebecq, C. Forster, D. Scaramuzza

Benefit of Large Field-of-View Cameras for Visual Odometry

IEEE International Conference on Robotics and Automation (ICRA), Stockholm, 2016.

PDF YouTube Research page (datasets and software)

Indoor Dataset of Quadrotor with Down-Looking Camera

This dataset contains the recording of the raw images, IMU measurements as well as the ground truth poses of a quadrotor flying a circular trajectory in a office size environment.

Download dataset

REMODE: Real-time, Probabilistic, Monocular, Dense Reconstruction

REMODE is a novel method to estimate dense and accurate depth maps from a single moving camera. A probabilistic depth measurement is carried out in real time on a per-pixel basis and the computed uncertainty is used to reject erroneous estimations and provide live feedback on the reconstruction progress. REMODE uses a novel approach to depth map computation that combines Bayesian estimation and recent development on convex optimization for image processing. In the reference paper below, we demonstrate that our method outperforms state-of-the-art techniques in terms of accuracy, while exhibiting high efficiency in memory usage and computing power. Our CUDA-based implementation runs at 50Hz on a laptop computer and is released as open-source software (code here).

Download the code from GitHub.



M. Pizzoli, C. Forster, D. Scaramuzza

REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time

IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, 2014.

PDF YouTube Software

SVO: Semi-direct Visual Odometry

SVO is a Semi-direct, monocular Visual Odometry algorithm that is precise, robust, and faster than current state-of-the-art methods. The semi-direct approach eliminates the need of costly feature extraction and robust matching techniques for motion estimation. SVO operates directly on pixel intensities, which results in subpixel precision at high frame-rates. A probabilistic mapping method that explicitly models outlier and depth uncertainty is used to estimate 3D points, which results in fewer outliers and more reliable points. Precise and high frame-rate motion estimation brings increased robustness in scenes of little, repetitive, and high-frequency texture. The algorithm is applied to micro-aerial-vehicle state-estimation in GPS-denied environments and runs at 55 frames per second on the onboard embedded computer and at more than 400 frames per second on anm i7 consumer laptop and more than 70 frames per second on a smartphone computer (e.g., Odroid or Samsung Galaxy phones).

Download the code from GitHub.



C. Forster, M. Pizzoli, D. Scaramuzza

SVO: Fast Semi-Direct Monocular Visual Odometry

IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, 2014.

PDF YouTube Software

ROS Driver and Calibration Tool for the Dynamic Vision Sensor (DVS)

The RPG DVS ROS Package allow to use the Dynamic Vision Sensor (DVS) within the Robot Operating System (ROS). It also contains a calibration tool for intrinsic and stereo calibration using a blinking pattern.

The code with instructions on how to use it is hosted on GitHub.

Authors: Elias Mueggler, Basil Huber, Luca Longinotti, Tobi Delbruck


E. Mueggler, B. Huber, D. Scaramuzza Event-based, 6-DOF Pose Tracking for High-Speed Maneuvers, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Chicago, 2014. [ PDF ]

A. Censi, J. Strubel, C. Brandli, T. Delbruck, D. Scaramuzza Low-latency localization by Active LED Markers tracking using a Dynamic Vision Sensor, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, 2013. (PDF) [ PDF ]

P. Lichtsteiner, C. Posch, T. Delbruck A 128x128 120dB 15us Latency Asynchronous Temporal Contrast Vision Sensor, IEEE Journal of Solid State Circuits, Feb. 2008, 43(2), 566-576. [ PDF ]

A Monocular Pose Estimation System based on Infrared LEDs

Mutual localization is a fundamental component for multi-robot missions. Our monocular pose estimation system consists of multiple infrared LEDs and a camera with an infrared-pass filter. The LEDs are attached to the robot that we want to track, while the observing robot is equipped with the camera.

The code with instructions on how to use it is hosted on GitHub.


Matthias Faessler, Elias Mueggler, Karl Schwabe and Davide Scaramuzza, A Monocular Pose Estimation System based on Infrared LEDs, Proc. IEEE International Conference on Robotics and Automation (ICRA), 2014, Hong Kong. [ PDF ]

Torque Control of a KUKA youBot Arm

Existing control schemes for the KUKA youBot arm, such as directly controlling joint positions or velocities, are not suited for close tracking of end effector trajectories. A torque controller, based on the dynamical model of the youBot arm, was implemented to overcome this limitation. Complementary to the controller, a framework to automatically generate trajectories was developed.

The code with instructions on how to use it is hosted on GitHub. Details are provided in the Master Thesis of Benjamin Keiser.

Authors: Benjamin Keiser, Matthias Faessler, Elias Mueggler


B. Keiser, E. Mueggler, M. Faessler, D. Scaramuzza Torque Control of a KUKA youBot Arm, Master Thesis, University of Zurich, September, 2013. [ PDF ]

Dataset: Air-Ground Matching of Airborne images with Google Street View data

Matching airborne images to ground level ones is a challenging problem since in this case extreme changes in viewpoint and scale can be found between the aerial Micro Aerial Vehicle (MAV) images and the ground-level images, aside the challenges present in ground visual search algorithms used in UGV applications, such as illumination, lens distortions, over season variation of the vegetation, and scene changes between the query and the database images.

Our dataset consists of image data captured with a small quadroctopter flying in the streets of Zurich (up to 15 meters from the ground), along a path of 2km, including: (1) aerial MAV Images, (2) ground-level Google Street View Images, (3) ground-truth confusion matrix, and (4) GPS data (geotags) for every database image.

Download dataset.

Authors: Andras Majdik and Yves Albers-Schoenberg


A.L. Majdik, D. Verda, Y. Albers-Schoenberg, D. Scaramuzza Air-ground Matching: Appearance-based GPS-denied Urban Localization of Micro Aerial Vehicles Journal of Field Robotics, 2015. [ PDF ]

A. L. Majdik, D. Verda, Y. Albers-Schoenberg, D. Scaramuzza Micro Air Vehicle Localization and Position Tracking from Textured 3D Cadastral Models IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, 2014. [ PDF ]

A. Majdik, Y. Albers-Schoenberg, D. Scaramuzza. MAV Urban Localization from Google Street View Data, IROS'13, IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS'13, 2013. [ PDF ] [ PPT ]

Perspective 3-Point (P3P) Algorithm

The Perspective-Three-Point (P3P) problem aims at determining the position and orientation of a camera in the world reference frame from three 2D-3D point correspondences. Most solutions attempt to first solve for the position of the points in the camera reference frame, and then compute the point aligning transformation between the camera and the world frame. In contrast, this work proposes a novel closed-form solution to the P3P problem, which computes the aligning transformation directly in a single stage, without the intermediate derivation of the points in the camera frame. This is made possible by introducing intermediate camera and world reference frames, and expressing their relative position and orientation using only two parameters. The projection of a world point into the parametrized camera pose then leads to two conditions and finally a quartic equation for finding up to four solutions for the parameter pair. A subsequent backsubstitution directly leads to the corresponding camera poses with respect to the world reference frame. The superior computational efficiency is particularly suitable for any RANSAC-outlier-rejection step, which is always recommended before applying PnP or non-linear optimization of the final solution.

Download C/C++ code

Author: Laurent Kneip


L. Kneip, D. Scaramuzza, R. Siegwart. A Novel Parameterization of the Perspective-Three-Point Problem for a Direct Computation of Absolute Camera Position and Orientation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, USA, 2011. [ PDF ]

OCamCalib: Omnidirectional Camera Calibration Toolbox for Matlab

Omnidirectional Camera Calibration Toolbox for Matlab (for Windows, MacOS, and Linux) for catadioptric and fisheye cameras up to 195 degrees.

Code, tutorials, and datasets can be found here.

Author: Davide Scaramuzza


D. Scaramuzza, A. Martinelli, R. Siegwart. A Toolbox for Easily Calibrating Omnidirectional Cameras. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2006), Beijing, China, October 2006. [ PDF ]

D. Scaramuzza, A. Martinelli, R. Siegwart. A Flexible Technique for Accurate Omnidirectional Camera Calibration and Structure from Motion. IEEE International Conference on Computer Vision Systems (ICVS 2006), New York, USA, January 2006. [ PDF ]