UltimateSLAM: using Events, Images, and IMU


UltimateSLAM is a visual-inertial odometry pipeline that combines events, images, and IMU to yield robust and accurate state estimation in HDR and high-speed scenarios. The code is available here.

Event cameras, such as the Dynamic Vision Sensor (DVS), are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames (illustration). They offer significant advantages over standard cameras, namely a very high dynamic range, no motion blur, and a latency in the order of microseconds. In the past few years, we have been investigating the use of such sensors for various vision applications: see our research page on event-based vision.

UltimateSLAM is the results of several years of research on event cameras in our lab (since 2013). In particular, it is based on our previous work on event-based visual-inertial odometry and adds the possibility to use images from a standard camera to provide a boost of accuracy and robustness in situations where standard visual-inertial odometry works best (good lighting, limited motion speed), while still retaining the ability to leverage the qualities of event cameras for tracking in HDR and high-speed scenarios.

It runs in real-time on a standard laptop, and even on a computationally limited platform, such as smartphone computers. In fact, we successfully used it to perform onboard state estimation for closed-loop autonomous flight of our quadrotor, which features a smartphone PC (Odroid XU4).

Thanks to the way UltimateSLAM works, it is possible to use it in two different modes: with events + images + IMU, or with only events and IMU. This allows to control the trade-off between accuracy and cost. While events + images + IMU give the best results, our pipeline will still work with events + IMU even when a frame-based sensor is not available, or for applications with low-power requirements.

UltimateSLAM used for flying an autonomous drone in the dark

UltimateSLAM tracking high-speed motion using Events + IMU

Event + IMU vs UltimateSLAM (Events, Images and IMU)

We have shown in our research paper that including images yields a mean boost of accuracy of 130 % over a pipeline using only events and IMU. The error plot below shows, as an example, the translation and rotation error of UltimateSLAM (Fr + E + I) as a function of the travelled distance, compared to using only Events + IMU (E + I), or using only frames and IMU (F + I), on the hdr_boxes dataset from the Event Camera Dataset.

Error plot on hdr_boxes dataset from the Event Camera Dataset

To understand this, consider, for example, the task of hovering a drone using vision. The video below shows what happens when using i) using only events and IMU for onboard state estimation, and ii) using UltimateSLAM (images, events and IMU). Due to the very little amount of motion in the first case, events do not provide reliable visual information which leads to noticeable drift. By contrast, UltimateSLAM is able to hover the drone without drift by making use of the images additionally to the events and IMU.



T. Rosinol Vidal, H.Rebecq, T. Horstschaefer, D. Scaramuzza

Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed Scenarios

IEEE Robotics and Automation Letters (RA-L), 2018.

PDF YouTube ICRA18 Video Pitch Results (raw trajectories) Source Code


H.Rebecq, T. Horstschaefer, D. Scaramuzza

Real-time Visual-Inertial Odometry for Event Cameras using Keyframe-based Nonlinear Optimization

British Machine Vision Conference (BMVC), London, 2017.

Oral Presentation. Acceptance Rate: 5.6%

PDF PPT YouTube Oral Presentation Results (raw trajectories)