Deep Learning

Deep learning is a branch of machine learning based on a set of algorithms that attempt to model high level abstractions in data. In our research, we apply deep learning to solve different mobile robot navigation problems, such as depth estimation, end-to-end navigation, and classification.

Towards Domain Independence for Learning-Based Monocular Depth Estimation

Most state-of-the-art learning-based monocular depth depth estimators do not consider generalization and only benchmark their performance on publicly available datasets "only after specific fine tuning". Generalization can be achieved by training on several heterogeneous datasets but their collection and labeling is costly. In this work, we propose two Deep Neural Networks (one based on CNN and one on LSTM) for monocular depth estimation, which we train on heterogeneous synthetic datasets (forest and urban scenarios), generated using Unreal Engine, and show that, although trained only on synthetic data, the network is able to generalize well across different, unseen real-world scenarios (KITTI and new collected datasets from Zurich, Switzerland, and Perugia, Italy) without any fine-tuning, achieving comparable performance to state-of-the-art methods. Additionally, we also show that the LSTM network is able to estimate well the absolute scale with low additional computational overhead. We release the Unreal Engine 3D models and all the collected datasets (from Switzerland and Italy) freely to the public.



M. Mancini, G. Costante, P. Valigi, T.A. Ciarfuglia, J. Delmerico, D. Scaramuzza

Towards Domain Independence for Learning-Based Monocular Depth Estimation

IEEE Robotics and Automation Letters (RA-L), 2017.

PDF YouTube Dataset and Unreal-Engine 3D models

A Deep Learning Approach for Automatic Recognition and Following of Forest Trails with Drones

We study the problem of perceiving forest or mountain trails from a single monocular image acquired from the viewpoint of a robot traveling on the trail itself. Previous literature focused on trail segmentation, and used low-level features such as image saliency or appearance contrast; we propose a different approach based on a Deep Neural Network used as a supervised image classifier. By operating on the whole image at once, our system outputs the main direction of the trail compared to the viewing direction. Qualitative and quantitative results computed on a large real-world dataset (which we provide for download) show that our approach outperforms alternatives, and yields an accuracy comparable to the accuracy of humans that are tested on the same image classification task. Preliminary results on using this information for quadrotor control in unseen trails are reported. To the best of our knowledge, this is the first paper that describes an approach to perceive forest trials which is demonstrated on a quadrotor micro aerial vehicle.



A. Giusti, J. Guzzi, D.C. Ciresan, F. He, J.P. Rodr�guez, F. Fontana, M. Faessler, C. Forster, J. Schmidhuber, G. Di Caro, D. Scaramuzza, L.M. Gambardella

A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots

IEEE Robotics and Automation Letters (RA-L), pages 661 - 667, 2016

Nominated for AAAI Best Video Award!

PDF Project Webpage and Datasets DOI YouTube

"On-the-spot Training" for Terrain Classification in Autonomous Air-Ground Collaborative Teams

We consider the problem of performing rapid training of a terrain in the context of a collaborative robotic search and res- cue system. Our system uses a vision-based flying robot to guide a ground robot through unknown terrain to a goal location by building a map of terrain class and elevation. However, due to the unknown environments present in search and rescue scenarios, our system requires a terrain classifier that can be trained and deployed quickly, based on data col- lected on the spot. We investigate the relationship of training set size and complexity on training time and accuracy, for both feature-based and convolutional neural network classi.ers in this scenario. Our goal is to minimize the deployment time of the in our terrain mapping system within acceptable classi.cation accuracy tolerances. So we are not concerned with training a that generalizes well, only one that works well for this particular environment. We demonstrate that we can launch our aerial robot, gather data, train a, and begin building a terrain map after only 60 seconds of flight.


On-the-spot training

J. Delmerico, A. Giusti, E. Mueggler, L.M. Gambardella, D. Scaramuzza

"On-the-spot Training" for Terrain Classification in Autonomous Air-Ground Collaborative Teams

International Symposium on Experimental Robotics (ISER), Tokyo, 2016.

PDF YouTube