Deep Learning

Deep learning is a branch of machine learning based on a set of algorithms that attempt to model high level abstractions in data. In our research, we apply deep learning to solve different mobile robot navigation problems, such as depth estimation, end-to-end navigation, and classification.

Towards Domain Independence for Learning-Based Monocular Depth Estimation

Abstract—Modern autonomous mobile robots require a strong understanding of their surroundings in order to safely operate in cluttered and dynamic environments. Monocular depth estimation offers a geometry-independent paradigm to detect free, navigable space with minimum space and power consumption. These represent highly desirable features, especially for micro aerial vehicles. In order to guarantee robust operation in real world scenarios, the estimator is required to generalize well in diverse environments. Most of the existent depth estimators do not consider generalization, and only benchmark their performance on publicly available datasets after specific finetuning. Generalization can be achieved by training on several heterogeneous datasets, but their collection and labeling is costly. In this work, we propose a Deep Neural Network for scene depth estimation that is trained on synthetic datasets, which allow inexpensive generation of ground truth data. We show how this approach is able to generalize well across different scenarios. In addition, we show how the addition of Long Short Term Memory (LSTM) layers in the network helps to alleviate, in sequential image streams, some of the intrinsic limitations of monocular vision, such as global scale estimation, with low computational overhead. We demonstrate that the network is able to generalize well with respect to different real world environments without any fine-tuning, achieving comparable performance to state-ofthe- art methods on the KITTI dataset.



M. Mancini, G. Costante, P. Valigi, T.A. Ciarfuglia, J. Delmerico, D. Scaramuzza

Towards Domain Independence for Learning-Based Monocular Depth Estimation

IEEE Robotics and Automation Letters (RA-L), 2017.


A Deep Learning Approach for Automatic Recognition and Following of Forest Trails with Drones

We study the problem of perceiving forest or mountain trails from a single monocular image acquired from the viewpoint of a robot traveling on the trail itself. Previous literature focused on trail segmentation, and used low-level features such as image saliency or appearance contrast; we propose a different approach based on a Deep Neural Network used as a supervised image classifier. By operating on the whole image at once, our system outputs the main direction of the trail compared to the viewing direction. Qualitative and quantitative results computed on a large real-world dataset (which we provide for download) show that our approach outperforms alternatives, and yields an accuracy comparable to the accuracy of humans that are tested on the same image classification task. Preliminary results on using this information for quadrotor control in unseen trails are reported. To the best of our knowledge, this is the first paper that describes an approach to perceive forest trials which is demonstrated on a quadrotor micro aerial vehicle.



A. Giusti, J. Guzzi, D.C. Ciresan, F. He, J.P. Rodr�guez, F. Fontana, M. Faessler, C. Forster, J. Schmidhuber, G. Di Caro, D. Scaramuzza, L.M. Gambardella

A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots

IEEE Robotics and Automation Letters (RA-L), pages 661 - 667, 2016

Nominated for AAAI Best Video Award!

PDF Project Webpage and Datasets DOI YouTube

"On-the-spot Training" for Terrain Classification in Autonomous Air-Ground Collaborative Teams

We consider the problem of performing rapid training of a terrain in the context of a collaborative robotic search and res- cue system. Our system uses a vision-based flying robot to guide a ground robot through unknown terrain to a goal location by building a map of terrain class and elevation. However, due to the unknown environments present in search and rescue scenarios, our system requires a terrain classifier that can be trained and deployed quickly, based on data col- lected on the spot. We investigate the relationship of training set size and complexity on training time and accuracy, for both feature-based and convolutional neural network classi.ers in this scenario. Our goal is to minimize the deployment time of the in our terrain mapping system within acceptable classi.cation accuracy tolerances. So we are not concerned with training a that generalizes well, only one that works well for this particular environment. We demonstrate that we can launch our aerial robot, gather data, train a, and begin building a terrain map after only 60 seconds of flight.


On-the-spot training

J. Delmerico, A. Giusti, E. Mueggler, L.M. Gambardella, D. Scaramuzza

"On-the-spot Training" for Terrain Classification in Autonomous Air-Ground Collaborative Teams

International Symposium on Experimental Robotics (ISER), Tokyo, 2016.

PDF YouTube