-
Fahimeh Farahnakian authoredFahimeh Farahnakian authored
Deep Convolutional Neural Networked-based Multisensor Fusion for Autonomous Vehicles
Multisensor fusion methods are widely used in many real-world applications such as autonomous systems, remote sensing, video surveillance and military. The objective of multisensor fusion is to combine the data provided by the multiple sensors to achieve complementary information about the scene. The data can be obtained from the same sensor with several capturing parameters or multiple sensors. Deep Convolutional Neural Network (DCNN) has been developed as one of the main models in deep learning and successfully applied to a wide range of computer vision tasks showing state-of-the-art performance. For this reason, most multisensor fusion architectures for computer vision tasks are built based on DCNN. In addition, DCNNs have great potential in processing the multi- sensory data, which usually contains rich information in the raw data and is sensitive to training time as well as model size. However, the multisensor fusion approaches suffer from two challenges, which are (1) the feature extraction from various types of sensory data and (2) the selection of a suitable fusion level. In this repository, we introduce the trend of DCNN-based multisensor fusion for object detection. We also describe some of our research objectives and contributions in this topic.
Fusing LiDAR and Color Imagery for Object Detection using Convolutional Neural Networks [1]:
The goal of this work is answer to this question: how much fusing LiDAR and color images can improve the performance of a convolutional neural network (CNN)-based detector? To this end, we trained state-of-the-art CNN-based detectors using different configurations of color images and their associated LiDAR data, in conjunction and independently. Moreover, we investigate the effect of sparse and dense LiDAR data on the detection accuracy. For this purpose, we estimate a dense depth image from spare LiDAR data using a recent self-supervised depth completion technique [2] that requires only sequences of color and sparse depth images, without the need for dense depth labels. Then, we compared two detectors when are trained on sparse or dense LiDAR data. The obtained results on the KITTI dataset show that fusing dense LiDAR and color images is an efficient solution for future object detectors.
Fig.1 illustrates our proposed frameworks: using a common detection network structure, different kind of data are used to perform network training as follows:
- Color-based framework: uses only color images for training the detection network as shown in Fig.1(a).
- Sparse LiDAR-based framework: uses only sparse depth images for training the detection network as shown in Fig.1(b). The framework is similar to Color-only, except that LiDAR images are used instead of camera images. There is no fusion in this experiment. The sparse depth images is obtained by projecting LiDAR point cloud data on 2D image following [11].
- Dense LiDAR-based framework: uses only dense depth images for training the detection network as shown in Fig.1(c). The dense image is obtained through self-supervised algorithm [1]. This framework is similar to the two above frameworks as there is not fusion in this experiment as well.
- Color and dense LiDAR-based framework: uses both color and dense LiDAR images for training the detec-tion network as shown in Fig.1(d). This framework is described in Section III
References
- F. Farahnakian, and J. Heikkonen, “Fusing LiDAR and Color Imagery for Object Detection using Convolutional Neural Networks”, The 23th edition of the IEEE International conference on information fusion (Fusion), 2020.
- Fangchang Ma, Guilherme Venturelli Cavalheiro, and Sertac Karaman. Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera. CoRR, abs/1807.00275, 2018.