13、1604.01655.pdf
In this paper, we propose a correlated and individual multi-modal deep learning (CIMDL) method for RGB-D object recognition. Unlike most conventional RGB-D object recognition methods which extract features from the RGB and depth channels individually, our CIMDL jointly learns feature representations from raw RGB-D data with a pair of deep neural networks, so that the sharable and modalspecific information can be simultaneously and explicitly exploited. Specifically, we construct a pair of deep residual networks for the RGB and depth data, and concatenate them at the top layer of the network with a loss function which learns a new feature space where both the correlated part and the individual part of the RGB-D information are well modelled. The parameters of the whole networks are updated by using the back-propagation criterion. Experimental results on two widely used RGB-D object image benchmark datasets clearly show that our method outperforms most of the state-of-the-art methods.