When training and test data are drawn from different distribution, performance of traditional object detection model drops significantly.
2. Proposed Solution
Domain Adaptive Faster R-CNN model can be used to effectively detect object from cross domain. Domain variation might be the result of difference in camera type, different weather condition, difference in appearance ,image quality, backgrounds etc. Following variations were considered here:
Datasets: Cityscapes,KITTI,SIM 10k ,etc
In the Faster R-CNN model, the image-level representation refers to the feature map outputs of the base convolutional layers (see the green parallelogram in Fig 1) Domain distribution mismatch on the image level was eliminated by employing a patch-based domain classifier as shown in the lower right part of Fig 1.
The instance-level representation refers to the ROI-based feature vectors before feeding into the final category classifiers (i.e., the rectangles after the “FC” layer in Fig 1). Aligning the instance level representations helps to reduce the local instance difference such as object appearance, size, viewpoint etc. A domain classifier for the feature vectors was trained to align the instance-level distribution.
2. Faster R-CNN
It is composed of three major components : shared bottom convolution layers,a region proposal network(RPN) and a region-of-interest (ROI) based classifier.
3. Final Network
Domain Adaptive Faster R-CNN was obtained by augmenting domain adaptation components to Faster R-CNN.