A novel multimodal object detection framework with sparse transformer and explicit attention module. The illustration of our proposed multimodal object detection framework is shown in the following ...