Skip to the content.

SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection

By Prarthana Bhattacharyya, Chengjie Huang and Krzysztof Czarnecki.

We provide code support and configuration files to reproduce the results in the paper: SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection.
Our code is based on OpenPCDet, which is a clean open-sourced project for benchmarking 3D object detection methods.

Overview

Fig.1. Self-Attention augmented global-context aware backbone networks.

In this paper, we explore variations of self-attention for contextual modeling in 3D object detection by augmenting convolutional features with self-attention features.
We first incorporate the pairwise self-attention mechanism into the current state-of-the-art BEV, voxel, point and point-voxel based detectors and show consistent improvement over strong baseline models while simultaneously significantly reducing their parameter footprint and computational cost. We call this variant full self-attention (FSA).
We also propose a self-attention variant that samples a subset of the most representative features by learning deformations over randomly sampled locations. This not only allows us to scale explicit global contextual modeling to larger point-clouds, but also leads to more discriminative and informative feature descriptors. We call this variant deformable self-attention (DSA).

Results

Fig.2. 3D Car AP with respect to params and FLOPs of baseline and proposed self-attention variants.


Fig.3. Visualizing qualitative results between baseline and our proposed self-attention module.

Model Zoo

We provide our proposed detection models in this section. The 3D AP results (R-40) on KITTI 3D Object Detection validation of the Car moderate category are shown in the table below.

Notes:

  Car 3D AP Params (M) G-FLOPs download
PointPillar_baseline 78.39 4.8 63.4 PointPillar
PointPillar_red 78.07 1.5 31.5 PointPillar-red
PointPillar_DSA 78.94 1.1 32.4 PointPillar-DSA
PointPillar_FSA 79.04 1.0 31.7 PointPillar-FSA
SECOND_baseline 81.61 4.6 76.7 SECOND
SECOND_red 81.11 2.5 51.2 SECOND-red
SECOND_DSA 82.03 2.2 52.6 SECOND-DSA
SECOND_FSA 81.86 2.2 51.9 SECOND-FSA
Point-RCNN_baseline 80.52 4.0 27.4 Point-RCNN
Point-RCNN_red 80.40 2.2 24 Point-RCNN-red
Point-RCNN_DSA 81.80 2.3 19.3 Point-RCNN-DSA
Point-RCNN_FSA 82.10 2.5 19.8 Point-RCNN-FSA
PV-RCNN_baseline 84.83 12 89 PV-RCNN
PV-RCNN_DSA 84.71 10 64 PV-RCNN-DSA
PV-RCNN_FSA 84.95 10 64.3 PV-RCNN-FSA

Usage

a. Clone the repo:

git clone --recursive https://github.com/AutoVision-cloud/SA-Det3D

b. Copy SA-Det3D src into OpenPCDet:

sh ./init.sh

c. Install OpenPCDet and prepare KITTI data:

Please refer to INSTALL.md for installation and dataset preparation.

d. Run experiments with a specific configuration file:

Please refer to GETTING_STARTED.md to learn more about how to train and run inference on this detector.

Citation

If you find this project useful in your research, please consider citing:

@misc{bhattacharyya2021sadet3d,
      title={SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection}, 
      author={Prarthana Bhattacharyya and Chengjie Huang and Krzysztof Czarnecki},
      year={2021},
      eprint={2101.02672},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement