GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector

Abstract

We propose GO-N3RDet, a scene-geometry optimized multi-view 3D objectdetector enhanced by neural radiance fields. The key to accurate 3D objectdetection is in effective voxel representation. However, due to occlusion andlack of 3D information, constructing 3D features from multi-view 2D images ischallenging. Addressing that, we introduce a unique 3D positional informationembedded voxel optimization mechanism to fuse multi-view features. Toprioritize neural field reconstruction in object regions, we also devise adouble importance sampling scheme for the NeRF branch of our detector. Weadditionally propose an opacity optimization module for precise voxel opacityprediction by enforcing multi-view consistency constraints. Moreover, tofurther improve voxel density consistency across multiple perspectives, weincorporate ray distance as a weighting factor to minimize cumulative rayerrors. Our unique modules synergetically form an end-to-end neural model thatestablishes new state-of-the-art in NeRF-based multi-view 3D detection,verified with extensive experiments on ScanNet and ARKITScenes. Code will beavailable at https://github.com/ZechuanLi/GO-N3RDet.