Although Neural Radiance Fields (NeRF) is popular in
the computer vision community recently, registering multiple NeRFs has yet to gain much attention. Unlike the
existing work, NeRF2NeRF, which is based on traditional optimization methods and needs human annotated
keypoints, we propose DReg-NeRF to solve the NeRF registration problem on object-centric scenes without human
intervention. After training NeRF models, our DReg-NeRF
first extracts features from the occupancy grid in NeRF. Subsequently, our DReg-NeRF utilizes a transformer architecture with self-attention and cross-attention layers to learn
the relations between pairwise NeRF blocks. In contrast
to state-of-the-art (SOTA) point cloud registration methods,
the decoupled correspondences are supervised by surface
fields without any ground truth overlapping labels. We construct a novel view synthesis dataset with 1,700+ 3D objects
obtained from Objaverse to train our network. When evaluated on the test set, our proposed method beats the SOTA
point cloud registration methods by a large margin.