Abstract

Recent works such as BARF and GARF can bundle adjust camera poses with neural radiance fields (NeRF) which is based on coordinate-MLPs. Despite the impressive results, these methods cannot be applied to Generalizable NeRFs (GeNeRFs) which require image feature extractions that are often based on more complicated 3D CNN or transformer architectures. In this work, we first analyze the difficulties of jointly optimizing camera poses with GeNeRFs, and then further propose our DBARF to tackle these issues. Our DBARF which bundle adjusts camera poses by taking a cost feature map as an implicit cost function can be jointly trained with GeNeRFs in a self-supervised manner. Unlike BARF and its follow-up works, which can only be applied to per-scene optimized NeRFs and need accurate initial camera poses with the exception of forward-facing scenes, our method can generalize across scenes and does not require any good initialization. Experiments show the effectiveness and generalization ability of our DBARF when evaluated on real-world datasets.

Qualitative Analysis on LLFF dataset

Hover over image to move the zoomed in patch; Click on ground-truth image to switch to a different image

Ground Truth
BARF
IBRNet
DBARF (Ours)

Rendering Results on LLFF dataset

We compare our novel view synthesis results with IBRNet, where our network is trained without ground-truth camera poses.

image depth image depth




IBRNet DBARF (ours)

Though DBARF only predicts relative camera poses, we estimate the absolute camera poses by chaining the relative poses along an maximum spanning tree, where the edge weight is the number of correspondences.

1st row: left: rendered images; right: rendered depth
2nd row: optimized camera poses ()
3rd row: optimized camera poses () aligned to poses from COLMAP () via a similarity transformation.

RGB Depth
forward-facing view
top-down view
forward-facing view
top-down view

Qualitative Analysis on ScanNet dataset

ground-truth IBRNet DBARF

ground-truth IBRNet DBARF

Code

The code is hosted on GitHub (PyTorch).
Details to download the datasets are also described in the GitHub page.

Publications

CVPR 2023 paper: [ link ]

arXiv preprint: https://arxiv.org/abs/2104.06405

BibTex:

@InProceedings{Chen_2023_CVPR,   author = {Chen, Yu and Lee, Gim Hee},
  title = {DBARF: Deep Bundle-Adjusting Generalizable Neural Radiance Fields},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2023},
  pages = {24-34} }