Learning Neural Light Fields with Ray-Space Embedding Networks

Benjamin Attal

Carnegie Mellon University

Jia-Bin Huang

Meta

Michael Zollhöfer

Reality Labs Research

Johannes Kopf

Meta

Changil Kim

Meta

Overview

We present Neural Light Fields with Ray-Space Embedding. A light field directly represents integrated radiance along rays. Unlike neural radiance fields, which need many network evaluations to approximate a volume integral, rendering from a light field only requires one evaluation per ray.

However, learning a neural light field is challenging, and using popular coordinate-based neural network architectures leads to poor view synthesis quality. We present a novel ray-space embedding approach to mitigate this challenge. By leveraging ray-space embedding, in addition to subdivision, our neural light fields provide a more favorable trade-off between quality, speed, and memory than the previous state of the art, as shown in the graph below.

Results

We demonstrate our method on a variety of scenes, both sparsely and densely sampled, from the NeRF's Real Forward-Facing dataset, the Stanford Light Fields dataset and NeX-MPI's Shiny dataset. Our results show faithful reconstruction of highly complex view dependent effects. More results, including baseline comparisons and ablations, are found on our supplemental webpage. All rendered images from our method and all baseline methods are posted here.

Ground Truth
Ours
31.7dB / 2.1s
NeRF
31.1dB / 10.4s
Ground Truth
Ours
36.8dB / 0.11s
NeRF
35.4dB / 13.7s
X-Fields
33.0dB / 0.05s

Abstract

Neural radiance fields (NeRFs) produce state-of-the-art view synthesis results. However, they are slow to render, requiring hundreds of network evaluations per pixel to approximate a volume rendering integral. Baking NeRFs into explicit data structures enables efficient rendering, but results in a large increase in memory footprint and, in many cases, a quality reduction. In this paper, we propose a novel neural light field representation that, in contrast, is compact and directly predicts integrated radiance along rays. Our method supports rendering with a single network evaluation per pixel for small baseline light field datasets and can also be applied to larger baselines with only a few evaluations per pixel. At the core of our approach is a ray-space embedding network that maps the 4D ray-space manifold into an intermediate, interpolable latent space. Our method achieves state-of-the-art quality on dense forward-facing datasets such as the Stanford Light Field dataset. In addition, for forward-facing scenes with sparser inputs we achieve results that are competitive with NeRF-based approaches in terms of quality while providing a better speed/quality/memory trade-off with far fewer network evaluations.

Overview Video

BibTeX

@inproceedings{attal2022learning,
  author    = {Benjamin Attal and Jia-Bin Huang and Michael Zollh{\"o}fer and Johannes Kopf and Changil Kim},
  title     = {Learning Neural Light Fields with Ray-Space Embedding Networks},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2022},
}