Radiance Fields are being widely explored for 3D scene reconstruction and several downstream tasks, such as segmentation. Prior radiance field segmentation methods require scene-specific training to enable segmentation.
We propose distilling semantic features into Radiance Fields in a generalisable fashion using GNT, a transformer-based architecture, enabling 3D reconstruction and multi-view segmentation on arbitrarily new scenes. By fine-tuning our method, any set of 2D features can be distilled into a radiance field, providing better multi-view consistency than the original features.
We show multi-view segmentation results on standard datasets and compare our method against existing NeRF-based segmentation methods. We perform on par with the state-of-the-art scene-specific segmentation methods. Our approach and experiments bring generalisable NeRF methods one step closer to the contemporary NeRF literature.
Segmentation results on the 4 scenes from the Real Iconic Dataset. Even though these scenes were not shown during training of both radiance and feature fields, we still perform on-par with the scene-specific segmentation methods.
@inproceedings{gupta2024gsn,
author = {Gupta, Vinayak and Goel, Rahul and Sirikonda, Dhawal and Narayanan, P. J},
title = {GSN: Generalisable Segmentation in Neural Radiance Fields},
journal = {AAAI},
year = {2024},
}