GSN: Generalisable Segmentation in Neural Radiance Fields

AAAI 2024
1IIT Madras, 2IIIT Hyderabad


Radiance Fields are being widely explored for 3D scene reconstruction and several downstream tasks, such as segmentation. Prior radiance field segmentation methods require scene-specific training to enable segmentation.

We propose distilling semantic features into Radiance Fields in a generalisable fashion using GNT, a transformer-based architecture, enabling 3D reconstruction and multi-view segmentation on arbitrarily new scenes. By fine-tuning our method, any set of 2D features can be distilled into a radiance field, providing better multi-view consistency than the original features.

We show multi-view segmentation results on standard datasets and compare our method against existing NeRF-based segmentation methods. We perform on par with the state-of-the-art scene-specific segmentation methods. Our approach and experiments bring generalisable NeRF methods one step closer to the contemporary NeRF literature.


Overview of GSN

DiNO-Integrated Segmentation Results


Segmentation results on the 4 scenes from the Real Iconic Dataset. Even though these scenes were not shown during training of both radiance and feature fields, we still perform on-par with the scene-specific segmentation methods.

Other Semantic Field Segmentation Results


We show segmentation results using other semantics integrated into our model. We shows results using CLIP, DINOv2 and SAM on the NeRF-LLFF dataset.

GSN produces finer semantics than DiNO!!!


Our student GSN method can surpass the teacher feature extractor methods to produce better features in a multi-view setting.

GSN is View Consistent than GNT


Our method generates multi-view consistent features which give view-consistent clusters as highlighted in the boxes.


  author    = {Gupta, Vinayak and Goel, Rahul and Sirikonda, Dhawal and Narayanan, P. J},
  title     = {GSN: Generalisable Segmentation in Neural Radiance Fields},
  journal   = {AAAI},
  year      = {2024},