Language-Guided 3D Object Detection

The goal of this project is to use language prompts to help find object parts in 3D.

Keywords: large language model, 3D scene understanding, 3D reconstruction

Description
The objective of this project is to employ language prompts in identifying components within 3D objects. Existing methods in 3D semantic and instance segmentation predominantly recognize entire objects but are often susceptible to errors in over- and under-segmentation. This project proposes the integration of Large Language Models (LLMs) to enhance the precision of segmenting parts of composite objects in both images and 3D representations. For instance, if the object of interest is a microwave, the LLM can be queried to list typical components such as the power button and display. Subsequently, based on the 3D reconstruction and segmentation data, the project aims to accurately locate these individual components utilizing the LLM’s output.
The objective of this project is to employ language prompts in identifying components within 3D objects. Existing methods in 3D semantic and instance segmentation predominantly recognize entire objects but are often susceptible to errors in over- and under-segmentation. This project proposes the integration of Large Language Models (LLMs) to enhance the precision of segmenting parts of composite objects in both images and 3D representations. For instance, if the object of interest is a microwave, the LLM can be queried to list typical components such as the power button and display. Subsequently, based on the 3D reconstruction and segmentation data, the project aims to accurately locate these individual components utilizing the LLM’s output.
Goal
Not specified
Contact Details
Not specified

Calendar

Earliest start	2024-04-24
Latest end	No date

Location

Computer Vision and Geometry Group (ETHZ)

Labels

Semester Project
Master Thesis

Topics

Information, Computing and Communication Sciences