GitHub - miladfa7/SpatialLM-Gradio: "Gradio" Interface for SpatialLM Model | A 3D Large Language Model for Structured Scene Understanding, Processing Point Cloud Data from Monocular Videos, RGBD Images, and LiDAR.

Gradio Interface for SpatialLM Model

SpatialLM: A 3D Large Language Model for Structured Scene Understanding, Processing Point Cloud Data from Monocular Videos, RGBD Images, and LiDAR.

Run demo

    python gradio_demo.py

Introductiond

SpatialLM is a 3D large language model designed to process 3D point cloud data and generate structured 3D scene understanding outputs. These outputs include architectural elements like walls, doors, windows, and oriented object bounding boxes with their semantic categories. Unlike previous methods that require specialized equipment for data collection, SpatialLM can handle point clouds from diverse sources such as monocular video sequences, RGBD images, and LiDAR sensors. This multimodal architecture effectively bridges the gap between unstructured 3D geometric data and structured 3D representations, offering high-level semantic understanding. It enhances spatial reasoning capabilities for applications in embodied robotics, autonomous navigation, and other complex 3D scene analysis tasks. Project Page | Official Code

SpatialLM Models

Model	Download
SpatialLM-Llama-1B	🤗 HuggingFace
SpatialLM-Qwen-0.5B	🤗 HuggingFace

Usage

Installation

Tested with the following environment:

Python 3.11
Pytorch 2.4.1
CUDA Version 12.4

# clone the repository
git clone https://github.com/manycore-research/SpatialLM-Gradio.git
cd SpatialLM-Gradio

# create a conda environment with cuda 12.4
conda create -n spatiallm-gradio python=3.11
conda activate spatiallm-gradio
conda install -y nvidia/label/cuda-12.4.0::cuda-toolkit conda-forge::sparsehash

# Install dependencies with poetry
pip install poetry && poetry config virtualenvs.create false --local
poetry install
poe install-torchsparse # Building wheel for torchsparse will take a while
pip install gradio_rerun

Inference

In the current version of SpatialLM, input point clouds are considered axis-aligned where the z-axis is the up axis. This orientation is crucial for maintaining consistency in spatial understanding and scene interpretation across different datasets and applications. Example preprocessed point clouds, reconstructed from RGB videos using MASt3R-SLAM, are available in SpatialLM-Testset.

Download an example point cloud:

huggingface-cli download manycore-research/SpatialLM-Testset pcd/scene0000_00.ply --repo-type dataset --local-dir .

License

SpatialLM-Llama-1B is derived from Llama3.2-1B-Instruct, which is licensed under the Llama3.2 license. SpatialLM-Qwen-0.5B is derived from the Qwen-2.5 series, originally licensed under the Apache 2.0 License.

All models are built upon the SceneScript point cloud encoder, licensed under the CC-BY-NC-4.0 License. TorchSparse, utilized in this project, is licensed under the MIT License.

Acknowledgements

I would like to thank the following projects that made this work possible:

SpatialLM

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
__pycache__		__pycache__
figures		figures
outputs/layouts		outputs/layouts
pcd		pcd
spatiallm		spatiallm
LICENSE.txt		LICENSE.txt
README.md		README.md
code_template.txt		code_template.txt
gradio_demo.py		gradio_demo.py
inference.py		inference.py
install.sh		install.sh
script.sh		script.sh
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gradio Interface for SpatialLM Model

Run demo

Introductiond

SpatialLM Models

Usage

Installation

Inference

License

Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

miladfa7/SpatialLM-Gradio

Folders and files

Latest commit

History

Repository files navigation

Gradio Interface for SpatialLM Model

Run demo

Introductiond

SpatialLM Models

Usage

Installation

Inference

License

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages