Skip to content

chicleee/End-to-End-3D-Reconstruction-Paper-List

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 

Repository files navigation

End-to-End-3D-Reconstruction-Paper-List

Personal list. With relevant research advancing fast and branching out widely, I'll only add papers meeting my needs hereafter.



Cross View

  • CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion [NeurIPS 2022] [croco]

  • CroCo v2: Improved Cross-view Completion Pre-training for Stereo Matching and Optical Flow [ICCV 2023] [croco]

  • 3D-Consistent Image Inpainting with Diffusion Models [arXiv 2024] [croco-diff]

  • Alligat0R: Pre-Training through Co-Visibility Segmentation for Relative Camera Pose Regression [arXiv 2025] []

Pose Estimation

  • Cameras as Rays: Pose Estimation via Ray Diffusion [ICLR 2024] [RayDiffusion]

  • Cameras as Relative Positional Encoding [arXiv 2025] [prope]

  • Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization [CVPR 2025] [reloc3r]

3D Reconstruction

  • Visual Geometry Grounded Deep Structure From Motion [CVPR 2024] [vggsfm]

  • DUSt3R: Geometric 3D Vision Made Easy [CVPR 2024] [dust3r]

  • Grounding Image Matching in 3D with MASt3R [ECCV 2024] [mast3r]

  • MASt3R-SfM: a Fully-Integrated Solution for Unconstrained Structure-from-Motion [arXiv 2024] [mast3r]

  • 3D Reconstruction with Spatial Memory [3DV 2025] [spann3r]

  • MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds [CVPR 2025] [mv-dust3rp]

  • Continuous 3D Perception Model with Persistent State [CVPR 2025] [cut3r]

  • Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass [CVPR 2025] [fast3r-3d]

  • Light3R-SfM: Towards Feed-forward Structure-from-Motion [arXiv 2025] []

  • MUSt3R: Multi-view Network for Stereo 3D Reconstruction [arXiv 2025] [must3r]

  • PE3R: Perception-Efficient 3D Reconstruction [arXiv 2025] [pe3r]

  • VGGT: Visual Geometry Grounded Transformer [CVPR 2025] [vggt]

  • Pow3R: Empowering Unconstrained 3D Reconstruction with Camera and Scene Priors [CVPR 2025] []

  • Matrix3D: Large Photogrammetry Model All-in-One [CVPR 2025] [ml-matrix3d]

  • DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion [CVPR 2025] [DiffusionSfM]

  • Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory [arXiv 2025] [Point3R]

  • π³: Scalable Permutation-Equivariant Visual Geometry Learning [arXiv 2025] [Pi3]

  • StreamVGGT: Streaming 4D Visual Geometry Transformer [arXiv 2025] [StreamVGGT]

  • Evict3R: Training-Free Token Eviction for Memory-Bounded Streaming Visual Geometry Transformers [arXiv 2025]

  • Uni3R: Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images [arXiv 2025] [Uni3R]

  • Surf3R: Rapid Surface Reconstruction from Sparse RGB Views in Seconds [arXiv 2025] []

  • STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer [arXiv 2025] [STream3R]

  • WinT3R: Window-Based Streaming Reconstruction With Camera Token Pool [arXiv 2025] [WinT3R]

  • SAIL-Recon: Large SfM by Augmenting Scene Regression with Localization [arXiv 2025] [sail-recon]

  • FastVGGT: Training-Free Acceleration of Visual Geometry Transformer [arXiv 2025] [FastVGGT]

  • Faster VGGT with Block-Sparse Global Attention [arXiv 2025] [sparse-vggt]

  • Quantized Visual Geometry Grounded Transformer [arXiv 2025] [QuantVGGT]

  • MapAnything: Universal Feed-Forward Metric 3D Reconstruction [arXiv 2025] [map-anything]

  • TTT3R: 3D Reconstruction as Test-Time Training [arXiv 2025] [TTT3R]

  • WorldMirror: Universal 3D World Reconstruction with Any-Prior Prompting [arXiv 2025] [HunyuanWorld-Mirror]

  • OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer [arXiv 2025] [OmniVGGT]

  • Depth Anything 3: Recovering the Visual Space from Any Views [arXiv 2025] [DA3]

Generation

  • ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation [arXiv 2025] [ReconViaGen]

Semantic

  • PanSt3R: Multi-view Consistent Panoptic Segmentation [arXiv 2025] []

  • IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction [arXiv 2025] []

Depth

  • MoGe: Accurate Monocular Geometry Estimation [CVPR 2025] [MoGe]

  • DA2: Depth Anything in Any Direction [arXiv 2025] [DA2]

  • FastViDAR: Real-Time Omnidirectional Depth Estimation via Alternative Hierarchical Attention [arXiv 2025] [FastVidar]

Dynamic

  • MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion [ICLR 2025] [monst3r]

  • MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos [CVPR 2025] [mega-sam]

  • Driv3R: Learning Dense 4D Reconstruction for Autonomous Driving [arXiv 2024] [Driv3R]

  • Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction [arXiv 2025] [Geo4D]

  • ViPE: Video Pose Engine for Geometric 3D Perception [arXiv 2025] [vipe]

  • VGGT4D: Mining Motion Cues in Visual Geometry Transformers for 4D Scene Reconstruction [arXiv 2025] [vggt4d]

SLAM

  • SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos [CVPR 2025] [SLAM3R]

  • MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors [CVPR 2025] [mast3r-slam]

  • VGGT-SLAM: Dense RGB SLAM Optimized on the SL(4) Manifold [arXiv 2025] [VGGT-SLAM]

  • EC3R-SLAM: Efficient and Consistent Monocular Dense SLAM with Feed-Forward 3D Reconstruction [arXiv 2025] [EC3R-SLAM]

Novel View Synthesis

  • Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs [arXiv 2024] [splatt3r]

  • No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [ICLR 2025] [NoPoSplat]

  • PreF3R: Pose-Free Feed-Forward 3D Gaussian Splatting from Variable-length Image Sequence [arXiv 2024] [PreF3R]

  • SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction [CVPR 2025] [SPARS3R]

  • LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias [ICLR 2025] [LVSM]

  • FlowR: Flowing from Sparse to Dense 3D Reconstructions [arXiv 2025] []

  • RayZer: A Self-supervised Large View Synthesis Model [arXiv 2025] [RayZer]

  • AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views [arXiv 2025] [AnySplat]

  • VGGT-X: When VGGT Meets Dense Novel View Synthesis [arXiv 2025] [VGGT-X]

  • YoNoSplat: You Only Need One Model for Feedforward 3D Gaussian Splatting [arXiv 2025] [yonosplat]

  • E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training [arXiv 2025] [E-RayZer]

  • Off The Grid: Detection of Primitives for Feed-Forward 3D Gaussian Splatting [arXiv 2025] [OffTheGrid]

  • Sharp Monocular View Synthesis in Less Than a Second [arXiv 2025] [ml-sharp]

  • EcoSplat: Efficiency-controllable Feed-forward 3D Gaussian Splatting from Multi-view Images [arXiv 2025] [ecosplat-site]

  • From Rays to Projections: Better Inputs for Feed-Forward View Synthesis [arXiv 2026] [pvsm-web]

About

multi-view, end2end, 3D Reconstruction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published