Last updated: December 18, 2025
Authors: Angel A. Barrera-Gomez, Inhwan Jung, Luke Hussung
External Research Collaboration: Oak Ridge National Laboratory (ORNL)
Figure 1: Final model prediction after feature engineering and hyperparameter tuning, showing coherent reconstruction of plant organs (65.58% mIoU, 82% stem recall).
- Abstract
- Repository Purpose
- Contributions
- Getting Started
- Code Structure
- Method Overview
- Problem Setting
- Manual Annotation Protocol
- Geometric Feature Engineering
- Learning Architecture
- Evaluation Protocol
- Qualitative Findings
- Future Research Directions
- Citation
- External Research Collaborators
- Academic Advisors
- Acknowledgments
- Disclaimer
This repository releases research code developed in collaboration with Oak Ridge National Laboratory for studying semantic segmentation of 3D plant point clouds.
Important:
- No raw or processed data is included.
- No trained model checkpoints are provided.
- The repository focuses on methodology, architecture, and evaluation.
- Geometry-based semantic segmentation of plant organs from 3D LiDAR point clouds.
- A structured manual annotation protocol tailored to complex biological structures.
- Integration of local geometric descriptors within dynamic graph convolutional networks.
- Robust learning under severe class imbalance in organ-level segmentation tasks.
- A research-oriented, modular codebase designed to support reproducible experimentation.
Please follow the installation guide. To run the pipeline end-to-end make sure to install dependencies, place labeled point clouds in data/train and data/val,
train with python train.py, then evaluate with python evaluation.py and visualize with python visualization.py.
The repository is organized as a modular research pipeline:
├── data/ # Directory structure only (no data included)
│ ├── train/
│ ├── val/
│ └── test/
│
├── src/
│ ├── dataset.py # ETL and geometric feature computation
│ ├── model.py # Dynamic Edge CNN architecture
│ └── inference.py # Inference and post-processing
│
├── validations/ # Data integrity and sanity checks
│ ├── check_data.py
│ ├── check_labels.py
│ └── count_nans.py
│
├── train.py # Training loop
├── evaluation.py # Metric computation and bootstrapping
├── visualization.py # 3D visualization utilities
└── README.md
All directories related to raw data, predictions, and model checkpoints are intentionally excluded from this repository to comply with data confidentiality and intellectual property constraints associated with Oak Ridge National Laboratory (ORNL).
Given a 3D LiDAR point cloud acquired in a controlled phenotyping environment, the goal is to assign each point a semantic label corresponding to biologically meaningful plant structures:
- Stem
- Leaf
- Support Stake
- Background
The task is challenging due to:
- Severe class imbalance.
- Structural similarity between stems and stakes.
- Occlusion and sparse sampling.
- Absence of RGB or spectral information.
High-quality ground truth is critical for supervised semantic segmentation of 3D point clouds, particularly in plant phenotyping where geometric ambiguity, occlusion, and class imbalance are prevalent. To ensure consistent, accurate, and reproducible labels, a structured manual annotation protocol was developed using CloudCompare.
The protocol defines a rule-based workflow for point-wise segmentation and labeling of LiDAR point clouds into four semantic classes: stem, leaf, stake, and background. It enforces strict completeness, naming conventions, boundary rules, and class assignment guidelines, ensuring that every point in the original scan is assigned a biologically meaningful label.
This annotation strategy was essential for:
- Producing reliable supervision signals for deep learning.
- Reducing label noise in geometrically ambiguous regions.
- Enabling consistent evaluation across samples.
- Supporting reproducibility and future dataset extensions.
A total of 30 fully annotated 3D LiDAR point clouds were generated and used for supervised training and evaluation.
The full annotation procedure, including setup instructions, segmentation steps, labeling rules, and export formats, is documented in detail here: Manual Annotation Protocol
To overcome the limitations of raw XYZ coordinates, each point is augmented with local geometric descriptors computed within a fixed-radius neighborhood:
- Linearity
- Planarity
- Sphericity
- Relative Height
These features encode local shape properties critical for organ discrimination.
The core model is a Dynamic Edge Convolutional Neural Network (DECNN) that:
- Dynamically constructs neighborhood graphs per layer.
- Learns edge features capturing local geometry.
- Operates directly on unstructured point clouds.
To address extreme class imbalance, training uses a composite loss combining:
- Weighted Cross-Entropy
- Dice Loss
Model performance is evaluated using:
- Intersection over Union (IoU).
- Precision and Recall (per class).
- Sample-averaged metrics.
- Bootstrapped confidence intervals.
Qualitative evaluation is performed via 3D visualization of predicted segmentations.
Under the described experimental setup:
- Strong performance is observed on the dominant Leaf class.
- High recall is achieved for the biologically critical Stem class.
- Qualitative results show coherent reconstruction of plant structure.
Limitations include stem–stake ambiguity and boundary artifacts due to resolution constraints.
For a comprehensive discussion of experimental results, quantitative metrics, and additional analyses, please refer to the full exit report available here: Exit Report
Potential extensions of this work include:
- Incorporation of RGB or multispectral information to reduce geometric ambiguity
- Evaluation of alternative point cloud architectures (e.g., PointNet++, KPConv, transformer-based models)
- Temporal modeling across plant growth stages
- Scalable experimentation via containerization and MLOps workflows (e.g., Docker, experiment tracking, automated evaluation pipelines)
If you find this work useful in your research, please consider citing:
@unpublished{barrera2025plantseg,
title = {Plant Organ Segmentation from 3D LiDAR Point Clouds via Geometric Deep Learning},
author = {Barrera-Gomez, Angel A. and Jung, Inhwan and Hussung, Luke},
year = {2025}
}Oak Ridge National Laboratory (ORNL) | Biosciences Division
Dr. John Lagergren
R&D Associate Staff Member
lagergrenjr@ornl.gov
Dr. Larry M. York
Senior Staff Scientist
yorklm@ornl.gov
Anand Seethepalli
Biosciences Computer Vision Developer
seethepallia@ornl.gov
East Tennessee State University (ETSU) | Department of Mathematics & Statistics
Dr. Jeff R. Knisley
Professor
knisley@etsu.edu
Dr. Robert M. Price
Professor
pricer@etsu.edu
Dr. Michele Joyner
Professor
joynerm@etsu.edu
This research used resources of the Advanced Plant Phenotyping Laboratory and the Center for Bioenergy Innovation (CBI), which is a U.S. Department of Energy Bioenergy Research Center supported by the Office of Biological and Environmental Research in the DOE Office of Science. Oak Ridge National Laboratory is managed by UT-Battelle, LLC for the U.S. Department of Energy under Contract Number DE-AC05-00OR22725.
We sincerely thank Dr. John Lagergren, Dr. Larry M. York, and Anand Seethepalli (Oak Ridge National Laboratory, Biosciences Division) for providing access to experimental data, domain expertise, and valuable feedback throughout the project. We also thank Dr. Jeff R. Knisley, Dr. Robert M. Price, and Dr. Michele Joyner (Department of Mathematics & Statistics, East Tennessee State University) for their academic guidance and mentorship, and for making this collaboration possible by enabling meaningful real-world research and development experience in data science.
The views and conclusions expressed in this repository are those of the authors and do not necessarily represent the views of Oak Ridge National Laboratory or the U.S. Department of Energy. The code is provided for academic and research purposes only.
