SLEAP 1.5: Tracking AttributeError during prediction + tensor size mismatch when training without “Use Existing Training Config” #2438
Replies: 2 comments 24 replies
-
|
⸻ 🐛 SLEAP 1.5: Tracking AttributeError during prediction + tensor size mismatch when training without “Use Existing Training Config” Summary On a fresh SLEAP 1.5 install (Windows, RTX 5070 Ti + AMD Ryzen 9 9900X), I can train and get good predictions on random frames right after training. But: There’s also a logging UnicodeEncodeError on Windows cp1252 when printing the ✓ checkmark from Loguru. ⸻ Environment ⸻ What I did A. Prediction with tracking on a 1.4-trained project Command: sleap-nn-track Expected: End-to-end predictions + tracks saved to the output .slp. Actual: Fails with AttributeError when assigning tracker to SingleInstancePredictor. Also prints a Loguru ✓ and hits a Windows cp1252 UnicodeEncodeError. ⸻ B. Training without “Use Existing Training Config” Observed: Immediate loss computation error: predicted confmaps have 11 channels, but target has 22. Expected: Matching channel dimensions; training proceeds. Actual: RuntimeError: The size of tensor a (11) must match the size of tensor b (22) at non-singleton dimension 1 ⸻ Full logs (trimmed to errors) Prediction error (click to expand)Using already trained model for single_instance: C:/Users/SylwestrakGPU/Documents/SLEAP/Leah_AcuteNlx_local/models/251007_211701.single_instance.n=2875/training_config.json --- Logging error in Loguru Handler #1 --- AttributeError: 'SingleInstancePredictor' object has no attribute 'tracker' and no dict for setting new attributes INFO: ... legacy model weights loaded and verified ... Training error (click to expand)Start training single_instance... ... (pydantic warnings; environment + model summary) ... Sanity Checking DataLoader 0: 100%|... Epoch 0: 0%|... UserWarning: Using a target size (torch.Size([4, 22, 192, 192])) that is different to the input size (torch.Size([4, 11, 192, 192])) ... RuntimeError: The size of tensor a (11) must match the size of tensor b (22) at non-singleton dimension 1 Run Path: C:/Users/SylwestrakGPU/Documents/SLEAP/Leah_AcuteNlx_local/models/251023_124155.single_instance.n=2342 Config excerpt used for training (click to expand)data_config: ⸻ Notes / Clues ⸻ Repro steps ⸻ What would help ⸻ Workarounds tried / Suggested quick checks sleap-nn-track ... --batch_size 4 # omit --tracking and related flags If this works, the failure is isolated to tracker assignment in 1.5. ⸻ Attachments ⸻ Expected behavior ⸻ Actual behavior ⸻ Thanks Happy to test patches or provide additional diagnostics (full configs, .slp metadata, etc.). ⸻ TL;DR (1-paragraph for maintainers) Upgrading to SLEAP 1.5 (PyTorch rewrite, sleap_nn 0.0.2) on Windows, I can train and get immediate sample predictions, but batch prediction with tracking fails because SingleInstancePredictor appears immutable and rejects predictor.tracker = .... Starting a fresh training run (no existing config) fails immediately with a channel mismatch (11 predicted vs 22 target) in MSE loss. Legacy weights map fine, so the model loads; problems look isolated to (a) tracking configuration/assignment in 1.5 and (b) a mismatch between the dataset target spec and the configured single-instance confmaps head. There’s also a minor cp1252 UnicodeEncodeError from printing a ✓ in Loguru on Windows. |
Beta Was this translation helpful? Give feedback.
-
|
I found one workaround which is to upload my videos into a new project in SLEAP 1.5 and then just run inference with the .json training config from the old 1.4 project. It seems to be working to continue training/prediction from there on the new PC and SLEAP 1.5. But it would be great to solve this problem so that I can just run the .slp files with all the labels already present to avoid repeating any work. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
On a fresh SLEAP 1.5 install (Windows, RTX 5070 Ti + AMD Ryzen 9 9900X), I can train and get good predictions on random frames right after training. But:
AttributeError: 'SingleInstancePredictor' object has no attribute 'tracker' and no dict for setting new attributes.
RuntimeError: The size of tensor a (11) must match the size of tensor b (22) at non-singleton dimension 1.
There’s also a logging UnicodeEncodeError on Windows cp1252 when printing the ✓ checkmark from Loguru.
⸻
Environment
• OS: Windows (PDT timezone in logs)
• GPU: NVIDIA GeForce RTX 5070 Ti
• CPU: AMD Ryzen 9 9900X (12-core)
• SLEAP: 1.5 (CLI sleap-nn-*)
• sleap_nn version (from logs): 0.0.2
• Python: 3.13.9 (per .../uv/python/cpython-3.13.9-windows-x86_64-none/Lib/...)
• Torch / Lightning: Torch 2.x inferred from TF32 warnings; Lightning present (see logs)
• Install method: uv tools environment on Windows
• Project provenance: Labels/models originally trained with SLEAP 1.4 on a different PC
⸻
What I did
A. Prediction with tracking on a 1.4-trained project
• Using an existing 1.4 model on a 1.5 install to run predictions across the full video with tracking.
Command:
sleap-nn-track
--data_path "C:/Users/SylwestrakGPU/Documents/SLEAP/Leah_AcuteNlx_local/labels.v010_test.slp"
--video_index 0
--frames 0,-418014
--model_paths "C:\Users\SylwestrakGPU\Documents\SLEAP\Leah_AcuteNlx_local\models\251007_211701.single_instance.n=2875"
-o "C:/Users/SylwestrakGPU/Documents/SLEAP/Leah_AcuteNlx_local/predictions/labels.v010_test.slp.251023_123453.predictions.slp"
--batch_size 4
--tracking
--track_matching_method hungarian
--tracking_window_size 10
--scoring_reduction robust_quantile
--features bboxes
--scoring_method iou
Expected: End-to-end predictions + tracks saved to the output .slp.
Actual: Fails with AttributeError when assigning tracker to SingleInstancePredictor. Also prints a Loguru ✓ and hits a Windows cp1252 UnicodeEncodeError.
⸻
B. Training without “Use Existing Training Config”
• Starting a fresh single-instance training run from the same labels file.
Observed: Immediate loss computation error: predicted confmaps have 11 channels, but target has 22.
Expected: Matching channel dimensions; training proceeds.
Actual:
RuntimeError: The size of tensor a (11) must match the size of tensor b (22) at non-singleton dimension 1
⸻
Full logs (trimmed to errors)
Prediction error (click to expand)
Using already trained model for single_instance: C:/Users/SylwestrakGPU/Documents/SLEAP/Leah_AcuteNlx_local/models/251007_211701.single_instance.n=2875/training_config.json
Command line call:
sleap-nn-track --data_path C:/Users/SylwestrakGPU/Documents/SLEAP/Leah_AcuteNlx_local/labels.v010_test.slp --video_index 0 --frames 0,-418014 --model_paths C:\Users\SylwestrakGPU\Documents\SLEAP\Leah_AcuteNlx_local\models\251007_211701.single_instance.n=2875 -o C:/Users/SylwestrakGPU/Documents/SLEAP/Leah_AcuteNlx_local\predictions\labels.v010_test.slp.251023_123453.predictions.slp --batch_size 4 --tracking --track_matching_method hungarian --tracking_window_size 10 --scoring_reduction robust_quantile --features bboxes --scoring_method iou
--- Logging error in Loguru Handler #1 ---
... (cp1252 UnicodeEncodeError on '✓') ...
--- End of logging error ---
AttributeError: 'SingleInstancePredictor' object has no attribute 'tracker' and no dict for setting new attributes
INFO: ... legacy model weights loaded and verified ...
Process return code: 1
Training error (click to expand)
Start training single_instance...
['sleap-nn-train', '--config-name', '251023_124155_config', '--config-dir', 'C:\Users\SYLWES~1\AppData\Local\Temp\tmp5r6zfx3_']
... (pydantic warnings; environment + model summary) ...
Sanity Checking DataLoader 0: 100%|...
Epoch 0: 0%|...
UserWarning: Using a target size (torch.Size([4, 22, 192, 192])) that is different to the input size (torch.Size([4, 11, 192, 192])) ...
RuntimeError: The size of tensor a (11) must match the size of tensor b (22) at non-singleton dimension 1
Run Path: C:/Users/SylwestrakGPU/Documents/SLEAP/Leah_AcuteNlx_local/models/251023_124155.single_instance.n=2342
Config excerpt used for training (click to expand)
data_config:
train_labels_path: C:/Users/SylwestrakGPU/Documents/SLEAP/Leah_AcuteNlx_local/labels.v010_test.slp
validation_fraction: 0.1
provider: LabelsReader
user_instances_only: true
data_pipeline_fw: torch_dataset
preprocessing:
ensure_rgb: false
ensure_grayscale: false
max_height: 740
max_width: 740
scale: 0.5
...
model_config:
backbone_config:
unet:
in_channels: 1
filters: 16
output_stride: 2
head_configs:
single_instance:
confmaps:
part_names: [nose, l_ear, r_ear, neck, l_forepaw, r_forepaw, l_hindpaw, r_hindpaw, tailbase, mid_tail, tail_tip]
output_stride: 2
trainer_config:
train_data_loader: { batch_size: 4, shuffle: true, num_workers: 0 }
val_data_loader: { batch_size: 4, shuffle: false, num_workers: 0 }
run_name: 251023_124155.single_instance.n=2342
...
sleap_nn_version: 0.0.2
⸻
Notes / Clues
• Legacy weight load succeeds for the 1.4 model (34/34 mapped).
• Tracking failure: error occurs when Tracker.from_config(...) is assigned into SingleInstancePredictor.tracker. The error suggests the predictor object doesn’t allow new attributes (likely a frozen dataclass / Pydantic model with extra="forbid" or slots), so assigning predictor.tracker = ... fails.
• Unicode logging: printing a ✓ to a cp1252 console raises UnicodeEncodeError on Windows. This is noisy but seems non-fatal preceding the real exception.
• Channel mismatch (11 vs 22): the head outputs 11 channels (matches 11 part_names), while the target tensor is 22. Possibly the dataset pipeline is producing two targets per part (e.g., confmaps + something else, or 2 instances), while the configured head only emits confmaps. There are no skeletons listed in the config; training is single_instance.
⸻
Repro steps
⸻
What would help
• Guidance on:
• Whether SingleInstancePredictor in SLEAP 1.5/sleap_nn 0.0.2 is intentionally immutable (and thus needs tracker passed/constructed differently), or if this is a regression.
• Correct CLI/API usage for tracking under the 1.5 PyTorch rewrite (e.g., if tracker must be specified in a config section instead of setting predictor.tracker).
• Why the target has 22 channels for a single-instance confmaps head with 11 parts. Is the pipeline outputting additional targets (e.g., per-part offsets or duplicate instances) that require a different head configuration?
⸻
Workarounds tried / Suggested quick checks
• ✅ Legacy weights load fine → model architecture likely mapped correctly.
• 🔎 Prediction: Try running without tracking:
sleap-nn-track ... --batch_size 4 # omit --tracking and related flags
If this works, the failure is isolated to tracker assignment in 1.5.
• 🔎 Training: Switch to a head that matches the produced targets. For example, if the pipeline is emitting both confmaps and centroids or dual outputs, confirm whether the single-instance head should include additional heads or settings. (Documentation/example configs for 1.5 would help verify the expected target spec for torch_dataset.)
• 🔎 Confirm whether multiple labeled instances per frame exist in the data while training a single_instance head; if so, should this double the channels or should they be reduced/combined?
• 🪟 Unicode logging (minor): Set the console/codepage to UTF-8 (chcp 65001) or set PYTHONIOENCODING=utf-8, or suppress the checkmark in the logger to avoid cp1252 encoding errors on Windows.
⸻
Attachments
• Full CLI commands (see above)
• Full logs (see details sections)
• Training config excerpt (see details)
⸻
Expected behavior
• Tracking should initialize without mutating a frozen predictor or should use a supported path to configure a tracker.
• Fresh single-instance training should produce compatible target tensors (11 channels here) or the config generator should emit a head that matches the label pipeline’s targets.
⸻
Actual behavior
• sleap-nn-track crashes on predictor.tracker = ....
• Training crashes with RuntimeError due to 11 vs 22 channel mismatch.
⸻
Thanks
Happy to test patches or provide additional diagnostics (full configs, .slp metadata, etc.).
⸻
TL;DR (1-paragraph for maintainers)
Upgrading to SLEAP 1.5 (PyTorch rewrite, sleap_nn 0.0.2) on Windows, I can train and get immediate sample predictions, but batch prediction with tracking fails because SingleInstancePredictor appears immutable and rejects predictor.tracker = .... Starting a fresh training run (no existing config) fails immediately with a channel mismatch (11 predicted vs 22 target) in MSE loss. Legacy weights map fine, so the model loads; problems look isolated to (a) tracking configuration/assignment in 1.5 and (b) a mismatch between the dataset target spec and the configured single-instance confmaps head. There’s also a minor cp1252 UnicodeEncodeError from printing a ✓ in Loguru on Windows.
Beta Was this translation helpful? Give feedback.
All reactions