A major breakthrough: it is now possible to directly load the original pth models trained with Python PyTorch, convert them into javacpp-pytorch models, and use them for both inference and training. #1732

mullerhai · 2026-01-14T07:19:19Z

mullerhai
Jan 14, 2026
Sponsor

I wrote some code to tackle a difficult task: reading a PyTorch model saved as a Python pickle file and translating it into a JavaCPP-PyTorch model. This is quite a challenge. In ten years, no one has managed to implement it because it's just too difficult and challenging.

This is currently just a proof of concept, and I'm showing you the logs. For full support, a lot more layer type mappings would need to be implemented. This time, we primarily mapped Sequential and linear.

This feature is worth improving; in the future, Java should be able to load the original pth models directly, which would be very appealing to users.

import torch
import torch.nn as nn
from collections import OrderedDict

# ===================== 1. 定义模型 =====================



model = nn.Sequential(
# OrderedDict([
nn.Linear(5, 1, bias=False),  # 5输入→1输出，无偏置
nn.Linear(1, 2, bias=False)   # 1输入→2输出，无偏置
# ])
)


# ===================== 2. 强制初始化所有权重为1 =====================
# 遍历模型所有参数，设置为1（确保权重值可预测，方便Java端验证）
for param in model.parameters():
param.data.fill_(1.5)  # 所有元素赋值为1

# ===================== 3. 验证权重（可选，确认初始化结果） =====================
print("=== 模型权重验证（所有值应为1）===")
for name, param in model.named_parameters():
print(f"{name}: 形状={param.shape}, 数值={param.detach().numpy()}")

# ===================== 4. 保存完整模型（含结构+权重） =====================
torch.save(model, "py_model-2.pt")
print("\n✅ 模型已保存为 model.pt（权重全为1）")

# ===================== 5. 额外：保存state_dict（供Java端对比测试） =====================
torch.save(model.state_dict(), "py_model_weights-2.pt")
print("✅ 纯权重文件已保存为 model_weights.pt")

log

=== Model Weight Validation (all values should be 1) ===

0.weight: shape=torch.Size([1, 5]), value=[[1.5 1.5 1.5 1.5 1.5]]

1.weight: shape=torch.Size([2, 1]), value=[[1.5]

 [1.5]]



✅ Model saved as model.pt (all weights are 1.5)

✅ Pure weight file saved as model_weights.pt

var path ="pickled_modules/py_model-2.pt";
Module model= PyTorchModuleUtils.loadPy(path,false,null,null,true);

// 验证推理结果
try (Tensor input = torch.tensor(new float[]{10, 20, 30, 40, 50}).to(torch.kFloat())) {
Tensor res = model.forward(input);
float[] resData = new float[2];


System.out.println(model.asSequential().size());
torch.print(model.asSequential().get(0).asLinear().weight());
Tensor res1 = model.asSequential().forward(input);


torch.print(res1);
System.out.println("成功加载 Python 模型并执行推理。");


}
    }

Loading PyTorch checkpoint from: pickled_modules/py_model-2.pt

CustomUnpickler Init, Found pickle entry: py_model-2/data.pkl

CustomUnpickler Completed 0 PidStructs, 0 TensorStreams.

Loading pickle file: py_model-2/data.pkl

=== CustomSequentialObjectConstructor.construct 被调用 ===

Sequential 构造参数: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

=== CustomLinearObjectConstructor.construct 被调用 ===

Linear 构造参数: []

构造 CustomLinear: in=5, out=1, bias=false

CustomUnpickler persistentLoad invoke....  pid: [Ljava.lang.Object;

Storage type: FloatStorage

Checking entry: py_model-2/data.pkl

Checking entry: py_model-2/byteorder

Checking entry: py_model-2/data/0

Found tensor data entry: py_model-2/data/0

OrderedDict constructor args: []

OrderedDict constructor args: []

=== ParameterObjectConstructor.construct 被调用 ===

Parameter 构造参数: [CPUFloatType, true, {}]

Parameter 对应的 Tensor 形状: [1, 5]

Parameter requires_grad: true

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

=== CustomLinear.__setstate__ 被调用 ===

恢复 Linear 状态，参数数量: 19

读取到 Linear torch::nn::LinearImpl 属性 ：_backward_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_backward_pre_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：training -> true

读取到 Linear torch::nn::LinearImpl 属性 ：out_features -> 1

读取到 Linear torch::nn::LinearImpl 属性 ：_is_full_backward_hook -> null

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_hooks_with_kwargs -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_modules -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_hooks_always_called -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_non_persistent_buffers_set -> []

读取到 Linear torch::nn::LinearImpl 属性 ：_buffers -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_pre_hooks_with_kwargs -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_state_dict_pre_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_pre_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_state_dict_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：in_features -> 5

读取到 Linear torch::nn::LinearImpl 属性 ：_load_state_dict_pre_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_load_state_dict_post_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_parameters -> {weight=CPUFloatType, bias=null}

恢复 +++ Linear weight: from weightTensor 形状=[1, 5]

恢复 +++ Linear 自己的 this.weight: 形状=[1, 5]

类型不匹配: 目标=Float 源=Float

成功恢复 +++ Linear weight: 形状=[1, 5]

=== CustomLinearObjectConstructor.construct 被调用 ===

Linear 构造参数: []

构造 CustomLinear: in=5, out=1, bias=false

CustomUnpickler persistentLoad invoke....  pid: [Ljava.lang.Object;

Storage type: FloatStorage

Checking entry: py_model-2/data.pkl

Checking entry: py_model-2/byteorder

Checking entry: py_model-2/data/0

Checking entry: py_model-2/data/1

Found tensor data entry: py_model-2/data/1

OrderedDict constructor args: []

OrderedDict constructor args: []

=== ParameterObjectConstructor.construct 被调用 ===

Parameter 构造参数: [CPUFloatType, true, {}]

Parameter 对应的 Tensor 形状: [2, 1]

Parameter requires_grad: true

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

OrderedDict constructor args: []

=== CustomLinear.__setstate__ 被调用 ===

恢复 Linear 状态，参数数量: 19

读取到 Linear torch::nn::LinearImpl 属性 ：_backward_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_backward_pre_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：training -> true

读取到 Linear torch::nn::LinearImpl 属性 ：out_features -> 2

读取到 Linear torch::nn::LinearImpl 属性 ：_is_full_backward_hook -> null

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_hooks_with_kwargs -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_modules -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_hooks_always_called -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_non_persistent_buffers_set -> []

读取到 Linear torch::nn::LinearImpl 属性 ：_buffers -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_pre_hooks_with_kwargs -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_state_dict_pre_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_forward_pre_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_state_dict_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：in_features -> 1

读取到 Linear torch::nn::LinearImpl 属性 ：_load_state_dict_pre_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_load_state_dict_post_hooks -> {}

读取到 Linear torch::nn::LinearImpl 属性 ：_parameters -> {weight=CPUFloatType, bias=null}

恢复 +++ Linear weight: from weightTensor 形状=[2, 1]

恢复 +++ Linear 自己的 this.weight: 形状=[1, 5]

类型不匹配: 目标=Float 源=Float

成功恢复 +++ Linear weight: 形状=[2, 1]

=== CustomSequential.__setstate__ 被调用 ===

恢复 Sequential 状态，参数数量: 17

即将添加子模块: 0 -> examples.model.CustomLinear

成功添加子模块 类型: CustomLinear

即将添加子模块: 1 -> examples.model.CustomLinear

成功添加子模块 类型: CustomLinear

After unpickler load, 2 PidStructs, 2 TensorStreams.

2

成功加载 Python 模型并执行推理。

 0.3816 -0.2314 -0.0656  0.2121 -0.2214

[ CPUFloatType{1,5} ] 1.5000  1.5000  1.5000  1.5000  1.5000

[ CPUFloatType{1,5} ]-0.1560 -0.0181  0.2921 -0.4398 -0.0764

[ CPUFloatType{1,5} ] 1.5000

 1.5000

[ CPUFloatType{2,1} ] 0.0464 -0.0438 -0.3725 -0.1055  0.2602

[ CPUFloatType{1,5} ]-1.4235

 2.2309



Process finished with exit code 0

[ CPUFloatType{2} ]

Observation log output, weight is normally read as 1.5f, and can be normally inferred forward.

mullerhai · 2026-01-14T09:20:07Z

mullerhai
Jan 14, 2026
Author Sponsor

now could deal with any model， like that

import torch
import torch.nn as nn

class MyFeatureBlock(nn.Module):
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1)
        self.relu = nn.ReLU()
        self.flatten = nn.Flatten()

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.block = MyFeatureBlock(3, 16)
        self.fc = nn.Linear(16 * 224 * 224, 10)

def init_weights(m):
    # 只要是有权重的层，全部初始化为 1.0f
    if hasattr(m, 'weight') and m.weight is not None:
        nn.init.constant_(m.weight, 1.0)
    if hasattr(m, 'bias') and m.bias is not None:
        nn.init.constant_(m.bias, 1.0)

model = MyModel()
model.apply(init_weights) # 递归应用初始化
torch.save(model, "custom_model_ones.pt")

java log


Loading PyTorch checkpoint from:  pickled_modules/custom_model_ones.pt
CustomUnpickler Init, Found pickle entry: custom_model_ones/data.pkl
CustomUnpickler Completed 0 PidStructs, 0 TensorStreams.
Loading pickle file: custom_model_ones/data.pkl
MyFeatureBlock constructor reg success
CustomUnpickler persistentLoad invoke....  pid: [Ljava.lang.Object;
Storage type: FloatStorage
Checking entry: custom_model_ones/data.pkl
Checking entry: custom_model_ones/byteorder
Checking entry: custom_model_ones/data/0
Found tensor data entry: custom_model_ones/data/0
=== ParameterObjectConstructor.construct 被调用 ===
Parameter 构造参数: [CPUFloatType, true, {}]
Parameter 对应的 Tensor 形状: [16, 3, 3, 3]
Parameter requires_grad: true
CustomUnpickler persistentLoad invoke....  pid: [Ljava.lang.Object;
Storage type: FloatStorage
Checking entry: custom_model_ones/data.pkl
Checking entry: custom_model_ones/byteorder
Checking entry: custom_model_ones/data/0
Checking entry: custom_model_ones/data/1
Found tensor data entry: custom_model_ones/data/1
=== ParameterObjectConstructor.construct 被调用 ===
Parameter 构造参数: [CPUFloatType, true, {}]
Parameter 对应的 Tensor 形状: [16]
Parameter requires_grad: true
MyFeatureBlock begin  __setstate__: state keys = 17
MyFeatureBlock outer __setstate__: _modules size = 3
=== CustomLinearObjectConstructor.construct 被调用 ===
Linear 构造参数: []
构造 CustomLinear: in=5, out=1, bias=false
CustomUnpickler persistentLoad invoke....  pid: [Ljava.lang.Object;
Storage type: FloatStorage
Checking entry: custom_model_ones/data.pkl
Checking entry: custom_model_ones/byteorder
Checking entry: custom_model_ones/data/0
Checking entry: custom_model_ones/data/1
Checking entry: custom_model_ones/data/2
Found tensor data entry: custom_model_ones/data/2
=== ParameterObjectConstructor.construct 被调用 ===
Parameter 构造参数: [CPUFloatType, true, {}]
Parameter 对应的 Tensor 形状: [10, 802816]
Parameter requires_grad: true
CustomUnpickler persistentLoad invoke....  pid: [Ljava.lang.Object;
Storage type: FloatStorage
Checking entry: custom_model_ones/data.pkl
Checking entry: custom_model_ones/byteorder
Checking entry: custom_model_ones/data/0
Checking entry: custom_model_ones/data/1
Checking entry: custom_model_ones/data/2
Checking entry: custom_model_ones/data/3
Found tensor data entry: custom_model_ones/data/3
=== ParameterObjectConstructor.construct 被调用 ===
Parameter 构造参数: [CPUFloatType, true, {}]
Parameter 对应的 Tensor 形状: [10]
Parameter requires_grad: true
=== CustomLinear.__setstate__ 被调用 ===
恢复 Linear 状态，参数数量: 19
读取到 Linear torch::nn::LinearImpl 属性 ：_backward_hooks -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_backward_pre_hooks -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_forward_hooks -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：training -> true
读取到 Linear torch::nn::LinearImpl 属性 ：out_features -> 10
读取到 Linear torch::nn::LinearImpl 属性 ：_is_full_backward_hook -> null
读取到 Linear torch::nn::LinearImpl 属性 ：_forward_hooks_with_kwargs -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_modules -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_forward_hooks_always_called -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_non_persistent_buffers_set -> []
读取到 Linear torch::nn::LinearImpl 属性 ：_buffers -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_forward_pre_hooks_with_kwargs -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_state_dict_pre_hooks -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_forward_pre_hooks -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_state_dict_hooks -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：in_features -> 802816
读取到 Linear torch::nn::LinearImpl 属性 ：_load_state_dict_pre_hooks -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_load_state_dict_post_hooks -> {}
读取到 Linear torch::nn::LinearImpl 属性 ：_parameters -> {weight=CPUFloatType, bias=CPUFloatType}
恢复 +++ Linear weight: from weightTensor 形状=[10, 802816]
恢复 +++ Linear 自己的 this.weight: 形状=[1, 5]
类型不匹配: 目标=Float 源=Float
成功恢复 +++ Linear weight: 形状=[10, 802816]
=== MyModelContainer.__setstate__ 开始组装主模型 ===
主模型挂载子模块: block
主模型挂载子模块: fc
主模型 [MyModel] __setstate__ 执行完毕。
After unpickler load, 4 PidStructs, 4 TensorStreams.
2
成功加载 Python 模型并执行推理。
-16.3905
-25.4727
[ CPUFloatType{2} ] 0.5951
 0.9248

Process finished with exit code 0
[ CPUFloatType{2,1} ]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A major breakthrough: it is now possible to directly load the original pth models trained with Python PyTorch, convert them into javacpp-pytorch models, and use them for both inference and training. #1732

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

A major breakthrough: it is now possible to directly load the original pth models trained with Python PyTorch, convert them into javacpp-pytorch models, and use them for both inference and training. #1732

Uh oh!

mullerhai Jan 14, 2026 Sponsor

Replies: 1 comment

Uh oh!

Uh oh!

mullerhai Jan 14, 2026 Author Sponsor

mullerhai
Jan 14, 2026
Sponsor

mullerhai
Jan 14, 2026
Author Sponsor