Skip to content

Conversation

@Viren6
Copy link
Contributor

@Viren6 Viren6 commented Jan 10, 2026

Switches from using Monty data to instead distilling from the lc0 BT4 net.
Two stages. First a direct comparison followed by a scaling up of net size.

Stage 1: Same size, schedule, and number of games as master net:

UHO, 1 node
Results of Patch vs Baseline (1 nodes, 1t, 64MB, UHO_Lichess_4852_v1.epd):
Elo: 61.05 +/- 2.53, nElo: 90.44 +/- 3.67
LOS: 100.00 %, DrawRatio: 36.97 %, PairsRatio: 2.50
Games: 34416, Wins: 11362, Losses: 5376, Draws: 17678, Points: 20201.0 (58.70 %)
Ptnml(0-2): [427, 2676, 6362, 5970, 1773], WL/DD Ratio: 0.41

DFRC, 1 node
Results of Patch vs Baseline (1 nodes, 1t, 64MB, DFRC_openings.epd):
Elo: 128.77 +/- 3.68, nElo: 177.40 +/- 4.63
LOS: 100.00 %, DrawRatio: 27.16 %, PairsRatio: 5.26
Games: 21600, Wins: 10492, Losses: 2834, Draws: 8274, Points: 14629.0 (67.73 %)
Ptnml(0-2): [178, 1078, 2933, 4130, 2481], WL/DD Ratio: 0.91

Passed STC:
LLR: 2.93 (-2.94,2.94) <0.00,4.00>
Total: 1504 W: 482 L: 315 D: 707
Ptnml(0-2): 15, 128, 325, 243, 41
https://tests.montychess.org/tests/view/69627ee53974b6e428003a08

Passed LTC:
LLR: 2.94 (-2.94,2.94) <1.00,5.00>
Total: 1554 W: 429 L: 275 D: 850
Ptnml(0-2): 6, 136, 360, 248, 27
https://tests.montychess.org/tests/view/69627eee3974b6e428003a0a

Stage 2: 100M games -> 400M games, L1 16384 -> L1 40960

UHO, 1 node
Results of Patch vs Baseline (1 nodes, 1t, 64MB, UHO_Lichess_4852_v1.epd):
Elo: 68.54 +/- 2.06, nElo: 94.73 +/- 2.78
LOS: 100.00 %, DrawRatio: 34.26 %, PairsRatio: 2.50
Games: 60102, Wins: 22986, Losses: 11281, Draws: 25835, Points: 35903.5 (59.74 %)
Ptnml(0-2): [965, 4673, 10296, 9926, 4191], WL/DD Ratio: 0.83

Speed testing: Comparing the stage 1 and stage 2 nets in this PR:

EPYC 9654 x2:
setoption name Threads value 384
setoption name Hash value 384000
go movetime 60000

Start position:
L1 16384:
info depth 19 seldepth 60 score cp 16 time 60017 nodes 2406951331 nps 40103928 pv d2d4 g8f6 g1f3 b7b6 g2g3 c8b7 c2c4 e7e6 f1g2 f8b4 c1d2 b4d2 d1d2 e8g8 b1c3 d7d6 d2f4 f6h5 f4e3
bestmove d2d4
L1 40960:
info depth 18 seldepth 53 score cp 15 time 60046 nodes 1453206163 nps 24201402 pv d2d4 g8f6 c2c4 e7e6 b1c3 f8b4 d1c2 e8g8 g1f3 c7c5 d4c5 b8a6 g2g3 a6c5 f1g2 c5e4 e1g1 e4c3
bestmove d2d4

Speed diff: -40%

Endgame position:
position fen 7k/6p1/6Pp/3B1b1P/5Pn1/B7/4K3/8 w - - 3 70
L1 16384:
info depth 15 seldepth 56 score cp 108 time 60003 nodes 1947063535 nps 32448930 pv a3b2 f5b1 d5g2 b1f5 g2c6 f5e6 c6f3 g4h2 f3g2 e6c4 e2e3 h2g4 e3d2 g4h2 f4f5
bestmove a3b2
L1 40960:
info depth 16 seldepth 50 score cp 123 time 60029 nodes 1789662569 nps 29813008 pv a3c5 g4h2 c5g1 h2g4 g1d4 g4h2 d5g2 h2g4 g2h3 h8g8 e2f3 g4e5 f3g3 e5c6 d4c5 f5b1
bestmove a3c5

Speed diff: -8%

Finally, the L1 40960 is compared to the original BT4 net:

Results of Patch vs Baseline (1 nodes, 1t - 0t, 64MB - NULL, 8moves_v3.pgn):
Elo: -673.44 +/- 14.80, nElo: -1577.71 +/- 5.56
LOS: 0.00 %, DrawRatio: 0.65 %, PairsRatio: 0.00
Games: 15000, Wins: 43, Losses: 14434, Draws: 523, Points: 304.5 (2.03 %)
Ptnml(0-2): [6944, 505, 49, 2, 0], WL/DD Ratio: 5.12

This is at 200,000x less operations per inference than BT4. The L1 40960 policy net performs at a 2000 FIDE level statically (this has been verified with other measurements vs e.g weaker lc0 nets like T79)

Bench: 1620531

@Viren6 Viren6 changed the title New Policy Network: nn-06e27b5ef6e7.network New Policy Network: nn-6e49a41bd7c0.network Jan 10, 2026
@Viren6 Viren6 merged commit e5938da into master Jan 10, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants