Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] DataLoadingPrimer.repeat #2822

Merged
merged 4 commits into from
Mar 11, 2025
Merged

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Mar 3, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2822

Note: Links to docs will display an error until the docs builds have been completed.

❌ 8 New Failures, 1 Unrelated Failure

As of commit 239f2d1 with merge base 6e40548 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 3, 2025
vmoens added a commit that referenced this pull request Mar 3, 2025
ghstack-source-id: df7ee5caf2850303068d073e0c7cf09d8941c5d3
Pull Request resolved: #2822
Copy link

github-actions bot commented Mar 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6045s 0.5275s 1.8957 Ops/s 1.8784 Ops/s $\color{#35bf28}+0.92\%$
test_transformed 1.1234s 1.0449s 0.9570 Ops/s 0.9681 Ops/s $\color{#d91a1a}-1.14\%$
test_serial 1.6364s 1.5483s 0.6459 Ops/s 0.6458 Ops/s $+0.01\%$
test_parallel 1.4326s 1.3262s 0.7540 Ops/s 0.7504 Ops/s $\color{#35bf28}+0.48\%$
test_step_mdp_speed[True-True-True-True-True] 0.2073ms 29.8384μs 33.5138 KOps/s 33.1790 KOps/s $\color{#35bf28}+1.01\%$
test_step_mdp_speed[True-True-True-True-False] 52.7080μs 17.8256μs 56.0991 KOps/s 56.0415 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[True-True-True-False-True] 0.5889ms 16.9990μs 58.8268 KOps/s 57.5725 KOps/s $\color{#35bf28}+2.18\%$
test_step_mdp_speed[True-True-True-False-False] 42.9200μs 9.9811μs 100.1891 KOps/s 99.7515 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[True-True-False-True-True] 84.9490μs 32.0068μs 31.2433 KOps/s 30.9526 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[True-True-False-True-False] 46.7080μs 19.6569μs 50.8728 KOps/s 50.9533 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[True-True-False-False-True] 68.0070μs 18.8766μs 52.9757 KOps/s 52.5478 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-True-False-False-False] 34.4450μs 11.8792μs 84.1805 KOps/s 84.8481 KOps/s $\color{#d91a1a}-0.79\%$
test_step_mdp_speed[True-False-True-True-True] 84.6190μs 34.2046μs 29.2358 KOps/s 29.2770 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-False-True-True-False] 71.0830μs 21.6150μs 46.2642 KOps/s 46.5131 KOps/s $\color{#d91a1a}-0.54\%$
test_step_mdp_speed[True-False-True-False-True] 55.8240μs 18.7362μs 53.3725 KOps/s 52.4806 KOps/s $\color{#35bf28}+1.70\%$
test_step_mdp_speed[True-False-True-False-False] 36.2580μs 11.8621μs 84.3020 KOps/s 84.8603 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[True-False-False-True-True] 87.8750μs 35.4305μs 28.2243 KOps/s 28.1326 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[True-False-False-True-False] 74.0400μs 23.2759μs 42.9628 KOps/s 43.1024 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-False-False-False-True] 49.1730μs 20.4518μs 48.8956 KOps/s 48.0326 KOps/s $\color{#35bf28}+1.80\%$
test_step_mdp_speed[True-False-False-False-False] 65.9540μs 13.6761μs 73.1202 KOps/s 73.5761 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[False-True-True-True-True] 85.8010μs 33.7552μs 29.6251 KOps/s 29.2392 KOps/s $\color{#35bf28}+1.32\%$
test_step_mdp_speed[False-True-True-True-False] 93.1930μs 21.4877μs 46.5383 KOps/s 46.0463 KOps/s $\color{#35bf28}+1.07\%$
test_step_mdp_speed[False-True-True-False-True] 73.8500μs 21.3126μs 46.9205 KOps/s 45.4093 KOps/s $\color{#35bf28}+3.33\%$
test_step_mdp_speed[False-True-True-False-False] 38.2020μs 13.2371μs 75.5451 KOps/s 75.7070 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[False-True-False-True-True] 90.7300μs 35.1466μs 28.4522 KOps/s 27.7582 KOps/s $\color{#35bf28}+2.50\%$
test_step_mdp_speed[False-True-False-True-False] 2.2827ms 23.2323μs 43.0434 KOps/s 42.6558 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[False-True-False-False-True] 78.1370μs 22.9259μs 43.6189 KOps/s 42.6682 KOps/s $\color{#35bf28}+2.23\%$
test_step_mdp_speed[False-True-False-False-False] 72.8770μs 14.8696μs 67.2515 KOps/s 67.1532 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-False-True-True-True] 0.1023ms 36.9043μs 27.0971 KOps/s 26.8739 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[False-False-True-True-False] 62.7380μs 25.2316μs 39.6328 KOps/s 40.0003 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[False-False-True-False-True] 56.7270μs 23.1072μs 43.2765 KOps/s 42.8093 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[False-False-True-False-False] 42.4590μs 14.9960μs 66.6846 KOps/s 66.2045 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[False-False-False-True-True] 89.4380μs 38.6543μs 25.8703 KOps/s 25.3792 KOps/s $\color{#35bf28}+1.93\%$
test_step_mdp_speed[False-False-False-True-False] 66.3140μs 26.6747μs 37.4887 KOps/s 36.8481 KOps/s $\color{#35bf28}+1.74\%$
test_step_mdp_speed[False-False-False-False-True] 83.2270μs 24.5452μs 40.7411 KOps/s 39.7913 KOps/s $\color{#35bf28}+2.39\%$
test_step_mdp_speed[False-False-False-False-False] 59.8010μs 16.4462μs 60.8043 KOps/s 59.8933 KOps/s $\color{#35bf28}+1.52\%$
test_values[generalized_advantage_estimate-True-True] 11.3580ms 9.5250ms 104.9869 Ops/s 101.3976 Ops/s $\color{#35bf28}+3.54\%$
test_values[vec_generalized_advantage_estimate-True-True] 36.0134ms 26.3701ms 37.9217 Ops/s 37.2667 Ops/s $\color{#35bf28}+1.76\%$
test_values[td0_return_estimate-False-False] 0.2074ms 0.1828ms 5.4693 KOps/s 5.4853 KOps/s $\color{#d91a1a}-0.29\%$
test_values[td1_return_estimate-False-False] 27.9000ms 23.7733ms 42.0640 Ops/s 40.7672 Ops/s $\color{#35bf28}+3.18\%$
test_values[vec_td1_return_estimate-False-False] 28.7690ms 26.0964ms 38.3195 Ops/s 38.3088 Ops/s $\color{#35bf28}+0.03\%$
test_values[td_lambda_return_estimate-True-False] 36.8747ms 34.2903ms 29.1627 Ops/s 28.3770 Ops/s $\color{#35bf28}+2.77\%$
test_values[vec_td_lambda_return_estimate-True-False] 31.3173ms 26.3159ms 37.9998 Ops/s 38.4306 Ops/s $\color{#d91a1a}-1.12\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.2688ms 8.3102ms 120.3336 Ops/s 115.8617 Ops/s $\color{#35bf28}+3.86\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2451ms 1.9626ms 509.5263 Ops/s 503.7684 Ops/s $\color{#35bf28}+1.14\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5374ms 0.3650ms 2.7399 KOps/s 2.6852 KOps/s $\color{#35bf28}+2.03\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 48.3213ms 46.7387ms 21.3956 Ops/s 21.4343 Ops/s $\color{#d91a1a}-0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.4909ms 3.4676ms 288.3799 Ops/s 289.6277 Ops/s $\color{#d91a1a}-0.43\%$
test_dqn_speed[False-None] 2.0092ms 1.3956ms 716.5435 Ops/s 697.0637 Ops/s $\color{#35bf28}+2.79\%$
test_dqn_speed[False-backward] 1.9371ms 1.8729ms 533.9210 Ops/s 520.3137 Ops/s $\color{#35bf28}+2.62\%$
test_dqn_speed[True-None] 0.7432ms 0.5500ms 1.8181 KOps/s 1.7533 KOps/s $\color{#35bf28}+3.70\%$
test_dqn_speed[True-backward] 1.0445ms 0.9755ms 1.0251 KOps/s 799.4572 Ops/s $\textbf{\color{#35bf28}+28.23\%}$
test_dqn_speed[reduce-overhead-None] 0.8135ms 0.5562ms 1.7978 KOps/s 1.7597 KOps/s $\color{#35bf28}+2.16\%$
test_dqn_speed[reduce-overhead-backward] 1.0390ms 0.9668ms 1.0343 KOps/s 1.0036 KOps/s $\color{#35bf28}+3.06\%$
test_ddpg_speed[False-None] 3.6442ms 2.9124ms 343.3538 Ops/s 341.5578 Ops/s $\color{#35bf28}+0.53\%$
test_ddpg_speed[False-backward] 4.0834ms 3.9840ms 251.0033 Ops/s 245.4392 Ops/s $\color{#35bf28}+2.27\%$
test_ddpg_speed[True-None] 1.7657ms 1.4301ms 699.2712 Ops/s 683.6942 Ops/s $\color{#35bf28}+2.28\%$
test_ddpg_speed[True-backward] 2.3911ms 2.3277ms 429.6121 Ops/s 425.0120 Ops/s $\color{#35bf28}+1.08\%$
test_ddpg_speed[reduce-overhead-None] 2.0120ms 1.4325ms 698.0903 Ops/s 684.4890 Ops/s $\color{#35bf28}+1.99\%$
test_ddpg_speed[reduce-overhead-backward] 2.4769ms 2.3117ms 432.5803 Ops/s 420.8404 Ops/s $\color{#35bf28}+2.79\%$
test_sac_speed[False-None] 8.2945ms 7.8666ms 127.1196 Ops/s 122.7474 Ops/s $\color{#35bf28}+3.56\%$
test_sac_speed[False-backward] 11.0477ms 10.7161ms 93.3177 Ops/s 92.4078 Ops/s $\color{#35bf28}+0.98\%$
test_sac_speed[True-None] 3.6961ms 2.6009ms 384.4878 Ops/s 375.3263 Ops/s $\color{#35bf28}+2.44\%$
test_sac_speed[True-backward] 4.2632ms 4.1939ms 238.4407 Ops/s 225.4770 Ops/s $\textbf{\color{#35bf28}+5.75\%}$
test_sac_speed[reduce-overhead-None] 3.0173ms 2.5537ms 391.5879 Ops/s 363.1457 Ops/s $\textbf{\color{#35bf28}+7.83\%}$
test_sac_speed[reduce-overhead-backward] 4.4004ms 4.2438ms 235.6383 Ops/s 230.6681 Ops/s $\color{#35bf28}+2.15\%$
test_redq_speed[False-None] 19.7371ms 13.2419ms 75.5180 Ops/s 72.1528 Ops/s $\color{#35bf28}+4.66\%$
test_redq_speed[False-backward] 31.9635ms 23.3764ms 42.7783 Ops/s 43.4983 Ops/s $\color{#d91a1a}-1.66\%$
test_redq_speed[True-None] 7.4989ms 6.7966ms 147.1328 Ops/s 142.5018 Ops/s $\color{#35bf28}+3.25\%$
test_redq_speed[True-backward] 14.8165ms 14.4148ms 69.3732 Ops/s 64.3634 Ops/s $\textbf{\color{#35bf28}+7.78\%}$
test_redq_speed[reduce-overhead-None] 7.6116ms 6.8109ms 146.8232 Ops/s 137.4140 Ops/s $\textbf{\color{#35bf28}+6.85\%}$
test_redq_speed[reduce-overhead-backward] 15.2787ms 14.7610ms 67.7459 Ops/s 67.4150 Ops/s $\color{#35bf28}+0.49\%$
test_redq_deprec_speed[False-None] 13.7892ms 12.9760ms 77.0654 Ops/s 76.6679 Ops/s $\color{#35bf28}+0.52\%$
test_redq_deprec_speed[False-backward] 19.6148ms 18.5546ms 53.8948 Ops/s 53.3525 Ops/s $\color{#35bf28}+1.02\%$
test_redq_deprec_speed[True-None] 5.7633ms 5.2952ms 188.8495 Ops/s 181.2779 Ops/s $\color{#35bf28}+4.18\%$
test_redq_deprec_speed[True-backward] 10.4216ms 10.0481ms 99.5213 Ops/s 93.5315 Ops/s $\textbf{\color{#35bf28}+6.40\%}$
test_redq_deprec_speed[reduce-overhead-None] 5.7036ms 5.2388ms 190.8819 Ops/s 187.0003 Ops/s $\color{#35bf28}+2.08\%$
test_redq_deprec_speed[reduce-overhead-backward] 10.9721ms 10.2024ms 98.0164 Ops/s 97.8247 Ops/s $\color{#35bf28}+0.20\%$
test_td3_speed[False-None] 8.3748ms 7.9626ms 125.5873 Ops/s 121.8031 Ops/s $\color{#35bf28}+3.11\%$
test_td3_speed[False-backward] 11.3178ms 10.3789ms 96.3491 Ops/s 94.0746 Ops/s $\color{#35bf28}+2.42\%$
test_td3_speed[True-None] 2.4416ms 2.2805ms 438.4984 Ops/s 428.7918 Ops/s $\color{#35bf28}+2.26\%$
test_td3_speed[True-backward] 4.0146ms 3.9301ms 254.4492 Ops/s 248.8924 Ops/s $\color{#35bf28}+2.23\%$
test_td3_speed[reduce-overhead-None] 2.8177ms 2.3058ms 433.6898 Ops/s 431.6150 Ops/s $\color{#35bf28}+0.48\%$
test_td3_speed[reduce-overhead-backward] 4.1342ms 3.9538ms 252.9221 Ops/s 246.8201 Ops/s $\color{#35bf28}+2.47\%$
test_cql_speed[False-None] 39.9933ms 37.3203ms 26.7951 Ops/s 27.0977 Ops/s $\color{#d91a1a}-1.12\%$
test_cql_speed[False-backward] 48.5278ms 46.6578ms 21.4327 Ops/s 21.3543 Ops/s $\color{#35bf28}+0.37\%$
test_cql_speed[True-None] 23.3732ms 22.3036ms 44.8358 Ops/s 44.1562 Ops/s $\color{#35bf28}+1.54\%$
test_cql_speed[True-backward] 30.6223ms 29.4075ms 34.0050 Ops/s 33.3972 Ops/s $\color{#35bf28}+1.82\%$
test_cql_speed[reduce-overhead-None] 23.8176ms 22.3017ms 44.8397 Ops/s 44.2809 Ops/s $\color{#35bf28}+1.26\%$
test_cql_speed[reduce-overhead-backward] 30.8504ms 29.7435ms 33.6208 Ops/s 33.5835 Ops/s $\color{#35bf28}+0.11\%$
test_a2c_speed[False-None] 8.4499ms 7.1947ms 138.9909 Ops/s 135.7331 Ops/s $\color{#35bf28}+2.40\%$
test_a2c_speed[False-backward] 15.6339ms 14.8078ms 67.5318 Ops/s 67.6905 Ops/s $\color{#d91a1a}-0.23\%$
test_a2c_speed[True-None] 4.7305ms 4.6429ms 215.3822 Ops/s 208.2800 Ops/s $\color{#35bf28}+3.41\%$
test_a2c_speed[True-backward] 11.5838ms 11.1503ms 89.6840 Ops/s 88.6399 Ops/s $\color{#35bf28}+1.18\%$
test_a2c_speed[reduce-overhead-None] 5.2504ms 4.7241ms 211.6818 Ops/s 211.8166 Ops/s $\color{#d91a1a}-0.06\%$
test_a2c_speed[reduce-overhead-backward] 11.4605ms 11.1751ms 89.4843 Ops/s 87.5557 Ops/s $\color{#35bf28}+2.20\%$
test_ppo_speed[False-None] 8.1604ms 7.5091ms 133.1711 Ops/s 131.0378 Ops/s $\color{#35bf28}+1.63\%$
test_ppo_speed[False-backward] 16.0680ms 15.3115ms 65.3104 Ops/s 67.2206 Ops/s $\color{#d91a1a}-2.84\%$
test_ppo_speed[True-None] 5.9392ms 5.0851ms 196.6539 Ops/s 190.8042 Ops/s $\color{#35bf28}+3.07\%$
test_ppo_speed[True-backward] 11.2197ms 10.9751ms 91.1154 Ops/s 88.5641 Ops/s $\color{#35bf28}+2.88\%$
test_ppo_speed[reduce-overhead-None] 5.6258ms 5.0590ms 197.6684 Ops/s 196.3453 Ops/s $\color{#35bf28}+0.67\%$
test_ppo_speed[reduce-overhead-backward] 11.5615ms 11.0911ms 90.1625 Ops/s 90.4031 Ops/s $\color{#d91a1a}-0.27\%$
test_reinforce_speed[False-None] 8.0439ms 6.6203ms 151.0503 Ops/s 151.2200 Ops/s $\color{#d91a1a}-0.11\%$
test_reinforce_speed[False-backward] 10.5872ms 10.1748ms 98.2818 Ops/s 99.9667 Ops/s $\color{#d91a1a}-1.69\%$
test_reinforce_speed[True-None] 4.8549ms 4.0523ms 246.7708 Ops/s 244.2138 Ops/s $\color{#35bf28}+1.05\%$
test_reinforce_speed[True-backward] 11.1999ms 10.0366ms 99.6350 Ops/s 99.3382 Ops/s $\color{#35bf28}+0.30\%$
test_reinforce_speed[reduce-overhead-None] 4.4952ms 4.0847ms 244.8133 Ops/s 241.3958 Ops/s $\color{#35bf28}+1.42\%$
test_reinforce_speed[reduce-overhead-backward] 10.4578ms 9.9806ms 100.1948 Ops/s 99.8311 Ops/s $\color{#35bf28}+0.36\%$
test_iql_speed[False-None] 37.9813ms 33.3461ms 29.9885 Ops/s 30.2927 Ops/s $\color{#d91a1a}-1.00\%$
test_iql_speed[False-backward] 59.5238ms 46.4438ms 21.5314 Ops/s 21.6410 Ops/s $\color{#d91a1a}-0.51\%$
test_iql_speed[True-None] 16.7551ms 15.8549ms 63.0718 Ops/s 61.4194 Ops/s $\color{#35bf28}+2.69\%$
test_iql_speed[True-backward] 28.9397ms 27.3928ms 36.5059 Ops/s 36.5758 Ops/s $\color{#d91a1a}-0.19\%$
test_iql_speed[reduce-overhead-None] 16.9719ms 16.2804ms 61.4234 Ops/s 62.3568 Ops/s $\color{#d91a1a}-1.50\%$
test_iql_speed[reduce-overhead-backward] 28.9202ms 27.9519ms 35.7757 Ops/s 36.7391 Ops/s $\color{#d91a1a}-2.62\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8559ms 4.9964ms 200.1436 Ops/s 204.3829 Ops/s $\color{#d91a1a}-2.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0074ms 0.5495ms 1.8198 KOps/s 1.8293 KOps/s $\color{#d91a1a}-0.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8781ms 0.5181ms 1.9301 KOps/s 1.9380 KOps/s $\color{#d91a1a}-0.41\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3822ms 4.7599ms 210.0886 Ops/s 215.9979 Ops/s $\color{#d91a1a}-2.74\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.9237ms 0.5319ms 1.8800 KOps/s 1.8783 KOps/s $\color{#35bf28}+0.09\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8513ms 0.5125ms 1.9512 KOps/s 1.9536 KOps/s $\color{#d91a1a}-0.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4336ms 1.6957ms 589.7434 Ops/s 577.6959 Ops/s $\color{#35bf28}+2.09\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3213ms 1.6057ms 622.7946 Ops/s 615.8155 Ops/s $\color{#35bf28}+1.13\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5049ms 5.0675ms 197.3366 Ops/s 207.7177 Ops/s $\color{#d91a1a}-5.00\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.6210ms 0.6753ms 1.4808 KOps/s 1.4843 KOps/s $\color{#d91a1a}-0.24\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0203ms 0.6521ms 1.5336 KOps/s 1.5277 KOps/s $\color{#35bf28}+0.38\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1094ms 4.9175ms 203.3553 Ops/s 214.5637 Ops/s $\textbf{\color{#d91a1a}-5.22\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.2744ms 0.5458ms 1.8321 KOps/s 1.8284 KOps/s $\color{#35bf28}+0.20\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7506ms 0.5199ms 1.9235 KOps/s 1.9376 KOps/s $\color{#d91a1a}-0.73\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.4780ms 4.8198ms 207.4778 Ops/s 218.4454 Ops/s $\textbf{\color{#d91a1a}-5.02\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.2703ms 0.5359ms 1.8661 KOps/s 1.8892 KOps/s $\color{#d91a1a}-1.22\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8629ms 0.5166ms 1.9356 KOps/s 1.9457 KOps/s $\color{#d91a1a}-0.52\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.5902ms 4.9963ms 200.1462 Ops/s 209.0595 Ops/s $\color{#d91a1a}-4.26\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.5807ms 0.6915ms 1.4461 KOps/s 1.4844 KOps/s $\color{#d91a1a}-2.58\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8651ms 0.6527ms 1.5321 KOps/s 1.5199 KOps/s $\color{#35bf28}+0.80\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.8712ms 4.2972ms 232.7113 Ops/s 241.8058 Ops/s $\color{#d91a1a}-3.76\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.5941ms 2.2563ms 443.1968 Ops/s 418.9245 Ops/s $\textbf{\color{#35bf28}+5.79\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.4917ms 1.3805ms 724.3911 Ops/s 739.0887 Ops/s $\color{#d91a1a}-1.99\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4619s 13.6375ms 73.3272 Ops/s 239.7407 Ops/s $\textbf{\color{#d91a1a}-69.41\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.9348ms 2.4306ms 411.4216 Ops/s 422.8399 Ops/s $\color{#d91a1a}-2.70\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.6954ms 1.4446ms 692.2459 Ops/s 692.4231 Ops/s $\color{#d91a1a}-0.03\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.1392ms 4.5284ms 220.8309 Ops/s 32.9211 Ops/s $\textbf{\color{#35bf28}+570.79\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.1198ms 2.5578ms 390.9550 Ops/s 395.9280 Ops/s $\color{#d91a1a}-1.26\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.1147ms 1.5501ms 645.1163 Ops/s 656.4094 Ops/s $\color{#d91a1a}-1.72\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.0056ms 11.7972ms 84.7659 Ops/s 78.3332 Ops/s $\textbf{\color{#35bf28}+8.21\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.3206ms 14.1222ms 70.8103 Ops/s 69.3857 Ops/s $\color{#35bf28}+2.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.7741ms 20.6005ms 48.5424 Ops/s 46.5161 Ops/s $\color{#35bf28}+4.36\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 15.9246ms 14.3180ms 69.8421 Ops/s 68.2783 Ops/s $\color{#35bf28}+2.29\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.8329ms 20.8099ms 48.0540 Ops/s 46.8918 Ops/s $\color{#35bf28}+2.48\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.2752ms 15.5170ms 64.4454 Ops/s 62.1275 Ops/s $\color{#35bf28}+3.73\%$

[ghstack-poisoned]
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 6, 2025
ghstack-source-id: c17a24a4594db737cae51e8897d215295aa52d03
Pull Request resolved: #2822
[ghstack-poisoned]
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}19$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.9093s 0.8228s 1.2154 Ops/s 1.2145 Ops/s $\color{#35bf28}+0.07\%$
test_transformed 1.4963s 1.4055s 0.7115 Ops/s 0.6655 Ops/s $\textbf{\color{#35bf28}+6.91\%}$
test_serial 2.4329s 2.3413s 0.4271 Ops/s 0.4146 Ops/s $\color{#35bf28}+3.03\%$
test_parallel 1.9337s 1.8965s 0.5273 Ops/s 0.5231 Ops/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-True-True-True-True] 0.1821ms 39.7589μs 25.1516 KOps/s 25.7121 KOps/s $\color{#d91a1a}-2.18\%$
test_step_mdp_speed[True-True-True-True-False] 55.0110μs 23.3352μs 42.8537 KOps/s 42.5151 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[True-True-True-False-True] 55.2510μs 21.8888μs 45.6855 KOps/s 44.4316 KOps/s $\color{#35bf28}+2.82\%$
test_step_mdp_speed[True-True-True-False-False] 43.4610μs 12.9941μs 76.9578 KOps/s 76.6014 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[True-True-False-True-True] 0.1660ms 42.6656μs 23.4381 KOps/s 23.6063 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[True-True-False-True-False] 58.9210μs 25.6247μs 39.0248 KOps/s 38.8553 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[True-True-False-False-True] 61.4810μs 24.7299μs 40.4369 KOps/s 40.7767 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[True-True-False-False-False] 0.1359ms 15.3263μs 65.2473 KOps/s 64.5214 KOps/s $\color{#35bf28}+1.13\%$
test_step_mdp_speed[True-False-True-True-True] 85.5510μs 44.3745μs 22.5355 KOps/s 22.0300 KOps/s $\color{#35bf28}+2.29\%$
test_step_mdp_speed[True-False-True-True-False] 58.0610μs 28.0773μs 35.6159 KOps/s 35.6361 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[True-False-True-False-True] 64.2710μs 24.5157μs 40.7901 KOps/s 40.2838 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[True-False-True-False-False] 47.8610μs 15.2619μs 65.5225 KOps/s 64.5216 KOps/s $\color{#35bf28}+1.55\%$
test_step_mdp_speed[True-False-False-True-True] 0.1299ms 45.2063μs 22.1208 KOps/s 21.0453 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_step_mdp_speed[True-False-False-True-False] 60.0200μs 30.0319μs 33.2979 KOps/s 34.1026 KOps/s $\color{#d91a1a}-2.36\%$
test_step_mdp_speed[True-False-False-False-True] 71.6110μs 26.6906μs 37.4664 KOps/s 37.0332 KOps/s $\color{#35bf28}+1.17\%$
test_step_mdp_speed[True-False-False-False-False] 0.2033ms 17.5620μs 56.9411 KOps/s 56.4271 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[False-True-True-True-True] 0.2176ms 45.1269μs 22.1597 KOps/s 22.1769 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[False-True-True-True-False] 0.2225ms 28.0719μs 35.6228 KOps/s 35.6372 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[False-True-True-False-True] 2.6371ms 28.9609μs 34.5293 KOps/s 35.3407 KOps/s $\color{#d91a1a}-2.30\%$
test_step_mdp_speed[False-True-True-False-False] 71.1910μs 17.2342μs 58.0243 KOps/s 58.3900 KOps/s $\color{#d91a1a}-0.63\%$
test_step_mdp_speed[False-True-False-True-True] 0.2030ms 47.4642μs 21.0685 KOps/s 21.0950 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[False-True-False-True-False] 0.2147ms 30.3501μs 32.9488 KOps/s 32.5248 KOps/s $\color{#35bf28}+1.30\%$
test_step_mdp_speed[False-True-False-False-True] 0.2226ms 30.7445μs 32.5261 KOps/s 32.6249 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[False-True-False-False-False] 50.7110μs 18.9989μs 52.6346 KOps/s 51.7614 KOps/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[False-False-True-True-True] 82.9110μs 48.9656μs 20.4225 KOps/s 20.2606 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[False-False-True-True-False] 78.3310μs 32.8902μs 30.4042 KOps/s 30.3217 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[False-False-True-False-True] 54.6010μs 30.3998μs 32.8949 KOps/s 32.7591 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[False-False-True-False-False] 46.8310μs 19.1923μs 52.1042 KOps/s 51.6398 KOps/s $\color{#35bf28}+0.90\%$
test_step_mdp_speed[False-False-False-True-True] 85.7310μs 50.8735μs 19.6566 KOps/s 19.3087 KOps/s $\color{#35bf28}+1.80\%$
test_step_mdp_speed[False-False-False-True-False] 65.7810μs 35.0521μs 28.5290 KOps/s 28.7186 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[False-False-False-False-True] 56.2110μs 32.0057μs 31.2444 KOps/s 30.8247 KOps/s $\color{#35bf28}+1.36\%$
test_step_mdp_speed[False-False-False-False-False] 53.0510μs 21.4406μs 46.6404 KOps/s 46.4455 KOps/s $\color{#35bf28}+0.42\%$
test_values[generalized_advantage_estimate-True-True] 26.9035ms 26.4851ms 37.7570 Ops/s 38.8186 Ops/s $\color{#d91a1a}-2.73\%$
test_values[vec_generalized_advantage_estimate-True-True] 94.7584ms 2.8095ms 355.9381 Ops/s 343.8882 Ops/s $\color{#35bf28}+3.50\%$
test_values[td0_return_estimate-False-False] 0.2248ms 85.0285μs 11.7608 KOps/s 12.4359 KOps/s $\textbf{\color{#d91a1a}-5.43\%}$
test_values[td1_return_estimate-False-False] 58.3870ms 57.9477ms 17.2569 Ops/s 17.6866 Ops/s $\color{#d91a1a}-2.43\%$
test_values[vec_td1_return_estimate-False-False] 1.3719ms 1.1141ms 897.6255 Ops/s 910.9372 Ops/s $\color{#d91a1a}-1.46\%$
test_values[td_lambda_return_estimate-True-False] 92.5805ms 92.0438ms 10.8644 Ops/s 10.7668 Ops/s $\color{#35bf28}+0.91\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3991ms 1.1125ms 898.8655 Ops/s 915.0139 Ops/s $\color{#d91a1a}-1.76\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.5256ms 26.0738ms 38.3527 Ops/s 37.2100 Ops/s $\color{#35bf28}+3.07\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0779ms 0.7816ms 1.2795 KOps/s 1.2883 KOps/s $\color{#d91a1a}-0.68\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8764ms 0.7089ms 1.4106 KOps/s 1.4231 KOps/s $\color{#d91a1a}-0.88\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.7432ms 1.5104ms 662.0688 Ops/s 663.2923 Ops/s $\color{#d91a1a}-0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8723ms 0.7023ms 1.4240 KOps/s 1.3895 KOps/s $\color{#35bf28}+2.48\%$
test_dqn_speed[False-None] 1.6922ms 1.5067ms 663.6957 Ops/s 641.0168 Ops/s $\color{#35bf28}+3.54\%$
test_dqn_speed[False-backward] 2.2970ms 2.1457ms 466.0474 Ops/s 457.5768 Ops/s $\color{#35bf28}+1.85\%$
test_dqn_speed[True-None] 0.7019ms 0.5480ms 1.8249 KOps/s 1.7518 KOps/s $\color{#35bf28}+4.17\%$
test_dqn_speed[True-backward] 1.3856ms 1.2354ms 809.4312 Ops/s 856.3219 Ops/s $\textbf{\color{#d91a1a}-5.48\%}$
test_dqn_speed[reduce-overhead-None] 0.7893ms 0.5820ms 1.7181 KOps/s 1.7200 KOps/s $\color{#d91a1a}-0.11\%$
test_dqn_speed[reduce-overhead-backward] 1.1271ms 1.0846ms 921.9805 Ops/s 1.0004 KOps/s $\textbf{\color{#d91a1a}-7.84\%}$
test_ddpg_speed[False-None] 3.1305ms 2.8302ms 353.3325 Ops/s 334.7604 Ops/s $\textbf{\color{#35bf28}+5.55\%}$
test_ddpg_speed[False-backward] 4.7359ms 4.2848ms 233.3837 Ops/s 237.7096 Ops/s $\color{#d91a1a}-1.82\%$
test_ddpg_speed[True-None] 1.7358ms 1.3519ms 739.6911 Ops/s 740.5445 Ops/s $\color{#d91a1a}-0.12\%$
test_ddpg_speed[True-backward] 2.9951ms 2.6191ms 381.8033 Ops/s 403.5402 Ops/s $\textbf{\color{#d91a1a}-5.39\%}$
test_ddpg_speed[reduce-overhead-None] 1.5226ms 1.3565ms 737.1818 Ops/s 730.6972 Ops/s $\color{#35bf28}+0.89\%$
test_ddpg_speed[reduce-overhead-backward] 2.2271ms 2.0653ms 484.1847 Ops/s 514.7588 Ops/s $\textbf{\color{#d91a1a}-5.94\%}$
test_sac_speed[False-None] 8.5409ms 8.0886ms 123.6313 Ops/s 120.3577 Ops/s $\color{#35bf28}+2.72\%$
test_sac_speed[False-backward] 12.1536ms 11.3953ms 87.7555 Ops/s 87.6555 Ops/s $\color{#35bf28}+0.11\%$
test_sac_speed[True-None] 2.1395ms 1.8623ms 536.9706 Ops/s 539.0915 Ops/s $\color{#d91a1a}-0.39\%$
test_sac_speed[True-backward] 4.1933ms 3.8195ms 261.8143 Ops/s 260.0520 Ops/s $\color{#35bf28}+0.68\%$
test_sac_speed[reduce-overhead-None] 20.8226ms 12.0068ms 83.2859 Ops/s 82.8173 Ops/s $\color{#35bf28}+0.57\%$
test_sac_speed[reduce-overhead-backward] 1.9243ms 1.7919ms 558.0813 Ops/s 568.2615 Ops/s $\color{#d91a1a}-1.79\%$
test_redq_speed[False-None] 8.2293ms 7.7227ms 129.4891 Ops/s 126.2826 Ops/s $\color{#35bf28}+2.54\%$
test_redq_speed[False-backward] 12.6262ms 12.0290ms 83.1327 Ops/s 82.7059 Ops/s $\color{#35bf28}+0.52\%$
test_redq_speed[True-None] 2.6782ms 2.3547ms 424.6814 Ops/s 407.0900 Ops/s $\color{#35bf28}+4.32\%$
test_redq_speed[True-backward] 4.7093ms 4.2927ms 232.9537 Ops/s 229.4954 Ops/s $\color{#35bf28}+1.51\%$
test_redq_speed[reduce-overhead-None] 2.6590ms 2.3960ms 417.3623 Ops/s 415.7099 Ops/s $\color{#35bf28}+0.40\%$
test_redq_speed[reduce-overhead-backward] 4.6383ms 4.3032ms 232.3831 Ops/s 230.5264 Ops/s $\color{#35bf28}+0.81\%$
test_redq_deprec_speed[False-None] 9.7730ms 9.1757ms 108.9835 Ops/s 106.8587 Ops/s $\color{#35bf28}+1.99\%$
test_redq_deprec_speed[False-backward] 13.2458ms 12.5166ms 79.8939 Ops/s 78.4747 Ops/s $\color{#35bf28}+1.81\%$
test_redq_deprec_speed[True-None] 2.8477ms 2.6645ms 375.2987 Ops/s 369.7135 Ops/s $\color{#35bf28}+1.51\%$
test_redq_deprec_speed[True-backward] 4.9689ms 4.5902ms 217.8567 Ops/s 211.5937 Ops/s $\color{#35bf28}+2.96\%$
test_redq_deprec_speed[reduce-overhead-None] 2.9279ms 2.6664ms 375.0392 Ops/s 368.7481 Ops/s $\color{#35bf28}+1.71\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.9906ms 4.5659ms 219.0136 Ops/s 214.7089 Ops/s $\color{#35bf28}+2.00\%$
test_td3_speed[False-None] 8.1150ms 8.0371ms 124.4223 Ops/s 122.5281 Ops/s $\color{#35bf28}+1.55\%$
test_td3_speed[False-backward] 11.2183ms 10.7079ms 93.3891 Ops/s 92.5386 Ops/s $\color{#35bf28}+0.92\%$
test_td3_speed[True-None] 1.7014ms 1.6550ms 604.2142 Ops/s 602.5796 Ops/s $\color{#35bf28}+0.27\%$
test_td3_speed[True-backward] 3.9288ms 3.3903ms 294.9579 Ops/s 284.6432 Ops/s $\color{#35bf28}+3.62\%$
test_td3_speed[reduce-overhead-None] 77.9704ms 26.4534ms 37.8024 Ops/s 38.0992 Ops/s $\color{#d91a1a}-0.78\%$
test_td3_speed[reduce-overhead-backward] 1.6232ms 1.4762ms 677.4256 Ops/s 724.7690 Ops/s $\textbf{\color{#d91a1a}-6.53\%}$
test_cql_speed[False-None] 17.5177ms 16.9276ms 59.0749 Ops/s 57.4458 Ops/s $\color{#35bf28}+2.84\%$
test_cql_speed[False-backward] 23.0859ms 22.5622ms 44.3219 Ops/s 44.0018 Ops/s $\color{#35bf28}+0.73\%$
test_cql_speed[True-None] 3.5568ms 3.2977ms 303.2420 Ops/s 289.5599 Ops/s $\color{#35bf28}+4.73\%$
test_cql_speed[True-backward] 6.0341ms 5.6382ms 177.3626 Ops/s 170.8024 Ops/s $\color{#35bf28}+3.84\%$
test_cql_speed[reduce-overhead-None] 0.6026s 16.4619ms 60.7462 Ops/s 78.7706 Ops/s $\textbf{\color{#d91a1a}-22.88\%}$
test_cql_speed[reduce-overhead-backward] 2.1519ms 2.0020ms 499.4952 Ops/s 537.4316 Ops/s $\textbf{\color{#d91a1a}-7.06\%}$
test_a2c_speed[False-None] 3.3976ms 3.1883ms 313.6479 Ops/s 304.3083 Ops/s $\color{#35bf28}+3.07\%$
test_a2c_speed[False-backward] 6.9931ms 6.3857ms 156.5989 Ops/s 157.0939 Ops/s $\color{#d91a1a}-0.32\%$
test_a2c_speed[True-None] 1.5152ms 1.3501ms 740.6700 Ops/s 730.9451 Ops/s $\color{#35bf28}+1.33\%$
test_a2c_speed[True-backward] 3.2546ms 3.0874ms 323.9018 Ops/s 317.5995 Ops/s $\color{#35bf28}+1.98\%$
test_a2c_speed[reduce-overhead-None] 16.0694ms 9.0186ms 110.8814 Ops/s 108.9090 Ops/s $\color{#35bf28}+1.81\%$
test_a2c_speed[reduce-overhead-backward] 1.7651ms 1.6206ms 617.0410 Ops/s 609.5364 Ops/s $\color{#35bf28}+1.23\%$
test_ppo_speed[False-None] 4.2393ms 3.7895ms 263.8839 Ops/s 262.0444 Ops/s $\color{#35bf28}+0.70\%$
test_ppo_speed[False-backward] 7.6155ms 7.1542ms 139.7784 Ops/s 136.6616 Ops/s $\color{#35bf28}+2.28\%$
test_ppo_speed[True-None] 1.6289ms 1.4299ms 699.3721 Ops/s 696.2404 Ops/s $\color{#35bf28}+0.45\%$
test_ppo_speed[True-backward] 3.4620ms 3.2723ms 305.5911 Ops/s 302.5664 Ops/s $\color{#35bf28}+1.00\%$
test_ppo_speed[reduce-overhead-None] 1.4139ms 0.9789ms 1.0216 KOps/s 1.0284 KOps/s $\color{#d91a1a}-0.67\%$
test_ppo_speed[reduce-overhead-backward] 1.6151ms 1.5778ms 633.7964 Ops/s 614.9558 Ops/s $\color{#35bf28}+3.06\%$
test_reinforce_speed[False-None] 2.7263ms 2.2894ms 436.8048 Ops/s 431.7110 Ops/s $\color{#35bf28}+1.18\%$
test_reinforce_speed[False-backward] 3.8657ms 3.3990ms 294.1999 Ops/s 288.8609 Ops/s $\color{#35bf28}+1.85\%$
test_reinforce_speed[True-None] 1.7418ms 1.3053ms 766.1198 Ops/s 763.2869 Ops/s $\color{#35bf28}+0.37\%$
test_reinforce_speed[True-backward] 3.2653ms 3.1074ms 321.8117 Ops/s 338.9722 Ops/s $\textbf{\color{#d91a1a}-5.06\%}$
test_reinforce_speed[reduce-overhead-None] 18.7306ms 10.2545ms 97.5185 Ops/s 95.6560 Ops/s $\color{#35bf28}+1.95\%$
test_reinforce_speed[reduce-overhead-backward] 1.8186ms 1.6523ms 605.2331 Ops/s 593.0812 Ops/s $\color{#35bf28}+2.05\%$
test_iql_speed[False-None] 9.6857ms 9.2349ms 108.2851 Ops/s 104.6292 Ops/s $\color{#35bf28}+3.49\%$
test_iql_speed[False-backward] 13.7430ms 13.2391ms 75.5341 Ops/s 73.0460 Ops/s $\color{#35bf28}+3.41\%$
test_iql_speed[True-None] 2.4770ms 2.2356ms 447.3054 Ops/s 438.5654 Ops/s $\color{#35bf28}+1.99\%$
test_iql_speed[True-backward] 5.5061ms 5.0484ms 198.0812 Ops/s 199.9310 Ops/s $\color{#d91a1a}-0.93\%$
test_iql_speed[reduce-overhead-None] 0.5365s 12.6813ms 78.8565 Ops/s 88.3015 Ops/s $\textbf{\color{#d91a1a}-10.70\%}$
test_iql_speed[reduce-overhead-backward] 2.1987ms 2.0780ms 481.2273 Ops/s 512.1055 Ops/s $\textbf{\color{#d91a1a}-6.03\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6851ms 6.2534ms 159.9123 Ops/s 157.6912 Ops/s $\color{#35bf28}+1.41\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5747ms 0.3221ms 3.1050 KOps/s 3.4292 KOps/s $\textbf{\color{#d91a1a}-9.45\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6038ms 0.3014ms 3.3181 KOps/s 3.9885 KOps/s $\textbf{\color{#d91a1a}-16.81\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4423ms 5.9669ms 167.5904 Ops/s 166.2375 Ops/s $\color{#35bf28}+0.81\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1697ms 0.3691ms 2.7094 KOps/s 3.0009 KOps/s $\textbf{\color{#d91a1a}-9.71\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5593ms 0.3258ms 3.0693 KOps/s 3.6691 KOps/s $\textbf{\color{#d91a1a}-16.35\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.8681ms 1.5148ms 660.1467 Ops/s 756.8413 Ops/s $\textbf{\color{#d91a1a}-12.78\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5111ms 1.2355ms 809.4096 Ops/s 728.5112 Ops/s $\textbf{\color{#35bf28}+11.10\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4876ms 6.1373ms 162.9382 Ops/s 161.3907 Ops/s $\color{#35bf28}+0.96\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2758ms 0.4287ms 2.3328 KOps/s 2.0568 KOps/s $\textbf{\color{#35bf28}+13.42\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6167ms 0.3979ms 2.5129 KOps/s 2.1271 KOps/s $\textbf{\color{#35bf28}+18.14\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1315ms 5.9508ms 168.0456 Ops/s 165.1641 Ops/s $\color{#35bf28}+1.74\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7021ms 0.3697ms 2.7049 KOps/s 2.9922 KOps/s $\textbf{\color{#d91a1a}-9.60\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6213ms 0.3529ms 2.8333 KOps/s 2.9721 KOps/s $\color{#d91a1a}-4.67\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2672ms 5.9136ms 169.1022 Ops/s 164.4718 Ops/s $\color{#35bf28}+2.82\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8894ms 0.3596ms 2.7806 KOps/s 3.3312 KOps/s $\textbf{\color{#d91a1a}-16.53\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6904ms 0.3336ms 2.9974 KOps/s 3.8326 KOps/s $\textbf{\color{#d91a1a}-21.79\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5947ms 6.1274ms 163.2020 Ops/s 160.0270 Ops/s $\color{#35bf28}+1.98\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8250ms 0.4614ms 2.1672 KOps/s 1.9896 KOps/s $\textbf{\color{#35bf28}+8.92\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6195ms 0.4136ms 2.4178 KOps/s 2.2502 KOps/s $\textbf{\color{#35bf28}+7.45\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.2134ms 5.5837ms 179.0941 Ops/s 176.6946 Ops/s $\color{#35bf28}+1.36\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.7869ms 1.7668ms 565.9891 Ops/s 441.7636 Ops/s $\textbf{\color{#35bf28}+28.12\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.3087ms 1.2756ms 783.9588 Ops/s 821.1161 Ops/s $\color{#d91a1a}-4.53\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.6120ms 5.7186ms 174.8677 Ops/s 176.1396 Ops/s $\color{#d91a1a}-0.72\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.0540ms 2.0675ms 483.6827 Ops/s 435.6504 Ops/s $\textbf{\color{#35bf28}+11.03\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.9451ms 0.9503ms 1.0523 KOps/s 910.7638 Ops/s $\textbf{\color{#35bf28}+15.54\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5310s 16.3807ms 61.0474 Ops/s 29.9378 Ops/s $\textbf{\color{#35bf28}+103.91\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.0516ms 2.2359ms 447.2555 Ops/s 459.8372 Ops/s $\color{#d91a1a}-2.74\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.0641ms 1.3509ms 740.2207 Ops/s 719.4784 Ops/s $\color{#35bf28}+2.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.8358ms 13.5617ms 73.7372 Ops/s 72.5310 Ops/s $\color{#35bf28}+1.66\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.0540ms 17.3401ms 57.6699 Ops/s 59.7164 Ops/s $\color{#d91a1a}-3.43\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.9459ms 18.1867ms 54.9851 Ops/s 54.7367 Ops/s $\color{#35bf28}+0.45\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.2146ms 17.6259ms 56.7348 Ops/s 58.5732 Ops/s $\color{#d91a1a}-3.14\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.8116ms 18.1857ms 54.9884 Ops/s 54.5349 Ops/s $\color{#35bf28}+0.83\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.2890ms 19.1211ms 52.2982 Ops/s 54.0349 Ops/s $\color{#d91a1a}-3.21\%$

@vmoens vmoens merged commit 239f2d1 into gh/vmoens/99/base Mar 11, 2025
59 of 72 checks passed
vmoens added a commit that referenced this pull request Mar 11, 2025
ghstack-source-id: fb2e2b62652cd60111d84e0a45f47153b54c44cc
Pull Request resolved: #2822
@vmoens vmoens deleted the gh/vmoens/99/head branch March 11, 2025 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants