Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] batch_size, reward, done, attention_key in LLMEnv #2824

Merged
merged 5 commits into from
Mar 11, 2025

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Mar 4, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2824

Note: Links to docs will display an error until the docs builds have been completed.

❌ 8 New Failures, 1 Unrelated Failure

As of commit 2d86afc with merge base 6e40548 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 4, 2025
Copy link

github-actions bot commented Mar 4, 2025

'&' can not be used here

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$
Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: 11 . Worsened: 7 .

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6194s 0.5389s 1.8556 Ops/s 1.8968 Ops/s 2.17 %
test_transformed 1.1648s 1.0832s 0.9232 Ops/s 0.9817 Ops/s -5.96%
test_serial 1.6636s 1.5679s 0.6378 Ops/s 0.6509 Ops/s 2.02 %
test_parallel 1.4227s 1.3300s 0.7519 Ops/s 0.7429 Ops/s + 1.22 %
test_step_mdp_speed[True-True-True-True-True] 0.3382ms 30.2630μs 33.0437 KOps/s 33.5801 KOps/s 1.60 %
test_step_mdp_speed[True-True-True-True-False] 48.3010μs 17.7987μs 56.1839 KOps/s 56.9181 KOps/s 1.29 %
test_step_mdp_speed[True-True-True-False-True] 50.0840μs 16.9282μs 59.0730 KOps/s 59.7032 KOps/s 1.06 %
test_step_mdp_speed[True-True-True-False-False] 53.8290μs 9.9555μs 100.4473 KOps/s 101.0218 KOps/s 0.57 %
test_step_mdp_speed[True-True-False-True-True] 63.6790μs 31.9635μs 31.2857 KOps/s 31.2950 KOps/s 0.03 %
test_step_mdp_speed[True-True-False-True-False] 67.8570μs 19.2842μs 51.8559 KOps/s 51.4720 KOps/s + 0.75 %
test_step_mdp_speed[True-True-False-False-True] 0.1671ms 18.9293μs 52.8283 KOps/s 52.8929 KOps/s 0.12 %
test_step_mdp_speed[True-True-False-False-False] 0.1229ms 12.1521μs 82.2901 KOps/s 85.4057 KOps/s 3.65 %
test_step_mdp_speed[True-False-True-True-True] 87.6340μs 33.4094μs 29.9317 KOps/s 29.7141 KOps/s + 0.73 %
test_step_mdp_speed[True-False-True-True-False] 53.5510μs 21.3059μs 46.9354 KOps/s 46.6785 KOps/s + 0.55 %
test_step_mdp_speed[True-False-True-False-True] 73.2470μs 18.4931μs 54.0743 KOps/s 52.0846 KOps/s + 3.82 %
test_step_mdp_speed[True-False-True-False-False] 33.3020μs 11.5505μs 86.5765 KOps/s 85.5244 KOps/s + 1.23 %
test_step_mdp_speed[True-False-False-True-True] 89.5570μs 34.9926μs 28.5775 KOps/s 28.0906 KOps/s + 1.73 %
test_step_mdp_speed[True-False-False-True-False] 76.3930μs 23.1477μs 43.2009 KOps/s 43.4452 KOps/s 0.56 %
test_step_mdp_speed[True-False-False-False-True] 71.7140μs 20.0965μs 49.7599 KOps/s 48.8574 KOps/s + 1.85 %
test_step_mdp_speed[True-False-False-False-False] 39.8950μs 13.5075μs 74.0329 KOps/s 74.1966 KOps/s 0.22 %
test_step_mdp_speed[False-True-True-True-True] 90.2280μs 33.4787μs 29.8698 KOps/s 29.8004 KOps/s + 0.23 %
test_step_mdp_speed[False-True-True-True-False] 0.2242ms 21.6844μs 46.1161 KOps/s 46.5962 KOps/s 1.03 %
test_step_mdp_speed[False-True-True-False-True] 51.3160μs 21.3877μs 46.7558 KOps/s 44.9062 KOps/s + 4.12 %
test_step_mdp_speed[False-True-True-False-False] 57.5270μs 13.1043μs 76.3106 KOps/s 76.0310 KOps/s + 0.37 %
test_step_mdp_speed[False-True-False-True-True] 70.5020μs 35.0936μs 28.4952 KOps/s 28.1509 KOps/s + 1.22 %
test_step_mdp_speed[False-True-False-True-False] 2.6941ms 23.0683μs 43.3495 KOps/s 43.0543 KOps/s + 0.69 %
test_step_mdp_speed[False-True-False-False-True] 0.1009ms 23.1035μs 43.2835 KOps/s 43.1056 KOps/s + 0.41 %
test_step_mdp_speed[False-True-False-False-False] 40.7160μs 14.7487μs 67.8027 KOps/s 67.2993 KOps/s + 0.75 %
test_step_mdp_speed[False-False-True-True-True] 80.1290μs 36.6764μs 27.2655 KOps/s 26.6869 KOps/s + 2.17 %
test_step_mdp_speed[False-False-True-True-False] 58.5600μs 25.0402μs 39.9358 KOps/s 40.1005 KOps/s 0.41 %
test_step_mdp_speed[False-False-True-False-True] 78.2160μs 22.6881μs 44.0760 KOps/s 42.3952 KOps/s + 3.96 %
test_step_mdp_speed[False-False-True-False-False] 48.6010μs 14.6393μs 68.3091 KOps/s 66.9970 KOps/s + 1.96 %
test_step_mdp_speed[False-False-False-True-True] 90.9800μs 38.1161μs 26.2356 KOps/s 25.5232 KOps/s + 2.79 %
test_step_mdp_speed[False-False-False-True-False] 71.2030μs 26.4475μs 37.8108 KOps/s 38.0304 KOps/s 0.58 %
test_step_mdp_speed[False-False-False-False-True] 46.6770μs 24.4410μs 40.9148 KOps/s 40.5860 KOps/s + 0.81 %
test_step_mdp_speed[False-False-False-False-False] 65.7730μs 16.3313μs 61.2321 KOps/s 60.9914 KOps/s + 0.39 %
test_values[generalized_advantage_estimate-True-True] 12.5508ms 10.0881ms 99.1264 Ops/s 99.1114 Ops/s + 0.02 %
test_values[vec_generalized_advantage_estimate-True-True] 34.4323ms 24.5286ms 40.7687 Ops/s 37.8012 Ops/s +7.85%
test_values[td0_return_estimate-False-False] 0.2377ms 0.1821ms 5.4903 KOps/s 5.5718 KOps/s 1.46 %
test_values[td1_return_estimate-False-False] 26.0917ms 24.4751ms 40.8579 Ops/s 41.2040 Ops/s 0.84 %
test_values[vec_td1_return_estimate-False-False] 26.7862ms 24.4789ms 40.8514 Ops/s 37.4598 Ops/s +9.05%
test_values[td_lambda_return_estimate-True-False] 36.8299ms 35.3197ms 28.3128 Ops/s 28.4483 Ops/s 0.48 %
test_values[vec_td_lambda_return_estimate-True-False] 27.1706ms 24.5493ms 40.7343 Ops/s 36.6955 Ops/s +11.01%
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.7727ms 8.5391ms 117.1085 Ops/s 117.6850 Ops/s 0.49 %
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4344ms 1.9780ms 505.5593 Ops/s 513.1082 Ops/s 1.47 %
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4856ms 0.3734ms 2.6780 KOps/s 2.6632 KOps/s + 0.56 %
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 58.5270ms 45.3447ms 22.0533 Ops/s 21.8262 Ops/s + 1.04 %
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.5567ms 3.4596ms 289.0528 Ops/s 289.8358 Ops/s 0.27 %
test_dqn_speed[False-None] 2.1105ms 1.4633ms 683.4014 Ops/s 691.1220 Ops/s 1.12 %
test_dqn_speed[False-backward] 2.0796ms 1.9631ms 509.4085 Ops/s 521.4027 Ops/s 2.30 %
test_dqn_speed[True-None] 0.8682ms 0.5674ms 1.7626 KOps/s 1.7651 KOps/s 0.14 %
test_dqn_speed[True-backward] 1.0342ms 0.9904ms 1.0097 KOps/s 989.5994 Ops/s + 2.03 %
test_dqn_speed[reduce-overhead-None] 0.8268ms 0.5699ms 1.7547 KOps/s 1.7436 KOps/s + 0.64 %
test_dqn_speed[reduce-overhead-backward] 1.0371ms 0.9788ms 1.0216 KOps/s 977.9295 Ops/s + 4.47 %
test_ddpg_speed[False-None] 3.7675ms 2.9993ms 333.4105 Ops/s 345.2649 Ops/s 3.43 %
test_ddpg_speed[False-backward] 4.1834ms 4.0891ms 244.5503 Ops/s 248.5773 Ops/s 1.62 %
test_ddpg_speed[True-None] 1.6604ms 1.4499ms 689.7167 Ops/s 677.1004 Ops/s + 1.86 %
test_ddpg_speed[True-backward] 2.4036ms 2.3404ms 427.2736 Ops/s 425.2714 Ops/s + 0.47 %
test_ddpg_speed[reduce-overhead-None] 1.7426ms 1.4560ms 686.8062 Ops/s 685.0886 Ops/s + 0.25 %
test_ddpg_speed[reduce-overhead-backward] 2.4298ms 2.3495ms 425.6282 Ops/s 416.7083 Ops/s + 2.14 %
test_sac_speed[False-None] 9.7692ms 8.2932ms 120.5812 Ops/s 121.5507 Ops/s 0.80 %
test_sac_speed[False-backward] 11.9032ms 11.1846ms 89.4084 Ops/s 90.8974 Ops/s 1.64 %
test_sac_speed[True-None] 2.8826ms 2.6087ms 383.3399 Ops/s 379.8646 Ops/s + 0.91 %
test_sac_speed[True-backward] 4.8481ms 4.4089ms 226.8147 Ops/s 227.2258 Ops/s 0.18 %
test_sac_speed[reduce-overhead-None] 3.6263ms 2.6848ms 372.4734 Ops/s 384.2587 Ops/s 3.07 %
test_sac_speed[reduce-overhead-backward] 4.7819ms 4.2980ms 232.6650 Ops/s 219.8050 Ops/s +5.85%
test_redq_speed[False-None] 15.2489ms 13.4998ms 74.0754 Ops/s 64.7577 Ops/s +14.39%
test_redq_speed[False-backward] 31.1851ms 23.6417ms 42.2981 Ops/s 37.3062 Ops/s +13.38%
test_redq_speed[True-None] 8.4842ms 7.1870ms 139.1409 Ops/s 143.4366 Ops/s 2.99 %
test_redq_speed[True-backward] 15.7462ms 14.9052ms 67.0909 Ops/s 66.9853 Ops/s + 0.16 %
test_redq_speed[reduce-overhead-None] 11.1575ms 7.3083ms 136.8305 Ops/s 144.6590 Ops/s -5.41%
test_redq_speed[reduce-overhead-backward] 15.9319ms 14.4811ms 69.0554 Ops/s 69.6468 Ops/s 0.85 %
test_redq_deprec_speed[False-None] 14.5066ms 13.0917ms 76.3840 Ops/s 76.9057 Ops/s 0.68 %
test_redq_deprec_speed[False-backward] 20.6378ms 18.4939ms 54.0720 Ops/s 54.0263 Ops/s + 0.08 %
test_redq_deprec_speed[True-None] 7.6294ms 5.7490ms 173.9436 Ops/s 184.2443 Ops/s -5.59%
test_redq_deprec_speed[True-backward] 11.9062ms 11.3184ms 88.3519 Ops/s 92.9187 Ops/s 4.91 %
test_redq_deprec_speed[reduce-overhead-None] 7.8896ms 5.8995ms 169.5073 Ops/s 191.4479 Ops/s -11.46%
test_redq_deprec_speed[reduce-overhead-backward] 11.9857ms 10.8972ms 91.7666 Ops/s 92.3952 Ops/s 0.68 %
test_td3_speed[False-None] 8.8883ms 8.4119ms 118.8798 Ops/s 120.2404 Ops/s 1.13 %
test_td3_speed[False-backward] 12.9769ms 11.1302ms 89.8457 Ops/s 91.3190 Ops/s 1.61 %
test_td3_speed[True-None] 2.4650ms 2.3156ms 431.8513 Ops/s 432.3783 Ops/s 0.12 %
test_td3_speed[True-backward] 4.3133ms 4.0862ms 244.7273 Ops/s 251.7414 Ops/s 2.79 %
test_td3_speed[reduce-overhead-None] 2.5837ms 2.3023ms 434.3416 Ops/s 420.2119 Ops/s + 3.36 %
test_td3_speed[reduce-overhead-backward] 4.4177ms 4.0085ms 249.4713 Ops/s 248.6697 Ops/s + 0.32 %
test_cql_speed[False-None] 38.7963ms 37.3703ms 26.7592 Ops/s 26.9602 Ops/s 0.75 %
test_cql_speed[False-backward] 50.5322ms 47.7162ms 20.9573 Ops/s 20.6589 Ops/s + 1.44 %
test_cql_speed[True-None] 26.1770ms 23.0745ms 43.3378 Ops/s 44.2648 Ops/s 2.09 %
test_cql_speed[True-backward] 31.4135ms 29.6424ms 33.7354 Ops/s 33.1991 Ops/s + 1.62 %
test_cql_speed[reduce-overhead-None] 24.4140ms 22.7468ms 43.9623 Ops/s 44.0839 Ops/s 0.28 %
test_cql_speed[reduce-overhead-backward] 31.3085ms 30.1779ms 33.1368 Ops/s 32.8409 Ops/s + 0.90 %
test_a2c_speed[False-None] 8.8055ms 7.5330ms 132.7495 Ops/s 135.7475 Ops/s 2.21 %
test_a2c_speed[False-backward] 16.8235ms 14.8597ms 67.2960 Ops/s 66.9539 Ops/s + 0.51 %
test_a2c_speed[True-None] 5.9795ms 4.8082ms 207.9791 Ops/s 207.6537 Ops/s + 0.16 %
test_a2c_speed[True-backward] 11.6799ms 11.2407ms 88.9628 Ops/s 86.5681 Ops/s + 2.77 %
test_a2c_speed[reduce-overhead-None] 5.7140ms 4.8131ms 207.7675 Ops/s 213.7169 Ops/s 2.78 %
test_a2c_speed[reduce-overhead-backward] 12.9974ms 11.8834ms 84.1507 Ops/s 86.5151 Ops/s 2.73 %
test_ppo_speed[False-None] 8.7144ms 7.7586ms 128.8896 Ops/s 130.6512 Ops/s 1.35 %
test_ppo_speed[False-backward] 17.4072ms 15.5439ms 64.3338 Ops/s 64.1261 Ops/s + 0.32 %
test_ppo_speed[True-None] 6.1433ms 5.1386ms 194.6039 Ops/s 192.4950 Ops/s + 1.10 %
test_ppo_speed[True-backward] 12.3517ms 11.2125ms 89.1863 Ops/s 88.3781 Ops/s + 0.91 %
test_ppo_speed[reduce-overhead-None] 6.1565ms 5.1619ms 193.7261 Ops/s 194.6587 Ops/s 0.48 %
test_ppo_speed[reduce-overhead-backward] 12.2377ms 11.5637ms 86.4778 Ops/s 85.1082 Ops/s + 1.61 %
test_reinforce_speed[False-None] 8.2340ms 6.7616ms 147.8944 Ops/s 145.2619 Ops/s + 1.81 %
test_reinforce_speed[False-backward] 10.5166ms 10.1562ms 98.4618 Ops/s 95.2094 Ops/s + 3.42 %
test_reinforce_speed[True-None] 5.1878ms 4.2390ms 235.9037 Ops/s 222.9743 Ops/s +5.80%
test_reinforce_speed[True-backward] 11.5056ms 10.4389ms 95.7959 Ops/s 96.2178 Ops/s 0.44 %
test_reinforce_speed[reduce-overhead-None] 5.2129ms 4.3279ms 231.0592 Ops/s 237.6324 Ops/s 2.77 %
test_reinforce_speed[reduce-overhead-backward] 11.2483ms 10.2036ms 98.0042 Ops/s 94.7526 Ops/s + 3.43 %
test_iql_speed[False-None] 40.0158ms 33.4119ms 29.9294 Ops/s 28.7653 Ops/s + 4.05 %
test_iql_speed[False-backward] 47.8032ms 46.0449ms 21.7179 Ops/s 21.1172 Ops/s + 2.84 %
test_iql_speed[True-None] 20.5084ms 15.9896ms 62.5408 Ops/s 60.4518 Ops/s + 3.46 %
test_iql_speed[True-backward] 29.3225ms 27.8434ms 35.9152 Ops/s 36.3963 Ops/s 1.32 %
test_iql_speed[reduce-overhead-None] 21.3820ms 16.3674ms 61.0969 Ops/s 61.7925 Ops/s 1.13 %
test_iql_speed[reduce-overhead-backward] 28.4455ms 27.1392ms 36.8470 Ops/s 35.8775 Ops/s + 2.70 %
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8116ms 4.7587ms 210.1428 Ops/s 187.0074 Ops/s +12.37%
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7429ms 0.5455ms 1.8333 KOps/s 1.8555 KOps/s 1.20 %
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.1155ms 0.5524ms 1.8103 KOps/s 1.9236 KOps/s -5.89%
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.3244ms 4.5620ms 219.2040 Ops/s 211.1071 Ops/s + 3.84 %
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3692ms 0.5319ms 1.8800 KOps/s 1.8815 KOps/s 0.08 %
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 1.3815ms 0.5076ms 1.9700 KOps/s 1.9720 KOps/s 0.10 %
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4758ms 1.7826ms 560.9820 Ops/s 589.3202 Ops/s 4.81 %
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2875ms 1.6652ms 600.5351 Ops/s 616.3160 Ops/s 2.56 %
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.3685ms 4.7617ms 210.0109 Ops/s 199.3539 Ops/s +5.35%
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.2570ms 0.6952ms 1.4385 KOps/s 1.4703 KOps/s 2.17 %
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9263ms 0.6678ms 1.4975 KOps/s 1.5468 KOps/s 3.19 %
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.4082ms 4.7061ms 212.4879 Ops/s 207.8002 Ops/s + 2.26 %
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8811ms 0.5493ms 1.8206 KOps/s 1.8422 KOps/s 1.17 %
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8587ms 0.5217ms 1.9167 KOps/s 1.9549 KOps/s 1.96 %
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.8004ms 4.8083ms 207.9729 Ops/s 205.3733 Ops/s + 1.27 %
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9580ms 0.5424ms 1.8436 KOps/s 1.8230 KOps/s + 1.13 %
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.9166ms 0.5241ms 1.9082 KOps/s 1.9149 KOps/s 0.35 %
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 9.4720ms 5.0832ms 196.7269 Ops/s 197.5498 Ops/s 0.42 %
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.8361ms 0.7066ms 1.4152 KOps/s 1.3186 KOps/s +7.33%
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8570ms 0.6546ms 1.5277 KOps/s 1.4999 KOps/s + 1.85 %
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.4539ms 4.2260ms 236.6312 Ops/s 227.8934 Ops/s + 3.83 %
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.5711ms 2.2803ms 438.5372 Ops/s 432.9964 Ops/s + 1.28 %
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.5894ms 1.4288ms 699.8862 Ops/s 793.6664 Ops/s -11.82%
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4628s 13.5946ms 73.5585 Ops/s 32.3388 Ops/s +127.46%
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 18.1281ms 2.6098ms 383.1715 Ops/s 458.9659 Ops/s -16.51%
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.8688ms 1.4300ms 699.2966 Ops/s 733.0468 Ops/s 4.60 %
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.9477ms 4.5201ms 221.2317 Ops/s 221.5361 Ops/s 0.14 %
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.9987ms 2.5771ms 388.0256 Ops/s 382.2498 Ops/s + 1.51 %
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.3072ms 1.5727ms 635.8613 Ops/s 640.3240 Ops/s 0.70 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.6168ms 12.2013ms 81.9584 Ops/s 79.7733 Ops/s + 2.74 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.3666ms 14.6155ms 68.4204 Ops/s 68.1481 Ops/s + 0.40 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.8883ms 20.7834ms 48.1153 Ops/s 48.0738 Ops/s + 0.09 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.7647ms 14.7981ms 67.5761 Ops/s 69.0109 Ops/s 2.08 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.1026ms 21.3176ms 46.9096 Ops/s 48.1536 Ops/s 2.58 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.9257ms 16.1488ms 61.9240 Ops/s 64.0436 Ops/s 3.31 %

[ghstack-poisoned]
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Mar 6, 2025
ghstack-source-id: b6657fc202e42b25c76b19602e71e1aebd196abf
Pull Request resolved: #2824
vmoens added 2 commits March 6, 2025 14:29
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

'&' can not be used here

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$
Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: 28 . Worsened: 16 .

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.9117s 0.8162s 1.2252 Ops/s 1.2156 Ops/s + 0.79 %
test_transformed 1.5041s 1.4111s 0.7087 Ops/s 0.6617 Ops/s +7.11%
test_serial 2.3604s 2.2746s 0.4396 Ops/s 0.4162 Ops/s +5.64%
test_parallel 1.9048s 1.8548s 0.5391 Ops/s 0.5303 Ops/s + 1.67 %
test_step_mdp_speed[True-True-True-True-True] 0.2326ms 41.2224μs 24.2587 KOps/s 26.0468 KOps/s -6.87%
test_step_mdp_speed[True-True-True-True-False] 0.4177ms 23.8533μs 41.9229 KOps/s 42.9173 KOps/s 2.32 %
test_step_mdp_speed[True-True-True-False-True] 0.4097ms 22.5294μs 44.3865 KOps/s 45.8488 KOps/s 3.19 %
test_step_mdp_speed[True-True-True-False-False] 45.4600μs 13.0657μs 76.5362 KOps/s 79.5226 KOps/s 3.76 %
test_step_mdp_speed[True-True-False-True-True] 0.4387ms 44.2040μs 22.6224 KOps/s 23.6587 KOps/s 4.38 %
test_step_mdp_speed[True-True-False-True-False] 0.4166ms 25.9430μs 38.5460 KOps/s 39.9107 KOps/s 3.42 %
test_step_mdp_speed[True-True-False-False-True] 0.4180ms 25.0676μs 39.8921 KOps/s 40.3663 KOps/s 1.17 %
test_step_mdp_speed[True-True-False-False-False] 80.6310μs 14.9002μs 67.1132 KOps/s 65.5725 KOps/s + 2.35 %
test_step_mdp_speed[True-False-True-True-True] 83.7000μs 44.9011μs 22.2712 KOps/s 22.4489 KOps/s 0.79 %
test_step_mdp_speed[True-False-True-True-False] 68.0510μs 28.0793μs 35.6134 KOps/s 35.9494 KOps/s 0.93 %
test_step_mdp_speed[True-False-True-False-True] 59.6310μs 25.0750μs 39.8803 KOps/s 41.2853 KOps/s 3.40 %
test_step_mdp_speed[True-False-True-False-False] 53.4700μs 15.4596μs 64.6847 KOps/s 65.7865 KOps/s 1.67 %
test_step_mdp_speed[True-False-False-True-True] 86.4110μs 47.2147μs 21.1799 KOps/s 21.8082 KOps/s 2.88 %
test_step_mdp_speed[True-False-False-True-False] 65.3300μs 30.0529μs 33.2746 KOps/s 33.6381 KOps/s 1.08 %
test_step_mdp_speed[True-False-False-False-True] 82.1610μs 26.6695μs 37.4960 KOps/s 37.6280 KOps/s 0.35 %
test_step_mdp_speed[True-False-False-False-False] 0.4868ms 17.5864μs 56.8621 KOps/s 57.5298 KOps/s 1.16 %
test_step_mdp_speed[False-True-True-True-True] 78.7410μs 45.5955μs 21.9320 KOps/s 22.4873 KOps/s 2.47 %
test_step_mdp_speed[False-True-True-True-False] 69.7910μs 28.2450μs 35.4045 KOps/s 35.9474 KOps/s 1.51 %
test_step_mdp_speed[False-True-True-False-True] 2.6431ms 29.2608μs 34.1754 KOps/s 35.6644 KOps/s 4.18 %
test_step_mdp_speed[False-True-True-False-False] 55.5510μs 17.1798μs 58.2078 KOps/s 60.3078 KOps/s 3.48 %
test_step_mdp_speed[False-True-False-True-True] 84.1610μs 47.2333μs 21.1715 KOps/s 21.5418 KOps/s 1.72 %
test_step_mdp_speed[False-True-False-True-False] 62.7100μs 30.6592μs 32.6166 KOps/s 32.8294 KOps/s 0.65 %
test_step_mdp_speed[False-True-False-False-True] 89.2310μs 31.1490μs 32.1037 KOps/s 32.6411 KOps/s 1.65 %
test_step_mdp_speed[False-True-False-False-False] 82.1010μs 19.0621μs 52.4601 KOps/s 51.7013 KOps/s + 1.47 %
test_step_mdp_speed[False-False-True-True-True] 93.3620μs 50.3422μs 19.8640 KOps/s 19.8662 KOps/s 0.01 %
test_step_mdp_speed[False-False-True-True-False] 0.4304ms 32.8182μs 30.4709 KOps/s 30.3720 KOps/s + 0.33 %
test_step_mdp_speed[False-False-True-False-True] 0.4179ms 31.7629μs 31.4832 KOps/s 32.7872 KOps/s 3.98 %
test_step_mdp_speed[False-False-True-False-False] 0.4301ms 19.3114μs 51.7828 KOps/s 52.0148 KOps/s 0.45 %
test_step_mdp_speed[False-False-False-True-True] 0.4480ms 51.9528μs 19.2482 KOps/s 19.6928 KOps/s 2.26 %
test_step_mdp_speed[False-False-False-True-False] 80.9110μs 35.6615μs 28.0415 KOps/s 28.6493 KOps/s 2.12 %
test_step_mdp_speed[False-False-False-False-True] 0.4418ms 32.9345μs 30.3633 KOps/s 30.7396 KOps/s 1.22 %
test_step_mdp_speed[False-False-False-False-False] 48.8010μs 21.5813μs 46.3363 KOps/s 47.8691 KOps/s 3.20 %
test_values[generalized_advantage_estimate-True-True] 24.3954ms 23.8273ms 41.9687 Ops/s 40.4191 Ops/s + 3.83 %
test_values[vec_generalized_advantage_estimate-True-True] 0.1153s 3.1941ms 313.0821 Ops/s 299.5932 Ops/s + 4.50 %
test_values[td0_return_estimate-False-False] 0.1156ms 77.3509μs 12.9281 KOps/s 12.8076 KOps/s + 0.94 %
test_values[td1_return_estimate-False-False] 53.9024ms 53.1274ms 18.8227 Ops/s 18.6744 Ops/s + 0.79 %
test_values[vec_td1_return_estimate-False-False] 1.3442ms 1.0706ms 934.0182 Ops/s 935.2057 Ops/s 0.13 %
test_values[td_lambda_return_estimate-True-False] 84.3882ms 83.9156ms 11.9167 Ops/s 11.1572 Ops/s +6.81%
test_values[vec_td_lambda_return_estimate-True-False] 1.3685ms 1.0757ms 929.6604 Ops/s 935.8886 Ops/s 0.67 %
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.1099ms 23.7515ms 42.1026 Ops/s 42.0939 Ops/s + 0.02 %
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0071ms 0.7326ms 1.3650 KOps/s 1.3492 KOps/s + 1.17 %
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 1.0632ms 0.6527ms 1.5322 KOps/s 1.4930 KOps/s + 2.63 %
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5300ms 1.4718ms 679.4534 Ops/s 675.7366 Ops/s + 0.55 %
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.0719ms 0.6631ms 1.5080 KOps/s 1.5020 KOps/s + 0.40 %
test_dqn_speed[False-None] 1.8704ms 1.4669ms 681.7320 Ops/s 656.5600 Ops/s + 3.83 %
test_dqn_speed[False-backward] 2.1026ms 2.0610ms 485.1997 Ops/s 475.5464 Ops/s + 2.03 %
test_dqn_speed[True-None] 0.6695ms 0.5491ms 1.8211 KOps/s 1.6867 KOps/s +7.97%
test_dqn_speed[True-backward] 1.2840ms 1.2274ms 814.6971 Ops/s 874.0176 Ops/s -6.79%
test_dqn_speed[reduce-overhead-None] 0.9756ms 0.5758ms 1.7368 KOps/s 1.7015 KOps/s + 2.08 %
test_dqn_speed[reduce-overhead-backward] 1.4492ms 1.0721ms 932.7486 Ops/s 1.0322 KOps/s -9.64%
test_ddpg_speed[False-None] 3.1326ms 2.7683ms 361.2337 Ops/s 352.1326 Ops/s + 2.58 %
test_ddpg_speed[False-backward] 4.5071ms 4.0865ms 244.7104 Ops/s 247.4089 Ops/s 1.09 %
test_ddpg_speed[True-None] 1.4285ms 1.3310ms 751.3327 Ops/s 753.5211 Ops/s 0.29 %
test_ddpg_speed[True-backward] 2.6796ms 2.5811ms 387.4306 Ops/s 384.8580 Ops/s + 0.67 %
test_ddpg_speed[reduce-overhead-None] 1.7343ms 1.3556ms 737.6984 Ops/s 745.5990 Ops/s 1.06 %
test_ddpg_speed[reduce-overhead-backward] 2.0899ms 2.0332ms 491.8444 Ops/s 509.2130 Ops/s 3.41 %
test_sac_speed[False-None] 8.2276ms 7.8231ms 127.8261 Ops/s 119.9116 Ops/s +6.60%
test_sac_speed[False-backward] 11.5819ms 10.8566ms 92.1102 Ops/s 89.3088 Ops/s + 3.14 %
test_sac_speed[True-None] 2.0091ms 1.8336ms 545.3720 Ops/s 543.2674 Ops/s + 0.39 %
test_sac_speed[True-backward] 3.8339ms 3.7499ms 266.6705 Ops/s 262.0953 Ops/s + 1.75 %
test_sac_speed[reduce-overhead-None] 21.3208ms 12.0733ms 82.8272 Ops/s 82.8208 Ops/s + 0.01 %
test_sac_speed[reduce-overhead-backward] 1.6460ms 1.5711ms 636.5036 Ops/s 560.6208 Ops/s +13.54%
test_redq_speed[False-None] 8.0938ms 7.4530ms 134.1737 Ops/s 130.3260 Ops/s + 2.95 %
test_redq_speed[False-backward] 11.7555ms 11.1485ms 89.6982 Ops/s 85.1207 Ops/s +5.38%
test_redq_speed[True-None] 2.5764ms 2.3792ms 420.3146 Ops/s 428.3232 Ops/s 1.87 %
test_redq_speed[True-backward] 4.1965ms 4.0310ms 248.0771 Ops/s 244.5034 Ops/s + 1.46 %
test_redq_speed[reduce-overhead-None] 2.4976ms 2.3375ms 427.8064 Ops/s 426.1166 Ops/s + 0.40 %
test_redq_speed[reduce-overhead-backward] 4.1689ms 4.0410ms 247.4643 Ops/s 234.4662 Ops/s +5.54%
test_redq_deprec_speed[False-None] 9.2832ms 8.8029ms 113.5985 Ops/s 110.7964 Ops/s + 2.53 %
test_redq_deprec_speed[False-backward] 12.2953ms 11.6571ms 85.7849 Ops/s 82.1633 Ops/s + 4.41 %
test_redq_deprec_speed[True-None] 2.7300ms 2.6092ms 383.2620 Ops/s 377.7929 Ops/s + 1.45 %
test_redq_deprec_speed[True-backward] 4.6866ms 4.3242ms 231.2540 Ops/s 215.8796 Ops/s +7.12%
test_redq_deprec_speed[reduce-overhead-None] 2.8453ms 2.6257ms 380.8523 Ops/s 375.5014 Ops/s + 1.43 %
test_redq_deprec_speed[reduce-overhead-backward] 4.4673ms 4.3195ms 231.5092 Ops/s 220.0422 Ops/s +5.21%
test_td3_speed[False-None] 8.0662ms 7.7556ms 128.9390 Ops/s 125.8749 Ops/s + 2.43 %
test_td3_speed[False-backward] 10.8044ms 10.0031ms 99.9687 Ops/s 96.4304 Ops/s + 3.67 %
test_td3_speed[True-None] 1.7821ms 1.6486ms 606.5593 Ops/s 598.9646 Ops/s + 1.27 %
test_td3_speed[True-backward] 3.2965ms 3.2049ms 312.0220 Ops/s 292.8063 Ops/s +6.56%
test_td3_speed[reduce-overhead-None] 76.9785ms 26.5495ms 37.6655 Ops/s 37.8798 Ops/s 0.57 %
test_td3_speed[reduce-overhead-backward] 1.3660ms 1.3120ms 762.2018 Ops/s 677.1422 Ops/s +12.56%
test_cql_speed[False-None] 16.7586ms 16.2939ms 61.3725 Ops/s 59.6459 Ops/s + 2.89 %
test_cql_speed[False-backward] 21.7788ms 21.2274ms 47.1090 Ops/s 45.2212 Ops/s + 4.17 %
test_cql_speed[True-None] 3.3729ms 3.2566ms 307.0711 Ops/s 306.8274 Ops/s + 0.08 %
test_cql_speed[True-backward] 6.1401ms 5.6041ms 178.4408 Ops/s 170.5149 Ops/s + 4.65 %
test_cql_speed[reduce-overhead-None] 0.5907s 16.3383ms 61.2059 Ops/s 74.9624 Ops/s -18.35%
test_cql_speed[reduce-overhead-backward] 1.8311ms 1.7918ms 558.0862 Ops/s 537.4650 Ops/s + 3.84 %
test_a2c_speed[False-None] 3.1421ms 3.0487ms 328.0091 Ops/s 311.5323 Ops/s +5.29%
test_a2c_speed[False-backward] 6.4944ms 5.8446ms 171.0987 Ops/s 161.1082 Ops/s +6.20%
test_a2c_speed[True-None] 1.4243ms 1.3289ms 752.5073 Ops/s 731.6069 Ops/s + 2.86 %
test_a2c_speed[True-backward] 3.0000ms 2.9008ms 344.7327 Ops/s 332.3225 Ops/s + 3.73 %
test_a2c_speed[reduce-overhead-None] 16.1922ms 9.2109ms 108.5670 Ops/s 107.6617 Ops/s + 0.84 %
test_a2c_speed[reduce-overhead-backward] 1.5376ms 1.4509ms 689.2065 Ops/s 615.7411 Ops/s +11.93%
test_ppo_speed[False-None] 3.7043ms 3.5582ms 281.0447 Ops/s 267.2722 Ops/s +5.15%
test_ppo_speed[False-backward] 6.9932ms 6.5594ms 152.4525 Ops/s 143.9911 Ops/s +5.88%
test_ppo_speed[True-None] 1.6245ms 1.4015ms 713.5072 Ops/s 688.9415 Ops/s + 3.57 %
test_ppo_speed[True-backward] 3.1995ms 3.0733ms 325.3841 Ops/s 301.9716 Ops/s +7.75%
test_ppo_speed[reduce-overhead-None] 1.0332ms 0.9592ms 1.0426 KOps/s 1.0110 KOps/s + 3.12 %
test_ppo_speed[reduce-overhead-backward] 1.4526ms 1.4010ms 713.7551 Ops/s 621.9193 Ops/s +14.77%
test_reinforce_speed[False-None] 2.2824ms 2.1867ms 457.3116 Ops/s 442.5956 Ops/s + 3.32 %
test_reinforce_speed[False-backward] 3.7135ms 3.1699ms 315.4698 Ops/s 295.7815 Ops/s +6.66%
test_reinforce_speed[True-None] 1.3853ms 1.2743ms 784.7500 Ops/s 735.1217 Ops/s +6.75%
test_reinforce_speed[True-backward] 3.1932ms 3.0989ms 322.6925 Ops/s 324.8383 Ops/s 0.66 %
test_reinforce_speed[reduce-overhead-None] 19.3336ms 10.4676ms 95.5329 Ops/s 93.3402 Ops/s + 2.35 %
test_reinforce_speed[reduce-overhead-backward] 1.6067ms 1.4821ms 674.7268 Ops/s 656.3941 Ops/s + 2.79 %
test_iql_speed[False-None] 9.4312ms 8.9777ms 111.3871 Ops/s 107.7695 Ops/s + 3.36 %
test_iql_speed[False-backward] 12.9225ms 12.4054ms 80.6099 Ops/s 77.7276 Ops/s + 3.71 %
test_iql_speed[True-None] 2.4393ms 2.2182ms 450.8179 Ops/s 439.9517 Ops/s + 2.47 %
test_iql_speed[True-backward] 4.9514ms 4.7971ms 208.4595 Ops/s 203.3689 Ops/s + 2.50 %
test_iql_speed[reduce-overhead-None] 0.5277s 13.2890ms 75.2504 Ops/s 87.4632 Ops/s -13.96%
test_iql_speed[reduce-overhead-backward] 1.9189ms 1.8543ms 539.3006 Ops/s 517.3882 Ops/s + 4.24 %
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6259ms 6.3011ms 158.7013 Ops/s 158.5520 Ops/s + 0.09 %
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5180ms 0.3099ms 3.2267 KOps/s 3.2187 KOps/s + 0.25 %
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5968ms 0.2944ms 3.3973 KOps/s 3.4052 KOps/s 0.23 %
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2862ms 6.0341ms 165.7245 Ops/s 166.2643 Ops/s 0.32 %
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7859ms 0.3276ms 3.0522 KOps/s 3.6232 KOps/s -15.76%
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5509ms 0.3131ms 3.1937 KOps/s 3.1366 KOps/s + 1.82 %
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6487ms 1.4081ms 710.1711 Ops/s 747.6938 Ops/s -5.02%
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5129ms 1.3212ms 756.8609 Ops/s 862.7263 Ops/s -12.27%
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2303ms 6.0945ms 164.0817 Ops/s 160.6620 Ops/s + 2.13 %
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9772ms 0.3953ms 2.5295 KOps/s 2.3007 KOps/s +9.95%
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6794ms 0.4573ms 2.1868 KOps/s 2.3241 KOps/s -5.91%
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1042ms 5.9579ms 167.8448 Ops/s 163.8610 Ops/s + 2.43 %
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1396ms 0.3515ms 2.8446 KOps/s 3.4036 KOps/s -16.42%
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5692ms 0.3199ms 3.1263 KOps/s 3.5647 KOps/s -12.30%
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 9.2597ms 5.9643ms 167.6651 Ops/s 166.4635 Ops/s + 0.72 %
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1476ms 0.3314ms 3.0175 KOps/s 3.2497 KOps/s -7.15%
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4437ms 0.2348ms 4.2589 KOps/s 3.5937 KOps/s +18.51%
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2766ms 6.1337ms 163.0343 Ops/s 161.2015 Ops/s + 1.14 %
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7257ms 0.3948ms 2.5326 KOps/s 2.0473 KOps/s +23.70%
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5955ms 0.3816ms 2.6203 KOps/s 2.3273 KOps/s +12.59%
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9684ms 5.3601ms 186.5629 Ops/s 180.3433 Ops/s + 3.45 %
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.2686ms 2.0152ms 496.2206 Ops/s 438.2735 Ops/s +13.22%
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.9253ms 1.2631ms 791.6720 Ops/s 880.2470 Ops/s -10.06%
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.8852ms 5.5609ms 179.8259 Ops/s 178.3510 Ops/s + 0.83 %
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.1252ms 2.0808ms 480.5901 Ops/s 444.3635 Ops/s +8.15%
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.3916ms 1.3101ms 763.2777 Ops/s 814.0927 Ops/s -6.24%
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5197s 16.0087ms 62.4660 Ops/s 30.5654 Ops/s +104.37%
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.2336ms 2.3034ms 434.1457 Ops/s 462.4868 Ops/s -6.13%
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.5195ms 1.4232ms 702.6380 Ops/s 885.8065 Ops/s -20.68%
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.5411ms 13.2951ms 75.2155 Ops/s 72.9793 Ops/s + 3.06 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.2369ms 16.6084ms 60.2105 Ops/s 60.4200 Ops/s 0.35 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.3796ms 17.7444ms 56.3557 Ops/s 54.7198 Ops/s + 2.99 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.2894ms 16.7530ms 59.6908 Ops/s 59.7705 Ops/s 0.13 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.9087ms 17.7832ms 56.2330 Ops/s 54.5413 Ops/s + 3.10 %
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.6394ms 18.1160ms 55.1998 Ops/s 55.3986 Ops/s 0.36 %

@vmoens vmoens added the enhancement New feature or request label Mar 11, 2025
@vmoens vmoens merged commit 2d86afc into gh/vmoens/101/base Mar 11, 2025
59 of 72 checks passed
vmoens added a commit that referenced this pull request Mar 11, 2025
ghstack-source-id: 90e0ff99b1a1e1dbc86781df842097505f862361
Pull Request resolved: #2824
@vmoens vmoens deleted the gh/vmoens/101/head branch March 11, 2025 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants