-
Notifications
You must be signed in to change notification settings - Fork 67
Explore Optimizing and Running Tests in Parallel for Faster CI in litdata
#612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
litdata
litdata
Slowest Test runs: pytester (ubuntu-22.04, 3.11)
============================ slowest 100 durations =============================
195.33s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_with_large_num_chunks
60.00s teardown tests/streaming/test_dataloader.py::test_custom_collate_multiworker
43.44s call tests/processing/test_functions.py::test_optimize_append_overwrite
40.00s teardown tests/streaming/test_dataset.py::test_resumable_dataset_two_workers_2_epochs
40.00s teardown tests/streaming/test_dataloader.py::test_dataloader_states_with_persistent_workers
36.46s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even_multi_nodes[zstd-True]
36.45s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even_multi_nodes[zstd-False]
36.14s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even[zstd-True]
36.14s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even[zstd-False]
33.47s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_odd[zstd-False]
33.47s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_odd[zstd-True]
28.51s call tests/streaming/test_dataset.py::test_optimize_dataset[False-None-5]
28.43s call tests/streaming/test_dataset.py::test_optimize_dataset[False-None-75]
28.43s call tests/streaming/test_dataset.py::test_optimize_dataset[False-None-1200]
25.24s call tests/processing/test_functions.py::test_optimize_with_fernet_encryption
22.45s call tests/streaming/test_cache.py::test_cache_for_image_dataset_distributed[2]
20.00s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-None-5]
20.00s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-None-1200]
20.00s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-None-75]
20.00s teardown tests/processing/test_functions.py::test_map_with_text_files[False]
20.00s teardown tests/processing/test_functions.py::test_optimize_with_text_files[False]
20.00s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-64MB-None]
19.79s call tests/streaming/test_dataset.py::test_dataset_resume_on_future_chunks[True]
19.22s call tests/streaming/test_dataset.py::test_dataset_resume_on_future_chunks[False]
18.67s call tests/streaming/test_dataloader.py::test_resume_parallel_dataset[simple_transform-2-None]
17.47s call tests/processing/test_functions.py::test_optimize_with_rsa_encryption
17.34s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_distributed_num_workers_end_to_end
16.90s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[4-15]
14.74s call tests/streaming/test_dataset.py::test_optimize_dataset[True-None-75]
14.74s call tests/streaming/test_dataset.py::test_optimize_dataset[True-64MB-None]
14.74s call tests/streaming/test_dataset.py::test_optimize_dataset[True-None-1200]
14.73s call tests/streaming/test_dataset.py::test_optimize_dataset[True-None-5]
14.70s call tests/streaming/test_dataset.py::test_optimize_dataset[False-64MB-None]
13.03s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_2_epochs_int_length
12.97s call tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_partial_iterations[None-3-2]
12.96s call tests/streaming/test_dataset.py::test_dataset_with_mosaic_mds_data
12.71s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_2_epochs_none_length
12.68s call tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_partial_iterations[48-3-2]
12.38s call tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_partial_iterations[20-3-2]
12.18s call tests/processing/test_functions.py::test_map_with_path
12.14s call tests/streaming/test_combined.py::test_combined_dataset_with_dataloader_2_epochs
11.86s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[4-10]
11.53s call tests/processing/test_data_processor.py::test_data_processsor_nlp
11.33s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[False-random-2-None]
11.33s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[False-torch-2-7]
11.26s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[False-torch-2-None]
11.11s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[True-random-2-None]
11.08s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[True-numpy-2-None]
11.07s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[False-numpy-2-7]
11.01s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[False-random-2-7]
11.00s call tests/streaming/test_dataset.py::test_subsample_streaming_dataset_with_token_loader
10.99s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[True-numpy-2-7]
10.96s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[True-torch-2-7]
10.96s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[False-numpy-2-None]
10.94s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[True-random-2-7]
10.94s call tests/streaming/test_parallel.py::test_parallel_dataset_rng[True-torch-2-None]
10.41s call tests/streaming/test_dataloader.py::test_dataloader_with_loading_states
8.72s call tests/processing/test_data_processor.py::test_data_processsor[10-True]
8.69s call tests/streaming/test_dataloader.py::test_resume_parallel_dataset[rng_transform-2-None]
8.64s call tests/streaming/test_dataloader.py::test_resume_parallel_dataset[rng_transform-2-7]
8.61s call tests/streaming/test_dataloader.py::test_resume_parallel_dataset[simple_transform-2-7]
8.55s call tests/streaming/test_dataloader.py::test_resume_parallel_dataset[None-2-7]
8.52s call tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_complete_iterations[24-2]
8.50s call tests/streaming/test_dataloader.py::test_resume_parallel_dataset[None-2-None]
8.47s call tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_complete_iterations[None-2]
8.40s call tests/processing/test_functions.py::test_optimize_with_queues_as_input[2]
8.34s call tests/streaming/test_dataset.py::test_resumable_dataset_two_workers_2_epochs
8.08s call tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_partial_iterations[20-7-2]
7.93s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_complete_iterations[4]
7.86s call tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_partial_iterations[48-7-2]
7.85s call tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_partial_iterations[None-7-2]
7.64s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_with_large_block_size_multiple_workers
7.50s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[2-15]
7.47s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[2-10]
7.23s call tests/utilities/test_train_test_split.py::test_train_test_split_with_streaming_dataloader[zstd]
7.16s call tests/processing/test_functions.py::test_optimize_race_condition
6.84s call tests/streaming/test_dataset.py::test_dataset_reshuffling_every_epoch
6.73s call tests/processing/test_readers.py::test_reader
6.72s call tests/processing/test_data_processor.py::test_data_processing_optimize
6.70s call tests/processing/test_data_processor.py::test_map_is_last[2-expected1]
6.68s call tests/processing/test_functions.py::test_optimize_with_text_files[False]
6.68s call tests/processing/test_data_processor.py::test_data_processing_optimize_class_yield
6.66s call tests/processing/test_data_processor.py::test_data_processing_map
6.65s call tests/processing/test_data_processor.py::test_data_process_transform
6.65s call tests/processing/test_data_processor.py::test_data_processing_optimize_class
6.65s call tests/processing/test_functions.py::test_map_with_text_files[True]
6.63s call tests/processing/test_functions.py::test_optimize_with_text_files[True]
6.62s call tests/processing/test_functions.py::test_map_with_text_files[False]
6.28s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_no_shuffle[zstd-False]
6.07s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_no_shuffle[zstd-True]
6.00s call tests/processing/test_functions.py::test_optimize_with_queues_as_input[1]
5.95s call tests/streaming/test_dataloader.py::test_resume_dataloader_with_new_dataset
5.60s call tests/processing/test_functions.py::test_optimize_with_jpeg_array
5.55s call tests/streaming/test_dataloader.py::test_resume_dataloader_after_some_workers_are_done
5.47s call tests/streaming/test_reader.py::test_reader_chunk_removal
5.47s call tests/streaming/test_reader.py::test_reader_chunk_removal_compressed
5.16s call tests/processing/test_data_processor.py::test_map_batch_size
5.13s call tests/processing/test_data_processor.py::test_empty_optimize[inputs0]
5.12s call tests/processing/test_data_processor.py::test_data_processing_map_without_input_dir_and_folder
5.12s call tests/processing/test_data_processor.py::test_map_is_last[1-expected0]
========== 389 passed, 10 skipped, 19 warnings in 1826.10s (0:30:26) =========== pytester (macos-14, 3.11)
============================ slowest 100 durations =============================
345.54s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_with_large_num_chunks
69.88s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[4-10]
68.63s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[4-15]
60.26s teardown tests/streaming/test_dataloader.py::test_custom_collate_multiworker
57.23s call tests/streaming/test_combined.py::test_combined_dataset_with_dataloader_2_epochs
56.89s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even_multi_nodes[zstd-False]
56.40s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even_multi_nodes[zstd-True]
49.28s call tests/streaming/test_dataloader.py::test_dataloader_with_loading_states
46.51s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_odd[zstd-False]
46.47s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_odd[zstd-True]
45.77s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_complete_iterations[4]
44.36s call tests/streaming/test_cache.py::test_cache_for_image_dataset_distributed[2]
40.27s teardown tests/streaming/test_dataset.py::test_resumable_dataset_two_workers_2_epochs
40.12s teardown tests/streaming/test_dataloader.py::test_dataloader_states_with_persistent_workers
38.81s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_distributed_num_workers_end_to_end
37.23s call tests/processing/test_functions.py::test_optimize_append_overwrite
35.67s call tests/streaming/test_dataset.py::test_dataset_resume_on_future_chunks[True]
35.52s call tests/streaming/test_dataset.py::test_dataset_resume_on_future_chunks[False]
34.97s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[2-15]
34.91s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[2-10]
30.92s call tests/streaming/test_dataset.py::test_optimize_dataset[False-None-1200]
30.25s call tests/streaming/test_dataset.py::test_optimize_dataset[False-None-75]
29.69s call tests/streaming/test_dataset.py::test_optimize_dataset[False-64MB-None]
29.04s call tests/streaming/test_dataset.py::test_optimize_dataset[False-None-5]
28.96s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_with_large_block_size_multiple_workers
26.34s call tests/streaming/test_dataset.py::test_optimize_dataset[True-64MB-None]
25.90s call tests/streaming/test_dataloader.py::test_resume_dataloader_with_new_dataset
24.53s call tests/processing/test_functions.py::test_optimize_with_fernet_encryption
24.29s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_complete_iterations[2]
21.50s call tests/streaming/test_dataset.py::test_dataset_reshuffling_every_epoch
20.11s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-64MB-None]
20.11s teardown tests/processing/test_functions.py::test_optimize_with_text_files[False]
20.10s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-None-1200]
20.09s teardown tests/processing/test_functions.py::test_map_with_text_files[False]
20.09s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-None-75]
20.01s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-None-5]
19.05s call tests/streaming/test_dataloader.py::test_resume_dataloader_after_some_workers_are_done
18.82s call tests/streaming/test_dataset.py::test_dataset_valid_state_override
17.89s call tests/processing/test_functions.py::test_optimize_with_rsa_encryption
16.34s call tests/streaming/test_dataset.py::test_resumable_dataset_two_workers_2_epochs
15.59s call tests/streaming/test_dataloader.py::test_dataloader_no_workers
14.74s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_multiple_workers
13.98s call tests/streaming/test_dataloader.py::test_dataloader_states_with_persistent_workers
13.61s call tests/streaming/test_combined.py::test_combined_dataset_with_per_stream_batching[2-4]
13.29s call tests/streaming/test_combined.py::test_combined_dataset_with_per_stream_batching[2-2]
12.74s call tests/streaming/test_dataset.py::test_optimize_dataset[True-None-5]
12.42s call tests/streaming/test_dataset.py::test_optimize_dataset[True-None-1200]
12.23s call tests/streaming/test_dataset.py::test_optimize_dataset[True-None-75]
11.19s call tests/utilities/test_train_test_split.py::test_train_test_split_with_streaming_dataloader[zstd]
10.85s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_no_shuffle[zstd-True]
10.12s call tests/processing/test_data_processor.py::test_data_processsor_distributed[False-False]
10.04s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_no_shuffle[zstd-False]
9.84s call tests/streaming/test_reader.py::test_reader_chunk_removal
9.43s call tests/processing/test_functions.py::test_map_with_path
9.34s call tests/streaming/test_dataset.py::test_dataset_valid_state
8.92s call tests/processing/test_data_processor.py::test_data_processsor_nlp
8.76s call tests/streaming/test_dataset.py::test_subsample_streaming_dataset_with_token_loader
8.46s call tests/streaming/test_reader.py::test_reader_chunk_removal_compressed
8.14s call tests/streaming/test_combined.py::test_combined_dataset_with_per_stream_batching[1-2]
8.02s call tests/processing/test_functions.py::test_optimize_with_queues_as_input[2]
8.00s call tests/processing/test_readers.py::test_parquet_reader
7.99s call tests/streaming/test_combined.py::test_combined_dataset_with_per_stream_batching[1-4]
7.76s call tests/streaming/test_dataset.py::test_streaming_dataset_deepcopy
7.60s call tests/processing/test_data_processor.py::test_data_processing_optimize_class
7.54s call tests/processing/test_data_processor.py::test_data_processsor[10-True]
7.46s call tests/processing/test_functions.py::test_optimize_with_queues_as_input[1]
7.30s call tests/processing/test_functions.py::test_map_with_text_files[False]
6.92s call tests/processing/test_functions.py::test_optimize_race_condition
6.89s call tests/streaming/test_combined.py::test_combined_dataset
6.88s call tests/streaming/test_combined.py::test_combined_dataset_with_dataloader_and_one_worker[2]
6.83s call tests/streaming/test_combined.py::test_combined_dataset_with_dataloader_and_one_worker[1]
6.67s call tests/processing/test_data_processor.py::test_map_is_last[2-expected1]
6.59s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_and_one_worker[2-None-expected1-num_samples_yielded1-num_cycles1]
6.49s call tests/processing/test_functions.py::test_optimize_with_text_files[True]
6.44s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_and_one_worker[1-None-expected0-num_samples_yielded0-num_cycles0]
6.36s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_and_one_worker[2-13-expected3-num_samples_yielded3-num_cycles3]
6.35s call tests/processing/test_data_processor.py::test_data_processing_optimize_class_yield
6.29s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_and_one_worker[1-13-expected2-num_samples_yielded2-num_cycles2]
5.84s call tests/processing/test_readers.py::test_reader
5.78s call tests/processing/test_data_processor.py::test_data_processing_map
5.66s call tests/processing/test_data_processor.py::test_data_processing_optimize
5.48s call tests/processing/test_functions.py::test_optimize_with_text_files[False]
5.40s call tests/streaming/test_dataset.py::test_streaming_dataset_max_cache_dir
5.27s call tests/processing/test_data_processor.py::test_data_processing_map_without_input_dir_remote
5.12s call tests/processing/test_data_processor.py::test_data_processing_map_without_input_dir_local
5.11s call tests/streaming/test_dataset.py::test_streaming_dataset[zstd]
5.01s call tests/processing/test_data_processor.py::test_map_is_last[1-expected0]
5.00s call tests/processing/test_data_processor.py::test_data_processing_map_non_absolute_path
4.85s call tests/processing/test_data_processor.py::test_data_processing_optimize_yield
4.80s call tests/processing/test_functions.py::test_optimize_with_jpeg_array
4.77s call tests/streaming/test_parquet.py::test_stream_hf_parquet_dataset[True-False]
4.69s call tests/processing/test_data_processor.py::test_empty_optimize[inputs2]
4.66s call tests/processing/test_data_processor.py::test_data_process_transform
4.64s call tests/processing/test_functions.py::test_map_with_text_files[True]
4.59s call tests/processing/test_data_processor.py::test_data_processing_map_without_input_dir_and_folder
4.42s call tests/processing/test_data_processor.py::test_empty_optimize[inputs0]
4.33s call tests/processing/test_data_processor.py::test_map_batch_size
4.21s call tests/streaming/test_dataset.py::test_dataset_with_mosaic_mds_data
4.20s call tests/processing/test_data_processor.py::test_empty_optimize[inputs1]
4.17s call tests/streaming/test_parquet.py::test_stream_hf_parquet_dataset[True-True]
========== 333 passed, 66 skipped, 23 warnings in 2295.58s (0:38:15) =========== pytester (windows-2022, 3.11)
============================ slowest 100 durations ============================
60.03s teardown tests/streaming/test_dataloader.py::test_custom_collate_multiworker
40.02s teardown tests/streaming/test_dataloader.py::test_dataloader_states_with_persistent_workers
38.20s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even_multi_nodes[zstd-False]
38.14s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even_multi_nodes[zstd-True]
37.37s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even[zstd-False]
37.06s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even[zstd-True]
33.90s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_odd[zstd-True]
33.88s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_odd[zstd-False]
30.37s call tests/streaming/test_dataset.py::test_optimize_dataset[False-None-5]
30.02s call tests/streaming/test_dataset.py::test_optimize_dataset[False-None-1200]
29.85s call tests/streaming/test_dataset.py::test_optimize_dataset[False-None-75]
23.28s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_distributed_num_workers_end_to_end
20.01s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-None-75]
20.01s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-None-5]
20.01s teardown tests/processing/test_functions.py::test_map_with_text_files[False]
20.00s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-None-1200]
20.00s teardown tests/streaming/test_dataset.py::test_optimize_dataset[False-64MB-None]
20.00s teardown tests/processing/test_functions.py::test_optimize_with_text_files[False]
19.50s call tests/streaming/test_dataset.py::test_optimize_dataset[True-None-5]
19.23s call tests/streaming/test_dataset.py::test_optimize_dataset[False-64MB-None]
19.20s call tests/streaming/test_dataset.py::test_optimize_dataset[True-64MB-None]
18.81s call tests/streaming/test_dataset.py::test_optimize_dataset[True-None-1200]
18.66s call tests/streaming/test_dataset.py::test_optimize_dataset[True-None-75]
16.59s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[4-15]
16.51s call tests/processing/test_functions.py::test_map_with_path
16.28s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[4-10]
14.60s call tests/streaming/test_dataloader.py::test_dataloader_with_loading_states
14.53s call tests/streaming/test_dataset.py::test_subsample_streaming_dataset_with_token_loader
11.03s call tests/processing/test_functions.py::test_optimize_with_queues_as_input[2]
10.99s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_complete_iterations[4]
10.72s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[2-10]
10.66s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_partial_iterations[2-15]
10.41s call tests/processing/test_readers.py::test_parquet_reader
10.00s setup tests/streaming/test_dataloader.py::test_resume_parallel_dataset[rng_transform-2-None]
9.80s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_with_large_block_size_multiple_workers
9.52s call tests/processing/test_readers.py::test_reader
9.20s call tests/processing/test_functions.py::test_map_with_text_files[True]
8.98s call tests/processing/test_functions.py::test_map_with_text_files[False]
8.60s call tests/processing/test_functions.py::test_optimize_with_text_files[False]
8.50s call tests/processing/test_functions.py::test_optimize_with_text_files[True]
8.50s call tests/processing/test_functions.py::test_optimize_with_queues_as_input[1]
8.42s call tests/streaming/test_dataloader.py::test_resume_dataloader_with_new_dataset
7.68s call tests/streaming/test_dataloader.py::test_resume_dataloader_after_some_workers_are_done
7.62s call tests/processing/test_functions.py::test_optimize_with_jpeg_array
7.60s call tests/utilities/test_train_test_split.py::test_train_test_split_with_streaming_dataloader[zstd]
7.47s call tests/processing/test_data_processor.py::test_map_batch_size
7.45s call tests/processing/test_data_processor.py::test_empty_optimize[inputs0]
7.32s call tests/processing/test_data_processor.py::test_empty_optimize[inputs2]
7.24s call tests/processing/test_data_processor.py::test_empty_optimize[inputs1]
7.20s call tests/streaming/test_dataloader.py::test_dataloader_states_with_persistent_workers
7.16s call tests/streaming/test_combined.py::test_combined_dataset_dataloader_states_complete_iterations[2]
6.57s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_no_shuffle[zstd-False]
6.38s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_no_shuffle[zstd-True]
5.72s call tests/streaming/test_reader.py::test_reader_chunk_removal
5.59s call tests/streaming/test_reader.py::test_reader_chunk_removal_compressed
5.00s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_multiple_workers
4.89s call tests/streaming/test_combined.py::test_combined_dataset_with_per_stream_batching[2-2]
4.82s call tests/streaming/test_combined.py::test_combined_dataset_with_per_stream_batching[2-4]
4.55s call tests/streaming/test_dataset.py::test_streaming_dataset_max_cache_dir
4.36s call tests/streaming/test_combined.py::test_combined_dataset_with_per_stream_batching[1-2]
4.31s call tests/streaming/test_combined.py::test_combined_dataset_with_per_stream_batching[1-4]
3.93s call tests/streaming/test_dataloader.py::test_custom_collate_multiworker
3.85s call tests/streaming/test_dataset.py::test_streaming_dataset_deepcopy
3.66s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_and_one_worker[1-None-expected0-num_samples_yielded0-num_cycles0]
3.66s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_and_one_worker[1-13-expected2-num_samples_yielded2-num_cycles2]
3.64s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_and_one_worker[2-None-expected1-num_samples_yielded1-num_cycles1]
3.63s call tests/streaming/test_dataset.py::test_streaming_dataset[zstd]
3.49s call tests/streaming/test_parallel.py::test_parallel_dataset_with_dataloader_and_one_worker[2-13-expected3-num_samples_yielded3-num_cycles3]
3.40s call tests/streaming/test_combined.py::test_combined_dataset
3.26s call tests/streaming/test_combined.py::test_combined_dataset_with_dataloader_and_one_worker[1]
3.23s call tests/streaming/test_combined.py::test_combined_dataset_with_dataloader_and_one_worker[2]
2.95s call tests/streaming/test_parquet.py::test_stream_hf_parquet_dataset[False-False]
2.94s call tests/streaming/test_parquet.py::test_stream_hf_parquet_dataset[False-True]
2.93s call tests/streaming/test_parquet.py::test_stream_hf_parquet_dataset[True-True]
2.92s call tests/streaming/test_parquet.py::test_stream_hf_parquet_dataset[True-False]
2.20s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even_multi_nodes[None-True]
2.20s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even_multi_nodes[None-False]
1.92s call tests/streaming/test_sampler.py::test_batch_sampler_imagenet
1.71s call tests/streaming/test_dataset.py::test_dataset_as_iterator_and_non_iterator[True-True]
1.70s call tests/streaming/test_dataset.py::test_dataset_as_iterator_and_non_iterator[False-True]
1.66s call tests/streaming/test_parquet.py::test_cache_dir_option[False]
1.66s call tests/streaming/test_parquet.py::test_cache_dir_option[True]
1.57s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens_distributed_num_workers
1.55s call tests/streaming/test_dataset.py::test_streaming_dataset[None]
1.53s call tests/streaming/test_dataset.py::test_dataset_cache_recreation
1.51s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even[None-False]
1.51s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_odd[None-False]
1.50s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even[None-True]
1.45s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_odd[None-True]
1.35s call tests/streaming/test_dataset.py::test_dataset_for_text_tokens
1.30s call tests/utilities/test_train_test_split.py::test_split_a_subsampled_dataset[None]
1.30s call tests/utilities/test_train_test_split.py::test_split_a_subsampled_dataset[zstd]
1.19s call tests/streaming/test_dataloader.py::test_dataloader_no_workers
1.15s call tests/utilities/test_encryption.py::test_fernet_encryption
1.14s call tests/streaming/test_reader.py::test_prepare_chunks_thread_eviction
1.09s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_no_shuffle[None-False]
1.09s call tests/streaming/test_dataset.py::test_streaming_dataset_distributed_no_shuffle[None-True]
1.09s call tests/streaming/test_parallel.py::test_dataloader_shuffle[True]
1.08s setup tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_without_any_iterations[None]
1.08s setup tests/streaming/test_parallel.py::test_parallel_dataset_dataloader_states_without_any_iterations[3]
========= 264 passed, 135 skipped, 20 warnings in 1112.93s (0:18:32) ========== |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Title: Explore Running Tests in Parallel for Faster CI in
litdata
Description:
We should investigate the possibility of running the test suite in parallel to speed up CI and local test execution. As the test base grows, serial execution becomes a bottleneck, especially when contributing frequently or iterating on PRs.
Tasks:
[pytest-xdist](https://pypi.org/project/pytest-xdist/)
or an alternative for parallelization.Goal:
Improve test speed while maintaining test accuracy and reproducibility.
The text was updated successfully, but these errors were encountered: