Skip to content

Commit 3ddd421

Browse files
author
xinyu gao
committed
FEAT: update readme
1 parent 8ce7aab commit 3ddd421

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ fs_datasets.load_dataset('afqmc')
1616

1717
huggingface datasets已知问题:
1818
- **处理大文件效果极其低下,因为需要将原始文件转换成PyArrow的形式,这个过程很慢,所以我们单个文件建议切分至500M左右,切分后的数据集采用多进程的方式加载,加载后再合并,可以显著提升加载性能**
19-
- **大数据集的情况下,做shuffle十分耗时,建议在生成**
19+
- **大数据集的情况下,做shuffle十分耗时,建议在生成缓存时就做shuffle**
2020

2121
## data process
2222

0 commit comments

Comments
 (0)