i've noticed that the dataset does not include 'understanding' data. May I ask if this portion of the data will be open-sourced?