[Questions] Bias correction quantile mapping #2086
Unanswered
Balajigb703
asked this question in
Questions
Replies: 1 comment
-
Hi @Balajigb703 , Does your initial data MSWEP contains nan values? If so, these will remain after the adjustment
Bias adjustment in Hope this helps! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Setup Information
0.54.0
Context
I have two datasets: IMD and model data (MSWEP). I need to perform bias correction on the MSWEP model data using daily data from 1990 to 2019. I am considering a 31-day moving window for this correction.
However, I am uncertain about the best interpolation method to usewhether nearest or linearand for extrapolation, I have set a constant value. Despite running the bias correction process, I am encountering NaN values in the corrected data columns.
Although I can see values in the ECDF graph, the bias-corrected columns remain empty or contain NaNs.
What is the best method and approach to bias correct daily data effectively to avoid NaN values?
import pandas as pd
Load dataset
file_path = "C:/Users/User/Desktop/processes_datasets/first_location.csv"
df = pd.read_csv(file_path, parse_dates=['time'])
Convert time column to datetime format
df['time'] = pd.to_datetime(df['time'], errors='coerce')
Drop any rows where time conversion failed
df = df.dropna(subset=['time'])
print(f"✅ Step 1: Time column checked and converted. Total rows after cleanup: {len(df)}")# Create a continuous daily time range
full_time_range = pd.date_range(start=df['time'].min(), end=df['time'].max(), freq='D')
Reindex to ensure both datasets have the same time range
df = df.set_index('time').reindex(full_time_range).reset_index()
df.rename(columns={'index': 'time'}, inplace=True)
Fill missing precipitation values with interpolation
df['IMD'] = df['IMD'].interpolate(method='linear')
df['MSWEP'] = df['MSWEP'].interpolate(method='linear')
Check for remaining NaN values
print(f"IMD Missing: {df['IMD'].isna().sum()}, MSWEP Missing: {df['MSWEP'].isna().sum()}")
assert df['IMD'].isna().sum() == 0 and df['MSWEP'].isna().sum() == 0, "❌ ERROR: Missing values remain!"
print("✅ Step 3: Missing values handled.")
print(f"✅ Step 2: Time alignment ensured. Total time steps: {len(df)}")import xarray as xr
Convert datasets to xarray DataArrays
ref = xr.DataArray(df['IMD'].values, dims='time', coords={'time': df['time']}, attrs={"units": "mm/d"}, name='ref')
hist = xr.DataArray(df['MSWEP'].values, dims='time', coords={'time': df['time']}, attrs={"units": "mm/d"}, name='hist')
Ensure time alignment
assert np.array_equal(ref.time.values, hist.time.values), "❌ ERROR: Time coordinates are still misaligned!"
print("✅ Step 4: Converted to xarray. Ready for bias correction.")from xclim import sdba
Define the 31-day moving window for quantile mapping
group_doy_31 = sdba.Grouper('time.dayofyear', window=31)
Train the Empirical Quantile Mapping (EQM) model
EQM = sdba.EmpiricalQuantileMapping.train(ref, hist, nquantiles=50, group=group_doy_31, kind='*')
Apply bias correction
scen = EQM.adjust(hist, interp='linear',extrapolation="constant")
Convert corrected values back to Pandas DataFrame
df["bias_corrected_MSWEP"] = scen.to_pandas()
Save the bias-corrected dataset
output_path = "C:/Users/User/Desktop/processes_datasets/112olybias_corrected_MSWEP_FIXED.csv"
df.to_csv(output_path, index=False)
print(f"✅ Bias-corrected dataset saved to: {output_path}")
Steps To Reproduce
Beta Was this translation helpful? Give feedback.
All reactions