[Questions] Bias correction quantile mapping #2086

Balajigb703 · 2025-02-25T11:41:50Z

Balajigb703
Feb 25, 2025

Setup Information

Xclim version:
0.54.0

Context

I have two datasets: IMD and model data (MSWEP). I need to perform bias correction on the MSWEP model data using daily data from 1990 to 2019. I am considering a 31-day moving window for this correction.

However, I am uncertain about the best interpolation method to usewhether nearest or linearand for extrapolation, I have set a constant value. Despite running the bias correction process, I am encountering NaN values in the corrected data columns.

Although I can see values in the ECDF graph, the bias-corrected columns remain empty or contain NaNs.

What is the best method and approach to bias correct daily data effectively to avoid NaN values?
import pandas as pd

Load dataset

file_path = "C:/Users/User/Desktop/processes_datasets/first_location.csv"
df = pd.read_csv(file_path, parse_dates=['time'])

Convert time column to datetime format

df['time'] = pd.to_datetime(df['time'], errors='coerce')

Drop any rows where time conversion failed

df = df.dropna(subset=['time'])

print(f"✅ Step 1: Time column checked and converted. Total rows after cleanup: {len(df)}")# Create a continuous daily time range
full_time_range = pd.date_range(start=df['time'].min(), end=df['time'].max(), freq='D')

Reindex to ensure both datasets have the same time range

df = df.set_index('time').reindex(full_time_range).reset_index()
df.rename(columns={'index': 'time'}, inplace=True)

Fill missing precipitation values with interpolation

df['IMD'] = df['IMD'].interpolate(method='linear')
df['MSWEP'] = df['MSWEP'].interpolate(method='linear')

Check for remaining NaN values

print(f"IMD Missing: {df['IMD'].isna().sum()}, MSWEP Missing: {df['MSWEP'].isna().sum()}")

assert df['IMD'].isna().sum() == 0 and df['MSWEP'].isna().sum() == 0, "❌ ERROR: Missing values remain!"
print("✅ Step 3: Missing values handled.")
print(f"✅ Step 2: Time alignment ensured. Total time steps: {len(df)}")import xarray as xr

Convert datasets to xarray DataArrays

ref = xr.DataArray(df['IMD'].values, dims='time', coords={'time': df['time']}, attrs={"units": "mm/d"}, name='ref')
hist = xr.DataArray(df['MSWEP'].values, dims='time', coords={'time': df['time']}, attrs={"units": "mm/d"}, name='hist')

Ensure time alignment

assert np.array_equal(ref.time.values, hist.time.values), "❌ ERROR: Time coordinates are still misaligned!"

print("✅ Step 4: Converted to xarray. Ready for bias correction.")from xclim import sdba

Define the 31-day moving window for quantile mapping

group_doy_31 = sdba.Grouper('time.dayofyear', window=31)

Train the Empirical Quantile Mapping (EQM) model

EQM = sdba.EmpiricalQuantileMapping.train(ref, hist, nquantiles=50, group=group_doy_31, kind='*')

Apply bias correction

scen = EQM.adjust(hist, interp='linear',extrapolation="constant")

Convert corrected values back to Pandas DataFrame

df["bias_corrected_MSWEP"] = scen.to_pandas()

Save the bias-corrected dataset

output_path = "C:/Users/User/Desktop/processes_datasets/112olybias_corrected_MSWEP_FIXED.csv"
df.to_csv(output_path, index=False)

print(f"✅ Bias-corrected dataset saved to: {output_path}")

Steps To Reproduce

$ pip install foo --bar

coxipi · 2025-02-26T15:38:07Z

coxipi
Feb 26, 2025
Collaborator

Hi @Balajigb703 ,

Does your initial data MSWEP contains nan values? If so, these will remain after the adjustment

Despite running the bias correction process, I am encountering NaN values in the corrected data columns.

Bias adjustment in sdba simply keeps the NaN values untouched. The interpolation method is about the quantiles: Typically, you will use less quantiles than the size of the time dimensions (e.g. nquantiles=50 as you did). Then, you have 50 adjustment factors, but these must interpolated to be used on all possible quantiles.

Hope this helps!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Questions] Bias correction quantile mapping #2086

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Questions] Bias correction quantile mapping #2086

Uh oh!

Balajigb703 Feb 25, 2025

Setup Information

Context

Load dataset

Convert time column to datetime format

Drop any rows where time conversion failed

Reindex to ensure both datasets have the same time range

Fill missing precipitation values with interpolation

Check for remaining NaN values

Convert datasets to xarray DataArrays

Ensure time alignment

Define the 31-day moving window for quantile mapping

Train the Empirical Quantile Mapping (EQM) model

Apply bias correction

Convert corrected values back to Pandas DataFrame

Save the bias-corrected dataset

Steps To Reproduce

Replies: 1 comment

Uh oh!

coxipi Feb 26, 2025 Collaborator

Balajigb703
Feb 25, 2025

coxipi
Feb 26, 2025
Collaborator