-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/implement daily bcsd #28
Changes from 21 commits
2a35885
c1c31b8
18068b0
1eb3b39
990ddb2
9400f97
e281e8a
8c65c36
4a3dfeb
eb090b2
cdd0838
f6a12da
84d3aa2
48f5bc9
6daa1ba
8b6ef5a
1ae1a6d
1d9cdc5
1f58985
9036b9d
65fe1d6
1f2948e
bfc051c
3621ce4
7a5455f
d41ebe7
56162a6
38ac4ab
e75909d
3ada0eb
ed932a7
740befc
7e65248
acce768
65db340
fc85b9d
0e43bd0
b046af1
f9ee410
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,6 @@ | ||
from .bcsd import BcsdPrecipitation, BcsdTemperature | ||
from .core import PointWiseDownscaler | ||
from .gard import AnalogRegression, PureAnalog | ||
from .groupers import PaddedDOYGrouper | ||
from .utils import LinearTrendTransformer, QuantileMapper | ||
from .zscore import ZScoreRegressor |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
import numpy as np | ||
import pandas as pd | ||
|
||
|
||
class SkdownscaleGroupGeneratorBase: | ||
pass | ||
|
||
|
||
class PaddedDOYGrouper(SkdownscaleGroupGeneratorBase): | ||
def __init__(self, df, offset=15): | ||
self.df = df | ||
self.offset = offset | ||
self.max = 365 | ||
self.days_of_year = np.arange(1, 366) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which calendars will this support? I'm thinking it will be useful to check that the calendar of df.index is valid for this grouper method. |
||
self.days_of_year_wrapped = np.pad(self.days_of_year, 15, mode="wrap") | ||
self.n = 1 | ||
|
||
def __iter__(self): | ||
self.n = 1 | ||
return self | ||
|
||
def __next__(self): | ||
# n as day of year | ||
if self.n > self.max: | ||
raise StopIteration | ||
|
||
i = self.n - 1 | ||
total_days = (2 * self.offset) + 1 | ||
|
||
# create day groups with +/- days | ||
# number of days defined by offset | ||
first_half = self.days_of_year_wrapped[i : i + self.offset] | ||
sec_half = self.days_of_year_wrapped[self.n + self.offset : i + total_days] | ||
all_days = np.concatenate((first_half, np.array([self.n]), sec_half), axis=0) | ||
|
||
assert len(set(all_days)) == total_days, all_days | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's convert this to a proper if len(set(all_days)) != total_days:
raise ValueError('...say something meaningful...') |
||
|
||
result = self.df[self.df.index.dayofyear.isin(all_days)] | ||
|
||
self.n += 1 | ||
|
||
return self.n - 1, result | ||
|
||
def mean(self): | ||
list_result = [] | ||
for key, group in self: | ||
list_result.append(group.mean().values[0]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this will be slightly faster if you allocate a numpy array via There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, thanks! |
||
result = pd.Series(list_result, index=self.days_of_year) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. shouldn't this be a DataFrame. In Pandas, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it should actually be a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ahh nm, got this to work along with some other updates for your comment below |
||
return result |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -166,3 +166,17 @@ def ensure_samples_features(obj): | |
if obj.ndim == 1: | ||
return obj.reshape(-1, 1) | ||
return obj # hope for the best, probably better to raise an error here | ||
|
||
|
||
def check_datetime_index(obj, timestep): | ||
""" helper function to check datetime index for compatibility | ||
""" | ||
if isinstance(obj, pd.DataFrame): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what should happen when obj is not a DataFrame? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. actually don't think we really need this function - I didn't end up using it in the NASA-NEX daily implementation and I think it's redundant with the testing that is in place now. seem reasonable @jhamman? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. I don't think we need this anymore. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool - just took this out. |
||
if timestep == "daily": | ||
obj.index = obj.index.values.astype("datetime64[D]") | ||
return obj | ||
elif timestep == "monthly": | ||
obj.index = obj.index.values.astype("datetime64[M]") | ||
return obj | ||
else: | ||
raise ValueError("this frequency has not yet been implemented in scikit-downscale") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move the DAY_GROUPER and MONTH_GROUPER functions to the
groupers
module.