Skip to content

ValueError: y data should have at least 1 samples, but found 0 #290

@ajayxcel

Description

@ajayxcel

Hi, I have been facing an error while analyzing for 'Turbine Ideal Energy'. I'm facing this error if I have less than 2 years of SCADA data. Even if I have data which is short by a day, it throws the error. I'm wondering if we have to use 2 years or more data or it's just a bug. Could you look into this issue please? I have pasted the entire error for reference. Also, I have similar error with other codes as well except AEP when I used data less than 2 years. Thank you very much for consideration.

ValueError                                Traceback (most recent call last)
Cell In[15], line 6
      1 # We can choose to save key plots to a file by setting enable_plotting=True and 
      2 # specifying a directory to save the images. For now we turn off this feature. 
      3 # ta.run(reanalysis_subset=['era5', 'merra2'], enable_plotting=False, plot_dir=None,
      4 #        wind_bin_thresh=wind_bin_thresh, max_power_filter=max_power_filter,
      5 #        correction_threshold=correction_threshold)
----> 6 ta.run(reanalysis_products=['era5', 'merra2'])

File ~\AppData\Local\anaconda3\envs\openoa-env\lib\site-packages\openoa\logging.py:33, in logged_method_call.<locals>._wrapper(self, *args, **kwargs)
     31 logger = logging.getLogger(the_method.__module__)
     32 logger.debug(f"{self.__class__.__name__}#{id(self)}.{the_method.__name__}: {msg}")
---> 33 return the_method(self, *args, **kwargs)

File ~\AppData\Local\anaconda3\envs\openoa-env\lib\site-packages\openoa\analysis\turbine_long_term_gross_energy.py:255, in TurbineLongTermGrossEnergy.run(self, num_sim, reanalysis_products, uncertainty_scada, wind_bin_threshold, max_power_filter, correction_threshold)
    253     self.filter_sum_impute_scada()  # Setup daily scada data
    254     self.setupturbine_model_dict()  # Setup daily data to be fit using the GAM
--> 255     self.fit_model()  # Fit daily turbine energy to atmospheric data
    256     self.apply_model(i)  # Apply fitting result to long-term reanalysis data
    258 # Log the completion of the run

File ~\AppData\Local\anaconda3\envs\openoa-env\lib\site-packages\openoa\logging.py:33, in logged_method_call.<locals>._wrapper(self, *args, **kwargs)
     31 logger = logging.getLogger(the_method.__module__)
     32 logger.debug(f"{self.__class__.__name__}#{id(self)}.{the_method.__name__}: {msg}")
---> 33 return the_method(self, *args, **kwargs)

File ~\AppData\Local\anaconda3\envs\openoa-env\lib\site-packages\openoa\analysis\turbine_long_term_gross_energy.py:519, in TurbineLongTermGrossEnergy.fit_model(self)
    516     df["energy_imputed"] = df["energy_imputed"] * self._run.scada_data_fraction
    518     # Consider wind speed, wind direction, and air density as features
--> 519     mod_results[t] = functions.gam_3param(
    520         windspeed_col="WMETR_HorWdSpd",
    521         wind_direction_col="WMETR_HorWdDir",
    522         air_density_col="WMETR_AirDen",
    523         power_col="energy_imputed",
    524         data=df,
    525     )
    526 self._model_results = mod_results

File ~\AppData\Local\anaconda3\envs\openoa-env\lib\site-packages\openoa\utils\_converters.py:294, in dataframe_method.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    292     # Update the args and kwargs as need and call the function
    293     args, kwargs = _update_arguments(args, kwargs, arg_ix_list, data_cols, arg_list)
--> 294     return func(*args, **kwargs)
    296 # When no data is provided, then convert the Series arguments, update args and kwargs,
    297 # appropriately, then call the function
    298 df, arg_list = series_to_df(*arg_list, names=data_cols)

File ~\AppData\Local\anaconda3\envs\openoa-env\lib\site-packages\openoa\utils\power_curve\functions.py:187, in gam_3param(windspeed_col, wind_direction_col, air_density_col, power_col, n_splines, data)
    184 y = data[power_col]
    186 # Fit the model
--> 187 model = LinearGAM(n_splines=n_splines).fit(X, y)
    189 # Wrap the prediction function in a closure to pack input variables
    190 @dataframe_method(data_cols=["windspeed_col", "wind_direction_col", "air_density_col"])
    191 def predict(
    192     windspeed_col: str | pd.Series,
   (...)
    195     data: pd.DataFrame = None,
    196 ):

File ~\AppData\Local\anaconda3\envs\openoa-env\lib\site-packages\pygam\pygam.py:887, in GAM.fit(self, X, y, weights)
    884 self._validate_params()
    886 # validate data
--> 887 y = check_y(y, self.link, self.distribution, verbose=self.verbose)
    888 X = check_X(X, verbose=self.verbose)
    889 check_X_y(X, y)

File ~\AppData\Local\anaconda3\envs\openoa-env\lib\site-packages\pygam\utils.py:234, in check_y(y, link, dist, min_samples, verbose)
    212 """
    213 tool to ensure that the targets:
    214 - are in the domain of the link function
   (...)
    230 y : array containing validated y-data
    231 """
    232 y = np.ravel(y)
--> 234 y = check_array(
    235     y,
    236     force_2d=False,
    237     min_samples=min_samples,
    238     ndim=1,
    239     name='y data',
    240     verbose=verbose,
    241 )
    243 with warnings.catch_warnings():
    244     warnings.simplefilter("ignore")

File ~\AppData\Local\anaconda3\envs\openoa-env\lib\site-packages\pygam\utils.py:203, in check_array(array, force_2d, n_feats, ndim, min_samples, name, verbose)
    201 n = array.shape[0]
    202 if n < min_samples:
--> 203     raise ValueError(
    204         '{} should have at least {} samples, '
    205         'but found {}'.format(name, min_samples, n)
    206     )
    208 return array

ValueError: y data should have at least 1 samples, but found 0

EDIT: I put the traceback in python code bracket to make it easier for me to read.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions