@@ -256,13 +256,37 @@ def transform(
256
256
sample_weight : ndarray of shape (n_features,), (n_samples, n_features), (1, n_features), or None, default=None
257
257
Individual weights for each of the input data. If only 1 weight vector is
258
258
provided, it is assumed to be the same for the features all samples.
259
+ No weights may be negative (< 0.0) and at least one weight needs to be
260
+ positive (> 0.0).
261
+ Providing them is mandatory when the optimum penalty weight ``lam`` is to be
262
+ determined automatically via the log marginal likelihood (``"logml"``)
263
+ method.
259
264
If ``None``, all features are assumed to have the same weight.
265
+ Please refer to the Notes section for further details on selecting the
266
+ weights.
260
267
261
268
Returns
262
269
-------
263
270
X_smoothed : ndarray of shape (n_samples, n_features)
264
271
The transformed data.
265
272
273
+ Notes
274
+ -----
275
+ If estimates of the standard deviations ``s_i`` of each data point are
276
+ available, e.g., from theoretical considerations or repeated measurements, it is
277
+ recommended to use the inverse of the squared standard deviations as weights,
278
+ i.e., ``w_i = 1 / (s_i * s_i)``. This is a very effective way to down-weight
279
+ noisy data points and thus reduce the risk of noise-induced artifacts in the
280
+ smoothed signal. On the other hand, features measured with high confidence will
281
+ remain well-preserved even under strong smoothing.
282
+ Sometimes, it is infeasible to provide standard deviations because theoretical
283
+ considerations are not appropriate and replicate measurements are not available/
284
+ feasible. In such scenarios, the weights can still be estimated by making use of
285
+ the function :func:`chemotools.smooth.estimate_noise_stddev` with a `power=-2`.
286
+ It relies on the parameter ``window_length`` to estimate the local/global noise
287
+ standard deviation of the spectrum, but please refer to the documentation of the
288
+ function for further details.
289
+
266
290
""" # noqa: E501
267
291
268
292
# Check that the estimator is fitted
@@ -313,13 +337,35 @@ def fit_transform(
313
337
provided, it is assumed to be the same for the features all samples.
314
338
No weights may be negative (< 0.0) and at least one weight needs to be
315
339
positive (> 0.0).
340
+ Providing them is mandatory when the optimum penalty weight ``lam`` is to be
341
+ determined automatically via the log marginal likelihood (``"logml"``)
342
+ method.
316
343
If ``None``, all features are assumed to have the same weight.
344
+ Please refer to the Notes section for further details on selecting the
345
+ weights.
317
346
318
347
Returns
319
348
-------
320
349
X_smoothed : ndarray of shape (n_samples, n_features)
321
350
The transformed data.
322
351
352
+ Notes
353
+ -----
354
+ If estimates of the standard deviations ``s_i`` of each data point are
355
+ available, e.g., from theoretical considerations or repeated measurements, it is
356
+ recommended to use the inverse of the squared standard deviations as weights,
357
+ i.e., ``w_i = 1 / (s_i * s_i)``. This is a very effective way to down-weight
358
+ noisy data points and thus reduce the risk of noise-induced artifacts in the
359
+ smoothed signal. On the other hand, features measured with high confidence will
360
+ remain well-preserved even under strong smoothing.
361
+ Sometimes, it is infeasible to provide standard deviations because theoretical
362
+ considerations are not appropriate and replicate measurements are not available/
363
+ feasible. In such scenarios, the weights can still be estimated by making use of
364
+ the function :func:`chemotools.smooth.estimate_noise_stddev` with a `power=-2`.
365
+ It relies on the parameter ``window_length`` to estimate the local/global noise
366
+ standard deviation of the spectrum, but please refer to the documentation of the
367
+ function for further details.
368
+
323
369
""" # noqa: E501
324
370
325
371
return self .fit (X = X ).transform (X = X , sample_weight = sample_weight )
0 commit comments