Description
Why Binary Log-Loss (Cross-Entropy) metric is needed
Without a built-in log-loss, users who train probabilistic classifiers in ml_algo cannot:
- compare calibrated models on a common scale,
- reproduce competition leaderboards (e.g., Kaggle), or
- speak the same language as scikit-learn / LightGBM / CatBoost users.
All in-package classifiers (e.g., LogisticRegressor) optimise log-likelihood internally—so surfacing that very metric in MetricType neatly closes the loop between training and validation.
Reference stacks that ship the metric out-of-the-box: scikit-learn metrics.log_loss, LightGBM binary_logloss, CatBoost Logloss.
A common loss-function interface (for training) landed in v0.17.0, and the Metric layer (for evaluation) already lists Accuracy / Precision / Recall. Binary Log-Loss is the missing puzzle piece.
Proposed change
Public surface
// lib/src/metric/metric_type.dart
enum MetricType {
/* existing entries */
logLoss, // NEW – binary cross-entropy / log-loss
}
Reference implementation (evaluation-time metric)
import 'dart:math' as math;
import 'package:ml_algo/src/metric/metric.dart'; // adjust if path differs
/// Computes binary log-loss (cross-entropy) between `yTrue` and `yPred`.
///
/// * `yTrue` – iterable of 0/1 labels.
/// * `yPred` – iterable of probabilities in [0, 1].
class LogLossMetric implements Metric<int, double> {
/// Numerical cushion to avoid log(0). Mirrors scikit-learn’s default.
final double eps;
const LogLossMetric({this.eps = 1e-15});
double _clip(double p) =>
p < eps ? eps : (p > 1 - eps ? 1 - eps : p);
@override
double getScore(Iterable<int> yTrue, Iterable<double> yPred) {
if (yTrue.length != yPred.length) {
throw ArgumentError('yTrue (${yTrue.length}) and yPred '
'(${yPred.length}) lengths differ');
}
var sum = 0.0;
final itY = yTrue.iterator, itP = yPred.iterator;
while (itY.moveNext() && itP.moveNext()) {
final p = _clip(itP.current);
sum += itY.current == 1 ? -math.log(p) : -math.log(1 - p);
}
return sum / yTrue.length;
}
}
- One log() per sample → fastest scalar CPU route.
- ε-clip prevents −∞/NaN while adding negligible bias.
- Future SIMD implementation (via ml_linalg) can drop in without API churn.
File destination: lib/src/metric/classification/log_loss_metric.dart (mirrors existing structure).
Training-time counterpart: if desired, a LogLossFunction can implement LossFunction under lib/src/loss/log_loss_function.dart, re-using the same clipping helper.
Developer experience
final loss = model.assess(testDf, MetricType.logLoss);
print('Hold-out log-loss: $loss');
Would a PR for this be welcome? Any other thoughts?