Skip to content

feat(metric): add Binary Log-Loss (Cross-Entropy) metric #261

Open
@ChristianKleineidam

Description

@ChristianKleineidam

Why Binary Log-Loss (Cross-Entropy) metric is needed

Without a built-in log-loss, users who train probabilistic classifiers in ml_algo cannot:

  • compare calibrated models on a common scale,
  • reproduce competition leaderboards (e.g., Kaggle), or
  • speak the same language as scikit-learn / LightGBM / CatBoost users.

All in-package classifiers (e.g., LogisticRegressor) optimise log-likelihood internally—so surfacing that very metric in MetricType neatly closes the loop between training and validation.

Reference stacks that ship the metric out-of-the-box: scikit-learn metrics.log_loss, LightGBM binary_logloss, CatBoost Logloss.

A common loss-function interface (for training) landed in v0.17.0, and the Metric layer (for evaluation) already lists Accuracy / Precision / Recall. Binary Log-Loss is the missing puzzle piece.

Proposed change

Public surface

// lib/src/metric/metric_type.dart
enum MetricType {
  /* existing entries */
  logLoss, // NEW – binary cross-entropy / log-loss
}

Reference implementation (evaluation-time metric)

import 'dart:math' as math;
import 'package:ml_algo/src/metric/metric.dart'; // adjust if path differs

/// Computes binary log-loss (cross-entropy) between `yTrue` and `yPred`.
///
/// * `yTrue` – iterable of 0/1 labels.
/// * `yPred` – iterable of probabilities in [0, 1].
class LogLossMetric implements Metric<int, double> {
  /// Numerical cushion to avoid log(0). Mirrors scikit-learn’s default.
  final double eps;
  const LogLossMetric({this.eps = 1e-15});

  double _clip(double p) =>
      p < eps ? eps : (p > 1 - eps ? 1 - eps : p);

  @override
  double getScore(Iterable<int> yTrue, Iterable<double> yPred) {
    if (yTrue.length != yPred.length) {
      throw ArgumentError('yTrue (${yTrue.length}) and yPred '
          '(${yPred.length}) lengths differ');
    }
    var sum = 0.0;
    final itY = yTrue.iterator, itP = yPred.iterator;
    while (itY.moveNext() && itP.moveNext()) {
      final p = _clip(itP.current);
      sum += itY.current == 1 ? -math.log(p) : -math.log(1 - p);
    }
    return sum / yTrue.length;
  }
}
  • One log() per sample → fastest scalar CPU route.
  • ε-clip prevents −∞/NaN while adding negligible bias.
  • Future SIMD implementation (via ml_linalg) can drop in without API churn.

File destination: lib/src/metric/classification/log_loss_metric.dart (mirrors existing structure).

Training-time counterpart: if desired, a LogLossFunction can implement LossFunction under lib/src/loss/log_loss_function.dart, re-using the same clipping helper.

Developer experience

final loss = model.assess(testDf, MetricType.logLoss);
print('Hold-out log-loss: $loss');

Would a PR for this be welcome? Any other thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions