Request for improved/updated beginner examples

I'm looking to use `linfa` for k-means clustering, and [the current k-means example](https://github.com/rust-ml/linfa/blob/master/algorithms/linfa-clustering/examples/kmeans.rs) is pretty incomprehensible to a newbie. It may be that this makes perfect sense to someone steeped in this API or even in `ndarray`, but to me, the issues are:
* The current version of `rand` (0.9.0 as of Feb 2025) appears to be incompatible with the version used in the example
* Generating random data from a PRNG doesn't help when my goal is to load data from somewhere else. How can I create a mutable data structure that I can push new vectors onto?
* [`DatasetBase` indicates](https://docs.rs/linfa/latest/linfa/dataset/struct.DatasetBase.html#method.records) that it contains records and maybe targets, weights, and feature names. I have no clue what the target/weights are when I'm trying to create input.
* Not having expected centroids, I'd like to lean on the API to either generate something random, something evenly distributed, or a use some quick heuristic otherwise.

Ultimately, my ideal is to do something like:

```rust
let mut records = Dataset::with_capacity(100_000); // expected number of input rows
for row in load_my_data("file.tsv") {
    // where 'row' is, say, a [f64; 5] or a Vec<f32>?
    records.push(row);
}

let initial_state = kmeans::generate_random_centroids(10 /* # clusters */, &records);

let clusters = kmeans::params_with(...).fit(&records);

for (id, cluster) in clusters.iter().enumerate() {
    // presumably cluster is [f64; 5] or &[f32]
    println!("Cluster {id} located @ {cluster:?}");
}
```

I realize this may diverge drastically from what currently exists, but I'd like to determine how to bridge this gap. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Request for improved/updated beginner examples #378

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Request for improved/updated beginner examples #378

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions