Skip to content

Conversation

pequiste
Copy link

visidata’s current behavior for average, mean, and median is to coerce the result to a float which works fine for the builtin types but doesn’t play nice when adding custom precision preserving types, like Python’s Fraction, and Decimal types.

It would also be possible to replace sum(vals) / len(val) by Python’s native statistics.mean which also preserves the int type when possible:

  • statistics.mean([1, 2]) # 1.5 (float)
  • statistics.mean([1, 2, 3]) # 2 (int)

But that would mean that two golden files have to change since they would lose some .00.

vd.aggregator('avg', mean, 'arithmetic mean of values', type=float)
vd.aggregator('mean', mean, 'arithmetic mean of values', type=float)
vd.aggregator('median', statistics.median, 'median of values')
vd.aggregator('avg', mean, 'arithmetic mean of values', type=lambda x: x)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use anytype for the identity function as a type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using anytype instead of lambda x:x fails the following tests:

  • tests/golden/aggregators-set.tsv
  • tests/golden/avg-nulls.tsv
  • tests/golden/freq-error.tsv
  • tests/golden/freq-summary.tsv
  • tests/golden/pivot.tsv

Lots of n.00 that turn into n.0 but also a 624.00 that turns into a 623.9999999999999

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh! That's weird to me. I guess the TSV saver must do something special for anytype columns?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants