Skip to content

Can create too many distinct numeric values #153

@yoid2000

Description

@yoid2000

In cases of continuous numeric data where nevertheless there are a "medium" number of distinct values, SynDiffix can end up creating substantially more distinct values than the original data.

What happens is that there are enough original distinct values that many of them have relatively low counts (10-20 say), and when combined with another column some of the values get suppressed. Then during microdata assignment, random values are assigned which are not original values, and more distinct values end up being created.

We need to do something whereby when the original data values are not suppressed, then we assign microdata only from the original values.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions