Skip to content

Confusing levels fallback #54

Open
@aplavin

Description

@aplavin

Looking at the concrete implementation of levels in CategoricalArrays, I see that this function conveniently extracts possible levels from both arrays and individual values:

julia> a = CategoricalArray(["abc", "def", "abc"])
3-element CategoricalArray{String,1,UInt32}:
 "abc"
 "def"
 "abc"

julia> x = a[1]
CategoricalValue{String, UInt32} "abc"

julia> levels(a)
2-element Vector{String}:
 "abc"
 "def"

julia> levels(x)
2-element Vector{String}:
 "abc"
 "def"

However, the fallback implemented in DataAPI itself is only correct for collections, not individual values:

# like unique(x), makes sense?
julia> levels(["abc", "def", "abc"])
2-element Vector{String}:
 "abc"
 "def"

# makes no sense:
julia> levels("abc")
3-element Vector{Char}:
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
 'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)

# especially confusing given that:
julia> x == "abc"
true

Maybe, the fallback shouldn't exist at all, and something like haslevels(x)::Bool added instead?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions