Skip to content

Add a query compiler to convert & transform a user query to a execution tree #115

Open
@codein-dev

Description

@codein-dev

For example, a user query ...
Assume that csv_file.csv has the following metadata
string,uint,float

csv(csv_file.csv)
| project state = $0 as string, age = $1 as uint, income = $2 as float
| where age >= 20
| project state, age_group = age / 10, income
| group by [state, age_group]
| project state, age_group,
    population = count(*), sum_income = sum(income),
    max_income = max(income), min_income = min(income),
    avg_income = average(income), median_income = median(income)
| order by median_income desc
| limit 10
csv(csv_file.csv, csv_file_meta.txt)
| where age >= 20
| project state, age_group = age / 10, income
| group by [state, age_group]
| project state, age_group,
    population = count(*), sum_income = sum(income),
    max_income = max(income), min_income = min(income),
    avg_income = average(income), median_income = median(income)
| order by median_income desc
| limit 10

The above query can be translated into the following execution tree

top_n_sort : state, age_group, population, sum_income, max_income, min_income,
      |      avg_income, median_income with limit(10) & sort by median_income desc
      |
      +-- project : state, age_group, population, sum_income, max_income, min_income,
             |      avg_income = sum_income / convert(population, double),
             |      median_income
             |
             +-- hash_agg : group [state, age_group],
                     |      aggregate [population = count(*), sum_income = sum(income),
                     |          max_income = max(income), min_income = min(income),
                     |          median_income = median(income)]
                     |
                     +-- csv_file_scanner : state = $0, age_group = convert($1, uint) / 10,
                                            income = convert($2, float) with filter(age_group >= 2)

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions