Skip to content

(feat) Optimizer & Memo Full Logical -> Logical #101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 55 commits into from
May 13, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
2b6ce6c
Reorg src dir for optimizer
AlSchlo May 7, 2025
32b5601
Add optimizer files
AlSchlo May 7, 2025
32efa8a
Migrate memo types and interfaces (#102)
connortsui20 May 7, 2025
ac99222
Make compile
AlSchlo May 7, 2025
adabaa0
Add missing files
AlSchlo May 7, 2025
946ba74
Make all compile
AlSchlo May 7, 2025
57da3d3
Don't forget missing file
AlSchlo May 7, 2025
9563511
Adapt job executors and add task casters
AlSchlo May 8, 2025
3df4239
Cleanup
AlSchlo May 8, 2025
ec73d00
Add retriever
AlSchlo May 9, 2025
f848ee0
Implement fork logical handler
AlSchlo May 9, 2025
9387646
implement ensure group explore
AlSchlo May 9, 2025
3308cc5
launch jobs
AlSchlo May 9, 2025
1ffebc1
implement ensure goal mvp for log to log
AlSchlo May 9, 2025
b3c7c92
redo memo
AlSchlo May 9, 2025
ba895ff
Reboot memo implementation
AlSchlo May 9, 2025
85c408d
reorg and implement materialize and repr in memeo
AlSchlo May 9, 2025
3bb5a23
get props implemented
AlSchlo May 9, 2025
0f9327a
simplify memo result kiss
AlSchlo May 9, 2025
1df45a7
impl get all logical exprs
AlSchlo May 9, 2025
a143153
impl create group
AlSchlo May 9, 2025
30cf54f
add doc, index, and impl find logical
AlSchlo May 9, 2025
c8b598e
start merge impl
AlSchlo May 9, 2025
783d46b
Make materialized exprs and goals repr
AlSchlo May 10, 2025
f6a21e3
impl merge groups
AlSchlo May 10, 2025
81e2b26
merge works and add tests
AlSchlo May 10, 2025
1448c4a
big memo refactor
AlSchlo May 10, 2025
e4f845b
consolidate merge results
AlSchlo May 10, 2025
8cf416b
add more merge tests
AlSchlo May 10, 2025
4035c4d
add cascading merge test
AlSchlo May 10, 2025
5624fc6
Add extra memo test
AlSchlo May 11, 2025
bb5ee02
Fix memo fuzzing tests
AlSchlo May 11, 2025
4bf6104
update todo list task and remove useless functions
AlSchlo May 12, 2025
c8466d8
start task graph merge
AlSchlo May 12, 2025
dea8f8d
add missing file
AlSchlo May 12, 2025
4fc1e5e
refactor task merge and add consolidation step
AlSchlo May 12, 2025
91ee670
finish merge
AlSchlo May 12, 2025
b21478b
simplify task ir and remove notion of uncompleted
AlSchlo May 12, 2025
eeb873b
generate missing doc
AlSchlo May 12, 2025
ab5561e
address most of clippy
AlSchlo May 13, 2025
89ded5d
replace with faster hashmap and set in optimizer (hashbrown)
AlSchlo May 13, 2025
ba8a27d
add logical channel to result
AlSchlo May 13, 2025
59a3015
remove logical channel
AlSchlo May 13, 2025
a9848b7
Add missing file
AlSchlo May 13, 2025
ecca9bd
increase engine stack size
AlSchlo May 13, 2025
7302a40
Associated Error Types (#104)
connortsui20 May 13, 2025
6b347db
Add * handling and make tutorial compile
AlSchlo May 13, 2025
55a8e76
add * stored type handling and tests
AlSchlo May 13, 2025
b492c2b
merge
AlSchlo May 13, 2025
0dfb321
address connor's nits
AlSchlo May 13, 2025
e74d1b9
Merge branch 'main' into alexis/optimizer
connortsui20 May 13, 2025
b10b985
replace debug asserts into asserts
AlSchlo May 13, 2025
92f66bc
Merge branch 'alexis/optimizer' of github.com:cmu-db/optd into alexis…
AlSchlo May 13, 2025
e101300
union instead of extend
AlSchlo May 13, 2025
c075611
rename get ref to take ref
AlSchlo May 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

# optd

Query Optimizer Service
Query Optimizer Service
2 changes: 1 addition & 1 deletion optd-cli/examples/higher_order.opt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
data Logical =
| Operation =
| Binary =
| Arithmetic =
| Arith =
| Add(left: Logical, right: Logical)
| Mul(left: Logical, right: Logical)
| Div(left: Logical, right: Logical)
Expand Down
5 changes: 3 additions & 2 deletions optd-cli/examples/logical_rules.opt
Original file line number Diff line number Diff line change
Expand Up @@ -138,9 +138,10 @@ fn build_calculator_expr() =
const3 = Const(3),
const4 = Const(4),
addition = Add(const2, const3),
multiplication = Mult(addition, const4)
multiplication = Mult(addition, const4),
pow = Pow(multiplication, Const(10)),
in
multiplication.evaluate()
pow.evaluate()

[run]
fn run_mult_commute() =
Expand Down
57 changes: 29 additions & 28 deletions optd-cli/examples/tutorial.opt
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ fn (catalog: Catalog) get_table_schema(name: String): Schema

fn (logical: Logical*) properties(): LogicalProperties

fn (costed: Physical$) statistics(): CostedProperties
fn (costed: Physical$) statistics(): Statistics

// -------------------------
// 9. Transformation Rules
Expand All @@ -239,11 +239,11 @@ fn (costed: Physical$) statistics(): CostedProperties
// Helper function for scalar rewrites.
fn (expr: Scalar) remap(bindings: {I64 : I64}): Scalar =
match expr
| ColumnRef(idx) ->
if bindings(idx) != none then // TODO: Fix none bug... Should not be a type! Rather Option<Nothing>.
ColumnRef(0) // TODO: Add ! once we have the `!` syntax.
else
ColumnRef(idx)
| ColumnRef(idx) -> ColumnRef(0)
// if bindings(idx) != none then // TODO: Fix EQ inference: just require EqHash & adapt engine.
// ColumnRef(0) // TODO: Add ! once we have the `!` syntax.
// else
// ColumnRef(idx)
| IntLiteral(value) -> IntLiteral(value)
| StringLiteral(value) -> StringLiteral(value)
| BoolLiteral(value) -> BoolLiteral(value)
Expand Down Expand Up @@ -277,12 +277,12 @@ fn (expr: Logical*) join_commute(): Logical? = match expr
right_indices = 0..right_len,
left_indices = 0..left_len,

remapping = (left_indices.map(i -> (i, i + right_len)) ++
right_indices.map(i -> (i + left_len, i))).to_map(),
remapping = (left_indices.map((i: I64) -> (i, i + right_len)) ++
right_indices.map((i: I64) -> (i + left_len, i))).to_map(),
in
Project(
Join(right, left, Inner, predicate.remap(remapping)),
(right_indices ++ left_indices).map(i -> ColumnRef(i))
(right_indices ++ left_indices).map((i: I64) -> ColumnRef(i))
)
\ _ -> none

Expand All @@ -306,8 +306,8 @@ fn (plan: Logical*) join_associativity(): Logical? = match plan
b_indices = 0..b_len,
c_indices = 0..c_len,

pred_bc_remapping = (b_indices.map(i -> (a_len + i, i)) ++
c_indices.map(i -> (a_len + b_len + i, b_len + i))).to_map(),
pred_bc_remapping = (b_indices.map((i: I64) -> (a_len + i, i)) ++
c_indices.map((i: I64) -> (a_len + b_len + i, b_len + i))).to_map(),

remapped_pred_bc = pred_bc.remap(pred_bc_remapping),

Expand All @@ -316,8 +316,8 @@ fn (plan: Logical*) join_associativity(): Logical? = match plan
b_indices_after = 0..b_len,
c_indices_after = 0..c_len,

pred_ab_remapping = (b_indices_after.map(i -> (a_len + i, a_len + i)) ++
c_indices_after.map(i -> (a_len + b_len + i, a_len + b_len + i))).to_map(),
pred_ab_remapping = (b_indices_after.map((i: I64) -> (a_len + i, a_len + i)) ++
c_indices_after.map((i: I64) -> (a_len + b_len + i, a_len + b_len + i))).to_map(),

remapped_pred_ab = pred_ab.remap(pred_ab_remapping)
in
Expand Down Expand Up @@ -353,10 +353,11 @@ fn (expr: Logical*) impl_filter_enforce(props: PhysicalProperties?): Physical? =
let
result = PhysFilter(child.optimize(none), predicate)
in
if props != none then
PhysSort(result, props#order_by)
else
result
result
// if props != none then
// PhysSort(result, props#order_by) // TODO: same ! problem
// else
// result
\ _ -> none

[implementation]
Expand All @@ -365,14 +366,14 @@ fn (expr: Logical*) impl_filter_passthrough(props: PhysicalProperties?): Physica
| Filter(child, predicate) -> PhysFilter(child.optimize(props), predicate)
\ _ -> none

[implementation]
fn (expr: Logical*) impl_sort(props: PhysicalProperties?): Physical? = match expr
| Sort(child, order_by) ->
if props == none then // TODO: The incoming `?` syntax will make this cleaner
PhysSort(child.optimize(none), order_by)
else
none
\ _ -> none
// [implementation]
// fn (expr: Logical*) impl_sort(props: PhysicalProperties?): Physical? = match expr
// | Sort(child, order_by) ->
// if props == none then // TODO: The incoming `?` syntax will make this cleaner. Also == problem for now.
// PhysSort(child.optimize(none), order_by)
// else
// none
// \ _ -> none

// -------------------------
// 11. Other Required Functions
Expand All @@ -381,14 +382,14 @@ fn (expr: Logical*) impl_sort(props: PhysicalProperties?): Physical? = match exp
// Cost function for physical operators - used by the cost-based optimizer to evaluate
// different physical implementation alternatives.

fn (expr: Physical*) cost(): F64 = 0
fn (expr: Physical*) cost(): F64 = 0.0

// Derive logical properties from a logical plan - propagates schema information and other
// logical properties (like cardinality estimates, uniqueness, or functional dependencies)
// through the logical plan.

fn (log: Logical*) derive(): LogicalProperties = match log
| Get(table_name) -> catalog.get_table_schema(table_name)
fn (log: Logical*) derive(): LogicalProperties? = match log
| Get(table_name) -> LogicalProperties(Catalog.get_table_schema(table_name))
| Filter(child, _) -> child.properties()
| Join(left, right, join_type, _) ->
let
Expand Down
22 changes: 17 additions & 5 deletions optd-cli/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,10 @@ use optd::dsl::analyzer::hir::{CoreData, HIR, Udf, Value};
use optd::dsl::compile::{Config, compile_hir};
use optd::dsl::engine::{Continuation, Engine, EngineResponse};
use optd::dsl::utils::errors::{CompileError, Diagnose};
use optd::dsl::utils::retriever::{MockRetriever, Retriever};
use std::collections::HashMap;
use std::sync::Arc;
use tokio::runtime::Runtime;
use tokio::runtime::Builder;
use tokio::task::JoinSet;

#[derive(Parser)]
Expand All @@ -66,7 +67,11 @@ enum Commands {
}

/// A unimplemented user-defined function.
pub fn unimplemented_udf(_args: &[Value], _catalog: &dyn Catalog) -> Value {
pub fn unimplemented_udf(
_args: &[Value],
_catalog: &dyn Catalog,
_retriever: &dyn Retriever,
) -> Value {
println!("This user-defined function is unimplemented!");
Value::new(CoreData::<Value>::None)
}
Expand Down Expand Up @@ -122,7 +127,13 @@ fn run_all_functions(hir: &HIR) -> Result<(), Vec<CompileError>> {
println!("Found {} functions to run", functions.len());

// Create a multi-threaded runtime for parallel execution.
let runtime = Runtime::new().unwrap();
// TODO: We increase the stack size by x64 to avoid stack overflow
// given the lack of tail recursion in the engine (yet...)
let runtime = Builder::new_multi_thread()
.thread_stack_size(128 * 1024 * 1024)
.enable_all()
.build()
.unwrap();
let function_results = runtime.block_on(run_functions_in_parallel(hir, functions));

// Process and display function results.
Expand All @@ -139,10 +150,11 @@ fn run_all_functions(hir: &HIR) -> Result<(), Vec<CompileError>> {

async fn run_functions_in_parallel(hir: &HIR, functions: Vec<String>) -> Vec<FunctionResult> {
let catalog = Arc::new(memory_catalog());
let retriever = Arc::new(MockRetriever::new());
let mut set = JoinSet::new();

for function_name in functions {
let engine = Engine::new(hir.context.clone(), catalog.clone());
let engine = Engine::new(hir.context.clone(), catalog.clone(), retriever.clone());
let name = function_name.clone();

set.spawn(async move {
Expand All @@ -151,7 +163,7 @@ async fn run_functions_in_parallel(hir: &HIR, functions: Vec<String>) -> Vec<Fun
Arc::new(|value| Box::pin(async move { value }));

// Launch the function with an empty vector of arguments.
let result = engine.launch_rule(&name, vec![], result_handler).await;
let result = engine.launch(&name, vec![], result_handler).await;
FunctionResult { name, result }
});
}
Expand Down
7 changes: 6 additions & 1 deletion optd/src/dsl/analyzer/from_ast/converter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,7 @@ mod converter_tests {
use crate::dsl::analyzer::hir::{CoreData, FunKind};
use crate::dsl::analyzer::type_checks::registry::{Generic, TypeKind};
use crate::dsl::parser::ast::{self, Adt, Function, Item, Module, Type as AstType};
use crate::dsl::utils::retriever::Retriever;
use crate::dsl::utils::span::{Span, Spanned};

// Helper functions to create test items
Expand Down Expand Up @@ -381,7 +382,11 @@ mod converter_tests {
let ext_func = create_simple_function("external_function", false);
let module = create_module_with_functions(vec![ext_func]);

pub fn external_function(_args: &[Value], _catalog: &dyn Catalog) -> Value {
pub fn external_function(
_args: &[Value],
_catalog: &dyn Catalog,
_retriever: &dyn Retriever,
) -> Value {
println!("Hello from UDF!");
Value::new(CoreData::<Value>::None)
}
Expand Down
12 changes: 9 additions & 3 deletions optd/src/dsl/analyzer/hir/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

use super::type_checks::registry::Type;
use crate::catalog::Catalog;
use crate::dsl::utils::retriever::Retriever;
use crate::dsl::utils::span::Span;
use context::Context;
use map::Map;
Expand Down Expand Up @@ -76,12 +77,17 @@ pub struct Udf {
/// The function pointer to the user-defined function.
///
/// Note that [`Value`]s passed to and returned from this UDF do not have associated metadata.
pub func: fn(&[Value], &dyn Catalog) -> Value,
pub func: fn(&[Value], &dyn Catalog, &dyn Retriever) -> Value,
}

impl Udf {
pub fn call(&self, values: &[Value], catalog: &dyn Catalog) -> Value {
(self.func)(values, catalog)
pub fn call(
&self,
values: &[Value],
catalog: &dyn Catalog,
retriever: &dyn Retriever,
) -> Value {
(self.func)(values, catalog, retriever)
}
}

Expand Down
12 changes: 10 additions & 2 deletions optd/src/dsl/analyzer/into_hir/converter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,12 @@ fn convert_field_access(
) -> ExprKind {
use ExprKind::*;

let expr_type = registry.resolve_type(&expr.metadata.ty);
let mut expr_type = registry.resolve_type(&expr.metadata.ty);

// Unwrap stored type if necessary.
while let TypeKind::Stored(inner_type) = *expr_type.value {
expr_type = inner_type;
}

match &*expr_type.value {
TypeKind::Tuple(_) => {
Expand Down Expand Up @@ -262,7 +267,10 @@ fn convert_field_access(
)
}

_ => panic!("Field access on non-struct, non-tuple type: error in type inference"),
_ => panic!(
"Field access on non-struct, non-tuple type: {:?}",
expr_type
),
}
}

Expand Down
4 changes: 2 additions & 2 deletions optd/src/dsl/analyzer/type_checks/glb.rs
Original file line number Diff line number Diff line change
Expand Up @@ -221,13 +221,13 @@ impl TypeRegistry {
let result = glb_kind.into();

// Verify post-condition in debug mode only
debug_assert!(
assert!(
self.is_subtype(&result, type1),
"GLB post-condition failed: {:?} is not a subtype of {:?}",
result,
type1
);
debug_assert!(
assert!(
self.is_subtype(&result, type2),
"GLB post-condition failed: {:?} is not a subtype of {:?}",
result,
Expand Down
4 changes: 2 additions & 2 deletions optd/src/dsl/analyzer/type_checks/lub.rs
Original file line number Diff line number Diff line change
Expand Up @@ -200,13 +200,13 @@ impl TypeRegistry {
let result = lub_kind.into();

// Verify post-condition in debug mode only
debug_assert!(
assert!(
self.is_subtype(type1, &result),
"LUB post-condition failed: {:?} is not a subtype of {:?}",
type1,
result
);
debug_assert!(
assert!(
self.is_subtype(type2, &result),
"LUB post-condition failed: {:?} is not a subtype of {:?}",
type2,
Expand Down
Loading