Pretty-print in `SHOW CREATE` #31933

ggevay · 2025-03-18T13:46:28Z

This makes SHOW CREATE and SHOW REDACTED CREATE pretty-print the result (e.g., add line breaks). (https://github.com/MaterializeInc/database-issues/issues/9078, and slack discussion)

The first commit just does a renaming, and then the next commit is the main thing. I recommend reviewing commit-by-commit, and starting the review of the main commit from humanize_sql_for_show_create. The diff looks somewhat big, but most of the code diff is just threading FormatMode through the sql-pretty crate, which was needed to enable pretty-printing for both the normal SHOW CREATE and for SHOW REDACTED CREATE. Also, there was a staggering amount of manual test rewrites needed, because Testdrive doesn't have auto-rewriting. (I also did some "spring cleaning": deleted some old tests, which were mirrors of other tests but with an old Kafka syntax, as discussed here.)

Note that in addition to pretty-printing, this also changes the output format of SHOW CREATE from FormatMode::Stable to FormatMode::Simple. The main effect of this is less quoting of identifiers: stable mode quotes all identifiers, thus cluttering up the screen quite a bit, while the simple mode quotes only when it's needed. I have made some efforts recently to get the "when it's needed" logic bug-free, so hopefully the simple mode is enough.

Motivation

This PR adds a known-desirable feature: https://github.com/MaterializeInc/database-issues/issues/9078, and slack discussion

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

teskje

LGTM! Not sure about the removed SHOW CREATE statements in some of the test files, but I trust they all have good reasons.

The config parameter added to ~all functions in mz-sql-pretty makes me think that we should turn them into methods on a struct that holds the config. But I can understand that you don't want to deal with such a large refactor now.

misc/python/materialize/checks/all_checks/create_table.py

teskje · 2025-03-27T17:31:28Z

test/legacy-upgrade/check-from-v0.111.0-kafka-sink.td

 > SHOW CREATE SINK compression_implicit;
-"materialize.public.compression_implicit" ${expected-compression-implicit-create-sql}
+materialize.public.compression_implicit "CREATE SINK materialize.public.compression_implicit IN CLUSTER quickstart FROM materialize.public.kafka_sink_from INTO KAFKA CONNECTION materialize.public.kafka_conn (TOPIC = 'kafka-sink') FORMAT JSON ENVELOPE DEBEZIUM;"


Dumb question, but why do some of the SHOW CREATE outputs have newlines and some don't?

sql-pretty doesn't support every statement kind. E.g., to_doc has a case for CreateSource, but not for CreateSink.

teskje · 2025-03-27T17:36:41Z

test/legacy-upgrade/check-from-v0.111.0-kafka-sink.td

 > SHOW CREATE SINK compression_implicit;
-"materialize.public.compression_implicit" ${expected-compression-implicit-create-sql}
+materialize.public.compression_implicit "CREATE SINK materialize.public.compression_implicit IN CLUSTER quickstart FROM materialize.public.kafka_sink_from INTO KAFKA CONNECTION materialize.public.kafka_conn (TOPIC = 'kafka-sink') FORMAT JSON ENVELOPE DEBEZIUM;"


Also the pretty-printing adds a semicolon where there previously was none. Is this a compatibility concern? I think it would be if people would somehow run SHOW CREATE in their scripts, and then expect output without a trailing semicolon. Not sure why they would do either though.

Well, I'd say we should just risk this.

It's probably slightly better to have a semicolon than to not have, because if you want to paste these statements into some SQL shell, then it's slightly easier if they already have a semicolon.

Btw. mzexplore might actually run into this problem. For example, its cluster cloning functionality might use SHOW CREATE in this way. I'm working on mzexplore, so I'll fix this in follow-up PRs.

ParkMyCar · 2025-03-27T17:07:11Z

src/sql-pretty/src/util.rs

@@ -33,7 +33,7 @@ where

 pub(crate) fn title_comma_separate<'a, F, T, S>(title: S, f: F, v: &'a [T]) -> RcDoc<'a, ()>
 where
-    F: Fn(&'a T) -> RcDoc<'a>,
+    F: FnMut(&'a T) -> RcDoc<'a>,


I'm curious what motivated these changes from Fn to FnMut?

It was due to those ..._mapper functions somehow, but I switched to just having a closure everywhere instead of those ..._mapper functions, so I reverted these back to Fn.

ParkMyCar · 2025-03-27T17:19:41Z

src/sql-pretty/src/lib.rs

-pub fn to_pretty<T: AstInfo>(stmt: &Statement<T>, width: usize) -> String {
-    format!("{};", to_doc(stmt).pretty(width))
+pub fn to_pretty<T: AstInfo>(stmt: &Statement<T>, config: PrettyConfig) -> String {
+    format!("{};", to_doc(stmt, config).pretty(config.width))


Threading a PrettyConfig into the doc* functions doesn't seem right?

See discussion here.

ParkMyCar · 2025-03-27T17:43:16Z

src/sql/src/plan/statement/show.rs

+    Ok(mz_sql_pretty::to_pretty(
+        &resolved,
+        PrettyConfig {
+            width: mz_sql_pretty::DEFAULT_WIDTH,
+            format_mode: if redacted {
+                FormatMode::SimpleRedacted
+            } else {
+                FormatMode::Simple
+            },
+        },
+    ))


It seems like a large part of this change is because we need to plumb the FormatMode into the sql-pretty crate?

IMO pretty printing shouldn't need to know about whether or not something is redacted. What do you think about something like:

let raw_str = if redacted { resolved.to_ast_string_redacted() } else { resolved.to_ast_string_stable() }; Ok(mz_sql_pretty::pretty_str(&redacted))

Unfortunate that we need to format the string twice, but given this is only for SHOW CREATE I don't feel too bad about it? In a future world it feels like there is an API we could introduce for the mz-sql-pretty crate that would allow us to wrap a statement in some context which would automatically format values as redacted if necessary.

IMO pretty printing shouldn't need to know about whether or not something is redacted.

This didn't occur to me during the review, but I agree!

There is the issue that we might not be able to parse a redacted statement back. This is because maybe the parser expects a number somewhere, but then it gets something like <redacted>. https://github.com/MaterializeInc/database-issues/issues/8796 aims to solve this problem, but there might be a long tail of cases to solve there, so in the meantime we'd have to have some error handling after the pretty_str call, and just print the non-pretty statement if the parsing back errors out. This is doable, but maybe it tips the balance in favor of the PR's current approach. What do you think?

One more consideration:

There are various aspects of AST printing that we want to control (see https://github.com/MaterializeInc/database-issues/issues/9082). If we want to keep mz-sql-pretty oblivious to all of them, and just use Parker's trick for adding redaction on top of any of the other AST printing options, then a problem is that mz-sql-pretty might undo some of the other formatting options when it calls AstDisplay as its "base case" with its hardwired FormatMode.

For example, imagine that we'd like to pretty-print in FormatMode::Stable. The above trick can't be adopted for this: if we first print with FormatMode::Stable, then parse back, then run mz-sql-pretty, then the problem is that mz-sql-pretty has FormatMode::Simple hardwired into its own calls of AstDisplay (which it does when it can't or doesn't want to deal with further chunking up an AST fragment), so it undoes the earlier FormatMode::Stable and just prints in FormatMode::Simple.

This wouldn't be a concern for this particular PR (because of 1. redaction not being undone by a hardwired FormatMode and 2. not involving FormatMode::Stable), but I think in the future it would be great to make all formatting options orthogonal (https://github.com/MaterializeInc/database-issues/issues/9082), for which it seems to me that we'd have to wire FormatMode through mz-sql-pretty, to avoid mz-sql-pretty undoing some formatting option by using its hardwired FormatMode.

Proceeding with the current implementation works for me! I'll put some more thought into maybe how we could refactor this, but don't want to block on it

Thanks for thinking through this Gabor!

antiguru

Left some comments inline.

I'll approve, but I think there could be more work done to improve the PR.

antiguru · 2025-03-27T19:12:27Z

src/sql-parser/src/ast/display.rs

+    fn to_ast_string_simple(&self) -> String {
+        self.to_ast_string(FormatMode::Simple)


I think to reduce noise, it'd make sense not to rename this function.

I renamed it after finding myself jumping to the definition repeatedly to see what FormatMode it uses. Now it's clear from the name.

Note that the renaming is separated into its own commit (as mentioned in the PR description). This way it doesn't really add noise when reviewing: One can look at the commits individually, and all the diff that is in the renaming commit doesn't need reviewing, just the name itself.

antiguru · 2025-03-27T19:14:10Z

src/sql-pretty/src/doc.rs

+fn doc_display_pass_mapper<T: AstDisplay>(
+    config: PrettyConfig,
+) -> impl for<'b> FnMut(&'b T) -> RcDoc<'b, ()> {
+    move |v| doc_display_pass(v, config)
+}
+
+pub(crate) fn doc_create_source<T: AstInfo>(
+    v: &CreateSourceStatement<T>,
+    config: PrettyConfig,
+) -> RcDoc {


It seems the PR needs to introduce a bunch of complexity to work around the existing code structure. It strikes me as potentially the wrong approach: Why don't we convert the freestanding functions to functions on a type instead? That way we wouldn't need to pass the config everywhere.

Yeah, I'll probably do this. I don't think it will really reduce the complexity, because the self parameter will just take the place of the current config parameter, but should help readability a bit.

(But first, I'd like to resolve the question of whether we even need to plumb FormatMode through the sql-pretty crate. If not, then most of the changes to sql-pretty can simply be reverted.)

I've done this now in the "Refactor sql-pretty: pass around PrettyConfig as &self." commit. Passes the config in &self.

(The commit also removed the ..._mapper functions. I just have closures everywhere now.)

antiguru · 2025-03-27T19:17:26Z

src/sql/src/func.rs

@@ -3838,7 +3838,7 @@ pub static MZ_CATALOG_BUILTINS: LazyLock<BTreeMap<&'static str, Func>> = LazyLoc
        "pretty_sql" => Scalar {
            params!(String, Int32) => BinaryFunc::PrettySql => String, oid::FUNC_PRETTY_SQL;
            params!(String) => Operation::unary(|_ecx, s| {
-                let width = HirScalarExpr::literal(Datum::Int32(100), ScalarType::Int32);
+                let width = HirScalarExpr::literal(Datum::Int32(mz_sql_pretty::DEFAULT_WIDTH.try_into().expect("must fit")), ScalarType::Int32);


You could make the constant a 32-bit number, which should be plenty enough. Then you can do cheap up conversion and don't have to unwrap here.

Good idea, will do!

Actually, unfortunately it's not so simple, because then the other use of DEFAULT_WIDTH, which gives it to PrettyConfig::width, would need to convert from i32 to usize. I could make PrettyConfig::width also an i32, but all these widths being signed types would look weird.

I think the original problem is that the pretty_sql scalar function takes a signed width. We could also change that, but changing the parameter types of an existing scalar function is probably more hassle than this is worth.

So, I'd like to just stay with these widths being unsigned types (as they should be, as it width can't be negative), and just work around the problem that pretty_sql takes an Int32 by doing a conversion only in this one spot where I'm giving the default width to this function.

You could also make it a u16. But I'm ambivalent about whether or not it makes sense to choose a slightly "weird" type to avoid an expect.

ggevay · 2025-03-31T14:11:23Z

The "Refactor sql-pretty: pass around PrettyConfig as &self." commit did the refactoring that @teskje and @antiguru suggested.

The only remaining open question is whether we even need to pass around the config. I think we do, see here.

def- · 2025-04-10T00:51:37Z

@ggevay This is very messy to backport, is it really required?

ggevay · 2025-04-11T16:45:58Z

(We can live without it, as discussed on Slack.)

ggevay added A-ADAPTER Topics related to the ADAPTER layer self-managed-backport-v25.1 Needs to be backported into the v25.1 self-managed release labels Mar 18, 2025

ggevay force-pushed the show-create-pretty branch 15 times, most recently from b5ec1a6 to 07d8bd6 Compare March 25, 2025 12:53

ggevay force-pushed the show-create-pretty branch 5 times, most recently from 6ebb194 to b461943 Compare March 27, 2025 13:40

ggevay marked this pull request as ready for review March 27, 2025 13:41

ggevay requested review from a team as code owners March 27, 2025 13:41

ggevay requested a review from aljoscha March 27, 2025 13:41

ggevay force-pushed the show-create-pretty branch 2 times, most recently from 43fadec to a114777 Compare March 27, 2025 14:16

ggevay force-pushed the show-create-pretty branch 3 times, most recently from 8ddde38 to e4bcda0 Compare March 27, 2025 17:32

teskje approved these changes Mar 27, 2025

View reviewed changes

ParkMyCar reviewed Mar 27, 2025

View reviewed changes

ggevay force-pushed the show-create-pretty branch from e4bcda0 to da4b626 Compare March 27, 2025 18:32

antiguru approved these changes Mar 27, 2025

View reviewed changes

ggevay force-pushed the show-create-pretty branch 4 times, most recently from 7945e0c to 592ff6a Compare March 31, 2025 14:03

ggevay force-pushed the show-create-pretty branch from 592ff6a to 4c52447 Compare March 31, 2025 16:08

ggevay added 3 commits April 1, 2025 15:35

parser: Rename to_ast_string to to_ast_string_simple

1eb4d02

Pretty-print in SHOW CREATE and in SHOW REDACTED CREATE

f42e9f2

Refactor sql-pretty: pass around PrettyConfig as &self.

5e6278c

ggevay force-pushed the show-create-pretty branch from 4c52447 to 63444e2 Compare April 1, 2025 13:38

ParkMyCar approved these changes Apr 1, 2025

View reviewed changes

ggevay force-pushed the show-create-pretty branch from 63444e2 to 5ab09a6 Compare April 1, 2025 15:23

Version guards in SHOW CREATE tests

2f3f888

ggevay force-pushed the show-create-pretty branch from 5ab09a6 to 2f3f888 Compare April 1, 2025 16:06

ggevay merged commit 4a98668 into MaterializeInc:main Apr 1, 2025
219 of 249 checks passed

def- removed the self-managed-backport-v25.1 Needs to be backported into the v25.1 self-managed release label Apr 10, 2025

		fn to_ast_string_simple(&self) -> String {
		self.to_ast_string(FormatMode::Simple)

Pretty-print in SHOW CREATE #31933

Pretty-print in SHOW CREATE #31933

Uh oh!

Conversation

ggevay commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Tips for reviewer

Checklist

Uh oh!

teskje left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antiguru left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay commented Mar 31, 2025

Uh oh!

Uh oh!

def- commented Apr 10, 2025

Uh oh!

ggevay commented Apr 11, 2025

Uh oh!

Pretty-print in `SHOW CREATE` #31933

Pretty-print in `SHOW CREATE` #31933

ggevay commented Mar 18, 2025 •

edited

Loading

ggevay Mar 27, 2025 •

edited

Loading

ggevay Mar 27, 2025 •

edited

Loading

ggevay Mar 27, 2025 •

edited

Loading

ggevay Mar 31, 2025 •

edited

Loading

ggevay Mar 31, 2025 •

edited

Loading