Skip to content

Conversation

@apoelstra
Copy link
Member

As a step toward rewriting the expression parser to be non-recursive, add a pre-parsing well-formedness check, which verifies that an expression is well-formed, uses only the correct character set, and that the checksum (if present) is valid.

Along the way, replace the error types returned from the expression module with a new more-precise one which can identify the location of parse errors (and identify what the error was in a more correct way). This improves our error reporting and drops many instances of the stringly-typed BadDescriptor error.

There is currently a bunch of special logic for Taproot descriptors which have the extra characters { and }. To the extent possible, this PR doesn't touch that logic. It will be addressed in a later PR.

The benchmarks show a slight slowdown since we essentially added new validation logic without removing the old logic. Later PRs will improve things.

This creates a new checksum::Error type. For now it sticks it into the
giant global error enum, but we will fold it into an expression-parsing
enum in the following commits.

This eliminates several instances of the Error::BadDescriptor variant,
which is a meaningless stringly-typed variant that we'd like to
eliminate.
This is really useful for testing.
When we convert the parser to be non-recursive, it will be simpler if we
first do a pass to confirm that the expression is well-formed (all
parens match and are balanced, commas in the right place, etc.). After
that we can assume the string is well-formed and write an algorithm that
basically just translates it into a data structure.

Conveniently, because this check is independent of the parsing
algorithm, we can add it to the existing code as a pre-check.

Since all errors during parsing should occur during the pre-check, we
can then turn the existing stringly-typed errors into unreachable!()s.
By fuzzing the resulting code we can guarantee that our new pre-check
is at least as strict as the existing parser.

In a later PR we will generalize the pre-check a bit so that it treats
*both* () and {} as parentheses, and markes tree nodes based on which
parens are used. But until we change the Taproot parser (which is is an
ad-hoc mess of combination manual string parsing and tree parsing, and
expects tapleaves to show up as the "name"s of childless nodes in a tree
of {} combinators) we can't effectively do this.

There is a new unit test `parse_tree_taproot` that will track this
progress.
Right now our character validation logic is spread all over the place.
We have `expression::check_valid_chars` and `checksum::verify_checksum`
and also `checksum::Engine::input` which all do the same checks,
sharing the public array `expression::VALID_CHARS` which is arguably an
implementation detail and shouldn't be exposed.

In fact, whenever we parse any Miniscript object (a script, a
descriptor, a policy of either type), we are first validating the
checksum and then parsing into a tree, and our tree-parsing re-does
character validation that was already done as part of the checksum
check.

This commit moves the checksum error logic into expression, where it
belongs, and folds the checksum error variant into the expression
parsing error, where it belongs (removing it from the mega-enum at the
root).

However it leaves the Taproot parsing alone, which is still a bit of a
mess and will be addressed in a later PR.

Benchmarks show that this significantly slows down expression parsing
(as expected, since it is now doing checksumming) but significantly
speeds up descriptor parsing (which is no longer doing multiple checks
for character validity).
@apoelstra
Copy link
Member Author

Benchmarks on master
test benchmarks::parse_descriptor_balanced_segwit_a_0         ... bench:         416.16 ns/iter (+/- 6.42)
test benchmarks::parse_descriptor_balanced_segwit_b_1         ... bench:       6,355.02 ns/iter (+/- 77.32)
test benchmarks::parse_descriptor_balanced_segwit_c_10        ... bench:      67,655.17 ns/iter (+/- 2,277.04)
test benchmarks::parse_descriptor_balanced_segwit_d_20        ... bench:     134,077.92 ns/iter (+/- 997.23)
test benchmarks::parse_descriptor_balanced_segwit_e_40        ... bench:     269,365.70 ns/iter (+/- 886.21)
test benchmarks::parse_descriptor_balanced_segwit_f_60        ... bench:     404,551.50 ns/iter (+/- 1,189.81)
test benchmarks::parse_descriptor_balanced_segwit_g_80        ... bench:     538,862.60 ns/iter (+/- 2,835.53)
test benchmarks::parse_descriptor_balanced_segwit_h_90        ... bench:     607,355.80 ns/iter (+/- 2,490.32)
test benchmarks::parse_descriptor_balanced_segwit_thresh_a_1  ... bench:       6,348.42 ns/iter (+/- 33.81)
test benchmarks::parse_descriptor_balanced_segwit_thresh_b_10 ... bench:      71,440.19 ns/iter (+/- 193.71)
test benchmarks::parse_descriptor_balanced_segwit_thresh_c_20 ... bench:     142,918.70 ns/iter (+/- 662.25)
test benchmarks::parse_descriptor_balanced_segwit_thresh_d_40 ... bench:     287,775.83 ns/iter (+/- 2,069.47)
test benchmarks::parse_descriptor_balanced_segwit_thresh_e_60 ... bench:     433,455.70 ns/iter (+/- 1,727.61)
test benchmarks::parse_descriptor_balanced_segwit_thresh_f_80 ... bench:     581,765.00 ns/iter (+/- 3,346.93)
test benchmarks::parse_descriptor_balanced_segwit_thresh_g_90 ... bench:     652,067.80 ns/iter (+/- 2,709.12)
test benchmarks::parse_descriptor_deep_segwit_a_0             ... bench:         415.08 ns/iter (+/- 4.99)
test benchmarks::parse_descriptor_deep_segwit_b_1             ... bench:       6,357.24 ns/iter (+/- 36.34)
test benchmarks::parse_descriptor_deep_segwit_c_10            ... bench:      67,136.06 ns/iter (+/- 211.33)
test benchmarks::parse_descriptor_deep_segwit_d_20            ... bench:     134,446.55 ns/iter (+/- 400.57)
test benchmarks::parse_descriptor_deep_segwit_e_40            ... bench:     269,131.93 ns/iter (+/- 977.93)
test benchmarks::parse_descriptor_deep_segwit_f_60            ... bench:     404,084.65 ns/iter (+/- 1,177.09)
test benchmarks::parse_descriptor_deep_segwit_g_80            ... bench:     539,062.10 ns/iter (+/- 2,893.17)
test benchmarks::parse_descriptor_deep_segwit_h_90            ... bench:     607,630.40 ns/iter (+/- 2,915.79)
test benchmarks::parse_descriptor_deep_segwit_thresh_a_1      ... bench:       6,366.46 ns/iter (+/- 35.91)
test benchmarks::parse_descriptor_deep_segwit_thresh_b_10     ... bench:      72,475.06 ns/iter (+/- 285.33)
test benchmarks::parse_descriptor_deep_segwit_thresh_c_20     ... bench:     145,826.17 ns/iter (+/- 496.26)
test benchmarks::parse_descriptor_deep_segwit_thresh_d_40     ... bench:     293,380.37 ns/iter (+/- 1,074.54)
test benchmarks::parse_descriptor_deep_segwit_thresh_e_60     ... bench:     441,332.50 ns/iter (+/- 1,871.12)
test benchmarks::parse_descriptor_deep_segwit_thresh_f_80     ... bench:     588,580.30 ns/iter (+/- 3,422.26)
test benchmarks::parse_descriptor_deep_segwit_thresh_g_90     ... bench:     663,703.40 ns/iter (+/- 2,647.93)
test benchmarks::parse_descriptor_tr_bigtree_a_1              ... bench:      12,838.14 ns/iter (+/- 83.12)
test benchmarks::parse_descriptor_tr_bigtree_b_2              ... bench:      19,706.84 ns/iter (+/- 69.16)
test benchmarks::parse_descriptor_tr_bigtree_c_5              ... bench:      40,929.19 ns/iter (+/- 156.82)
test benchmarks::parse_descriptor_tr_bigtree_d_10             ... bench:      76,067.56 ns/iter (+/- 319.61)
test benchmarks::parse_descriptor_tr_bigtree_e_20             ... bench:     146,978.90 ns/iter (+/- 643.04)
test benchmarks::parse_descriptor_tr_bigtree_f_50             ... bench:     360,990.35 ns/iter (+/- 1,991.57)
test benchmarks::parse_descriptor_tr_bigtree_g_100            ... bench:     720,819.40 ns/iter (+/- 2,811.11)
test benchmarks::parse_descriptor_tr_bigtree_h_200            ... bench:   1,449,319.70 ns/iter (+/- 6,380.12)
test benchmarks::parse_descriptor_tr_bigtree_i_500            ... bench:   3,664,701.80 ns/iter (+/- 50,889.13)
test benchmarks::parse_descriptor_tr_bigtree_j_1000           ... bench:   7,340,243.75 ns/iter (+/- 95,284.79)
test benchmarks::parse_descriptor_tr_bigtree_k_2000           ... bench:  14,771,469.40 ns/iter (+/- 324,259.32)
test benchmarks::parse_descriptor_tr_bigtree_l_5000           ... bench:  36,976,883.80 ns/iter (+/- 1,047,557.69)
test benchmarks::parse_descriptor_tr_bigtree_m_10000          ... bench:  73,927,144.20 ns/iter (+/- 2,017,982.32)
test benchmarks::parse_descriptor_tr_deep_bigtree_a_1         ... bench:      12,872.21 ns/iter (+/- 45.01)
test benchmarks::parse_descriptor_tr_deep_bigtree_b_2         ... bench:      19,707.98 ns/iter (+/- 69.91)
test benchmarks::parse_descriptor_tr_deep_bigtree_c_5         ... bench:      40,768.39 ns/iter (+/- 140.55)
test benchmarks::parse_descriptor_tr_deep_bigtree_d_10        ... bench:      75,898.69 ns/iter (+/- 344.72)
test benchmarks::parse_descriptor_tr_deep_bigtree_e_20        ... bench:     146,889.58 ns/iter (+/- 534.29)
test benchmarks::parse_descriptor_tr_deep_bigtree_f_50        ... bench:     359,533.50 ns/iter (+/- 1,510.57)
test benchmarks::parse_descriptor_tr_deep_bigtree_g_100       ... bench:     716,975.90 ns/iter (+/- 2,332.70)
test benchmarks::parse_descriptor_tr_deep_bigtree_h_128       ... bench:     918,521.80 ns/iter (+/- 3,519.31)
test benchmarks::parse_descriptor_tr_deep_oneleaf_a_1         ... bench:      12,838.85 ns/iter (+/- 52.24)
test benchmarks::parse_descriptor_tr_deep_oneleaf_b_10        ... bench:      79,654.23 ns/iter (+/- 334.61)
test benchmarks::parse_descriptor_tr_deep_oneleaf_c_20        ... bench:     157,752.03 ns/iter (+/- 1,635.82)
test benchmarks::parse_descriptor_tr_deep_oneleaf_d_50        ... bench:     384,383.35 ns/iter (+/- 1,913.53)
test benchmarks::parse_descriptor_tr_deep_oneleaf_e_100       ... bench:     775,323.40 ns/iter (+/- 2,590.63)
test benchmarks::parse_descriptor_tr_deep_oneleaf_f_200       ... bench:   1,578,078.10 ns/iter (+/- 4,743.13)
test benchmarks::parse_descriptor_tr_oneleaf_a_1              ... bench:      12,860.97 ns/iter (+/- 44.70)
test benchmarks::parse_descriptor_tr_oneleaf_b_10             ... bench:      79,453.65 ns/iter (+/- 121.79)
test benchmarks::parse_descriptor_tr_oneleaf_c_20             ... bench:     157,421.55 ns/iter (+/- 517.65)
test benchmarks::parse_descriptor_tr_oneleaf_d_50             ... bench:     385,246.50 ns/iter (+/- 1,549.40)
test benchmarks::parse_descriptor_tr_oneleaf_e_100            ... bench:     778,200.90 ns/iter (+/- 6,522.65)
test benchmarks::parse_descriptor_tr_oneleaf_f_200            ... bench:   1,586,099.80 ns/iter (+/- 9,015.25)
test benchmarks::parse_descriptor_tr_oneleaf_g_500            ... bench:   4,034,182.20 ns/iter (+/- 26,087.10)
test benchmarks::parse_descriptor_tr_oneleaf_h_1000           ... bench:   8,195,196.00 ns/iter (+/- 13,130.05)
test benchmarks::parse_descriptor_tr_oneleaf_i_2000           ... bench:  16,536,516.10 ns/iter (+/- 16,851.49)
test benchmarks::parse_descriptor_tr_oneleaf_j_5000           ... bench:  47,157,298.00 ns/iter (+/- 155,154.44)
test benchmarks::parse_descriptor_tr_oneleaf_k_10000          ... bench:  99,418,389.10 ns/iter (+/- 331,785.09)
test benchmarks::parse_expression_balanced_a_0                ... bench:          59.06 ns/iter (+/- 7.66)
test benchmarks::parse_expression_balanced_b_1                ... bench:         203.90 ns/iter (+/- 6.48)
test benchmarks::parse_expression_balanced_c_2                ... bench:         417.93 ns/iter (+/- 10.63)
test benchmarks::parse_expression_balanced_d_5                ... bench:       1,094.49 ns/iter (+/- 14.86)
test benchmarks::parse_expression_balanced_e_10               ... bench:       2,457.32 ns/iter (+/- 31.48)
test benchmarks::parse_expression_balanced_f_20               ... bench:       5,290.49 ns/iter (+/- 46.03)
test benchmarks::parse_expression_balanced_g_50               ... bench:      13,243.76 ns/iter (+/- 65.15)
test benchmarks::parse_expression_balanced_h_100              ... bench:      26,363.25 ns/iter (+/- 207.25)
test benchmarks::parse_expression_balanced_i_200              ... bench:      51,766.08 ns/iter (+/- 394.52)
test benchmarks::parse_expression_balanced_j_500              ... bench:     128,949.43 ns/iter (+/- 860.85)
test benchmarks::parse_expression_balanced_k_1000             ... bench:     257,794.87 ns/iter (+/- 2,883.75)
test benchmarks::parse_expression_balanced_l_2000             ... bench:     513,741.90 ns/iter (+/- 8,375.77)
test benchmarks::parse_expression_balanced_m_5000             ... bench:   1,308,476.00 ns/iter (+/- 12,559.91)
test benchmarks::parse_expression_balanced_n_10000            ... bench:   2,625,852.10 ns/iter (+/- 20,333.94)
test benchmarks::parse_expression_deep_a_0                    ... bench:          58.98 ns/iter (+/- 3.02)
test benchmarks::parse_expression_deep_b_1                    ... bench:         206.49 ns/iter (+/- 5.66)
test benchmarks::parse_expression_deep_c_2                    ... bench:         420.99 ns/iter (+/- 13.47)
test benchmarks::parse_expression_deep_d_5                    ... bench:       1,116.10 ns/iter (+/- 12.39)
test benchmarks::parse_expression_deep_e_10                   ... bench:       2,433.14 ns/iter (+/- 13.96)
test benchmarks::parse_expression_deep_f_20                   ... bench:       5,239.88 ns/iter (+/- 61.44)
test benchmarks::parse_expression_deep_g_50                   ... bench:      13,136.91 ns/iter (+/- 109.53)
test benchmarks::parse_expression_deep_h_100                  ... bench:      25,395.95 ns/iter (+/- 175.86)
test benchmarks::parse_expression_deep_i_200                  ... bench:      50,030.04 ns/iter (+/- 473.71)
test benchmarks::parse_expression_deep_j_300                  ... bench:      74,139.48 ns/iter (+/- 469.20)
test benchmarks::parse_expression_deep_j_400                  ... bench:      98,320.38 ns/iter (+/- 384.30)

Benchmarks on this PR
test benchmarks::parse_descriptor_balanced_segwit_a_0         ... bench:         470.64 ns/iter (+/- 5.17)
test benchmarks::parse_descriptor_balanced_segwit_b_1         ... bench:       6,507.13 ns/iter (+/- 173.90)
test benchmarks::parse_descriptor_balanced_segwit_c_10        ... bench:      67,962.58 ns/iter (+/- 419.13)
test benchmarks::parse_descriptor_balanced_segwit_d_20        ... bench:     137,178.02 ns/iter (+/- 1,526.67)
test benchmarks::parse_descriptor_balanced_segwit_e_40        ... bench:     273,251.57 ns/iter (+/- 1,504.49)
test benchmarks::parse_descriptor_balanced_segwit_f_60        ... bench:     410,797.15 ns/iter (+/- 2,415.95)
test benchmarks::parse_descriptor_balanced_segwit_g_80        ... bench:     549,857.00 ns/iter (+/- 4,435.50)
test benchmarks::parse_descriptor_balanced_segwit_h_90        ... bench:     618,257.10 ns/iter (+/- 4,681.65)
test benchmarks::parse_descriptor_balanced_segwit_thresh_a_1  ... bench:       6,488.73 ns/iter (+/- 45.94)
test benchmarks::parse_descriptor_balanced_segwit_thresh_b_10 ... bench:      72,281.43 ns/iter (+/- 4,169.73)
test benchmarks::parse_descriptor_balanced_segwit_thresh_c_20 ... bench:     145,943.23 ns/iter (+/- 1,094.08)
test benchmarks::parse_descriptor_balanced_segwit_thresh_d_40 ... bench:     292,546.07 ns/iter (+/- 1,783.06)
test benchmarks::parse_descriptor_balanced_segwit_thresh_e_60 ... bench:     440,255.40 ns/iter (+/- 2,509.42)
test benchmarks::parse_descriptor_balanced_segwit_thresh_f_80 ... bench:     589,903.90 ns/iter (+/- 3,837.71)
test benchmarks::parse_descriptor_balanced_segwit_thresh_g_90 ... bench:     663,483.80 ns/iter (+/- 5,137.50)
test benchmarks::parse_descriptor_deep_segwit_a_0             ... bench:         475.30 ns/iter (+/- 10.35)
test benchmarks::parse_descriptor_deep_segwit_b_1             ... bench:       6,484.72 ns/iter (+/- 34.62)
test benchmarks::parse_descriptor_deep_segwit_c_10            ... bench:      68,004.62 ns/iter (+/- 251.59)
test benchmarks::parse_descriptor_deep_segwit_d_20            ... bench:     136,546.62 ns/iter (+/- 470.84)
test benchmarks::parse_descriptor_deep_segwit_e_40            ... bench:     273,568.80 ns/iter (+/- 1,417.77)
test benchmarks::parse_descriptor_deep_segwit_f_60            ... bench:     410,670.95 ns/iter (+/- 2,125.68)
test benchmarks::parse_descriptor_deep_segwit_g_80            ... bench:     546,762.00 ns/iter (+/- 2,718.21)
test benchmarks::parse_descriptor_deep_segwit_h_90            ... bench:     616,478.40 ns/iter (+/- 2,942.39)
test benchmarks::parse_descriptor_deep_segwit_thresh_a_1      ... bench:       6,477.58 ns/iter (+/- 72.33)
test benchmarks::parse_descriptor_deep_segwit_thresh_b_10     ... bench:      73,126.25 ns/iter (+/- 630.02)
test benchmarks::parse_descriptor_deep_segwit_thresh_c_20     ... bench:     147,223.90 ns/iter (+/- 497.92)
test benchmarks::parse_descriptor_deep_segwit_thresh_d_40     ... bench:     296,325.40 ns/iter (+/- 1,412.45)
test benchmarks::parse_descriptor_deep_segwit_thresh_e_60     ... bench:     445,616.80 ns/iter (+/- 2,389.77)
test benchmarks::parse_descriptor_deep_segwit_thresh_f_80     ... bench:     596,943.60 ns/iter (+/- 3,919.85)
test benchmarks::parse_descriptor_deep_segwit_thresh_g_90     ... bench:     674,759.80 ns/iter (+/- 4,033.43)
test benchmarks::parse_descriptor_tr_bigtree_a_1              ... bench:      13,236.42 ns/iter (+/- 124.95)
test benchmarks::parse_descriptor_tr_bigtree_b_2              ... bench:      20,350.54 ns/iter (+/- 377.20)
test benchmarks::parse_descriptor_tr_bigtree_c_5              ... bench:      41,951.42 ns/iter (+/- 296.54)
test benchmarks::parse_descriptor_tr_bigtree_d_10             ... bench:      78,537.86 ns/iter (+/- 413.68)
test benchmarks::parse_descriptor_tr_bigtree_e_20             ... bench:     152,192.38 ns/iter (+/- 984.61)
test benchmarks::parse_descriptor_tr_bigtree_f_50             ... bench:     372,410.05 ns/iter (+/- 8,083.66)
test benchmarks::parse_descriptor_tr_bigtree_g_100            ... bench:     739,770.20 ns/iter (+/- 7,940.74)
test benchmarks::parse_descriptor_tr_bigtree_h_200            ... bench:   1,484,661.40 ns/iter (+/- 21,573.86)
test benchmarks::parse_descriptor_tr_bigtree_i_500            ... bench:   3,907,661.00 ns/iter (+/- 14,959.10)
test benchmarks::parse_descriptor_tr_bigtree_j_1000           ... bench:   7,889,040.30 ns/iter (+/- 22,647.22)
test benchmarks::parse_descriptor_tr_bigtree_k_2000           ... bench:  15,808,461.80 ns/iter (+/- 229,508.74)
test benchmarks::parse_descriptor_tr_bigtree_l_5000           ... bench:  39,534,823.90 ns/iter (+/- 239,137.27)
test benchmarks::parse_descriptor_tr_bigtree_m_10000          ... bench:  80,157,162.60 ns/iter (+/- 641,393.85)
test benchmarks::parse_descriptor_tr_deep_bigtree_a_1         ... bench:      13,213.72 ns/iter (+/- 84.00)
test benchmarks::parse_descriptor_tr_deep_bigtree_b_2         ... bench:      20,305.54 ns/iter (+/- 109.85)
test benchmarks::parse_descriptor_tr_deep_bigtree_c_5         ... bench:      41,957.41 ns/iter (+/- 165.18)
test benchmarks::parse_descriptor_tr_deep_bigtree_d_10        ... bench:      78,210.05 ns/iter (+/- 345.16)
test benchmarks::parse_descriptor_tr_deep_bigtree_e_20        ... bench:     151,498.03 ns/iter (+/- 670.90)
test benchmarks::parse_descriptor_tr_deep_bigtree_f_50        ... bench:     370,511.35 ns/iter (+/- 1,948.76)
test benchmarks::parse_descriptor_tr_deep_bigtree_g_100       ... bench:     738,416.60 ns/iter (+/- 19,037.45)
test benchmarks::parse_descriptor_tr_deep_bigtree_h_128       ... bench:     945,320.70 ns/iter (+/- 6,278.54)
test benchmarks::parse_descriptor_tr_deep_oneleaf_a_1         ... bench:      13,207.25 ns/iter (+/- 72.16)
test benchmarks::parse_descriptor_tr_deep_oneleaf_b_10        ... bench:      81,345.43 ns/iter (+/- 601.78)
test benchmarks::parse_descriptor_tr_deep_oneleaf_c_20        ... bench:     160,400.68 ns/iter (+/- 869.46)
test benchmarks::parse_descriptor_tr_deep_oneleaf_d_50        ... bench:     391,496.10 ns/iter (+/- 3,813.95)
test benchmarks::parse_descriptor_tr_deep_oneleaf_e_100       ... bench:     788,146.20 ns/iter (+/- 3,788.72)
test benchmarks::parse_descriptor_tr_deep_oneleaf_f_200       ... bench:   1,604,875.00 ns/iter (+/- 7,280.04)
test benchmarks::parse_descriptor_tr_oneleaf_a_1              ... bench:      13,210.26 ns/iter (+/- 95.33)
test benchmarks::parse_descriptor_tr_oneleaf_b_10             ... bench:      81,522.05 ns/iter (+/- 376.10)
test benchmarks::parse_descriptor_tr_oneleaf_c_20             ... bench:     160,389.30 ns/iter (+/- 763.08)
test benchmarks::parse_descriptor_tr_oneleaf_d_50             ... bench:     392,662.35 ns/iter (+/- 8,103.82)
test benchmarks::parse_descriptor_tr_oneleaf_e_100            ... bench:     792,751.20 ns/iter (+/- 36,747.78)
test benchmarks::parse_descriptor_tr_oneleaf_f_200            ... bench:   1,616,307.30 ns/iter (+/- 7,163.84)
test benchmarks::parse_descriptor_tr_oneleaf_g_500            ... bench:   4,105,945.60 ns/iter (+/- 13,915.26)
test benchmarks::parse_descriptor_tr_oneleaf_h_1000           ... bench:   8,354,293.00 ns/iter (+/- 54,429.06)
test benchmarks::parse_descriptor_tr_oneleaf_i_2000           ... bench:  16,838,568.60 ns/iter (+/- 37,444.78)
test benchmarks::parse_descriptor_tr_oneleaf_j_5000           ... bench:  47,989,133.90 ns/iter (+/- 201,335.47)
test benchmarks::parse_descriptor_tr_oneleaf_k_10000          ... bench:  99,929,584.60 ns/iter (+/- 389,764.76)
test benchmarks::parse_expression_balanced_a_0                ... bench:         155.68 ns/iter (+/- 3.14)
test benchmarks::parse_expression_balanced_b_1                ... bench:         452.07 ns/iter (+/- 5.89)
test benchmarks::parse_expression_balanced_c_2                ... bench:         800.08 ns/iter (+/- 22.00)
test benchmarks::parse_expression_balanced_d_5                ... bench:       1,914.76 ns/iter (+/- 20.82)
test benchmarks::parse_expression_balanced_e_10               ... bench:       4,088.31 ns/iter (+/- 66.71)
test benchmarks::parse_expression_balanced_f_20               ... bench:       8,270.39 ns/iter (+/- 69.60)
test benchmarks::parse_expression_balanced_g_50               ... bench:      20,342.46 ns/iter (+/- 681.03)
test benchmarks::parse_expression_balanced_h_100              ... bench:      41,400.14 ns/iter (+/- 471.68)
test benchmarks::parse_expression_balanced_i_200              ... bench:      81,343.39 ns/iter (+/- 1,206.00)
test benchmarks::parse_expression_balanced_j_500              ... bench:     200,866.50 ns/iter (+/- 3,036.62)
test benchmarks::parse_expression_balanced_k_1000             ... bench:     408,045.35 ns/iter (+/- 4,329.46)
test benchmarks::parse_expression_balanced_l_2000             ... bench:     821,743.70 ns/iter (+/- 11,193.03)
test benchmarks::parse_expression_balanced_m_5000             ... bench:   2,055,374.00 ns/iter (+/- 27,628.21)
test benchmarks::parse_expression_balanced_n_10000            ... bench:   4,033,391.60 ns/iter (+/- 84,499.97)
test benchmarks::parse_expression_deep_a_0                    ... bench:         143.24 ns/iter (+/- 2.10)
test benchmarks::parse_expression_deep_b_1                    ... bench:         443.10 ns/iter (+/- 18.51)
test benchmarks::parse_expression_deep_c_2                    ... bench:         799.62 ns/iter (+/- 10.91)
test benchmarks::parse_expression_deep_d_5                    ... bench:       1,926.85 ns/iter (+/- 48.56)
test benchmarks::parse_expression_deep_e_10                   ... bench:       4,044.85 ns/iter (+/- 58.57)
test benchmarks::parse_expression_deep_f_20                   ... bench:       7,950.19 ns/iter (+/- 3,300.04)
test benchmarks::parse_expression_deep_g_50                   ... bench:      19,351.78 ns/iter (+/- 8,360.80)
test benchmarks::parse_expression_deep_h_100                  ... bench:      38,399.99 ns/iter (+/- 229.76)
test benchmarks::parse_expression_deep_i_200                  ... bench:      76,354.42 ns/iter (+/- 670.49)
test benchmarks::parse_expression_deep_j_300                  ... bench:     114,843.40 ns/iter (+/- 1,105.18)
test benchmarks::parse_expression_deep_j_400                  ... bench:     153,477.72 ns/iter (+/- 1,796.45)

Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK eb4f8c4

I love how this is shaping out to be. Thanks for detailed benchmarks.

let expected = eng.checksum_chars();
let mut actual = ['_'; CHECKSUM_LENGTH];
for (act, ch) in actual.iter_mut().zip(checksum_str.chars()) {
*act = ch;
Copy link
Member

@sanket1729 sanket1729 Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From chatgpt; with MSRV 1.63; you can avoid the mut and creating the actual with all '_' as

let mut iter = checksum_str.chars()
let actual: [char; CHECKSUM_LENGTH] = core::array::from_fn(|_| iter.next().expect("Len checked eariler))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I will add a commit to my next PR doing this. I'm thrilled to have from_fn now.

@apoelstra apoelstra merged commit 460400d into rust-bitcoin:master Nov 13, 2024
30 checks passed
@apoelstra apoelstra deleted the 2024-09--expression-1 branch November 13, 2024 15:47
heap-coder added a commit to heap-coder/rust-miniscript that referenced this pull request Sep 27, 2025
…nd checksum checking

eb4f8c4dce3c45d6e510075a63d2115f7198f7b4 expression: move checksum validation into expression parsing (Andrew Poelstra)
bb2e658cabe272e84270a0f30ce28ae0b81e9df2 expression: add explicit well-formedness check to parser (Andrew Poelstra)
9b03043a9e2f73f33188a541881e06404fd30a86 expression: add some unit tests (Andrew Poelstra)
ad85f92fb92cac9beb216f9a29d75ba165264edb expression: implement PartialEq/Eq for trees (Andrew Poelstra)
b0508d864c235fdf4fc03d24aca69acfab8d66e3 checksum: clean up errors and error paths (Andrew Poelstra)

Pull request description:

  As a step toward rewriting the expression parser to be non-recursive, add a pre-parsing well-formedness check, which verifies that an expression is well-formed, uses only the correct character set, and that the checksum (if present) is valid.

  Along the way, replace the error types returned from the `expression` module with a new more-precise one which can identify the location of parse errors (and identify what the error was in a more correct way). This improves our error reporting and drops many instances of the stringly-typed `BadDescriptor` error.

  There is currently a bunch of special logic for Taproot descriptors which have the extra characters `{` and `}`. To the extent possible, this PR doesn't touch that logic. It will be addressed in a later PR.

  The benchmarks show a slight slowdown since we essentially added new validation logic without removing the old logic. Later PRs will improve things.

ACKs for top commit:
  sanket1729:
    ACK eb4f8c4dce3c45d6e510075a63d2115f7198f7b4

Tree-SHA512: cb972e9683aca51f8d18ab61521af8606f47bd1d0bc37dd1ed085101dbc4dd69b382eb05e8e21d2856ac68ebe7d2ca7879ca8a0692dacbea0b22b7b10c9ef987
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants