Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure #142

Closed
simoncozens opened this issue Nov 8, 2024 · 13 comments · Fixed by #143 or #146
Closed

Assertion failure #142

simoncozens opened this issue Nov 8, 2024 · 13 comments · Fixed by #143 or #146
Assignees

Comments

@simoncozens
Copy link
Contributor

I'm having a problem with rustbuzz crashing inside of diffenator3 for a particular font (NotoSansMalayalam[wdth,wght].ttf.gz).

Here's a reproducer:

        let font = include_bytes!("NotoSansMalayalam[wdth,wght].ttf");
        let mut face = Face::from_slice(font, 0).unwrap();
        let mut buffer = UnicodeBuffer::new();
        let string = "സ്ഥലമായിരുന്നതിനാല്‍";
        buffer.push_str(string);
        println!(">{}", string);
        let plan = ShapePlan::new(
            &face,
            Direction::LeftToRight,
            Some(script::MALAYALAM),
            None,
            &[],
        );

        let output = shape_with_plan(&face, &plan, buffer);

And here's the backtrace:

                              ⋮ 12 frames hidden ⋮
13: core::panicking::panic::hc3763f3effcb9929
    at <unknown source file>
14: rustybuzz::hb::buffer::hb_buffer_t::move_to::he6a9580defbf1514
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/buffer.rs:1156
    1154 │         }
    1155 │
    1156 >         assert!(i <= self.out_len + (self.len - self.idx));
    1157 │
    1158 │         if self.out_len < i {
15: rustybuzz::hb::ot_layout_gsubgpos::apply_lookup::hb0bb938a671a37e1
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_layout_gsubgpos.rs:980
     978 │     }
     979 │
     980 >     ctx.buffer.move_to(end);
     981 │ }
     982 │
16: <ttf_parser::ggg::chained_context::ChainedContextLookup as rustybuzz::hb::ot_layout_gsubgpos::Apply>::apply::h6da40ea28f40f5f6
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_layout_gsubgpos.rs:663
     661 │                 ctx.buffer
     662 │                     .unsafe_to_break_from_outbuffer(Some(start_index), Some(end_index));
     663 >                 apply_lookup(
     664 │                     ctx,
     665 │                     usize::from(input_coverages.len()),
17: rustybuzz::hb::ot_layout_gsub_table::<impl rustybuzz::hb::ot_layout_gsubgpos::Apply for ttf_parser::tables::gsub::SubstitutionSubtable>::apply::h3f7e621c5b38371c
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_layout_gsub_table.rs:49
      47 │             Self::Ligature(t) => t.apply(ctx),
      48 │             Self::Context(t) => t.apply(ctx),
      49 >             Self::ChainContext(t) => t.apply(ctx),
      50 │             Self::ReverseChainSingle(t) => t.apply(ctx),
      51 │         }
18: rustybuzz::hb::ot::layout::GSUB::subst_lookup::<impl rustybuzz::hb::ot_layout_gsubgpos::Apply for rustybuzz::hb::ot_layout_common::SubstLookup>::apply::h1180360a0b8d580a
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot/layout/GSUB/subst_lookup.rs:35
      33 │         if self.digest().may_have_glyph(ctx.buffer.cur(0).as_glyph()) {
      34 │             for subtable in &self.subtables {
      35 >                 if subtable.apply(ctx).is_some() {
      36 │                     return Some(());
      37 │                 }
19: rustybuzz::hb::ot_layout::apply_forward::h40254d8f0cec948a
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_layout.rs:302
     300 │         if (cur.mask & ctx.lookup_mask()) != 0
     301 │             && ctx.check_glyph_property(cur, ctx.lookup_props)
     302 >             && lookup.apply(ctx).is_some()
     303 │         {
     304 │             ret = true;
20: rustybuzz::hb::ot_layout::apply_string::h41a8546a60574f08
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_layout.rs:282
     280 │         }
     281 │         ctx.buffer.idx = 0;
     282 >         apply_forward(ctx, lookup);
     283 │
     284 │         if !T::IN_PLACE {
21: rustybuzz::hb::ot_layout::apply_layout_table::h29833b22a4aa51bc
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_layout.rs:256
     254 │                     ctx.per_syllable = lookup_map.per_syllable;
     255 │
     256 >                     apply_string::<T>(&mut ctx, lookup);
     257 │                 }
     258 │             }
22: rustybuzz::hb::ot_layout_gsub_table::substitute::h1449b349b6791f0f
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_layout_gsub_table.rs:13
      11 │
      12 │ pub fn substitute(plan: &hb_ot_shape_plan_t, face: &hb_font_t, buffer: &mut hb_buffer_t) {
      13 >     apply_layout_table(plan, face, buffer, face.gsub.as_ref());
      14 │ }
      15 │
23: rustybuzz::hb::ot_shape::hb_ot_substitute_plan::hccd88507dd191d9a
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_shape.rs:393
     391 │         aat_layout::hb_aat_layout_substitute(ctx.plan, ctx.face, ctx.buffer);
     392 │     } else {
     393 >         super::ot_layout_gsub_table::substitute(ctx.plan, ctx.face, ctx.buffer);
     394 │     }
     395 │ }
24: rustybuzz::hb::ot_shape::substitute_pre::hcbfcfe8bf658fb43
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_shape.rs:346
     344 │ fn substitute_pre(ctx: &mut hb_ot_shape_context_t) {
     345 │     hb_ot_substitute_default(ctx);
     346 >     hb_ot_substitute_plan(ctx);
     347 │
     348 │     if ctx.plan.apply_morx && ctx.plan.apply_gpos {
25: rustybuzz::hb::ot_shape::shape_internal::hc835d90c43eef53c
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/ot_shape.rs:334
     332 │     }
     333 │
     334 >     substitute_pre(ctx);
     335 │     position(ctx);
     336 │     substitute_post(ctx);
26: rustybuzz::hb::shape::shape_with_plan::h2df57445f599ab4b
    at /Users/simon/.cargo/registry/src/index.crates.io-6f17d22bba15001f/rustybuzz-0.20.0/src/hb/shape.rs:71
      69 │         #[cfg(not(feature = "wasm-shaper"))]
      70 │         {
      71 >             shape_internal(&mut hb_ot_shape_context_t {
      72 │                 plan,
      73 │                 face,
27: diffenator3_lib::render::renderer::tests::test_rustybuzz_crash::he1916ce60e7b4dcd
    at /Users/simon/hacks/typography/diffenator3/diffenator3-lib/src/render/renderer.rs:223
     221 │         );
     222 │
     223 >         let output = shape_with_plan(&face, &plan, buffer);
     224 │         panic!("Fail")
     225 │     }
@behdad
Copy link
Member

behdad commented Nov 8, 2024

Does the same font/text crash HB by any chance?

That piece of code is the nastiest part of OT / HB.

@simoncozens
Copy link
Contributor Author

No crash in HB, no. I'll try adding some debugging prints around the values of end, match_positions etc and run it again in both.

@simoncozens
Copy link
Contributor Author

The problem is this line:

        end = (end as isize + delta) as _;

In this case, end is initially 0, delta is -2, so the value of end after this statement is 18446744073709551615. An attempt to resize the buffer to 18 exabytes does not go well.

@simoncozens
Copy link
Contributor Author

simoncozens commented Nov 9, 2024

I'm getting confused between harfruzz and rustybuzz. Fix in harfbuzz/harfruzz#15

@LaurenzV
Copy link
Collaborator

LaurenzV commented Nov 9, 2024

Will open a PR for here with a test case.

@alerque
Copy link
Member

alerque commented Nov 9, 2024

There are a few other instances of this exact construct as well as a large number of as invocations on integer types that are not very safe or idiomatic. That doesn't necessarily mean it is even possible to trigger other issues like this one, but it does mean there could be a little more defensive coding here.

@behdad
Copy link
Member

behdad commented Nov 9, 2024

The problem is this line:

        end = (end as isize + delta) as _;

In this case, end is initially 0, delta is -2, so the value of end after this statement is 18446744073709551615. An attempt to resize the buffer to 18 exabytes does not go well.

I think the problem is that in HB, end is defined as signed int, but that is lost in the translation maybe? I prefer staying close to HB instead of using saturating_add here, which would need verification that it satisfies the semantics.

@alerque alerque reopened this Nov 9, 2024
@alerque
Copy link
Member

alerque commented Nov 9, 2024

Yes, end was/is inheriting its type much earlier from ctx.buffer.backtrack_len() which returns usize not isize. I'll look into making mut end: isize to start with, and also whether the other unsigned types that have stuff subtracted from them may have been signed types in HB.

@alerque
Copy link
Member

alerque commented Nov 9, 2024

@behdad In fixing the integer type to track HB's implementation I have a question about this from HB:

static bool match_lookahead (hb_ot_apply_context_t *c,
			     unsigned int count,
			     const HBUINT lookahead[],
			     match_func_t match_func,
			     const void *match_data,
			     unsigned int start_index,
			     unsigned int *end_index)
{
  TRACE_APPLY (nullptr);

  hb_ot_apply_context_t::skipping_iterator_t &skippy_iter = c->iter_context;
  skippy_iter.reset (start_index - 1);

My C isn't very good so I don't claim to understand the implications on that side (maybe the iterator doesn't care about being reset to a negative value?), but on the Rust side it seems suspicious. The start_index is being passed into this function unsigned, but promptly has a value subtracted from it. This is one of the other cases I initially flagged in #146 as potentially needing a safeguard.

Is there any potential of this being called with a starting index of 0?

@behdad
Copy link
Member

behdad commented Nov 9, 2024

Is there any potential of this being called with a starting index of 0?

No. Because match_lookahead is always called with start_index following the "input" match, which always has at least one item in it. We can add an assert I think.

@alerque
Copy link
Member

alerque commented Nov 9, 2024

An assert it is.

A bit latter on (again from HB code):

unsigned int last_num_components = _hb_glyph_info_get_lig_num_comps (&buffer->cur());
unsigned int components_so_far = last_num_components;
// both numbers get manipulated, then later...
unsigned int new_lig_comp = components_so_far - last_num_components + hb_min (this_comp, last_num_components);

Should I just assert!(components_so_far >= last_num_components) in the places where this subtraction happens, or cast it to a signed value instead or use saturating_sub() instead?

@behdad
Copy link
Member

behdad commented Nov 11, 2024

An assert it is.

A bit latter on (again from HB code):

unsigned int last_num_components = _hb_glyph_info_get_lig_num_comps (&buffer->cur());
unsigned int components_so_far = last_num_components;
// both numbers get manipulated, then later...
unsigned int new_lig_comp = components_so_far - last_num_components + hb_min (this_comp, last_num_components);

Should I just assert!(components_so_far >= last_num_components) in the places where this subtraction happens, or cast it to a signed value instead or use saturating_sub() instead?

If you check out how they are manipulated, components_so_far starts as being equal to last_num_components, and every time last_num_components changes, it is added to components_so_far. So that assert you suggest is indeed guaranteed.

@alerque
Copy link
Member

alerque commented Nov 11, 2024

That makes sense, and my PR is asserting it now as well.

I did just spot another instance of a type mismatch (the prev() function has stop as a signed type where HB is unsigned) so I'll go add that too. That's the last of the related potentially problematic math situations I see see in the gsubgpos module at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants