Skip to content

Incorrect Use of Underscore _ for Italics in the Middle of Words #78

@PierreAtUptale

Description

@PierreAtUptale

When using remark-slate to serialize content into Markdown, the library defaults to using the underscore _ character for italic text. However, this causes issues when italicizing text in the middle of a word. According to Markdown italic best practices, an asterisk * should be used instead of an underscore _ to avoid rendering inconsistencies across Markdown parsers.

Example:

Input Node:

{
  "type": "paragraph",
  "children": [
    {
      "text": "Tes"
    },
    {
      "italic": true,
      "text": "t"
    },
    {
      "text": "ing"
    }
  ]
}

Expected Markdown Output:

Tes*t*ing

Actual Markdown Output:

Tes_t_ing

The current output (Tes_t_ing) does not render correctly in many Markdown parsers. For compatibility and consistency, the library should use * for italicization, especially when the italicized segment is in the middle of a word.

Even using remark-parse + remark-slate to parse "Test_t_ing" once again would produce a text without italic but with visible underscores.

{
  "type": "paragraph",
  "children": [
    {
      "text": "Tes_t_ing",
      "italic": false,
    },
  ]
} 

It really does feel more like a limitation of the remark-slate serializer more than a limitation of the parsers, given the current recommendations on best practices.

Steps to Reproduce:

  1. Create a rich text structure where italicized text appears in the middle of a word.
  2. Serialize the structure using remark-slate.
  3. Observe that underscores _ are used for the italicized text, resulting in incorrect Markdown rendering.
  4. Optionally: Parse again with
    const processor = unified().use(remarkParse).use(remarkSlate);
    const file = await processor.process(markdown);

And observe that the text surrounded by the underscores isn't italicized.

Proposed Solution:

  • Update the serialization logic to use * for italicized text when it appears in the middle of a word. This ensures proper rendering across Markdown parsers.
  • Ensure the library still supports underscores _ for italic text that is at word boundaries, as this is still valid Markdown.

Let me know if you need further details or clarification!

Versions

{
        "remark-parse": "11.0.0",
        "remark-slate": "1.8.6",
        "unified": "11.0.5",
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions