Implement our own MDX parser

This task is about replacing the MDX parser we use on GDSchool currently, remark, and the plugins we maintain for it, with our own MDX parser.

The MDX parser should take an MDX document, ideally valid, extract the content like imports and YAML frontmatter, and output a TSX file with the metadata as properties and a default export with a React component in React HTML format.

```tsx
export const title = "..."
export const index = 2
export const previous_lesson = {
  title: "Module Overview",
  slug: "module_overview"
} 
export const next_lesson = {...}
export const module_title = "Top Down Movement"

export const Content = () => {
  return <>
    <h1 className="main-title">Character Controller</h1>
    ... 
  </>
}
```

Stretch goal: output in JavaScript instead using the `React.createElement` API to skip extra parsing steps in the build process.

```js
export const Content = () => {
  return React.createElement('', {}, [
    React.createElement('h1', {className:'main-title'}, ["Character Controller"])
  ])  
} 
```

## MDX processing needs

We maintain our own remark plugins to make some MDX components easier to write in the source documents. They apply the following transformations:

- Sequences of Practice, Callout, and Searchable Components are wrapped into a container. More types of components may use this mechanism in the future.
- Child components of Practice, YourTurn, and Challenge components are turned into properties of the parent component. For example, all the hint elements are turned into an array of hints in the parent component.

We need to replicate this behavior in our MDX parser.

## Markdown code block parsing needs

We need to turn Markdown code fences into a specific HTML structure. We need to parse and highlight the GDScript code. Options include using a PEG grammar with nim's [npeg](https://github.com/zevv/npeg) library, writing our own specialized GDScript parser for highlighting, or passing the code to an external program like prism.js and injecting the result back. The existing build system uses prism.js within nextjs's build system.

Code fences should be turned into this `pre` and `code` structure:

```html
<pre className="gdquest-code-container"><code className="gdquest-code">
// code here
</code></pre>
```

If the code block has the diff attribute (if the language is `diff-gdscript` for example), we need to insert a class for every line that has a plus or a minus sign at the start.

## Markdown headings parsing

We need to extract the H1 heading to use it as a title fallback if a title is not specified in the YAML front matter of the document. We may also need to read the H2 headings to create a table of contents.

## Front matter parsing

We use the YAML format for the front matter. We just need to parse it using a YAML parser and inject optional fields or metadata if they are missing. The main two pieces of metadata are `title` and `unlocked`, which should be `false` by default if not specified.

## Development

To approach this project, I would:

- Look into reusing an existing Markdown parser for Nim, such as [nim-markdown](https://github.com/soasme/nim-markdown/), as it is implemented in Nim and produces a token tree that we can traverse to generate the output we need. We have to see if it's usable as-is or if we need to fork it to support MDX-specific syntax like imports and exports.
- Collecting pairs of input MDX files and output TSX files to guide development and test the parser against, to ensure it produces the expected output.

## Parsed token structure

For rendering, Jad suggested creating a node tree where tokens represent HTML elements/properties already, so that a single function can render that.

Our only output will be html for the foreseeable future so it makes sense, so the parser could directly parse the markdown into editable tokens that represent an HTML structure.

Pseudo code example:

```js
{ token: 'Practice',
  render: { tag: 'section', class: 'gdquest-practice'}
  children: [{
    token: 'Requirement',
    render: { tag: 'div', class: 'gdauest-requirement'}
    children:[
      { type: 'TEXTNODE', tag: '', contents: 'blah blah'}
    ]
  }]
}
```

But w. strong types and an object structure

### Parsed node tree manipulation

Make an API a bit like Godot, for convenient manipulation, reordering, reparenting, deleting, etc.

This should allow manipulating the node tree easily before rendering the output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Implement our own MDX parser #59

MDX processing needs

Markdown code block parsing needs

Markdown headings parsing

Front matter parsing

Development

Parsed token structure

Parsed node tree manipulation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Implement our own MDX parser #59

Description

MDX processing needs

Markdown code block parsing needs

Markdown headings parsing

Front matter parsing

Development

Parsed token structure

Parsed node tree manipulation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions