Skip to content

LaTeX not copied from conversation... #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
malekinho8 opened this issue Aug 8, 2024 · 9 comments
Open

LaTeX not copied from conversation... #7

malekinho8 opened this issue Aug 8, 2024 · 9 comments

Comments

@malekinho8
Copy link

Problem Description

I had a conversation that looked like so:

Screenshot 2024-08-08 at 8 16 34 AM

When I ran the bookmarklet and displayed the Markdown in VS Code, I was expecting to see the LaTeX formatted text, but I saw this:

Screenshot 2024-08-08 at 8 17 26 AM

The Markdown code was this:

# ChatGPT

## PROMPT 1

you know how in electrical circuits the potential drop across all elements in series is equal to sum of potential drop across each element in the series?

## RESPONSE 1

Yes, I understand. In an electrical circuit, when resistors (or other components) are connected in series, the total voltage drop across the series combination is equal to the sum of the voltage drops across each individual component.

Mathematically, if you have resistors R1,R2,…,RnR\_1, R\_2, \\ldots, R\_nR1​,R2​,…,Rn​ connected in series, and a total voltage VVV applied across the entire series, the voltage drop across each resistor V1,V2,…,VnV\_1, V\_2, \\ldots, V\_nV1​,V2​,…,Vn​ can be expressed as:

V\=V1+V2+⋯+VnV = V\_1 + V\_2 + \\cdots + V\_nV\=V1​+V2​+⋯+Vn​

Where V1,V2,…,VnV\_1, V\_2, \\ldots, V\_nV1​,V2​,…,Vn​ are the voltage drops across R1,R2,…,RnR\_1, R\_2, \\ldots, R\_nR1​,R2​,…,Rn​ respectively.

Expected Behavior

It should have been this:

# ChatGPT

## PROMPT 1

you know how in electrical circuits the potential drop across all elements in series is equal to sum of potential drop across each element in the series?

## RESPONSE 1

Yes, I understand. In an electrical circuit, when resistors (or other components) are connected in series, the total voltage drop across the series combination is equal to the sum of the voltage drops across each individual component.

Mathematically, if you have resistors $R_1, R_2, \ldots, R_n$​ connected in series, and a total voltage $V$ applied across the entire series, the voltage drop across each resistor $V_1, V_2, \ldots, V_n$,​ can be expressed as:

$$V = V_1 + V_2+ \cdots + V_n$$

Where $V_1, V_2, \ldots, V_n$ are the voltage drops across $R_1, R_2, \ldots, R_n$​ respectively.

Which would look like so:

Screenshot 2024-08-08 at 8 22 03 AM
@yaph
Copy link
Owner

yaph commented Aug 10, 2024

Thanks for the comprehensive report! The responses are converted to markdown using the turndown package and ChatGPT uses KaTeX markup. If the conversion step would be skipped for responses containing LaTeX document.querySelector('.katex').textContent still yields the same result:

Vtotal=V1+V2+V3+⋯+VnV_{\\text{total}} = V_1 + V_2 + V_3 + \\dots + V_nVtotal​=V1​+V2​+V3​+⋯+Vn​

@Ten0
Copy link

Ten0 commented Jan 28, 2025

I'm struggling to understand what you imply by:

If the conversion step would be skipped for responses containing LaTeX document.querySelector('.katex').textContent still yields the same result

Do you mean that this issue is impossible to fix because the page does not contain the raw LaTeX pre-rendering?

Otherwise, would it be possible to skip conversion for LaTeX components add the $$ and $ signs where appropriate so that the LaTeX boundaries are sent to the markdown and the markdown renderer can display them instead of having them converted before?

As far as I'm concerned I'm getting partially converted output, with e.g. \\in and \\frac{ but also , which is rather impossible to display nicely...

@yaph
Copy link
Owner

yaph commented Jan 28, 2025

I don't mean it is impossible to fix, but how would you retrieve the text of the response other than using the value of ELEMENT.textContent?

@Ten0
Copy link

Ten0 commented Jan 29, 2025

how would you retrieve the text of the response other than using the value of ELEMENT.textContent?

If your question is "how would we get the original LaTeX for a particular .katex?", it looks like this would work:

Image

Also it looks like it is possible to differentiate between inline LaTeX ($something$ in markdown) and LaTeX blocks ($$something$$ in markdown) by checking whether the .katex is within a .katex-display tag.

@yaph
Copy link
Owner

yaph commented Jan 29, 2025

Given that source code is not converted correctly any more as well (see #8), maybe the current implementation should be replaced with triggering clicks on the copy button of each response and then accessing the content of the clipboard.

@Ten0
Copy link

Ten0 commented Jan 29, 2025

Ah I didn't realize we could do this! Well that will definitely fix it for my particular case where I only had one document to convert, thanks 😅

@yaph
Copy link
Owner

yaph commented Jan 30, 2025

Sadly, I cannot access the content of the clipboard in the bookmarklet though, only in a browser extension that has the necessary permission.

@Ten0
Copy link

Ten0 commented Jan 30, 2025

Maybe it reads what it should write from somewhere on the page? Otherwise yes it looks like a tampermonkey script might be the easiest way to go... (IIRC these do have access to the clipboard but it's still a straight JS file to paste)

@yaph
Copy link
Owner

yaph commented Jan 30, 2025

I went through the click event in the dev console but couldn't figure out a way to get the text before it is copied to the clipboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants