-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat(synced-lyrics): thai romanization #3618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: Angelos Bouklis <[email protected]>
that is a really hard problem to solve, since there are songs that mix japanese/korean/chinese |
well, I never thought there are cases like that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Thai language romanization support to the synced lyrics plugin by integrating the @dehoist/romanize-thai
library. The implementation includes Thai text detection, word segmentation using the Intl.Segmenter API, and fallback handling.
- Adds Thai romanization functionality with intelligent word segmentation
- Refactors the romanization logic to use a cleaner handler-based approach
- Integrates the
@dehoist/romanize-thai
external dependency
Reviewed Changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
src/plugins/synced-lyrics/renderer/utils.tsx | Implements Thai romanization function and refactors existing romanization logic |
package.json | Adds the @dehoist/romanize-thai dependency |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
.map((segment) => segment.isWordLike ? romanizeThaiFrag(segment.segment) : segment.segment.trim()) | ||
.join(' ') | ||
.trim(); | ||
|
||
return latin; | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The ternary operation could be more readable by extracting the logic into a separate function or using explicit if-else blocks for better clarity.
.map((segment) => segment.isWordLike ? romanizeThaiFrag(segment.segment) : segment.segment.trim()) | |
.join(' ') | |
.trim(); | |
return latin; | |
}; | |
.map(processThaiSegment) | |
.join(' ') | |
.trim(); | |
return latin; | |
}; | |
const processThaiSegment = (segment: { isWordLike: boolean; segment: string }): string => { | |
return segment.isWordLike ? romanizeThaiFrag(segment.segment) : segment.segment.trim(); | |
}; |
Copilot uses AI. Check for mistakes.
|
||
const segments = Array.from(thaiSegmenter.segment(line)); | ||
const latin = segments | ||
.map((segment) => segment.isWordLike ? romanizeThaiFrag(segment.segment) : segment.segment.trim()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider why .trim()
is only applied to non-word segments. This asymmetry might lead to inconsistent spacing behavior and should be documented or applied consistently.
.map((segment) => segment.isWordLike ? romanizeThaiFrag(segment.segment) : segment.segment.trim()) | |
.map((segment) => romanizeThaiFrag(segment.segment).trim()) |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'cause word's can't have spaces in them, dumb AI
One solution might be to detect what percentage of the lyrics is japanese, if it's above a threshold of like 95%, then we assume all chinese characters are actually kanji |
Support of Thai romanization from
@dehoist/romanize-thai