Fix incorrect PDF title display when multiple language titles present in XMP metadata (issue 20801)#20874
Fix incorrect PDF title display when multiple language titles present in XMP metadata (issue 20801)#20874nyxsky404 wants to merge 4 commits intomozilla:masterfrom
Conversation
… in XMP metadata (issue 20801)
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #20874 +/- ##
==========================================
+ Coverage 62.51% 62.56% +0.04%
==========================================
Files 173 173
Lines 121246 121278 +32
==========================================
+ Hits 75796 75872 +76
+ Misses 45450 45406 -44
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
themavik
left a comment
There was a problem hiding this comment.
Turning on hasAttributes for SimpleXMLParser and routing dc:title/dc:description through _parseLangAlt fixes the concatenated multi-lang titles cleanly. nit: _parseLangAlt grabs entry.childNodes[0] before _getSequence—if a whitespace text node sneaks ahead of rdf:Alt you might still fall back to odd textContent; worth a regression if you see real-world XMP like that.
|
Makes sense, I’ll add a regression test for that case |
Fixes #20801
Problem
When a PDF has multiple language titles in XMP metadata using
rdf:Alt, the titles were being concatenated instead of selecting a single title. For example, a PDF with bothx-defaultandenlanguage titles would display "Hello WorldHello World" instead of "Hello World".Solution
rdf:Altelements fordc:titleanddc:descriptionx-defaultlanguage entry if present, otherwise uses the first entryhasAttributesoption in SimpleXMLParser to readxml:langattributesTesting
x-defaultx-default