-
Notifications
You must be signed in to change notification settings - Fork 287
Open
Description
Describe the bug
Server crashes during startup with StringIndexOutOfBoundsException when processing Macedonian (mk) language template redirects. Many Macedonian Wikipedia templates use 'Шаблон:' (Cyrillic for "Template") instead of the expected 'Предлошка:' namespace prefix, causing the substring operation to fail with index -1.
The crash occurs at:
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(String.java:1931)
at org.dbpedia.extraction.server.stats.MappingStatsHolder$$anonfun$1.apply(MappingStatsHolder.scala:54)
Over 100+ mk templates are affected, including:
- Шаблон:Инфокутија Верски објект
- Шаблон:2TeamBracket
- Шаблон:Инфокутија Православна црква
- Шаблон:Оклопно возило
- And many more...
Expected behaviour
The server should either:
- Handle alternative namespace prefixes for mk language (recognizing both 'Предлошка:' and 'Шаблон:')
- Log warnings and skip invalid templates without crashing (PR #795 provides a temporary fix for this)
- Successfully start and process mk language templates without throwing exceptions
Environment
- Extraction: (commit hash):
5eb208b932a63a6f0cd5cbede3e446315686e6a7(enable-wikidata-server branch) - OS: Linux 6.14.0-33-generic (Ubuntu)
- Java SDK Version (java --version): 1.8.0_462 (OpenJDK)
- Maven version (mvn --version): Apache Maven 3.8.7
To reproduce
- Enable Macedonian (mk) language in
server.default.propertieswith@mappingsor any configuration - Start the DBpedia extraction server:
cd server && ../run server - Server attempts to load mk template statistics
- Server crashes with StringIndexOutOfBoundsException during MappingStatsHolder initialization
Additional context & logs
The root cause is in MappingStatsHolder.scala:54 where the code expects all templates to start with 'Предлошка:' but many mk templates use 'Шаблон:'.
Related:
- PR #795 fixes the crash by adding validation but doesn't address the namespace mismatch
- This may require updating the mk language configuration to recognize 'Шаблон:' as a valid template namespace
Full stack trace:
org.dbpedia.extraction.server.stats.MappingStatsHolder$$anonfun$apply$2 apply
WARNING: mk template 'Шаблон:Инфокутија Верски објект' does not start with 'Предлошка:'
[... 100+ similar warnings ...]
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala_maven_executions.MainHelper.runMain(MainHelper.java:164)
at scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(String.java:1931)
at org.dbpedia.extraction.server.stats.MappingStatsHolder$$anonfun$1.apply(MappingStatsHolder.scala:54)
at org.dbpedia.extraction.server.stats.MappingStatsHolder$$anonfun$1.apply(MappingStatsHolder.scala:54)
at scala.collection.MapLike$FilteredKeys$$anonfun$foreach$1.apply(MapLike.scala:231)
[...]
Metadata
Metadata
Assignees
Labels
No labels