Skip to content

Server crashes with StringIndexOutOfBoundsException when processing Macedonian (mk) templates using 'Шаблон:' namespace #804

@haniyakonain

Description

@haniyakonain

Describe the bug
Server crashes during startup with StringIndexOutOfBoundsException when processing Macedonian (mk) language template redirects. Many Macedonian Wikipedia templates use 'Шаблон:' (Cyrillic for "Template") instead of the expected 'Предлошка:' namespace prefix, causing the substring operation to fail with index -1.

The crash occurs at:

java.lang.StringIndexOutOfBoundsException: String index out of range: -1
        at java.lang.String.substring(String.java:1931)
        at org.dbpedia.extraction.server.stats.MappingStatsHolder$$anonfun$1.apply(MappingStatsHolder.scala:54)

Over 100+ mk templates are affected, including:

  • Шаблон:Инфокутија Верски објект
  • Шаблон:2TeamBracket
  • Шаблон:Инфокутија Православна црква
  • Шаблон:Оклопно возило
  • And many more...

Expected behaviour
The server should either:

  1. Handle alternative namespace prefixes for mk language (recognizing both 'Предлошка:' and 'Шаблон:')
  2. Log warnings and skip invalid templates without crashing (PR #795 provides a temporary fix for this)
  3. Successfully start and process mk language templates without throwing exceptions

Environment

  • Extraction: (commit hash): 5eb208b932a63a6f0cd5cbede3e446315686e6a7 (enable-wikidata-server branch)
  • OS: Linux 6.14.0-33-generic (Ubuntu)
  • Java SDK Version (java --version): 1.8.0_462 (OpenJDK)
  • Maven version (mvn --version): Apache Maven 3.8.7

To reproduce

  1. Enable Macedonian (mk) language in server.default.properties with @mappings or any configuration
  2. Start the DBpedia extraction server: cd server && ../run server
  3. Server attempts to load mk template statistics
  4. Server crashes with StringIndexOutOfBoundsException during MappingStatsHolder initialization

Additional context & logs
The root cause is in MappingStatsHolder.scala:54 where the code expects all templates to start with 'Предлошка:' but many mk templates use 'Шаблон:'.

Related:

  • PR #795 fixes the crash by adding validation but doesn't address the namespace mismatch
  • This may require updating the mk language configuration to recognize 'Шаблон:' as a valid template namespace

Full stack trace:

org.dbpedia.extraction.server.stats.MappingStatsHolder$$anonfun$apply$2 apply
WARNING: mk template 'Шаблон:Инфокутија Верски објект' does not start with 'Предлошка:'
[... 100+ similar warnings ...]

java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at scala_maven_executions.MainHelper.runMain(MainHelper.java:164)
        at scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -1
        at java.lang.String.substring(String.java:1931)
        at org.dbpedia.extraction.server.stats.MappingStatsHolder$$anonfun$1.apply(MappingStatsHolder.scala:54)
        at org.dbpedia.extraction.server.stats.MappingStatsHolder$$anonfun$1.apply(MappingStatsHolder.scala:54)
        at scala.collection.MapLike$FilteredKeys$$anonfun$foreach$1.apply(MapLike.scala:231)
        [...]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions