Skip to content

Normalizing speech data #92

@jugglinmike

Description

@jugglinmike

Implementation experience in the ARIA-AT project has uncovered a large amount of variation in the formatting of the text that screen readers send to text-to-speech engines.

For instance, we've observed text from JAWS such as:

Print Page  \u001d Button \u001e

(Where \u001d and \u001e represent the Unicode "group separator" and "record separator", respectively)

And text from VoiceOver like:

  Print Page
              button
  You are currently on a button. To click this button, press 
Control            -Option            -Space.

(Note the copious amount of empty space, including a trailing space on the third line)

To be sure, these examples are not only accurate but also compliant. The specification places no constraints on the way the text is formatted. The relevant language reads:

When the assistive technology would send some text data (a string, without speech-specific markup or annotations) to the Text-To-Speech system, or equivalent for non-speech assistive technology software, run these steps:

However, those examples are not the most intuitive way to express the spoken text. The formatting is important to ARIA-AT, so we've written some logic to normalize at the application level. Since I expect formatting will also be important to many future consumers of the protocol, this seems like an opportunity for the standard to reduce repeated work.

A number of concerns come to mind:

  • removing details which have no impact on the vocalized text (e.g. extraneous space, new lines, some punctuation, some capitalization)
  • using a data type other than a simple string (e.g. an array of strings, each describing a discrete utterance)
  • expressing this in a localizable way (at first blush, Unicode's offerings seem promising)

Should we expect implementations to eventually improve and "do the right thing" in these regards? Or should we constrain speech data in some way? If so, how?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions