On this page Range Integrity

Sub-Utterance Chant Alignment

Inline word and phrase level zema alignment and milikit markers.

Chant metadata cannot always attach only to a whole utterance. OLS v1.0 defines inline segmentation for alignment.

{
  "id": "ut-segmented-zema",
  "roles": ["role-cantor"],
  "mode": "chanted",
  "text": {
    "gez-Ethi": "ቅዱስ እግዚአብሔር"
  },
  "segments": [
    {
      "id": "seg-001",
      "range": { "language": "gez-Ethi", "startChar": 0, "endChar": 4 },
      "text": "ቅዱስ",
      "chant": { "system": "yared", "mode": "geez", "milikit": ["mark-001"] }
    }
  ]
}

Range Integrity

Character ranges are interpreted against the exact string identified by range.language. Implementers should validate that:

  1. startChar is less than endChar.
  2. The range falls within the selected language value.
  3. The copied segment text matches that range.
  4. Segments do not overlap unless the active profile explicitly permits overlap.

Do not reuse Ge’ez character offsets for a transliteration or translation. Each writing system may have different token and character boundaries, so aligned languages need their own ranges.

Schemas and supported values