Sub-Utterance Chant Alignment
Inline word and phrase level zema alignment and milikit markers.
Chant metadata cannot always attach only to a whole utterance. OLS v1.0 defines inline segmentation for alignment.
{
"id": "ut-segmented-zema",
"roles": ["role-cantor"],
"mode": "chanted",
"text": {
"gez-Ethi": "ቅዱስ እግዚአብሔር"
},
"segments": [
{
"id": "seg-001",
"range": { "language": "gez-Ethi", "startChar": 0, "endChar": 4 },
"text": "ቅዱስ",
"chant": { "system": "yared", "mode": "geez", "milikit": ["mark-001"] }
}
]
}
Range Integrity
Character ranges are interpreted against the exact string identified by range.language. Implementers should validate that:
startCharis less thanendChar.- The range falls within the selected language value.
- The copied segment text matches that range.
- Segments do not overlap unless the active profile explicitly permits overlap.
Do not reuse Ge’ez character offsets for a transliteration or translation. Each writing system may have different token and character boundaries, so aligned languages need their own ranges.