|
| 1 | +# Dictionary Formats |
| 2 | + |
| 3 | +Plover supports multiple proprietary steno dictionary formats, as well as |
| 4 | +some open formats widely used by the Open Steno community. At a high level, |
| 5 | +there are two main types of dictionaries: static and programmatic. |
| 6 | + |
| 7 | +## Static Dictionaries |
| 8 | + |
| 9 | +Static dictionaries consist of entries mapping steno outlines to translations. |
| 10 | +This is the simplest type of dictionary, and most Plover dictionaries will |
| 11 | +be of this form. |
| 12 | + |
| 13 | +### JSON |
| 14 | + |
| 15 | +The most common format for steno dictionaries in Plover is the **JavaScript |
| 16 | +Object Notation** (JSON) format. This consists of a series of key-value pairs |
| 17 | +separated by commas and surrounded by curly brackets `{}`: |
| 18 | + |
| 19 | +```json |
| 20 | +{ |
| 21 | + "KAT": "cat", |
| 22 | + "KAT/HROG": "catalog", |
| 23 | + "KA/TA/HROG": "catalog", |
| 24 | + "-S": "{^s}" |
| 25 | +} |
| 26 | +``` |
| 27 | + |
| 28 | +In each key-value pair, the key is the [canonical steno notation](steno-notation) |
| 29 | +of the outline, with the strokes separated by slashes, and the value is the |
| 30 | +translation for that outline in Plover's [translation language](translation_language). |
| 31 | +This format is used for dictionaries because it matches Plover's internal |
| 32 | +storage format almost exactly. |
| 33 | + |
| 34 | +### RTF/CRE |
| 35 | + |
| 36 | +Another common dictionary format, which is also supported by most proprietary |
| 37 | +steno software, is the |
| 38 | +[**Rich Text Format** with Court Reporting Extensions](http://www.legalxml.org/workgroups/substantive/transcripts/cre-spec.htm) |
| 39 | +(RTF/CRE) format. It was designed as an interchange format between steno |
| 40 | +systems, so Plover supports some of the features implemented into the format. |
| 41 | + |
| 42 | +```rtf |
| 43 | +{\rtf1\ansi\cxrev100\cxdict |
| 44 | +{\*\cxs KAT}cat |
| 45 | +{\*\cxs KAT/HROG}catalog |
| 46 | +{\*\cxs KA/TA/HROG}catalog |
| 47 | +{\*\cxs -S}\cxds s{\*\cxcomment -s suffix} |
| 48 | +} |
| 49 | +``` |
| 50 | + |
| 51 | +In RTF dictionaries, while the steno outline is also written in the same |
| 52 | +notation, the translation isn't written in Plover's translation language; |
| 53 | +instead it uses RTF-specific formatting controls that translate to different |
| 54 | +commands for each steno system that supports it. RTF also supports some |
| 55 | +entry-level metadata, such as comments and historical usage data, but these |
| 56 | +can't be read by Plover. |
| 57 | + |
| 58 | +It's generally not recommended to maintain RTF dictionaries, since they can be |
| 59 | +slow to parse and the format isn't especially well defined, but this is often |
| 60 | +an option if, for example, professional stenographers would also like to use |
| 61 | +their personal dictionaries with Plover. |
| 62 | + |
| 63 | +### Proprietary Formats |
| 64 | + |
| 65 | +Plover also supports some proprietary software's native dictionary formats, |
| 66 | +with the help of some plugins: |
| 67 | + |
| 68 | +- [plover-casecat-dictionary](https://github.com/marnanel/plover_casecat_dictionary) -- Stenograph Case CATalyst dictionaries (`.sgdct`) |
| 69 | +- [plover-digitalcat-dictionary](https://github.com/marnanel/plover_digitalcat_dictionary) -- Stenovations digitalCAT dictionaries (`.dct`) |
| 70 | +- [plover-eclipse-dictionary](https://github.com/marnanel/plover_eclipse_dictionary) -- Advantage Software Eclipse dictionaries (`.dix`) |
| 71 | + |
| 72 | +## Programmatic Dictionaries |
| 73 | + |
| 74 | +Programmatic dictionaries, instead of containing a list of entries, calculate |
| 75 | +translations on the fly, the moment Plover requests them. This is most useful |
| 76 | +for heavily regular dictionaries like a symbol system or a syllabic theory. |
| 77 | + |
| 78 | +The [plover-python-dictionary](https://github.com/benoit-pierre/plover_python_dictionary) |
| 79 | +plugin adds support for programmatic dictionaries written in Python, which can |
| 80 | +be used in Plover just like static ones. |
| 81 | + |
| 82 | +Programmatic dictionaries primarily expose a lookup function, which calculates |
| 83 | +a translation for a given steno outline. Some dictionaries may also provide a |
| 84 | +reverse-lookup function, which calculates all the possible outlines that |
| 85 | +translate to a particular text. |
| 86 | + |
| 87 | +```{data} LONGEST_KEY |
| 88 | +The maximum number of strokes that this dictionary can translate. Plover uses |
| 89 | +this value to optimize dictionary lookups by only using this dictionary when |
| 90 | +looking up outlines this length or shorter. |
| 91 | +
|
| 92 | +This attribute is **required**. |
| 93 | +``` |
| 94 | + |
| 95 | +```{function} lookup(outline: Tuple[str]) -> str |
| 96 | +Given an outline which is a tuple of steno strokes, returns the translation for |
| 97 | +this outline, or raises a `KeyError` when no translation is available. The |
| 98 | +translation should be in Plover's [translation language](translation_language). |
| 99 | +
|
| 100 | +This function is **required**. |
| 101 | +``` |
| 102 | + |
| 103 | +```{function} reverse_lookup(translation: str) -> List[Tuple[str]] |
| 104 | +Given a translation in Plover's [translation language](translation_language), |
| 105 | +returns the list of possible outlines that translate to it. The list may be |
| 106 | +empty if there are no possible outlines in this dictionary. |
| 107 | +
|
| 108 | +This function is *optional*; the dictionary still works without implementing |
| 109 | +it, but it will not support searching in the Lookup tool. |
| 110 | +``` |
| 111 | + |
| 112 | +Here is an example of a very basic programmatic dictionary which just |
| 113 | +translates `KP-PL` to `example`: |
| 114 | + |
| 115 | +```python |
| 116 | +LONGEST_KEY = 1 |
| 117 | + |
| 118 | + |
| 119 | +def lookup(outline): |
| 120 | + assert len(outline) == 1 |
| 121 | + |
| 122 | + stroke = outline[0] |
| 123 | + if stroke == "KP-PL": |
| 124 | + return "example" |
| 125 | + else: |
| 126 | + raise KeyError |
| 127 | + |
| 128 | + |
| 129 | +def reverse_lookup(translation): |
| 130 | + if translation == "example": |
| 131 | + return [("KP-PL",)] |
| 132 | + else: |
| 133 | + return [] |
| 134 | +``` |
0 commit comments