Show in Frame No Frame
Up Previous Next Title Page Index Contents Search

6.5.5 Translating strings

To replace characters or substrings in a string use the template:
to 
   'A-Z a-z'
translate
   'Foo'
endto
which outputs ‘foo’. The clauses between to and translate form the translator, and those between translate and endto form the output to which that translator is applied. For example, the rule ‘A-Z a-z’ above means that each uppercase letter is converted into the corresponding lowercase letter.

Translators can be named and used many times. For example
to 
   '%lower' newline 'A-Z a-z'
endto 
defines a translator named ‘lower’ whose rule is ‘A-Z a-z’. The clauses between to and endto form the translator definition: the first line sets the translator name and the next lines contain the rules. This translator can be used later many times:
to '%lower' translate 'Foo' endto
There is also a shortcut syntax for using named translators with simple commands, design element output commands, variables and literal strings:
id;2;%lower
:myName%lower
$variable%lower
A translator definition can contain multiple rules, each one line of text that maps a left-hand side to a right-hand side. There are several different kinds of rule, such as character to character, or string to string. The different kinds of rules available are explained in the table below. Special characters (newline space \ / $ % - *) must be escaped with backslash, e.g. to map spaces to underscores use: '\ _' (backslash, space, space, underscore). Remember too that if the translator definition is expressed in a literal string, a single quote ' must of course be escaped by doubling it.
Name or comment
'%myName' as the first line in a translator definition gives the translator a name “myName”. Lines starting with % later in the translator are ignored as comments.
Character
'a b' maps each occurrence of character a to character b.
Range
'1-9 a-i' maps each character in the range on the left to the corresponding character in the range on the right. In this example numbers are mapped to letters: 1 becomes a, 2 becomes b and so on. Note that ranges must be of equal size, thus ‘a-c 1-4’ is not legal.
Note: range can be reversed, e.g. "a-z z-a".
Multiple character
'123 abc' maps each number to a letter: 1 to a, 2 to b, 3 to c. The difference from range is that each character is specified explicitly.
String
'$dog $cat' means replace each occurrence of the string ‘dog’ with ‘cat’.
Mixed
'aeiou $VOWEL' means replace each vowel with the string "VOWEL". This is applied with the character translations.
Asterisk
An asterisk on the left is the default mapping – what to map all unspecified characters to (the default is to leave them unchanged): '* $abc' means replace each character with the string "abc"
An asterisk on the right means leave the characters on the left unchanged: ‘abc *’ do not change a, b and c.
Regular expression
'/[A-Z][a-z]*/ $NAME' means replace each occurrence of a capital letter followed by lowercase letters with NAME. The left-hand side need not escape special translation characters, but can use the normal regular expression escapes; / must be escaped by doubling it.
The right-hand side (after the initial $) can use $0 to refer to the whole matched string, $1 for the substring matching the first parenthesized subexpression etc. E.g. the following rule (which should be on one line) would turn “Fred Bloggs and John Doe” into “Bloggs, Fred and Doe, John”
/([A-Z][a-z]*) ([A-Z][a-z]*)/
$$2,\ $1
All rules that apply to single characters are collected together first to build one large character mapping, which is applied to the input text in one operation. After that all rules that apply to strings, including regular expression rules, are applied in order, one at a time, to the whole text. If you need to change this order, e.g. to translate strings first then characters, you can use two translators. The first will translate just strings and the second just characters, and you can apply the first and then the second to achieve the desired result. Translators can also have their subexpression matches translated, e.g. $1%upper; will find the match for $1, then translate it with the %upper translator. Note that the semicolon is obligatory here.

As translators can perform almost any edits on texts, the result of a translator does not preserve any formatting from the original text.

Some useful translators such as %lower can be found from the _translators generator in the Graph metatype. To be able to use these translators in your own generators, call _translators() somewhere near the start of your outermost generator.

Show in Frame No Frame
Up Previous Next Title Page Index Contents Search