Convert-Translit

 view release on metacpan or  search on metacpan

lib/Convert/rfc1345  view on Meta::CPAN

   characters are listed with their mnemonic in ascending order.  A
   character mnemonic of "??" indicates that the position is unused.  A
   character mnemonic of "__" indicates that the character set is not
   completely defined with the specifications in this memo.

   "&code2" has 2 parameters specifying the row and column in certain
   16-bit character sets.  The value 32 must be added to obtain the
   first and second byte respectively.  Mnemonics can be specified after
   the "&code2" specification as mentioned for the "&code"
   specification.

   "&codex" has 5 parameters, specifying the character set prefix
   string, the start row number, the end row number, the start column
   number and the end column number respectively. This is equivalent to
   specifying a series of mnemonics of the form "nrrcc" where "n" is the
   character set name prefix string, "rr" is the row number running from
   the specified start row number to the end row number, and "cc" is the
   column number running from the specified start column number to the
   end column number.  The thereby created series mnemonics are
   allocated to code positions which are added 32 to the row and column
   numbers to get the row and column octet.

   "&duplicate" has a special meaning indicating that a position is
   being used for more than one character. This is an ugly convention
   but it is a sad fact of life that same code in one coded character
   set can mean different characters. "&duplicate" takes two parameters

Simonsen                                                       [Page 43]

RFC 1345          Character Mnemonics & Character Sets         June 1992


   - the first is the code to be duplicated, the other is the new
   mnemonic.

   "&rem" is followed by text to explain something in the table to a
   human reader.  All lines in such a remark has to start with this
   keyword.

   "&comb2" specifies a combination of two characters which signifies a
   third character.  All characters in the specification are given by
   their mnemonic.  The two combining characters must be specified
   previously in the code table.  The first combining character is
   specified as the first character after the keyword, and then the
   following pairs of characters are the second combining character and
   the result, respectively.  The specification can be repeated,
   terminated by an occurrence of a keyword.

4.3  Mnemonic charsets

   The following is compatible with current practice on the internet
   within EUnet - the European not-for-profit networking organisation in
   Europe and North Africa currently operating in 24 countries.

   The mnemonic charsets are a family of charsets which have the
   facility that within the relevant parts of the message, encoded in an
   ordinary coded character set, text may have occurrences of the
   following sequence: an intro character sequence, followed by a string
   of characters that represent a character mnemonic, as described
   below.  Similarly, the intro character sequence may be doubled,
   indicating a single occurrence of the respective symbols in decoded
   format.

   Note that many characters within a mnemonic character set may be
   represented in two different ways.  Normally the character itself is
   used, but it is also possible to use the mnemonic allocated to the
   character in a mnemonic sequence.

   In this way all characters with assigned mnemonics can be represented
   without information loss in any character set, which contains the
   invariant ISO 646 characters as a subset.  As a consequence, using a
   mnemonic character set all these characters can be generated
   uniformly on all keyboards and presented uniformly on all terminal
   equipment, whenever the real character is not available.

   Data encoded in a mnemonic charset is intended to be read by the end
   user possibly without further treatment.  If the transport encoding
   and the presentation encoding for the user differ, it is recommended
   that the data be translated into a mnemonic representation in the
   presentation encoding.

   A mnemonic charset is specified with the name
   "mnemonic+charset+intro" where "mnemonic" is written as given and
   "charset" and "intro" is specified as described below. The mnemonic
   charset "mnemonic" is a shorthand for "mnemonic+ascii+38".  The

Simonsen                                                       [Page 44]

RFC 1345          Character Mnemonics & Character Sets         June 1992


   mnemonic charset "mnem" is a shorthand for "mnemonic+ascii+8200".

   It is discouraged to use mnemonics for Chinese characters of either
   Chinese, Japanese or Korean origin, as the probability that the end
   user equipment can deal with the original encoding is very high for
   the intended receiver, and the mnemonics for such Chinese characters
   described in this memo convey very little meaning to humans.

4.3.1  charset

   The charset is given as one of the charset names in this memo and is
   the encoding used for the transport.  It cannot be a mnemonic
   charset.

4.3.2  Intro

   The intro character sequence is given as the decimal value of the
   intro characters in the transport character set. There may be up to
   two characters used in the intro character sequence, and the decimal
   value for two-character intro sequences are then the first character
   value multiplied with 256 to the power of the number of octets used
   in the character set, plus the second character value.  The
   recommended value is 38 for the ampersand (&) character in ASCII.
   Another common value is 29 for the control character "Group
   Separator", or 8200 for "space" followed by "backspace", which may be
   convenient when operating in some environments, and ordinary text is
   not changed.  Only the ampersand character may be chosen as intro
   from the invariant ISO 646 charset, but any character not in the
   invariant ISO 646 character can be used as intro.  The intro
   character sequence is used for introducing character mnemonics when a



( run in 0.786 second using v1.01-cache-2.11-cpan-df04353d9ac )