Discussion:
DEC-MCS mapping, anyone?
Addison Phillips [wM]
2003-10-10 17:05:05 UTC
Permalink
I'm looking for an authoritative mapping table of this old VMS encoding to Unicode and can't find one anywhere. Anyone got one handy?

Thanks in advance!

Addison

Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility

432 Lakeside Drive, Sunnyvale, CA, USA
+1 408.962.5487 (office) +1 408.210.3569 (mobile)
mailto:***@webmethods.com

Chair, W3C-I18N-WG, Web Services Task Force
http://www.w3.org/International/ws

Internationalization is an architecture.
It is not a feature.



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
John Cowan
2003-10-10 17:58:38 UTC
Permalink
Post by Addison Phillips [wM]
I'm looking for an authoritative mapping table of this old VMS encoding
to Unicode and can't find one anywhere. Anyone got one handy?
http://crl.nmsu.edu/~mleisher/csets/DECMCS.TXT is not authoritative, but
it's the best I could do based on the chart I was working from. It should
be good enough for government work. :-)

Thanks to Mark Leisher for publishing it on his site.
--
All Gaul is divided into three parts: the part John Cowan
that cooks with lard and goose fat, the part www.ccil.org/~cowan
that cooks with olive oil, and the part that www.reutershealth.com
cooks with butter. -- David Chessler ***@reutershealth.com


------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Tim Greenwood
2003-10-10 18:28:54 UTC
Permalink
The chart that John references looks good to me. DEC MCS (aka DEC
standard 169) preceeded 8859-1 and differs from it in the following way:

Code points A0 A4 A6 AC AD AE AF B4 B8 BE D0 DE F0 FE FF were undefined
(left open for future standardization that never occurred)

A8 was general currency sign
D7 Uppercase OE ligature
DD Uppercase Y diaeresis
F7 Lowercase oe ligature
FD Lowercase y diaeresis

- Tim




------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Philippe Verdy
2003-10-10 21:39:01 UTC
Permalink
Looking for some good old DEC Ultrix and VAX VMS references gives some hints
for this legacy character set used in the default standard VT1xx to VT3xx
terminal emulations (note that now, most terminal emulators prefer to use
ISO-8859-1 by default, instead of DEC-MCS, sometimes available only as an
option):

- ISO 8859-1 National Character Set FAQ
http://www.ciesin.ee/OTHER/NCS-FAQ
(many other sources found in Google for the same FAQ)

- Message Exchange for VMS 5.5+ Release Notes
http://www.tmk.com/ftp/ftp-madgoat-com/mx/mx052/mx052.release_notes

Note also that the glibc NLS support (gconv) tables include a good table to
support DEC-MCS...

See also RFC1340 "Assigned Numbers" by J.Reynolds & J.Postel (published july
1992) page 80 of 139 (was STANDARD, obsoletes RFC1060, obsoleted by RFC1700;
now HISTORIC): it is listed with reference:

[170,KSX2]: Simonsen, K., "Character Mnemonics & Character Sets",
RFC 1345, Rationel Almen Planlaegning, June 1992.

Now looking in RFC 1345, I see this text:

&charset DEC-MCS
&rem VAX/VMS User's Manual, Order Number: AI-Y517A-TE, April 1986.
&alias dec
&code 0
NU SH SX EX ET EQ AK BL BS HT LF VT FF CR SO SI
DL D1 D2 D3 D4 NK SY EB CN EM SB EC FS GS RS US
SP ! " Nb DO % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
At A B C D E F G H I J K L M N O P Q R S T U V W X Y Z <( // )> '> _
'! a b c d e f g h i j k l m n o p q r s t u v w x y z (! !! !) '? DT
PA HO BH NH IN NL SA ES HS HJ VS PD PU RI S2 S3
DC P1 P2 TS CC MW SG EG SS GC SC CI ST OC PM AC
?? !I Ct Pd ?? Ye ?? SE Cu Co -a << ?? ?? ?? ??
DG +- 2S 3S ?? My PI .M ?? 1S -o >> 14 12 ?? ?I
A! A' A> A? A: AA AE C, E! E' E> E: I! I' I> I:
?? N? O! O' O> O? O: OE O/ U! U' U> U: Y: ?? ss
a! a' a> a? a: aa ae c, e! e' e> e: i! i' i> i:
?? n? o! o' o> o? o: oe o/ u! u' u> u: y: ?? ??

This seems authoritative, but the mnemonics used in that RFC aren't clear...
Well, that RFC 1345 was (and still is) INFORMATIVE...


----- Original Message -----
From: "Addison Phillips [wM]" <***@webmethods.com>
To: <***@unicode.org>
Sent: Friday, October 10, 2003 7:05 PM
Subject: DEC-MCS mapping, anyone?
Post by Addison Phillips [wM]
I'm looking for an authoritative mapping table of this old VMS encoding to
Unicode and can't find one anywhere. Anyone got one handy?
Post by Addison Phillips [wM]
Thanks in advance!
Addison
Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
432 Lakeside Drive, Sunnyvale, CA, USA
+1 408.962.5487 (office) +1 408.210.3569 (mobile)
Chair, W3C-I18N-WG, Web Services Task Force
http://www.w3.org/International/ws
Internationalization is an architecture.
It is not a feature.
------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Philippe Verdy
2003-10-10 21:59:14 UTC
Permalink
I forgot this page:
http://std.dkuug.dk/i18n/charmaps/DEC-MCS
taken from a large list of downloadable charmaps referenced by
www.diffuse.org/chars.html



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Philippe Verdy
2003-10-10 22:21:12 UTC
Permalink
This mapping table would work...

Mnemonic Position Unicode CharacterName
<NU> /x00 <U0000> NULL (NUL)
<SH> /x01 <U0001> START OF HEADING (SOH)
<SX> /x02 <U0002> START OF TEXT (STX)
<EX> /x03 <U0003> END OF TEXT (ETX)
<ET> /x04 <U0004> END OF TRANSMISSION (EOT)
<EQ> /x05 <U0005> ENQUIRY (ENQ)
<AK> /x06 <U0006> ACKNOWLEDGE (ACK)
<BL> /x07 <U0007> BELL (BEL)
<BS> /x08 <U0008> BACKSPACE (BS)
<HT> /x09 <U0009> CHARACTER TABULATION (HT)
<LF> /x0A <U000A> LINE FEED (LF)
<VT> /x0B <U000B> LINE TABULATION (VT)
<FF> /x0C <U000C> FORM FEED (FF)
<CR> /x0D <U000D> CARRIAGE RETURN (CR)
<SO> /x0E <U000E> SHIFT OUT (SO)
<SI> /x0F <U000F> SHIFT IN (SI)
<DL> /x10 <U0010> DATALINK ESCAPE (DLE)
<D1> /x11 <U0011> DEVICE CONTROL ONE (DC1)
<D2> /x12 <U0012> DEVICE CONTROL TWO (DC2)
<D3> /x13 <U0013> DEVICE CONTROL THREE (DC3)
<D4> /x14 <U0014> DEVICE CONTROL FOUR (DC4)
<NK> /x15 <U0015> NEGATIVE ACKNOWLEDGE (NAK)
<SY> /x16 <U0016> SYNCHRONOUS IDLE (SYN)
<EB> /x17 <U0017> END OF TRANSMISSION BLOCK (ETB)
<CN> /x18 <U0018> CANCEL (CAN)
<EM> /x19 <U0019> END OF MEDIUM (EM)
<SB> /x1A <U001A> SUBSTITUTE (SUB)
<EC> /x1B <U001B> ESCAPE (ESC)
<FS> /x1C <U001C> FILE SEPARATOR (IS4)
<GS> /x1D <U001D> GROUP SEPARATOR (IS3)
<RS> /x1E <U001E> RECORD SEPARATOR (IS2)
<US> /x1F <U001F> UNIT SEPARATOR (IS1)
<SP> /x20 <U0020> SPACE
<!> /x21 <U0021> EXCLAMATION MARK
<"> /x22 <U0022> QUOTATION MARK
<Nb> /x23 <U0023> NUMBER SIGN
<DO> /x24 <U0024> DOLLAR SIGN
<%> /x25 <U0025> PERCENT SIGN
<&> /x26 <U0026> AMPERSAND
<'> /x27 <U0027> APOSTROPHE
<(> /x28 <U0028> LEFT PARENTHESIS
<)> /x29 <U0029> RIGHT PARENTHESIS
<*> /x2A <U002A> ASTERISK
<+> /x2B <U002B> PLUS SIGN
<,> /x2C <U002C> COMMA
<-> /x2D <U002D> HYPHEN-MINUS
<.> /x2E <U002E> FULL STOP
<//> /x2F <U002F> SOLIDUS
<0> /x30 <U0030> DIGIT ZERO
<1> /x31 <U0031> DIGIT ONE
<2> /x32 <U0032> DIGIT TWO
<3> /x33 <U0033> DIGIT THREE
<4> /x34 <U0034> DIGIT FOUR
<5> /x35 <U0035> DIGIT FIVE
<6> /x36 <U0036> DIGIT SIX
<7> /x37 <U0037> DIGIT SEVEN
<8> /x38 <U0038> DIGIT EIGHT
<9> /x39 <U0039> DIGIT NINE
<:> /x3A <U003A> COLON
<;> /x3B <U003B> SEMICOLON
<<> /x3C <U003C> LESS-THAN SIGN
<=> /x3D <U003D> EQUALS SIGN
</>> /x3E <U003E> GREATER-THAN SIGN
<?> /x3F <U003F> QUESTION MARK
<At> /x40 <U0040> COMMERCIAL AT
<A> /x41 <U0041> LATIN CAPITAL LETTER A
<B> /x42 <U0042> LATIN CAPITAL LETTER B
<C> /x43 <U0043> LATIN CAPITAL LETTER C
<D> /x44 <U0044> LATIN CAPITAL LETTER D
<E> /x45 <U0045> LATIN CAPITAL LETTER E
<F> /x46 <U0046> LATIN CAPITAL LETTER F
<G> /x47 <U0047> LATIN CAPITAL LETTER G
<H> /x48 <U0048> LATIN CAPITAL LETTER H
<I> /x49 <U0049> LATIN CAPITAL LETTER I
<J> /x4A <U004A> LATIN CAPITAL LETTER J
<K> /x4B <U004B> LATIN CAPITAL LETTER K
<L> /x4C <U004C> LATIN CAPITAL LETTER L
<M> /x4D <U004D> LATIN CAPITAL LETTER M
<N> /x4E <U004E> LATIN CAPITAL LETTER N
<O> /x4F <U004F> LATIN CAPITAL LETTER O
<P> /x50 <U0050> LATIN CAPITAL LETTER P
<Q> /x51 <U0051> LATIN CAPITAL LETTER Q
<R> /x52 <U0052> LATIN CAPITAL LETTER R
<S> /x53 <U0053> LATIN CAPITAL LETTER S
<T> /x54 <U0054> LATIN CAPITAL LETTER T
<U> /x55 <U0055> LATIN CAPITAL LETTER U
<V> /x56 <U0056> LATIN CAPITAL LETTER V
<W> /x57 <U0057> LATIN CAPITAL LETTER W
<X> /x58 <U0058> LATIN CAPITAL LETTER X
<Y> /x59 <U0059> LATIN CAPITAL LETTER Y
<Z> /x5A <U005A> LATIN CAPITAL LETTER Z
<<(> /x5B <U005B> LEFT SQUARE BRACKET
<////> /x5C <U005C> REVERSE SOLIDUS
<)/>> /x5D <U005D> RIGHT SQUARE BRACKET
<'/>> /x5E <U005E> CIRCUMFLEX ACCENT
<_> /x5F <U005F> LOW LINE
<'!> /x60 <U0060> GRAVE ACCENT
<a> /x61 <U0061> LATIN SMALL LETTER A
<b> /x62 <U0062> LATIN SMALL LETTER B
<c> /x63 <U0063> LATIN SMALL LETTER C
<d> /x64 <U0064> LATIN SMALL LETTER D
<e> /x65 <U0065> LATIN SMALL LETTER E
<f> /x66 <U0066> LATIN SMALL LETTER F
<g> /x67 <U0067> LATIN SMALL LETTER G
<h> /x68 <U0068> LATIN SMALL LETTER H
<i> /x69 <U0069> LATIN SMALL LETTER I
<j> /x6A <U006A> LATIN SMALL LETTER J
<k> /x6B <U006B> LATIN SMALL LETTER K
<l> /x6C <U006C> LATIN SMALL LETTER L
<m> /x6D <U006D> LATIN SMALL LETTER M
<n> /x6E <U006E> LATIN SMALL LETTER N
<o> /x6F <U006F> LATIN SMALL LETTER O
<p> /x70 <U0070> LATIN SMALL LETTER P
<q> /x71 <U0071> LATIN SMALL LETTER Q
<r> /x72 <U0072> LATIN SMALL LETTER R
<s> /x73 <U0073> LATIN SMALL LETTER S
<t> /x74 <U0074> LATIN SMALL LETTER T
<u> /x75 <U0075> LATIN SMALL LETTER U
<v> /x76 <U0076> LATIN SMALL LETTER V
<w> /x77 <U0077> LATIN SMALL LETTER W
<x> /x78 <U0078> LATIN SMALL LETTER X
<y> /x79 <U0079> LATIN SMALL LETTER Y
<z> /x7A <U007A> LATIN SMALL LETTER Z
<(!> /x7B <U007B> LEFT CURLY BRACKET
<!!> /x7C <U007C> VERTICAL LINE
<!)> /x7D <U007D> RIGHT CURLY BRACKET
<'?> /x7E <U007E> TILDE
<DT> /x7F <U007F> DELETE (DEL)
<PA> /x80 <U0080> PADDING CHARACTER (PAD)
<HO> /x81 <U0081> HIGH OCTET PRESET (HOP)
<BH> /x82 <U0082> BREAK PERMITTED HERE (BPH)
<NH> /x83 <U0083> NO BREAK HERE (NBH)
<IN> /x84 <U0084> INDEX (IND)
<NL> /x85 <U0085> NEXT LINE (NEL)
<SA> /x86 <U0086> START OF SELECTED AREA (SSA)
<ES> /x87 <U0087> END OF SELECTED AREA (ESA)
<HS> /x88 <U0088> CHARACTER TABULATION SET (HTS)
<HJ> /x89 <U0089> CHARACTER TABULATION WITH JUSTIFICATION (HTJ)
<VS> /x8A <U008A> LINE TABULATION SET (VTS)
<PD> /x8B <U008B> PARTIAL LINE FORWARD (PLD)
<PU> /x8C <U008C> PARTIAL LINE BACKWARD (PLU)
<RI> /x8D <U008D> REVERSE LINE FEED (RI)
<S2> /x8E <U008E> SINGLE-SHIFT TWO (SS2)
<S3> /x8F <U008F> SINGLE-SHIFT THREE (SS3)
<DC> /x90 <U0090> DEVICE CONTROL STRING (DCS)
<P1> /x91 <U0091> PRIVATE USE ONE (PU1)
<P2> /x92 <U0092> PRIVATE USE TWO (PU2)
<TS> /x93 <U0093> SET TRANSMIT STATE (STS)
<CC> /x94 <U0094> CANCEL CHARACTER (CCH)
<MW> /x95 <U0095> MESSAGE WAITING (MW)
<SG> /x96 <U0096> START OF GUARDED AREA (SPA)
<EG> /x97 <U0097> END OF GUARDED AREA (EPA)
<SS> /x98 <U0098> START OF STRING (SOS)
<GC> /x99 <U0099> SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI)
<SC> /x9A <U009A> SINGLE CHARACTER INTRODUCER (SCI)
<CI> /x9B <U009B> CONTROL SEQUENCE INTRODUCER (CSI)
<ST> /x9C <U009C> STRING TERMINATOR (ST)
<OC> /x9D <U009D> OPERATING SYSTEM COMMAND (OSC)
<PM> /x9E <U009E> PRIVACY MESSAGE (PM)
<AC> /x9F <U009F> APPLICATION PROGRAM COMMAND (APC)
<!I> /xA1 <U00A1> INVERTED EXCLAMATION MARK
<Ct> /xA2 <U00A2> CENT SIGN
<Pd> /xA3 <U00A3> POUND SIGN
<Ye> /xA5 <U00A5> YEN SIGN
<SE> /xA7 <U00A7> SECTION SIGN
<Cu> /xA8 <U00A4> CURRENCY SIGN
<Co> /xA9 <U00A9> COPYRIGHT SIGN
<-a> /xAA <U00AA> FEMININE ORDINAL INDICATOR
<<<> /xAB <U00AB> LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
<DG> /xB0 <U00B0> DEGREE SIGN
<+-> /xB1 <U00B1> PLUS-MINUS SIGN
<2S> /xB2 <U00B2> SUPERSCRIPT TWO
<3S> /xB3 <U00B3> SUPERSCRIPT THREE
<My> /xB5 <U00B5> MICRO SIGN
<PI> /xB6 <U00B6> PILCROW SIGN
<.M> /xB7 <U00B7> MIDDLE DOT
<1S> /xB9 <U00B9> SUPERSCRIPT ONE
<-o> /xBA <U00BA> MASCULINE ORDINAL INDICATOR
</>/>> /xBB <U00BB> RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
<14> /xBC <U00BC> VULGAR FRACTION ONE QUARTER
<12> /xBD <U00BD> VULGAR FRACTION ONE HALF
<?I> /xBF <U00BF> INVERTED QUESTION MARK
<A!> /xC0 <U00C0> LATIN CAPITAL LETTER A WITH GRAVE
<A'> /xC1 <U00C1> LATIN CAPITAL LETTER A WITH ACUTE
<A/>> /xC2 <U00C2> LATIN CAPITAL LETTER A WITH CIRCUMFLEX
<A?> /xC3 <U00C3> LATIN CAPITAL LETTER A WITH TILDE
<A:> /xC4 <U00C4> LATIN CAPITAL LETTER A WITH DIAERESIS
<AA> /xC5 <U00C5> LATIN CAPITAL LETTER A WITH RING ABOVE
<AE> /xC6 <U00C6> LATIN CAPITAL LETTER AE
<C,> /xC7 <U00C7> LATIN CAPITAL LETTER C WITH CEDILLA
<E!> /xC8 <U00C8> LATIN CAPITAL LETTER E WITH GRAVE
<E'> /xC9 <U00C9> LATIN CAPITAL LETTER E WITH ACUTE
<E/>> /xCA <U00CA> LATIN CAPITAL LETTER E WITH CIRCUMFLEX
<E:> /xCB <U00CB> LATIN CAPITAL LETTER E WITH DIAERESIS
<I!> /xCC <U00CC> LATIN CAPITAL LETTER I WITH GRAVE
<I'> /xCD <U00CD> LATIN CAPITAL LETTER I WITH ACUTE
<I/>> /xCE <U00CE> LATIN CAPITAL LETTER I WITH CIRCUMFLEX
<I:> /xCF <U00CF> LATIN CAPITAL LETTER I WITH DIAERESIS
<N?> /xD1 <U00D1> LATIN CAPITAL LETTER N WITH TILDE
<O!> /xD2 <U00D2> LATIN CAPITAL LETTER O WITH GRAVE
<O'> /xD3 <U00D3> LATIN CAPITAL LETTER O WITH ACUTE
<O/>> /xD4 <U00D4> LATIN CAPITAL LETTER O WITH CIRCUMFLEX
<O?> /xD5 <U00D5> LATIN CAPITAL LETTER O WITH TILDE
<O:> /xD6 <U00D6> LATIN CAPITAL LETTER O WITH DIAERESIS
<OE> /xD7 <U0152> LATIN CAPITAL LIGATURE OE
<O//> /xD8 <U00D8> LATIN CAPITAL LETTER O WITH STROKE
<U!> /xD9 <U00D9> LATIN CAPITAL LETTER U WITH GRAVE
<U'> /xDA <U00DA> LATIN CAPITAL LETTER U WITH ACUTE
<U/>> /xDB <U00DB> LATIN CAPITAL LETTER U WITH CIRCUMFLEX
<U:> /xDC <U00DC> LATIN CAPITAL LETTER U WITH DIAERESIS
<Y:> /xDD <U0178> LATIN CAPITAL LETTER Y WITH DIAERESIS



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Philippe Verdy
2003-10-11 00:43:26 UTC
Permalink
Note that IBM ICU data for DEC-MCS is based on the glibc implementation, but
this is not authoritative for this DEC charset, as the glibc "fills the
holes" by mapping ISO-8859-1 characters in unassigned positions. This
corresponds to a common practice, that maximizes the compatibility with
ISO-8859-1 (and thus Unicode code points in range U+0000 to U+00FF).

The listed incompatibilities of DEC-MCS with ISO-8859-1 are only those 3:

<!--
<OE> /xD7 <U0152> LATIN CAPITAL LIGATURE OE,
instead of:
<??> /xD7 <U+00D7> MULTIPLICATION SIGN
-->
<a u="0152" b="D7"/>

<!--
<oe> /xF7 <U0153> LATIN SMALL LIGATURE OE
instead of:
<??> /xF7 <U+00F7> DIVISION SIGN)
-->
<a u="0153" b="F7"/>

<!--
<Y:> /xDD <U0178> LATIN CAPITAL LETTER Y WITH DIAERESIS
instead of:
<??> /xDD <U+00DD> LATIN CAPITAL LETTER Y WITH ACUTE
-->
<a u="0178" b="DD"/>

In the original DEC-MCS set, the following positions were normally not
assigned (except in some extension found in more recent versions of VMS???)
like they are in the iconv tables for glibc:
<??> /xA0 <U+00A0 in iconv???>
<??> /xA4 <U+00A4 in iconv???>
<??> /xA6 <U+00A6 in iconv???>
<??> /xAC <U+00AC in iconv???>
<??> /xAD <U+00AD in iconv???>
<??> /xAE <U+00AE in iconv???>
<??> /xAF <U+00AF in iconv???>
<??> /xB4 <U+00B4 in iconv???>
<??> /xB8 <U+00B8 in iconv???>
<??> /xBE <U+00BE in iconv???>
<??> /xD0 <U+00D0 in iconv???>
<??> /xDE <U+00DE in iconv???>
<??> /xF0 <U+00F0 in iconv???>
<??> /xFD <U+00FD in iconv???>
<??> /xFE <U+00FE in iconv???>

Additionally some characters were at other places (adding another
incompatibility...):
<??> /xA8 <U+00A4>

I have also found extensions of DEC-MCS built for DBCS handling, but I don't
have the details about how they work (for Japanese?) or if they used the
ISO2022 encoding model, or some DEC VT controls in the encoded range /x80 to
/x9F, which includes two PUA positions at /x91 and /x92...
If someone has kept the technical references of those old DEC VT terminals
using them, may he could answer more precisely... Ask now to Compaq or to
HP?



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Frank da Cruz
2003-10-11 15:33:25 UTC
Permalink
I've added a DEC MCS table to the character tables at:

http://www.columbia.edu/kermit/csettables.html

- Frank


------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Loading...