Discussion:
[OT?] QBCS
Marco Cimarosti
2003-08-28 11:38:26 UTC
Permalink
It seems that the IT world has a new acronym: "QBCS". I understand that it
stands for "quadra-byte character set", and I heard it used to refer to GB
13030.

My question is: it just a fancy sinomym for GB 13030 or can it also refer to
Unicode or other encodings?

Thanks in advance.

_ Marco


------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Lars Marius Garshol
2003-08-28 14:31:55 UTC
Permalink
* Marco Cimarosti
|
| It seems that the IT world has a new acronym: "QBCS". I understand
| that it stands for "quadra-byte character set", and I heard it used
| to refer to GB 13030.
|
| My question is: it just a fancy sinomym for GB 13030 or can it also
| refer to Unicode or other encodings?

This must be an oxymoron, in the sense that character sets don't
really have a byte width, being completely abstract assignments of
abstract characters to abstract numbers.

So what it really means must be "quadra-byte character encoding", and
both GB 18030 and UTF-32 should fit into that category.
--
Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50 <URL: http://www.garshol.priv.no >



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Doug Ewell
2003-08-29 03:30:03 UTC
Permalink
Lars Marius Garshol <larsga at garshol dot priv dot no> quoted Marco
Post by Lars Marius Garshol
| It seems that the IT world has a new acronym: "QBCS". I understand
| that it stands for "quadra-byte character set", and I heard it used
| to refer to GB 13030.
|
| My question is: it just a fancy sinomym for GB 13030 or can it also
| refer to Unicode or other encodings?
The original term "DBCS," or "double-byte character set," refers to a
variable-width encoding where each character requires either one or two
bytes. East Asian legacy character encodings fall into this category.

By extension, then, a "QBCS" would be a variable-width character
encoding where the code units can be anywhere from one to four bytes
long -- an apt description of GB 18030.

Paradoxically (at least to me), the term "multi-byte character set"
refers to a fixed-width encoding, such as UCS-2. The official name of
ISO/IEC 10646 is "Universal Multiple-Octet Coded Character Set."

(BTW, pet peeve: The word "acronym" should only be used to mean a
pronounceable WORD ("nym") formed from the initials of other words.
Classic examples are "scuba" and "radar." If you can figure out how to
pronounce "qbcs," more power to you, but to me it's just an
abbreviation.)
Post by Lars Marius Garshol
This must be an oxymoron, in the sense that character sets don't
really have a byte width, being completely abstract assignments of
abstract characters to abstract numbers.
This is technically true, but the terms SBCS and DBCS are so entrenched
in the industry that it doesn't seem useful to try to deprecate them
now.
Post by Lars Marius Garshol
So what it really means must be "quadra-byte character encoding", and
both GB 18030 and UTF-32 should fit into that category.
GB 18030, yes, because its code units vary from one to four bytes in
length. UTF-32, no, because its code units are uniformly 32 bits.

-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Tex Texin
2003-09-01 17:43:51 UTC
Permalink
Doug,

In most industry usages, MBCS refers to variable width encodings, not fixed
width.

tex
Post by Doug Ewell
Paradoxically (at least to me), the term "multi-byte character set"
refers to a fixed-width encoding, such as UCS-2. The official name of
ISO/IEC 10646 is "Universal Multiple-Octet Coded Character Set."
--
-------------------------------------------------------------
Tex Texin cell: +1 781 789 1898 mailto:***@XenCraft.com
Xen Master http://www.i18nGuy.com

XenCraft http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------


------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Doug Ewell
2003-09-02 03:26:54 UTC
Permalink
Post by Tex Texin
In most industry usages, MBCS refers to variable width encodings, not
fixed width.
Well, if variable-width encodings are referred to as both DBCS (see, for
example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then what
term is used to describe a fixed-width encoding of more than 1 byte? Or
was the concept not common enough to warrant a name until Unicode?

-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Asmus Freytag
2003-09-02 07:24:47 UTC
Permalink
Post by Doug Ewell
Post by Tex Texin
In most industry usages, MBCS refers to variable width encodings, not
fixed width.
Well, if variable-width encodings are referred to as both DBCS (see, for
example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then what
term is used to describe a fixed-width encoding of more than 1 byte? Or
was the concept not common enough to warrant a name until Unicode?
The most common 'pure' DBCS was encountered in mainframe environments.
All the other platforms used 'mixed' single and double-byte or other
variable length encodings, so that 'DBCS' could stand in for a variable
lenght encoding with maximum length 2 without confusion (except when
talking to mainframe people).

A./


------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Philippe Verdy
2003-09-02 23:04:38 UTC
Permalink
Post by Asmus Freytag
Post by Doug Ewell
Post by Tex Texin
In most industry usages, MBCS refers to variable width encodings, not
fixed width.
Well, if variable-width encodings are referred to as both DBCS (see, for
example, http://czyborra.com/charsets/cjk.html#dbcs) and MBCS, then what
term is used to describe a fixed-width encoding of more than 1 byte?
Or
Post by Asmus Freytag
Post by Doug Ewell
was the concept not common enough to warrant a name until Unicode?
The most common 'pure' DBCS was encountered in mainframe environments.
All the other platforms used 'mixed' single and double-byte or other
variable length encodings, so that 'DBCS' could stand in for a
variable
Post by Asmus Freytag
lenght encoding with maximum length 2 without confusion (except when
talking to mainframe people).
In the late 80's, the acronym DBCS was also used to refer to
user-defined characters, that could be assigned in a codepage and
defined by a transferable bitmap, and accessed with an encoding sequence
allowing you to remap the upper-half of the 8-bit character set.

In a 7-bit environment, these 8-bit "characters" (in fact relative
positions in a 7-bit codepage) could be accessed using control sequences
(like SS2 used to shift temporarily in the upper subset only for the
next character). For these reasons, those assigned characters in the
selected codepage for the upper-half of the 8-bit encoding, and accessed
by at least 2 encoding 7-bit bytes were qualified as "double-byte
character", and the general encoding scheme was called "DBCS".

This has inspired the ISO-2022 standards for East-Asian languages, but
also the European Teletext and Videotext standard, then restricted to a
7-bit encoding scheme. These systems are still used today. But in any
case the "DBCS" usage was refering to a complex encoding scheme with
variable length for characters (and sometimes varying with the encoding
context or exceeding the 2 bytes limit). You may find references to
these character sets with also reference to special escape sequences
used to define and transport the bitmaps needed to represent
"user-defined" characters (as they were defined notably to support
Japanese or Chinese in the late 80's, or to create custom graphic
characters, in fact bitmap glyphs, within interactive documents or
applications).



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Marco Cimarosti
2003-08-29 07:57:42 UTC
Permalink
Post by Doug Ewell
[...]
(BTW, pet peeve: The word "acronym" should only be used to mean a
pronounceable WORD ("nym") formed from the initials of other words.
Classic examples are "scuba" and "radar." If you can figure
out how to pronounce "qbcs," more power to you, but to me it's just
an abbreviation.)
Right, sorry.

(I can pronounce ['kubks], although I wouldn't do it in front of my managers
and customers. :-)

Actually, I don't like this "QBCS" term and I'd rather avoid saying it
myself. But I wanted to be sure other people mean when they use it.
Post by Doug Ewell
[...]
Post by Lars Marius Garshol
So what it really means must be "quadra-byte character
encoding", and both GB 18030 and UTF-32 should fit
into that category.
GB 18030, yes, because its code units vary from one to four bytes in
length. UTF-32, no, because its code units are uniformly 32 bits.
But UTF-8 fits the definition.

_ Marco



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Doug Ewell
2003-08-29 15:26:30 UTC
Permalink
Post by Marco Cimarosti
Post by Doug Ewell
Post by Lars Marius Garshol
So what it really means must be "quadra-byte character
encoding", and both GB 18030 and UTF-32 should fit
into that category.
GB 18030, yes, because its code units vary from one to four bytes in
length. UTF-32, no, because its code units are uniformly 32 bits.
But UTF-8 fits the definition.
Correct.

-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Loading...