Discussion:
TLG and Beta code
Raymond Mercier
2003-08-26 20:50:44 UTC
Permalink
Last January when I asked if the Greek symbol for one-half might be included somewhere in Unicode I was led to understand that not only that but a whole range of Greek symbols were being proposed by the TLG people. There was for example http://www.tlg.uci.edu/Uni.prop.html. Indeed the Beta code (http://www.tlg.uci.edu/~tlg/BetaCode.html) used by the TLG covers a huge range of odd symbols which are needed in Unicode if the classical texts which they have digitised are ever to be "unicoded".

I was reminded of the need to enlarge the Greek coverage when converting some Greek numerical texts, and saw that not even the symbol for zero was part of the Greek block, so that I had to use U+014D, latin l.c. 'o' +macron, o, which is admittedly near enough.

Yet when I search the Unicode site now for TLG or beta code I find nothing. Are the TLG proposals somewhere in the pipeline ?

Raymond Mercier
David J. Perry
2003-08-27 00:11:02 UTC
Permalink
Raymond,

If you go to http://www.tlg.uci.edu/Uni.prop.html you will see all the
proposals; the site indicates very clearly which ones have been accepted
by the UTC and which are pending (only one still pending at this point).
They must of course be voted on by WG2 before they are officially a part
of Unicode. The TLG folks have prepared a very useful document at
http://www.tlg.uci.edu/quickbeta.pdf that shows the Unicode equivalent
for each beta code character (some of these are existing Unicode
characters, some newly proposed, and some so rare or poorly understood
that TLG did not think them appropriate to propose for Unicode).

David





------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Raymond Mercier
2003-08-27 08:49:20 UTC
Permalink
David,
I am glad to see this much progress, yet, as I noticed after posting, the zero symbol is actually missing in
beta code, so your Beta code -Unicode equivalences would not have it. I think it is fair to say that the TLG have avoided the parts of mathematical texts where the symbol is common, as in the various tables in Ptolemy's Almagest (where all the tables are omitted by TLG). This symbol is in reality more common than the rarities listed in quickbeta. In the editions I am involved with we use U+14D, o, which is near enough I suppose.

Raymond



----- Original Message -----
From: "David J. Perry" <***@verizon.net>
To: "'Raymond Mercier'" <***@compuserve.com>; <***@unicode.org>
Sent: Wednesday, August 27, 2003 1:11 AM
Subject: RE: TLG and Beta code
Post by David J. Perry
Raymond,
If you go to http://www.tlg.uci.edu/Uni.prop.html you will see all the
proposals; the site indicates very clearly which ones have been accepted
by the UTC and which are pending (only one still pending at this point).
They must of course be voted on by WG2 before they are officially a part
of Unicode. The TLG folks have prepared a very useful document at
http://www.tlg.uci.edu/quickbeta.pdf that shows the Unicode equivalent
for each beta code character (some of these are existing Unicode
characters, some newly proposed, and some so rare or poorly understood
that TLG did not think them appropriate to propose for Unicode).
David
Raymond Mercier
2003-08-27 10:49:25 UTC
Permalink
In a Greek text, shouldn't you be using omicron and a combining macron
rather than Latin o with macron? If omicron plus combining macron is an
adequate representation of the glpyh, then maybe there is no need to a
new character.
--
Peter Kirk
http://www.qaya.org/
Well, it is just simpler to use the Latin, since the combination is a single
codepoint. The real point is that it would be nice to have an appropriate
Greek form. The TLG assumption is that the Greek texts used omicron for
zero, but that is not what you find in the MSS. Against that assumtion, I
have just written to the TLG as follows:

I know that you will find support in Heath, whose Greek Mathematics, Vol.1,
p. 45, is surprisingly misleading in just saying that they used omicron.
Also in his ed. of Ptolemy's Hypotheses Heiberg has rather perversely put a
macron on all the letters except omicron ! (Opera Minora 78.29, for
example).

This does not adequately represent the Byzantine MSS. I don't have Heiberg's
Syntaxis in front of me, but Halma's edition of the Syntaxis is closer to
the MSS, and uses, o+macron. Elsewhere in the MSS one finds a variety of
forms, according to the age etc. In the ninth century MSS zero is
represented by a rather small o with a long overline with serifs at either
end, much bigger than our macron. In late Byzantine mathematics one finds
sometimes a form like the Cyrillic che (U+0447). Certainly the form varies
a good deal, but I have not seen a simple omicron, whatever the editors may
have put.

In the texts edited in Georges Gémiste Pléthon (by Anne Tihon and myself),
which I see you include in the TLG, we use a macron on the o, and are doing
the same in our edition of Ptolemy's Handy Tables. If we had something
closer to the forms used in ninth century MSS we would use it.

Raymond



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Nick Nicholas
2003-08-27 11:33:16 UTC
Permalink
Subject: TLG and Beta code
Date: Wed, 27 Aug 2003 09:49:20 +0100
David,
I am glad to see this much progress, yet, as I noticed after posting,
the zero symbol is actually missing in
beta code, so your Beta code -Unicode equivalences would not have it.
I think it is fair to say that the TLG have avoided the parts of
mathematical texts where the symbol is common, as in the various
tables in Ptolemy's Almagest (where all the tables are omitted by
TLG). This symbol is in reality more common than the rarities listed
in quickbeta. In the editions I am involved with we use U+14D, o,
which is near enough I suppose.
I count 368 instances of #130, the TLG entity for Greek zero, in the
text of the Almagest the TLG has, and a further 543 in Pappus'
commentary on the Almagest, 80 in Theon's commentary, and well over a
thousand in Byzantine astronomers; so rumours of its absence in Beta
code are exaggerated. :-) The TLG didn't actually avoid the tables (at
least not those integrated into the text), though the current markup of
the tables is somewhat dated.

Of course, the scholarly markup of texts in general raises the question
of when a glyph does need a Unicode codepoint, and when it is merely a
variant of something else, or beyond the scope of plaintext. The
listing of Beta escapes includes much that is either idiosyncratic or
a variant of something else; the TLG has traditionally erred on the
side of caution in including Beta escapes (equivalent to XML entities),
but the requirements for TLG markup are not necessarily the same for
inclusion in Unicode.

The equivalent glyph the TLG has posted for #130 is omicron, though of
course the print edition used for the Almagest has its Greek zero
slightly different (it's closer to an Goudy-style Arabic zero, from
memory.) Whether it merits its own codepoint, or is merely a glyph
variant of U+0030 Digit Zero, is probably a debate for another time and
place. What to do with such "one-off" glyphs the kind of issue the
Text Encoding Initiative is having to deal with, though.

One might argue against omicron or o-macron for Greek Zero on the
grounds that this isn't really a character but a digit; but then these
texts use letters for digits anyway. So I don't see a clear rationale
for one way or the other. However, I think the numerical diacritic for
the zero should be the same as for other Greek letters, and it should
be U+0305 Combining Overline rather than U+0304 Combining Macron.

|||
"Assuming, for whatever reasons, that neither scholar presented the
evidence properly, then there remains a body of evidence you have not
yet destroyed because it has never been presented." --- Harold Fleming
|NickNicholas|Dept.French&ItalianStudies|UniversityOfMelbourne|Australia
|
| ***@unimelb.edu.au http://www.opoudjis.net
|



To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Raymond Mercier
2003-08-27 12:37:01 UTC
Permalink
----- Original Message -----
From: "Nick Nicholas" <***@optushome.com.au>
To: <***@unicode.org>
Sent: Wednesday, August 27, 2003 12:33 PM
Subject: Re: TLG and Beta code
The equivalent glyph the TLG has posted for #130 is omicron,<
I know this is common in the TLG, but as you say, they assume it is just
omicron (an assumption repeated in a message just received from them).
But, I am trying to get across that that is wrong: it represents neither
papyri nor Byzantine MSS. Apart from what I have already said about MSS in
the previous posting, I see in Ifrah's Histoire Universelle des Chiffres
(there is also an English version), vol.1, p. 372 seq, illustrations from
papyri, in some of which the zero is just as in 9th century Byzantine MSS: a
small circle and a rather long bar over it. Other examples in papyri are
rather similar to that : never a plain omicron.
So is there not a good reason to treat this as a distinct character, to be
assigned to a Unicode codepoint ?

Raymond
John Hudson
2003-08-27 18:20:46 UTC
Permalink
Post by Raymond Mercier
I know this is common in the TLG, but as you say, they assume it is just
omicron (an assumption repeated in a message just received from them).
But, I am trying to get across that that is wrong: it represents neither
papyri nor Byzantine MSS.
...
Post by Raymond Mercier
So is there not a good reason to treat this as a distinct character, to be
assigned to a Unicode codepoint ?
Raymond, based on what you have said, I would agree. A variety of visual
representations, clearly distinct from the omicron as formed in the same
documents, suggests a separate character. Would you be able to write up a
proposal to encode such a character, or at least an informational document
including illustrations of different forms of the Greek zero, preferably in
proximity to differently formed omicrons? Nothing is going to happen unless
the UTC receive such a document, and you sound like the best person to
prepare one.

John Hudson

Tiro Typeworks www.tiro.com
Vancouver, BC ***@tiro.com

You need a good operator to make type. If it were a
DIY affair the caster would only run for about five
minutes before the DIYer burned his butt off.
- Jim Rimmer



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Raymond Mercier
2003-08-27 20:32:24 UTC
Permalink
John,
I am glad to hear from you. I shall do what I can to get a proposal
together.

Raymond


----- Original Message -----
From: "John Hudson" <***@tiro.com>
To: "Raymond Mercier" <***@compuserve.com>
Cc: <***@unicode.org>
Sent: Wednesday, August 27, 2003 7:20 PM
Subject: Re: TLG and Beta code
Post by John Hudson
Post by Raymond Mercier
I know this is common in the TLG, but as you say, they assume it is just
omicron (an assumption repeated in a message just received from them).
But, I am trying to get across that that is wrong: it represents neither
papyri nor Byzantine MSS.
...
Post by Raymond Mercier
So is there not a good reason to treat this as a distinct character, to
be
Post by John Hudson
Post by Raymond Mercier
assigned to a Unicode codepoint ?
Raymond, based on what you have said, I would agree. A variety of visual
representations, clearly distinct from the omicron as formed in the same
documents, suggests a separate character. Would you be able to write up a
proposal to encode such a character, or at least an informational document
including illustrations of different forms of the Greek zero, preferably
in
Post by John Hudson
proximity to differently formed omicrons? Nothing is going to happen
unless
Post by John Hudson
the UTC receive such a document, and you sound like the best person to
prepare one.
John Hudson
Tiro Typeworks www.tiro.com
You need a good operator to make type. If it were a
DIY affair the caster would only run for about five
minutes before the DIYer burned his butt off.
- Jim Rimmer
----- Original Message -----
From: "John Hudson" <***@tiro.com>
To: "Raymond Mercier" <***@compuserve.com>
Cc: <***@unicode.org>
Sent: Wednesday, August 27, 2003 7:20 PM
Subject: Re: TLG and Beta code
Post by John Hudson
Post by Raymond Mercier
I know this is common in the TLG, but as you say, they assume it is just
omicron (an assumption repeated in a message just received from them).
But, I am trying to get across that that is wrong: it represents neither
papyri nor Byzantine MSS.
...
Post by Raymond Mercier
So is there not a good reason to treat this as a distinct character, to
be
Post by John Hudson
Post by Raymond Mercier
assigned to a Unicode codepoint ?
Raymond, based on what you have said, I would agree. A variety of visual
representations, clearly distinct from the omicron as formed in the same
documents, suggests a separate character. Would you be able to write up a
proposal to encode such a character, or at least an informational document
including illustrations of different forms of the Greek zero, preferably
in
Post by John Hudson
proximity to differently formed omicrons? Nothing is going to happen
unless
Post by John Hudson
the UTC receive such a document, and you sound like the best person to
prepare one.
John Hudson
Tiro Typeworks www.tiro.com
You need a good operator to make type. If it were a
DIY affair the caster would only run for about five
minutes before the DIYer burned his butt off.
- Jim Rimmer
Loading...