Discussion:
Need help about unicode encoding with Perl !
Hu Guoxin
2003-09-08 10:26:42 UTC
Permalink
hello everyone:

I'm using Perl to develop a web-site. In order to fit OS in Chinese
and Japanese language, the CGI written in Perl should convert the
characters that users inputed into unicode and save them on the server.

My question is:

(1)how to judge the user's OS with Perl script?
(2)how to convert GB,BIG5,Shift-JIS, into unicode?
(3)how to convert unicode into GB,BIG5,Shift-JIS?

Thanks a lot!

********************************************
胡 国昕 (Hu Guo Xin)
北京富士通系统工程有限公司 开发二部
中国北京市朝阳区霄云路26号鹏润大厦B座10层
邮政编码100016
Tel: +86-10-84584711-261(外线)
7987-261(内线)
Email: ***@bfs.cn.fujitsu.com
********************************************




------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Chris Jacobs
2003-09-08 14:23:49 UTC
Permalink
----- Original Message -----
From: "Hu Guoxin" <***@bfs.cn.fujitsu.com>
To: <***@unicode.org>
Sent: Monday, September 08, 2003 12:26 PM
Subject: Need help about unicode encoding with Perl !
Post by Hu Guoxin
I'm using Perl to develop a web-site.
But you don't tell which version of Perl.
Post by Hu Guoxin
From this website http://rf.net/~james/perli18n.html (the first site Google
finds for unicode perl web) I see that different versions of Perl have
different levels of support for Unicode.





------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Hu Guoxin
2003-09-09 03:01:29 UTC
Permalink
Post by Chris Jacobs
Post by Hu Guoxin
I'm using Perl to develop a web-site.
But you don't tell which version of Perl.
it's Perl 5.6.1.
Post by Chris Jacobs
Post by Hu Guoxin
From this website http://rf.net/~james/perli18n.html (the first site Google
finds for unicode perl web) I see that different versions of Perl have
different levels of support for Unicode.
I just want to know how to convert the encode method.

Such as:

"中国人。" is a sentence in GBK code.
(1)how to convert it into unicode?
(2)and how to convert it back?
(3)if the user's OS is Japanese , how to convert the unicode message into Shift-JIS
as "中国人"?

Thanks a lot!



------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
John Delacour
2003-09-09 13:43:05 UTC
Permalink
Post by Hu Guoxin
Post by Chris Jacobs
Post by Hu Guoxin
I'm using Perl to develop a web-site.
But you don't tell which version of Perl.
it's Perl 5.6.1.
If you get

ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP936.TXT


then you can build a table to work with. The
script below simply prints the two values, but
you can build a hash to do what you want. In
Perl 5.8+ most people would use the Encode
module, I suppose.


#!/usr/bin/perl
no warnings ;
$f = "/Users/Shared/Downloads/CP936.TXT" ; # path to downloaded file
$/ = /(\015\012|\015|\012)/ ? $1: "\n";
open F, $f or die $!;
for (<F>) {
unless(/^#/) {
s~\t\#.+$~~;
s~0x~~g ;
s~^([0-9A-F][0-9A-F])\t~chr( hex $1) . "\t"~e ;
s~^([0-9A-F][0-9A-F])([0-9A-F][0-9A-F])~chr( hex $1) . chr(hex $2)~e ;

s~\t([0-9A-F][0-9A-F])([0-9A-F][0-9A-F])~"\t" .
chr(hex $1) . chr(hex $2)~e ;
print
}
}
Post by Hu Guoxin
Post by Chris Jacobs
Post by Hu Guoxin
From this website
http://rf.net/~james/perli18n.html (the first
site Google
Post by Chris Jacobs
finds for unicode perl web) I see that different versions of Perl have
different levels of support for Unicode.
I just want to know how to convert the encode method.
"íÜçëêlÅB" is a sentence in GBK code.
(1)how to convert it into unicode?
(2)and how to convert it back?
(3)if the user's OS is Japanese , how to convert
the unicode message into Shift-JIS
as "íÜçëêl"?
Thanks a lot!
------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Jungshik Shin
2003-09-09 23:53:13 UTC
Permalink
Post by John Delacour
Post by Hu Guoxin
it's Perl 5.6.1.
Perl 5.8+ most people would use the Encode
module, I suppose.
'Encode' was kinda backported to Perl 5.6.x. It's Encode-compat
by Autrijus Tang. See http://www.cpan.org/authors/id/A/AU/AUTRIJUS
However, I'd upgrade to Perl 5.8.x

Jungshik


------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Chris Jacobs
2003-09-09 13:43:07 UTC
Permalink
Post by Hu Guoxin
I just want to know how to convert the encode method.
"中国人。" is a sentence in GBK code.
(1)how to convert it into unicode?
(2)and how to convert it back?
(3)if the user's OS is Japanese , how to convert the unicode message into Shift-JIS
as "中国人"?
Thanks a lot!
For Shift-JIS see::

http://homepage1.nifty.com/nomenclator/perl/ShiftJIS-CP932-MapUTF.html




------------------------ Yahoo! Groups Sponsor ---------------------~-->
KnowledgeStorm has over 22,000 B2B technology solutions. The most comprehensive IT buyers' information available. Research, compare, decide. E-Commerce | Application Dev | Accounting-Finance | Healthcare | Project Mgt | Sales-Marketing | More
http://us.click.yahoo.com/IMai8D/UYQGAA/cIoLAA/8FfwlB/TM
---------------------------------------------------------------------~->

To Unsubscribe, send a blank message to: unicode-***@yahooGroups.com

This mailing list is just an archive. The instructions to join the true Unicode List are on http://www.unicode.org/unicode/consortium/distlist.html


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Loading...