Perl Unicode question

Jan. 22, 2006

      Folks,

I'm trying to understand how Unicode works in Perl 5.8.7, and things are not
working the way they're described in perluniintro. In particular, I have a
file encoded in UTF-8 from Perl, that I want to convert to Windows Unicode
(UCS-16LE). The intro provides this snippet for converting from one
character set to another:

    open(my $nihongo, '<:encoding(iso2022-jp)', 'text.jis');

    open(my $unicode, '>:utf8',                 'text.utf8');

    while (<$nihongo>) { print $unicode }  
When I try that, substituting UTF-8 and ':encoding(UTF-16LE)', I get no
warnings, but the output has twice as many characters as it should have and
they are all NULLs.
I can supply a my UTF-8 file if anyone wants to have a try. In particular, I
think the fact that I have embedded apostrophes (0x27) encoded as RIGHT
SINGLE QUOTATION MARK (0x2019) may be part of the problem.
Thanks,
Skip

Skip Gaede

tags

participants (1)