Ticket #1671 (closed bug: wontfix)

Opened 12 years ago

Last modified 12 years ago

Can't encode strings with non-ASCII characters in them

Reported by: masak Owned by:
Priority: normal Milestone:
Component: none Version: 2.4.0
Severity: medium Keywords:
Cc: Language:
Patch status: Platform:

Description

In the below pir, encoding an all-ASCII string works fine, but Parrot dies on trying to translate the string "ö" to fixed_8. The error message is 'unimpl fixed_8'.

.sub _main :main
    .local int bin_coding, i, max, byte
    .local string bin_string
    .local pmc it, result
    $S0 = "OH HAI"
    bin_coding = find_encoding 'fixed_8'
    bin_string = trans_encoding $S0, bin_coding
    i = 0
    max = length bin_string
  bytes_loop:
    if i >= max goto bytes_done
    byte = ord bin_string, i
    say byte
    inc i
    goto bytes_loop
  bytes_done:

    $S0 = unicode:"ö"
    bin_string = trans_encoding $S0, bin_coding
.end

Attachments

to_charset_binary.patch Download (1.4 KB) - added by NotFound 12 years ago.

Change History

Changed 12 years ago by NotFound

Changed 12 years ago by NotFound

The attached patch to_charset_binary allows the wanted functionality, using trans_charset instead of trans_encoding. It does a raw copy of the string content regardless of its charset and encoding.

However I'm not sure if this conversion is desirable to have. Opinions?

Changed 12 years ago by NotFound

The conversions of strings to/from raw bytes can be done now with the ByteBuffer PMC. Is that way enough, or a direct translation to binary string is still wanted?

Changed 12 years ago by masak

That way should indeed be enough; translation directly to string was more a sign of desperation than a viable long-term solution.

I've done an initial review of the ByteBuffer PMC, and it looks very good. Will port my Perl 6 code to make use of it tonight or tomorrow. Thanks.

Changed 12 years ago by NotFound

  • status changed from new to closed
  • resolution set to wontfix

Closing ticket.

Note: See TracTickets for help on using tickets.