Ticket #1639 (assigned todo)

Opened 5 years ago

Last modified 4 years ago

StringHandle should be updated to use StringBuilder internally.

Reported by: bacek Owned by: kthakore
Priority: normal Milestone:
Component: core Version: trunk
Severity: low Keywords: newbie
Cc: Language:
Patch status: Platform:

Description

Hello.

In immutable strings world current implementation of StringHandle is sub-optimal. It should be migrated to use StringBuilder internally instead of String.

-- Bacek

Change History

Changed 4 years ago by bacek

  • keywords newbie added

Changed 4 years ago by kthakore

From what I have understood from Coke:

src/pmc/stringhandle.pmc has an attribute "*stringhandle* - which needs to be changed

[Relevant Files: src/pmc/string.pmc and src/pmc/stringbuilder.pmc]

The internal buffer is STRING, but should be StringBuilder instead of a String. The main reason is the guts of push_string(), which stringbuilder does better.

Coke 'The puts() method that makes it want to update' ??? [Clarification Needed]

Coke ' You could start by changing the open() method to initialize the stringhandle attribute with a Stringbuilder instead of a string and seeing what breaks '

Are there specific tests for this? How do I run those without running the whole test suite (like perl Build test --test_files )?

R

Changed 4 years ago by coke

The initial comments above were against the wrong file, which is why they reference a non-existant method.

src/pmc/stringhandle has an ATTR named *stringhandle. - this is initialized to a string PMC, but should be a stringbuilder PMC instead. the initialization happens in open().

The string is appended to in puts() - that's why we want stringbuilder, which is much faster/better with appends than the now-immutable string.

The stringhandle tests are in t/pmc/stringhandle.t. You can always ack through the t/ files to find tests that include a given pmc/opcode/etc.

perl t/harness t/pmc/stringhandle.t

to run a single test file with the standard harness settings.

Changed 4 years ago by kthakore

I have updated stringhandle and the usage of stringhandle in io with this patch

http://sdlperl.ath.cx/parrot_string_patch

However this causes

make[1]: Leaving directory `/home/kthakore/parrot/trunk/docs'
./ops2c --dynamic src/dynoplibs/io.ops --quiet
make: *** [src/dynoplibs/io_ops.c] Segmentation fault

I can't seem to find the dependency but I will look again later.

Changed 4 years ago by kthakore

Ok so I got a lot further ... the patch is here now

 http://sdlperl.ath.cx/parrot_string_patch

However now I am having trouble on the readline test in t/pmc/stringbuilder.t that is using

Parrot_io_readline(INTERP, SELF);

Which is calling

if (pmc->vtable->base_type == enum_class_StringHandle) {
        INTVAL offset, newline_pos, read_length, orig_length;
        PMC* stringhandle;
        GETATTR_StringHandle_stringhandle(interp, pmc, stringhandle);
        if (PMC_IS_NULL(stringhandle))
            Parrot_ex_throw_from_c_args(interp, NULL, EXCEPTION_PIO_ERROR,
                "Attempt to read from null or invalid PMC");

        result = VTABLE_get_string(interp, stringhandle);

        if (STRING_IS_NULL(result))
            Parrot_ex_throw_from_c_args(interp, NULL, EXCEPTION_PIO_ERROR,
                "Cannot read from a closed stringhandle");

        orig_length = Parrot_str_byte_length(interp, result);
GETATTR_StringHandle_read_offset(interp, pmc, offset);
        newline_pos = Parrot_str_find_index(interp, result, CONST_STRING(interp,

        /* No newline found, read the rest of the string. */
        if (newline_pos == -1)
            read_length = orig_length - offset;
        else
            read_length = newline_pos - offset + 1; /* +1 to include the newline

        result = Parrot_str_substr(interp, result, offset, read_length);
        SETATTR_StringHandle_read_offset(interp, pmc, newline_pos + 1);
    }

The problem with this is it is running out of memory. Below is the stack.

{{ Failed allocation of 214446736 bytes Parrot VM: PANIC: Out of mem! C file src/gc/alloc_memory.c, line 151 Parrot file (not available), line (not available) ... (gdb) bt full #0 0x00007f38e50fdf3b in raise () from /lib/libpthread.so.0 No symbol table info available. #1 0x00007f38e6e9d33c in do_panic (interp=0x0,

message=0x7f38e7005ec8 "Out of mem", file=0x7f38e7005e88 "src/gc/alloc_memory.c", line=151) at src/exceptions.c:728

_ASSERT_ARGS_CHECK = 0

#2 0x00007f38e6ea73b9 in failed_allocation (line=151, size=214446736)

at src/gc/alloc_memory.c:31

No locals. #3 0x00007f38e6ea7631 in meminternal_allocate_zeroed (size=214446736,

file=0x7f38e700b958 "src/gc/alloc_resources.c", line=237) at src/gc/alloc_memory.c:151

_ASSERT_ARGS_CHECK = 0 ptr = (void * const) 0x0

#4 0x00007f38e6f0a1cd in alloc_new_block (mem_pools=0x60b8e0, size=214446680,

pool=0x60b9c0, why=0x7f38e700bec8 "inside compact")

---Type <return> to continue, or q <return> to quit---

at src/gc/alloc_resources.c:236

_ASSERT_ARGS_CHECK = 0 new_block = (Memory_Block *) 0x0 alloc_size = 214446680

#5 0x00007f38e6f0a459 in compact_pool (interp=0x60b080, mem_pools=0x60b8e0,

pool=0x60b9c0) at src/gc/alloc_resources.c:467

_ASSERT_ARGS_CHECK = 0 j = 139882369041963 total_size = 214446680

...

#9 0x00007f38e6ea7fdd in Parrot_gc_mark_and_sweep (interp=0x60b080, flags=2)

at src/gc/api.c:690

_ASSERT_ARGS_CHECK = 0

#10 0x00007f38e6f0a748 in mem_allocate (interp=0x60b080, mem_pools=0x60b8e0,

size=48904, pool=0x60b9c0) at src/gc/alloc_resources.c:326

_ASSERT_ARGS_CHECK = 0 return_val = (void *) 0xa2c500

#11 0x00007f38e6eaa700 in gc_ms_allocate_string_storage (interp=0x60b080,

str=0xa3d960, size=48894) at src/gc/gc_ms.c:1188

_ASSERT_ARGS_CHECK = 0 new_size = 48904

---Type <return> to continue, or q <return> to quit---

pool = (Variable_Size_Pool *) 0x60b9c0 mem = 0x0

#12 0x00007f38e6ea8487 in Parrot_gc_allocate_string_storage (interp=0x60b080,

str=0xa3d960, size=48894) at src/gc/api.c:511

_ASSERT_ARGS_CHECK = 0

#13 0x00007f38e6e4b04c in Parrot_str_clone (interp=0x60b080, s=0x7213f0)

at src/string/api.c:360

_ASSERT_ARGS_CHECK = 0 alloc_size = 48894 result = (STRING * const) 0xa3d960

#14 0x00007f38e6fc3ed6 in Parrot_StringBuilder_get_string (interp=0x60b080,

_self=0x69dd10) at ./src/pmc/stringbuilder.pmc:114

buffer = (STRING *) 0x7213f0

#15 0x00007f38e6f1054b in Parrot_io_readline (interp=0x60b080, pmc=0x69dc90)

at src/io/api.c:428

newline_pos = 139882369096428 read_length = 6336640

---Type <return> to continue, or q <return> to quit---

offset = 40 orig_length = 10976080 stringhandle = (PMC *) 0x69dd10 _ASSERT_ARGS_CHECK = 0 result = (STRING *) 0x60b080

#16 0x00007f38e6fc5e67 in Parrot_StringHandle_nci_readline (interp=0x60b080,

_self=0x698450) at ./src/pmc/stringhandle.pmc:291

string_result = (STRING * const) 0x2 _self = (PMC *) 0x69dc90 _ctx = (PMC * const) 0x69c8d0 _ccont = (PMC * const) 0x69c950

}}

Changed 4 years ago by NotFound

  • owner set to kthakore

Changed 4 years ago by kthakore

  • status changed from new to assigned

Changed 4 years ago by kthakore

Memory issues have been fixed. One last issue remains.

Using utf8 encoding on opened StringHandle returns a 'fixed_8' encodingname instead of utf8.

Moreover it seems do open() doesn't call StringHandle's open method but Perl_io_open. Which has a special case the just sets the flags and returns the pmc.

The patch has been updated at

 http://sdlperl.ath.cx/parrot_string_patch

Changed 4 years ago by kthakore

t/pmc/stringhandle.t .. 1/25 ok 5 # skip no asynch calls yet t/pmc/stringhandle.t .. 11/25 not ok 12 - record_separator # TODO not yet implemented t/pmc/stringhandle.t .. 18/25 # Failed test 'readall() - utf8 on opened stringhandle' # at t/pmc/stringhandle.t line 696. # got: 'fixed_8 # ' # expected: 'utf8 # ' # Looks like you failed 1 test of 25. t/pmc/stringhandle.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/25 subtests

(less 1 skipped subtest: 23 okay)

Changed 4 years ago by allison

We need to kill that Parrot_io_open loop. The function call isn't doing anything but setting the the flags attribute, which StringHandle's open method could do directly.

Delete these 4 lines from Parrot_io_open in src/io/api.c.

149     else if (new_filehandle->vtable->base_type == enum_class_StringHandle) {
150         SETATTR_StringHandle_flags(interp, pmc, flags);
151         filehandle = pmc;
152     }

Delete this line from the open method in src/pmc/stringhandle.pmc:

210         SELF = Parrot_io_open(INTERP, SELF, filename, open_mode);

Move the function "Parrot_io_parse_open_flags" from src/io/filehandle.c to src/io/api.c (since it is used more broadly than just the FileHandle PMC).

Add the following lines in the appropriate places in the open method in src/pmc/stringhandle.pmc:

INTVAL flags;
...
flags = Parrot_io_parse_open_flags(INTERP, open_mode);
SET_ATTR_flags(INTERP, SELF, flags)

Test.

Changed 4 years ago by NotFound

Partially done in r48234, TODO the open_flags moving.

Changed 4 years ago by NotFound

Moving of Parrot_io_parse_open_flags done in 2611433

I'm not sure about the status of this ticket. Are all problems mentioned solved?

Note: See TracTickets for help on using tickets.