Ticket #757 (closed bug: invalid)

Opened 7 years ago

Last modified 5 years ago

Problem with threads and HLLs

Reported by: tene Owned by: whiteknight
Priority: normal Milestone:
Component: hll_interop Version: 1.2.0
Severity: medium Keywords: pt, oo
Cc: Language:
Patch status: Platform:


clone_interpreter() fails an assert when I try to spawn a new thread while perl6.pbc is loaded.

Attached is a PIR file that demonstrates the problem. You can uncomment the load_bytecode 'perl6.pbc' to watch the crash.

I don't actually understand what's going on in that loop in clone_interpreter...


async.pir Download (0.9 KB) - added by tene 7 years ago.
tt757.pir Download (175 bytes) - added by Infinoid 7 years ago.
simple test case
TRAC-757-1.patch Download (1.8 KB) - added by cdarroch 7 years ago.
adjust vtable max and alloced to avoid losing slots?
TRAC-757-2.patch Download (1.0 KB) - added by cdarroch 7 years ago.
possible partial fix, but leads to another crash
TRAC-757-3.patch Download (1.1 KB) - added by kyle_l5l 7 years ago.
correct n_vtable_max in the cloned interpreter

Change History

Changed 7 years ago by tene

Changed 7 years ago by whiteknight

  • keywords pt, oo added
  • status changed from new to assigned
  • owner set to whiteknight

Changed 7 years ago by Infinoid

I'm seeing the same thing, but without needing an actual HLL. Apparently normal PIR classes are enough. I'll attach a very simple test case. Comment out the ".include 'test_more.pir'" line and it passes.

Changed 7 years ago by Infinoid

simple test case

Changed 7 years ago by Infinoid

(gdb) bt
#0  0x00007fe7e47dc645 in raise () from /lib/libc.so.6
#1  0x00007fe7e47ddb63 in abort () from /lib/libc.so.6
#2  0x00007fe7e720ae3c in Parrot_confess (
    cond=0x7fe7e7464538 "VTABLE_exists_keyed_str(d, d->class_hash, class_name)", file=0x7fe7e7464508 "./src/pmc/parrotinterpreter.pmc", line=94)
    at src/exceptions.c:605
#3  0x00007fe7e7350c63 in clone_interpreter (d=0x20bad20, s=0x1e1a080,
    flags=127) at ./src/pmc/parrotinterpreter.pmc:93
#4  0x00007fe7e7351541 in do_thread_run (interp=0x1e1a080, thread=0x1f93440,
    clone_flags=127, sub=0x1f09940, args=0x0) at ./src/pmc/parrotthread.pmc:66
#5  0x00007fe7e7351665 in do_thread_run_clone_default (interp=0x1e1a080,
    thread=0x1f93440, sub=0x1f09940, args=0x0) at ./src/pmc/parrotthread.pmc:82
#6  0x00007fe7e724b9c8 in pcf_I_JOPxAT_ (interp=0x1e1a080, self=0x1eb6f10)
    at src/nci.c:990
#7  0x00007fe7e735c5d8 in Parrot_NCI_invoke (interp=0x1e1a080, pmc=0x1eb6f10,
    next=0x1f3b260) at ./src/pmc/nci.pmc:335
#8  0x00007fe7e71a9593 in Parrot_callmethodcc_p_sc (cur_opcode=0x1f3b248,
    interp=0x1e1a080) at src/ops/object.ops:80
#9  0x00007fe7e7287ec8 in runops_slow_core (interp=0x1e1a080, pc=0x1f3b248)
    at src/runcore/cores.c:462
#10 0x00007fe7e7286b9c in runops_int (interp=0x1e1a080, offset=0)
    at src/runcore/main.c:987
#11 0x00007fe7e722c7aa in runops (interp=0x1e1a080, offs=0)
    at src/call/ops.c:107
#12 0x00007fe7e722cb86 in runops_args (interp=0x1e1a080, sub=0x1f09970,
    obj=0x1e9d850, meth_unused=0x0, sig=0x7fe7e745736b "vP", ap=0x7fffef903140)
    at src/call/ops.c:256
#13 0x00007fe7e722df3c in Parrot_runops_fromc_args (interp=0x1e1a080,
    sub=0x1f09970, sig=0x7fe7e745736b "vP") at src/call/ops.c:325
#14 0x00007fe7e7207f51 in Parrot_runcode (interp=0x1e1a080, argc=1,
    argv=0x7fffef903440) at src/embed.c:1021
#15 0x00007fe7e742b72d in imcc_run_pbc (interp=0x1e1a080, obj_file=0,
    output_file=0x0, argc=1, argv=0x7fffef903440) at compilers/imcc/main.c:801
#16 0x00007fe7e742c391 in imcc_run (interp=0x1e1a080,
    sourcefile=0x7fffef903f04 "tt757.pir", argc=1, argv=0x7fffef903440)
    at compilers/imcc/main.c:1092
#17 0x0000000000400c14 in main (argc=1, argv=0x7fffef903440) at src/main.c:60
(gdb) up
#1  0x00007fe7e47ddb63 in abort () from /lib/libc.so.6
(gdb) up
#2  0x00007fe7e720ae3c in Parrot_confess (
    cond=0x7fe7e7464538 "VTABLE_exists_keyed_str(d, d->class_hash, class_name)", file=0x7fe7e7464508 "./src/pmc/parrotinterpreter.pmc", line=94)
    at src/exceptions.c:605
605         abort();
(gdb) up
#3  0x00007fe7e7350c63 in clone_interpreter (d=0x20bad20, s=0x1e1a080,
    flags=127) at ./src/pmc/parrotinterpreter.pmc:93
93                      PARROT_ASSERT(VTABLE_exists_keyed_str(d,
(gdb) print class_name->strstart
$1 = 0x1fd9710 "Test;Builder"

So it seems the class hash isn't being cloned properly.

Changed 7 years ago by jkeenan

  • component changed from none to hll_interop

Changed 7 years ago by cdarroch

adjust vtable max and alloced to avoid losing slots?

Changed 7 years ago by cdarroch

possible partial fix, but leads to another crash

Changed 7 years ago by cdarroch

Hi -- this is my first stab at looking at the Parrot code, so my apologies up front if I'm heading off in the wrong direction on anything. I stumbled in here based on this Rakudo-related blog posting ( Perlgeek.de - Rakudo "star" announced) and the subsequent comments.

My first thought was the there seems to be something slightly off about the way n_vtable_max and n_vtable_alloced are being handled. For example, it looks to me as though parrot_realloc_vtables() will "lose" a slot each time it's called, because it effectively sets n_vtable_alloced to n_vtable_alloced + 15, but actually allocates 16 more slots. I'm also unsure about the single call to that function in clone_interpreter(), where it seems to me there's no guarantee only 16 (or 15) more slots are needed, but potentially any number. So my attachment:TRAC-757-1.patch Download attempts to make a few adjustments to fix these things and clarify that n_vtable_alloced is the actual number of allocated slots, and n_vtable_max is the next slot index to hand out, and may be equal to (but not greater than) n_vtable_alloced, in which case more slots will need to be allocated the next time a slot needs to be handed out. This patch seems to pass all of the t test suite, at least, but I'm still not 100% sure it's correct.

Then attachment:TRAC-757-2.patch Download tries to work around the problem shown in the core dumps, so that VTABLE_delete_keyed_str() is only called for a class name if it exists. I confess that I don't really understand what the correct logic should be here. It looks like the logic is something like:

  • clone the class_hash from s to d
  • assume that d->n_vtable_max was set to some initial default (maybe enum_class_core_max??)
  • for everything above that in s->vtables, if it's a pmc_class:
    • delete it from d's cloned class_hash
    • decrement d->n_vtable_max unless we've skipped over something in the loop

With this second change, I believe prove t/pmc/threads.t still works, however, now the attachment:tt757.pir Download crashes in a different place, a little further along, inside pt_clone_globals() near the end of clone_interpreter(). This is where I get truly far beyond my depth into Parrot's guts. It looks like the crash occurs in a string comparison which is invoked, effectively, by make_local_copy() in thread.c; I notice that function has some comments along the lines of this is a workaround for cloning subroutines not actually working as one might expect.... But I really don't know if that has anything to do with the problem, or, in fact, if my change to clone_interpreter() is the right thing to do. For whatever it's worth, here is the stack trace when trying attachment:tt757.pir Download with both of my patches.

Unfortunately I just don't have any more time to spend trying to look at this for a few weeks, but maybe this will help someone, even if the solution turns out to be in a different direction altogether.

Program terminated with signal 11, Segmentation fault.
#0  0xb7e51ccd in ascii_compare (interp=0x81325f0, lhs=0x82716cc, 
    rhs=0x81dad94) at src/string/charset/ascii.c:555
555             const int ret_val = memcmp(lhs->strstart, rhs->strstart, min_len);
(gdb) bt
#0  0xb7e51ccd in ascii_compare (interp=0x81325f0, lhs=0x82716cc, 
    rhs=0x81dad94) at src/string/charset/ascii.c:555
#1  0xb7d1fe03 in STRING_compare (interp=0x81325f0, search_key=0x82716cc, 
    bucket_key=0x81dad94) at src/hash.c:212
#2  0xb7d1fa3a in parrot_hash_get_bucket (interp=0x81325f0, hash=0x81bb848, 
    key=0x82716cc) at src/hash.c:1207
#3  0xb7d1fbd6 in parrot_hash_get (interp=0x81325f0, hash=0x81bb848, 
    key=0x82716cc) at src/hash.c:1237
#4  0xb7ee4f36 in Parrot_NameSpace_get_pmc_keyed_str (interp=0x81325f0, 
    pmc=0x8193488, key=0x82716cc) at ./src/pmc/namespace.pmc:386
#5  0xb7d1e62c in internal_ns_keyed_key (interp=0x81325f0, ns=0x8193488, 
    key=0x80c4ce8, flags=1) at src/global.c:158
#6  0xb7d1e818 in internal_ns_keyed (interp=0x81325f0, base_ns=0x817f410, 
    pmc_key=0x80c4d00, flags=1) at src/global.c:199
#7  0xb7d1e9f2 in Parrot_make_namespace_keyed (interp=0x81325f0, 
    base_ns=0x817f410, pmc_key=0x80c4d00) at src/global.c:328
#8  0xb7d1eb7a in get_namespace_pmc (interp=0x81325f0, sub_pmc=0x82371f0)
    at src/global.c:737
#9  0xb7d1ec9a in Parrot_store_sub_in_namespace (interp=0x81325f0, 
    sub_pmc=0x82371f0) at src/global.c:803
#10 0xb7d5a3c8 in make_local_copy (interp=0x81325f0, from=0x804f040, 
    arg=0x80c4d18) at src/thread.c:213
#11 0xb7d5a87e in pt_ns_clone (interp=0x804f040, d=0x81325f0, 
    dest_ns=0x817f410, s=0x804f040, source_ns=0x80b1120) at src/thread.c:634
#12 0xb7d5a82f in pt_ns_clone (interp=0x804f040, d=0x81325f0, 
    dest_ns=0x817f440, s=0x804f040, source_ns=0x80b1150) at src/thread.c:628
#13 0xb7d5a9bf in pt_clone_globals (d=0x81325f0, s=0x804f040)
    at src/thread.c:666
#14 0xb7e709b4 in clone_interpreter (d=0x81325f0, s=0x804f040, flags=127)
    at ./src/pmc/parrotinterpreter.pmc:143
#15 0xb7e70b0f in do_thread_run (interp=0x804f040, thread=0x8102ff8, 
    clone_flags=127, sub=0x80c52e8, args=0x0) at ./src/pmc/parrotthread.pmc:66
#16 0xb7e70c0e in do_thread_run_clone_default (interp=0x804f040, 
    thread=0x8102ff8, sub=0x80c52e8, args=0x0) at ./src/pmc/parrotthread.pmc:81

Changed 7 years ago by kyle_l5l

I think attachment:TRAC-757-1.patch Download should not change src/pmc.c. The rest seems fine.

With the two patches applied, I see a failed assertion at vtables.c:95. This is because we end up with a gap in the vtables[] array for the new interpreter. At src/pmc/parrotinterpreter.pmc:84, the new interpreter only has vtable entries for the core PMCs, while the existing interpreter has a few more. If we set the new interpreter's n_vtable_max field to that of the existing interpreter, that causes any new vtables we add to leave that gap in the vtables[] array. See attachment:TRAC-757-3.patch Download , which replaces attachment:TRAC-757-2.patch Download . With this patch, after the loop to delete the non-core class names from the new interpreter, we adjust n_vtable_max back to the correct value. (This seems a bit strange, but it works.) Now, we still pass the threads.t tests, and attachment:tt757.pir Download works.

Changed 7 years ago by kyle_l5l

correct n_vtable_max in the cloned interpreter

Changed 7 years ago by moritz

I applied patches 1 and 3 locally here, and this is what 'make test' produced:

#   Failed test 'test compiling anonymous and named grammars'
#   at t/compilers/tge/grammar.t line 28.
# Exited with error code: [SIGNAL 11]

Rakudo builds fine on top of a parrot with this patch, and 'make spectest' looks clean so far too - if anything goes wrong I'll comment here again in an hour or so.

Changed 5 years ago by whiteknight

  • status changed from assigned to new

Changed 5 years ago by jkeenan


Could we get an update on the status of this ticket?

Thank you very much.


Changed 5 years ago by whiteknight

  • status changed from new to closed
  • resolution set to invalid

threads are gone now. I'm closing this ticket.

Note: See TracTickets for help on using tickets.