Ticket #1393 (closed bug: fixed)

Opened 5 years ago

Last modified 5 years ago

src/gc/api.c: Intermittent test failures at line 245 since r43211

Reported by: jkeenan Owned by: jkeenan
Priority: major Milestone:
Component: testing Version: 1.9.0
Severity: high Keywords:
Cc: mikehh, Util, lithos Language:
Patch status: applied Platform:

Description

Observed on Linux/i386. When I run a smolder test, I get a FAIL:

t/compilers/pge/pge_examples.t .............. 
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/2 subtests 

Test Summary Report
-------------------
t/compilers/pge/pge_examples.t            (Wstat: 256 Tests: 2 Failed: 1)
  Failed test:  2
  Non-zero exit status: 1
Files=340, Tests=12168, 254 wallclock secs 
  ( 1.80 usr  0.19 sys + 63.86 cusr  8.54 csys = 74.39 CPU)
Result: FAIL

But when I run the individual test file, it PASSes:

$ prove -v t/compilers/pge/pge_examples.t
t/compilers/pge/pge_examples.t .. 
1..2
ok 1 - This made Parrot m4 fail
ok 2 - parse FASTA
ok
All tests successful.
Files=1, Tests=2,  0 wallclock secs 
  ( 0.01 usr  0.01 sys +  0.34 cusr  0.05 csys =  0.41 CPU)
Result: PASS

First observed at r43237 on Dec 24. Still occurring on Dec 26 at r43246. Also observed by cotto.

No substantive changes in the test file itself since Nov 15. I suspect the problem emerged between r43138 and r43237 (inclusive) -- a span that appears to have had a branch merged in.

kid51

Attachments

tt_1393_debug_prints.diff Download (2.6 KB) - added by lithos 5 years ago.
debug prints used to diagnose the problem
tt_1393_log.txt Download (1.8 KB) - added by lithos 5 years ago.
output generated by the debug prints just before the abort
tt_1393_log2.txt Download (2.0 KB) - added by lithos 5 years ago.
log showing that the tailcall op is involved [fixed typo "context" -> "continuation"]
tt_1393_debug_prints2.diff Download (3.6 KB) - added by lithos 5 years ago.
tt_1393_debug_changes3.diff Download (4.6 KB) - added by lithos 5 years ago.
changes used to reproduce and diagnose jkeenan's test fail
tt_1393_log3.txt Download (6.7 KB) - added by lithos 5 years ago.
debug log starting with the creation of the guilty RetContinuation 0x8a93d44

Change History

in reply to: ↑ description ; follow-up: ↓ 2   Changed 5 years ago by jkeenan

Replying to jkeenan:

No substantive changes in the test file itself since Nov 15. I suspect the problem emerged between r43138 and r43237 (inclusive) -- a span that appears to have had a branch merged in.

Bisecting: problem occurred in or before r43210.

in reply to: ↑ 1   Changed 5 years ago by jkeenan

Replying to jkeenan:

Bisecting: problem occurred in or before r43210.

But upon further inspection, it appears that in r43210, the test itself was failing: This failure was apparently corrected, but the harness is still registering a failure.

  Changed 5 years ago by jkeenan

I'm not seeing any Smolder test failures that can be attributed to the patches applied to this ticket. However, we are getting failures on our Win32 smoke tests in t/library/pcre.t.

1..2
ok 1 - libpcre loading
not ok 2 - soup to nuts

#   Failed test 'soup to nuts'
#   at t/library/pcre.t line 104.
#          got: 'ok 1
# ok 2
# not ok 3
# not ok 4
# not ok 5
# '
#     expected: 'ok 1
# ok 2
# ok 3
# ok 4
# ok 5
# '
# Looks like you failed 1 test of 2.

I can't attribute those failures to the Win32 patch which was the last part applied from this ticket, but, out of caution, I am going to hold the ticket open.

kid51

  Changed 5 years ago by jkeenan

The last submission to this ticket was meant for TT #886. Please ignore.

  Changed 5 years ago by jkeenan

And, just to make things more aggravating ... this test file is passing for me at HEAD on Darwin/PPC and is being correctly handled by the test harness (make test). So it just gets weirder.

  Changed 5 years ago by jkeenan

Hope this debugging output enables someone to diagnose why this is failing only when run thru a harness and not on all OSes:

perl t/harness --gc-debug t/compilers/pge/pge_examples.t 
t/compilers/pge/pge_examples.t .. 1/2 
#   Failed test 'parse FASTA'
#   at t/compilers/pge/pge_examples.t line 59.
# Exited with error code: [SIGNAL 6]
# Received:
# src/gc/api.c:245: failed assertion 'PObj_is_PMC_TEST(obj)'
# Backtrace - Obtained 32 stack frames (max trace depth is 32).
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400e45b2]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_confess+0x9a) [0x400e471a]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_gc_mark_PMC_alive_fun+0x89) [0x400f2c99]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40287f7f]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0  
(Parrot_gc_mark_PMC_alive_fun+0x116) [0x400f2d26]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40288197]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_gc_mark_PMC_alive_fun+0x116) [0x400f2d26]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f6ee9]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f51f2]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f54b5]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f29b8]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f508d]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f4f04]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f2a3b]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f33b8]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_str_new_COW+0x8f) [0x400539ef]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4010bb46]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40237c50]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400faeab]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4027cf69]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4028edee]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x402ba09d]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4006a77c]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4015fdde]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4015e33f]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x401078cf]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_pcc_invoke_from_sig_object+0x1e9) [0x400fddd9]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_pcc_invoke_sub_from_c_args+0xd3) [0x400fdee3]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40157890]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40157a5b]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (do_sub_pragmas+0x1a2) [0x40157d12]
# /topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40157ee7]
# 
# Expected:
# >gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
# LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV
# EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG
# LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL
# GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX
# IENY
# >poly_a teasing the parser with DNA
# aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
# 
# Looks like you failed 1 test of 2.
t/compilers/pge/pge_examples.t .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/2 subtests 

Test Summary Report
-------------------
t/compilers/pge/pge_examples.t (Wstat: 256 Tests: 2 Failed: 1)
  Failed test:  2
  Non-zero exit status: 1
Files=1, Tests=2,  1 wallclock secs ( 0.00 usr  0.01 sys +  0.19 cusr  0.02 csys =  0.22 CPU)
Result: FAIL

  Changed 5 years ago by jkeenan

On #parrot, Coke suggested: does it still die if you disable GC, can you reduce the code needed to generate the signal, can you capture a backtrace in gdb?

Disabling GC 'worked':

$ perl t/harness  t/compilers/pge/pge_examples.t t/compilers/pge/pge_examples.t .. ok   
All tests successful.
Files=1, Tests=2,  0 wallclock secs 
  ( 0.02 usr  0.00 sys +  0.34 cusr  0.01 csys =  0.37 CPU)
Result: PASS

So the --gc-debug is an essential factor in this problem -- though that still doesn't shed light on why I get this on Linux/i386 but not on Darwin/PPC.

The plot thickens.
kid51

  Changed 5 years ago by jkeenan

$ gdb ./parrot
GNU gdb 6.8-debian
...
This GDB was configured as "i486-linux-gnu"...

(gdb) r t/compilers/pge/pge_examples_2.pir

Starting program: /topdir/work/parrot/parrot t/compilers/pge/pge_examples_2.pir
[Thread debugging using libthread_db enabled]
warning: Lowest section in /usr/lib/libicudata.so.36 is .hash at 000000b4
>gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV
EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG
LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL
GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX
IENY
>poly_a teasing the parser with DNA
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

Program exited normally.

follow-up: ↓ 10   Changed 5 years ago by jkeenan

  • cc mikehh added

Today I had the time to do some bisection on this problem. It appears that the culprit is r43211:

Index: src/ops/pmc.ops
===================================================================
--- src/ops/pmc.ops     (revision 43210)
+++ src/ops/pmc.ops     (revision 43211)
@@ -51,7 +51,9 @@
 
 op new(out PMC, in STR) {
     STRING * const name   = $2;
-    PMC    * const _class = Parrot_oo_get_class_str(interp, name);
+    PMC    * const _class = Parrot_pcc_get_HLL(interp, CURRENT_CONTEXT(interp))
+                          ? Parrot_oo_get_class_str(interp, name)
+                          : PMCNULL;
 
     if (!PMC_IS_NULL(_class))
         $1 = VTABLE_instantiate(interp, _class, PMCNULL);
@@ -69,7 +71,9 @@
 
 op new(out PMC, in STR, in PMC) {
     STRING * const name   = $2;
-    PMC    * const _class = Parrot_oo_get_class_str(interp, name);
+    PMC    * const _class = Parrot_pcc_get_HLL(interp, CURRENT_CONTEXT(interp))
+                          ? Parrot_oo_get_class_str(interp, name)
+                          : PMCNULL;
 
     if (!PMC_IS_NULL(_class))
         $1 = VTABLE_instantiate(interp, _class, $3);

Beginning with r43211, perl t/harness --gc-debug t/compilers/pge/pge_examples.t gives the FAIL described in previous posts, while perl t/harness t/compilers/pge/pge_examples.t gives a PASS.

Why such a small change should cause such an unusual FAIL I cannot say. mikehh, can you take a look? Thanks.

kid51

in reply to: ↑ 9   Changed 5 years ago by jkeenan

Replying to jkeenan:

Beginning with r43211,

See TT #1368 re r43211.

  Changed 5 years ago by jkeenan

See also  this comment in Trac 473 where I get a similar SIGNAL 6 when trying testing a merge into trunk.

kid51

follow-up: ↓ 13   Changed 5 years ago by mikehh

this test does NOT fail on amd64 with all variants (gcc/g++ with or without --optimize)

I ran the test on i386 at r43380 and it ONLY failed on the gcc build (no --optimize).

see smoke #31459  http://smolder.plusthree.com/app/public_projects/report_details/31459

The test passes gcc build with --optimize and g++ builds with or without --optimize. smoke #31457, #31455, #31458 respectively.

All tests PASS (pre/post-config, make corevm/make coretest, smoke (#31455), fulltest) at r43380 - Ubuntu 9.10 i386 (g++ with --optimize)

All tests PASS (pre/post-config, make corevm/make coretest, smoke (#31457), fulltest) at r43380 - Ubuntu 9.10 i386 (gcc with --optimize)

All tests PASS (pre/post-config, make corevm/make coretest, smoke (#31458), fulltest) at r43380 - Ubuntu 9.10 i386 (g++)

t/compilers/pge/pge_examples.t - Failed test: 2 in smoke and fulltest [library_tests] all other tests PASS (pre/post-config, make corevm/make coretest, smoke (#31459), fulltest) at r43380 - Ubuntu 9.10 i386 (gcc)

regarding r43211 (TT #1368) - the change was introduced at r42924 and caused the failures detailed in the ticket. I did a partial reversion of this commit at r43033 to get the tests to pass for the release. After the merge of the context_unify3_simple branch the test passed with the changes in r42924 so I reversed the changes in r43033 at r43211. (IOW r43033 modified r42924 and r43211 restored it to its original form)

please also note that the test seems to pass in smoke #31454 (which is not mine) but seems to have been build on Ubuntu 9.10 i386 just using perl Configure.pl with no options (I tend to use extra options and perl 5.10.1).

in reply to: ↑ 12 ; follow-up: ↓ 14   Changed 5 years ago by jkeenan

Replying to mikehh:

I ran the test on i386 at r43380 and it ONLY failed on the gcc build (no --optimize). see smoke #31459  http://smolder.plusthree.com/app/public_projects/report_details/31459

Which is the case for which it is failing for me: Linux/i386, gcc --no-optimize. So this confirms that we have a problem.

in reply to: ↑ 13   Changed 5 years ago by jkeenan

Replying to jkeenan:

And if I build with --optimize the error in t/compilers/pge/pge_examples.t goes away.

But, since I've been submitting smoke tests from this box and without --optimize for years now, I'm not going to switch to --optimize simply to avoid getting a mysterious test failure.

kid51

  Changed 5 years ago by mikehh

I agree completely that we have a problem here. I was merely indicating that the problem did not occur with various options, not saying that we need not fix it because we get it to work under certain circumstances.

I normally build with gcc using the following configure options:

perl Configure.pl --optimize --test --maintainer --configure_trace

and try with and without the --optimize.

On Ubuntu 9.10 i386 we are having a problem with perl Configure.pl --test --maintainer --configure_trace

I ran a series of builds using different variations of this, at r43380, after a make realclean each time.

The --maintainer option appears to give different results (with or without --test and --configure_trace).

With the --maintainer option I get t/compilers/pge/pge_examples.t failing test 2 as covered in this ticket. prove passes and perl t/harness fails with the --gc-debug option but passes without.

If I drop the --maintainer option on the build the test passes, however the test t/library/test_more.t fails after test 60 (see TT #473, comments by jkeenan).

this test fails prove and perl t/harness with or without --gc-debug

but t/compilers/pge/pge_examples.t passes.

I tried this quite a few times and got the same results.

In both situations the test fails with:

src/gc/api.c:245: failed assertion 'PObj_is_PMC_TEST(obj)'

this is part of the function Parrot_gc_mark_PMC_alive_fun

and seems to be failing the assertion that it is an object.

Why this should happen and why the tests are ok in different builds and not others is what needs investigation and resolution.

I don't have any ideas at the moment but will continue investigating.

Hopefully we can get others (Whiteknight, bacek, chromatic perhaps) to have a look.

  Changed 5 years ago by jkeenan

  • priority changed from normal to major
  • severity changed from medium to high

mikehh, thanks for that analysis.

Now, to make matters more confusing ...

In  this Smolder test run at r43392 (last change at r43382), the problems in t/compilers/pge/pge_examples.t have mysteriously fixed themselves.

$ perl t/harness --gc-debug t/compilers/pge/pge_examples.t 
t/compilers/pge/pge_examples.t .. ok   
All tests successful.
Files=1, Tests=2,  1 wallclock secs ( 0.01 usr  0.00 sys +  0.41 cusr  0.03 csys =  0.45 CPU)
Result: PASS

But now, I'm getting failures in t/library/test_more.t:

$ perl t/harness --gc-debug t/library/test_more.t          
t/library/test_more.t .. 1/108 src/gc/api.c:245: failed assertion 'PObj_is_PMC_TEST(obj)'
Backtrace - Obtained 32 stack frames (max trace depth is 32).
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400e4572]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(Parrot_confess+0x9a) [0x400e46da]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(Parrot_gc_mark_PMC_alive_fun+0x89) [0x400f2b89]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40287dcf]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(Parrot_gc_mark_PMC_alive_fun+0x116) [0x400f2c16]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40287fe7]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(Parrot_gc_mark_PMC_alive_fun+0x116) [0x400f2c16]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f6dd9]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f50e2]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f53a5]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f28a8]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f4f7d]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f4df4]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f292b]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f32a8]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(Parrot_str_new_COW+0x8f) [0x400539af]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4010ba36]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40237aa0]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f7dec]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f8291]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(Parrot_get_namespace_keyed+0xad) [0x400f853d]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(Parrot_oo_get_class+0x13f) [0x4014fcff]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4007a3c8]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4015fc2e]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4015e18f]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x401077bf]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(Parrot_pcc_invoke_from_sig_object+0x1e9) [0x400fdcc9]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(Parrot_pcc_invoke_sub_from_c_args+0xd3) [0x400fddd3]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x401576e0]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x401578ab]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0(do_sub_pragmas+0x1a2) [0x40157b62]
/home/jimk/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40157d37]
t/library/test_more.t .. Failed 48/108 subtests 

Test Summary Report
-------------------
t/library/test_more.t (Wstat: 6 Tests: 60 Failed: 0)
  Non-zero wait status: 6
  Parse errors: Bad plan.  You planned 108 tests but ran 60.
Files=1, Tests=60,  0 wallclock secs ( 0.04 usr  0.00 sys +  0.07 cusr  0.01 csys =  0.12 CPU)
Result: FAIL

Astute readers of my posts will recall that I reported the same failures in a completely different ticket -- TT #473. In that ticket, where I am trying to merge the tt473_remove_memcpy_aligned branch into trunk, I got this error for a while, only to see it clear up and be replaced by errors in t/compilers/pct/complete_workflow.t''

We are clearly building up some massive technical debt here. I have not been able to get a 100% PASS on make test in two weeks -- and this on the least exotic platform with the least exotic compilation options. And I have been unable to complete merges that should have been a slam dunk because they were only removing unused code.

This is having a severe impact on what I can contribute to the Parrot project. If I can't get a clear PASS on Linux/i386, I can't confidently apply other contributors' patches. And I will have no confidence in Parrot 2.0 until I can get a PASS.

So I'm upping the priority and severity on this ticket. Thank you very much.

kid51

  Changed 5 years ago by lithos

Hello! I think I tracked down the cause of this bug:

In short: A RetContinuation PMC frees itself but there is still at least one pointer to it from a CallContext PMC. The context then tries to mark the RetContinuation (that is already on the free list) and thus triggers an abort.

The culprit: src/pmc/retcontinuation.pmc:85:

        /* recycle this PMC and make sure it doesn't get marked */
        if (!PMC_IS_NULL(from_ctx))
            Parrot_pcc_set_continuation(interp, from_ctx, NULL);
        Parrot_gc_free_pmc_header(interp, SELF);

I'll attach a patch with the debug prints I used plus the log output.

The log shows that the continuation seems to be propagated from one context to another context. So the RetContinuation PMC is wrong in assuming that it knows the single pointer to itself.

Hope this helps to fix this issue!

Changed 5 years ago by lithos

debug prints used to diagnose the problem

Changed 5 years ago by lithos

output generated by the debug prints just before the abort

  Changed 5 years ago by bacek

Hello!

Main problem in Sub/RetContinuation interaction:

1. RetContinuation kills itself after invoke setting from_ctx->current_cont to NULL. 2. OTOH Sub.invoke can attach same continuation to different CallContext.

I'm not sure why it's happen, but strongly recommend to remove line 88 in retcontinuation.pmc. Moreover, because RetContinuation's from_obj is always NULL (afaiu) we can remove RetContinuation PMC totally and use normal Continuation.

-- Bacek

  Changed 5 years ago by lithos

BTW, the reason that the same continuation is attached another context is a tailcall (at least in this case).

src/ops/core.ops:466

    interp->current_cont        = Parrot_pcc_get_continuation(interp, ctx);
    [...]
    dest = VTABLE_invoke(interp, p, dest);

Changed 5 years ago by lithos

log showing that the tailcall op is involved [fixed typo "context" -> "continuation"]

Changed 5 years ago by lithos

follow-up: ↓ 21   Changed 5 years ago by bacek

Hello.

I added updating CallContext in tailcalls in r43414. But honestly I still think that we have to remove RetContinuation at all.

Keep ticket open for "final" decision.

-- Bacek.

in reply to: ↑ 20   Changed 5 years ago by jkeenan

Replying to bacek:

Hello. I added updating CallContext in tailcalls in r43414. But honestly I still think that we have to remove RetContinuation at all. Keep ticket open for "final" decision.

On the same Linux/i386 box on which I have been reporting all along, the situation is unchanged as of r43417, the situation is unchanged. As previously reported, while pge_examples.t is now passing, t/library/test_more.t continues to FAIL.

Test Summary Report
-------------------
t/library/test_more.t         (Wstat: 6 Tests: 60 Failed: 0)
  Non-zero wait status: 6
  Parse errors: Bad plan.  You planned 108 tests but ran 60.
Files=340, Tests=12160, 245 wallclock secs 
  ( 1.66 usr  0.18 sys + 63.66 cusr  8.04 csys = 73.54 CPU)
Result: FAIL

Moreover, these failures are not merely some weird side effect of running the files through the harness with --gc-debug in force. They occur just by running the test with prove.

src/gc/api.c:245: failed assertion 'PObj_is_PMC_TEST(obj)'
Backtrace - Obtained 32 stack frames (max trace depth is 32).
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400e49e2]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_confess+0x9a) [0x400e4b4a]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_gc_mark_PMC_alive_fun+0x89) [0x400f2ff9]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4028875f]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_gc_mark_PMC_alive_fun+0x116) [0x400f3086]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40288977]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_gc_mark_PMC_alive_fun+0x116) [0x400f3086]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f7249]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f5552]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f5815]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f2d18]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f53ed]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f5264]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f2d9b]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f3718]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_str_new_COW+0x8f) [0x400539ff]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4010bf26]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40238430]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f825c]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f8701]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_get_namespace_keyed+0xad) [0x400f89ad]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_oo_get_class+0x13f) [0x401501ef]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4007a418]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4016011e]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4015e67f]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40107caf]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_pcc_invoke_from_sig_object+0x1e9) [0x400fe1b9]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_pcc_invoke_sub_from_c_args+0xd3) [0x400fe2c3]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40157bd0]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40157d9b]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (do_sub_pragmas+0x1a2) [0x40158052]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40158227]
Failed 48/108 subtests 

Test Summary Report
-------------------
t/library/test_more.t (Wstat: 6 Tests: 60 Failed: 0)
  Non-zero wait status: 6
  Parse errors: Bad plan.  You planned 108 tests but ran 60.
Files=1, Tests=60,  0 wallclock secs 
  ( 0.02 usr  0.00 sys +  0.08 cusr  0.00 csys =  0.10 CPU)
Result: FAIL

follow-up: ↓ 23   Changed 5 years ago by bacek

Hello.

Interesting. Can you rebuild parrot with --nolinedirectives? It will help with investigations.

-- Bacek

in reply to: ↑ 22   Changed 5 years ago by jkeenan

Replying to bacek:

Can you rebuild parrot with --nolinedirectives?

That's perl Configure.pl --no-line-directives.

Similar results:

ok 60 - failing test isnt() for pmcs with description
src/gc/api.c:245: failed assertion 'PObj_is_PMC_TEST(obj)'
Backtrace - Obtained 32 stack frames (max trace depth is 32).
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400e49e2]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_confess+0x9a) [0x400e4b4a]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_gc_mark_PMC_alive_fun+0x89) [0x400f2ff9]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4028875f]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_gc_mark_PMC_alive_fun+0x116) [0x400f3086]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40288977]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_gc_mark_PMC_alive_fun+0x116) [0x400f3086]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f7249]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f5552]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f5815]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f2d18]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f53ed]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f5264]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f2d9b]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f3718]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_str_new_COW+0x8f) [0x400539ff]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4010bf26]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40238430]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f825c]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x400f8701]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_get_namespace_keyed+0xad) [0x400f89ad]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_oo_get_class+0x13f) [0x401501ef]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4007a418]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4016011e]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x4015e67f]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40107caf]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_pcc_invoke_from_sig_object+0x1e9) [0x400fe1b9]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (Parrot_pcc_invoke_sub_from_c_args+0xd3) [0x400fe2c3]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40157bd0]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40157d9b]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0
  (do_sub_pragmas+0x1a2) [0x40158052]
/topdir/work/parrot/blib/lib/libparrot.so.1.9.0 [0x40158227]
Failed 48/108 subtests 

Test Summary Report
-------------------
t/library/test_more.t (Wstat: 6 Tests: 60 Failed: 0)
  Non-zero wait status: 6
  Parse errors: Bad plan.  You planned 108 tests but ran 60.
Files=1, Tests=60,  1 wallclock secs 
  ( 0.04 usr  0.00 sys +  0.12 cusr  0.02 csys =  0.18 CPU)
Result: FAIL

  Changed 5 years ago by lithos

Hello!

I was not able to reproduce jkeenan's test FAIL out-of-the-box. However, if I add a mark-and-sweep call directly after the self-freeing call in src/pmc/retcontinuation.pmc like this (in order to immediately trigger bugs caused by left-over pointer to the RetContinuation):

        Parrot_gc_free_pmc_header(interp, SELF);
        Parrot_gc_mark_and_sweep(interp, 0);

then I get the exact same assertion failure after test 60.

The continuation is propagated to a second context in the Parrot_tailcallmethod_p_sc opcode in this case. The continuation is not detached in this case because Parrot_pcc_do_run_ops returned 0.

Output by my debug prints directly before the assertion failure:

tailcallmethod PMC STR 1 0
tailcallmethod PMC STR invoking with continuation 0x8d38d44
2515: ./src/pmc/sub.pmc:401: setting continuation 0x8d38d44 in context 0x8db7c54

where the "1" means it is a RetContinuation PMC and the "0" is the return value of Parrot_pcc_do_run_ops.

Changed 5 years ago by lithos

changes used to reproduce and diagnose jkeenan's test fail

  Changed 5 years ago by lithos

FYI, the parrot invocation was:

./parrot t/library/test_more.t 2>LOG

The output to stdout ends with:

ok 60 - failing test isnt() for pmcs with description
Aborted

Changed 5 years ago by lithos

debug log starting with the creation of the guilty RetContinuation 0x8a93d44

follow-ups: ↓ 27 ↓ 29   Changed 5 years ago by whiteknight

I just tried to reproduce this failure on Linux-x64 with no luck. I'm trying to wade through all the posts above, but I'm not sure I have a consistent view of the problem. What platforms is this test failing on? What are the incantations I need to reproduce it?

in reply to: ↑ 26   Changed 5 years ago by jkeenan

Replying to whiteknight:

> I just tried to reproduce this failure on Linux-x64 with no luck. 
> I'm trying to wade through all the posts above, but I'm not sure 
> I have a consistent view of the problem. What platforms is this 
> test failing on? What are the incantations I need to reproduce it?

I understand your confusion. One of the reasons why this bug has been so difficult to fix is that it falls into that maddening category of bugs that manifest themselves consistently on one system but cannot be reproduced on others. In addition, the file that fails mysteriously during testing has mysteriously changed from t/compilers/pge/pge_examples.t to t/library/test_more.t.

I hope to find the time to write to the list summarizing the current state of this problem. But for now let me list the following:

* No failures for me on Darwin/PPC.

* No failures for most other testers on Linux 64-bit.

* No failures for me on Linux/i386 when I configure with --optimize.

* Since r43211, I have consistently gotten failures on my Linux/i386 (Debian stable) when I do not configure with --optimize. Most recently, these failures have shown up in t/library/test_more.t and have been reported on Smolder repeatedly. The failures on this file -- unlike the earlier failures on t/compilers/pge/pge_examples.t -- occur regardless of whether I run the harness with --gc-debug or not; they occur during prove as well.

For well over two years I have been smoking Parrot on this box, always with simply perl Configure.pl --test --configure_trace -- i.e., with no optimization or fancy stuff. The past three weeks have been the longest period in that time during which I have been unable to get a 100% PASS from make test on that box.

Thank you very much.
kid51

  Changed 5 years ago by jkeenan

And to make things more mysterious ...

09:28 kid51 Well here's some good news ...
09:28 kid51 I just did 'svn up' on trunk after plobsing's merges.
09:28 kid51 Am currently at r43437.
09:28 kid51 I got the same smoke results in trunk as I did in the branch ...
09:29 kid51 i.e., PASS with t/pmc/eval.t TODO passed:   12
09:29 kid51 ... which means that the failure being discussed in TT #1393 did NOT occur!
09:29 kid51 The dog did not bark!
09:30 plobsing_ so the changes hide the bug? yay! sort of

in reply to: ↑ 26 ; follow-ups: ↓ 31 ↓ 37   Changed 5 years ago by lithos

Replying to whiteknight:

I just tried to reproduce this failure on Linux-x64 with no luck. I'm trying to wade through all the posts above, but I'm not sure I have a consistent view of the problem. What platforms is this test failing on? What are the incantations I need to reproduce it?

To summarize my (limited) view of the problem:

RetContinuation wrongly assumes that it knows all pointers to itself (namely a single one from a context) and puts itself on the free list. If there are still pointers around to the RetContinuation and a GC run is triggered while these pointers are still live the assertion failure happens when marking the (former) RetContinuation. The emphasized condition in the previous sentence is the reason IMHO that this bug is so elusive. For example, it could depend on C compiler optimizations, if a pointer to a context holding a now-invalid pointer to the RetContinuation is still on the stack.

bacek's changes in r43414 fixed some, but I think not all cases where there is another pointer to the RetContinuation.

I think this modification unhides the bug (I'm not sure, however, if it does not also break valid code):

Index: src/pmc/retcontinuation.pmc
===================================================================
--- src/pmc/retcontinuation.pmc	(revision 43432)
+++ src/pmc/retcontinuation.pmc	(working copy)
@@ -86,6 +86,7 @@
         if (!PMC_IS_NULL(from_ctx))
             Parrot_pcc_set_continuation(interp, from_ctx, NULL);
         Parrot_gc_free_pmc_header(interp, SELF);
+        Parrot_gc_mark_and_sweep(interp, GC_trace_stack_FLAG);
 
         if (INTERP->code != seg)
             Parrot_switch_to_cs(INTERP, seg, 1);

An incantation to show the bug with this change is:

./parrot t/library/test_more.t

But a simple "make" also does parrot invocations where the bug is triggered with the above patch applied.

If my assumptions are right, this modification should make the bug disappear:

Index: src/pmc/retcontinuation.pmc
===================================================================
--- src/pmc/retcontinuation.pmc (revision 43432)
+++ src/pmc/retcontinuation.pmc (working copy)
@@ -85,7 +85,6 @@ Transfers control to the calling context
         /* recycle this PMC and make sure it doesn't get marked */
         if (!PMC_IS_NULL(from_ctx))
             Parrot_pcc_set_continuation(interp, from_ctx, NULL);
-        Parrot_gc_free_pmc_header(interp, SELF);
 
         if (INTERP->code != seg)
             Parrot_switch_to_cs(INTERP, seg, 1);

jkeenan, does the failure you see persist if you apply the second patch?

  Changed 5 years ago by bacek

Hello.

Last patch by lithos is exactly same as I proposed in  http://trac.parrot.org/parrot/ticket/1393#comment:18. We shouldn't even try to "help" GC in this case.

If no one will complain I'll do exactly this and put RetContinuation into DEPRECATED pool (because without manual delete there is no point to keep it around).

-- Bacek

in reply to: ↑ 29   Changed 5 years ago by jkeenan

Replying to lithos:

jkeenan, does the failure you see persist if you apply the second patch?

Well, as I reported above, in trunk the failure 'disappeared' after a completely different branch was merged in last night. But, as plobsing characterized it, this is probably just papering over the bug.

So I'll have to do a checkout of a revision from, say, early yesterday that still manifested the bug and test your patches there.

Realistically speaking, I may not be able to get to that until Friday evening. But I will definitely look into it by the end of the weekend.

Thank you very much.

kid51

  Changed 5 years ago by Util

  • cc Util added

On Darwin 10.5, part of the PGE build has been intermittently failing for several revisions. Because the failure occur at the same place (src/gc/api.c:245) as the failures already discussed in this ticket, and since I have traced the problem to the same revision (r43211), I am appending to this ticket instead of opening a new ticket.

Tested with r43482.

./parrot --runcore=gcdebug runtime/parrot/library/PGE/Perl6Grammar.pir  --output=builtins_gen_DUMMY.pir compilers/pge/PGE/builtins.pg

This is the actual command from the normal build, with --runcore=gcdebug added, paths adjusted, and the output filename changes to prevent make rebuilds. The command works correctly without --runcore=gcdebug, and the output file ./builtins_gen_DUMMY.pir matches the real compilers/pge/PGE/builtins_gen.pir produced during the real build.

With --runcore=gcdebug, produces this failure:

src/gc/api.c:245: failed assertion 'PObj_is_PMC_TEST(obj)'
Backtrace - Obtained 32 stack frames (max trace depth is 32).
0   libparrot.dylib                     0x0048fe7d Parrot_do_check_events + 173
1   libparrot.dylib                     0x0048ffe7 Parrot_confess + 151
2   libparrot.dylib                     0x0049d487 Parrot_gc_mark_PMC_alive_fun + 135
3   libparrot.dylib                     0x00631e28 Parrot_ArrayIterator_get_isa + 14824
4   libparrot.dylib                     0x0049d517 Parrot_gc_mark_PMC_alive_fun + 279
5   libparrot.dylib                     0x00632050 Parrot_ArrayIterator_get_isa + 15376
6   libparrot.dylib                     0x0049d517 Parrot_gc_mark_PMC_alive_fun + 279
7   libparrot.dylib                     0x004a0e8e Parrot_is_blocked_GC_sweep + 5374
8   libparrot.dylib                     0x004a00d3 Parrot_is_blocked_GC_sweep + 1859
9   libparrot.dylib                     0x004a0215 Parrot_is_blocked_GC_sweep + 2181
10  libparrot.dylib                     0x0049e573 Parrot_gc_mark_STRING_alive_fun + 3811
11  libparrot.dylib                     0x0050a4df enable_event_checking + 3711
12  libparrot.dylib                     0x00508d1a Parrot_runcore_switch + 3978
13  libparrot.dylib                     0x004b2ee8 new_runloop_jump_point + 392
14  libparrot.dylib                     0x004a900c Parrot_pcc_invoke_from_sig_object + 428
15  libparrot.dylib                     0x004a9400 Parrot_pcc_invoke_sub_from_c_args + 208
16  libparrot.dylib                     0x004fb359 Parrot_ComposeRole + 3385
17  libparrot.dylib                     0x004fb524 Parrot_ComposeRole + 3844
18  libparrot.dylib                     0x004fb904 do_sub_pragmas + 388
19  libparrot.dylib                     0x00502a88 PackFile_Annotations_add_entry + 2184
20  libparrot.dylib                     0x00502ba9 PackFile_Annotations_add_entry + 2473
21  libparrot.dylib                     0x00503223 Parrot_load_bytecode + 515
22  libparrot.dylib                     0x003ff2f5 Parrot_str_from_int + 917
23  libparrot.dylib                     0x0050a51c enable_event_checking + 3772
24  libparrot.dylib                     0x00508d1a Parrot_runcore_switch + 3978
25  libparrot.dylib                     0x004b2ee8 new_runloop_jump_point + 392
26  libparrot.dylib                     0x004a900c Parrot_pcc_invoke_from_sig_object + 428
27  libparrot.dylib                     0x004a9400 Parrot_pcc_invoke_sub_from_c_args + 208
28  libparrot.dylib                     0x004fb359 Parrot_ComposeRole + 3385
29  libparrot.dylib                     0x004fb60a Parrot_ComposeRole + 4074
30  libparrot.dylib                     0x004fb904 do_sub_pragmas + 388
31  libparrot.dylib                     0x005032a7 PackFile_fixup_subs + 119
Abort trap

Running under gdb:

~/Perl/Parrot/Release_20_test/parrot $ gdb ./parrot
GNU gdb 6.3.50-20050815 (Apple version gdb-967) (Tue Jul 14 02:11:58 UTC 2009)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-apple-darwin"...Reading symbols for shared libraries ........... done

(gdb) run --runcore=gcdebug runtime/parrot/library/PGE/Perl6Grammar.pir  --output=builtins_gen_DUMMY.pir compilers/pge/PGE/builtins.pg
Starting program: /Users/bruce/Perl/Parrot/Release_20_test/parrot/parrot --runcore=gcdebug runtime/parrot/library/PGE/Perl6Grammar.pir  --output=builtins_gen_DUMMY.pir compilers/pge/PGE/builtins.pg
Reading symbols for shared libraries ++++++++++......... done
src/gc/api.c:245: failed assertion 'PObj_is_PMC_TEST(obj)'
Backtrace - Obtained 32 stack frames (max trace depth is 32).
0   libparrot.dylib                     0x0048fe7d Parrot_do_check_events + 173
1   libparrot.dylib                     0x0048ffe7 Parrot_confess + 151
2   libparrot.dylib                     0x0049d487 Parrot_gc_mark_PMC_alive_fun + 135
3   libparrot.dylib                     0x00631e28 Parrot_ArrayIterator_get_isa + 14824
4   libparrot.dylib                     0x0049d517 Parrot_gc_mark_PMC_alive_fun + 279
5   libparrot.dylib                     0x00632050 Parrot_ArrayIterator_get_isa + 15376
6   libparrot.dylib                     0x0049d517 Parrot_gc_mark_PMC_alive_fun + 279
7   libparrot.dylib                     0x004a0e8e Parrot_is_blocked_GC_sweep + 5374
8   libparrot.dylib                     0x004a00d3 Parrot_is_blocked_GC_sweep + 1859
9   libparrot.dylib                     0x004a0215 Parrot_is_blocked_GC_sweep + 2181
10  libparrot.dylib                     0x0049e573 Parrot_gc_mark_STRING_alive_fun + 3811
11  libparrot.dylib                     0x0050a4df enable_event_checking + 3711
12  libparrot.dylib                     0x00508d1a Parrot_runcore_switch + 3978
13  libparrot.dylib                     0x004b2ee8 new_runloop_jump_point + 392
14  libparrot.dylib                     0x004a900c Parrot_pcc_invoke_from_sig_object + 428
15  libparrot.dylib                     0x004a9400 Parrot_pcc_invoke_sub_from_c_args + 208
16  libparrot.dylib                     0x004fb359 Parrot_ComposeRole + 3385
17  libparrot.dylib                     0x004fb524 Parrot_ComposeRole + 3844
18  libparrot.dylib                     0x004fb904 do_sub_pragmas + 388
19  libparrot.dylib                     0x00502a88 PackFile_Annotations_add_entry + 2184
20  libparrot.dylib                     0x00502ba9 PackFile_Annotations_add_entry + 2473
21  libparrot.dylib                     0x00503223 Parrot_load_bytecode + 515
22  libparrot.dylib                     0x003ff2f5 Parrot_str_from_int + 917
23  libparrot.dylib                     0x0050a51c enable_event_checking + 3772
24  libparrot.dylib                     0x00508d1a Parrot_runcore_switch + 3978
25  libparrot.dylib                     0x004b2ee8 new_runloop_jump_point + 392
26  libparrot.dylib                     0x004a900c Parrot_pcc_invoke_from_sig_object + 428
27  libparrot.dylib                     0x004a9400 Parrot_pcc_invoke_sub_from_c_args + 208
28  libparrot.dylib                     0x004fb359 Parrot_ComposeRole + 3385
29  libparrot.dylib                     0x004fb60a Parrot_ComposeRole + 4074
30  libparrot.dylib                     0x004fb904 do_sub_pragmas + 388
31  libparrot.dylib                     0x005032a7 PackFile_fixup_subs + 119

Program received signal SIGABRT, Aborted.
0x94e82e42 in __kill ()
(gdb) bt
#0  0x94e82e42 in __kill ()
#1  0x94e82e34 in kill$UNIX2003 ()
#2  0x94ef523a in raise ()
#3  0x94f01679 in abort ()
#4  0x0048ffec in Parrot_confess (cond=0x6ca290 "PObj_is_PMC_TEST(obj)", file=0x6ca280 "src/gc/api.c", line=245) at src/exceptions.c:553
#5  0x0049d487 in Parrot_gc_mark_PMC_alive_fun (interp=0x901900, obj=0xa0def0) at src/gc/api.c:245
#6  0x00631e28 in Parrot_CallContext_mark (interp=0x901900, pmc=0xa0df04) at callcontext.pmc:519
#7  0x0049d517 in Parrot_gc_mark_PMC_alive_fun (interp=0x901900, obj=0xa0df04) at src/gc/api.c:264
#8  0x00632050 in Parrot_CallContext_mark (interp=0x901900, pmc=0xb606cc) at callcontext.pmc:531
#9  0x0049d517 in Parrot_gc_mark_PMC_alive_fun (interp=0x901900, obj=0xb606cc) at src/gc/api.c:264
#10 0x004a0e8e in Parrot_gc_trace_root (interp=0x901900, trace=GC_TRACE_FULL) at src/gc/mark_sweep.c:199
#11 0x004a00d3 in gc_ms_trace_active_PMCs (interp=0x901900, trace=GC_TRACE_FULL) at src/gc/gc_ms.c:254
#12 0x004a0215 in gc_ms_mark_and_sweep (interp=0x901900, flags=1) at src/gc/gc_ms.c:177
#13 0x0049e573 in Parrot_gc_mark_and_sweep (interp=0x901900, flags=1) at src/gc/api.c:842
#14 0x0050a4df in runops_gc_debug_core (interp=0x901900, runcore=0x90b750, pc=0xa4b124) at src/runcore/cores.c:879
#15 0x00508d1a in runops_int (interp=0x901900, offset=0) at src/runcore/main.c:546
#16 0x004b2ee8 in runops (interp=0x901900, offs=0) at src/call/ops.c:99
#17 0x004a900c in Parrot_pcc_invoke_from_sig_object (interp=0x901900, sub_obj=0xa095e4, call_object=0xb606cc) at src/call/pcc.c:314
#18 0x004a9400 in Parrot_pcc_invoke_sub_from_c_args (interp=0x901900, sub_obj=0xa095e4, sig=0x6cc840 "->P") at src/call/pcc.c:75
#19 0x004fb359 in run_sub (interp=0x901900, sub_pmc=0xa095e4) at src/packfile.c:684
#20 0x004fb524 in do_1_sub_pragma (interp=0x901900, sub_pmc=0xa095e4, action=PBC_LOADED) at src/packfile.c:746
#21 0x004fb904 in do_sub_pragmas (interp=0x901900, self=0x903480, action=PBC_LOADED, eval_pmc=0x0) at src/packfile.c:936
#22 0x00502a88 in PackFile_append_pbc (interp=0x901900, filename=0x90b630 "/Users/bruce/Perl/Parrot/Release_20_test/parrot/runtime/parrot/library/PGE.pbc") at src/packfile.c:4829
#23 0x00502ba9 in compile_or_load_file (interp=0x901900, path=0xa18350, file_type=PARROT_RUNTIME_FT_PBC) at src/packfile.c:4698
#24 0x00503223 in Parrot_load_bytecode (interp=0x901900, file_str=0x8f66f4) at src/packfile.c:4895
#25 0x003ff2f5 in Parrot_load_bytecode_sc (cur_opcode=0x2053660, interp=0x901900) at core.ops:167
#26 0x0050a51c in runops_gc_debug_core (interp=0x901900, runcore=0x90b750, pc=0x2053660) at src/runcore/cores.c:882
#27 0x00508d1a in runops_int (interp=0x901900, offset=24) at src/runcore/main.c:546
#28 0x004b2ee8 in runops (interp=0x901900, offs=24) at src/call/ops.c:99
#29 0x004a900c in Parrot_pcc_invoke_from_sig_object (interp=0x901900, sub_obj=0xa0bce0, call_object=0xa0bd80) at src/call/pcc.c:314
#30 0x004a9400 in Parrot_pcc_invoke_sub_from_c_args (interp=0x901900, sub_obj=0xa0bce0, sig=0x6cc840 "->P") at src/call/pcc.c:75
#31 0x004fb359 in run_sub (interp=0x901900, sub_pmc=0xa0bce0) at src/packfile.c:684
#32 0x004fb60a in do_1_sub_pragma (interp=0x901900, sub_pmc=0xa0bce0, action=PBC_MAIN) at src/packfile.c:776
#33 0x004fb904 in do_sub_pragmas (interp=0x901900, self=0x90be30, action=PBC_MAIN, eval_pmc=0x0) at src/packfile.c:936
#34 0x005032a7 in PackFile_fixup_subs (interp=0x901900, what=PBC_MAIN, eval=0x0) at src/packfile.c:4918
#35 0x006a1364 in imcc_run_pbc (interp=0x901900, obj_file=0, output_file=0x0, argc=3, argv=0xbffff5d4) at compilers/imcc/main.c:790
#36 0x006a2019 in imcc_run (interp=0x901900, sourcefile=0xbffff6d5 "runtime/parrot/library/PGE/Perl6Grammar.pir", argc=3, argv=0xbffff5d4) at compilers/imcc/main.c:1075
#37 0x00002399 in main (argc=3, argv=0xbffff5d4) at src/main.c:60
(gdb) 

  Changed 5 years ago by jkeenan

  • summary changed from t/compilers/pge/pge_examples.t: PASS by itself, FAIL during 'make smoke' to src/gc/api.c: Intermittent test failures at line 245 since r43211

Am changing the ticket's Summary to better describe the focal point of the problem.

kid51

  Changed 5 years ago by jkeenan

TT #1067 may be related to this one. The same part of src/gc/api.c is cited in both tickets.

  Changed 5 years ago by lithos

TT #1420 looks very similar to this bug.

  Changed 5 years ago by lithos

BTW, TT #1371 also has a similar stack trace as bacek already noted there.

in reply to: ↑ 29 ; follow-up: ↓ 38   Changed 5 years ago by jkeenan

Replying to lithos:

If my assumptions are right, this modification should make the bug disappear:

 -        Parrot_gc_free_pmc_header(interp, SELF);

I finally got around to trying this patch tonight. As I previously reported, other commits during the last two week appear to have papered over the problem such that all tests were once again PASSing on the box in question. So all I can accurately report is that deleting that one line did no harm, i.e., I had no problem running make and I got a PASS on make test.

Thank you very much.
kid51

in reply to: ↑ 37 ; follow-up: ↓ 39   Changed 5 years ago by jkeenan

  • cc lithos added

Replying to jkeenan:

Replying to lithos:

If my assumptions are right, this modification should make the bug disappear:

 -        Parrot_gc_free_pmc_header(interp, SELF);

This morning I realized that a better test would be to do a checkout of r43211 (the offending revision) and to apply lithos's patch to that revision. Before applying, I configured, built and then ran perl t/harness --gc-debug t/compilers/pge/pge_examples.t. The test failed as described way above.

I then applied the patch, rebuilt and reran that test. The test PASSed. I then ran make test and all tests PASSed. I than ran make fulltest. Again, all tests PASSed.

I think we should apply this patch. But to further facilitate its testing, I have created the tt1393_retcon branch in SVN. I would like to ask all who have contributed to this ticket to give it a trial. In that branch, I will also see if we can un-TODO any tests that were failing in the other TTs cited in this ticket.

Thank you very much.

kid51

in reply to: ↑ 38   Changed 5 years ago by jkeenan

  • owner set to jkeenan
  • status changed from new to assigned
  • patch set to applied

Replying to jkeenan:

Replying to jkeenan:

Replying to lithos:

If my assumptions are right, this modification should make the bug disappear:

{{{ - Parrot_gc_free_pmc_header(interp, SELF); }}}

I think we should apply this patch. But to further facilitate its testing, I have created the tt1393_retcon branch in SVN. I would like to ask all who have contributed to this ticket to give it a trial. In that branch, I will also see if we can un-TODO any tests that were failing in the other TTs cited in this ticket.

We got some testing of the branch by mikehh+ and NotFound++. Both testers' results suggested that the patch, at the very least, did no harm. So I merged the branch into trunk at r43721.

I will keep the ticket open for 2-4 days to record any failures, complaints, etc.

Thank you very much.
kid51

  Changed 5 years ago by jkeenan

  • status changed from assigned to closed
  • resolution set to fixed

No complaints in the specified time period. Closing ticket.

Note: See TracTickets for help on using tickets.