Ticket #953 (closed bug: fixed)

Opened 5 years ago

Last modified 5 years ago

[gc] t/op/copy.t failure in OpenBSD amd64

Reported by: darbelo Owned by: jkeenan
Priority: normal Milestone:
Component: GC Version: trunk
Severity: medium Keywords: gc order-of-destruction
Cc: jkeenan Language:
Patch status: applied Platform: openbsd

Description

At some point after the Great Merging of the Branches that followed the 1.5.0 release t/op/copy.t started failing on OpenBSD amd64.

Amusingly all of my smolder reports show this fail as 100% PASS, due to the failure (apparently a double-free) is happening at the time of interpreter destruction, after the "ok" line has been printed. (example at  http://smolder.plusthree.com/app/public_projects/report_details/26618 and  http://smolder.plusthree.com/app/public_projects/tap_stream/26618/95 )

$ prove -v t/op/copy.t                                                                               
t/op/copy....1..4
ok 1 - copy should change type of PMC isa Float
ok 2 - ... and its value
ok 3 - copy should make independent copies
ok 4 - copy to null throws
parrot in free(): error: chunk is already free
dubious
        Test returned status 0 (wstat 134, 0x86)
        after all the subtests completed successfully
Failed Test Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
t/op/copy.t    0   134     4    0  ??
Failed 1/1 test scripts. 0/4 subtests failed.
Files=1, Tests=4,  0 wallclock secs ( 0.04 cusr +  0.03 csys =  0.07 CPU)
Failed 1/1 test programs. 0/4 subtests failed.

Inspecting the core file left by this test run shows that the second free happens at the time of interpreter destruction, which indicates the early collection of an object or (more likely) an order of destruction bug.

(gdb) bt
#0  0x000000020cb3237a in kill () from /usr/lib/libc.so.51.0
#1  0x000000020cb7e765 in abort () at /usr/src/lib/libc/stdlib/abort.c:68
#2  0x000000020cb576a8 in wrterror (p=0x20cc8a9a4 "chunk is already free")
    at /usr/src/lib/libc/stdlib/malloc.c:375
#3  0x000000020cb5935d in free (ptr=0x2071d1830) at /usr/src/lib/libc/stdlib/malloc.c:1328
#4  0x000000020f097883 in mem_sys_free (from=0x2071d1830) at src/gc/alloc_memory.c:325
#5  0x000000020f1f94e0 in Parrot_FixedIntegerArray_destroy (interp=0x20a26ae00, pmc=0x20cfd1090)
    at fixedintegerarray.pmc:153
#6  0x000000020f0ec085 in Parrot_pmc_destroy (interp=0x20a26ae00, pmc=0x20cfd1090) at src/pmc.c:113
#7  0x000000020f09dfac in free_pmc_in_pool (interp=0x20a26ae00, pool_unused=0x20e2f7000, 
    p=0x20cfd1090) at src/gc/mark_sweep.c:781
#8  0x000000020f09d3ed in Parrot_gc_sweep_pool (interp=0x20a26ae00, pool=0x20e2f7000)
    at src/gc/mark_sweep.c:342
#9  0x000000020f09c44b in gc_ms_finalize (interp=0x20a26ae00, arena_base=0x20e2f7600)
    at src/gc/gc_ms.c:229
#10 0x000000020f09c257 in gc_ms_mark_and_sweep (interp=0x20a26ae00, flags=4) at src/gc/gc_ms.c:160
#11 0x000000020f09999d in Parrot_gc_mark_and_sweep (interp=0x20a26ae00, flags=4) at src/gc/api.c:805
#12 0x000000020f0abbd9 in Parrot_really_destroy (interp=0x20a26ae00, exit_code_unused=0, 
    arg_unused=0x0) at src/interp/inter_create.c:350
#13 0x000000020f08d7f4 in Parrot_exit (interp=0x20a26ae00, status=0) at src/exit.c:91
#14 0x0000000000400ecf in main (argc=1, argv=0x7f7ffffbd0d0) at src/main.c:65

Change History

  Changed 5 years ago by jkeenan

  • cc jkeenan added

follow-up: ↓ 3   Changed 5 years ago by jkeenan

I got different results on different OSes at the same revision. At r40818, on Linux/i386, I got this:

$ prove -v t/op/copy.t 
t/op/copy.t .. 
1..4
ok 1 - copy should change type of PMC isa Float
ok 2 - ... and its value
ok 3 - copy should make independent copies
ok 4 - copy to null throws
ok
All tests successful.
Files=1, Tests=4,  1 wallclock secs ( 0.02 usr  0.00 sys +  0.02 cusr  0.00 csys =  0.04 CPU)
Result: PASS

I.e., no error reported at all.

On Darwin/PPC, I got this:

$ prove -v t/op/copy.t
t/op/copy.t .. 
1..4
ok 1 - copy should change type of PMC isa Float
ok 2 - ... and its value
ok 3 - copy should make independent copies
ok 4 - copy to null throws
parrot(11771) malloc: ***  Deallocation of a pointer not malloced: 0x630270; This could be a double free(), or free() called with the middle of an allocated block; Try setting environment variable MallocHelp to see tools to help debug
parrot(11771) malloc: ***  Deallocation of a pointer not malloced: 0x62ea40; This could be a double free(), or free() called with the middle of an allocated block; Try setting environment variable MallocHelp to see tools to help debug
ok
All tests successful.
Files=1, Tests=4, 21 wallclock secs ( 0.10 usr  0.05 sys +  0.08 cusr  0.15 csys =  0.38 CPU)
Result: PASS

I.e, a very different message from that reported on OpenBSD amd64.

So this problem has OS/platform-specific dimensions. Drat!

kid51

in reply to: ↑ 2 ; follow-up: ↓ 4   Changed 5 years ago by darbelo

Replying to jkeenan:

I got different results on different OSes at the same revision.

So this problem has OS/platform-specific dimensions. Drat!

The error message on darwin/ppc is different but it points to the same root cause (a double free()). It's also possible that linux has the same issue doesn't fail because of a more forgiving memory management. Also, I wouldn't be surprised if this failed similarly on other BSD derivatives.

in reply to: ↑ 3 ; follow-up: ↓ 5   Changed 5 years ago by jkeenan

Replying to darbelo:

Also, I wouldn't be surprised if this failed similarly on other BSD derivatives.

Well, the most recent Smolder reports on  FreeBSD and  NetBSD report no problems.

kid51

in reply to: ↑ 4   Changed 5 years ago by darbelo

Replying to jkeenan:

Replying to darbelo:

Also, I wouldn't be surprised if this failed similarly on other BSD derivatives.

Well, the most recent Smolder reports on  FreeBSD and  NetBSD report no problems.

NotFound++ commited a fix in r40824, if you can confirm that darwin/ppc PASSes then we're good to close this ticket.

  Changed 5 years ago by jkeenan

  • status changed from new to assigned
  • owner set to jkeenan
  • patch set to applied

Confirmed on Darwin/PPC at r40855:

$ prove -v t/op/copy.t 
t/op/copy.t .. 
1..4
ok 1 - copy should change type of PMC isa Float
ok 2 - ... and its value
ok 3 - copy should make independent copies
ok 4 - copy to null throws
ok
All tests successful.
Files=1, Tests=4,  7 wallclock secs ( 0.10 usr  0.04 sys +  0.07 cusr  0.14 csys =  0.35 CPU)
Result: PASS

I will close ticket.

kid51

  Changed 5 years ago by jkeenan

  • status changed from assigned to closed
  • resolution set to fixed
Note: See TracTickets for help on using tickets.