Ticket #551 (closed bug: fixed)

Opened 6 years ago

Last modified 5 years ago

t/pmc/nci.t: 2 test failures

Reported by: jkeenan Owned by:
Priority: normal Milestone:
Component: core Version:
Severity: medium Keywords: pmc nci pointer
Cc: cotto, mikehh, jkeen@…, rg@… Language:
Patch status: applied Platform:

Description

These two tests began to fail after r38036; bisection suggests r38037. Here are some pastes:

1. Output of my run on Linux/i386 of prove -v t/pmc/nci.t:

ok 65 - arity
not ok 66 - nci_vVi - void** out parameter

#   Failed test 'nci_vVi - void** out parameter'
#   at t/pmc/nci.t line 2650.
# Exited with error code: [SIGNAL 6]
# Received:
# src/call/pcc.c:978: failed assertion '(st->dest.sig & PARROT_ARG_TYPE_MASK) == PARROT_ARG_PMC'
# Backtrace - Obtained 18 stack frames (max trace depth is 32).
#   (unknown)
#     Parrot_confess
#       Parrot_fetch_arg_nci
#         (unknown)
#           (unknown)
#             Parrot_NCI_invoke
#               Parrot_invokecc_p
#                 (unknown)
#                   (unknown)
#                     (unknown)
#                       (unknown)
#                         Parrot_runops_fromc_args
#                           Parrot_runcode
#                             (unknown)
#                               imcc_run
#                                 (unknown)
#                                   __libc_start_main
#                                     (unknown)
# 
# Expected:
# got 10
# 
ok 67 - nci_ttt - t_tt parameter
ok 68 - nci_vfff - v_fff parameter
not ok 69 - nci_vV - char** out parameter

#   Failed test 'nci_vV - char** out parameter'
#   at t/pmc/nci.t line 2709.
# Exited with error code: [SIGNAL 11]
# Received:
# 
# Expected:
# Hello bright new world
# 
# Looks like you failed 2 tests of 69.
 Dubious, test returned 2 (wstat 512, 0x200)
 Failed 2/69 subtests 

Test Summary Report
-------------------
t/pmc/nci (Wstat: 512 Tests: 69 Failed: 2)
  Failed tests:  66, 69
  Non-zero exit status: 2
Files=1, Tests=69,  6 wallclock secs ( 0.00 usr  0.00 sys +  1.30 cusr  0.18 csys =  1.48 CPU)
Result: FAIL

2. mikehh's direct call:

mhk@mhk-desktop:~/parrot.t$ ./parrot -t t/pmc/nci_66.pir
     0 set S0, "libnci_test"                                        S0="(null)"
     3 loadlib P1, S0                                        P1=PMCNULL S0="libnci_test"
     6 dlfunc P3, P1, "nci_vVi", "vVi"                                        P3=PMCNULL P1=ParrotLibrary=PMC(0x9c25538)
    11 dlfunc P2, P1, "nci_vp", "vp"                                        P2=PMCNULL P1=ParrotLibrary=PMC(0x9c25538)
    16 new P0, PC14                                        P0=PMCNULL PC14=Key=PMC(0x9c25550)
    19 set I0, 10                                        I0=0
    22 set_args PC9 (2), P0, I0                                        PC9=FixedIntegerArray=PMC(0x9c25580) P0=Pointer=PMC(0x9c1e960) I0=10
    26 get_results PC11
    28 invokecc P3                                        P3=NCI=PMC(0x9c1ec60)
Segmentation fault

mhk@mhk-desktop:~/parrot.t$ ./parrot -t t/pmc/nci_69.pir
     0 set S1, "libnci_test"                                        S1="(null)"
     3 loadlib P0, S1                                        P0=PMCNULL S1="libnci_test"
     6 dlfunc P1, P0, "nci_vV", "vV"                                        P1=PMCNULL P0=ParrotLibrary=PMC(0x97c8550)
    11 new P2, PC10                                        P2=PMCNULL PC10=Key=PMC(0x97c8568)
    14 set_args PC4 (1), P2                                        PC4=FixedIntegerArray=PMC(0x97c8598) P2=Pointer=PMC(0x97c1990)
    17 get_results PC7
    19 invokecc P1                                        P1=NCI=PMC(0x97c1c60)
Segmentation fault

3. A diff

cat ~/learn/parrot/38036.38037.diff
Index: src/pmc/pointer.pmc
===================================================================
--- src/pmc/pointer.pmc (revision 38036)
+++ src/pmc/pointer.pmc (revision 38037)
@@ -23,6 +23,8 @@
 #include "parrot/parrot.h"
 
 pmclass Pointer need_ext {
+    ATTR void * mark_function;
+    ATTR void * pointer;
 
 /*
 
@@ -36,6 +38,7 @@
 
     VTABLE void init() {
         PObj_custom_mark_SET(SELF);
+        PMC_data(SELF) = mem_allocate_zeroed_typed(Parrot_Pointer_attributes);
     }
 
 /*
@@ -50,10 +53,10 @@
 
     VTABLE void mark() {
         void (*mark_function)(Interp *, void *) =
-            (void (*)(Interp *, void *))D2FPTR(PMC_struct_val(SELF));
-
-        if (PMC_data(SELF) && PMC_struct_val(SELF))
-            (*mark_function)(INTERP, PMC_data(SELF));
+            (void (*)(Interp *, void *))D2FPTR(PARROT_POINTER(SELF)->mark_function);
+        void * data = PARROT_POINTER(SELF)->pointer;
+        if (data && mark_function)
+            (*mark_function)(INTERP,data);
     }
 
 /*
@@ -84,7 +87,7 @@
 */
 
     VTABLE void *get_pointer() {
-        return PMC_data(SELF);
+        return PARROT_POINTER(SELF)->pointer;
     }
 
 /*
@@ -98,7 +101,7 @@
 */
 
     VTABLE INTVAL get_integer() {
-        return (INTVAL)PMC_data(SELF);
+        return (INTVAL)(PARROT_POINTER(SELF)->pointer);
     }
 
 /*
@@ -112,7 +115,7 @@
 */
 
     VTABLE FLOATVAL get_number() {
-        return (FLOATVAL)(INTVAL)PMC_data(SELF);
+        return (FLOATVAL)(INTVAL)(PARROT_POINTER(SELF)->pointer);
     }
 
 /*
@@ -126,7 +129,7 @@
 */
 
     VTABLE STRING *get_repr() {
-        return Parrot_sprintf_c(INTERP, "Pointer = 0x%p", PMC_data(SELF));
+        return Parrot_sprintf_c(INTERP, "Pointer = 0x%p", PARROT_POINTER(SELF)->pointer);
     }
 
 
@@ -141,7 +144,7 @@
 */
 
     VTABLE STRING *get_string() {
-        return Parrot_sprintf_c(INTERP, "%s", PMC_data(SELF));
+        return Parrot_sprintf_c(INTERP, "%s", PARROT_POINTER(SELF)->pointer);
     }
 
 /*
@@ -170,7 +173,7 @@
 
     VTABLE INTVAL is_same(PMC *pmc2) {
         return (INTVAL)(SELF->vtable   == pmc2->vtable &&
-                        PMC_data(SELF) == PMC_data(pmc2));
+                        PARROT_POINTER(SELF)->pointer == PARROT_POINTER(pmc2)->pointer);
     }
 }
 
Index: tools/build/nativecall.pl
===================================================================
--- tools/build/nativecall.pl   (revision 38036)
+++ tools/build/nativecall.pl   (revision 38037)
@@ -197,6 +197,7 @@
 #include "parrot/oplib/ops.h"
 #include "pmc/pmc_managedstruct.h"
 #include "pmc/pmc_nci.h"
+#include "pmc/pmc_pointer.h"
 #include "nci.str"
 #include "jit.h"
 
@@ -353,7 +354,7 @@
     /V/ && do {
         push @{$temps_ref},          "PMC *t_$temp_num;";
         push @{$extra_preamble_ref}, "t_$temp_num = GET_NCI_P($reg_num);";
-        return "(void**)&PMC_data(t_$temp_num)";
+        return "(void**)&PARROT_POINTER(t_$temp_num)->pointer";
     };
     /[ilIscfdNS]/ && do {
         my $ret_type = $sig_table{$_}{return_type};

Attachments

nci_V_fix.patch Download (5.7 KB) - added by cotto 6 years ago.
fix NCI calls with V in their signatures
pointerref.patch Download (4.9 KB) - added by rg 6 years ago.
different solution for pointer pmc
donthandle.patch Download (2.2 KB) - added by rg 6 years ago.
simplest workaround: don't handle 'V' signature

Change History

in reply to: ↑ description ; follow-up: ↓ 2   Changed 6 years ago by jkeenan

Replying to jkeenan:

These two tests began to fail after r38036; bisection suggests r38037. Here are some pastes: 1. Output of my run on Linux/i386 of prove -v t/pmc/nci.t:

I should clarify that I ran prove at r38067. cotto has indicated on IRC that commits subsequent to r38037 may have changed the resulting error.

in reply to: ↑ 1 ; follow-up: ↓ 3   Changed 6 years ago by whiteknight

Replying to jkeenan:

Replying to jkeenan:

These two tests began to fail after r38036; bisection suggests r38037. Here are some pastes: 1. Output of my run on Linux/i386 of prove -v t/pmc/nci.t:

I should clarify that I ran prove at r38067. cotto has indicated on IRC that commits subsequent to r38037 may have changed the resulting error.

I was aware of this error (I'm not on i386, so I found out after I committed r38037), so I committed in a fix at r38052. Can you do me a favor and update to r38052 (or later) and let me know what the status is and what the errors are if any? I had heard from somebody this morning that all tests were passing on i386 as of r38052, so I want to double-check that my information is correct.

in reply to: ↑ 2   Changed 6 years ago by jkeenan

Replying to whiteknight:

I was aware of this error (I'm not on i386, so I found out after I committed r38037), so I committed in a fix at r38052. Can you do me a favor and update to r38052 (or later) and let me know what the status is and what the errors are if any?

I checked out r38052 and got these results:

ok 65 - arity
not ok 66 - nci_vVi - void** out parameter

#   Failed test 'nci_vVi - void** out parameter'
#   at t/pmc/nci.t line 2650.
# Exited with error code: [SIGNAL 6]
# Received:
# src/call/pcc.c:978: failed assertion '(st->dest.sig & PARROT_ARG_TYPE_MASK) == PARROT_ARG_PMC'
# Backtrace - Obtained 18 stack frames (max trace depth is 32).
#   (unknown)
#     Parrot_confess
#       Parrot_fetch_arg_nci
#         (unknown)
#           (unknown)
#             Parrot_NCI_invoke
#               Parrot_invokecc_p
#                 (unknown)
#                   (unknown)
#                     (unknown)
#                       (unknown)
#                         Parrot_runops_fromc_args
#                           Parrot_runcode
#                             (unknown)
#                               imcc_run
#                                 (unknown)
#                                   __libc_start_main
#                                     (unknown)
# 
# Expected:
# got 10
# 
ok 67 - nci_ttt - t_tt parameter
ok 68 - nci_vfff - v_fff parameter
not ok 69 - nci_vV - char** out parameter

#   Failed test 'nci_vV - char** out parameter'
#   at t/pmc/nci.t line 2709.
# Exited with error code: [SIGNAL 11]
# Received:
# 
# Expected:
# Hello bright new world
# 
# Looks like you failed 2 tests of 69.
 Dubious, test returned 2 (wstat 512, 0x200)
 Failed 2/69 subtests 

Test Summary Report
-------------------
t/pmc/nci (Wstat: 512 Tests: 69 Failed: 2)
  Failed tests:  66, 69
  Non-zero exit status: 2
Files=1, Tests=69,  6 wallclock secs ( 0.00 usr  0.00 sys +  1.16 cusr  0.22 csys =  1.38 CPU)
Result: FAIL

And, of course, my original report was done at HEAD.

  Changed 6 years ago by jkeenan

  • cc cotto, mikehh, jkeen@… added; cotto mikehh removed

  Changed 6 years ago by cotto

These two tests work fine on non-jitcapable builds, where src/nci.c is used. This leads me to believe that the jit code for 'V' signature elements only does half of what it needs to. After the jit code generates a call to the NCI function, it also needs to generate calls to set_pointer, similar to what pcf_v_V in src/nci.c does. Whiteknight and I independently tried to figure out how to update the jit code to gracefully generate the appropriate calls to set_pointer after the function call, but neither of us could figure out how to make it work.

In r38069, I added an NCI test to make sure that functions with multiple out params work. As with the other two, it passes on non-jitcapable builds and fails on jitcapable builds.

follow-up: ↓ 7   Changed 6 years ago by whiteknight

The best way to fix this all that I can think of is to change the semantics of the Pointer PMC. Right now, the PMC is more like a "value" PMC, instead of a "pointer". Here's some code from src/nci.c:pcf_v_V():

t_872 = GET_NCI_P(0);
v_872 = VTABLE_get_pointer(interp, t_872);

(void)(*pointer)(&v_872);
    
VTABLE_set_pointer(interp, t_872, v_872);

If, instead of holding a single value v_872, the Pointer PMC could allocate a small amount of storage space, and pass a return a pointer to that space from get_pointer. Right now, VTABLE_get_pointer on the Pointer PMC does this:

return(PARROT_POINTER(SELF)->pointer);

Instead, I suggest it do something like this:

return(&(PARROT_POINTER(SELF)->buffer));

So then we can change the NCI thunk to do this:

t_872 = GET_NCI_P(0);
v_872 = VTABLE_get_pointer(interp, t_872);

(void)(*pointer)(v_872);

Any opinions on this?

in reply to: ↑ 6   Changed 6 years ago by rg

  • cc rg@… added

Replying to whiteknight:

Any opinions on this?

I think you wouldn't even need to allocate a buffer. The only API change required would be to return the address of the pointer instead of the pointer itself, or maybe instead add a different function to request the pointer address. I was about to try this, but it's not working for test 66, yet.

You definitely need to add this diff to fix the stack after the vtable call (sorry for inlining):

--- src/jit/i386/jit_defs.c	(revision 38087)
+++ src/jit/i386/jit_defs.c	(working copy)
@@ -2273,6 +2273,9 @@
                 emitm_movl_m_r(interp, pc, emit_EAX, emit_EAX, 0, 1, offsetof(VTABLE, get_pointer));
                 emitm_callr(pc, emit_EAX);
                 emitm_movl_r_m(interp, pc, emit_EAX, emit_EBP, 0, 1, args_offset);
+                /* reset ESP(4) */
+                emitm_lea_m_r(interp, pc, emit_EAX, emit_EBP, 0, 1, st_offset);
+                emitm_movl_r_m(interp, pc, emit_EAX, emit_EBP, 0, 1, temp_calls_offset + 4);
                 break;
             case 'b':   /* buffer (void*) pass PObj_bufstart(SReg) */
                 emitm_call_cfunc(pc, get_nci_S);

Maybe combined with your own fix attempt, this will get you further ;)

  Changed 6 years ago by cotto

With the attached patch, all tests pass (including the TODO'd ones) on both jitcapable and non-jitcapable. The patch includes Whiteknight's suggestion and rg's code, plus some changes to the nci thunk generation code. There are two issues with it that I'd like addressed before I'm comfortable applying it.

First, the patch changes nci_vp() in src/nci_test.c. I suspect that the logic in that function needed to be updated, but on the other hand it was working before as it was. I'm reasonably sure the change is correct, but I'd appreciate independent confirmation.

Second, I'm not 100% certain about all the pointer dereferencing logic in src/pmc/pointer.pmc. get_string and get_pointer are the only functions that have any test coverage, so a quick check by someone else would be helpful.

Once those issues are addressed, the patch can be applie and this ticket resolved.

Changed 6 years ago by cotto

fix NCI calls with V in their signatures

Changed 6 years ago by rg

different solution for pointer pmc

Changed 6 years ago by rg

simplest workaround: don't handle 'V' signature

  Changed 6 years ago by rg

I'm not entirely happy with cotto's suggestion. For one, it breaks the correlation between set_pointer and get_pointer.

I've now attached my other 2 possible solutions to this problem. One is to extend the Pointer pmc with a new function. This keeps the compatibility, however it needs some infrastructure changes to support returning a void **, plus it introduces a new vtable method, that probably no other class will use.

The other possible solution is to just not handle 'V' arguments with the jitted call stack, falling back to the predefined signatures list, just like any non-jitted architecture. While this is removing functionality that has worked before, it didn't work anymore, because of the conversion to ATTRs. With nobody really familiar with jit and the would-be examples (of passing the address of local variables to the called function) also not handled (with a "don't know how to do it" comment none the less), I don't think this is a big drawback.

So, who is going to decide which way to go? Can we take a vote? Any other suggestions?

follow-up: ↓ 11   Changed 6 years ago by donaldh

It might be counter productive to modify Pointer PMC since UnManagedStruct will need to work with 'V' signatures.

There is also a todo to remove Pointer PMC according to ticket #187

Note that all t/pmc/nci.t tests pass for me on Cygwin and Linux yet my Sqlite NCI code breaks on both platforms when using UnManagedStruct or Pointer in conjunction with 'V' signatures. I'll try and see if I can identify the cause.

in reply to: ↑ 10   Changed 6 years ago by rg

Replying to donaldh:

It might be counter productive to modify Pointer PMC since UnManagedStruct will need to work with 'V' signatures.

I'm sorry, I'm not too familiar with UnManagedStruct and I don't understand what you're getting at. Can you elaborate?

There is also a todo to remove Pointer PMC according to ticket #187

As you have noted in TT#187, there is code that is using the Pointer PMC and we don't have any replacement. chromatic hinted at solution that uses UnManagedStruct, but I don't know the code well enough to offer a patch.

Note that all t/pmc/nci.t tests pass for me on Cygwin and Linux

I'd guess this is because the offending tests are currently todo-ed (all 3 patches are un-todo-ing them to show the effect).

yet my Sqlite NCI code breaks on both platforms when using UnManagedStruct or Pointer in conjunction with 'V' signatures. I'll try and see if I can identify the cause.

Probably for the same reason the offending tests fail. Did you try one of the fixes (or all of them, but one at a time)?

  Changed 6 years ago by donaldh

Okay, so both Pointer and UnManagedStruct can be used to store an opaque pointer that is managed by something else. Quite some time back, I hit a bug with the custom mark() function in Pointer and discussed/submitted a patch. It was recommended I try using UnManagedStruct instead so I did. Anyway, if you modified Pointer PMC to have different semantics, then you'd also have to change UnManagedStruct and maybe others. So, as you said, maybe not a good idea.

You're right, I completely missed the fact that the offending tests are todo-ed.

I applied the donthandle.patch re-enabled the tests which pass of course because the required signatures are in config/gen/call_list/*. After adding the call signatures for Sqlite to config/gen/call_list/sqlite3.in, my code works fine.

For me, this confirms there is no need to modify Pointer.pmc or nativecall.pl or nci_test.c - they are just a bunch of band aids around broken jit.

I'm not entirely sure I understand what's happening in src/jit/i386/jit_defs.c but I'm pretty sure that the problem stems from not having any mechanism for postamble, which is required by V, 2, 3, and 4. In other words, there is a block of code for preparing the stack before the function call and a block of code for handling the return value, but no code for postprocessing the stack for out params.

What's needed is another block that iterates the signature and assigns the stack value back to the PMC in the relevant cases. I also assume that V requires a temporary for the pointer so that the address of the temporary can be put on the stack instead of the pointer itself, since V represents a void**.

My i386 assembly knowledge is non-existent so it would take me a while to attempt this. Does my description make enough sense for someone else to try?

  Changed 6 years ago by whiteknight

donaldh: Your plan here is the right one I think. It's the same conclusion I came to when this ticket was first opened, I suggested the semantic changes to the Pointer PMC as a temporary fix because I figured the JIT fix was going to be too difficult at the time.

However, it still is probably the "correct" solution to pursue. I'll be happy to check over any patches you work up, or if I can find the time over the weekend I'll take a stab at a fix myself.

  Changed 5 years ago by jkeenan

  • component changed from none to core

  Changed 5 years ago by chromatic

  • status changed from new to closed
  • resolution set to fixed
  • patch set to applied

I applied the most recent patch; let's call this good. We need to review how we handle out parameters in NCI calls in general -- not just V but 1, 2, and 3 types. That'll work for another ticket.

Note: See TracTickets for help on using tickets.