Ticket #581 (closed bug: worksforme)

Opened 5 years ago

Last modified 4 years ago

make segfaults with PGE.pbc on fedora (r38365)

Reported by: Lu. Owned by: Infinoid
Priority: normal Milestone:
Component: core Version:
Severity: medium Keywords:
Cc: Language:
Patch status: Platform: linux

Description

Well, title says almost all. After a successful perl Configure.pl, make segfaults when building PGE.pbc. I have seen other tickets with this problem, but all are marked 'fixed, patch included in following revisions'. Yet it happened again.

Here is the output :

gmake -C compilers/pge
gmake[1]: entrant dans le répertoire « /home/lucien/Software/Source/parrot/compilers/pge »
/opt/bin/perl -MExtUtils::Command -e rm_f PGE.pbc ../../runtime/parrot/library/PGE.pbc
/opt/bin/perl -e "" >PGE/builtins_gen.pir
../../parrot -o PGE.pbc --output-pbc PGE.pir
../../parrot ../../runtime/parrot/library/PGE/Perl6Grammar.pir  --output=PGE/builtins_gen.pir PGE/builtins.pg
gmake[1]: *** [PGE.pbc] Erreur de segmentation
gmake[1]: *** Destruction du fichier « PGE.pbc »
gmake[1]: quittant le répertoire « /home/lucien/Software/Source/parrot/compilers/pge »
make: *** [compilers.dummy] Erreur 2

I run fedora 10 with a self compiled perl 5.10.0.

perl -V
Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.6.27.5-117.fc10.i686, archname=i686-linux-thread-multi-ld
    uname='linux asmodee.localdomain 2.6.27.5-117.fc10.i686 #1 smp tue nov 18 12:19:59 est 2008 i686 athlon i386 gnulinux '
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=define
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='4.3.2 20081105 (Red Hat 4.3.2-7)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='long double', nvsize=12, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.9.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.9'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib'


Characteristics of this binary (from libperl): 
  Compile-time options: MULTIPLICITY PERL_DONT_CREATE_GVSV
                        PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP USE_ITHREADS
                        USE_LARGE_FILES USE_LONG_DOUBLE USE_PERLIO
                        USE_REENTRANT_API
  Built under linux
  Compiled at Apr  4 2009 15:41:51
  @INC:
    /opt/lib/perl5/5.10.0/i686-linux-thread-multi-ld
    /opt/lib/perl5/5.10.0
    /opt/lib/perl5/site_perl/5.10.0/i686-linux-thread-multi-ld
    /opt/lib/perl5/site_perl/5.10.0
    .

Attachments

test-jit-might-work.c Download (3.8 KB) - added by Infinoid 5 years ago.
test-jit-might-work.c from Santtu++
configure_jit_selinux.patch Download (1.1 KB) - added by markmont 5 years ago.
Improve config/auto/jit.pm test for PARROT_HAS_EXEC_PROTECT to workaround Fedora/SELinux issues
configure_selinux_exec_protect.patch Download (1.1 KB) - added by markmont 5 years ago.
Improve config/auto/frames/test_exec_linux_c.in to workaround SELinux/Fedora issues related to PARROT_HAS_EXEC_PROTECT. This patch obsoletes configure_jit_selinux.patch

Change History

follow-up: ↓ 2   Changed 5 years ago by pmichaud

  • owner changed from pmichaud to nobody

...reassigning to nobody, since the problem is likely not with PGE.

(In general, PGE is the first major component that gets built using Parrot, so problems with Parrot tend to show up in the PGE building step but aren't specific to PGE.)

Pm

in reply to: ↑ 1   Changed 5 years ago by Infinoid

  • component changed from PGE to core

Replying to pmichaud:

(In general, PGE is the first major component that gets built using Parrot, so problems with Parrot tend to show up in the PGE building step but aren't specific to PGE.)

Yeah, it's a crash in parrot itself.

@reporter, to allow us to understand this problem better, please provide a gdb backtrace. If you're not familiar with gdb, you can do something like the following:

infinoid@chirp test % cd compilers/pge
infinoid@chirp pge % gdb ../../parrot
GNU gdb 6.8
[snip]
(gdb) run ../../runtime/parrot/library/PGE/Perl6Grammar.pir  --output=PGE/builtins_gen.pir PGE/builtins.pg
[snip lots of crash breaky stuff]
(gdb) bt
(gdb) kill
(gdb) exit

And post the entire output as a comment to this ticket.

Without a backtrace, it's nearly impossible to say where the problem may be occurring. Thanks!

Mark

  Changed 5 years ago by Lu.

Here is the output from gdb. I'm not sure it will be that useful, though, there is no backtrace stack.

[18:12:08 lucien@asmodee:pge]$ gdb ../../parrot 
GNU gdb Fedora (6.8-29.fc10)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
(no debugging symbols found)
(gdb) run ../../runtime/parrot/library/PGE/Perl6Grammar.pir  --output=PGE/builtins_gen.pir PGE/builtins.pg
Starting program: /home/lucien/Software/Source/parrot/parrot ../../runtime/parrot/library/PGE/Perl6Grammar.pir  --output=PGE/builtins_gen.pir PGE/builtins.pg
warning: "/usr/lib/debug/usr/lib/libicudata.so.40.0.debug": The separate debug info file has no debug info
[Thread debugging using libthread_db enabled]
PackFile_unpack: This is not a valid Parrot bytecode file
Parrot VM: Can't unpack packfile ./PGE.pir.
Unable to append PBC to the current directory
current instr.: 'parrot;PGE;Perl6Grammar;Compiler;__onload' pc 22 (../../runtime/parrot/library/PGE/Perl6Grammar.pir:75)
called from Sub 'parrot;PGE;Perl6Grammar;Compiler;main' pc -1 ((unknown file):-1)

Program exited with code 01.
(gdb) bt
No stack.
(gdb) kill
The program is not being run.
(gdb) quit

Hope this helps.
Thanks for your time and effort.
Lu.

  Changed 5 years ago by Infinoid

Thanks for trying. Something went wrong when running it; I'm not really sure what the story is. Is this the same tree from your report earlier today? Did a "make clean" happen somewhere inbetween?

What happens when you do:

$ make realclean
$ svn update
$ perl Configure.pl
$ make

And then when it errors out, cd into the pge directory and do the gdb thing from there?

The strange thing about your error is, PGE.pir isn't an autogenerated file, it should have come as part of the parrot checkout. So it really shouldn't be missing.

  Changed 5 years ago by Infinoid

  • owner changed from nobody to Infinoid

follow-up: ↓ 9   Changed 5 years ago by Lu.

Well, strangest thing is that it's not missing. I checked, it's there, and its content is the same as on the svn server. Could it be a character encoding problem ?

I already did a make realclean and svn update in between my first report and the update, hoping it would change something. I tried again but there is no difference, I still have the same output. Yet I noticed something strange in the Configure output :

auto::jit -           Determine JIT capability...p = 0x86d1000  PAGE_SIZE = 4096 (0x1000)
failure: Permission denied
.........................yes.

Could this have something to do with the segfault ? It could be a SELinux policy problem. I'll try disabling SELinux and rebuilding parrot.

follow-up: ↓ 8   Changed 5 years ago by Lu.

Yes, setting SELinux to permissive did the trick. parrot was built, all tests passed, and it installed all right.

Thanks again for your time and effort.

By the way, is it normal for SELinux to block things in that way ? I mean, I didn't have that problem when building perl, or Image::Magick, although I didn't do anything different this time. Does parrot require additional rights or should SELinux policies be updated to include it ?

Lu.

in reply to: ↑ 7   Changed 5 years ago by Infinoid

Replying to Lu.:

By the way, is it normal for SELinux to block things in that way ?

Yeah, I've heard of this problem once before. (see TT #18.)

I mean, I didn't have that problem when building perl, or Image::Magick, although I didn't do anything different this time.

Perl and IM don't do JIT. For JIT, we need to be able to allocate a buffer somehow, write native machine code to it and execute it. SELinux in restricted mode doesn't allow you to have the write and execute permission bits set on the same buffer at the same time, and rather than crashing or returning an error, it apparently fails silently, doesn't tell the app and lets it go on and crash horribly when you try to set those bits.

Before ticket #18, we were just using memory from the heap, which in hindsight was a really bad idea. Now we're using mmapped buffers, which plays a little nicer with selinux, but we're still not all the way there.

The next step for us is to use a 2-stage process where the memory is allocated, set for writing, written to, set for executing, and executed. We're not quite there yet, which is why selinux barfs. I'd be interested to hear whether (for instance) the JVM has some special selinux setup to get around this problem.

I think some of the parrot core developers are hoping that throwing away our jit engine and using llvm or libjit will fix this problem for us.

Mark

in reply to: ↑ 6 ; follow-ups: ↓ 10 ↓ 11   Changed 5 years ago by doughera

Replying to Lu.:

Yet I noticed something strange in the Configure output :

> auto::jit -           Determine JIT capability...p = 0x86d1000  PAGE_SIZE = 4096 (0x1000)
> failure: Permission denied
> .........................yes.

This is a bug in the JIT detection. Clearly something is reporting a failure somewhere, yet it's still reporting 'yes'. I think the short-term fix would be to alter the jit test so that it detected this situation and automatically set jitcapable=0.

in reply to: ↑ 9   Changed 5 years ago by Infinoid

Replying to doughera:

This is a bug in the JIT detection. Clearly something is reporting a failure somewhere, yet it's still reporting 'yes'. I think the short-term fix would be to alter the jit test so that it detected this situation and automatically set jitcapable=0.

Agreed. Santtu++ sent me a C file, test-jit-might-work.c, which tests the various types of accesses. He suggested the current configure test should be replaced with something based on this, and I would tend to agree. I'll attach the C file he provided to this ticket.

Changed 5 years ago by Infinoid

test-jit-might-work.c from Santtu++

in reply to: ↑ 9 ; follow-up: ↓ 12   Changed 5 years ago by Infinoid

Replying to doughera:

Replying to Lu.:

Yet I noticed something strange in the Configure output :

{{{

auto::jit - Determine JIT capability...p = 0x86d1000 PAGE_SIZE = 4096 (0x1000) failure: Permission denied .........................yes.

}}} This is a bug in the JIT detection. Clearly something is reporting a failure somewhere, yet it's still reporting 'yes'. I think the short-term fix would be to alter the jit test so that it detected this situation and automatically set jitcapable=0.

I've done a little digging on this. I think there's some confusion caused by the fact that the auto::jit step contains checks for both JIT and other exec-related things.

The above error message is emitted by config/auto/jit/test_exec_linux_c.in. Failure of that test means "yes, this machine has exec protection" and causes PARROT_HAS_EXEC_PROTECT to be defined.

It is not taken as failure of the JIT engine. In fact, there is no active runtime testing of whether JIT works at all; there are just some lookup tables defining which arch/OS are supposed to have working JIT. I'm working on getting Santtu's test file into this test, or at least the subset of it which pertains to our current jit implementation. It could use a portability review though, it quite obviously won't build on win32 and I'm not sure how many unixes it will run on as-is. It will probably need OS-specific test files, like the exec test has.

Hmm. The current situation of JIT is, it will use mmap() buffers when PARROT_HAS_EXEC_PROTECT is set, and will use the heap otherwise. To test whether jit will work, the test needs to know which style to test. Maybe those two tests should be merged together.

in reply to: ↑ 11 ; follow-up: ↓ 13   Changed 5 years ago by markmont

I've attached a patch, configure_jit_selinux.patch, for setting PARROT_HAS_EXEC_PROTECT under Fedora, thus causing JIT to use mmap() buffers on Fedora systems (which works) instead of using the heap (which results in a segmentation fault when SELinux is in enforcing mode). This was tested under Fedora 11 with "make test" for both parrot and rakudo.

I consider this to be a temporary workaround pending the proper/complete solutions described by Infinoid above.

The patch should be safe, as it changes only config/auto/jit/test_exec_linux_c.in to test for exec protection using mmap() instead of mprotect(), as this is how src/platform.c currently works for Linux platforms.

The change became necessary when Fedora tightened SELinux exec protection several releases back, breaking the mprotect() method when used with the heap when SELinux is in enforcing mode. See  http://people.redhat.com/drepper/selinux-mem.html

Changed 5 years ago by markmont

Improve config/auto/jit.pm test for PARROT_HAS_EXEC_PROTECT to workaround Fedora/SELinux issues

in reply to: ↑ 12   Changed 5 years ago by markmont

Replying to markmont:

I've attached a patch, configure_jit_selinux.patch, for setting PARROT_HAS_EXEC_PROTECT under Fedora, thus causing JIT to use mmap() buffers on Fedora systems (which works) instead of using the heap (which results in a segmentation fault when SELinux is in enforcing mode).

JIT functionality has been removed from Parrot post 1.6.0, and the test for setting PARROT_HAS_EXEC_PROTECT was moved to config/auto/frames/test_exec_linux_c.in

I've updated the previous patch to be against r41635 and uploaded it as configure_selinux_exec_protect.patch. The new patch is identical to the previous patch, except for the location of the patched file. Please disregard (or delete) the old configure_jit_selinux.patch attachment.

Changed 5 years ago by markmont

Improve config/auto/frames/test_exec_linux_c.in to workaround SELinux/Fedora issues related to PARROT_HAS_EXEC_PROTECT. This patch obsoletes configure_jit_selinux.patch

  Changed 5 years ago by coke

#1002 was just closed as a duplicate of this ticket.

  Changed 5 years ago by whiteknight

Can we get any confirmation from a dev on this platform that:

  • This is still an issue
  • The patches provided here do or do not work?

  Changed 4 years ago by jkeenan

Echoing whiteknight's questions of 10 months ago:

* Are these still live issues? * Do the patches improve things?

And adding one:

* Given that we are pretty Jit-less at this point, are these live issues?

kid51

  Changed 4 years ago by whiteknight

  • status changed from new to closed
  • resolution set to worksforme

I'm closing this ticket. We haven't seen any failing reports from Fedora systems in Smolder, and we no longer have many of the components mentioned in the ticket. WORKSFORME.

Note: See TracTickets for help on using tickets.