Ticket #1789 (closed bug: fixed)
String concatenation is really slow
Reported by: | pmichaud | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | core | Version: | 2.7.0 |
Severity: | medium | Keywords: | |
Cc: | Language: | ||
Patch status: | Platform: |
Description
String concatenation can be really slow in Parrot. This was originally reported in TT #1749 as part of the 'readall' method on FileHandle PMCs; that ticket has been marked as "fixed" (by causing 'readall' to avoid string concatenations) but the underlying performance problem remains. This ticket focuses more directly on the string concatenation issue so it doesn't get lost.
The following Perl 5 program generates a string of 25,000 comma-separated random integers:
#! perl my $s = int(rand(10000)); for ($n=1; $n<25000; $n++) { $s = $s . ', ' . int(rand(10000)); } print length($s), "\n"
Hand translation to PIR results in something like:
pmichaud@orange:~/rakudo$ cat x1.pir .loadlib 'math_ops' .sub 'main' :main .local string s .local int n $I0 = rand 10000 s = $I0 n = 1 loop: unless n < 25000 goto done $I0 = rand 10000 $S0 = $I0 s = concat s, ', ' s = concat s, $S0 inc n goto loop done: $I0 = length s say $I0 .end
Running the program under r48933 takes a little over 3 seconds:
pmichaud@orange:~/rakudo$ parrot_install/bin/parrot_config revision 48933 pmichaud@orange:~/rakudo$ time parrot_install/bin/parrot x1.pir 147305 real 0m3.208s user 0m3.190s sys 0m0.000s
Adding a single line to load Rakudo's perl6.pbc bytecode file (but not use Rakudo in any way) makes the program at least nine times slower:
pmichaud@orange:~/rakudo$ diff -u x1.pir x2.pir --- x1.pir 2010-09-13 15:07:23.000000000 -0500 +++ x2.pir 2010-09-13 15:07:29.000000000 -0500 @@ -1,6 +1,8 @@ .loadlib 'math_ops' .sub 'main' :main + load_bytecode 'perl6.pbc' + .local string s .local int n pmichaud@orange:~/rakudo$ time parrot_install/bin/parrot x2.pir 147242 real 0m30.381s user 0m30.030s sys 0m0.200s
Notes:
1. It only takes about a second for x2.pir to complete the C<load_bytecode 'perl6.pbc'> step -- the increase in execution time is almost entirely in the concatenation steps.
2. There's no "large Rakudo parse tree or ast" lying around that would be artificially increasing the work of the GC -- this is truly compiled code.
3. The code is running entirely in the 'parrot' HLL namespace, there are no Rakudo-specific PMCs, subroutines, dispatchers, etc. being used once the load_bytecode has completed.
(By way of comparison, the Perl 5 version of this program completes in approximately 0.7 sec on my system.)
Pm