Changes between Version 21 and Version 22 of PIRCDevelopment

Show
Ignore:
Timestamp:
08/09/09 09:41:19 (12 years ago)
Author:
kjs
Comment:

more heredoc explanation

Legend:

Unmodified
Added
Removed
Modified
  • PIRCDevelopment

    v21 v22  
    155155We will now walk through two different scenarios, in order to simplify the discussion. Scenario 1 discussed the case of single heredoc parsing, and Scenario 2 discusses multiple heredoc parsing. Multiple heredoc parsing starts out with Scenario 1, but is a bit more advanced. 
    156156 
    157 ==== Scenario 1: single heredoc parsing ==== 
     157==== Scenario 1a: single heredoc string parsing ==== 
    158158 
    159159Consider the following input: 
     
    178178Since the preprocessor does not build a data structure representing the input, but instead writes the output directly (to a file), the "rest of the line" needs to be stored somewhere. This is because the {{{<<'EOS'}}} heredoc token is basically a placeholder for the actual (heredoc) string contents. Hence, the [source:/trunk/compilers/pirc/src/hdocprep.l#L318 activation of SAVE_REST_OF_LINE state]. 
    179179 
    180 The state {{{SAVE_REST_OF_LINE}}} has only one function, and that is to SAVE the REST OF the LINE :-). It will match all the text after the {{{<<'EOS'}}} heredoc marker up to and include the end-of-line character. This, including an additional "\n" character is stored in the {{{linebuffer}}} field, which always contains the "rest of the line". As you can see, in this scenario there is no "rest of the line", except for the end-of-line character ("\n", or "\r\n" on Windows). 
    181  
    182 After the heredoc marker the actual heredoc string must be scanned, hence the activation of the HEREDOC_STRING state on [source:/trunk/compilers/pirc/src/hdocprep.l#L331 line 331]. 
    183  
    184  
    185  
     180The state {{{SAVE_REST_OF_LINE}}} has only one function, and that is to SAVE the REST OF the LINE :-). It will match all the text after the {{{<<'EOS'}}} heredoc marker up to and include the end-of-line character. This, including an additional "\n" character is stored in the {{{linebuffer}}} field, which always contains the "rest of the line". As you can see, in this scenario there is no "rest of the line", except for the end-of-line character ("\n", or "\r\n" on Windows). See Scenario 1b below for a variant on this, in which the "rest of the line" contains a closing parenthesis of a subroutine invocation. 
     181 
     182After the heredoc marker the actual heredoc string must be scanned, hence the activation of the HEREDOC_STRING state on [source:/trunk/compilers/pirc/src/hdocprep.l#L331 line 331]. In the state HEREDOC_STRING, there are three different types of input: 
     183 
     184 1. "end-of-line" characters, basically an empty line (see [source:/trunk/compilers/pirc/src/hdocprep.l#L357 line 357]). An escaped newline character ("\\n") will be stored as part of the heredoc string. 
     185 
     186 2. "normal" heredoc string lines (see [source:/trunk/compilers/pirc/src/hdocprep.l#L376 line 376]. First the newline character is removed, because we may have found the heredoc string delimiter, that was stored earlier. In order to compare the strings, the newline character is chopped off (see [source:/trunk/compilers/pirc/src/hdocprep.l#L381 lines 381-384]). Then, a string comparison is done in order to see whether we just read the heredoc string delimiter. If so, then we need to continue scanning the "rest of the line" that was saved earlier. However, since we need to switch back later to the current buffer, we need to store this current buffer ([source:/trunk/compilers/pirc/src/hdocprep.l#L395 line 395]). Also, the lexer's state is changed to SCAN_STRING, since we're going to scan a saved string. Then, the lexer's told to read the next input from the string buffer ([source:/trunk/compilers/pirc/src/hdocprep.l#L406 line 406]). 
     187 
     188 
     189==== Scenario 1b: single heredoc argument parsing ==== 
     190 
     191Scenario 1b is almost the same as Scenario 1a, except that instead of a heredoc string being assigned to some target (register), the heredoc string is an argument to a function. Consider the following input: 
     192 
     193{{{ 
     194 
     195.sub main 
     196  foo(<<'EOS') 
     197This 
     198is 
     199a 
     200heredoc 
     201string. 
     202 
     203EOS 
     204.end 
     205  
     206}}} 
     207 
     208The process of parsing this heredoc string is pretty much the same as in Scenario 1a, except that the "rest of the line" contains the closing parenthesis ")" to close the argument list of the invocation of {{{foo}}}.  
    186209 
    187210