Changes between Version 5 and Version 6 of PGEBestPractices

Show
Ignore:
Timestamp:
06/30/09 01:28:24 (5 years ago)
Author:
Austin_Hastings
Comment:

Fixed typos, added leader.

Legend:

Unmodified
Added
Removed
Modified
  • PGEBestPractices

    v5 v6  
    22 
    33== Parsing Techniques == 
     4 
     5The techniques listed here are things you might want to try. They're definitely not things you must do, only things that have helped in the past. 
    46 
    57=== ''Everything'' goes on a stack === 
     
    3436Remember when building PAST trees that the parser is "experimenting" with your rule. Maybe this is an expression, but maybe it's an anonymous declaration. 
    3537 
    36 As a result, you may build a very complex parse tree only to have the parser decide it's wrong, and discard it. Consider having some "sequence points" in your grammar where you can be sure that the parser has correctly identifier whatever it is you are building. 
     38As a result, you may build a very complex parse tree only to have the parser decide it's wrong, and discard it. Consider having some "sequence points" in your grammar where you can be sure that the parser has correctly identified whatever it is you are building. 
    3739 
    3840Consider a variable declaration. If you store the variable declaration in a lexical block, and the parser decides that those curly braces don't ''really'' indicate a block, there's no harm. The connection from declaration to block was encapsulated in the block itself, so discarding the block's PAST tree discards the variable declaration. 
    3941 
    40 But if you interpret a variable declaration as some kind of global, and jam a symbol into a namespace, then what will you do if the parse fails? The parser might find a valid alternative, but you've injected a bogus symbol into the symtable someplace. 
     42But if you interpret a variable declaration as some kind of global, and jam a symbol into a namespace, then what will you do if the parse fails? The parser might find a valid alternative, but you've already injected a bogus symbol into the symtable someplace. 
    4143 
    42 For a dynamic language, the best way to deal with this may be to ignore it -- let autovivification solve the problem. Alternatively, you may want to create a "transaction" block representing changes to the namespace, and merge that block when the parser returns to some higher level (the sequence points mentioned before: a completed function decl, for example). 
     44For a dynamic language, the best way to deal with this might be to ignore it -- let autovivification solve the problem. Or, you may want to create a "transaction" block representing changes to the namespace, and merge that block when the parser returns to some higher level (the sequence points mentioned before: a completed function decl, for example). 
    4345 
    4446=== Use 'state' variables to avoid special cases === 
     
    4648For handling non-local conditions, like "first time this occurs" or "first one in file", use a state variable instead of trying to build a complex pattern. This can be used to start parsing with a "default" configuration that skips certain productions. 
    4749{{{ 
    48   rule TOP { 
    49     {{ $P0 = box 1 
    50        set_global '$Top_of_file', $P0 
    51     }} 
     50   rule TOP { 
     51      {{ $P0 = box 1 
     52          set_global '$Top_of_file', $P0 
     53      }} 
    5254 
    53     # ... rest of rule 
    54   } 
     55      <section>* 
     56      {*} 
     57   } 
    5558 
    56   rule section { 
    57     [ <section_header> 
     59   rule section { 
     60      [ <section_header> 
    5861 
    59     || # Note: Section Header is OPTIONAL at top of file 
    60       <?{{ $P0 = get_global '$Top_of_file' 
     62      || # Note: Section Header is OPTIONAL at top of file 
     63         <?{{ $P0 = get_global '$Top_of_file' 
    6164            .return($P0) 
    62        }}> 
    63     ] 
    64     # Not at TOF anymore. 
    65     {{ $P0 = box 0 
    66        set_global '$Top_of_file', $P0 
    67     }} 
     65         }}> 
     66      ] 
     67      # Not at TOF anymore. 
     68      {{ $P0 = box 0 
     69         set_global '$Top_of_file', $P0 
     70      }} 
    6871 
    69     # ... rest of rule 
    70   } 
     72      # ... rest of rule 
     73   } 
    7174}}} 
    7275 
     
    7881{{{ 
    7982   rule variable_decl { 
    80       <storage_class> <type> <init_declarator> 
     83      <storage_class>  
     84      <type>  
     85      <init_declarator> 
    8186   } 
    8287 
    8388   rule parameter_decl { 
    84        <type> <init_declarator> 
     89      <type>  
     90      <init_declarator> 
    8591   } 
    8692}}} 
     
    9399   } 
    94100}}} 
    95 The mode is a special token that just executes some inline PIR code: 
     101The first line, above, looks like `[ A B ]?` where B happens to be an optional rule. That structure essentially means that if A is present, B is required also. It's a useful pattern for guard conditions. 
     102 
     103The mode rule is a special token that just executes some inline PIR code: 
    96104{{{ 
    97105token DECL_MODE_PARAM { 
     
    119127}}} 
    120128Note that the parser may try several alternatives, so don't try to insert error messages too early in the rule. In the example above, the '{' is the key indicator of a compound statement. It makes no sense to try to report a "Compound statement missing '{'" error, since the failure of the rule may tell the parser to try some other, valid subrule. But once the block is opened, it will obviously have to be closed. 
     129 
     130Note also that there is no `error()` method by default. You can start out using `panic()`, but you'll want to attach a less dramatic mechanism to your parser eventually. 
    121131 
    122132== Performance == 
     
    146156 
    147157If several rules can be uniquely distinguished by a simple prefix, hoist the prefix into a parent rule. 
    148  
    149158{{{ 
    150159rule built_in_function { 
     
    165174} 
    166175}}} 
    167  
     176The more completely you hoist, the more code you eliminate. Ideally you'll save the keyword for later use, maybe restructuring the rule to build the function call in pieces. (Put it on a stack!) 
    168177 
    169178== Debugging ==