Document Actions

PRS: Lexical scanner, macro processor, and parser; first pass of the SETL compiler.

by Paul McJones — last modified 2021-03-18 20:22
PRS: Lexical scanner, macro processor, and parser; first pass of the SETL compiler. stlprs.opl
       1 .=member intro
       2$           ssssssss   eeeeeeeeee  tttttttttt  ll
       3$          ssssssssss  eeeeeeeeee  tttttttttt  ll
       4$          ss      ss  ee              tt      ll
       5$          ss          ee              tt      ll
       6$          sssssssss   eeeeee          tt      ll
       7$           sssssssss  eeeeee          tt      ll
       8$                  ss  ee              tt      ll
       9$          ss      ss  ee              tt      ll
      10$          ssssssssss  eeeeeeeee       tt      llllllllll
      11$           ssssssss   eeeeeeeee       tt      llllllllll
      12$
      13$
      14$                ppppppppp   rrrrrrrrr    ssssssss
      15$                pppppppppp  rrrrrrrrrr  ssssssssss
      16$                pp      pp  rr      rr  ss      ss
      17$                pp      pp  rr      rr  ss
      18$                pppppppppp  rrrrrrrrrr  sssssssss
      19$                ppppppppp   rrrrrrrrr    sssssssss
      20$                pp          rr   rr             ss
      21$                pp          rr    rr    ss      ss
      22$                pp          rr     rr   ssssssssss
      23$                pp          rr      rr   ssssssss
      24$
      25$
      26$                 t h e    s e t l    p a r s e r
      27$
      28$       this software is part of the setl programming system
      29$                address queries and comments to
      30$
      31$                          setl project
      32$                 department of computer science
      33$                      new york university
      34$           courant institute of mathematical sciences
      35$                       251 mercer street
      36$                      new york, ny  10012
      37$
      38
      39
      40$ this file contains the source for the first phase of the setl
      41$ compiler.  this phase consists of a parser, macro processor,
      42$ and lexical scanner.  its output is a symbol table and a reversed
      43$ polish string.
      44$
      45$ the parser is by far the simplest of the three sections.
      46$ the setl grammar is written in an extended bnf which can
      47$ be viewed as an assembly language for some abstract
      48$ parsing machine.  this assembly language is turned into
      49$ interpreteble code by the utility program 'syn', and
      50$ interpreted by the parser.
      51$
      52$ the parser consists of an interpreter called 'parse' and
      53$ several small auxilliary routines.
      54$
      55$ the scanner is also quite simple. it consists of a single
      56$ routine called 'trulex' which returns the next token from the
      57$ input stream.
      58$
      59$ the macro processor is relatively complex; the rest of this
      60$ section will concentrate on its design.
      61
      62$ the macro processor consists of three hierarchical routines.
      63$ each of these routines can be viewed from the outside as
      64$ a black box which obtains a token from some lower level routine
      65$ and returns it. from above, these routines seem like a garden
      66$ variety lexical scanner. in actuality they intercept certain
      67$ tokens passed to them in order to expand macros and absorb
      68$ macro definitions.
      69$
      70$ the three routines are:
      71$
      72$ 1. lexscan
      73$
      74$    this is the top level routine of the macro processor.
      75$    each time it is called, it returns a token from a lower
      76$    level routine. before returning a token, it checks
      77$    whether the token is a macro. if so, it initializes the
      78$    expansion of the macro and requests another token. it
      79$    returns the first token which is not a macro.
      80$
      81$ 2. absorb
      82$
      83$    this is the second level routine of the macro processor.
      84$    it passes along tokens from the routine below it, looking
      85$    for occurrences of 'macro' and 'drop'. each time it sees
      86$    one of these it makes a detour to process a macro drop
      87$    or definition.
      88$
      89$ 3. expand
      90$
      91$    this is the bottom level routine of the macro processor.
      92$    it checks whether any macros are currently being expanded.
      93$    if so, it returns the next word of macro text; otherwise
      94$    it obtains a token from trulex.
      95
      96$ the macro processor uses three global data structures:
      97$
      98$ 1. mtab:
      99$
     100$    this is an array used to store macro text. each word of
     101$    mtab represents a single token of macro text. there are
     102$    three types of macro text entries:
     103$
     104$    a. simple text
     105$
     106$       a simple text entry is represented by a symbol table
     107$       pointer.
     108$
     109$    b. user supplied arguments
     110$
     111$       the i-th user supplied argument is represented by
     112$       i+bias_user. bias_user is a bias used to distinguish
     113$       user supplied arguments from plain text.
     114$
     115$    c. generated arguments
     116$
     117$       generated arguments are names which are supplied by the
     118$       compiler during each macro expansion. the first time
     119$       we see each generated argument in a new expansion, we
     120$       generate a new name for it; the next time we see it
     121$       we supply the old name.
     122$
     123$       generated arguments are represented in two forms:
     124$
     125$       i+bias_gen1:  first occurrence of i-th generated argument
     126$       i+bias_gen2:  any other occurrence of i-th generated arg.
     127$
     128$    the various biases are arranged as follows:
     129$
     130$    bias_user = dimension of symbol table + 1.
     131$    bias_gen1 = bias_user + 100
     132$    bias_gen2 = bias_gen1 + 100
     133
     134$    the text for each macro consists of a series of mtab entries
     135$    followed by a zero entry. if 'm' is a macro then 'morg(m)'
     136$    gives its first entry in mtab.
     137$
     138$ 2. astack
     139$
     140$    astack is an array used to hold actual arguments of macros
     141$    currently being expanded. each argument can consist of a
     142$    series of tokens. its corresponding astack entry consists
     143$    of a series of symbol table pointers followed by a zero word.
     144$
     145$ 3. mstack
     146$
     147$    mstack is a stack used to control recursion during macro
     148$    expansion. we associate a series of mstack entries, known
     149$    as a stack frame, with each macro we are in the process of
     150$    expanding.
     151
     152$    each stack frame is organized as follows:
     153$
     154$    first word:    points to next word of macro text
     155$    second word:   gives number of arguments
     156$    i+2 nd word:   gives pointer to start of i-th argument in astack
     157$
     158$    stack frames are built whenever lexscan encounters a macro.
     159$    expand returns the next word of macro text if we are processing
     160$    macros and calls trulex otherwise.
     161
     162$    when expand discovers a user-argument in the text of a macro
     163$    it begins substituting the actual argument. this is done by
     164$    building an extra stack frame which contains:
     165$
     166$    first word:    points to next word of argument text in astack
     167$    second word:   indicates zero arguments.
     168$
     169$    the pointer to astack is biased by the dimension of mtab to
     170$    distinguish an argument stack frame from a macro stack frame.
     171
     172$ the parser does not communicate with the scanner directly, but
     173$ instead calls an interface routine -gettok-.  this routine
     174$ manipulates a token buffer which is used to implement backtracking
     175$ in the parser.
     176
     177
     178$ outputs of the parser
     179$ ---------------------
     180
     181$ the parser produces two tables:
     182
     183$ 1. the polish string
     184
     185$    the polish string is represented as an array whose entries
     186$    have two fields:
     187
     188$    pol_typ:      type code pol_xxx
     189$    pol_val:      value of entry
     190
     191$    entries in the polish string are refered to as 'nodes'. there
     192$    are three types of nodes:
     193
     194$    a. names
     195
     196$       these nodes indicate names appearing in the source program.
     197$       they have:
     198
     199$       pol_typ:     pol_name
     200$       pol_val:     pointer to 'names' array
     201
     202$       the names array is used to store the actual name of
     203$       the node.
     204
     205$    b. counters
     206
     207$       counters are integers which indicate the number of clauses
     208$       found by each  operation of the parser. they have:
     209
     210$       pol_typ:     pol_count
     211$       pol_val:     integer indicating count
     212
     213$    c. markers
     214
     215$       markers are nodes indicating points where semantic routines
     216$       are to be invoked. they have:
     217
     218$       pol_typ:     pol_mark
     219$       pol_val:     code p_xxx
     220
     221$    the polish string is written out whenever it is full. rather
     222$    than write out both the polish string and the names array,
     223$    out, we write the names entry for each node directly after
     224$    the node itself.
     225
     226$    the variable 'polp' points to the last entry in the string.
     227
     228$    the polish string is actually written onto two files known
     229$    as the main file and the auxiliary file. the two files
     230$    are merged by the semaintic pass. this allows the semanitc
     231$    pass to act as if there were no forward references to
     232$    procedures.
     233
     234
     235$ 2. names
     236
     237$    names is a word-sized array used to store the names of tokens.
     238$    we pack each token into names by converting it to a 'standard'
     239$    self defining string, then putting each word of the string into
     240$    a successive word of names, starting with the low order word.
     241
     242$    pointers to names always point to the low order word of a string,
     243$    and thus allow us to access its slen and sorg fields.
     244
     245$    names entries are 'standardized' in the sense that an
     246$    n-character token will always have an sorg of .sds. n +1.
     247
     248
     249
     250
     251
     252$ the symbol table
     253$ ----------------
     254
     255$  the parser uses a symbol called symtab. it has the following
     256$  fields:
     257
     258
     259$   name:        pointer to names array
     260$   lex_typ:     extended lexical type
     261$   key_val:     lexical value, see below
     262$   lit_val:     literal value, see below.
     263$   link:        clash list link
     264$   morg:        points to first word of macro text.
     265$   margs:       number of macro arguments
     266$   mcode:       indicates representation in macro text
     267
     268$  in the code which follows we distinguish between 'keywords' and
     269$  'literals'.
     270
     271$  a 'keyword' is any token whose lex_typ is not some standard
     272$  type such as name, integer, or real. there are several groups
     273$  of keywords, such as binary operators, statement keywords, etc.
     274$  each set of keywords forms a separate lexical class.
     275
     276$  each keyword has a value associated with it. this value is
     277$  given by the symbols key_val field. its meaning is a function
     278$  of its lex_typ:
     279
     280$             lex_typ               key_val
     281$             --------               -------
     282
     283$             l_dkey                 index for branch on keyword
     284$             l_tkey1                as above
     285$             l_tkey2                as above
     286$             l_stkey                as above
     287$             l_lparen                 as above
     288$             l_bin                  precedence
     289$             l_un                   precedence
     290$             l_minus                precedence
     291$             l_debug                debugging code dbg_xxx
     292
     293$      key_val is zero for all other types.
     294
     295$  a 'literal' is any token which appears explicitly in the
     296$  grammar. the lit_val field of each literal contains a code
     297$  lit_xxx which identifies which literal it is. tokens which
     298$  are not literals have a lit_val of 0.
     299
     300$  note that key_val and lit_val must be disjoint, since certain
     301$  tokens are both literals and keywords.
     302
     303$  symtab is arranged as a hash table, with the link field used
     304$  to chain entries with the same hash code. a separate table
     305$  called 'heads' is used to hold the heads of clash chains, and
     306$  the variable 'symtabp' points to the last used symtab entry.
     307
     308$   note that tokens are hashed on their names only, not on their types.
     309$   the types of keywords are set by the initialization routine. if
     310$   trulex discovers a token whose lex_typ field is not already set,
     311$   it fills in a default.
     312
     313
       1 .=member mods
       2
       3
       4$ program revision history
       5$ ------------------------
       6
       7$ this section contains a description of each revision to the program.
       8$ these descriptions have the following format:
       9$
      10$ mm-dd-yy      jdate     author(s)
      11$
      12$ 1.............15........25........................................
      13$
      14$ where mm-dd-yy are the month, day, and year, and jdate is the julian
      15$ date.
      16$
      17$ each time a revision is installed, the author should insert a
      18$ description after line 'mods.21', and change the macro 'prog_level'
      19$ to the current julian date.
      20$
      21$ ......................................................................
sunb   1
sunb   2
sunb   3$ 07/24/84     84206     s. freudenberger
sunb   4$
sunb   5$  1. introduce program parameters -lcp- and -lcs- to control default
sunb   6$     output:  -lcp- controls the listing of program parameters, i.e.
sunb   7$     the initial phase heading;  -lcs- controls the listing of the
sunb   8$     final statistics.  if both are set, the old listing is generated;
sunb   9$     if neither is set, no output is generated unless an error occurs.
sunb  10$     modules affected: start, prsini, and prstrm.
asca   1
asca   2
asca   3$ 03/05/84     84065     d. shields
asca   4$
asca   5$  1. for s37, add option 'cset=asc' to permit translation of programs
asca   6$     developed in ascii without need to translated to portable
asca   7$     character set.
asca   8$     modules affected: start, prsini, and initlex.
suna   1
suna   2
suna   3$ 02/05/84     84065     s. freudenberger
suna   4$
suna   5$  1. support motorola mc68000 microprocessor on sun workstation.
suna   6$     modules affected: start, prsini, ptinit, trulex, initlex,
suna   7$                       stackexp, and svtab.
smfc   1
smfc   2
smfc   3$ 09/01/83     83244     s. freudenberger
smfc   4$
smfc   5$  1. document machine-dependency of integer representation in setl
smfc   6$     binary i/o.
smfd   1$     module affected:  newstat.
smfc   7$  2. correct a comditional assembly directive for s10/s20.
smfc   8$     module affected:  trulex.
smfb   1
smfb   2
smfb   3$ 08/08/83     83220     s. freudenberger
smfb   4$
smfb   5$  1. increase the dimension of astack and mstack to 1023.
smfb   6$     module affected:  start.
smfb   7$  2. check while collecting open tokens for the end of a loop header,
smfb   8$     to avoid scanning beyond it.  this required to add the keyword end
smfb   9$     to the list of keywords defined by default.
smfb  10$     modules affected: start, initstd, opentoks, endtoks, and chktoks.
smfb  11$  3. code anyc string search function inline.
smfb  12$     modules affected: trulex and initlex.
smfb  13$  4. change some cases of s32, s37, s47 conditional assembly to r32.
smfb  14$     modules affected: inparse and svtab.
smfa   1
smfa   2
smfa   3$ 12/16/82     82350     s. freudenberger
smfa   4$
smfa   5$  1. modify trulex to call an internal rather than an external
smfa   6$     subroutine to read the next character.
smfa   7$     module affected:  trulex.
smfa   8$     module deleted:   getc.
smfa   9$  2. check that there really exists an open loop when we encounter a
smfa  10$     'continue' or 'quit' statement with no tokens.
smfa  11$     module affected:  chktoks and ermsg.
      22
      23
      24$ 08/12/82     82224     s. freudenberger
      25$
      26$  1. we postpone the check for constant expressions to the semantic
      27$     pass, and are thus able to constant fold additional operators.
      28$     module affected:  remote text inclusion.
      29$  2. the token buffer for the current token is set to zero before we
      30$     attempt to find the token.
      31$     module affected:  gettok.
      32$  3. we check the symbol table index for zero before we retrieve the
      33$     token name.
      34$     mocudule affected:  symsds.
      35$  4. the enders for error recovery in expressions have been expanded
      36$     to include 'elseif' and 'else', and have been corrected to use
      37$     portable character set representations in the initialisation code.
      38$     module affected:  ermet.
      39$  5. we check that the token buffer indeces are defined (non-zero),
      40$     and replace the names of invalid tokens between tok_out and
      41$     tok_in by question marks.
      42$     module affected:  trcbak.
      43
      44
      45$ 06/15/82     82166     s. freudenberger
      46$
      47$  1. the setl source map file has been renamed to ssm from src.  it is
      48$     now written out whenever the ssm program parameter is supplied,
      49$     not as a funtion of the opt program parameter.  the opt program
      50$     parameter now merely checks the availability of the setl q1 inter-
      51$     face and that an ssm file has been specified.
      52$     modules affected: start, prsini, putcard, stat1, newstat, and
      53$                       prstrm.
      54$  2. trailink blanks are removed before the record is written to the
      55$     ssm file.
      56$     modules affected: trulex and putcard.
      57$  3. we count the 'case  of' of a case statement as a separate
      58$     statement.
      59$     module affected:  (remote text inclusion)
      60
      61
      62$ 06/01/82     82152     s. freudenberger
      63$
      64$  1. we added conditional code for the s37 mts implementation.
      65$     module affected:  prsini.
      66$  2. we re-introduced the keyword 'map' for ambiguous maps, i.e. maps
      67$     which can have both single- and multi-valued image points.
      68$     module affected:  initkey.
      69$  3. the globals 'stat_no' and 'cstat_no' have been deleted.  they were
      70$     dead since 82032.2.
      71$     modules affected: start, stat1, and newstat.
      72$  4. a new global flag has been introduced:  'eor_flag' is set whenever
      73$     a new input record is read.
      74$     modules affected: start and getc.
      75$  5. the implicit string concatenation now actually requires an end-of-
      76$     record between the two string operands.
      77$     module affected:  trulex.
      78
      79
      80$ 03/16/82     82075     s. freudenberger
      81$
      82$   1. the 'rtrs0' and 'rtrs1' debugging options have been deleted from
      83$      the initialisation code of the symbol table.
      84$      module affected:  initkey.
      85
      86
      87$ 02/01/82     82032     s. freudenberger
      88$
      89$  1. the listing output has been moved to start each line in column 1
      90$     rather than column 7.  dump outputs have not been modified.
      91$     modules affected: prsini, usratp, overfl, and prstrm.
      92$  2. the statement counters have been modified to correspond to the
      93$     variables used in the remaining compile phases, i.e. the use of
      94$     cstmt_count, etc.
      95$     modules affected: start, parse, putcard, stat1, and newstat.
      96$  3. the setl optimiser needs the setl binary interface to communicate
      97$     with other compiler phases.  thus we can check already here
      98$     whether we will run into problems later.  furthermore, the
      99$     optimiser likes to produce an annotated source listing:  we
     100$     generate here a map from cummulative statement numbers to tuples
     101$     of source lines, in setl binary format, and write it to the file
     102$     src.
     103$     new program parameters:
     104$             opt=0/1         global optimisation flag:  if set, write
     105$                             setl binary map from cummulative statement
     106$                             numbers to tuples of source lines
     107$             src=/           file to which the setl binary map is
     108$                             written
     109$     new inclusion from compl:  module binio.
     110$     modules affected: start, prsini, getcard, putcard, stat1, newstat,
     111$                       and prstrm.
     112$     module deleted:   lstcrd.
     113$  4. several tokens have been predefined, which before were defined
     114$     when needed:  ':=', 'then', and 'end'.
     115$     modules affected: start, initstd, chkrepr, and badasn.
     116$  5. we only check tokens up to the 'then' in an 'if' statement.  this
     117$     prevents look-ahead beyond the 'then', and thus keeps statement
     118$     numbers correct for the source listing.
     119$     modules affected: opentoks, endtoks, and checktoks.
     120
     121
     122$ 02/01/82    82032     d. shields
     123$
     124$ use r32 conditional symbol for standard 32-bit fields.
     125$ this replaces the field definitions for s32, s37 and s47.
     126
     127
     128$ 01/15/82     82015     s. freudenberger & d. shields
     129$
     130$  1. prsini has been modified to print the phase header to the terminal
     131$     whenever the new control card parameter 'termh=0/1' is set.
     132$     new control card parameter:
     133$         termh=0/1           print phase header on the terminal file
     134$     module affected:  prsini.
     135$  2. the nameset nsptab has been put in start to simplify compilation
     136$     for s32u and s47 (unix).
     137
     138
     139$ 11/29/81    81333     d.shields
     140$
     141$  1. support s47: amdahl uts (universal timesharing system).
     142$     this implementation runs on s37 architecture using an operating
     143$     system very close to unix (v7), and uses the ascii character set.
     144
     145
     146$ 08/13/81     81232     s. tihor
     147$
     148$  1.  expand the size of the s32 name table for the ada compiler.
     149$  2.  expand the heads, symtab, and polish file sizes accordingly.
     150$  3.  add the /upd switch and the /seq switch.  the /seq switch
     151$      initially supports the four options: l(line numbers),
     152$      s(stmt_num), ls(both) and nothing.
     153$  4.  add support for s20 (s10 with extended addressing).
     154
     155
     156$ 12/05/80     80340     s. tihor & d. shields
     157$
     158$  1. shields mods for unix checkout
     159$  1a. support lower case for unix
     160$  1b. use binary parse tables for unix checkout
     161$  2. use name$sio to capture the names of the stdin, stdout, and stderr
     162$     (term) files.
     163$  3. correct two stropped key words in ermet to unstropped form.
     164$  4. do not convert percent to colon inside of quotes even on s66.
     165
     166
     167$ 11/05/80     80310     s. freudenberger
     168$
     169$ the puterr routine has been split, so that the unexpected-end-of-file
     170$ error message is printed more reasonably.
     171
     172
     173$ 08/01/80     80214     s. freudenberger
     174$
     175$  1. for implementations selecting the .+lc conditional assembly, the
     176$     test in the getcard routine for blank after .title has been
     177$     changed to a search for blank-equivalent separators.
     178
     179
     180$ 06/20/80     80172      e. deak
     181$
     182$ modifications have been made to improve the error recovery.  most
     183$ changes have been made in the ermet routine to prevent scanning
     184$ over reserved words.
     185
     186
     187$ 07/08/80     80190     s. freudenberger
     188$
     189$  1. a special polish node is written to mark the end of both the
     190$     polish and the auxiliary polish files.
     191$  2. the line "no errors..." is not echoed to the terminal anymore.
     192$  3. the layout of the title line has been changed.
     193$  4. lines containing listing directives are counted as input lines.
     194
     195
     196$ 05/27/80     80148     s. freudenberger
     197$
     198$  1. adjust various macros for s37 in preparation to compile and
     199$     run the optimizer on that system.
     200
     201
     202$ 04/11/80     80102     d. shields
     203$
     204$ 1. delete is_primitive.
     205$ 2. 'rgcd1' was wrongly spelled 'tgcd1'.
     206
     207
     208$ 02/04/80     80035     s. freudenberger and d. shields
     209$
     210$ 1. implement unary operators acos, asin, atan, char, cos, exp,
     211$    log, sin, sqrt, tan and tanh.
     212$ 2. implement binary operators atan2 and interrogation (?).
     213$ 3. implement type predicates is_atom, is_boolean, is_integer,
     214$    is_map, is_real, is_set, is_string and is_tuple.
     215$    change prim to is_primitive.
     216$ 4. add procedure host() to provide means for adding
     217$    implementation- or site-dependent features.
     218$ 5. drop cset_dis option.
     219$ 6. correct the call to 'opninc' so that the '.copy' option must
     220$    start in column 2.
     221$ 7. the meta-macros and system parameter macros have been replaced
     222$    by an inclusion of cmnpl.sysmac.
     223$ 8. the recursion stack of the parser has been changed to account
     224$    for the possibility of more than 255 <*cl>'s.  this change was
     225$    a consequence of the increase in vlen_lim (cmnpl.q1symtab).
     226
     227
     228$ 01/16/80     80016     s. freudenberger
     229$
     230$ change 79248.5 has been undone, since the code generator symbol table
     231$ allocator does not handle this extension properly.
     232
     233
     234$ 01/15/80     80015     s. freudenberger
     235$
     236$ 1. correct 'puttab' and 'putxtab' for backtracked environment.
     237$ 2. correct 'retab'/'puttabs' so that subr-statement follows '*deck'.
     238$ 3. correct error message in correspondence to grammar change.
     239
     240
     241$ 12/17/79     79351     s. freudenberger
     242$
     243$ 1. 'puttabs' has been shortened to six characters: 'puttbs'.
     244
     245
     246$ 11/30/79     79334     s. freudenberger
     247$
     248$ 1. the names table has been increased to the same size as in cmnpl.
     249$ 2. on polish atbel overflow, we only write out hte non-backtracked
     250$    part of the polish.
     251
     252
     253$ 11/12/79      79316     s. freudenberger
     254$
     255$ 1. the mode keyword 'map' has been dropped.  the initialization
     256$    for the repr keywords has been updated accordingly.
     257$ 2. the error message for slash after compound operator has
     258$    been included.
     259$ 3. new error messages have been added according to the grammar
     260$    extensions.
     261
     262
     263$ 09/17/79     79259     s. freudenberger
     264$
     265$ 1. the current input line is printed unconditionally when an error
     266$    is found.
     267$ 2. the grammar action to force the printing of the current line does
     268$    not call 'put_card' directly anymore, but rather the newly added
     269$    routine 'lstcrd', which checks whether the 'list' option has been
     270$    selected.
     271
     272
     273$ 09/13/79     79256     s. freudenberger
     274$
     275$ 1. logical file names are sized using 'filenamlen' (define in
     276$    cmnpl.sysmac, duplicated in prspl.start since sysmac is not
     277$    included here.
     278
     279
     280$ 09/05/79     79248     s. freudenberger
     281$
     282$
     283$ this correction set installs setl 2.1
     284$
     285$
     286$  1. a conditional assembly 'lc' has been added to control upper/lower
     287$     case equivalency for identifiers.
     288$  2. a character set control card parameter ('cset') has been added to
     289$     control selection of either the portable or the extended character
     290$     sets. (for further details, see comment in member start)
     291$  3. the lexical classes have been changed and ammended.
     292$  4. the parser debugging options have been renamed and made part of
     293$     the general 'debug' statement.
     294$  5. we now accept certain lexical classes as names.  these are the
     295$     base type and the mode keywords, as well as the options for
     296$     the 'debug' and 'trace' statements.
     297$  6. 'trulex' and 'initlex' have been completely rewritten, using the
     298$     little string primitives.  the modification also reflects the
     299$     upper/lower case folding for identifiers.
     300$  7. 'trulex' now accepts '.5' as a real denotation.
     301$  8. 'initlex' contains the code to define one-character alternatives
     302$     for predefined tokens.
     303$  9. the error handler has been updated.
     304$ 10. the routines 'getc' and 'altchar' have been added.  'getc'
     305$     returns the next character from the input buffer, and 'altchar'
     306$     defines its second argument as an alternative for its first
     307$     argument.
     308
     309
     310$ 07/20/79     79201     s. freudenberger
     311$
     312$ 1. error messages on the s10 are echoed to the device tty:, and not
     313$    the file 'tty'.
     314$ 2. the default for the s32 has been changed so that, in the absence
     315$    of a 'term' parameter, errors are echoed to 'sys$error'.
     316$ 3. error messages on the s10 are preceded by '?', the standard error
     317$    marker for the dec-10.
     318
     319
     320$ 05/18/79     79138     s. freudenberger
     321
     322$ 1. the quick parser has been renamed from stlqprs to stlqrs.
     323$ 2. the term option has been cleaned up so that the skip's and
     324$    the call's to contlpr interleave correctly.
     325$ 3. a number of extreneous return statements have been deleted.
     326$ 4. the source has been resequenced.
     327
     328
     329$ 04/27/79     79117     s. freudenberger
     330
     331$ the environment has been modified to provide an assembly-level
     332$ module which contains the parse table output from syn.  this
     333$ module is used, and it is not necessary anymore to read 'synbin'
     334$ for each run.
     335
     336
     337$ 04/12/79     79102     s. freudenberger and d. shields
     338
     339$ 1. three character set changes have been implemented:
     340$     1. the ascii number sign is not a synonym for the keyword exist.
     341$       it is currently unused.
     342$     2. the ascii circumflex is not used as the break character.
     343$        it is currently unused.
     344$     3. the ascii underline is the new break character.
     345$        it is not acceptable as a synonym for the keyword in.
     346$ 2. initial modifications have been included to eliminatethe need
     347$    to read in the grammar file for every run.
     348
     349
     350$ 04/10/79     79100     s. freudenberger
     351
     352$ 1. the parse listing commands -.eject- and -.title- only eject a
     353$    page when the source is listed.
     354$ 2. since there was no precedenc 7, all precedence 8 operators have
     355$    been given precedence 7.  the operators +:= and -:= have been
     356$    given precedence 8.
     357
     358
     359$ 04/03/79     79093     s. freudenberger and d. shields
     360
     361$ 1. as a first step to remove prefix stropping, the pre-control-
     362$    card parameter has been removed, and the pre_flag initialized
     363$    to false, i.e. reserved word stropping.
     364$ 2. a conditional symbol -qp- has been added, allowing the creation
     365$    of a smaller version of the parser.  the main differences between
     366$    the regular parser and the -quick- parser are the following:
     367$    a. reduced macro- and symbol tables
     368$    b. no polish strings are written onto the files -pol-/-xpol-
     369$    c. there are no dump routines in this version.
     370$ 3. a terminal option has been added, echo-printing all error
     371$    messages onto the file specified by the term control card
     372$    parameter.
     373
     374
     375$ 03/27/79     79086     s. freudenberger
     376
     377$ 1. two tests were corrected in -endtoks- and -checktoks-,
     378$    so that the correct number of tokens will be checked.
     379$ 2. if dumps are requested, the dump routines will be invoked
     380$    from -puttabs- rather than from -puttab- and -putxtab-.
     381$ 3. two namesets have been renamed to avoid external name
     382$    conflicts on systems which do not distinct between data
     383$    and program segments:
     384$         initkeyns ---> nsinitkey
     385$         trulexns  ---> nstrulex
     386
     387
     388$ 03/05/79     79065     s. freudenberger
     389
     390$ 1. the size of the macro table (mtab) has been increased.
     391$ 2. subroutine -init- has been renamed to -prsini-.  this is both
     392$    more consistent with the remaining phases of the compiler, and
     393$    avoids a name conflict in the s10 implementation.
     394$ 3. ltlsyn has been changed to emit a member synmax.  this member
     395$    contains a macro -parselitmax- giving the total number of
     396$    literals in the grammar.  -initlit- has been modified accordingly.
     397$ 4. the error handling and recovery has been changed considerably.
     398$    4.1 a new control card parameter has been added: etoks=5/5
     399$        in the new version, each error message prints the last
     400$        etoks tokens that have been accepted by the parser, plus
     401$        the tokens examined but not accepted.  the latter are
     402$        underlined, using a minus sign for unreserved and an
     403$        equal sign for reserved tokens.
     404$    4.2 if the parser finishes to parse a program but is not at
     405$        the end of the input file, it assumes that some error must
     406$        have derailed it, and resumes parsing with the production
     407$        
     408$    4.3 if the tokens after -end- do not match, a check is performed
     409$        to see whether one ore more -end-s are missing.  if so, an
     410$        error is flaged, but the parse proceeds, inserting the missing
     411$        -end-(s).
     412$ 5. the statement numbering has been corrected (finally !).  the
     413$    cause for the neurotic behaviour was the routine -opentoks-,
     414$    which save a fixed number of tokens.  it has been modified so
     415$    that it will only save the tokens to the minimum of the pre-
     416$    defined limit and the next semicolon.  this change is justified
     417$    since both -endtoks- and -checktoks- only compare to the next
     418$    semicolon.
     419
     420
     421$ 12-27-78     78361     a. grand and d. shields
     422
     423$ this correction set installs machine-dependent code for the
     424$ ibm 370, dec-10, and vax.
     425
     426
     427$ 12-8-78     78342     a. grand
     428
     429$ 1. the scanner did not allow a signed exponent in real constants.
     430$ 2. there was a grammar bug which required two new error messages.
     431
     432
     433$ 11-15-78     78319     a. grand and s. freudenberger
     434
     435$ this correction does nothing but install the deck'mods'
     436
       1 .=member start
       2 .=include cndasm
suna   8
       3 .-qp.
       4 .+s66    subr start;
       5 .+r32    prog stlprs;
suna   9 .+r36    prog stlprs;
suna  10 .+qp.
suna  11          prog stlprs;
       8 ..qp
       9
      11
      12$ this is the main routine of the scanner-parser. it contains
      13$ all the global variable declarations, plus calls to invoke
      14$ the rest of the program.
      15
      16
      17      +* prog_level =    $ program level(julian date of last fix)
sunb  11          'prs(84206) '
      19          **
      20
      21 $    conditional symbols
      22
      23 $    enable qp to compile quick parse which has reduced
      24 $    table sizes and runs as stand-along program to assist
      25 $    in finding syntactic errors quickly.
      26
      27$ enable defenv_ptinit if procedure ptinit defined by environment
      28
      29 .+set defenv_ptinit
      30
      31 .+s47.
      32 .-set defenv_ptinit  $ read in parse tables during checkout.
      33 ..s47
      34
      35
      36 .=include sysmac
      37
      38$ character constants
      39
      40      +* cc_tab  =            $ tab character (if available) or blank
      41 .+s10    09
      42 .+s20    09
      43 .+s32    09
      44 .+s47    09
      45 .+s37    1r
      46 .+s66    1r
suna  13 .+s68    09
      47          **
      48
      49
      50
      51
      52      nameset symtab;    $ symbol table and related variables
      53
      54          +*  symtab_sz  =  $ size of symtab
      55 .+s66        120
      56 .+r32        168
      57 .+s10        108
      58 .+s20        108
      59          **
      60          +*  symtab_lim  =  2000  **  $ dimension of symtab
      61 .+qp     +*  symtab_lim  =  1000  **  $ (small) dimension of symtab
      62
      63          size symtab(symtab_sz);
      64          dims symtab(symtab_lim);
      65
      66          size symtabp(ps);   $ pointer to last symtab entry
      67          data symtabp = 0;
      68
      69 .+s66.
      70          +*  name(i)          =  .f. 001, 16, symtab(i)  **
      71          +*  lex_typ(i)       =  .f. 017, 08, symtab(i)  **
      72          +*  key_val(i)       =  .f. 025, 08, symtab(i)  **
      73          +*  lit_val(i)       =  .f. 033, 08, symtab(i)  **
      74          +*  link(i)          =  .f. 065, 16, symtab(i)  **
      75          +*  morg(i)          =  .f. 081, 16, symtab(i)  **
      76          +*  mcode(i)         =  .f. 097, 16, symtab(i)  **
      77          +*  margs(i)         =  .f. 113, 08, symtab(i)  **
      78 ..s66
      79
      80 .+r32.
      81          +*  name(i)          =  .f. 001, 32, symtab(i)   **
      82          +*  lex_typ(i)       =  .f. 033, 08, symtab(i)   **
      83          +*  key_val(i)       =  .f. 041, 08, symtab(i)   **
      84          +*  lit_val(i)       =  .f. 049, 08, symtab(i)   **
      85          +*  link(i)          =  .f. 065, 32, symtab(i)   **
      86          +*  morg(i)          =  .f. 097, 32, symtab(i)   **
      87          +*  mcode(i)         =  .f. 129, 32, symtab(i)   **
      88          +*  margs(i)         =  .f. 161, 08, symtab(i)   **
      89 ..r32
      90
      91 .+s10.
      92          +*  name(i)          =  .f. 001, 18, symtab(i)  **
      93          +*  lex_typ(i)       =  .f. 019, 08, symtab(i)  **
      94          +*  key_val(i)       =  .f. 027, 08, symtab(i)  **
      95          +*  lit_val(i)       =  .f. 037, 08, symtab(i)  **
      96          +*  link(i)          =  .f. 055, 18, symtab(i)  **
      97          +*  morg(i)          =  .f. 073, 18, symtab(i)  **
      98          +*  mcode(i)         =  .f. 091, 18, symtab(i)  **
      99          +*  margs(i)         =  .f. 045, 08, symtab(i)  **
     100 ..s10
     101
     102 .+s20.
     103          +*  name(i)          =  .f. 001, 18, symtab(i)  **
     104          +*  lex_typ(i)       =  .f. 019, 08, symtab(i)  **
     105          +*  key_val(i)       =  .f. 027, 08, symtab(i)  **
     106          +*  lit_val(i)       =  .f. 037, 08, symtab(i)  **
     107          +*  link(i)          =  .f. 055, 18, symtab(i)  **
     108          +*  morg(i)          =  .f. 073, 18, symtab(i)  **
     109          +*  mcode(i)         =  .f. 091, 18, symtab(i)  **
     110          +*  margs(i)         =  .f. 045, 08, symtab(i)  **
     111 ..s20
     112
     113
     114$ the array 'heads' is used to hold the head of each clash list.
     115$ its dimension must be a prime.
     116
     117          +*  heads_lim  =  509  **
     118
     119          size heads(ps);
     120          dims heads(heads_lim);
     121          data heads = 0(heads_lim);
     122
     123$ constants for lexical types
     124$ ---------------------------
     125
     126 .=zzyorg z
     127
     128      defc(l_name)            $ names
     129      defc(l_bold)            $ bold (operator) name
     130      defc(l_int)             $ integer denotation
     131      defc(l_real)            $ real denotation
     132      defc(l_string)          $ string denotation
     133      defc(l_delim)           $ delimiters
     134      defc(l_dkey)            $ declaration statement keyword
     135      defc(l_modekey)         $ system defined modes
     136      defc(l_btkey)           $ base types (local, remote, ...)
     137      defc(l_stkey)           $ statement keyword
     138      defc(l_rwkey)           $ read/write keywords
     139      defc(l_rkey1)           $ libraries, reads, writes
     140      defc(l_rkey2)           $ imports, exports
     141      defc(l_bin)             $ system defined binary operators
     142      defc(l_from)            $ from, etc.
     143      defc(l_un)              $ system defined unary operators
     144      defc(l_unbin)           $ system defined unary/binary operators
     145      defc(l_dots)            $ two or more dots
     146      defc(l_debug)           $ compiler debugging options
     147      defc(l_trace)           $ user debugging options
     148
     149      +*  l_min  =  l_name    **
     150      +*  l_max  =  l_trace   **
     151
     152$ standard symbols
     153$ ----------------
     154
     155$ the first few symbol table entries are used for standard literals
     156$ such as '.macro', '.endm', etc. these locations are identified by
     157$ a series of macros sym_xxx.
     158
     159 .=zzyorg z
     160
     161          defc(sym_macro)     $ keyword macro
     162          defc(sym_endm)      $ keyword endm
     163          defc(sym_drop)      $ keyword drop
     164          defc(sym_semi)      $ semicolon
     165          defc(sym_comma)     $ comma
     166          defc(sym_lp)        $ left parenthesis
     167          defc(sym_rp)        $ right parenthesis
     168          defc(sym_asn)       $ assignment operator
     169          defc(sym_then)      $ keyword then
     170          defc(sym_end)       $ keyword end
smfb  16          defc(sym_do)        $ keyword do
     171
     172      end nameset;
     173
     174
     175      nameset names;   $ names
     176                       $ -----
     177
     178      +* names_lim  =         $ dimension of names
     179 .+s10    1500
     180 .+s20    1500
     181 .+r32   65536
     182 .+s66    1024
     183          **
     184
     185          size names(ws);
     186          dims names(names_lim);
     187
     188          size namesp(ps);  $ pointer to last names entry
     189          data namesp = 0;
     190
     191$ the following fields access the slen and sorg fields of names
     192$ entries:
     193
     194          +*  n_slen(p)         =  slen names(p)                  **
     195          +*  n_sorg(p)         =  sorg names(p + (.sl.+1) / ws)  **
     196
     197      end nameset;
     198
     199
     200
     201      nameset polish;       $ declarations for polish string
     202                            $ ------------------------------
     203
     204          +*  polsz     =    16  **  $ size of polish
     205 .+r32    +*  polsz     =    32  **  $ size of pol must be 2+log2(name_
     206          +*  pol_lim   =  3000  **  $ dimension of polish string
     207 .+qp     +*  pol_lim   =    10  **  $ dimension of (small) polish array
     208          +*  xpol_lim  =  1000  **  $ dimension of auxiliary string
     209 .+qp     +*  xpol_lim  =    10  **  $ dimension of (small) xpolish arra
     210
     211          size polish(polsz);
     212          dims polish(pol_lim);
     213
     214          size polp(ps);   $ pointer to last entry
     215          data polp = 0;
     216
     217          +*  pol_typ(i)     =  .f. 01, 02, polish(i)  **
     218          +*  pol_val(i)     =  .f. 03, 14, polish(i)  **
     219 .+r32    +*  pol_val(i)     =  .f. 03, 30, polish(i)  **
     220
     221
     222$ auxiliary polish string
     223          size xpolish(polsz);
     224          dims xpolish(xpol_lim);
     225
     226          size xpolp(ps);   $ pointer to last entry
     227          data xpolp = 0;
     228
     229          +*  xpol_typ(i)     =  .f. 01, 02, xpolish(i)  **
     230          +*  xpol_val(i)     =  .f. 03, 14, xpolish(i)  **
     231 .+r32    +*  xpol_val(i)     =  .f. 03, 30, xpolish(i)  **
     232$ polish string types
     233
     234 .=zzyorg z
     235
     236          defc0(pol_name)     $ name
     237          defc0(pol_count)    $ counter
     238          defc0(pol_mark)     $ marker
     239          defc0(pol_end)      $ end-of-file marker
     240
     241          +*  pol_min  =  pol_name   ** $ minimum type
     242          +*  pol_max  =  pol_end    ** $ maximum type
     243
     244$ the macro definitions for the marker codes p_xxx are produced
     245$ by syn.
     246
     247          +*  synimpmap(a, b)  =  macdef(a = b)  **
     248
     249 .=include synmac
     250 .=include synimp
     251
     252
     253$ utilities to read and write polish
     254
     255          +*  putp(val, tp)  =  $ write node onto polish
     256 .-qp         if (polp = pol_lim) call puttab;
     257              polp = polp + 1;
     258 .-qp         pol_typ(polp) = tp;
     259 .-qp         pol_val(polp) = val;
     260              **
     261
     262
     263          +*  putxp(val, tp)  =  $ write node onto xpolish
     264 .-qp         if (xpolp = xpol_lim) call putxtab;
     265              xpolp = xpolp + 1;
     266 .-qp         xpol_typ(xpolp) = tp;
     267 .-qp         xpol_val(xpolp) = val;
     268              **
     269
     270
     271      end nameset polish;
     272
     273
     274      nameset macro;    $ declarations for macro processor
     275                        $ --------------------------------
     276
     277          +*  mtab_lim  =  3000  **  $ dimension of mtab
     278 .+qp     +*  mtab_lim  =   800  **  $ (small) dimension of mtab
     279
     280          size mtab(ps);
     281          dims mtab(mtab_lim);
     282
     283          size mtabp(ps);    $ pointer to last entry in mtab
     284          data mtabp = 0;
     285
smfb  17          +*  mstack_lim  =  1023  **  $ dimension of mstack
     287
     288          size mstack(ps);
     289          dims mstack(mstack_lim);
     290
     291          size mstackp(ps);   $ pointer to last mstack entry
     292          data mstackp = 0;
     293
smfb  18          +*  astack_lim  =  1023  **  $ dimension of astack
     295
     296          size astack(ps);
     297          dims astack(astack_lim);
     298
     299          size astackp(ps);   $ pointer to last entry
     300          data astackp = 0;
     301
     302          +*  param_lim =  26  **  $ maximum no. of formal parameters
     303
     304          size params(ps);  $ array of formal parameters
     305          dims params(param_lim);
     306
     307          size paramsp(ps);  $ pointer to last parameter
     308          data paramsp = 1;
     309
     310          size expand_buff(ps);  $ 1 word buffer used by 'expand'
     311          data expand_buff = 0;
     312
     313
     314    $ biases for macro text encoding
     315
     316          +*  bias_user  =  (symtab_lim + 1)  ** $ user arguments
     317          +*  bias_gen1  =  (bias_user + 100) ** $ first use of gen arg
     318          +*  bias_gen2  =  (bias_gen1 + 100) ** $ next use of gen arg
     319
     320      end nameset;
     321
     322
     323      nameset params;   $ control card parameters
     324
     325          size upd_flag(1);   $ update format source file
sunb  12          size lcp_flag(1);   $ listing control:  program parameters
sunb  13          size lcs_flag(1);   $ listing control:  program statistics
     326          size list_flag(1);  $ print source listing
     327          size at_flag(1);    $ produce titles automatically
     328          size mlen_lim(ps);  $ macro length limit
     329          size pel(ps);       $ parse error limit
     330          size etok_lim(ps);  $ no. of tokens printed after error msg
     331          size m_flag(1);     $ measurement flag
     332          size opt_flag(1);   $ global optimisation flag
     333          size tp_flag(1);    $ terminate after parsing
     334          size cset_str(.sds. 10);    $ source character set
     335          size seqf_str(.sds. 10);    $ sequence information format
     336          size seqf_flag(2);          $ sequence information code
     337
     338          $ options for parser debugging
     339          size et_flag(1);    $ dump tables after error
     340          size pd_flag(1);    $ dump polish
     341          size sd_flag(1);    $ dump symbol table
     342          size mt_flag(1);    $ trace macro processor
     343          size pt_flag(1);    $ trace parser
     344          size tt_flag(1);    $ trace tokens
     345
     346          size in_title(.sds. filenamlen);   $ input file
     347          size out_title(.sds. filenamlen);  $ output file
     348          size pol_title(.sds. filenamlen);  $ polish file
     349          size xpol_title(.sds. filenamlen); $ auxiliary polish file
     350          size ssm_title(.sds. filenamlen);  $ setl source map
     351          size term_title(.sds. filenamlen); $ terminal file/device
     352
     353      end nameset;
     354
     355      nameset misc;   $ miscelaneous
     356
     357          size error_count(ps);  $ number of errors
     358          data error_count = 0;
     359
     360          size line_no(ps);   $ current line number
     361          data line_no = 0;
     362
     363$ we keep three statement counters:
     364$
     365$ cstmt_count:   the cummulative statement count
     366$ ustmt_count:   the cummulative statement count at the start of the
     367$                current unit
     368$ estmt_count:   the cummulative statement count for the first code
     369$                producing instruction (currently not used in stlprs)
     370$
     371$ the current statement number equals cstmt_count - ustmt_count + 1.
     372
     373          +* stmt_count  =  (cstmt_count - ustmt_count + 1)  **
     374
     375          size cstmt_count(ps);   data cstmt_count = 0;
     376          size ustmt_count(ps);   data ustmt_count = 0;
     377          size estmt_count(ps);   data estmt_count = 0;
     378
     379          size eof_flag(1);   $ set when end of file seen
     380          data eof_flag = no;
     381
     382          size eor_flag(1);   $ set when new record is read
     383          data eor_flag = no;
     384
     385          size is_listed(1);  $ on if current line has been listed
     386          size is_written(1); $ current line has been written to src
     387          data is_listed = yes;   $ zero-th line has been listed
     388          data is_written = yes;  $ file - zero'th line has been written
     389
     390      end nameset;
     391
     392
     393$ the setl parser supports the following charcter sets, as selected
     394$ by the 'cset' control card parameter:
     395$
     396$ cset=por: the portable character set.  this set uses '<<' and '>>'
     397$           for set braces, '(/' and '/)' for tuple brackets, and
     398$           provides no single character alternates for any keyword.
     399$
     400$ cset=ext: the extended character set.  this set contains the por-
     401$           table character set as a proper subset, and specifies
     402$           an installation-dependend set of abbreviations for setl
     403$           characters and symbols, such as e.g. set braces.
     404$
     405
     406      size char_set(ps);      $ source character set
     407
     408$ character set options:
     409
     410 .=zzyorg z
     411
     412      defc(cset_por)          $ portable character set
     413      defc(cset_ext)          $ extended character set
asca   9 .+ascebc.
asca  10      defc(cset_asc)          $ for ascii text without conversion
asca  11 ..ascebc
     414
     415$ the string 'altchars' is used to store the one character
     416$ alternatives for setl symbols.  the table 'altchar_tab' maps
     417$ alternate symbols to the proper symbol table entry.
     418
     419      +*  altchar_max  =  10  **  $ maximum number of alternates
     420
     421      size altchars(.sds. altchar_max);
     422      data altchars = '' .pad. altchar_max;
     423
     424      size altchar_tab(ps);
     425      dims altchar_tab(altchar_max);
     426
     427
     428$ debugging codes
     429$ ---------------
     430
     431 .=zzyorg z
     432
     433      defc(dbg_ptrm0)         $ disable/enable macro processor trace
     434      defc(dbg_ptrm1)
     435      defc(dbg_ptrp0)         $ disable/enable parser trace
     436      defc(dbg_ptrp1)
     437      defc(dbg_ptrt0)         $ disable/enable token trace
     438      defc(dbg_ptrt1)
     439      defc(dbg_prsod)         $ open-token dump
     440      defc(dbg_prspd)         $ polish and xpolish string dumps
     441      defc(dbg_prssd)         $ symbol table dump
     442
     443      +*  dbg_min  =  dbg_ptrm0  **
     444      +*  dbg_max  =  dbg_prssd  **
     445
     446
     447      +*  prec_un    =  12  **  $ precedence for unary operators
     448      +*  prec_ubin  =  07  **  $ precedence for user defined binaries
     449
     450
     451$ file numbers
     452$ ------------
     453
     454 .=zzyorg z
     455
     456      defc(in_file)  $ input file
     457      defc(out_file)  $ output file
     458      defc(pol_file)  $ polish string file
     459      defc(xpol_file) $ auxiliary polish file
     460      defc(syn_file)  $ tables from syn
     461 .+sq1
     462$ the setl optimiser needs the setl binary interface to communicate with
     463$ other compiler phases.  thus we can check already here whether we will
     464$ run into problems later.  furthermore, the optimiser likes to  produce
     465$ an annotated source listing:  we generate here a map from  cummulative
     466$ statement numbers to tuples of source lines, in  setl  binary  format,
     467$ and write it to the file src.
     468
     469      defc(ssm_file)          $ setl source map
     470
     471 .=include binio
     472
     473      size putbhdrblk(ws);    $ binary header block
     474
     475      +* putbhdr(t, v)  =     $ write binary header block
     476          putbhdrblk         = 0;
     477          bh_typ_ putbhdrblk = t;
     478          bh_val_ putbhdrblk = v;
     479          write ssm_file, putbhdrblk;
     480          **
     481
     482      +* putbdat(v)  =        $ write one word binary data block
     483          write ssm_file, v;
     484          **
     485 ..sq1
     486
     487
     488        nameset nsptab;  $ nameset with parse table.
     489            size pt(ws);  $ interpretable code
     490            dims pt(parsearamax);
     491        end nameset nsptab;
     492
     493
     494      call prsini;
     495
     496      call parse;
     497
     498      call prstrm(no);
     499
     500 .-qp.
     501 .+s66    end subr start;
     502 .+r32    end prog stlprs;
suna  14 .+r36    end prog stlprs;
suna  15 .+qp.
suna  16          end prog stlprs;
     505 ..qp
       1 .=member prsini
       2      subr prsini;
       3
       4$ this is the main initialization routine. we begin by reading
       5$ the control card parameters and opening various files, then
       6$ call various separate routines to initialize symtab.
       7
       8
       9      size termh_flag(1);     $ print phase header on the terminal
      10
      11      size rc(ws);           $ return code from namesio
      12
      13
      14
      15      size hash(ps);          $ hashing function
      16      size timestr(.sds. 30); $ current time
      17
      18
      19      in_title = '';
      20      call namesio(in_file, rc, in_title, filenamlen); $ input file
      21      if (rc > 1) in_title = '';
      22
      23      out_title = '';
      24      call namesio(out_file, rc, out_title, filenamlen); $ output
      25      if (rc > 1) out_title = '';
      26
      27      term_title = '';
      28      call namesio(max_no_files, rc, term_title, filenamlen); $ terminal
      29      if (rc > 1) term_title = '';         $ has max file number
      30
      31
      32 .+s10.
      33      call getspp(pol_title,  'pol=pol/pol');     $ polish file
      34      call getspp(xpol_title, 'xpol=xpol/xpol');  $ aux. string
      35 ..s10
      36 .+s20.
      37      call getspp(pol_title,  'pol=pol/pol');     $ polish file
      38      call getspp(xpol_title, 'xpol=xpol/xpol');  $ aux. string
      39 ..s20
      40
      41 .+s32.
      42      call getspp(pol_title,  'pol=pol.tmp/');    $ polish file
      43      call getspp(xpol_title, 'xpol=xpol.tmp/');  $ aux. string
      44      call getspp(ssm_title,  'ssm=/ssm.tmp');    $ setl source map
      45 ..s32
      46
      47 .+s37cms.
      48      call getspp(pol_title,  'pol=pol/pol');     $ polish file
      49      call getspp(xpol_title, 'xpol=xpol/xpol');  $ aux. string
      50      call getspp(ssm_title,  'ssm=/ssm');        $ setl source map
      51 ..s37cms
      52 .+s37mts.
      53      call getspp(pol_title,  'pol=-setlpol/');   $ polish file
      54      call getspp(xpol_title, 'xpol=-setlxpol/'); $ aux. polish file
      55      call getspp(ssm_title,  'ssm=/-setlssm');   $ setl source map
      56 ..s37mts
      57 .+s47.
      58      call getspp(pol_title,  'pol=pol/pol');     $ polish file
      59      call getspp(xpol_title, 'xpol=xpol/xpol');  $ aux. string
      60      call getspp(ssm_title,  'ssm=/ssm');        $ setl source map
      61 ..s47
      62
      63 .+s66.
      64      call getspp(pol_title,  'pol=pol/pol');     $ polish file
      65      call getspp(xpol_title, 'xpol=xpol/xpol');  $ aux. string
      66 ..s66
suna  17
suna  18 .+s68.
suna  19      call getspp(pol_title,  'pol=setl.pol/');   $ polish file
suna  20      call getspp(xpol_title, 'xpol=setl.xpol/'); $ aux. polish file
suna  21      call getspp(ssm_title,  'ssm=/setl.ssm');   $ setl source map
suna  22 ..s68
      67
      68
      69      call getipp(list_flag,  'list=0/1');  $ listing control
      70      call getipp(etok_lim,   'etoks=5/5'); $ #toks printed after error
      71      call getipp(pel,        'pel=1000/1000');
      72      call getipp(mlen_lim,   'mlen=1000/1000'); $ macro length
      73      call getipp(tp_flag,    'tp=0/0');    $ terminate after parsing
      74      call getipp(pt_flag,    'pt=0/1');    $ parser trace
      75      call getipp(mt_flag,    'mt=0/1');    $ macro processor trace
      76      call getipp(et_flag,    'et=0/1');    $ error trace
      77      call getipp(tt_flag,    'tt=0/1');    $ token trace
      78      call getipp(sd_flag,    'sd=0/1');    $ symtab dump
      79      call getipp(pd_flag,    'pd=0/1');    $ polish dump
      80      call getipp(at_flag,    'at=0/1');    $ auto title
      81      call getipp(m_flag,     'meas=0/1');  $ measurements
      82      call getipp(opt_flag,   'opt=0/1');   $ global optimisation
      83      call getipp(upd_flag,   'upd=0/1');   $ upd format input
      84      call getspp(cset_str,   'cset=ext/por'); $ source character set
      85      call getspp(seqf_str,   'seq=ls/');   $ sequencing
      86      call getipp(termh_flag, 'termh=0/1'); $ print phase header
sunb  14
sunb  15
sunb  16 .-s68.
sunb  17 .-s47.
sunb  18 .-s32u.
sunb  19      call getipp(lcp_flag,   'lcp=1/1');   $ list program parameters
sunb  20      call getipp(lcs_flag,   'lcs=1/1');   $ list program statistics
sunb  21 .+s32u.
sunb  22      call getipp(lcp_flag,   'lcp=0/1');   $ list program parameters
sunb  23      call getipp(lcs_flag,   'lcs=0/1');   $ list program statistics
sunb  24 ..s32u
sunb  25 .+s47.
sunb  26      call getipp(lcp_flag,   'lcp=0/1');   $ list program parameters
sunb  27      call getipp(lcs_flag,   'lcs=0/1');   $ list program statistics
sunb  28 ..s47
sunb  29 .+s68.
sunb  30      call getipp(lcp_flag,   'lcp=0/1');   $ list program parameters
sunb  31      call getipp(lcs_flag,   'lcs=0/1');   $ list program statistics
sunb  32 ..s68
sunb  33
      87
      88      if mt_flag then $ trace macro processor
      89          monitor entry, limit = 10000;
      90      else
      91          monitor noentry;
      92      end if;
      93
      94      seqf_flag = 0;
      95      if ( 'l' .in. seqf_str) seqf_flag = 1;
      96      if ( 's' .in. seqf_str) seqf_flag = seqf_flag + 2;
      97
      98      call opninc('', 0, ' .copy ', upd_flag);  $ ltl inclusion facility
      99
     100 .+qp pol_title = '0';   xpol_title = '0'; $ set to null for quick parse
     101
     102      file pol_file    access = write,   title = pol_title;
     103      file xpol_file   access = write,   title = xpol_title;
     104 .+sq1.
     105      if slen ssm_title then
     106          file ssm_file  access = write,  title = ssm_title;
     107      end if;
     108 .-sq1.
     109      if opt_flag then
     110          put ,skip;            $ emit blank line
     111          call contlpr(27, yes);$ echo to the terminal
     112          put ,'*** setl optimiser not available ***' ,skip;
     113          call contlpr(27, no); $ stop to echo to the terminal
     114          call ltlfin(1, 0);    $ abnormally terminate
     115      end if;
     116 ..sq1
     117
     118 .+s66.
     119      rewind pol_file;   rewind xpol_file;
     120 ..s66
     121
     122      if  (.len. term_title) call opnterm(term_title);
     123
     124      if     cset_str .seq. 'por' then char_set = cset_por;
     125      elseif cset_str .seq. 'ext' then char_set = cset_ext;
asca  12 .+ascebc.
asca  13      elseif cset_str .seq. 'asc' then char_set = cset_asc;
asca  14 ..ascebc
     126      else                             char_set = cset_ext;
     127      end if;
     128
     129      $ initialize listing control
     130      call contlpr( 6, yes);  $ start paging
     131      call contlpr( 7, yes);  $ enable titling
     132      call lstime(timestr);   $ get current time
     133      call etitlr(0, 'cims.setl.' .cc. prog_level,  1, 0);
     134      call etitlr(0,                      timestr, 41, 0);
     135      call etitlr(0,                       'page', 71, 0);
     136      call contlpr( 8,  76);  $ set page number in column 76
     137      call contlpr(13,   0);  $ set number of current page
sunb  34      call contlpr(10,  rc);  $ get lines per page
sunb  35      call contlpr(15,  rc);  $ set line number within page
     138
     139
sunb  36    if lcp_flag then  $ print initial phase heading
sunb  37      put ,'parameters for this compilation: ' ,skip
     142          ,skip ,'source file: i = '              :in_title   ,a ,'. '
     143          ,skip ,'listing file: l = '             :out_title  ,a ,'. '
     144          ,skip ,'polish string file: pol = '     :pol_title  ,a ,'. '
     145          ,skip ,'auxiliary string file: xpol = ' :xpol_title ,a ,'. '
     146          ,skip ,'setl source map: ssm = '        :ssm_title  ,a ,'. '
     147          ,skip ,'list directives: list = '       :list_flag  ,i ,', '
     148                ,'at = '                          :at_flag    ,i ,'. '
     149                ,'seq = '                         :seqf_str   ,a ,'. '
     150          ,skip ,'character set: cset = '         :cset_str   ,a ,'. '
     151          ,skip ,'parse error limit: pel = '      :pel        ,i ,'. '
     152                ,'parse error file: term = '      :term_title ,a ,'. '
     153          ,skip ,'global optimisation: opt= '     :opt_flag   ,i ,'. '
     154                ,'measurements: meas = '          :m_flag     ,i ,'. '
     155          ,skip
     156          ,skip;
sunb  38    end if;
     157
     158
     159      if termh_flag then
     160          $ the following line is printed on the terminal file only
     161          call contlpr(26, no);   call contlpr(27, yes);
     162          put ,'  start cims.setl.' ,prog_level :timestr ,a ,skip;
     163          call contlpr(26, yes);  call contlpr(27, no);
     164      end if;
     165
     166
     167$ next we call three routines to initialize reserved names, keywords,
     168$ and literals. the order of these calls is important, since we assume
     169$ that some symbols will always have the same position in symtab.
     170
     171      call initstd;
     172      call initkey;
     173      call initlit;
     174
     175$ initialize trulex
     176      call initlex;
     177
     178      call initparse;  $ initialize parse
     179
     180
     181      end subr prsini;
       1 .=member initstd
       2      subr initstd;
       3
       4$ this routine initializes standard symbols used by the macro
smfb  19$ processor, such as 'macro'.  these must be the first entries
       6$ in symtab.
       7
       8
       9      size j(ps);             $ loop index
      10      size sym(ps);           $ symbol table pointer
      11
      12      size syms(.sds. 5);     $ symbols to be hashed into symtab
smfb  20      dims syms(11);
      14      data syms = 'macro', 'endm', 'drop',
      15                  ';', ',', '(', ')', ':=',
smfb  21                  'then', 'end', 'do';
      17
      18      size hashlit(ps);       $ hashes symbol into symbol table
      19
      20
smfb  22      do j = 1 to 11;
      22          sym = hashlit(syms(j));
      23      end do;
      24
      25
      26      end subr initstd;
       1 .=member initkey
       2      subr initkey;
       3
       4$ this routine initializes all the special lexical classes used
       5$ by the parser. it does this through a series of calls to the
       6$ lower level routine 'setkey'. each call to setkey processes
       7$ a dozen or so literals.
       8
       9$ setkey has three arguments, which are passed globally:
      10
      11$ the actual initialization is done by 'setkey', which has
      12$ two global inputs:
      13
      14$ key_lext:    gives the lex_typ of the tokens being initialized.
      15$ key_string:  a character string which drives the initiliation.
      16
      17$ key_string is a character string consisting of alternate integers
      18$ and keywords, separated by blanks. each integer indicates the
      19$ key_val of the keyword which follows it. a plus sign where an
      20$ integer is expected indicates that we should use the previous
      21$ integer, incremented by 1.
      22
      23
      24      nameset nsinitkey;
      25
      26          size key_lext(ps),
      27               key_string(.sds. 80);
      28
      29      end nameset nsinitkey;
      30
      31$ we use the following macro to call setkey
      32      +*  set(s) =
      33          key_string = s;
      34          call setkey;
      35          **
      36
      37
      38      $ declaration keywords
      39      key_lext = l_dkey;
      40      set('1 const + var + init ')
      41
      42      $ keywords for system defined modes
      43      key_lext = l_modekey;
      44      set('1 general + integer + real + string + boolean + atom ')
      45      set('7 error + elmt + tuple + set + smap + map + mmap ')
      46
      47      $ keywords for base types
      48      key_lext = l_btkey;
      49      set('1 local + remote + sparse + packed + untyped ')
      50
      51      $ keywords for statements
      52      key_lext = l_stkey;
      53      set('1 assert + case + continue + debug + exit ')
      54      set('6 fail + goto + if + loop + notrace + quit ')
      55      set('12 return + stop + succeed + trace + yield ')
      56
      57      $ keywords for read/write options
      58      key_lext = l_rwkey;
      59      set('1 rd + wr + rw ')
      60
      61      $ keywords for 'rights' list
      62      key_lext = l_rkey1;
      63      set('1 libraries + reads + writes ')
      64
      65      key_lext = l_rkey2;
      66      set('1 imports + exports ')
      67
      68      $ system defined binary operators
      69      key_lext = l_bin;
      70      set('1 impl 2 or 3 and 5 in 5 notin 5 incs 5 subset ')
      71      set('5 < 5 <= 5 > 5 >= 5 = 5 /= ')
      72      set('7 with 7 less 7 lessf 7 npow 7 min 7 max ')
      73      set('10 * 10 / 10 div 10 mod 10 ? 10 atan2 11 ** ')
      74      $
      75      key_lext = l_from;
      76      set('7 from 7 fromb 7 frome ');
      77
      78      $ system defined unary operators
      79      key_lext = l_un;
      80      set('4 not 6 even 6 odd 6 is_integer 6 is_real 6 is_string ')
      81      set('6 is_boolean 6 is_atom 6 is_tuple ')
      82      set('6 is_set 6 is_map 12 arb 12 domain 12 range 12 pow 12 # ')
      83      set('12 abs 12 char 12 ceil 12 floor 12 fix 12 float ')
      84      set('12 sin 12 cos 12 tan 12 asin 12 acos 12 atan 12 tanh ')
      85      set('12 exp 12 log 12 sqrt 12 random 12 sign 12 type ')
      86      set('12 str 12 val ')
      87
      88      $ system defined operators which are both unary and binary.
      89      $ their precedence is the precedence of the binary operator.
      90      key_lext = l_unbin;
      91      set('9 + 9 - ')
      92
      93      $ compiler debugging options
      94      key_lext = l_debug;
      95      set('1 ptrm0 + ptrm1 + ptrp0 + ptrp1 + ptrt0 + ptrt1 ')
      96      set('7 prsod + prspd + prssd ')
      97      set('10 stre0 + stre1 + strs0 + strs1 + sq1cd + sq1sd + scstd ')
      98      set('17 cq1cd + cq1sd + cq2cd ')
      99      set('20 rtre0 + rtre1 + rtrc0 + rtrc1 ')
     100      set('24 rtrg0 + rtrg1 + rgcd0 + rgcd1 + rdump + rgarb ')
     101
     102      $ user trace options
     103      key_lext = l_trace;
     104      set('1 calls + statements ')
     105
     106      macdrop(set);
     107
     108
     109      end subr initkey;
       1 .=member setkey
       2      subr setkey;
       3
       4$ this routine initializes a single group of keywords. its
       5$ arguments are passed globally as described in 'initkey'.
       6
       7$ the principle argument is a character string 'key_string'.
       8$ key_string consists of a series of tokens separated by a
       9$ single blank. numbering tokens from right to left, each odd
      10$ token is either:
      11
      12$ 1. an integer giving the key_val of the keyword which
      13$    immeditely follows it.
      14
      15$ 2. a plus sign, indicating that the new key_val is one more
      16$    than that of the previous keyword.
      17
      18$ the even numbered tokens are the keywords themselves.
      19
      20$ we loop along the string, finding each pair of tokens,
      21$ and setting the key_val of each keyword.
      22
      23
      24      size len(ps),  $ length of key_string
      25           val(ws), $ value of integer
      26           first(ps),  $ points to first character of token
      27           last(ps),   $ points to blank after token
      28           ch(cs),     $ current character
      29           keyw(.sds. 20),  $ current keyword
      30           sym(ps);         $ symbol table pointer to keyword
      31
      32      size hashlit(ps);  $ hashing function
      33
      34      access nsinitkey;
      35
      36
      37      first = 1;  $ start of first token
      38      val = 0;    $ initial keyword value
      39
      40      len = .len. key_string;
      41
      42      while first < len;
      43
      44          ch = .ch. first, key_string;
      45
      46          if ch = 1r+ then
      47              val = val + 1;
      48              first = first + 1;  $ point to blank
      49
      50          else $ convert integer
      51              val = 0;
      52
      53              while ch ^= 1r ;
      54                  val = 10 * val + digofchar(ch);
      55                  first = first + 1;
      56                  ch = .ch. first, key_string;
      57              end while;
      58          end if;
      59
      60$ first now points to the blank. find the keyword.
      61
      62          first = first + 1;  $ point to start of keyword
      63          last  = first + 1;  $ point to next character
      64
      65$ look for next blank
      66          while .ch. last, key_string ^= 1r ;
      67              last = last + 1;
      68          end while;
      69
      70$ 'last' now points to the blank after the keyword
      71
      72          keyw = .s. first, last-first, key_string;
      73          sym = hashlit(keyw);
      74
      75          lex_typ(sym) = key_lext;
      76          key_val(sym) = val;
      77
      78          first = last + 1;  $ start of next token.
      79      end while;
      80
      81
      82      end subr setkey;
       1 .=member initlit
       2      subr initlit;
       3
       4$ this routine initiailizes the literals contained in the grammar.
       5$ we use the results of 'syn' to create two arrays, which in turn
       6$ drive the initialization routine
       7
       8$ 1. lit_string    this is an array containing the actual literals
       9
      10$ 2. lit_code      this is an array containing the corresponding
      11$                  literal code lit_xxx.
      12
      13$ we iterate over the two arrays, hashing in each string and setting the
      14$ lit_val of the corresponding entry.
      15
      16$ note that there are more entries in lit_string than literals
      17$ in the grammar. this is because certain terminal symbols of the
      18$ grammar can be represented by one of several tokens.
      19
      20
      21 .=include synmax           $ this includes the macro parselitmax
      22
      23      size lit_string(.sds. 30);
      24      dims lit_string(parselitmax);
      25
      26      size lit_code(ps);
      27      dims lit_code(parselitmax);
      28
      29      size hashlit(ps);   $ hashing function
      30
      31 .=zzyorg a   $ counter for initilizing lit_string and lit_code
      32
      33      +*  synlitmap(string, code) =
      34          data lit_string(zzya) = string;
      35          data lit_code(zzya) = code;
      36          **
      37
      38 .=include 'synlit'  $ include synlitmap
      39
      40
      41      size j(ps),   $ loop index
      42           lit(.sds. 30),  $ current literal
      43           p(ps);          $ symbol table pointer to literal
      44
      45
      46      do j = 1 to parselitmax;
      47          lit = lit_string(j);
      48
      49 .+s66    if (lit = 2q::) lit = 1q: .cc. 1q:;
      50
      51          p = hashlit(lit);
      52
      53          lit_val(p) = lit_code(j);
      54      end do;
      55
      56
      57      end subr initlit;
       1 .=member inparse
       2      subr initparse;
       3
       4$ this routine defines the data structures for the parser. the
       5$ parser can be thought of as an interpreter for an abstract
       6$ parsing machine. the interpertable code is contained in the
       7$ table 'pt', and has the following fields:
       8
       9$ parse_op         an opcode po_xxx
      10$ parse_param      a single integer parameter
      11$ parse_parm1      a subfield of parse_param
      12$ parse_parm2      another subfield of parse_param
      13
      14$ the parse machine has two registers, represented by the array
      15$ parsereg. the first two registers have special purposes:
      16
      17$ reg. 1    indicates success of last parser operation
      18$ reg. 2    contains key_val of last keyword recognized.
      19
      20$ the variable 'parsep' points to the current instruction in 'pt'.
      21
      22$ the parse machine contains several instructions which are
      23$ essentially procedure calls. we handle procedure linkage
      24$ through a stack called 'rstack' (r-eturn stack). its fields are
      25
      26$ r_is_sev     $ indicates call from po_sev instruction
      27$ r_clause     $ starting address of desired clause
      28$ r_return     $ return address
      29$ r_number     $ number of clauses seen during  operation
      30
      31
      32$ 'parse' consists of an infinite loop with a case statement
      33$ which branches on the current opcode.
      34
      35
      36      nameset parsens;
      37
      38           +*  pt_op(i)      =  .f. 01, 04, pt(i)  **
      39           +*  pt_parm(i)    =  .f. 05, 12, pt(i)  **
      40           +*  pt_parm1(i)   =  .f. 05, 03, pt(i)  **
      41           +*  pt_parm2(i)   =  .f. 08, 09, pt(i)  **
      42
      43           size parsep(ps);  $ location counter
      44           data parsep = 1;
      45
      46           size parsereg(ps); $ registers
      47           dims parsereg(8);
      48           data parsereg = 0(8);
      49
      50           +*  parse_ok  = parsereg(1)  **  $ condition code
      51           +*  parse_tok = parsereg(2)  **  $ last token
      52
      53      +* rstack_sz  =         $ size of recursion stack
      54 .+s10    36
      55 .+s20    36
smfb  23 .+r32    64
      59 .+s66    60
      60          **
      61
      62          +*  rstack_lim  = 100  **  $ dimension of rstack
      63
      64      size rstack(rstack_sz); $ recursion stack
      65           dims rstack(rstack_lim);
      66
      67           size rstackp(ps);  $ pointer to top of rstack
      68           data rstackp = 0;
      69
      70 .+s10.
      71           +*  r_is_sev(i)    =  .f. 01, 01, rstack(i)  **
      72           +*  r_clause(i)    =  .f. 02, 11, rstack(i)  **
      73           +*  r_return(i)    =  .f. 13, 11, rstack(i)  **
      74           +*  r_number(i)    =  .f. 25, 08, rstack(i)  **
      75 ..s10
      76
      77 .+s20.
      78           +*  r_is_sev(i)    =  .f. 01, 01, rstack(i)  **
      79           +*  r_clause(i)    =  .f. 02, 11, rstack(i)  **
      80           +*  r_return(i)    =  .f. 13, 11, rstack(i)  **
      81           +*  r_number(i)    =  .f. 25, 08, rstack(i)  **
      82 ..s20
      83
      84
smfb  24 .+r32.
      86           +*  r_is_sev(i)    =  .f. 01, 01, rstack(i)  **
      87           +*  r_clause(i)    =  .f. 17, 16, rstack(i)  **
      88           +*  r_return(i)    =  .f. 33, 16, rstack(i)  **
      89           +*  r_number(i)    =  .f. 49, 16, rstack(i)  **
smfb  25 ..r32
     104
     105 .+s66.
     106           +*  r_is_sev(i)    =  .f. 01, 01, rstack(i)  **
     107           +*  r_clause(i)    =  .f. 02, 11, rstack(i)  **
     108           +*  r_return(i)    =  .f. 13, 11, rstack(i)  **
     109           +*  r_number(i)    =  .f. 25, 08, rstack(i)  **
     110 ..s66
     111
     112
     113      +* r_num_lim  =         $ limit for r_number
     114 .+s10    255
     115 .+s20    255
smfb  26 .+r32    65535
     119 .+s66    255
     120          **
     121
     122
     123$ the parser has the following opcodes:
     124
     125 .=zzyorg z
     126
     127            defc(po_act)   $ call separate routine
     128            defc(po_bak)   $ return from interpreter call
     129            defc(po_err)   $ emit error message
     130            defc(po_jif)   $ branch on parse_ok = no.
     131            defc(po_jmp)   $ jump
     132            defc(po_lex)   $ find lexical type
     133            defc(po_lit)   $ find literal
     134            defc(po_set)   $ set register
     135            defc(po_sev)   $ call to find several clauses
     136            defc(po_sub)   $ call to find 1 clause
     137            defc(po_skip)   $ branch on keyword
     138            defc(po_save)  $ save parser status
     139            defc(po_nosave) $ pop saved status
     140            defc(po_restore) $ restore saved status
     141            defc(po_m)       $ emit marker
     142
     143            +*  po_min   =  po_act  **
     144            +*  po_max   =  po_m    **
     145
     146      end nameset;
     147
     148
     149      nameset backup;
smfb  27
     150$ this nameset contains the variables which implement backtracking
     151$ in the parser. these include:
     152
     153$ 1. sstack:
     154
     155$    this is a stack for saving parser states. its fields are
     156
     157$    s_polp:     saves pointer to polish string
     158$    s_xpolp:    saves pointer to auxiliary string
     159$    s_tout:     saves pointer to token buffer
     160$    s_stat:     saves statement counter
     161
     162$ 2. tok_buff:
     163
     164$    this is a circular buffer used to save tokens. its related
     165$    variables are:
     166
     167$    tok_in:    points to last token added to buffer
     168$    tok_out:   points to last token returned to parser
     169
     170$ tok_buff is manipulated by the scanner-interface routine 'gettok'.
     171
     172          +*  sstack_sz   =          $ size of string
     177 .+s10        72
     178 .+s20        72
smfb  28 .+r32        128
smfb  29 .+s66        60
     179              **
     180
     181          +*  sstack_lim  =  50  **  $ number of saved states
     182
     183          size sstack(sstack_sz);
     184          dims sstack(sstack_lim);
     185
     186          size sstackp(ps);
     187          data sstackp = 0;
     188
     189 .+s66.
     191          +*  s_polp(i)   =  .f. 01, 15, sstack(i)  **
     192          +*  s_xpolp(i)  =  .f. 16, 15, sstack(i)  **
     193          +*  s_tout(i)   =  .f. 31, 15, sstack(i)  **
     194          +*  s_stat(i)   =  .f. 46, 15, sstack(i)  **
     195 ..s66
     204
smfb  30 .+r32.
     207          +*  s_polp(i)   =  .f. 01, 32, sstack(i)  **
     208          +*  s_xpolp(i)  =  .f. 33, 32, sstack(i)  **
     209          +*  s_tout(i)   =  .f. 65, 32, sstack(i)  **
     210          +*  s_stat(i)   =  .f. 97, 32, sstack(i)  **
smfb  31 ..r32
     221
     222 .+s10.
     224          +*  s_polp(i)   =  .f. 01, 18, sstack(i)  **
     225          +*  s_xpolp(i)  =  .f. 19, 18, sstack(i)  **
     226          +*  s_tout(i)   =  .f. 37, 18, sstack(i)  **
     227          +*  s_stat(i)   =  .f. 55, 18, sstack(i)  **
     228 ..s10
     229
     230 .+s20.
     232          +*  s_polp(i)   =  .f. 01, 18, sstack(i)  **
     233          +*  s_xpolp(i)  =  .f. 19, 18, sstack(i)  **
     234          +*  s_tout(i)   =  .f. 37, 18, sstack(i)  **
     235          +*  s_stat(i)   =  .f. 55, 18, sstack(i)  **
     236 ..s20
     237
     238
     239
     240          +*  tok_lim  =  500  **  $ maximum saved tokens
     241
     242          size tok_buff(ps);
     243          dims tok_buff(tok_lim);
     244
     245          size tok_in(ps);
     246          data tok_in = 0;
     247
     248          size tok_out(ps);
     249          data tok_out = 0;
     250      end nameset;
     251
     252      size j(ps);   $ loop index for initialization
     253
     254      call ptinit;  $ initialize parse table.
     255
     256$ initialize parsep and parse_ok
     257      parsep = 1;
     258      parse_ok = yes;
     259
     260
     261      end subr initparse;
       1 .=member ptinit
       2
       3
       4 .-defenv_ptinit.
       5
       6
       7      subr ptinit;
       8
       9$ initialize parse table by reading it in.
      10
      11$ the default version of ptinit reads in the parse table.
      12$ the parse table for setl is too large to permit compiling
      13$ the data statements that would be required to initialize
      14$ the table due to limitations of the little compiler.
      15$ however, some systems may be able to use the option of the
      16$ syn program by which the table is represented in form that
      17$ permits initialization by assembly language data statements,
      18$ and such systems can use defenv_ptinit option to avoid need
      19$ to read in parse table.
      20
      21
      22      size  syn_title(.sds.30);  $ title for 'syn' file.
      23
      24 .+s66   call getspp(syn_title,   'syn=synbin/synout');
      25 .+s37   call getspp(syn_title,   'syn=synbin/synout');
      26 .+s47   call getspp(syn_title,   'syn=synbin/synout');
      27 .+s32   call getspp(syn_title,   'syn=synbin.dat/synout.dat');
      28 .+s10   call getspp(syn_title,   'syn=synbin/synout');
      29 .+s20   call getspp(syn_title,   'syn=synbin/synout');
suna  23 .+s68   call getspp(syn_title,   'syn=synbin/');
      30
      31      file syn_file
      32          access = read,
      33          title = syn_title;
      34
      35      rewind syn_file;
      36
      37      read syn_file, pt(1) to pt(parsearamax);
      38
      39      file syn_file access=release;
      40
      41
      42      end subr ptinit;
      43
      44
      45 ..defenv_ptinit
      46
      47
       1 .=member parse
       2      subr parse;
       3
       4$ this is the actual parser. its operation is described in
       5$ initparse, above.
       6
       7      access parsens, backup;
       8
       9      size op(ps),    $ current opcode
      10           parm(ps), $ current parameter
      11           i(ps),  $ register number
      12           j(ps),  $ register value
      13           tok(ps),   $ token obtained from scanner
      14           junk(ps);  $ disgarded type code from polish
      15      size tp(ps);            $ type of current token
      16
      17      size gettok(ps); $ lexical scanner
      18
      19
      20      while 1;   $ main loop
      21
      22          op = pt_op(parsep);
      23          parm = pt_parm(parsep);
      24
      25          if pt_flag then  $ tracing parser
      26              put, skip, column(7), 'parsep: ': parsep, i,
      27                   column(20), 'op: ': op, i,
      28                   column(30), 'parm: ': parm, i;
      29          end if;
      30
      31          go to case(op) in po_min to po_max;
      32
      33      /case(po_lit)/    $ test for matching literal
      34
      35$ literals are identified by
      36
      37$ lit_val:     should be equal to 'parm'.
      38
      39$ a 'find literal' instruction is always the first of three
      40$ instructions
      41
      42$ 1. find literal
      43$ 2. branch on failure
      44$ 3. next action on success
      45
      46$ if we are successful, we increment parsep to point to (3).
      47$ otherwise we increment it to point to (2).
      48
      49          parse_tok = gettok(0);
      50
      51          if lit_val(parse_tok) = parm & ^ eof_flag then
      52              parse_ok = yes;
      53              parsep = parsep + 2;
      54          else
      55              parse_ok = no;
      56
      57              tok_out = tok_out - 1;  $ return token to buffer
      58              if (tok_out = 0) tok_out = tok_lim;
      59
      60              parsep = parsep + 1;
      61          end if;
      62
      63          cont;
      64
      65
      66      /case(po_lex)/    $ find desired lexical type
      67
      68$ a tokens lexical type is given by its 'lex_typ' field.
      69$ if the token has the proper type, we write it onto the polish
      70$ string and advance parsep by 2. otherwise we return it and
      71$ advance parsep by 1.
      72
      73          parse_tok = gettok(0);   tp = lex_typ(parse_tok);
      74
      75          if ^ eof_flag & parm = tp then
      76              parse_ok = yes;
      77
      78              putp(name(parse_tok), pol_name);
      79              parsep = parsep + 2;
      80
      81          else
      82              parse_ok = no;
      83
      84              tok_out = tok_out-1;  $ return token
      85              if (tok_out = 0) tok_out = tok_lim;
      86
      87              parsep = parsep + 1;
      88          end if;
      89
      90          cont;
      91
      92      /case(po_sev)/     $ 'procedure calls' to find clauses
      93
      94      /case(po_sub)/
      95
      96$ po_sev corresponds to the construct  in the grammar,
      97$ and seeks several instances of a clause. po_sub corresponds
      98$ to  and seeks a single instance.
      99
     100          countup(rstackp, rstack_lim, 'rstack');
     101
     102          r_is_sev(rstackp) = (op = po_sev);
     103          r_clause(rstackp) = parm;
     104          r_return(rstackp) = parsep + 1;
     105          r_number(rstackp) = 0;
     106
     107          parsep = parm;
     108          cont;
     109
     110
     111      /case(po_bak)/    $ return from  or 
     112
     113$ the action we take here depends on whether we are looking
     114$ for one clause or several.
     115
     116$ if we are looking for one clause, we return unconditionally. if
     117$ we are looking for several clauses, we take one of two actions:
     118
     119$ 1. if we have been successful so far, look for another clause.
     120$ 2. otherwise write the number of clauses found onto the
     121$    polish string and return.
     122
     123$ if rstack is empty, we have parsed a sentence symbol.  if the
     124$ input file is exhausted, we return;  otherwise -chkprs- has done
     125$ the necessary recovery, and we continue to parse.
     126
     127          if rstackp = 0 then
     128              call chkprs;
     129              if (eof_flag) return;
     130
     131              cont;
     132
     133
     134          elseif r_is_sev(rstackp) then   $ return from 
     135
     136              if parse_ok then  $ look for another
     137                  countup(r_number(rstackp), r_num_lim, '');
     138                  parsep = r_clause(rstackp);
     139
     140              else  $ write marker and return success
     141                  putp(r_number(rstackp), pol_count);
     142                  parse_ok = yes;
     143                  parsep = r_return(rstackp);
     144                  rstackp = rstackp-1;
     145              end if;
     146
     147          else   $ return from 
     148              parsep = r_return(rstackp);
     149              if (parse_ok) parsep = parsep + 1;
     150              rstackp = rstackp-1;
     151          end if;
     152
     153          cont;
     154
     155
     156      /case(po_err)/   $ issue error message
     157
     158$ if the condition code is true we merely advance. otherwise we
     159$ call 'ermet', which prints the error message and resets the
     160$ parser.
     161
     162          if parse_ok then
     163              parsep = parsep + 1;
     164          else
     165              call ermet(parm);
     166          end if;
     167
     168          cont;
     169
     170
     171      /case(po_jif)/     $ conditional jump  -lab
     172
     173          if parse_ok then
     174              parsep = parsep + 1;
     175          else
     176              parsep = parm;
     177          end if;
     178
     179          cont;
     180
     181
     182      /case(po_jmp)/   $ go to    +lab
     183
     184          parsep = parm;
     185          cont;
     186
     187
     188      /case(po_set)/    $ set register i to j
     189
     190          i = pt_parm1(parsep) + 1;
     191          j = pt_parm2(parsep);
     192
     193          parsereg(i) = j;
     194
     195          parsep = parsep + 1;
     196          cont;
     197
     198
     199      /case(po_skip)/    $ branch on keyword
     200
     201          parsep = parsep + key_val(parse_tok);
     202          cont;
     203
     204
     205      /case(po_save)/     $ save parser status
     206
     207          countup(sstackp, sstack_lim, 'save');
     208
     209          s_polp(sstackp)  = polp;
     210          s_xpolp(sstackp) = xpolp;
     211          s_tout(sstackp)  = tok_out;
     212          s_stat(sstackp)  = cstmt_count;
     213
     214          parsep = parsep + 1;
     215          cont;
     216
     217
     218      /case(po_nosave)/    $ disgard saved status
     219
     220          sstackp = sstackp-1;
     221          parsep = parsep + 1;
     222          cont;
     223
     224
     225      /case(po_restore)/    $ restore parser to saved status
     226
     227          polp        = s_polp(sstackp);
     228          xpolp       = s_xpolp(sstackp);
     229          tok_out     = s_tout(sstackp);
     230          cstmt_count = s_stat(sstackp);
     231
     232          sstackp = sstackp-1;
     233
     234          parsep = parsep + 1;
     235          cont;
     236
     237
     238      /case(po_m)/    $ write marker
     239
     240          putp(parm, pol_mark);
     241          parsep = parsep + 1;
     242          cont;
     243
     244
     245      /case(po_act)/   $ call generator routine
     246
     247          call actgen(parm);
     248          parsep = parsep + 1;
     249          cont;
     250
     251
     252      end while;
     253
     254      end subr parse;
       1 .=member actgen
       2      subr actgen(parm);
       3
       4$ this routine performs generated actions specified in the
       5$ grammar. the actual generator calls are included from the
       6$ output of 'syn'.
       7
       8      size parm(ps);
       9
      10      +*  pac  =  go to esac;  **   $ utility macro
      11
      12
      13      go to pa(parm) in 1 to parseactmax;
      14
      15 .=include 'synact'    $ include generator calls
      16
      17/esac/      $ end of case
      18
      19      return;
      20
      21      end subr actgen;
       1 .=member gettok
       2      fnct gettok(dummy);
       3
       4$ this routine acts as an interface bewteen the scanner and the parser.
       5$ it returns the next token in the input stream.
       6
       7$ gettok maintains a buffer of tokens obtained from the scanner.
       8$ this is used to implement backtracking in the parser.
       9
      10
      11$ the token buffer is circular. when we enter gettok we begin by
      12$ checking whether the buffer is empty(in pointer = out pointer).
      13$ if so, we get a token from the scanner and insert it in the buffer.
      14
      15$ once we know the buffer is non-empty, we merely return the first
      16$ token in the buffer.
      17
      18$ at certain points in the parse, we must save the status of the parser.
      19$ one of the variables saved is a pointer to the token buffer indicating
      20$ the last token returned. every time we add a token to the buffer, we
      21$ must make sure that we do not clobber the oldest token which is part
      22$ of a saved state.
      23
      24
      25      size tok(ps);   $ current token
      26
      27      size gettok(ps),  $ token returned
      28           lexscan(ps),  $ lower level scanner routine
      29           symsds(sds_sz);  $ gets token name
      30
      31      access backup;  $ nameset containing token buffer
      32
      33
      34      if tok_in = tok_out then   $ buffer empty
      35          tok_in = tok_in + 1;
      36          if (tok_in > tok_lim) tok_in = 1;
      37
      38          if sstackp ^= 0 then  $ saving parser states
      39              if (tok_in = s_tout(1)) call overfl('tok_buff');
      40          end if;
      41
      42          tok_buff(tok_in) = 0;    $ not yet found
      43          tok = lexscan(0);   $ get token from scanner
      44          tok_buff(tok_in) = tok;
      45
      46      end if;
      47
      48      tok_out = tok_out + 1;   $ remove token
      49      if (tok_out > tok_lim) tok_out = 1;
      50
      51      gettok = tok_buff(tok_out);
      52
      53      if tt_flag then  $ tracing tokens output by gettok
      54          put, skip, column(7), 'gettok = ': symsds(gettok), a;
      55      end if;
      56
      57
      58      end fnct gettok;
       1 .=member lexscan
       2      fnct lexscan(dummy);
       3
       4$ this is the top level routine of the macro processor. it
       5$ returns a token from the next routine down, intercepting
       6$ macro occurrences.
       7
       8$ when we see a macro occurrence, we call 'initexp' which
       9$ initiates expansion of the macro.
      10
      11
      12      size lexscan(ps),  $ value returned
      13           absorb(ps);   $ lower level scanner routine
      14
      15
      16      while 1;
      17          lexscan = absorb(0);
      18
      19          if eof_flag then  $ end of file
      20              return;
      21
      22          elseif morg(lexscan) = 0 then  $ not macro
      23              return;
      24
      25          else
      26              call initexp(lexscan);  $ collect arguments and try again
      27          end if;
      28
      29      end while;
      30
      31
      32      end fnct lexscan;
       1 .=member initexp
       2      subr initexp(nam);
       3
       4$ this routine is called to initiate expansion of the macro 'nam'.
       5$ we begin by checking whther the macro has arguments. if not,
       6$ we simply build a new stack frame. otherwise we collect the
       7$ macros arguments by calling absorb, then build the stack
       8$ frame.
       9
      10
      11      size nam(ps);   $ name of macro
      12
      13      size n(ps),  $ number of arguments
      14           nparen(ps), $ no of parens
      15           tok(ps),  $ current token
      16           delim(ps),  $ desired delimiter
      17           j(ps);  $ loop index
      18
      19      size args(ps);   $ array of astack pointers
      20      dims args(param_lim);
      21
      22      size absorb(ps);  $ lower level scanner routine
      23
      24
      25      n = margs(nam);  $ get number of arguments
      26
      27      if n = 0 then  $ no arguments
      28          mstackp = mstackp+2;
      29          if (mstackp > mstack_lim) call overfl('mstack');
      30
      31          mstack(mstackp) = morg(nam);
      32          mstack(mstackp-1) = 0;
      33
      34          return;
      35
      36      end if;
      37
      38
      39$ otherwise collect the macros arguments. we do this by calling
      40$ 'absorb' for tokens. as we get each token, we add it to the argument
      41$ buffer 'astack'. we save a pointer to the first astack entry for
      42$ each argument in an auxilliary table called 'args'. these pointers
      43$ are later used to build the stack frame.
      44
      45      if (absorb(0) ^= sym_lp) call ermsg(1, 0);  $ get leading '('
      46
      47      do j = 1 to n;   $ collect arguments
      48
      49          args(j) = astackp+1;  $ save pointer to argument
      50          nparen = 0;           $ depth of parenthesis
      51
      52          while 1;
      53              tok = absorb(0);
      54
      55              if eof_flag then  $ end of file
      56                  call eofr;
      57
      58              elseif tok = sym_lp then  $ left paren
      59                  nparen = nparen + 1;
      60
      61              elseif tok = sym_rp then  $ right paren
      62                  if (nparen = 0) quit;   $ end of argument list
      63                  nparen = nparen - 1;
      64
      65              elseif nparen = 0 & tok = sym_comma then  $ end of argumen
      66                  quit;
      67              end if;
      68
      69$ add token to astack.
      70              countup(astackp, astack_lim, 'astack');
      71              astack(astackp) = tok;
      72          end while;
      73
      74$ push a zero onto astack to indicate end of argument
      75          countup(astackp, astack_lim, 'astack');
      76          astack(astackp) = 0;
      77
      78$ check that we found the right delimiter
      79          if j = n then
      80              delim = sym_rp;
      81          else
      82              delim = sym_comma;
      83          end if;
      84
      85          if (delim ^= tok) call ermsg(2, nam);
      86      end do;
      87
      88$ build stack frame
      89      mstackp = mstackp + n + 2;
      90      if (mstackp > mstack_lim) call overfl('mstack');
      91
      92      mstack(mstackp) = morg(nam);
      93      mstack(mstackp-1) = n;
      94
      95      do j = 1 to n;
      96          mstack(mstackp-1-j) = args(j);
      97      end do;
      98
      99
     100      end subr initexp;
       1 .=member absorb
       2      fnct absorb(dummy);
       3
       4$ this is the second level of the macro processor. it passes along
       5$ tokens, intercepting macro drops and definitions.
       6
       7
       8      size absorb(ps),  $ value returned
       9           expand(ps);  $ lower level scanner routine
      10
      11
      12      while 1;
      13          absorb = expand(0);   $ obtain token
      14
      15          if eof_flag then  $ end of file
      16              return;
      17
      18          elseif absorb = sym_macro then
      19              call getdef;
      20
      21          elseif absorb = sym_drop then
      22              call getdrop;
      23
      24          else
      25              quit;
      26          end if;
      27
      28      end while;
      29
      30
      31      end fnct absorb;
       1 .=member getdef
       2      subr getdef;
       3
       4$ this routine processes macro definitions. this is handled in three
       5$ steps:
       6
       7$ 1. get the macro name and store it in the global 'new_mac'.
       8
       9$ 2. get the formal parameters and set the margs field of the
      10$    macro.
      11
      12$ 3. get the macro text and set the morg field of the macro.
      13
      14
      15      size mac(ps);   $ macro name
      16
      17      size expand(ps);  $ lower level scanner routine
      18
      19
      20      mac = expand(0);
      21
      22      if eof_flag then  $ end of file
      23          call eofr;
      24
      25      elseif lex_typ(mac) > l_bold then  $ illegal token type
      26          call ermsg(3, mac);
      27          return;
      28
      29      elseif morg(mac) ^= 0 then  $ redefinition
      30          call ermsg(4, mac);
      31      end if;
      32
      33      call getparms(mac);  $ get parameter list
      34      call gettext(mac);  $ get text
      35
      36
      37      end subr getdef;
       1 .=member getparms
       2      subr getparms(mac);
       3
       4$ this routine collects the formal parameters for the macro 'mac'.
       5$ we do this by calling 'expand' to get tokens and parsing the
       6$ parameter list. as we recognize each parameter, we do three
       7$ things:
       8
       9$ 1. check for duplicate parameters.
      10$ 2. enter the parameter name in the array 'params'.
      11$ 3. set the 'mcode' field of the parameter to indicate
      12$    how the parameter will be represented in the macros text.
      13
      14$ once we have processed all the parameters, we set the
      15$ margs field of the macro to indicate the number of
      16$ arguments.
      17
      18$ the mcode field is used to indicate the representation of arguments
      19$ in the macro text. the mcode field is initially zero, indicating
      20$ straight text, and is reset while we are processing each macro
      21$ definition. after we absorb the text we must reset the mcode
      22$ field of each of the parameters to 0.
      23
      24
      25      size mac(ps);  $ macro name
      26
      27      size delim(ps),  $ delimiter
      28           nuser(ps),  $ no. of user args
      29           ngen(ps),   $ no. of generated arguments
      30           semi_seen(1),  $ on if ';' has been seen
      31           tok(ps),  $ current token
      32           param(ps);   $ current parameter
      33
      34      size expand(ps);  $ lower level scanner function
      35
      36
      37$ we begin by checking whether the macro name is followed by a
      38$ semi-colon. if so, there are no arguments.
      39
      40      delim = expand(0);
      41
      42      if eof_flag then  $ end of file
      43          call eofr;
      44
      45      elseif delim = sym_semi then  $ macro with no arguments
      46          margs(mac) = 0;
      47          paramsp = 0;  $ indicate params is empty.
      48          return;
      49
      50      elseif delim ^= sym_lp then
      51          call ermsg(5, 0);
      52      end if;
      53
      54      nuser = 0;   $ number of user supplied arguments
      55      ngen = 0;    $ number of generated arguments
      56
      57      paramsp = 0;
      58
      59$ see if the left paren is followed by a semicolon. if so, there are
      60$ no user supplied arguments.
      61
      62      tok = expand(0);
      63
      64      if tok = sym_semi then
      65          semi_seen = yes;
      66      else
      67          semi_seen = no;
      68          expand_buff = tok;  $ return token
      69      end if;
      70
      71      while 1;
      72          param = expand(0);
      73
      74          if eof_flag then  $ end of file
      75              call eofr;
      76
      77          elseif mcode(param) ^= 0 then  $ duplicate argument
      78              call ermsg(6, param);
      79
      80          elseif semi_seen then  $ generated argument
      81              ngen = ngen + 1;
      82              mcode(param) = bias_gen1 + ngen;
      83
      84          else  $ user supplied argument
      85              nuser = nuser + 1;
      86              mcode(param) = bias_user + nuser;
      87          end if;
      88
      89$ store in set of parameters
      90          countup(paramsp, param_lim, 'params');
      91          params(paramsp) = param;
      92
      93          delim = expand(0);
      94
      95          if eof_flag then  $ end of file
      96              call eofr;
      97
      98          elseif delim = sym_rp then  $ end of list
      99              quit;
     100
     101          elseif delim = sym_semi then
     102              if (semi_seen) call ermsg(7, 0);
     103              semi_seen = yes;
     104
     105          elseif delim ^= sym_comma then
     106              call ermsg(8, 0);
     107          end if;
     108
     109      end while;
     110
     111      margs(mac) = nuser;  $ set no. of args
     112
     113      if (expand(0) ^= sym_semi) call ermsg(9, 0); $ trailing ';'
     114
     115
     116      end subr getparms;
       1 .=member gettext
       2      subr gettext(mac);
       3
       4$ this routine gets the text for the macro 'mac'. this is done
       5$ by getting tokens from the macro expander and adding them to
       6$ mttab until the end of the macro is encountered.
       7
       8$ as we get each token, we examine it to see if we have reached
       9$ the end of the macro. there are two ways to indicate the
      10$ end of a macro:
      11
      12$     endm mac;     or       endm;
      13
      14$ the word 'endm' alone is only recognized as the end of the
      15$ macro if the macro text contains an equal number of imbedded
      16$ 'macro'-s and 'endm'-s.
      17
      18$ we keep track of the length of the macro. if it exceeds a
      19$ certain length, we assume that there is a missing 'endm'.
      20
      21$ note that parameters and generated arguments recieve a
      22$ special representation in the macro text. this representation
      23$ is given by the mcode field of their symbol table entries.
      24$ a mcode field of zero indicates simple text; this is
      25$ represented directly by a symbol table pointer.
      26
      27$ once we have gathered all the text, we must clear the mcode
      28$ fields of the parameters. we use the set 'params' to do this.
      29
      30
      31      size mac(ps);   $ macro name
      32
      33      size len(ps),  $ length of macro
      34           lev(ps),  $ nesting depth for nested macros
      35           code(ps),   $ encoding of word of macro text
      36           j(ps),  $ loop index
      37           tok(ps),  $ current token
      38           tok1(ps), $ extra token
      39           param(ps);  $ parameter name
      40
      41      size expand(ps);  $ lower level scanner routine
      42
      43
      44      morg(mac) = mtabp + 1;   $ set origin field
      45
      46      len = 0;   $ length of macro
      47      lev = 0;  $ depth of nexting of macros
      48
      49      while 1;
      50          len = len+1;
      51
      52          if len > mlen_lim then
      53              call ermsg(10, 0);
      54              quit;
      55          end if;
      56
      57          tok = expand(0);
      58
      59          if eof_flag then  $ end of file
      60              call eofr;
      61
      62          elseif tok = sym_macro then  $ nested macro definitions
      63              lev = lev + 1;
      64
      65          elseif tok = sym_endm then  $ end of some macro
      66              if (lev = 0) quit;
      67              lev = lev-1;      $ decrement nesting depth
      68
      69              tok1 = expand(0);  $ look for 'endm mac'
      70              if (tok1 = mac) quit;
      71              expand_buff = tok1;  $ return token
      72
      73          elseif tok = mac then $ macro name occurs within itself
      74              call ermsg(11, mac);
      75          end if;
      76
      77
      78$ find out how we will encode the token in mtab. if the tokens
      79$ mcode is zero, then the token is straight text, and we simply
      80$ use its symbol table pointer. otherwise the token is an argument,
      81$ and mcode gives its encoding.
      82
      83$ each generated argument has one encoding on its first occurrence and
      84$ another on all later occurrences. after getting the code of the token,
      85$ see if it is a first occurrence of a generated argument. if so,
      86$ change its mcode field to indicate its encoding on all later
      87$ occurrences.
      88
      89          code = mcode(tok);
      90
      91          if code = 0 then  $ simple text
      92              code = tok;
      93
      94          elseif code > bias_gen1 & code < bias_gen2 then
      95              mcode(tok) = code + (bias_gen2 - bias_gen1);
      96          end if;
      97
      98          countup(mtabp, mtab_lim, 'mtab');
      99          mtab(mtabp) = code;
     100
     101$ adjust code for generated arguments
     102      end while;
     103
     104$ add zero entry to macro text
     105      countup(mtabp, mtab_lim, 'mtab');
     106      mtab(mtabp) = 0;
     107
     108$ get trailing tokens after 'endm'
     109      until eof_flag ! tok = sym_semi;
     110          tok = expand(0);
     111      end until;
     112
     113$ reset encoding for parameters
     114      do j = 1 to paramsp;
     115          param = params(j);
     116          mcode(param) = 0;
     117      end do;
     118
     119
     120      end subr gettext;
       1 .=member getdrop
       2      subr getdrop;
       3
       4$ this routine processes the 'macro drop' statement. this statement
       5$ has the form:
       6
       7$    drop  ;
       8
       9$ we iterate until we see a semicolon, dropping macros.
      10
      11
      12      size mac(ps),  $ macro name
      13           delim(ps);  $ delimeter
      14
      15      size expand(ps);  $ lower level scanner routine
      16
      17
      18      while 1;
      19          mac = expand(0);
      20
      21          if eof_flag then  $ end of file
      22              call eofr;
      23          elseif lex_typ(mac) > l_bold then  $ not valid macro name
      24              call ermsg(12, mac);
      25          else
      26              morg(mac) = 0;
      27              margs(mac) = 0;
      28          end if;
      29
      30          delim = expand(0);
      31
      32          if eof_flag then  $ end of file
      33              call eofr;
      34          elseif delim = sym_semi then
      35              return;
      36          elseif delim ^= sym_comma then
      37              call ermsg(13, 0);
      38          end if;
      39
      40      end while;
      41
      42
      43      end subr getdrop;
       1 .=member expand
       2      fnct expand(dummy);
       3
       4$ this is the bottom level routine of the macro expander. it
       5$ checks mstack to see whether we are currently expanding macros.
       6$ if so, it returns the next token of macro text; otherwise
       7$ it gets a token from the scanner.
       8
       9$ in order to parse macro definitions, we keep a one token
      10$ buffer. if this buffer contains a token, we return it.
      11$ otherwise loop until we can return a token, expanding macro
      12$ arguments where necessary.
      13
      14
      15      size p(ps),  $ pointer to mtab or astack
      16           n(ps),  $ number of arguments
      17           arg(ps);  $ pointer to astack
      18
      19      size expand(ps),  $ token returned
      20           trulex(ps),  $ lower level scanner routine
      21           bldgen(ps);  $ builds generated arguments
      22
      23
      24$ begin by checking 1 word buffer.
      25
      26      if expand_buff ^= 0 then  $ return saved token
      27          expand = expand_buff;
      28          expand_buff = 0;
      29          return;
      30      end if;
      31
      32$ otherwise loop until we can return a token of macro text
      33
      34      while 1;
      35
      36          if mstackp = 0 then  $ not expanding any macros
      37              expand = trulex(0);
      38              return;
      39          end if;
      40
      41$ get pointer to next word of macro text, then advance mstack.
      42          p = mstack(mstackp);
      43          mstack(mstackp) = p+1;
      44
      45$ p points to the next word of text in either astack or mtab.
      46$ if p is greater than 'mtab_lim' then it is a pointer into
      47$ astack biased by mtab_lim.
      48
      49          if p > mtab_lim then
      50              expand = astack(p-mtab_lim);
      51          else
      52              expand = mtab(p);
      53          end if;
      54
      55          if expand = 0 then  $ end of macro or argument. pop mstack.
      56              n = mstack(mstackp-1);   $ no. of args
      57
      58              if n ^= 0 then   $ pop astack
      59                  astackp = mstack(mstackp-2)-1;
      60              end if;
      61
      62              mstackp = mstackp-n-2;  $ pop mstack
      63
      64          elseif expand <= symtab_lim then $ simple text
      65              return;
      66
      67          elseif expand <= bias_gen1 then $ user argument
      68
      69$ get a pointer to the start of the argument in astack, then
      70$ build a stack frame.
      71
      72              n = expand-bias_user;  $ argument number
      73              arg = mstack(mstackp-1-n);  $ pointer to astack
      74
      75              mstackp = mstackp + 2;  $ build stack frame
      76              if (mstackp > mstack_lim) call overfl('mstack');
      77
      78              mstack(mstackp) = arg + mtab_lim;  $ store biased pointer
      79              mstack(mstackp-1) = 0;
      80
      81          elseif expand <= bias_gen2 then $ first use of gen. arg.
      82              expand = bldgen(expand-bias_gen1, yes);
      83              return;
      84
      85          else  $ second use of generated argument
      86              expand = bldgen(expand-bias_gen2, no);
      87              return;
      88          end if;
      89
      90      end while;
      91
      92
      93      end fnct expand;
       1 .=member trulex
       2      fnct trulex(dummy);
       3
       4$ this routine is the actual lexical scanner.  it returns the next
       5$ token from the input file.
       6$
       7$ trulex is essentially a giant case statement which brances on the
       8$ class of the first character in the token. each branch of the case
       9$ statement contains a separate loop for gathering the remainder
      10$ of the token.
      11$
      12$ the global variable 'char' always contains the next character of
      13$ the input file. there are several primitives for manipulating
      14$ 'char':
      15$
      16$ 1. getc:    advances 'char' to the next character
      17$ 2. backc:   removes last character from token.
      18$ 3. keepc:   adds 'char' to the token being built up.
      19$
      20$ the current card image is stored in a self defining string called
      21$ 'card' with a standard origin. the number of characters per line
      22$ is given by the macro 'cpc'. only the first 'scpc' characters
      23$ are significant; the remaining characters are used for sequence
      24$ numbers, etc.
      25$
      26$ the current token is stored as an array of characters called
      27$ 'tok'. it is converted to an sds before it is hashed.
      28$
      29$ the flag 'eof_flag' is set when the end of the input file is
      30$ encountered. it is checked every time we start processing a
      31$ new token.
      32$
      33$ once the end of file has been encountered, 'getc' always returns
      34$ a blank as the next character. the blank will act as a delimiter
      35$ for names, etc. without requiring a separate end of file check
      36$ within most of the loops.
      37
      38      +*  card_org  =   $ sorg for 'card'
      39          (.sds. cpc + 1)
      40          **
      41
      42
      43      +*  card_char(i)  =   $ i-th character of current line
      44          .f. card_org - (i*cs), cs, card
      45          **
      46
      47
      48      +*  keepc =   $ store character in 'tok'
      49          tok_pos = tok_pos + 1;
      50          if (tok_pos > toklen_lim) call tok_over;
      51          tok(tok_pos) = char;
      52          **
      53
      54
      55$ the following macro is used to back up char, tok_pos, and
      56$ card_pos. we will only back up as far as the start of
      57$ the current token, so there is never any need to retreat to
      58$ the previous line.
      59
      60      +*  backc =
      61          char = tok(tok_pos);
      62          tok_pos = tok_pos-1;
      63          card_pos = card_pos-1;
      64          **
smfa  13
smfa  14
smfa  15$ we define a standard code block at the end of this routine to get the
smfa  16$ next character from the input stream.  the following macro is used to
smfa  17$ simulate the internal procedure call.
smfa  18
smfa  19      +* l_call(lab)  =       $ local call to code block at label -lab-
smfa  20          retpt = zzya;       $ save return label
smfa  21          go to lab;          $ branch to internal procedure
smfa  22          /rlab(zzya)/        $ define label for return point
smfa  23          **
smfa  24
smfa  25      size retpt(ps);         $ return label
smfa  26
smfa  27 .=zzyorg a     $ reset counter for return labels
smfa  28
smfa  29 .+mc size ctpc(ws);          $ little 'convert-to-primary-case'
smfa  30
smfa  31      size fold(1);           $ indicates whether case-folding should be
smfa  32      data fold = yes;        $  performed
      65
      66
      67      nameset nstrulex;
      68
      69          size card(.sds. cpc); $ current card image
      70          size card_pos(ps);  $ current position in card
      71          size card_len(ps);  $ index of last non-blank on card
      72
      73          size eor_seen(1);   $ end-of-record has occurred
      74          data eor_seen = no;
      75
      76          size tok(cs);  $ token being built
      77          dims tok(toklen_lim);
      78
      79          size tok_pos(ps),         $ current position in tok = token le
      80               tok_typ(ps);        $ lex_typ of token
      81
      82          size char(ps),    $ current character
      83               char1(ps);   $ extra character
      84
      85          size p(ps);   $ symtab pointer
      86
      87$ little string search set definitions:
      88
      89$ little uses the following pre-defined string search sets:
      90
      91      +*  ss_blank   =  001  **  $ blank
      92      +*  ss_separ   =  002  **  $ blank-equivalent separators
      93      +*  ss_digit   =  004  **  $ digits 0...9
      94      +*  ss_ucchar  =  008  **  $ upper case alphabetics a...z
      95      +*  ss_lcchar  =  016  **  $ lower case alphabetics a...z
      96      +*  ss_break   =  032  **  $ break character ('_')
      97
      98$ setl uses the following additional search classes:
      99
     100      +*  ss_delim   =  064  **  $ delimiter
     101      +*  ss_ddelim  =  128  **  $ double delimiter
     102      +*  ss_alter   =  512  **  $ 'altchar_tab'-defined characters
     103
     104$ groups of search classes:
     105
     106      +*  ss_alpha   =  024  **  $ alphabetic
     107      +*  ss_alpham  =  060  **  $ alphameric
     108
smfb  32$ for efficiency, we duplicate some of the little string functions as
smfb  33$ internal subroutines, to avoid repeated external subroutine calls.
smfb  34$ this implies that we need to dublicate the definition of the string
smfb  35$ search tables.
smfb  36
smfb  37          size sstab(ws);     $ string search table
smfb  38          dims sstab(cssz);
smfb  39
smfc  11 .+r36.   $ initialise sstab for s10/s20 (9 bit ascii)
smfb  41          data sstab =
smfb  42              0(9), ss_separ /* tab */, 0(2), ss_separ /* form feed */,
smfb  43              0(19), ss_blank ! ss_separ, 0(15), ss_digit(10), 0(7),
smfb  44              ss_ucchar(26), 0(4), ss_break, 0, ss_lcchar(26), 0(5),
smfb  45              0(384);
smfc  12 ..r36
smfb  47 .+s32.   $ initialise sstab for s32 (8 bit ascii)
smfb  48          data sstab =
smfb  49              0(9), ss_separ /* tab */, 0(2), ss_separ /* form feed */,
smfb  50              0(19), ss_blank ! ss_separ, 0(15), ss_digit(10), 0(7),
smfb  51              ss_ucchar(26), 0(4), ss_break, 0, ss_lcchar(26), 0(5),
smfb  52              0(128);
smfb  53 ..s32
smfb  54 .+s37.   $ initialise sstab for s37 (8 bit ebcdic)
smfb  55          data sstab =
smfb  56              0(5), ss_separ /* tab */, 0(6), ss_separ /* form feed */,
smfb  57              0(51), ss_blank ! ss_separ, 0(44), ss_break, 0(18), 0(1),
smfb  58              ss_lcchar(9), 0(7), ss_lcchar(9), 0(8), ss_lcchar(8),
smfb  59              0(22), 0, ss_ucchar(9), 0(7), ss_ucchar(9), 0(8),
smfb  60              ss_ucchar(8), 0(6), ss_digit(10), 0(6);
smfb  61 ..s37
smfb  62 .+s47.   $ initialise sstab for s47 (8 bit ascii)
smfb  63          data sstab =
smfb  64              0(9), ss_separ /* tab */, 0(2), ss_separ /* form feed */,
smfb  65              0(19), ss_blank ! ss_separ, 0(15), ss_digit(10), 0(7),
smfb  66              ss_ucchar(26), 0(4), ss_break, 0, ss_lcchar(26), 0(5),
smfb  67              0(128);
smfb  68 ..s47
smfb  69 .+s66.   $ initialise sstab for s66
smfb  70          data sstab =
smfb  71              0,
smfb  72              ss_ucchar(26),  $ alphabetics
smfb  73              ss_digit(10),   $ numerics
smfb  74              0(8),
smfb  75              ss_blank ! ss_separ, $ blank (the only separator)
smfb  76              0(7),
smfb  77              ss_break,       $ break (underline)
smfb  78              0(10);          $ remaining characters
smfb  79 ..s66
suna  24 .+s68.   $ initialise sstab for mc68000 (8 bit ascii)
suna  25          data sstab =
suna  26              0(9), ss_separ /* tab */, 0(2), ss_separ /* form feed */,
suna  27              0(19), ss_blank ! ss_separ, 0(15), ss_digit(10), 0(7),
suna  28              ss_ucchar(26), 0(4), ss_break, 0, ss_lcchar(26), 0(5),
suna  29              0(128);
suna  30 ..s68
smfb  80
smfb  81      +* anyc(c, sm)  =       $ look for character in string set
smfb  82          ((sm & sstab(1+c)) ^= 0)
smfb  83          **
smfb  84
     109$ macros for character types:
     110
     111      +*  alphabetic(c)  =  anyc(c, ss_alpha)   **
     112      +*  alphameric(c)  =  anyc(c, ss_alpham)  **
     113      +*  numeric(c)     =  anyc(c, ss_digit)   **
     114
     115      +* is_blank(c)  =
     116 .+mc     anyc(c, ss_separ)
     117 .-mc     (c = 1r )
     118          **
smfb  85
smfb  86$ define the table to map each character to the proper lexical case.
smfb  87
smfb  88          size lttab(ws);     $ lexical table
smfb  89          dims lttab(cssz);
smfb  90
smfb  91 .=zzyorg z
smfb  92
smfb  93          defc(lt_separ)      $ token separators
smfb  94          defc(lt_alpha)      $ alphabetic character
smfb  95          defc(lt_ddelim)     $ double-character delimiter
smfb  96          defc(lt_delim)      $ single-character delimiter
smfb  97          defc(lt_alter)      $ one-character alternate character
smfb  98          defc(lt_period)     $ the character '.'
smfb  99          defc(lt_quote)      $ the character -'-
smfb 100          defc(lt_digit)      $ numeric character
smfb 101          defc(lt_dollar)     $ the character '$'
smfb 102          defc(lt_error)      $ all remaining characters
     119
     120
     121$ delimitiers
     122$ -----------
     123
     124$ the table 'dsym_tab' maps single delimiters to their symbol table
     125$ entries.
     126
     127      +*  delims  =  '(),;:=+-*/#?'  **
     128
     129      size dsym_tab(ps);
     130      dims dsym_tab(.len. delims);
     131
     132$ 'ddelims(i,j)' is true for characters ij which form two character
     133$ delimiters.
     134
     135      +* double_delims_1 = '(/:/*:'  **
     136      +* double_delims_2 = '/)====*:'  **
     137
     138      +*  ddelim(i, j)  =  .f. i+1, 1, a_ddelim(j+1)  **
     139
smfb 103          size a_ddelim(cssz);
     141          dims a_ddelim(cssz);
     142
     143$ codes for ddelim
     144
     145$ the code for ddelim are zero origined.  this allows us to
     146$ initialize the entries for all chracacters which are not
     147$ double delimiters in a single data statement.
     148
     149 .=zzyorg z
     150
     151          defc0(dd_single)   $ single delimiter
     152          defc0(dd_double)   $ double delimiter
     153
     154          +*  dd_min  =  dd_single  **  $ minumum value
     155          +*  dd_max  =  dd_double  **  $ maximum value
     156
     157          data a_ddelim = dd_single(cssz);  $ set to default case
     158
     159      end nameset;
     160
     161
     162      size string(.sds. toklen_lim);  $ token in sds form
     163
     164      size j(ps);             $ loop index
     165      size org(ps);           $ origin for self-defining string
     166
     167      size trulex(ps);        $ pointer returned
     168      size hash(ps);          $ hashing function called
     169
smfb 104      size brkc(ws);          $ little string primitives
     172      size spns(ws);
     173
     174
     175/begin/        $ main entry point
     176
     177      if (eof_flag) return;
     178
     179      tok_pos = 0;  $ initialize length of token
     180
smfb 105      go to case(lttab(1+char)) in lt_separ to lt_error;
smfb 106
smfb 107/case(lt_separ)/              $ token separators
     182          $
     183          $ advance to the next non-blank character
     184          $
     185          if card_pos = 0 then
smfa  33              l_call(getc);
     187              if (is_blank(char) = no) go to begin;
     188          end if;
     189
     190          card_pos = card_pos + spns(card, card_pos, ss_separ) - 1;
smfa  34          l_call(getc);
     192
     193          go to begin;
     194
     195
smfb 108/case(lt_alpha)/              $ alphabetic character
     197          $
     198          $ <*name>
     199          $
     200          until alphameric(char) = no;
smfa  35              keepc; l_call(getc);
     202          end until;
     203
     204          tok_typ = l_name;
     205          go to esac;
     206
     207
smfb 109/case(lt_ddelim)/             $ double-character delimiter
     209          $
     210          $ this class includes characters such as '*' which may
     211          $ either stand alone or be part of a two character
     212          $ delimiter such as '**'.
     213          $
     214          $ we begin by saving char and getting the next character.
     215          $ next we lookup the two characters in 'ddelim' which
     216          $ gives us one of two cases:
     217          $
     218          $ 1. the first character is a delimiter and the second is
     219          $    part of the next token.
     220          $
     221          $ 2. the two characters form a double delimiter.
     222          $
smfa  36          char1 = char; l_call(getc);  $ save char, then get next char
     224
     225          go to c(ddelim(char1, char)) in dd_min to dd_max;
     226
     227          /c(dd_single)/      $ char1 is a single delimiter
     228
     229              trulex = dsym_tab(brkc(delims, 1, char1)+1);
     230              return;
     231
     232          /c(dd_double)/      $ double delimiter
     233
     234              tok(1) = char1;   tok(2) = char;   tok_pos = 2;
     235              tok_typ = l_delim;
     236
smfa  37              l_call(getc);  $ get character after delimiter
     238              go to esac;
     239
     240
smfb 110/case(lt_delim)/              $ single-character delimiter
     242          $
     243          $ look up the symbol table pointer and return
     244          $
     245          trulex = dsym_tab(brkc(delims, 1, char)+1);
smfa  38          l_call(getc);
     247          return;
     248
     249
smfb 111/case(lt_alter)/              $ one-character alternate character
     251          $
     252          $ one character alternative for setl symbol or literal
     253          $
     254          trulex = altchar_tab(brkc(altchars, 1, char)+1);
smfa  39          l_call(getc);
     256          return;
     257
     258
smfb 112/case(lt_period)/             $ the character '.'
     260          $
     261          $ <*bold>, <*real>, or <*dots>
     262          $
     263          $ if the period is followed by an alphabetic, it is part
     264          $ of a <*bold>;  if it is followed by a numeric, it is
     265          $ part of a <*real>;  otherwise it must be the start of
     266          $ a <*dots>.
     267          $
smfa  40          keepc;  l_call(getc);  $ read character after dot
     269
     270          if alphabetic(char) then $ start of <*bold>
     271              until alphameric(char) = no;
smfa  41                  keepc; l_call(getc);
     273              end until;
     274
     275              tok_typ = l_bold;
     276
     277          elseif numeric(char) then $ start of <*real>
     278              go to get_real;
     279
     280          elseif char = 1r. then  $ start of <*dots>
     281              until char ^= 1r.;
smfa  42                  keepc; l_call(getc);
     283              end until;
     284
     285              tok_typ = l_dots;
     286
     287          else                    $ invalid starting character
     288              call ermsg(14, 0);
     289              go to begin;
     290          end if;
     291
     292          go to esac;
     293
     294
smfb 113/case(lt_quote)/              $ the character -'-
     296          $
     297          $ <*string>
     298          $
     299          $ imbedded quotes are represented by a pair of quotes
     300          $ with no intervening blanks.  two string denotations
     301          $ separated by blanks are treated as a single denotation.
     302          $
     303          $ n.b. both the opening and closing quotes are considered
     304          $      part of the token.  this way the string -'x'-
     305          $      looks different from the identifier -x-.
     306          $
     307          $ strings are particularly likely to overflow the token
     308          $ buffer, for example if the closing quote is missing.
     309          $ for this reason we make an explicit test and jump out
     310          $ of the loop if necessary.
     311
     312          eor_seen = no;      $ initialise end-of-record flag
     313          keepc;   $ save quote
smfa  43          fold = no;          $ do not fold case while reading string
     314
     315          while 1;
smfa  44              l_call(getc);
     317
     318              if char = 1r' then  $ found quote
smfa  45                  fold = yes;  $ could be end of string
smfa  46                  l_call(getc);  $ read next character
     320
     321                  if char = 1r' then  $ double quote
     322                      keepc;
     323
     324                  elseif char = 1r  then  $ look for adjacent string
     325                      until is_blank(char) = no;
smfa  47                          l_call(getc); if (eof_flag) quit;
     327                          if (eor_flag) eor_seen = yes;
     328                      end until;
     329
     330                      if (char ^= 1r') quit;
     331                      if ( ^ eor_seen) quit;
     332                  else
     333                      quit;
     334                  end if;
smfa  48
smfa  49                  fold = no;  $ continue to read string
     335
     336              elseif eof_flag then  $ missing close quote
     337                  call eofr;
     338
     339              else   $ simple text
     340                  keepc;
     341              end if;
     342
     343              if tok_pos = toklen_lim - 1 then  $ too long
     344                  call ermsg(18, 0);
     345                  quit;
     346              end if;
     347
     348          end while;
smfa  50
smfa  51          fold = yes;  $ fold case while not reading string
     349
     350          $ insert final quote in token
     351          tok_pos = tok_pos + 1;
     352          tok(tok_pos) = 1r';
     353
     354          tok_typ = l_string;
     355          go to esac;
     356
     357
smfb 114/case(lt_digit)/              $ numeric character
     359          $
     360          $ <*int>, or <*real>
     361          $
     362          $ begin by collecting the integer part of the token.
     363          $ then see if the digits are followed by a decimal point.
     364          $
     365          until numeric(char) = no;
smfa  52              keepc; l_call(getc);
     367          end until;
     368          $
     369          $ if the digits are followed by a period, we may have
     370          $ either one token, namely a real denotation such as 1.0,
     371          $ or two tokens, such as in '1 .op'.  we can determine
     372          $ the proper alternative by inspecting the character
     373          $ after the period.
     374          $
     375          if char = 1r. then
smfa  53              keepc; l_call(getc);
     377
     378              if numeric(char) then  $ <*real> - get fraction
     379                  go to get_real;
     380              else            $ period is part of the next token
     381                  backc;
     382              end if;
     383          end if;
     384
     385          tok_typ = l_int;
     386          go to esac;
     387
     388
     389/get_real/
     390
     391          until numeric(char) = no;
smfa  54              keepc; l_call(getc);
     393          end until;
     394
     395          if char = 1re then  $ probably its an exponent
smfa  55              keepc; l_call(getc);
     397
     398              if char = 1r+ ! char = 1r- then  $ sign for exponent
smfa  56                  keepc; l_call(getc);
     400              end if;
     401
     402              if numeric(char) then  $ valid exponent
     403                  until numeric(char) = no;
smfa  57                      keepc; l_call(getc);
     405                  end until;
     406
     407              else  $ we had '1.0  else'
     408                  backc;
     409              end if;
     410
     411          end if;
     412
     413          tok_typ = l_real;
     414          go to esac;
     415
     416
smfb 115/case(lt_dollar)/             $ the character '$'
     418          $
     419          $ rest of line is comment
     420          $
smfa  58          call getcard;
     422          go to begin;
     423
     424
smfb 116/case(lt_error)/              $ all remaining characters
     426          $
     427          $ current character can not start a token
     428          $
     429          call ermsg(14, 0);
     430
smfa  59          l_call(getc);
     432          go to begin;
     433
     435
     436/esac/     $ hash in token
     437
     438$ we begin by converting the token to an sds. the sds must
     439$ be standard. i.e. each token xxx must have the same sorg
     440$ as a string xxx generated by the little compiler.
     441
     442      org = .sds. tok_pos + 1;  $ origin for string
     443
     444      string = 0;  $ initialize string
     445      slen string = tok_pos;
     446      sorg string = org;
     447
     448      do j = 1 to tok_pos;
     449          .f. org-j*cs, cs, string = tok(j);
     450      end do;
     451
     452      p = hash(string);
     453
     454$ if the token is a literal or keyword then its lex_type has
     455$ already been set; otherwise we set it to tok_typ.
     456
     457      if (key_val(p) = 0 & lit_val(p) = 0) lex_typ(p) = tok_typ;
     458
     459      trulex = p;
smfa  60
smfa  61      return;
smfa  62
smfa  63
smfa  64/getc/                        $ local subroutine to read next character
smfa  65
smfa  66$ this routine places the next character from the input stream into the
smfa  67$ global variable 'char'.  the global 'fold' indicates whether the
smfa  68$ character is to be translated to its primary case equivalent.
smfa  69
smfa  70      if card_pos >= scpc then  $ read next card
smfa  71          eor_flag = yes;  $ indicate end-of-record
smfa  72          call getcard;  $ read next card
smfa  73      else    $ advance in buffer
smfa  74          eor_flag = no;
smfa  75          card_pos = card_pos + 1;
smfa  76          char     = card_char(card_pos);
smfa  77
smfa  78 .+s66.
smfa  79$ if using 64 set and outside of quotes, convert percent to colon
smfb 117          if (fold .and. (char=3b'63'))  char = 1r:;
smfa  81 ..s66
smfa  82
smfa  83 .+mc     if (fold) char = ctpc(char);
smfa  84      end if;
smfa  85
smfa  86      go to rlab(retpt) in 1 to zzya;  $ return from local subroutine
     460
     461
     462      end fnct trulex;
       1 .=member initlex
       2      subr initlex;
       3
       4$ this routine initializes various tables used by trulex.
       5
       6
       7      size j(ps);             $ loop index
       8      size p(ps);             $ symbol table pointer
       9      size str(.sds. 1);      $ used to initialize delimiters
      10      size c(cs);             $ character code
      11      size c1(cs), c2(cs);    $ pair of characters
      12
      13      access nstrulex;
      14
      15      size hashlit(ps);       $ hashing function
      16      size brkc(ws);          $ little break string primitive
      17
      18
      19      $ initialise the input line to a blank line
      20      card = ' ' .pad. cpc; sorg card = card_org; slen card = wpc * cpw;
      21      card_pos = 0;
      22      char = 1r ;
      23
      24      $ initialize delimiters, finding their hash, and setting
      25      $ their class.  we will revise the class for double
      26      $ delimiters later.
      27
      28      call blds(delims, ss_delim);
      29
      30      str = ' ';
      31
      32      do j = 1 to .len. delims;
      33          c = .ch. j, delims;   .ch. 1, str = c;
smfb 118          sstab(1+c) = sstab(1+c) ! ss_delim;  $ define pattern set
      34
      35          p = hashlit(str);
      36          dsym_tab(brkc(delims, 1, c)+1) = p;
      37      end do;
      38
      39      $ process double delimiters
      40      $
      41      $ n.b. 'ddelim' has already been initialized by a data state-
      42      $      ment so that all invalid pairs indicate dd_single.
      43
      44      call blds(double_delims_1, ss_ddelim);
      45
      46      do j = 1 to .len. double_delims_1;
      47          c1 = .ch. j, double_delims_1;
      48          c2 = .ch. j, double_delims_2;
      49          ddelim(c1, c2) = dd_double;
smfb 119
smfb 120          sstab(1+c1) = sstab(1+c1) ! ss_ddelim;  $ define pattern set
      50      end do;
      51
      52      $ initialize alternate characters
      53
      54      .len. altchars = 0;
      55
      56 .+s10.
      57      if char_set = cset_ext then
      58          call altchar('<<',     3b'173');  $ left set brace
      59          call altchar('>>',     3b'175');  $ right set brace
      60          call altchar('(/',     3b'133');  $ left square bracket
      61          call altchar('/)',     3b'135');  $ right square bracket
      62          call altchar('st',     3b'174');  $ vertical bar
      63          call altchar('st',     3b'041');  $ exclamation mark
      64      end if;
      65 ..s10
      66
      67 .+s20.
      68      if char_set = cset_ext then
      69          call altchar('<<',     3b'173');  $ left set brace
      70          call altchar('>>',     3b'175');  $ right set brace
      71          call altchar('(/',     3b'133');  $ left square bracket
      72          call altchar('/)',     3b'135');  $ right square bracket
      73          call altchar('st',     3b'174');  $ vertical bar
      74          call altchar('st',     3b'041');  $ exclamation mark
      75      end if;
      76 ..s20
      77
      78
      79 .+s32.
      80      if char_set = cset_ext then
      81          call altchar('<<',     4b'7b');   $ left set brace
      82          call altchar('>>',     4b'7d');   $ right set brace
      83          call altchar('(/',     4b'5b');   $ left square bracket
      84          call altchar('/)',     4b'5d');   $ right square bracket
      85          call altchar('st',     4b'7c');   $ vertical bar
      86          call altchar('st',     4b'21');   $ exclamation mark
      87      end if;
      88 ..s32
      89
      90 .+s37.
      91      if char_set = cset_ext then
      92          call altchar('<<',     4b'8b');   $ left set brace
      93          call altchar('>>',     4b'9b');   $ right set brace
      94          call altchar('(/',     4b'ad');   $ left square bracket
      95          call altchar('/)',     4b'bd');   $ right square bracket
      96          call altchar('st',     4b'4f');   $ vertical bar
      97          call altchar('st',     4b'5a');   $ exclamation mark
asca  15
asca  16 .+ascebc.
asca  17$ cset_por for s37 reflects tn chain conventions;  cset_asc reflects
asca  18$ the (cdc-based) ascii to ebcdic translation tables used at nyu.
asca  19$ use cset_asc for files originally written in ascii which are to
asca  20$ be compiled under ebcdic using the translation used by nyu tape
asca  21$ driver.
asca  22      elseif char_set = cset_asc then
asca  23          call altchar('<<',     4b'c0');   $ left set brace
asca  24          call altchar('>>',     4b'd0');   $ right set brace
asca  25          call altchar('(/',     4b'4a');   $ left square bracket
asca  26          call altchar('/)',     4b'5a');   $ right square bracket
asca  27          call altchar('st',     4b'6a');   $ vertical bar
asca  28          call altchar('st',     4b'4f');   $ exclamation mark
asca  29 ..ascebc
      98      end if;
      99 ..s37
     100
     101 .+s47.
     102      if char_set = cset_ext then
     103          call altchar('<<',     4b'7b');   $ left set brace
     104          call altchar('>>',     4b'7d');   $ right set brace
     105          call altchar('(/',     4b'5b');   $ left square bracket
     106          call altchar('/)',     4b'5d');   $ right square bracket
     107          call altchar('st',     4b'7c');   $ vertical bar
     108          call altchar('st',     4b'21');   $ exclamation mark
     109      end if;
     110 ..s47
     111
     112 .+s66.
     113      if char_set = cset_ext then
     114          call altchar('<<',     3b'74');   $ at sign
     115          call altchar('>>',     3b'75');   $ reverse slant
     116          call altchar('(/',     3b'61');   $ left square bracket
     117          call altchar('/)',     3b'62');   $ right square bracket
     118          call altchar('st',     3b'67');   $ ampersand
     119      end if;
     120 ..s66
suna  31
suna  32 .+s68.
suna  33      if char_set = cset_ext then
suna  34          call altchar('<<',     4b'7b');   $ left set brace
suna  35          call altchar('>>',     4b'7d');   $ right set brace
suna  36          call altchar('(/',     4b'5b');   $ left tuple bracket
suna  37          call altchar('/)',     4b'5d');   $ right tuple bracket
suna  38          call altchar('st',     4b'7c');   $ vertical bar
suna  39          call altchar('st',     4b'21');   $ exclamation mark
suna  40      end if;
suna  41 ..s68
     121
     122      call blds(altchars, ss_alter);
smfb 121
smfb 122      do j = 1 to .len. altchars;
smfb 123          c = .ch. j, altchars;
smfb 124          sstab(1+c) = sstab(1+c) ! ss_alter;  $ define pattern set
smfb 125      end do;
smfb 126
smfb 127      do c = 0 to cssz-1;
smfb 128          if is_blank(c)        then lttab(1+c) = lt_separ;  cont; end;
smfb 129          if alphabetic(c)      then lttab(1+c) = lt_alpha;  cont; end;
smfb 130          if anyc(c, ss_ddelim) then lttab(1+c) = lt_ddelim; cont; end;
smfb 131          if anyc(c, ss_delim)  then lttab(1+c) = lt_delim;  cont; end;
smfb 132          if anyc(c, ss_alter)  then lttab(1+c) = lt_alter;  cont; end;
smfb 133          if c = 1r.            then lttab(1+c) = lt_period; cont; end;
smfb 134          if c = 1r'            then lttab(1+c) = lt_quote;  cont; end;
smfb 135          if numeric(c)         then lttab(1+c) = lt_digit;  cont; end;
smfb 136          if c = 1r$            then lttab(1+c) = lt_dollar; cont; end;
smfb 137          lttab(1+c) = lt_error;
smfb 138      end do;
smfb 139
smfb 140
smfb 141      macdrop(anyc);
smfb 142
smfb 143      macdrop4(lt_separ, lt_alpha, lt_ddelim, lt_delim)
smfb 144      macdrop4(lt_alter, lt_period, lt_quote, lt_digit)
smfb 145      macdrop2(lt_dollar, lt_error)
     123
     124
     125      end subr initlex;
       1 .=member altch
       2      subr altchar(s, c);
       3
       4$ this routine initializes the character 'c' as an alternate to the
       5$ string 's'.
       6
       7
       8      size s(.sds. 72);       $ string representing token
       9      size c(cs);             $ one character alternate for 's'
      10
      11      size hashlit(ps);       $ hashing function called
      12
      13
      14      .len. altchars = (.len. altchars) + 1;
      15      .ch. (.len. altchars), altchars = c;
      16
      17      altchar_tab(.len. altchars) = hashlit(s);
      18
      19
      20      end subr altchar;
       1 .=member getcard
       2      subr getcard;
       3
       4$ this routine reads the next card image and sets 'char' to the first
       5$ character. it also looks for title, eject, and include cards.
       6
       7$ the card image is read by the little library routine 'getinc' which
       8$ returns an array in r(cpw) format. we begin by converting it to
       9$ a self defining string
      10
      11$ card images are added to the listing file after they have been
      12$ parsed. this way the card image for 'proc p' can be used as a
      13$ title before it appears in the text.
      14
      15
      16      size ara(ws);   $ array read by getinc
      17      dims ara(wpc);
      18
      19      size j(ps),  $ loop index
      20           tab_str(.sds. 1),      $ tab character
      21           init_col(ps),          $ initial column for title
      22           p1(ps),  $ pointer to start of title
      23           p2(ps);  $ pointer to end of title
      24
      25      size str(.sds. cpc),   $ string containing 'list', etc.
      26           title(.sds. cpc); $ new listing title
      27      size anyc(ws);          $ little string primitive
      28
      29      access nstrulex;
      30
      31
      32      call put_card;          $ print previous card
      33
      34/loop/  $ process listing directives
      35
      36      call getinc(ara, 1, wpc, eof_flag);
      37      line_no = line_no + 1;
      38
      39      if eof_flag then  $ set 'card' to blanks and return
      40          card = ' ' .pad. cpc;
      41          return;
      42      end if;
      43
      44      do j = 1 to wpc;
      45          .f. card_org - j*ws, ws, card = ara(j);
      46      end do;
      47
      48      sorg card = card_org;
      49      slen card = wpc * cpw;
      50
      51$ look for '.' in column 2. if we find one look for the next blank
      52$ and find the string between the dot and the blank.
      53
      54      if card_char(2) = 1r. then
      55
      56          do j = 3 to cpc;
      57 .+mc         if (anyc(card_char(j), 2)) quit do;
      58 .+mc         $ anyc(..., 2) matches blank-equivalent separators
      59 .-mc         if (card_char(j) = 1r )    quit do;
      60          end do;
      61
      62          str = .s. 3, j-3, card;
      63 .+mc     call stpc(str);     $ convert to primary case
      64
      65          if str .seq. 'list' then  $ turn on listing
      66              list_flag = yes;
      67              go to loop;
      68
      69          elseif str .seq. 'nolist' then  $ turn off listing
      70              list_flag = no;
      71              go to loop;
      72
      73          elseif str .seq. 'eject' then  $ page eject
      74              if (list_flag) put, page;
      75              go to loop;
      76
      77          elseif str .seq. 'title' then  $ change title
      78
      79$ the title is surrounded by quotes. find it.
      80              p1 = '''' .in. card;
      81              if (p1 = 0) p1 = 1;
      82              card_char(p1) = 1r ;
      83
      84              p2 = '''' .in. card;
      85              if (p2 = 0) p2 = scpc;
      86              title = .s. p1+1, p2-p1-1, card;
      87
      88$ call little library to set title
      89              if cc_tab = 1r then
      90                  call etitlr(yes, '', 9, cpc+1);
      91                  call etitlr(yes, ' ', 5, 1);
      92              else
      93                  tab_str = ' ';
      94                  .ch. 1, tab_str = cc_tab;
      95                  call etitlr(yes, tab_str, 9, cpc);
      96                  call etitlr(yes, ' ', 5, 1);
      97              end if;
      98
      99              if seqf_flag then
     100                  init_col = 10;
     101              else
     102                  init_col = 2;
     103              end if;
     104
     105              call etitlr(yes, title, init_col, 0);
     106              if (list_flag) put, page;
     107
     108              go to loop;
     109          end if;
     110      end if;
     111
     112      is_listed  = no;        $ line has not been listed
     113      is_written = no;        $ line has not been written for optimiser
     114
     115      card_pos = 0;
     116      char     = 1r ;  $ nominal blank at end of line
     117
     118
     119      end subr getcard;
       1 .=member putcard
       2      subr put_card;
       3
       4$ this routine prints the current card image.
       5
       6      size j(ps);             $ loop index
       7      size temp1(ws);         $ temporaries for line/statement numbers
       8      size temp2(ws);
       9
      10      size anyc(ws);          $ little seek-character string primitive
      11      size rsps(ws);          $ little right-span string-set string prim
      12
      13      access nstrulex;
      14
      15 .+sq1.
      16      if ((slen ssm_title) ^= 0) & ^ is_written then
      17          $ (logically) 'delete' trailink blanks
      18          card_len = slen card;
      19          if is_blank((.ch. card_len, card)) then
      20              card_len = card_len - rsps(card, card_len, ss_separ);
      21          end if;
      22
      23          $ copy the line for the optimiser
      24          putbhdr(bt_string, card_len);
      25
      26          do j = 1 to (card_len + cpw - 1) / cpw;
      27              putbdat((.f. (sorg card)-j*ws, ws, card))
      28          end do;
      29
      30          is_written = yes;   $ line has been copied
      31      end if;
      32 ..sq1
      33      if (is_listed ! ^ list_flag) return;
      34
      35      is_listed = yes;
      36
      37      go to prcase(seqf_flag) in 0 to 3;
      38
      39/prcase(1)/                        $ line number only
      40
      41          put :line_no ,i(7);
      42          go to print_tab;
      43
      44
      45/prcase(2)/                        $ statement number only
      46
      47          put :stmt_count ,i(7);
      48          go to print_tab;
      49
      50
      51/prcase(3)/                        $ 3 digits of each
      52
      53          temp1 = mod(line_no,1000);
      54          if (temp1 = 0) call setdig(1, line_no/1000, 2, yes);
      55
      56          temp2 = mod(stmt_count, 1000);
      57          if (temp1 = 0)       call setdig(1, stmt_count/1000, 6, yes);
      58          if (stmt_count < 10) call etitlr(1, '   ', 6, 3);
      59
      60          put :temp1 ,i(3) ,x(1) :temp2 ,i(3);
      61
      62
      63/print_tab/
      64
      65
      66$ if a tab character is available, print it so that the following
      67$ will be aligned properly.
      68
      69      if cc_tab ^= 1r  then
      70          put: cc_tab, r(1);
      71      else
      72          put, column(9);
      73      end if;
      74
      75/prcase(0)/
      76
      77      put: (.s. 1, scpc, card), a;
      78
      79      if ( (.s. scpc+1, cpc-scpc, card)
      80          .sne. (' ' .pad.  cpc-scpc)) then
      81          put, ' . ': (.s. scpc+1, cpc-scpc, card), a;
      82      end if;
      83
      84      if m_flag then
      85          if cc_tab ^= 1r  then
      86              put, column(16+cpc);
      87          else
      88              put, column(11+cpc);
      89          end if;
      90
      91          put :cstmt_count ,i;
      92      end if;
      93
      94      put, skip;
      95
      96
      97      end subr put_card;
       1 .=member setdig
       2      subr setdig(p1, numb, from, blank_zer);
       3
       4      size p1(ps),
       5           numb(ws),
       6           from(ps),
       7           blank_zer(1);
       8
       9      size temp(.sds. 3),
      10           i(ps),
      11           num(ws),
      12           leng(ps);
      13
      14      temp = '   '; leng = 3;
      15
      16      num = numb;
      17
      18      do i = 3 to 1 by -1;
      19          if ( (num > 0) .or. ^ blank_zer) .ch. i, temp = mod(num,10)
      20                                                           + 1r0;
      21          num = num / 10;
      22      end do;
      23
      24      call etitlr(p1, temp, from, leng);
      25
      26      end subr setdig;
       1 .=member atitle
       2      subr atitle;
       3
       4$ this routine performs automatic titling.
       5
       6      access nstrulex;
       7
       8      size j(ps),  $ loop index
       9           tab_str(.sds. 1),  $ tab character
      10           init_col(ps),      $ starting column for heading
      11           len(ps);  $ length of title
      12
      13      if at_flag then
      14
      15$ find first non blank then character
      16          j = 1;
      17
      18          while 1;
      19              if (j > cpc) quit;
      20              if (.ch. j, card ^= 1r ) quit;
      21
      22              j = j + 1;
      23          end while;
      24
      25          len = scpc-j+1;
      26
      27          if cc_tab = 1r then
      28              call etitlr(yes,  '', 9, cpc+1);
      29              call etitlr(yes, ' ', 5, 1);
      30          else
      31              tab_str = ' ';
      32              .ch. 1, tab_str = cc_tab;
      33              call etitlr(yes, tab_str, 9, cpc);
      34              call etitlr(yes, ' ', 5, 1);
      35          end if;
      36
      37          if (seqf_flag) then
      38              init_col = 10;
      39          else
      40              init_col = 2;
      41          end if;
      42
      43          call etitlr(yes, (.s. j, len, card), init_col, len);
      44          put, page;
      45      end if;
      46
      47
      48      end subr atitle;
       1 .=member tokover
       2      subr tok_over;
       3
       4$ this routine is called when a token exceeds the maximum token length.
       5$ we print an error message and then reset tok_pos so that only the
       6$ second half of the token will be saved.
       7
       8      access nstrulex;
       9
      10      call ermsg(18, 0);
      11      tok_pos = 0;
      12
      13
      14      end subr tok_over;
       1 .=member bldgen
       2      fnct bldgen(n, first);
       3
       4$ this routine builds a token for a generated macro argument.
       5$ we allow each macro to have up to 26 generated arguments.
       6
       7$ we generate a token of the form
       8
       9$    gax.yyyyy
      10
      11$ where x is the n-th letter of the alphabet and yyyyy is the number of
      12$ macro expansions which have contained a token starting with 'gax'.
      13
      14$ first is true if this is the first token 'gax' we are generating
      15$ for this macro expansion. if first is true, we increment the
      16$ counter yyyyy.
      17
      18
      19      size n(ps),     $ number of generated argument
      20           first(1);  $ on for first occurrence of argument
      21
      22      size nn(ps),  $ local copy of n.
      23           j(ps),  $ loop index
      24           str(.sds. 9); $ string for token
      25
      26      size count(ps); $ array of counters yyyyy
      27      dims count(param_lim);
      28      data count = 0(param_lim);
      29
      30      +*  count_lim = 99999  ** $ maximum value for yyyyy
      31
      32
      33      size bldgen(ps),  $ value returned
      34           hash(ps);   $ hashing function
      35
      36
      37      if first then
      38          countup(count(n), count_lim, 'generated arguments');
      39      end if;
      40
      41      str = 'ga .     ';  $ initialize string
      42      .ch. 3, str = .ch. n, 'abcdefghijklmnopqrstuvwxyz';
      43
      44      nn = count(n);  $ copy counter for conversion to string
      45
      46      do j = 9 to 5 by -1;
      47          .ch. j, str = charofdig(mod(nn, 10));
      48          nn = nn/10;
      49      end do;
      50
      51      bldgen = hash(str);
      52      lex_typ(bldgen) = l_name;
      53
      54
      55      end fnct bldgen;
       1 .=member bugact
       2      subr bugact;
       3
       4$ this routine processes compiler debugging options.
       5
       6
       7      access parsens;
       8
       9
      10      if dbg_min<=key_val(parse_tok) & key_val(parse_tok)<=dbg_max then
      11
      12          go to case(key_val(parse_tok)) in dbg_min to dbg_max;
      13
      14
      15/case(dbg_ptrm0)/             $ turn macro processor trace off
      16              mt_flag = no;   go to esac;
      17
      18/case(dbg_ptrm1)/             $ turn macro processor trace on
      19              mt_flag = yes;  go to esac;
      20
      21/case(dbg_ptrp0)/             $ turn parse trace off
      22              pt_flag = no;   go to esac;
      23
      24/case(dbg_ptrp1)/             $ turn parse trace on
      25              pt_flag = yes;  go to esac;
      26
      27/case(dbg_ptrt0)/             $ turn token trace off
      28              tt_flag = no;   go to esac;
      29
      30/case(dbg_ptrt1)/             $ turn token trace on
      31              tt_flag = yes;  go to esac;
      32
      33/case(dbg_prsod)/             $ dump open-tokens
      34              call odump;     go to esac;
      35
      36/case(dbg_prspd)/             $ dump polish and xpolish strings
      37              call pdump;     go to esac;
      38
      39/case(dbg_prssd)/             $ dump parser dbgbol table
      40              call sdump;     go to esac;
      41
      42
      43/esac/
      44
      45
      46      end if;
      47
      48
      49      end subr bugact;
       1 .=member hash
       2      fnct hash(string);
       3
       4$ this routine hashes a token into symtab and returns a pointer to
       5$ it.
       6
       7$ self defining strings generally contain a few unused bits
       8$ whose value is undefined. 'hash' assumes that these bits
       9$ are all zero. this is true for strings produced by 'trulex',
      10$ but not for strings produced by '.s.', etc.
      11
      12$ strings whose extra bits are not guarenteed zero should
      13$ be hashed by calling 'hashlit'.
      14
      15
      16      size string(sds_sz);   $ token to be hashed
      17
      18      size hash(ps);   $ symtab pointer returned
      19
      20      size indx(ps),  $ index into hash headers
      21           len(ps),  $ slen of string
      22           org(ps),  $ sorg of string
      23           words(ps),  $ no. of words in new names entry
      24           nam(ps),  $ pointer to names
      25           hashc(ws),  $ hash code of token
      26           word(ws),   $ current word of token
      27           head(ps),  $ pointer to hash header
      28           j(ps);      $ loop index
      29
      30
      31$ we begin by making a names entry for the new token. we do this
      32$ now since it is easier to compare two names entries than a names
      33$ entry and a string. we do not adjust 'namesp' until we build
      34$ a new symtab entry.
      35
      36$ as we make the names entry, we compute the hash code as the
      37$ exclusive-or of all its words.
      38
      39      len = slen string;  $ length of token
      40      org = sorg string;  $ origin of token
      41
      42      words = org/ws;  $ number of words needed
      43      if (namesp + words > names_lim) call overfl('names');
      44
      45      hashc = 0;  $ hash code
      46
      47      do j = 1 to words;
      48          word = .f. 1 + (j-1)*ws, ws, string;
      49
      50          hashc = hashc .ex. word;
      51          names(namesp+j) = word;
      52      end do;
      53
      54      hashc = (.f. 1, ws/2, hashc) .ex. (.f. ws/2+1, ws/2, hashc);
      55
      56$ get a pointer into heads, and search the clash list.
      57      indx = mod(hashc, heads_lim) + 1;
      58      hash = heads(indx);
      59
      60      while hash ^= 0;
      61
      62          nam = name(hash);
      63
      64          do j = 1 to words;
      65              if (names(nam-1+j) ^= names(namesp+j)) go to nxt;
      66          end do;
      67
      68          return;
      69
      70      /nxt/
      71          hash = link(hash);
      72      end while;
      73
      74$ no match. get free symtab entry
      75      countup(symtabp, symtab_lim, 'symtab');
      76      hash = symtabp;
      77
      78$ fill in symtab entry
      79      symtab(hash) = 0;
      80      name(hash) = namesp + 1;
      81      namesp = namesp + words;
      82
      83$ add to front of clash list
      84      link(hash) = heads(indx);
      85      heads(indx) = hash;
      86
      87
      88      end fnct hash;
       1 .=member hashlit
       2      fnct hashlit(str1);
       3
       4$ this is a special hashing routine for processing literals.
       5$ it hashes strings which are not necessarily in the format
       6$ required by 'hash'. it also converts bold names to names
       7$ if we are not in the prefix stropping mode.
       8
       9
      10      size str1(.sds. 30);  $ original string
      11
      12      size hashlit(ps);   $ pointer returned
      13
      14      size str2(.sds. 30),  $ new string built
      15           p1(ps),   $ position in str1
      16           p2(ps),   $ position in str2
      17           j(ps),  $ loop index
      18           len(ps);  $ length of strings
      19
      20      size hash(ps);  $ hashing function
      21
      22
      23      p1 = 1;   $ initialize
      24      p2 = 1;
      25
      26      len = slen str1;
      27
      28      str2 = 0;
      29
      30      sorg str2 = .sds. len+1;
      31      slen str2 = len;
      32
      33      do j = 0 to len-1;
      34          .ch. p2+j, str2 = .ch. p1+j, str1;
      35      end do;
      36
      37      hashlit = hash(str2);
      38
      39
      40      end fnct hashlit;
       1 .=member stat1
       2      subr stat1;
       3
       4$ this routine resets the statement counter to 1.
       5
       6      cstmt_count = cstmt_count + 1;
       7      ustmt_count = cstmt_count;
       8      estmt_count = 0;
       9 .+sq1.
      10      if (slen ssm_title) then
      11          if cstmt_count = 1 then
      12              putbhdr(bt_map, 0)
      13          else
      14              putbhdr(bt_tuple, 1)
      15              putbhdr(bt_tuple, 1)
      16          end if;
      17
      18          putbhdr(bt_tuple, 0)
      19          putbhdr(bt_int, 1)   putbdat(cstmt_count)
      20          putbhdr(bt_tuple, 0)
      21      end if;
      22 ..sq1
      23
      24
      25      end subr stat1;
       1 .=member newstat
       2      subr new_stat;
       3
       4$ this routine is called at the start of each statement.
       5
       6      cstmt_count = cstmt_count + 1;
       7 .+sq1.
smfc  13$ assert 0 <= cstmt_count < li_dbas as defined in compl.lipkg.
       8      if (slen ssm_title) then
       9          if cstmt_count = 1 then
      10              put ,skip;            $ emit blank line
      11              call contlpr(27, yes);$ echo to the terminal
      12              put ,'*** setl compiler error (code po1) ***' ,skip;
      13              call contlpr(27, no); $ stop to echo to the terminal
      14              call ltlfin(1, 0);    $ abnormally terminate
      15          end if;
      16
      17          putbhdr(bt_tuple, 1)
      18          putbhdr(bt_tuple, 1)
      19
      20          putbhdr(bt_tuple, 0)
      21          putbhdr(bt_int, 1)   putbdat(cstmt_count)
      22          putbhdr(bt_tuple, 0)
      23      end if;
      24 ..sq1
      25
      26
      27      end subr new_stat;
       1 .=member back
       2      subr back(i, j);
       3
       4$ this routine backs up the scanner by i tokens and the polish string
       5$ by j nodes.
       6
       7$ i and j are both in the range 0-1.
       8
       9
      10      size i(ps),
      11           j(ps);
      12
      13      access backup;
      14
      15
      16      tok_out = tok_out - i;
      17      if (tok_out = 0) tok_out = tok_lim;
      18
      19      polp = polp - j;
      20
      21
      22      end subr back;
       1 .=member backx
       2      subr backx;
       3
       4$ this routine backs up the auxiliary string one node.
       5
       6      xpolp = xpolp - 1;
       7
       8
       9      end subr backx;
       1 .=member copyx
       2      subr copyx;
       3
       4$ this routine copies the last node on the polish string file onto
       5$ the auxiliary file.
       6
       7      if (xpolp = xpol_lim) call putxtab;
       8      xpolp = xpolp + 1;
       9
      10      xpolish(xpolp) = polish(polp);
      11
      12
      13      end subr copyx;
       1 .=member mx
       2      subr mx(node);
       3
       4$ this routine writes a marker onto the auxiliary string.
       5
       6      size node(ps);  $ marker p_xxx
       7
       8      putxp(node, pol_mark);
       9
      10
      11      end subr mx;
       1 .=member stackexp
       2      subr stackexp;
       3
       4$ this is the first of three routine which make up an operator precedenc
       5$ parser for expressions.
       6
       7$ the precedence parser does not handle all expressions, but rather thos
       8$ expressions made up of binary and unary operators.
       9
      10$ when the top down parser recognizes operators and operands it immed-
      11$ iately writes them onto the polish string. each time it recognizes an
      12$ operator it calls the precedence parser which removes the operator
      13$ from the string, places it on a stack, and replaces it with any
      14$ previous operators which have higher precedence.
      15
      16$ since expressions can be nested, we must have some way of delimiting
      17$ the stack entries associated with each expression. we do this by
      18$ calling 'stackexp' at the start of each expression. we push a zero
      19$ word onto the stack. this has the effect both of separating the
      20$ entries for different expressions and of pushing an entry of
      21$ minimal precedence on the stack.
      22
      23$ the precedence stack is called 'ostack' and has three fields:
      24
      25$ oname:      names pointer for operator
      26$ omark:      a marker, one of p_bin, p_ubin, p_un, p_uun.
      27$ oprec:      the precedence.
      28
      29$ the marker node is emitted after the operator itself.
      30
      31$ the variable 'ostackp' points to the last entry in ostack.
      32
      33
      34      nameset ostack;
      35
      36      +*  ostack_lim  =  20  **
      37
suna  42          +* ostack_sz  =
suna  43 .+r32        64
suna  44 .+r36        32
suna  45 .+s66        32
suna  46              **
      41
      42          size ostack(ostack_sz);
      43          dims ostack(ostack_lim);
      44
      45          size ostackp(ps);
      46          data ostackp = 0;
      47
suna  47 .+r32.
suna  48          +*  oname(i)  =  .f. 01, 32, ostack(i)  **
suna  49          +*  omark(i)  =  .f. 33, 08, ostack(i)  **
suna  50          +*  oprec(i)  =  .f. 41, 08, ostack(i)  **
suna  51 ..r32
suna  52 .+r36.
suna  53          +*  oname(i)  =  .f. 01, 16, ostack(i)  **
suna  54          +*  omark(i)  =  .f. 17, 08, ostack(i)  **
suna  55          +*  oprec(i)  =  .f. 25, 08, ostack(i)  **
suna  56 ..r36
suna  57 .+s66.
suna  58          +*  oname(i)  =  .f. 01, 16, ostack(i)  **
suna  59          +*  omark(i)  =  .f. 17, 08, ostack(i)  **
suna  60          +*  oprec(i)  =  .f. 25, 08, ostack(i)  **
suna  61 ..s66
      62
      63      end nameset;
      64
      65
      66      countup(ostackp, ostack_lim, 'ostack');
      67      ostack(ostackp) = 0;
      68
      69
      70      end subr stackexp;
       1 .=member opprec
       2      subr opprec(m);
       3
       4$ this routine processes binary and unary operators used in expressions.
       5$
       6$ when it is called, the parser has examined the next token to see
       7$ whether the operator is the prefix of an on-the-fly assignment or
       8$ a compound operator.  therefore the binary operator we are interested
       9$ is the current last token in the token buffer, which may or may not be
      10$ identical to parse_tok.  rather than backing up the scanner and polish
      11$ to re-parse the operator (once we found it not to be part of a more
      12$ complex form) in order to reset parse_tok, we access tok_buff directly
      13$
      14$ 'm' is a marker node associated with the operator. by examining
      15$ m, we can tell whether the operator is binary, unary, etc.
      16
      17$ if the operator is unary, we simply make a new ostack entry.
      18
      19$ if the operator is binary, we must first emit all previous
      20$ operators whose precedence with greater or equal precedence.
      21$ after this we push the new operator.
      22
      23$ note that as we pop the stack, we never check whether it is
      24$ empty. this is unnecessary since stackexp always makes an
      25$ ostack entry with zero precedence. when we reach this
      26$ entry, we will always quit the loop.
      27
      28
      29      size m(ps);  $ marker node
      30
      31      size nam(ps),   $ name of operator
      32           prec(ps),  $ its precedence
      33           bin(1);    $ flags binary operator
      34
      35      access ostack;
      36      access parsens;
      37      access backup;
      38
      39
      40      parse_tok = tok_buff(tok_out);
      41      polp = polp - 1;
      42      nam  = name(parse_tok);
      43
      44$ find precedence and set 'bin' to flag binary operators.
      45
      46      if m = p_bin then   $ system binary
      47          prec = key_val(parse_tok);
      48          bin  = yes;
      49
      50      elseif m = p_un then $ system unary
      51          prec = key_val(parse_tok);
      52          if (lex_typ(parse_tok) = l_unbin) prec = prec_un;
      53
      54          bin  = no;
      55
      56      elseif m = p_ubin then  $ user-defined binary
      57          prec = prec_ubin;
      58          bin  = yes;
      59
      60      else     $ user unary
      61          prec = prec_un;
      62          bin  = no;
      63      end if;
      64
      65$ if this is a binary operator, pop all previous operators with
      66$ greater or equal precedence.
      67      if bin then
      68          while oprec(ostackp) >= prec;
      69              putp(oname(ostackp), pol_name);  $ write name
      70              putp(omark(ostackp), pol_mark);  $ write marker
      71
      72              ostackp = ostackp-1;  $ pop stack
      73          end while;
      74      end if;
      75
      76$ push new operator.
      77      countup(ostackp, ostack_lim, 'ostack');
      78      oname(ostackp) = nam;
      79      omark(ostackp) = m;
      80      oprec(ostackp) = prec;
      81
      82
      83      end subr opprec;
       1 .=member popop
       2      subr popop;
       3
       4$ this routine is called after we have seen a user defined niladic
       5$ operator. when we first parsed the operator, we assumed it was
       6$ unary, and pushed it onto ostack. we now pop it, and write it
       7$ onto the polish string.
       8
       9
      10      access ostack;
      11
      12
      13      putp(oname(ostackp), pol_name);
      14
      15      ostackp = ostackp - 1;
      16
      17
      18      end subr popop;
       1 .=member popop1
       2      subr popop1;
       3
       4$ this routine is like popop, but simply throws away the top
       5$ ostack entry.
       6
       7
       8      access ostack;
       9
      10
      11      ostackp = ostackp - 1;
      12
      13
      14      end subr popop1;
       1 .=member popexp
       2      subr popexp;
       3
       4$ this routine is called at the end of each expression. we pop
       5$ and emit all the remaining operators in the expression, then
       6$ pop the zero word left by 'stackexp'.
       7
       8
       9      access ostack;
      10
      11
      12      while ostack(ostackp) ^= 0;
      13          putp(oname(ostackp), pol_name);  $ write name
      14          putp(omark(ostackp), pol_mark);  $ write marker
      15
      16          ostackp = ostackp - 1;  $ pop stack
      17      end while;
      18
      19      ostackp = ostackp - 1;  $ pop zero word
      20
      21
      22      end subr popexp;
       1 .=member opentoks
       2      subr opentoks(iter, skip);
       3
       4$ this routine is called at the start of each control statement.
       5$ it collects information which is necessary to process 'end',
       6$ 'quit', and 'cont' statements.
       7
       8$ 1. a counter 'open_count' giving the number of control
       9$    statements which are currently open.
      10
      11$ 2. a bit vector 'is_iter' which indicates which control
      12$    statements are iterators. this bit vector is indexed
      13$    by values of open_count.
      14
      15$ 3. an array 'opntoks' containing the first few tokens
      16$    of each control statement.
      17
      18$ at this point we increment open_count, set is_iter, and
      19$ collect tokens in opntoks. we collect tokens by calling
      20$ gettok. when we are done we restore the token buffer to
      21$ its original status so that the tokens we have saved will
      22$ be passed onto the parser.
      23
      24$ the parameters to the routine are:
      25
      26$ iter:    indicates that we are opening an iterator
      27$ skip:    indicates that we are to skip the current token
      28
      29
      30      nameset opntoks;  $ nameset used to save tokens for openers
      31
      32          +*  save_lim     =  05  **  $ tokens saved/opener
      33          +*  open_lim     =  20  **  $ maximum openers
      34          +*  opntoks_lim  =  (save_lim * open_lim)  **  $ opntoks limit
      35
      36          size open_count(ps);  $ number of openers
      37          data open_count = 0;
      38
      39          +*  is_iter(p)  =  .f. p, 1, itervect  **
      40          size itervect(open_lim);
      41
      42          size opntoks(ps);   $ array of tokens
      43          dims opntoks(opntoks_lim);
      44
      45          size opntoksp(ps);   $ pointer to last saved token
      46          data opntoksp = 0;
      47
      48      end nameset;
      49
      50      size iter(1),  $ indicates opening iterator
      51           skip(1);  $ indicates skipping current token
      52
smfb 146      size tok(ps);           $ last token read
smfb 147      size temp(ps);          $ saved value of 'tok_out'
smfb 148      size j(ps);             $ loop index
      55
      56      size gettok(ps);   $ scanner function
      57
      58      access backup;
      59
      60
      61$ increment open_count and set is_iter
      62
      63      countup(open_count, opntoks_lim, 'openers');
      64      is_iter(open_count) = iter;
      65
      66      temp = tok_out;  $ save place in token buffer
      67
      68      if ^ skip then  $ save current token
      69          countup(opntoksp, opntoks_lim, 'opntoks');
      70          opntoks(opntoksp) = tok_buff(tok_out);
      71      end if;
      72
      73$ save remaining tokens
      74      do j = 1 to save_lim - 1 + skip;
      75          countup(opntoksp, opntoks_lim, 'opntoks');
      76
smfb 149          tok = tok_buff(tok_out);
smfb 150
smfb 151          if tok = sym_semi ! tok = sym_then !
smfb 152                  tok = sym_rp ! tok = sym_do then
smfb 153              $ note that since we do not consume a new token from the
smfb 154              $ input, 'tok_buff(tok_out)' and 'tok' will not change,
smfb 155              $ and this loop indeed will set all remaining entries of
smfb 156              $ 'opntoks' to zero.
smfb 157              opntoks(opntoksp) = 0;
      81          else
      82              opntoks(opntoksp) = gettok(0);
      83          end if;
      84      end do;
      85
      86      tok_out = temp;  $ reset token buffer
      87
      88
      89      end subr opentoks;
       1 .=member endtoks
       2      subr endtoks;
       3
       4$ this routine is called after seeing the keyword 'end'. we compare
       5$ the tokens which follow it with those of the most recent opener.
       6
       7$ on exit the token buffer is positioned so that the next token
       8$ will be the semicolon.
       9
      10
      11      size toks(ps);          $ array of tokens after '.end'
      12      dims toks(save_lim);
      13
      14      size toksp(ps),         $ pointer to last entry in toks
      15           tok(ps),           $ current token
      16           org(ps),           $ origin in opntoks
      17           lev(ps),           $ level of enders
      18           temp1(ps),         $ entry value of tok_out
      19           temp2(ps),         $ temporary used for swap
      20           i(ps),             $ loop index
      21           j(ps);             $ loop index
      22
      23      size gettok(ps);        $ scanner routine
      24
      25      access backup, opntoks, parsens;
      26
      27
      28      temp1 = tok_out;        $ save place in token buffer
      29
      30      toksp = 0;              $ collect tokens to next semicolon
      31      while 1;
smfb 158          tok = gettok(0); if (eof_flag) call eofr;
smfb 159          if (tok = sym_semi) quit;
smfb 160          if tok = sym_then ! tok = sym_rp ! tok = sym_do then
smfb 161              until tok = sym_semi;
smfb 162                  tok = gettok(0); if (eof_flag) call eofr;
smfb 163              end until;
smfb 164              quit while;
smfb 165          end if;
      36
      37          if toksp < save_lim then $ save token
      38              toksp = toksp + 1;
      39              toks(toksp) = tok;
      40          end if;
      41      end while;
      42
      43      tok_out = tok_out - 1;  $ return semicolon
      44      if (tok_out = 0) tok_out = tok_lim;
      45
      46      if toksp = 0 then       $ no tokens,
      47          call poptoks;       $     reset opntoksp, open_count
      48          return;             $     and return.
      49      end if;
      50
      51      $ iterate over the openers, looking for a match.
      52
      53      lev = 0;                $ nesting level
      54      do i = open_count to 1 by -1;
      55          lev = lev + 1;
      56          org = (i-1) * save_lim;  $ origin in opntoks
      57
      58          do j = 1 to toksp;  $ compare tokens
      59              if (toks(j) ^= opntoks(org+j)) cont do i;
      60          end do;
      61
      62          go to found;
      63      end do;
      64
      65      temp2   = tok_out;
      66      tok_out = temp1;
      67
      68      call ermsg(15, 0);
      69
      70      tok_out = temp2;
      71      call poptoks;
      72
      73      return;
      74
      75
      76/found/
      77
      78      call poptoks;           $ reset opntoksp, open_count
      79
      80      if (lev = 1) return;    $ found correct match
      81
      82$ at this point we found a match, but are missing one or more
      83$ -.end-s.  we recover by pretending we have not yet seen the
      84$ current -.end-.
      85
      86      tok_out = temp1;
      87
      88      call ermsg(16, 0);
      89
      90      tok_out = tok_out - 1;
      91      if (tok_out = 0) tok_out = tok_lim;
      92
      93      parsep = parseerrloc;
      94
      95      return;
      96
      97
      98      end subr endtoks;
       1 .=member poptoks
       2      subr poptoks;
       3
       4$ this routine is called at the end of a control statement, after we
       5$ have made any necessary checks for matching tokens. it pops
       6$ opntoks and resets open_count.
       7
       8
       9      access opntoks;
      10
      11
      12      opntoksp = opntoksp - save_lim;
      13      open_count = open_count - 1;
      14
      15
      16      end subr poptoks;
       1 .=member chktoks
       2      subr checktoks(dft);
       3
       4$ this routine compares the tokens after the keywords 'cont' and
       5$ 'quit' to find out which loop they refer to.
       6
       7$ we write out a counter to indicate which loop to cont(quit).
       8$ this is done as follows:
       9
      10$ 1. if the keyword cont(quit) is followed by a semicolon, we
      11$    write out a default, given by the parameter 'dft'.
      12
      13$ 2. otherwise we search opntoks starting with the most recent
      14$    entry until we find one which matches, keeping a counter
      15$    of the number of loop entries we look at, then use this
      16$    counter.
      17
      18$ on exit the token buffer is positioned so that the next token
      19$ will be the semicolon.
      20
      21
      22      size dft(ps);  $ default counter value
      23
      24      size toks(ps);  $ array of tokens after 'cont'
      25      dims toks(save_lim);
      26
      27      size toksp(ps), $ pointer to last entry in toks
      28           tok(ps),  $ current token
      29           org(ps),     $ origin in opntoks
      30           lev(ps),   $ level of loop nesting
      31           i(ps),  $ loop index
      32           j(ps);     $ loop index
      33
      34      size gettok(ps);   $ scanner routine
      35
      36      access opntoks;
      37
      38
      39$ we begin by collecting tokens up to a semicolon, and putting them
      40$ in the array 'toks'.
      41
      42      toksp = 0;
      43
      44      while 1;
smfb 166          tok = gettok(0); if (eof_flag) call eofr;
smfb 167          if (tok = sym_semi) quit;
smfb 168          if tok = sym_then ! tok = sym_rp ! tok = sym_do then
smfb 169              until tok = sym_semi;
smfb 170                  tok = gettok(0); if (eof_flag) call eofr;
smfb 171              end until;
smfb 172              quit while;
smfb 173          end if;
      49
      50          if toksp < save_lim then $ save token
      51              toksp = toksp + 1;
      52              toks(toksp) = tok;
      53          end if;
      54      end while;
      55
      56$ return semicolon
      57      call back(1, 0);
      58
      59$ if there were no tokens, write out a zero onto the polish string and
      60$ return.
      61
      62      if toksp = 0 then
smfa  87          until 1;
smfa  88              do i = open_count to 1 by -1;
smfa  89                  if (is_iter(i)) quit until 1;
smfa  90              end do;
smfa  91              call ermsg(21, 0);
smfa  92          end until;
      63          putp(dft, pol_count);
      64          return;
      65      end if;
      66
      67$ otherwise iterate over the openers, looking for a match.
      68
      69      lev = 0;  $ nesting level of loops
      70
      71      do i = open_count to 1 by -1;
      72          if (^ is_iter(i)) cont;
      73
      74          lev = lev + 1;
      75          org = (i-1) * save_lim;  $ origin in opntoks
      76
      77          do j = 1 to toksp;  $ compare tokens
      78              if (toks(j) ^= opntoks(org+j)) cont do i;
      79          end do;
      80
      81$ found match. write out count and return.
      82
      83          putp(lev, pol_count);
      84
      85          return;
      86      end do;
      87
      88$ no match found - issue message and write out a zero.
      89
      90      call ermsg(17, 0);
      91
      92      putp(dft, pol_count);
      93
      94
      95      end subr checktoks;
       1 .=member svtab
       2      trace entry;
       3
       4      subr svtab;
       5
       6$ this routine saves the values of mttab, namesp, and symtabp
       7$ so that we can later free the space used by local names.
       8$ of the scope.
       9
      10      nameset svtabns;
      11      +*  sv_lim  =  10  ** $ dimension of sv.
      12
smfb 176 .+r32    size sv(96);
suna  62 .+r36    size sv(72);
smfb 177 .+s66    size sv(60);
      19          dims sv(sv_lim);
      20
      21 .+s66.
      22          +*  sv_symtabp(i)  =  .f. 01, 16, sv(i)  **
      23          +*  sv_namesp(i)   =  .f. 17, 16, sv(i)  **
      24          +*  sv_mtabp(i)    =  .f. 33, 16, sv(i)  **
      25 ..s66
      26
smfb 178 .+r32.
      34          +*  sv_symtabp(i)  =  .f. 01, 32, sv(i)  **
      35          +*  sv_namesp(i)   =  .f. 33, 32, sv(i)  **
      36          +*  sv_mtabp(i)    =  .f. 65, 32, sv(i)  **
smfb 179 ..r32
      43
suna  63 .+r36.
      45          +*  sv_symtabp(i)  =  .f. 01, 18, sv(i)  **
      46          +*  sv_namesp(i)   =  .f. 19, 18, sv(i)  **
      47          +*  sv_mtabp(i)    =  .f. 37, 18, sv(i)  **
suna  64 ..r36
      55
      56
      57          size svp(ps);
      58          data svp = 0;
      59      end nameset;
      60
      61      countup(svp, sv_lim, 'sv');
      62
      63      sv_symtabp(svp) = symtabp;
      64      sv_namesp(svp)  = namesp;
      65      sv_mtabp(svp)   = mtabp;
      66
      68
      69      end subr svtab;
       1 .=member retab
       2      subr retab;
       3
       4$ this routine resets mttabp, namesp, and symtabp to their saved
       5$ values, thus freeing a block of table space.
       6
       7      access svtabns;
       8
       9      size j(ps),    $ loop index
      10           p(ps); $ symtab pointer
      11
      12      symtabp = sv_symtabp(svp);
      13      namesp  = sv_namesp(svp);
      14      mtabp   = sv_mtabp(svp);
      15
      16      svp = svp - 1;
      17
      18$ delete all the freed symtab entries from the clash lists.
      19      do j = 1 to heads_lim;
      20          p = heads(j);
      21
      22          while p > symtabp;
      23              p = link(p);
      24          end while;
      25
      26          heads(j) = p;
      27      end do;
      28
      29      return;
      30
      31      end subr retab;
       1 .=member puttabs
       2      subr puttbs;
       3
       4$ this routine writes out both polish strings.
       5
       6      if (sd_flag) call sdump;  $ dump symbol table
       7
       8      call puttab;            $ write polish string
       9      call putxtab;           $ write xpolish string
      10
      11
      12      end subr puttbs;
       1 .=member puttab
       2      subr puttab;
       3
       4$ this routine writes out the polish string. the polish string has
       5$ a somewhat different external format from its internal one.
       6$ markers and counters are written out exactly as they are stored
       7$ in core howwever names have:
       8
       9$ pol_typ:     pol_name
      10$ pol_val:     number of words in 'names' entry
      11
      12$ followed by the actual names entry.
      13
      14$ puttab may be called while we are saving parser states. in this
      15$ case we write out the part of the string we are not saving, move
      16$ the remaining nodes towards the front of the string and adjust
      17$ the pointers from sstack to the polish string. if we cannot write
      18$ anything out then the string is full, and we abort.
      19
      20      size last(ps),   $ last node we can write
      21           j(ps),         $ loop index
      22           nam(ps),  $ names pointer
      23           words(ps);  $ number of words
      24
      25      access backup;
      26
      27
      28      if (pd_flag) call pdump;
      29
      30$ find last node to write
      31
      32      if sstackp = 0 then
      33          last = polp;
      34
      35      else
      36          last = s_polp(1)-1;
      37          if (last = 0) call overfl('polish');
      38      end if;
      39 .-qp.
      40
      41$ write string
      42      do j = 1 to last;
      43          if pol_typ(j) = pol_name then
      44              nam   = pol_val(j);
      45              words = n_sorg(nam)/ws;
      46
      47              pol_val(j) = words;
      48
      49              write pol_file,
      50                  polish(j), names(nam) to names(nam+words-1);
      51
      52          else
      53              write pol_file, polish(j);
      54          end if;
      55      end do;
      56
      57$ move all remaining nodes to front of string then adjust sstack
      58      do j = last+1 to polp;
      59          polish(j-last) = polish(j);
      60      end do;
      61
      62      do j = 1 to sstackp;
      63          s_polp(j) = s_polp(j) - last;
      64      end do;
      65
      66 ..qp
      67      polp = polp - last;     $ reset polish pointer
      68
      69
      70      end subr puttab;
      71
      72      notrace entry;
      73
       1 .=member putxtab
       2      subr putxtab;
       3
       4$ this routine writes out the xpolish string. the xpolish string has
       5$ a somewhat different external format from its internal one.
       6$ markers and counters are written out exactly as they are stored
       7$ in core howwever names have:
       8
       9$ xpol_typ:     pol_name
      10$ xpol_val:     number of words in 'names' entry
      11
      12$ followed by the actual names entry.
      13
      14$ putxtab may be called while we are saving parser states. in this
      15$ case we write out the part of the string we are not saving, move
      16$ the remaining nodes towards the front of the string and adjust
      17$ the pointers from sstack to the xpolish string. if we cannot write
      18$ anything out then the string is full, and we abort.
      19
      20      size last(ps),   $ last node we can write
      21           j(ps),         $ loop index
      22           nam(ps),  $ names pointer
      23           words(ps);  $ number of words
      24
      25      access backup;
      26
      27
      28      if (pd_flag) call xpdump;
      29
      30$ find last node to write
      31
      32      if sstackp = 0 then
      33          last = xpolp;
      34
      35      else
      36          last = s_xpolp(1)-1;
      37          if (last = 0) call overfl('xpolish');
      38      end if;
      39
      40 .-qp.
      41$ write string
      42      do j = 1 to last;
      43          if xpol_typ(j) = pol_name then
      44              nam   = xpol_val(j);
      45              words = n_sorg(nam)/ws;
      46
      47              xpol_val(j) = words;
      48
      49              write xpol_file,
      50                  xpolish(j), names(nam) to names(nam+words-1);
      51
      52          else
      53              write xpol_file, xpolish(j);
      54          end if;
      55      end do;
      56
      57$ move all remaining nodes to front of string then adjust sstack
      58      do j = last+1 to xpolp;
      59          xpolish(j-last) = xpolish(j);
      60      end do;
      61
      62 ..qp
      63      do j = 1 to sstackp;
      64          s_xpolp(j) = s_xpolp(j) - last;
      65      end do;
      66
      67      xpolp = xpolp - last;   $ reset aux. polish pointer
      68
      69
      70      end subr putxtab;
      71
      72      notrace entry;
      73
       1 .=member symsds
       2      fnct symsds(p);
       3
       4$ this routine returns the name of a symbol table entry as a self
       5$ defining string.
       6
       7
       8      size p(ps);   $ symbol table pointer
       9
      10      size symsds(sds_sz),
      11           namesds(sds_sz); $ gets string from names ptr
      12
      13
      14      if p = 0 then
      15          symsds = '';
      16      else
      17          symsds = namesds(name(p));
      18      end if;
      19
      20
      21      end fnct symsds;
       1 .=member namesds
       2      fnct namesds(nam);
       3
       4$ this routine converts a names entry to an sds string.
       5
       6      size nam(ps);  $ pointer to names entry
       7
       8      size namesds(sds_sz);
       9
      10      size j(ps),  $ loop index
      11           words(ps);  $ number of words in names entry
      12
      13      if nam = 0 then  $ missing names entry
      14          namesds = '';
      15          return;
      16      end if;
      17
      18      words = n_sorg(nam)/ws;
      19
      20      do j = 0 to words-1;
      21          .f. 1+j*ws, ws, namesds = names(nam+j);
      22      end do;
      23
      24
      25      end fnct namesds;
       1 .=member ermsg
       2      subr ermsg(n, nam);
       3
       4$ this routine prints lexical error messages. 'n' is the message
       5$ number and 'nam' is an optional symbol table pointer to the
       6$ name which caused the error.
       7
       8
       9      size n(ps),  $ error number
      10           nam(ps);  $ symtab pointer
      11
      12      size string(.sds. 80);  $ message text
      13
      14
smfa  93      go to case(n) in 1 to 21;
      16
      17      +*  et(n, msg) =
      18          /case(n)/   string = msg;  go to esac;
      19          **
      20
      21      et(01, 'macro argument list to begin with -(-');
      22      et(02, 'to have proper number of arguments');
      23      et(03, 'to be valid macro name');
      24      et(04, 'to be defined only once');
      25      et(05, '-(- in macro parameter list');
      26      et(06, 'to appear only once in macro parameter list');
      27      et(07, 'only one -;- in macro parameter list');
      28      et(08, '-,- in macro parameter list');
      29      et(09, '-;- after -endm-');
      30      et(10, '-endm-. macro exceeds maximum length');
      31      et(11, 'macro not to occur within itself');
      32      et(12, 'to be a valid macro name');
      33      et(13, '-,- in macro list');
      34      et(14, 'valid starting character for token');
      35      et(15, 'matching tokens after -end-.')
      36      et(16, 'matching tokens after -end-; attempt to recover ' .cc.
      37             'by inserting missing -end-(s).')
      38      et(17, 'matching tokens after -quit- or -cont-');
      39      et(18, 'token to be less than 128 characters');
      40      et(19, 'end of file after last program member');
      41      et(20, '= after +: or -:.');
smfa  94      et(21, 'open loop to -continue- or -quit-')
      42
      43
      44/esac/
      45
      46      call put_err(n, nam, string);
      47
      48
      49      macdrop(et)
      50
      51      end subr ermsg;
       1 .=member eofr
       2      subr eofr;
       3
       4$ this routine is called when the end of the input file is
       5$ encounterd during a macro definition, etc.
       6
       7$ we print an error message and abort
       8
       9      call put_card;          $ print current line
      10      put, skip;              $ skip to next line
      11      call contlpr(27, yes);  $ start to echo to the terminal
      12 .+s10    put, '?';           $ emit s10 error message marker
      13 .+s20    put, '?';           $ emit s10 error message marker
      14      put,                    $ print message
      15          column(5),
      16          '*** compilation terminated by unexpected end-of-file ***',
      17          skip;
      18      call contlpr(27, no);   $ stop to echo to the terminal
      19
      20      call trcbak;            $ print current token stream
      21
      22      call prstrm(yes);
      23
      24      end subr eofr;
       1 .=member ermet
       2      subr ermet(n);
       3
       4$ this rouine processes syntactic errors. it prints an error message
       5$ then adjusts the parser in an attempt to recover from the error.
       6
       7$ the routine begins with a 'case' statement which maps the
       8$ error number into a message and a recovery mode. at the end of
       9$ the 'case' statement we print the message and perform the
      10$ appropriate recovery actions.
      11
      12$ there are 4  error recovery modes:
      13
      14$ 1. standard errors
      15
      16$    here we simply skip to the next semicolon and branch to
      17$    the grammar's  production.
      18
      19$ 2. errors in expressions
      20
      21$   these are treated like standard errors, except that we stop
      22$   when we see one of ';', ')', ']', '\', 'then', or 'end'.
      23
      24$ 3. errors in declarations.
      25
      26$    these generally occor in the context
      27
      28$ (1)        =     
      29$ (2)        =   
      30
      31$    when we fail to recognize a declaration we think we have
      32$    found an entire , return to line (1), and look
      33$    for a . if we don't find a  we advance the token
      34$    file to the next declaration keyword and backup the
      35$    parser to search for a .
      36
      37$ 4. special errors
      38
      39$    these are errors which are handled by separate error
      40$    recovery routines, so we simply return.
      41
      42 .=zzyorg z
      43
      44      defc(rec_std)
      45      defc(rec_nst)
      46      defc(rec_end)
      47      defc(rec_exp)
      48      defc(rec_dcl)
      49      defc(rec_spc)
      50
      51
      52      size n(ps);  $ error number
      53
      54      size string(.sds. 80),  $ message text
      55           mode(ps),  $ recovery mode
      56           tok(ps);  $ next token
      57
      58$ the array 'enders' gives all tokens which can end an expression.
      59
      60      +*  enders_lim  =  09  **  $ number of enders
      61
      62      size init(1);  $ initilization flag
      63      data init = no;
      64
      65
      66      size enders(.sds. 5);
      67      dims enders(enders_lim);
      68
      69      size j(ps),         $ loop index
      70           str(sds_sz);   $ next token as string
      71
      72      data enders =
      73           ')', '>>', '/)', ',', ';', 'then', 'elseif', 'else', 'end';
      74
      75      +* okey_lim = 11 **
      76      size okeywds(.sds. 9);
      77      dims okeywds(okey_lim);
      78      data okeywds =  'procedure',  'proc',  'operator', 'op',
      79                      'case',       'else',  'elseif',
      80                      'end',        '(',     'loop',     'if';
      81
      82
      83      size symsds(sds_sz),  $ returns name of token
      84           hashlit(ps),  $ hashes literals
      85           gettok(ps);   $ scanner interface
      86
      87      access parsens,
      88          opntoks,
      89             backup;
      90
      91
      92      if ^ init then  $ initialize tables
      93          init = yes;
      94
      95          do j = 1 to enders_lim;
      96              enders(j) = hashlit(enders(j));
      97          end do;
      98          do j = 1 to okey_lim;
      99              okeywds(j) = hashlit(okeywds(j));
     100          end do;
     101      end if;
     102
     103      go to case(n) in 0 to 98;
     104
     105      +*  et(n, m, str) =
     106          /case(n)/
     107              mode = m;
     108              string = str;
     109              go to esac;
     110          **
     111
     112      et(00, rec_std, 'valid error message')
     113      et(01, rec_std, 'semicolon')
     114      et(02, rec_std, 'left parenthesis')
     115      et(03, rec_std, 'right parenthesis')
     116      et(04, rec_std, 'left set brace')
     117      et(05, rec_std, 'right set brace')
     118      et(06, rec_std, 'left tuple bracket')
     119      et(07, rec_std, 'right tuple bracket')
     120      et(08, rec_std, 'colon')
     121      et(09, rec_std, 'assignment operator')
     122      et(10, rec_std, 'dash after directory name')
     123      et(11, rec_std, 'directory name in directory statement')
     124      et(12, rec_std, 'directory or program name in program statement')
     125      et(13, rec_std, 'program name in program statement')
     126      et(14, rec_std, 'library name in library statement')
     127      et(15, rec_std, 'directory name in module statement')
     128      et(16, rec_std, 'module name in module statement')
     129      et(17, rec_std, 'main program description in directory')
     130      et(18, rec_std, 'routine definition')
     131      et(19, rec_std, 'namelist after access right')
     132      et(20, rec_std, 'valid variable number of arguments option')
     133      et(21, rec_std, 'procedure description')
     134      et(22, rec_std, 'directory name in main program description')
     135      et(23, rec_std, 'program name in program description')
     136      et(24, rec_std, 'directory name in module description')
     137      et(25, rec_std, 'module name in module description')
     138      et(26, rec_nst, 'main program.')
     139      et(27, rec_std, 'name in declaration')
     140      et(28, rec_std, 'constant right-hand side in declaration')
     141      et(29, rec_std, 'keyword -back-')
     142      et(30, rec_std, 'mode descriptor')
     143      et(31, rec_std, 'mode name')
     144      et(32, rec_std, '-base- to follow -plex-')
     145      et(33, rec_std, 'list of base names')
     146      et(34, rec_std, 'mode descriptor after base type')
     147      et(35, rec_std, 'valid range definition for integer mode')
     148      et(36, rec_std, 'base name after -elmt-')
     149      et(37, rec_std, 'mode descriptor for tuple range')
     150      et(38, rec_std, 'constant length for homogeneous tuple')
     151      et(39, rec_std, 'mode descriptor for set elements')
     152      et(40, rec_std, 'mode descriptor for multi-valued map range')
     153      et(41, rec_std, 'mode descriptor for map domain elements')
     154      et(42, rec_std, 'procedure name')
     155      et(43, rec_std, 'operator name')
     156      et(44, rec_std, 'formal parameter name')
     157      et(45, rec_std, 'statement')
     158      et(46, rec_std, 'boolean expression in assertion')
     159      et(47, rec_std, 'right-hand side')
     160      et(48, rec_std, 'restricted expression after -from-, ...')
     161      et(49, rec_std, 'expression after -case-')
     162      et(50, rec_nst, 'keyword -of-')
     163      et(51, rec_std, 'case tag')
     164      et(52, rec_std, 'constant expression after -arb- in case tag')
     165      et(53, rec_std, 'debug option')
     166      et(54, rec_std, 'label name after -goto-')
     167      et(55, rec_std, 'boolean expression after -if-')
     168      et(56, rec_nst, 'keyword - then')
     169      et(57, rec_std, 'boolean expression after -elseif-')
     170      et(58, rec_nst, 'loop iterator')
     171      et(59, rec_nst, 'keyword -do- or -)-')
smfb 180      et(60, rec_std, 'expression or semicolon after -return-')
     173      et(61, rec_std, 'trace option')
     174      et(62, rec_std, 'expression after -yield-')
     175      et(63, rec_std, 'keyword -end-')
     176      et(64, rec_exp, 'expression after assignment operator')
     177      et(65, rec_exp, 'valid expression after -from-, ...')
     178      et(66, rec_exp, 'factor after compound operator')
     179      et(67, rec_exp, 'term after assigning binary operator')
     180      et(68, rec_exp, 'term after binary operator')
     181      et(69, rec_exp, 'iterator list')
     182      et(70, rec_exp, 'constraint after iterator list')
     183      et(71, rec_exp, 'term after unary operator')
     184      et(72, rec_exp, 'boolean expression after keyword -case-')
     185      et(73, rec_exp, 'keyword -of-')
     186      et(74, rec_exp, 'case tag')
     187      et(75, rec_exp, 'keyword -else-')
     188      et(76, rec_exp, 'expression after -else-')
     189      et(77, rec_exp, 'expression after case tag')
     190      et(78, rec_exp, 'boolean expression after keyword -if-')
     191      et(79, rec_exp, 'keyword -then-')
     192      et(80, rec_exp, 'expression after keyword -then-')
     193      et(81, rec_exp, 'boolean expression after keyword -elseif-')
     194      et(82, rec_exp, 'expression')
     195      et(83, rec_exp, 'left-hand side or dash')
     196      et(84, rec_exp, 'expression as index or routine parameter')
     197      et(85, rec_std, 'iterator')
     198      et(86, rec_std, 'boolean expression after keyword -while-')
     199      et(87, rec_std, 'boolean expression after keyword -where-')
     200      et(88, rec_std, 'boolean expression after keyword -until-')
     201      et(89, rec_std, 'name')
     202      et(90, rec_std, 'valid block structure')
     203      et(91, rec_std, 'valid statement')
     204      et(92, rec_nst, 'valid procedure definition.')
     205      et(93, rec_std, 'valid data structure representation statement')
     206      et(94, rec_spc, 'colon in assignment')
     207      et(95, rec_exp, '-/- after compound operator')
     208      et(96, rec_std, 'operator name')
     209      et(97, rec_std, 'integer or real denotation')
     210      et(98, rec_end, 'keyword -end-' )
     211
     212
     213/esac/
     214
     215      call put_err(n, 0, string);  $ print message
     216
     217      go to rcase(mode) in rec_std to rec_spc;
     218
     219/rcase(rec_std)/      $ standard error
     220
     221$ advance the scanner to the next semicolon and set the parser to
     222$ .
     223
     224      until tok = sym_semi;
     225          tok = gettok(0);
     226          if (eof_flag) call eofr;;
     227      end until;
     228
     229      parsep = parseerrloc;
     230
     231      return;
     232
     233/rcase(rec_nst)/
     234
     235      $ scan to next semicolon, not past -okey- statement.
     236      until tok = sym_semi;
     237          tok = gettok(0);   if (eof_flag) call eofr;
     238
     239          do j = 1 to okey_lim;
     240              if tok = okeywds(j) then
     241                  call back(1,0);   quit until;
     242              end if;
     243          end do;
     244      end until;
     245
     246      $ parser next looks for a statement block
     247      parse_ok = yes;   parsep = parsep + 1;   return;
     248
     249/rcase(rec_end)/
     250
     251      $ error recovery for missing end statement.
     252      $ since the current token might begin a new procedure, must
     253      $ not scan past.
     254      call contlpr(27, yes);  $ start to echo to the terminal
     255      call put_card;
     256      put,column(14), 'closing statment ';
     257      do j = 1 to save_lim;
     258          put: symsds(opntoks(opntoksp-save_lim + j)), a, x(1);
     259      end do;
     260      put, skip;
     261      call contlpr(27, no);   $ stop to echo to the terminal
     262
     263      until tok = sym_semi;
     264          tok = gettok(0);   if (eof_flag) call eofr;
     265          do j = 1 to 4;
     266              if tok = okeywds(j) then
     267                  call back(1, 0);   quit until;
     268              end if;
     269          end do;
     270      end until;
     271
     272      call poptoks;           $ since we are closing an opener
     273      parsep = parseerrloc;
     274      return;
     275
     276/rcase(rec_exp)/     $ error in expression
     277
     278      while 1;
     279          tok = gettok(0);
     280          if (eof_flag) call eofr;
     281
     282          do j = 1 to enders_lim;
     283              if (tok = enders(j)) quit while;
     284          end do;
     285      end while;
     286
     287      tok_out = tok_out-1;  $ return last token
     288      if (tok_out = 0) tok_out = tok_lim;
     289
     290      parsep = parseerrloc;
     291
     292      return;
     293
     294/rcase(rec_dcl)/     $ error in declaration
     295
     296      while 1;    $ advance token file
     297          tok = gettok(0);
     298
     299          if (lex_typ(tok) = l_dkey) quit;
     300          if (eof_flag) call eofr;
     301      end while;
     302
     303      tok_out = tok_out-1;  $ return last token
     304      if (tok_out = 0) tok_out = tok_lim;
     305
     306$ back up parser two  operations.
     307
     308      j = 0;
     309
     310      until j = 2;
     311          parsep = parsep-1;
     312          if (pt_op(parsep) = po_sub) j = j+1;
     313      end until;
     314
     315      putp(p_error, pol_mark);
     316
     317      return;
     318
     319/rcase(rec_spc)/     $ simply return
     320
     321
     322
     323      macdrop(et)
     324
     325      end subr ermet;
       1 .=member puterr
       2      subr put_err(n, nam, str);
       3
       4$ this routine prints error messages. error messages have the
       5$ form:
       6
       7$ *** error n: expect nam str ***
       8
       9$ nam is either zero or a symbol table pointer, while str is a
      10$ string.
      11
      12$ we print a maximum of 1 error message per line of input.
      13
      14      size n(ps),             $ error number
      15           nam(ps),           $ symbol table pointer
      16           str(sds_sz);       $ error message text
      17
      18      size j(ps),             $ loop index
      19           p(ps);             $ pointer into token buffer
      20
      21      size cur_tok(ps),       $ current token
      22           tok_sds(sds_sz),   $ current token as sds
      23           tok_len(ps),       $ length of cur_tok
      24           line_pos(ps),      $ position in current line
      25           mrk_line(sds_sz),  $ underline string
      26           mrk_flag(1);       $ on if tok_out <= p <= tok_in
      27
      28      size err_line(ps);      $ line number of last error
      29      data err_line = 0;
      30
      31      size symsds(sds_sz);    $ returns symbol as sds
      32
      33      access backup,
      34             nstrulex;
      35
      36
      37      if (err_line = line_no) return;
      38      err_line = line_no;
      39
      40      call contlpr(27, yes);  $ start to echo to terminal
      41
      42      call put_card;          $ print last line
      43
      44 .+s10    put, '?';           $ emit standard s10 error character
      45 .+s20    put, '?';           $ emit standard s10 error character
      46
      47      put, column(5), '******** error ': n, i, ': expect ';
      48      if (nam ^= 0) put: symsds(nam), a, x(1);
      49      put: str, a;
      50      if (m_flag) put, column(19+cpc), '*****';
      51      put, skip;
      52
      53      call trcbak;            $ print current token stream
      54
      55      error_count = error_count + 1;
      56
      57      if error_count > pel then
      58          put, skip, '**** parse error limit exceeded ***', skip;
      59          call prstrm(yes);
      60      end if;
      61
      62      call contlpr(27, no);   $ stop to echo to terminal
      63
      64
      65      end subr put_err;
       1 .=member trcbak
       2      subr trcbak;
       3$
       4$ this routine prints the input token trace back.
       5$
       6$ we print the token stream from -tok_out- minus -etok_lim- through
       7$ -tok_in-, if there are -etok_lim- tokens before -tok_out- in the
       8$ buffer.  we underline each token from -tom_out- through -tok_in,
       9$ using an equal sign for reserved, and a minus sign for unre-
      10$ served tokens.
      11$
      12      size cur_tok(ps);       $ current token
      13      size tok_sds(sds_sz);   $ current token as sds
      14      size tok_len(ps);       $ length of cur_tok
      15      size line_pos(ps);      $ position in current line
      16      size mrk_line(sds_sz);  $ underline string
      17      size mrk_flag(1);       $ on if tok_out <= p <= tok_in
      18
      19      size j(ps);             $ loop index
      20      size p(ps);             $ pointer into token buffer
      21
      22      size symsds(sds_sz);    $ returns symbol as sds
      23
      24      access backup;
      25      access nstrulex;
      26
      27
      28      if (tok_out = 0) return;    $ no tokens in buffer
      29
      30      call contlpr(27, yes);  $ start to echo to terminal
      31
      32      p = tok_out;
      33      do j = 1 to etok_lim + 1;
      34          p = p - 1;
      35          if (p = 0) p = tok_lim;
      36
      37          if p = tok_in then  $ less than -etok_lim- tokens in buffer
      38              p = p + 1;   if (p > tok_lim) p = 1;
      39          end if;
      40      end do;
      41
      42      put, column(14), 'parsing: ';   line_pos = 23;
      43      mrk_line = '' .pad. 9;  $ nb. .len. tok_line = 9
      44      mrk_flag = no;
      45
      46      until p = tok_in;
      47          p = p + 1;
      48          if (p > tok_lim) p = 1;
      49
      50          cur_tok = tok_buff(p);
      51          tok_sds = symsds(cur_tok);
      52          tok_len = .len. tok_sds;
      53
      54          if (mrk_flag = yes) & (tok_len = 0) then
      55              tok_sds = '?????';   tok_len = 5;
      56          end if;
      57
      58          if line_pos + tok_len + 1 > 132 then
      59              put, skip, column(14): mrk_line, a, skip, column(14);
      60              line_pos = 14;
      61
      62              mrk_line = '';
      63          end if;
      64
      65          put, ' ': tok_sds, a;   line_pos = line_pos + 1 + tok_len;
      66
      67          if mrk_flag then    $ underline blank
      68              mrk_line = mrk_line .cc. '-';
      69          else
      70              mrk_line = mrk_line .cc. ' ';
      71          end if;
      72
      73          if (p = tok_out) mrk_flag = yes;
      74
      75          do j = 1 to tok_len; $ underline token
      76              if mrk_flag then
      77                  if lex_typ(cur_tok) > l_string !
      78                              lit_val(cur_tok) ^= 0 then
      79                      mrk_line = mrk_line .cc. '='; $ reserved token
      80                  else
      81                      mrk_line = mrk_line .cc. '-'; $ unreserved token
      82                  end if;
      83              else
      84                  mrk_line = mrk_line .cc. ' ';
      85              end if mrk_flag;
      86          end do j;
      87      end until;
      88
      89      put, skip, column(14): mrk_line, a, skip;
      90
      91      call contlpr(27, no);   $ stop to echo to terminal
      92
      93
      94      end subr trcbak;
       1 .=member chkprs
       2      subr chkprs;
       3
       4$ this routine is called when we have parsed the sentence symbol.
       5
       6$ we check whether we are at the end of the input file.  if so, we
       7$ finished parsing, and merely return.  otherwise, some error must
       8$ have caused us to assume an incorrect block structure; we recover
       9$ by restarting the parser with the production 
      10
      11
      12      size tok(ps),           $ next token
      13           j(ps);             $ loop index
      14
      15      size first(1);          $ on if aleady initialized
      16      data first = yes;
      17
      18      size endnames(.sds. 10);  $ 'end' tokens as strings
      19      dims endnames(17);
      20
      21      data endnames =
      22           'procedure', 'proc', 'operator', 'op',
      23           'end', '(', ')', 'do', ',', 'else', 'elseif',
      24           'doing', 'while', 'where', 'step', 'until', 'term';
      25
      26      size endtoks(ps);
      27      dims endtoks(17);
      28
      29      size hashlit(ps),       $ hashes a literal
      30           gettok(ps);        $ returns the next token
      31
      32      access parsens, backup;
      33
      34
      35      if first then           $ initialize endtoks
      36          first = no;
      37
      38          do j = 1 to 17;
      39              endtoks(j) = hashlit(endnames(j));
      40          end do;
      41      end if;
      42
      43      tok = gettok(0);        $ check for end-of-file
      44      if (eof_flag) return;
      45
      46      do j = 1 to 17;
      47          if tok = endtoks(j) then $ found unexpected end, ...
      48          call ermet(91);
      49              tp_flag = yes;  $ terminate after parse
      50                               $ skip to next semicolon
      51              tok = gettok(0); $ check for end-of-file
      52              if (eof_flag) return;
      53
      54              quit do j;
      55          end if tok;
      56      end do j;
      57
      58      tok_out = tok_out - 1;  $ return token
      59      if (tok_out = 0) tok_out = tok_lim;
      60
      61      parsep   = parseerrmploc;
      62      parse_ok = yes;
      63
      64
      65      end subr chkprs;
       1 .=member chkst
       2      subr chkst;
       3
       4$ this routine is called when the parser looks for a 
       5$ and fails to find one. there are two possibilities:
       6
       7$ 1. the next token is 'end', or some other token which can
       8$    normally follow a series of statements.
       9
      10$ 2. otherwise we assume that the next token begins an invalid
      11$    statement, in which case we issue an error message.
      12
      13$ a token can follow a series of statements if either:
      14
      15$ 1. it is a bold name.
      16$ 2. it is in the set 'endtoks', below.
      17
      18      size tok(ps),   $ next token
      19           j(ps);     $ loop index
      20
      21      size init(1);   $ on if aleady initialized
      22      data init = no;
      23
      24      size endnames(.sds. 10);  $ 'end' tokens as strings
      25      dims endnames(17);
      26
      27      data endnames =
      28          '(',
      29           'procedure', 'proc', 'operator', 'op',
      30           'end', ')', 'do', ',', 'else', 'elseif',
      31           'doing', 'while', 'where', 'step', 'until', 'term';
      32
      33      size endtoks(ps);
      34      dims endtoks(16);
      35
      36      size hashlit(ps),   $ hashes a literal
      37           gettok(ps);    $ gets a token
      38
      39      access parsens, backup;
      40
      41      if init = no then     $ initialize endtoks
      42          init = yes;
      43
      44      do j = 1 to 17;
      45              endtoks(j) = hashlit(endnames(j));
      46          end do;
      47      end if;
      48
      49
      50$ see if next token can follow a series of statements.
      51      tok = gettok(0);
      52
      53      do j = 1 to 17;
      54          if (tok = endtoks(j)) go to pass;
      55      end do;
      56
      57/fail/      $ tok can't follow a block
      58
      59      call ermet(91);
      60      parse_ok = yes;
      61
      62      return;
      63
      64/pass/     $ valid token
      65
      66      tok_out = tok_out - 1;   $ return tok
      67      if (tok_out = 0) tok_out = tok_lim;
      68
      69      return;
      70
      71      end subr chkst;
       1 .=member errend
       2      subr errend;
       3
       4$ this routine is called generally when an extra end
       5$ is found in the program.
       6
       7      size savepp(ps);        $ save parsep location
       8      access parsens;
       9
      10
      11      savepp = parsep;
      12      call ermet(92);
      13      parsep = savepp;
      14
      15
      16      end subr errend;
       1 .=member scanend
       2      subr scanend;
       3
       4$ this routine simply scans ahead to the next semicolon after
       5$ an end statement is found.  it is called for error recovery.
       6
       7      size tok(ps);           $ token
       8      size gettok(ps);        $ returns the next token
       9
      10
      11      until tok = sym_semi;
      12          tok = gettok(0);   if (eof_flag) call eofr;
      13      end until;
      14
      15      call back(1, 0);
      16
      17
      18      end subr scanend;
       1 .=member chkbody
       1 .=member chkrepr
       2      subr chkrepr;
       3
       4$ this routine is similar to chkbody, but is called after we
       5$ fail to find a .
       6
       7$ 1. the next token is 'end'
       8$ 2. the next token is the start of an invalid repr.
       9
      10
      11      size tok(ps);   $ next token
      12
      13      size gettok(ps);        $ returns next token
      14
      15      access parsens, backup;
      16
      17
      18      tok = gettok(0);
      19
      20      if tok = sym_end then
      21          tok_out = tok_out - 1;
      22          if (tok_out = 0) tok_out = tok_lim;
      23      else
      24          call ermet(93);
      25          parse_ok = yes;
      26      end if;
      27
      28
      29      end subr chkrepr;
       1 .=member badasn
       2      subr badasn;
       3
       4$ this routine is called after seeing a statement of the form
       5$ "x = ..." where the colon has been left out of the assignment
       6$ symbol.
       7
       8$ we issue an error message then set up the token buffer so that the
       9$ next token seen will be ':='.
      10
      11
      12      access parsens, backup;
      13
      14      call ermet(75);
      15
      16      tok_out = tok_out - 1;
      17      if (tok_out = 0) tok_out = tok_lim;
      18
      19      tok_buff(tok_out) = sym_asn;
      20
      21
      22      end subr badasn;
       1 .=member sdump
       2      subr sdump;
       3 .-qp.
       4
       5$ this routine dumps the symbol table. the dump is formatted
       6$ in columns; every 'lines_lim' lines we print a series of
       7$ column headings.
       8
       9$ we print the lexical types of tokens using the 'synlexmap'
      10$ created by 'syn'.
      11
      12
      13      +*  lines_max  =  20  **
      14
      15      size j(ps),    $ loop indx
      16           str(sds_sz),  $ token name as sds
      17           tp(.sds. 20),  $ lexical type as ads
      18           lines(ps);    $ number of lines since last dump
      19
      20      size init(1);   $ flags first dump
      21      data init = yes;
      22
      23      size ltyps(.sds. 20);  $ array of lexical types
      24      dims ltyps(l_max);
      25
      26      +*  synlexmap(a, b)  =  data ltyps(b) = a;  **
      27
      28 .=include 'synlex'   $ include map from lexical codes to names
      29
      30      size symsds(sds_sz); $ returns name of symbol
      31
      32
      33$ if this is the first call, we must go over ltyps, trimming
      34$ each name to five characters.
      35
      36      if init then
      37          init = no;
      38
      39          do j = l_min to l_max;
      40              if (.len. ltyps(j) > 5) ltyps(j) = .s. 1, 5, ltyps(j);
      41          end do;
      42      end if;
      43
      44      put, skip(2), column(7), 's y m t a b    d u m p', skip(2);
      45      lines = lines_max;   $ to force new header
      46
      47      do j = 1 to symtabp;
      48          lines = lines + 1;
      49
      50          if lines > lines_max then  $ print header
      51              lines = 1;
      52
      53              put, skip(2), column(7),
      54                   'index     name     lxtyp  kyval  ltval',
      55                   '  link  morg  mcode  margs',
      56                   skip, column(7),
      57                   '--------------------------------------',
      58                   '--------------------------',
      59                   skip;
      60          end if;
      61
      62          str = symsds(j);  $ get token name
      63          if (.len. str > 10) .len. str = 10;
      64
      65          if lex_typ(j) = 0 then
      66              tp = '0';
      67          else
      68              tp = ltyps(lex_typ(j));
      69          end if;
      70
      71          put, column(07):  j, i,
      72               column(14):  str, a,
      73               column(26):  tp,  a,
      74               column(33):  key_val(j), i,
      75               column(40):  lit_val(j), i,
      76               column(47):  link(j),    i,
      77               column(54):  morg(j),    i,
      78               column(61):  mcode(j),   i,
      79               column(68):  margs(j),   i,
      80               skip;
      81
      82      end do;
      83
      84      put, skip(2), column(7),  'end symtab dump', skip(2);
      85
      86
      87 ..qp
      88      end subr sdump;
       1 .=member pdump
       2      subr pdump;
       3 .-qp.
       4
       5$ this routine dumps the polish string. we print nodes one after
       6$ another in standard columns. the handling of tabs is driven
       7$ by an array 'postab' which contains the various tabs.
       8
       9
      10      +*  pos_lim  =  4  **  $ maximum nodes/line
      11
      12      size postab(ps);
      13      dims postab(pos_lim);
      14
      15      data postab = 7, 22, 37, 52;
      16
      17      size pos(ps),   $ index into postab
      18           j(ps),     $ loop index
      19           typ(ps),  $ type of current node
      20           val(ps),  $ value of current node
      21           str(sds_sz);  $ token in sds form
      22
      23$ the names of the polish string markers are obtained from the
      24$ output of 'syn'. they are kept in a nameset so that they can
      25$ be accessed by both polish string dump routines.
      26
      27      nameset mark_names;
      28          size mark_name(.sds. 20);
      29          dims mark_name(parseimpmax);
      30
      31          +*  synmarkmap(a, b)  =  data mark_name(a) = b;  **
      32
      33 .=include synmark
      34
      35      end nameset;
      36
      37      size namesds(sds_sz);  $ gives name of token
      38
      39
      40      put, skip(2), column(7), 'p o l i s h    s t r i n g    d u m p',
      41           skip(2);
      42
      43      pos = 0;   $ column position
      44
      45      do j = 1 to polp;
      46          pos = pos + 1;
      47
      48          if pos > pos_lim then  $ end of line
      49              put, skip;
      50              pos = 1;
      51          end if;
      52
      53          put, column(postab(pos));  $ position in proper column
      54
      55          typ = pol_typ(j);
      56          val = pol_val(j);
      57
      58          go to case(typ) in pol_min to pol_max;
      59
      60      /case(pol_name)/
      61
      62          str = namesds(val);   $ get name as sds
      63          if (.len. str > 10) .len. str = 10;
      64
      65          put: str, a;
      66          cont;
      67
      68      /case(pol_count)/
      69
      70          put: val, i;
      71          cont;
      72
      73      /case(pol_mark)/
      74
      75          put: mark_name(val), a;
      76          cont do j;
      77
      78      /case(pol_end)/
      79
      80          put, skip(2), column(7), 'end of polish file', skip;
      81          cont do j;
      82
      83      end do;
      84
      85      put, skip(2), column(7), 'end polish dump', skip;
      86
      87
      88      macdrop(pos_lim)
      89
      90
      91 ..qp
      92      end subr pdump;
       1 .=member xpdump
       2      subr xpdump;
       3 .-qp.
       4
       5$ this routine is similar to pdump, but dumps the auxiliary string.
       6
       7
       8      +*  pos_lim  =  4  **  $ maximum nodes/line
       9
      10      size postab(ps);
      11      dims postab(pos_lim);
      12
      13      data postab = 7, 22, 37, 52;
      14
      15      size pos(ps),   $ index into postab
      16           j(ps),     $ loop index
      17           typ(ps),  $ type of current node
      18           val(ps),  $ value of current node
      19           str(sds_sz);  $ token in sds form
      20
      21      access mark_names;
      22
      23      size namesds(sds_sz);  $ gives name of token
      24
      25
      26      put, skip(2), column(7), 'x p o l    s t r i n g    d u m p',
      27           skip(2);
      28
      29      pos = 0;   $ column position
      30
      31      do j = 1 to xpolp;
      32          pos = pos + 1;
      33
      34          if pos > pos_lim then  $ end of line
      35              put, skip;
      36              pos = 1;
      37          end if;
      38
      39          put, column(postab(pos));  $ position in proper column
      40
      41          typ = xpol_typ(j);
      42          val = xpol_val(j);
      43
      44          go to case(typ) in pol_min to pol_max;
      45
      46      /case(pol_name)/
      47
      48          str = namesds(val);   $ get name as sds
      49          if (.len. str > 10) .len. str = 10;
      50
      51          put: str, a;
      52          cont;
      53
      54      /case(pol_count)/
      55
      56          put: val, i;
      57          cont;
      58
      59      /case(pol_mark)/
      60
      61          put: mark_name(val), a;
      62          cont do j;
      63
      64      /case(pol_end)/
      65
      66          put, skip(2), column(7), 'end of xpolish file', skip;
      67          cont do j;
      68
      69      end do;
      70
      71      put, skip(2), column(7), 'end xpolish dump', skip;
      72
      73
      74      macdrop(pos_lim)
      75
      76
      77 ..qp
      78      end subr xpdump;
       1 .=member odump
       2      subr odump;
       3 .-qp.
       4
       5$ this routine dumps opntoks. we simply print a series of names.
       6
       7
       8      size j(ps);   $ loop index
       9
      10      size symsds(sds_sz);  $ returns token name
      11
      12      access opntoks;
      13
      14
      15      put, skip(2), column(7), 'o p n t o k s    d u m p', skip(2);
      16
      17      do j = 1 to opntoksp;
      18          put, column(7): symsds(opntoks(j)), a, skip;
      19      end do;
      20
      21
      22 ..qp
      23      end subr odump;
       1 .=member overfl
       2      subr overfl(msg);
       3
       4$ this routine is called when a compiler array overflows. we print
       5$ a message and abort.
       6
       7
       8      size msg(.sds. 100);
       9
      10
      11      put ,skip;              $ emit blank line
      12
      13      call contlpr(27, yes);  $ start to echo to terminal
      14      put, '*** compiler table overflow - ': msg, a, ' ***', skip;
      15      call contlpr(27, no);   $ stop to echo to terminal
      16
      17      put, skip;
      18
      19      call prstrm(yes);
      20
      21
      22      end subr overfl;
       1 .=member usratp
       2      subr usratp;
       3
       4$ this routine is called if the system detects a fatal error.
       5
       6      put ,skip(2) ,'*** fatal error detected by system ***' ,skip(2);
       7
       8      if (^ et_flag) return;
       9
      10      call sdump;
      11      call pdump;
      12      call xpdump;
      13      call odump;
      14
      15
      16      end subr usratp;
       1 .=member prstrm
       2      subr prstrm(abnormal);
       3
       4$ this is the main termination routine. 'abnormal' is true for
       5$ abnormal termination.
       6
       7
       8      size abnormal(1);
       9
      10      size tok(ps);  $ next token on input file
      11
      12$ the following nameset is used to pass the number of lines in the
      13$ source to the overlay executive so that it can compute statistics.
      14
      15      nameset lexcard;
      16          size line_tot(ps);
      17      end nameset;
      18
      19
sunb  39      if lcs_flag then  $ print statistics
sunb  40          put ,skip;  $ emit blank line
      21
sunb  41          if error_count = 0 then
sunb  42              put, 'no errors were detected.', skip;
sunb  43          else
sunb  44              put, 'number of errors detected = ': error_count, i, skip;
sunb  45          end if;
sunb  46      end if;
      27
      28      if abnormal then        $ abort
sunb  47          if (lcs_flag) put ,skip ,'abnormal termination.' ,skip;
      30
      31          call ltlfin(1, 0);
      32
      33      elseif tp_flag then     $ terminate
sunb  48          if (lcs_flag) put ,skip ,'compilation terminated.' ,skip;
      35
      36          call ltlfin(0, 0);
      37
      38      else
sunb  49          if (lcs_flag) put ,skip ,'normal termination.' ,skip;
      40
      41          $ write end-of-file markers
      42          putp(0, pol_end);   putxp(0, pol_end);
      43 .-qp     call puttbs;        $ write remainder of tables
      44
smfa  95 .+sq1.
      46          if (slen ssm_title) then
      47              putbhdr(bt_tuple, 1)
      48              putbhdr(bt_tuple, 1)
      49              putbhdr(bt_map,   1)
      50              file ssm_file access = release;
      51          end if;
      52 ..sq1
      53
      54          file pol_file  access = release;  $ close all scratch files
      55          file xpol_file access = release;
      56          file syn_file  access = release;
      57
      58          line_tot = line_no;  $ pass to overlay exec
      59
      60 .-qp     call ltlterm(1, 0);
      61 .+qp     call ltlfin (0, 0);
      62      end if;
      63
      64
      65      end subr prstrm;
Software Preservation Group

Sections

Personal tools

Document Actions

PRS: Lexical scanner, macro processor, and parser; first pass of the SETL compiler.