This is the target repository for the SER515 MeetMe SCORE project
-
parser_base_main.py - This module is the output of the grako parser generation which has two classes MyParser and MySemantics. MySemantics class is overridden to traverse the tree and implement customized handling for translation phase. MyParser is used to instantiate a parser and parse the input with help of the customized MySemantics.
-
exp_parser.py – The module has a class ExpSemantics and a global function translate().
-
ExpSemantics - Overrides functions corresponding to each rulein the grammar from MySemantics class which are used which enables the parser to use custom semantics. The functions in ExpSemantics class can be used to modify the AST to format it and are subject to interpretation in model module e.g. CleanUp in some functions is necessary to get rid of all the blank statements and redundant members.
-
Translate() 5. Instantiates a Parser generated from grako. 6. Runs the preprocessor to make the input file provided from input parameter to make it compatible with the Parser. 7. The Parser object has the custom Semantics passed to it which enables it to use the custom semantics. 8. Writes processed input to the output file.
The translator is the stage where the semantics come into the picture as the syntax analysis is done until the AST is generated. The tree is traversed in a DFS manner and each node in the AST interpreted according to the Semantics class provided (ExpSemantics here for lists2helix) and ultimately as the root is reached collective output of all the nodes below can be processed to produce the translated lists python code. The translation phase using Grako can be summarized in the following steps:
- Create a PEG grammar that represents the syntax to be parsed. A Parser and a Semantics class is generated based on the input grammar by grako.
- Use the class Parser generated by Semantics class.
- Define a custom Semantics class which implements functions corresponding to each rule that take input as AST and returns its interpretation.
- Define a custom class corresponding to each rule which has a template and subclasses Model.Renderer. This class is used by Semantics class functions to interpret the input AST and returns the interpretation to the next rule.
Some of the prominent features of the translation phase are as follows:
- Variable resolution
- The variable resolution is done in the functions process_var() of the BaseStartStmt class and check_var_usage() function of class start for parent files and process_var_libl*() and constructor of class libl_start for included lists or libls.
- The variables are either defined or referred to somewhere in the code which makes the semantics specific to 4. Variable Definition 5. Variable Reference 6. Variable Scope Validation
- Variable Definition 8. Variable definition is mainly done by a declare statement in the TestMgr List syntax. 9. Variable definition however in Python needs the context and hence the resolution of the context to be used is deferred. 10. To signify that a particular variable is defined, the variable is prefixed with “$@var_def”. This is done in the constructor of expression. The LHS variable is the only one which is prefixed with aforementioned prefix. Moreover, nodes which are indexed are also prefixed and replaced with the context later at the root node. 11. The variable is resolved later as the traversal reaches the root node where the processing is aware if the particular statement is in setup, testcase or teardown section. Hence, the prefix $@var_def is replaced with its context. 12. Context resolution is done by following replacement : 13. Testcase – 14. $@var_def = self.(nodes) Or no context(for variables ) 15. Setup - 16. $@var_def = cls.env.(nodes) Or cls. (for variables) 17. Teardown – 18. $@var_def = cls.env.(nodes) Or cls. (for variables) 19. Teardown – 20. $@var_def = cls.env.(nodes) Or cls. (for variables) 21. Included files – 22. All variables are local and no context prefix is prepended. 23. Nodes – env.nodes[]
- Variable Reference –
- A “name” in the lists syntax grammar denotes the use of a variable. Therefore, when it is interpreted, a $@var_ prefix is added to denote that it’s a variable for later processing.
- Therefore, all the variables used in the lists have a prefix $@var_ which are then resolved at the root when the context is known thus replacing the prefix with the context.
- The function find_replace_var() in class BaseStartStmt also parse the RHS of an assignment statement and command of task statement for variable references as they cannot be parsed into strings readily from the grammar.
- Such cases of variable reference are prefixed with $@replace_ and are replaced by the context according to the sections they are present in.
- Variable Scope Validation –
- We assume that the Test list should successfully run with the TestMgr framework and hence it is implied that there will not be any variable reference which is not defined. Such a case can still occur if an invalid test case is used, the invalid variable references are checked against variables being maintained in a dictionary. These maintained variables are defined in included files and in current local scope and are stored as soon as they are defined.
- The scope of TestMgr List is global provided the included files also have scope listed as global. Therefore, making this compatible with Python unit test cases and python scope rules, every time a list or libl is included in global scope in Test lists, the current file’s globals dictionary has to be updated with all the included lists variables and local variables defined in the included list or libl.
- Therefore following functions and class are relevant to this functionality – 11. read_include_symtable() 12. write_sym_table() 13. RegPatternInclude().process_var()
- Handling Include Lists and libls
- The handling of included lists and libls requires a different starting point in the grammar as the format of the included libls and lists does not have setup, teardown and testcase sections. Hence, the rule libl_start deals with the start rule of libl and lists included in the test lists.
- Include statements are interpreted and translated to prefix #$@read_include_var@<file_to_include> to indicate that a file has been included for later processing.
- However, include statements are handled with the help of the function handle_include_stmt() from the class included.
- The handle_include_stmt() function is similar to translate function in the exp_parser module. It creates a Parser object and passes the custom Semantics class ExpSemantics but the start rule is different which is libl_start.
- So essentially, it resembles a depth first processing where if an include statement is encountered the processing of the present file is suspended until the processing of the included file is complete and the same applies for included files as well. Thus it handles recursive inclusion of files very well.
- Following depicts the processing graphically. ##IMAGE
- Tasks execution
- The TestMgr framework has task statements which involve a certain command being executed on a set of clients or nodes. These tasks have specific variables like stdout, stderr, last_pid, etc. which are readily available for use.
- However, Helix framework does not have these “special” variables readily available. One solution is to maintain all the tasks in variables but that would make the translated code ugly and therefore, it is necessary to maintain the last executed task at every statement.
- This enables the translator to prepend the last executed task statement with a variable assignment “proc = ” whenever it comes across a special variable like stdout, stderr, lastpid, etc.
- The ifStmt , however, can apply this strategy only in its block of code. If the variable reference is outside the ifElse blocks of code and the last executed task is in both the blocks of if and else statements, then each of the last task executed in both the blocks are prepended with “proc = ” variable assignment. This is done as it is impossible to decide which of the blocks will get executed at compile time.
- The processing to maintain the last executed task takes place in the check_var_usage() and libl_start constructor function. 27. check_var_usage() – for parent file. 28. libl_start() - for included files.
Adding functionality to the existing code base is an impending task as there is a possibility that the new things should be added in future. With this consideration, here are some cases and corresponding suggested ways in which specific things can be added –
- **Add Special Variable handling ** - – Special variables like stdout, stderr, etc. (A list can be found on the taskmgr syntax documentation under title “variable substitution”) which require special handling can be added by making a class and implementing the function handle_var() in it. Module – variable_handle.py.
- In the Variable_Handle class, subclass BaseVariableHandle class and implement a function handle_var() in it which will return the desired value of the variable in the translated code e.g. stdout = proc.stdout.
- Add the variable name and its class instantiation as a key,value pair in the dictionary var_map_out in init function of Variable_Handle class.
- Add new Node class variable handling – Node class variables like linux, windows, nodes, client, etc. are handled in the class NodeHandle. The BaseNodeHandle class is the base class which is to which has the structure of a new node class variable.
- Add a new class which denotes the new class variable by subclassing the BaseNodeHandle.
- Add the supposed processed name for the name as a key, value pair in VAR_MAP dictionary.
- Add the name of node variable and its class instantiation to var_class dictionary in NodeHandle class.
- Add new rule to the grammar -
- Add the new rule to the grammar and try to compile it with grako to ensure that there are no errors in the added grammar.
- After successful compilation, a new parser is generated which is important to use as using the previously generated parser, it will not reflect the change made in the grammar.
- Add a function corresponding to the rule in the class ExpSemantics. The function will instantiate an object which corresponds to a class in model.py module.
- Add a class corresponding to the rule in model.py which subclasses Model class. This subclass of Model is supposed to have a render_fields method and an interpretation of the input AST such that a translated output is recorded while instantiating the object.
- Add new regular expression parsing – There can be a requirement where new markers have to be introduced to signify that the processing of that particular entity cannot be done at the current moment and have to be deferred.
- Add a class in reg_exp_module.py which represents the regular expression to be used to parse.
- Add the class to reg_pattern dictionary and hence the factory method will be able to find it.
There are some directives used to allow the translator to take some actions to produce the desired output, when it is processing at the root. Following is a brief account of all such directives used. As we format the output at each level of the tree traversal
-
$@var_def – used for variable definitions in order to signify to the translator that a context has to be applied in place of the
$@var_def if the variable it is prepended to is available in the maintained variable dictionary. It is also used to denote node addressing irrespective of whether they are indexed or not. Nodes or clients in tasks statements also have context to be replacement of $ @var_def prefix. e.g.$@var_defa = 1 => cls.a = 1 (Setup/ Teardown section) $ @var_defnode[0] => cls.env.node[0] (Setup/ Teardown section) -
$@var_ – used for variable references and definitions. In case of definitions, it is replaced by
$@var_def. It is also replaced with its context prefix while translation. e.g. $ @var_somevariable = 1 => somevariable = 1 (testcase) -
$@replace_ – The variables used in strings in RHS or command of task statements are parsed and prefixed by
$@replace_ to be replaced later by the context prefix. “$ $variable is $$value” => “{variable} is {value}”.format(**) - $@ifdefined_ – used for operator “||=” so that the translated code has the variable check if it is defined in locals() or not while processing it later.
- #@read_include_var@ – This directive takes care of include statements. Whenever an include statement is encountered, the processing at root node(start rule) is indicated to read all the variables from the included file to ensure strict scope validation of the variables.
- #$@decorate_ – There are some statements like hardware(--hardware), etc. which form as a decorator in the helix framework. Moreover, the decorators can be added only when we are aware that the current processing corresponds to Setup, teardown or testcase rule. Therefore, it is at the processing of start rule we put decorators as we come across this directive.
- @task – A task statement is denoted by @task prefix in the task_stmt grammar rule interpretation. It is useful to maintain the last executed task to format it when a special variable(stdout, stderr, lastpid) is encountered.
- @$lookup – Lookup is used to identify the context of a processing section in the check_var_usage() function.
Parsers do not have a definite flow chart as it completely depends on the AST it is traversing. However, different functions corresponding to grammar rules are called by Grako through _call(), _invoke_rule, and rule() functions.
a. b.
a. b. 2. a. b. c. 3. a. b. c. d. 4. Add new regular expression parsing – There can be a requirement where new markers have to be introduced to signify that the processing of that particular entity cannot be done at the current moment and have to be deferred. a. Add a class in reg_exp_module.py which represents the regular expression to be used to parse. b. Add the class to reg_pattern dictionary and hence the factory method will be able to find it.