-
Notifications
You must be signed in to change notification settings - Fork 16
IncrementalUpdateForDataFeature
ABB.SrcML.Data is a library for producing program-related data from srcML. It focuses on producing maps and relationships between program elements. Some of the maps it currently produces are:
- Type maps: Given a type usage (such as in a variable declaration), where is the type definition?
- Variable maps: Given a variable usage, where is the variable declaration?
- If the variable is attached to an object, what object? (i.e. resolve
bin the statementa.b— this relies on the type map)
- If the variable is attached to an object, what object? (i.e. resolve
- Function maps: Given a function call, what is the definition for the function? Because we may have to resolve method calls on objects and argument variables, this depends on both the variable map and the type map.
SrcMLData also provides a full function call graph on top of these maps.
The basic idea behind SrcML.Data is that we can infer much of the information about programming statements by looking at the structure of the srcML document. For instance, given the following C++ code:
int MyObject::PrintTheString(string theString)
{
if(theString.length() > 0)
cout << theString;
return theString.length();
}We get the following srcML:
<function><type><name>int</name></type> <name><name>MyObject</name><op:operator>::</op:operator><name>PrintTheString</name></name><parameter_list>(<param><decl><type><name>string</name></type> <name>theString</name></decl></param>)</parameter_list>
<block>{
<if>if<condition>(<expr><name>theString</name><op:operator>.</op:operator><call><name>length</name><argument_list>()</argument_list></call> <op:operator>></op:operator> <lit:literal type="number">0</lit:literal></expr>)</condition><then>
<expr_stmt><expr><name>cout</name> <op:operator><<</op:operator> <name>theString</name></expr>;</expr_stmt></then></if>
<return>return <expr><name>theString</name><op:operator>.</op:operator><call><name>length</name><argument_list>()</argument_list></call></expr>;</return>
}</block></function>without having to compile the function, we can learn a number of things from the XML. First, what do we know about this function:
- The function is named
PrintTheString - It is a member of the
MyObjectclass - It returns an integer
- It takes one argument (a string)
From this information, we can easily construct a signature for this method.
Additionally, we can look at the use of a variable in the function body (theString). theString is used three times in the body of the method. Where is it declared? We can answer this question by:
- Look at the current block: Is
theStringdeclared? No - Look at the function: Is
theStringdeclared? Yes — it is an argument to the method
Now we have the declaration for theString and its type.
The current SrcML.Data implementation parses a multi-file srcML archive and dumps all of the relevant data (variable declarations, type definitions, method definitions, etc) into a SQLServer database. Even for relatively small programs (for example Notepad++), it can take 5-10 minutes on a Core i5 laptop.
Another deficiency is updating: the current implementation requires the entire dataset to be thrown away and regenerated when any source file changes. This is not sustainable.
List users of the feature here
Describe use-cases for this feature here
Describe the design of the feature here