Graphviz integration and persistence refactor - #28
Graphviz integration and persistence refactor - #28alessandrocarminati merged 15 commits intonav-kernel-binfrom
Conversation
The SQL component used to be this clunky, old-school set of functions written in C style. But with this patch, I've gone ahead and given it a total makeover by abstracting it with interfaces and receivers. The result is a shiny new version that makes it way easier to make changes to the persistence code down the line. You could even add in new components, like non-SQL databases, just by sticking to the interface. Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com> Suggested-by: Maurizio Papini <mpapini@redhat.com>
This patch finishes up the test port for the newly refactored SQL component. I should mention that we split off the old test file that covered all the functions in db.go into two new files: one for the persistency interface, and the other for everything else. Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com> Suggested-by: Maurizio Papini <mpapini@redhat.com>
This commit introduces an enhancement to the Nav commandline tool that integrates Graphviz functionality for generating diagrams within the tool itself, thereby eliminating the need to pipe output to an external Graphviz tool like dot. Although there is a long-term goal of creating a web application for navigating code with diagram display, the objective for now is to maintain Nav as a commandline tool. Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
This patch introduces a small Flex - Bison dot parser. I'm not claiming it is essential, but it came easy from a work I did before, and I think it to be useful for testing the nav application. It is written in C which make nav use CGO at some point. For now, just the flex - bison - c code of the parser. Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
This patch updates the generateOutput function to use an external C parser for dot parsing. It includes the addition of CGO and a dedicated test case for the parser itself, which runs generateOutput and parses the resulting dot file. Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
With this patch, we've made the external C parser optional for generateOutput. As cgo can be problematic for some users, we've added an option to bypass it and test generateOutput against a sample dot file instead. This change should make the tool more accessible and user-friendly. Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com> Suggested-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com> Reported-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com> Reported-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com> Suggested-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com> Reported-by: Maurizio Papini <mpapini@redhat.com>
This commit adds the ability to turn debug messages on and off through a switch. When the switch is turned off, no code is inserted, but when it is turned on, code is generated and messages are printed. The benefit of this implementation is that messages can be left in place without impacting performance or producing output when the switch is turned off. Please see https://carminatialessandro.blogspot.com/2023/04/c-styel-print-debug-in-golang.html for additional references. Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
This patch introduces a new mock driver for the persistence-related functions that are used in the db version. The primary objective of this patch is to remove the coupling between high-level function tests and the persistence layer in the long term. By replacing the functions that depend on the persistence layer with a mock driver, we can test the high-level functions without actually relying on the persistence layer. This also means that we can easily switch between different persistence implementations in the future without having to update our tests. Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com> Suggested-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
|
This is the updated version of the "Graphviz integration and persistence refactor" |
rokkbert
left a comment
There was a problem hiding this comment.
Can't finish the review at this moment, but here are some comments already! :)
| callee int | ||
| } | ||
|
|
||
| // Return id an item is already in the list. |
There was a problem hiding this comment.
I guess "id" should be "if"?
Why does it have to be negated? "notIn?" -> "false" seems unnecessarily hard on the brain :)
There was a problem hiding this comment.
Touché, a typo has been spotted. It is just that 'd' and 'f' are so near... ;-P
|
|
||
| const SUBSYS_UNDEF = "The REST" | ||
|
|
||
| // Parent node. |
There was a problem hiding this comment.
All these types could use some explanation/comments, IMHO. Their names are very generic and they don't reference each other, so the intended structure isn't very clear.
Maybe a special type for symbold-ID could help too.
The name "adjM" seems suspicious. Does M stand for Matrix? But is an l-r-struct already a matrix?
What does it mean for an entry (a symbol?) to have multiple subsystems? Especially when there is a function "getSubsysFromSymbolName" which seems to return only one?
There was a problem hiding this comment.
Upon further reflection, I agree that adding more comments could be beneficial. However, if possible, I would prefer to postpone this to another commit or pull request.
Regarding the adjM, I am ashamed to say that it was a typo that I left unchanged out of laziness. My original intention was to express that two nodes were adjacent ("l" is for left, "r" is for right), and following this intent, it should have been named adjN.
| } | ||
|
|
||
| // Removes duplicates resulting by the exploration of a call tree. | ||
| func removeDuplicate(list []entry) []entry { |
There was a problem hiding this comment.
Is it guaranteed that the duplicate entries will be identical in all fields?
Maybe we should say that identity is defined by symId (after all, it's just one of the fields in the struct and nothing special, syntactically).
Could we avoid having to remove duplicates by just not creating them in the first place, by using a map? Or simplify this function by just adding all entries into a map?
There was a problem hiding this comment.
While it's not guaranteed, we don't really care about it in practice. Looking at how it's used, we can see that the navigate function calls removeDuplicate. This function processes the data provided by getSuccessorsById, which gives us all the functions called from another function. Although it's not uncommon for a function to call another function multiple times at different places within its body, our goal is simply to represent the fact that one function can reach another. As such, we do not need to worry about differences such as the calling address.
| } | ||
|
|
||
| // Checks if a given function needs to be explored. | ||
| func notExcluded(symbol string, excluded []string) bool { |
There was a problem hiding this comment.
Is this negation necessary? Could it be "excluded" or "included" instead?
I would prefer if I knew that "excluded" isn't a list of symbol names but of regular expressions without reading the body of the function. Do we even want that? The README gives "rcu_.*" as an example, but for example regexp.MatchString("foo", "foobar") returns true as well. Is that our intention?
The excluded-lists are user supplied - do we validate them in some way? Maybe we should, or alternatively not ignore the error from MatchString here?
There was a problem hiding this comment.
The function regexp.MatchString("foo", "foobar") returns a match because that is how regular expressions work. When I specify a field as a regular expression, I assume that the user is familiar with how regular expressions work. I do not want to limit the potential of regular expressions simply because a user may misuse them. The name notExcluded was chosen because it sounded appropriate to me, given that the excluded.* enums are utilized.
| var l, r, ll node | ||
|
|
||
| *visited = append(*visited, symbolId) | ||
| l = parentDispaly |
There was a problem hiding this comment.
I agree with you that "parentDisplay" may not be the most aesthetically pleasing name. However, despite considering other alternatives, I have yet to find a better option. My intention with this name was to convey the idea that this value represents the parent of the current node, and that it is solely used for display purposes.
| *visited = append(*visited, symbolId) | ||
| l = parentDispaly | ||
| successors, err := d.getSuccessorsById(symbolId, instance) | ||
| if mode == c.PrintAll { |
There was a problem hiding this comment.
"PrintAll" sounds to me like the opposite of "removeDupliacte". Is this correct?
Wouldn't it be more efficient to just not get the duplicates from the DB in the first place?
There was a problem hiding this comment.
The two items, c.PrintAll and removeDuplicate, may sound similar, but they are actually quite distinct. Firstly, removeDuplicate is a function, whereas c.PrintAll is an enum value. Secondly, the c.PrintAll value is the point at which the flow determines whether it needs to generate all the information or not. Please note that in a previous comment, I expressed concerns about having a dedicated function that does not emit duplicates.
| if mode == c.PrintAll { | ||
| successors = removeDuplicate(successors) | ||
| } | ||
| if err == nil { |
There was a problem hiding this comment.
If err != nil then we don't do anything, so we removed the duplicates for nothing. Should we pull it inside the if block?
There was a problem hiding this comment.
Yes, I agree with you. The current form of the function is a result of previous versions and may not be optimal. Upon reflection, it makes no sense to continue the computation if an error is produced. Therefore, it would be beneficial to modify this function to report the error condition externally.
This patchset includes several changes to a codebase, including: