Skip to content

Graphviz integration and persistence refactor - #28

Merged
alessandrocarminati merged 15 commits intonav-kernel-binfrom
ac-graphviz-integration_refactor-rebase-4b86da7b
Jul 4, 2023
Merged

Graphviz integration and persistence refactor - #28
alessandrocarminati merged 15 commits intonav-kernel-binfrom
ac-graphviz-integration_refactor-rebase-4b86da7b

Conversation

@alessandrocarminati
Copy link
Copy Markdown
Collaborator

This patchset includes several changes to a codebase, including:

  • Introducing a new mock driver for persistence-related functions to decouple high-level function tests from the persistence layer.
  • Adding support for C-style switchable debug messages.
  • Fixing errors and improving the makefile.
  • Making the external C parser optional for generateOutput and adding a dedicated test case for the parser itself.
  • Introducing a small Flex-Bison dot parser for testing the Nav application.
  • Refactoring Nav to include Graphviz functionality for diagram images production.
  • Completing the test port for the newly refactored SQL component.
  • Refactoring the SQL component by abstracting it with interfaces and receivers to make it easier to make changes to the persistence code down the line.

The SQL component used to be this clunky, old-school set of functions
written in C style. But with this patch, I've gone ahead and given it
a total makeover by abstracting it with interfaces and receivers. The
result is a shiny new version that makes it way easier to make changes
to the persistence code down the line.
You could even add in new components, like non-SQL databases, just by
sticking to the interface.

Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Suggested-by: Maurizio Papini <mpapini@redhat.com>
This patch finishes up the test port for the newly refactored SQL component.
I should mention that we split off the old test file that covered all the
functions in db.go into two new files: one for the persistency interface,
and the other for everything else.

Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Suggested-by: Maurizio Papini <mpapini@redhat.com>
This commit introduces an enhancement to the Nav commandline tool that
integrates Graphviz functionality for generating diagrams within the
tool itself, thereby eliminating the need to pipe output to an external
Graphviz tool like dot. Although there is a long-term goal of creating
a web application for navigating code with diagram display, the objective
for now is to maintain Nav as a commandline tool.

Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
This patch introduces a small Flex - Bison dot parser.
I'm not claiming it is essential, but it came easy from a work I did
before, and I think it to be useful for testing the nav application.
It is written in C which make nav use CGO at some point. For now,
just the flex - bison - c code of the parser.

Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
This patch updates the generateOutput function to use an external C parser
for dot parsing. It includes the addition of CGO and a dedicated test case
for the parser itself, which runs generateOutput and parses the resulting
dot file.

Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
With this patch, we've made the external C parser optional for generateOutput.
As cgo can be problematic for some users, we've added an option to bypass it
and test generateOutput against a sample dot file instead. This change should
 make the tool more accessible and user-friendly.

Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Suggested-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Reported-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Reported-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Suggested-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Reported-by: Maurizio Papini <mpapini@redhat.com>
This commit adds the ability to turn debug messages on and off through a
switch. When the switch is turned off, no code is inserted, but when it is
turned on, code is generated and messages are printed. The benefit of this
implementation is that messages can be left in place without impacting
performance or producing output when the switch is turned off.

Please see
https://carminatialessandro.blogspot.com/2023/04/c-styel-print-debug-in-golang.html
for additional references.

Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
This patch introduces a new mock driver for the persistence-related functions
that are used in the db version. The primary objective of this patch is to
remove the coupling between high-level function tests and the persistence
layer in the long term. By replacing the functions that depend on the
persistence layer with a mock driver, we can test the high-level functions
without actually relying on the persistence layer.
This also means that we can easily switch between different persistence
implementations in the future without having to update our tests.

Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
Suggested-by: Maurizio Papini <mpapini@redhat.com>
Signed-off-by: Alessandro Carminati <alessandro.carminati@gmail.com>
@alessandrocarminati
Copy link
Copy Markdown
Collaborator Author

alessandrocarminati commented Apr 27, 2023

This is the updated version of the "Graphviz integration and persistence refactor"
Former #24

Copy link
Copy Markdown

@rokkbert rokkbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't finish the review at this moment, but here are some comments already! :)

Comment thread db_util.go
callee int
}

// Return id an item is already in the list.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess "id" should be "if"?
Why does it have to be negated? "notIn?" -> "false" seems unnecessarily hard on the brain :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Touché, a typo has been spotted. It is just that 'd' and 'f' are so near... ;-P

Comment thread db_util.go

const SUBSYS_UNDEF = "The REST"

// Parent node.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these types could use some explanation/comments, IMHO. Their names are very generic and they don't reference each other, so the intended structure isn't very clear.
Maybe a special type for symbold-ID could help too.
The name "adjM" seems suspicious. Does M stand for Matrix? But is an l-r-struct already a matrix?
What does it mean for an entry (a symbol?) to have multiple subsystems? Especially when there is a function "getSubsysFromSymbolName" which seems to return only one?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon further reflection, I agree that adding more comments could be beneficial. However, if possible, I would prefer to postpone this to another commit or pull request.

Regarding the adjM, I am ashamed to say that it was a typo that I left unchanged out of laziness. My original intention was to express that two nodes were adjacent ("l" is for left, "r" is for right), and following this intent, it should have been named adjN.

Comment thread db_util.go
}

// Removes duplicates resulting by the exploration of a call tree.
func removeDuplicate(list []entry) []entry {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it guaranteed that the duplicate entries will be identical in all fields?
Maybe we should say that identity is defined by symId (after all, it's just one of the fields in the struct and nothing special, syntactically).
Could we avoid having to remove duplicates by just not creating them in the first place, by using a map? Or simplify this function by just adding all entries into a map?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's not guaranteed, we don't really care about it in practice. Looking at how it's used, we can see that the navigate function calls removeDuplicate. This function processes the data provided by getSuccessorsById, which gives us all the functions called from another function. Although it's not uncommon for a function to call another function multiple times at different places within its body, our goal is simply to represent the fact that one function can reach another. As such, we do not need to worry about differences such as the calling address.

Comment thread db_util.go
}

// Checks if a given function needs to be explored.
func notExcluded(symbol string, excluded []string) bool {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this negation necessary? Could it be "excluded" or "included" instead?
I would prefer if I knew that "excluded" isn't a list of symbol names but of regular expressions without reading the body of the function. Do we even want that? The README gives "rcu_.*" as an example, but for example regexp.MatchString("foo", "foobar") returns true as well. Is that our intention?
The excluded-lists are user supplied - do we validate them in some way? Maybe we should, or alternatively not ignore the error from MatchString here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function regexp.MatchString("foo", "foobar") returns a match because that is how regular expressions work. When I specify a field as a regular expression, I assume that the user is familiar with how regular expressions work. I do not want to limit the potential of regular expressions simply because a user may misuse them. The name notExcluded was chosen because it sounded appropriate to me, given that the excluded.* enums are utilized.

Comment thread db_util.go
var l, r, ll node

*visited = append(*visited, symbolId)
l = parentDispaly
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parentDisplay?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you that "parentDisplay" may not be the most aesthetically pleasing name. However, despite considering other alternatives, I have yet to find a better option. My intention with this name was to convey the idea that this value represents the parent of the current node, and that it is solely used for display purposes.

Comment thread db_util.go
*visited = append(*visited, symbolId)
l = parentDispaly
successors, err := d.getSuccessorsById(symbolId, instance)
if mode == c.PrintAll {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"PrintAll" sounds to me like the opposite of "removeDupliacte". Is this correct?
Wouldn't it be more efficient to just not get the duplicates from the DB in the first place?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two items, c.PrintAll and removeDuplicate, may sound similar, but they are actually quite distinct. Firstly, removeDuplicate is a function, whereas c.PrintAll is an enum value. Secondly, the c.PrintAll value is the point at which the flow determines whether it needs to generate all the information or not. Please note that in a previous comment, I expressed concerns about having a dedicated function that does not emit duplicates.

Comment thread db_util.go
if mode == c.PrintAll {
successors = removeDuplicate(successors)
}
if err == nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If err != nil then we don't do anything, so we removed the duplicates for nothing. Should we pull it inside the if block?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree with you. The current form of the function is a result of previous versions and may not be optimal. Upon reflection, it makes no sense to continue the computation if an error is produced. Therefore, it would be beneficial to modify this function to report the error condition externally.

@alessandrocarminati alessandrocarminati merged commit 815d8b4 into nav-kernel-bin Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants