Skip to content
This repository has been archived by the owner on Mar 8, 2020. It is now read-only.

Driver returns twice the same comment #36

Open
dpordomingo opened this issue Sep 17, 2018 · 8 comments
Open

Driver returns twice the same comment #36

dpordomingo opened this issue Sep 17, 2018 · 8 comments
Labels

Comments

@dpordomingo
Copy link
Member

dpordomingo commented Sep 17, 2018

If it's parsed the following example:

a = 1; //comment here
/* block comment */

It is returned each comment twice with different internalRole

image

addendum:
This issue might be related to these similar ones in other languages drivers:
bblfsh/go-driver#28 bblfsh/python-driver#167 bblfsh/java-driver#86

@dpordomingo
Copy link
Member Author

dpordomingo commented Sep 17, 2018

For comment blocks, the same comment can be returned three times, not only two

var user = [];
/**
 * adds two numbers togeter
 */
function add(first) {
}

image

@zurk
Copy link

zurk commented Sep 28, 2018

I confirm that for drivers v1.2.0 and v2.2.0 And it happens not only for comments.
This JS code produses next UAST tree:

import { PassportConfiguratorrr } from '@freecodecamp/loopback-component-passport';
#  Token                   Internal Role      Roles Tree                                         
                                                                                                 
1  ||                      File               FILE                                               
1  ||                      Program            ┣ MODULE                                           
1  ||                      ImportDeclaration  ┃ ┣ STATEMENT, DECLARATION, IMPORT                 
1  ||                      ImportSpecifier    ┃ ┃ ┣ IMPORT                                       
1  |PassportConfigurator|  Identifier         ┃ ┃ ┃ ┣ EXPRESSION, IDENTIFIER, IMPORT             
1  |PassportConfigurator|  Identifier         ┃ ┃ ┃ ┗ EXPRESSION, IDENTIFIER, IMPORT             
1  |@freecodecamp/loopba|  StringLiteral      ┗ ┗ ┗ EXPRESSION, LITERAL, STRING, IMPORT, PATHNAME

@dennwc
Copy link
Member

dennwc commented Sep 28, 2018

@zurk It doesn't show the internalRole field of these nodes. I'm quite sure they are different. It's probably a name and an alias fields (package-local name).

@zurk
Copy link

zurk commented Sep 28, 2018

oh, I just mixed, and column Internal Role corresponds to node.internal_type field. So, internal types are the same or we have a bug in there

@dennwc
Copy link
Member

dennwc commented Sep 28, 2018

@zurk Yes, but I'm talking about a different field :) internal_type in UAST v1 corresponds to a @type field in v2, so yes, you will have two Identifiers there. But there is an InternalRole property in v1 which corresponds to a field name (or predicate) in v2. This field should have two different values in your example. So there are two different identifiers - one for the external import name and the second one for the internal name (alias for current file/package). Or at least this is what I assume, looking at the sample.

But in case that @dpordomingo described, we can remove these duplicates from the tree safely. But for the case with names vs aliases, removing one of them will cause half of the UAST queries to break.

The correct solution will require adding support for DAGs in SDK that I was advocating from the very beginning. This will happen later during v2 cycle, or even in v3.

@zurk
Copy link

zurk commented Sep 28, 2018

yeah, now I get it. If in Internal Role I use node.properties["internalRole"] I get

#  Token                   Internal Role  Roles Tree                                         
                                                                                             
1  ||                                     FILE                                               
1  ||                      program        ┣ MODULE                                           
1  ||                      body           ┃ ┣ STATEMENT, DECLARATION, IMPORT                 
1  ||                      specifiers     ┃ ┃ ┣ IMPORT                                       
1  |PassportConfigurator|  imported       ┃ ┃ ┃ ┣ EXPRESSION, IDENTIFIER, IMPORT             
1  |PassportConfigurator|  local          ┃ ┃ ┃ ┗ EXPRESSION, IDENTIFIER, IMPORT             
1  |@freecodecamp/loopba|  source         ┗ ┗ ┗ EXPRESSION, LITERAL, STRING, IMPORT, PATHNAME

before it was node.internal_type on this place. UAST even more complicated than I think.

@bzz
Copy link
Contributor

bzz commented Nov 29, 2018

This seems still to be the case: the query to the same file //*/uast:Comment results in entities for me:

{"@pos":{"@type":"uast:Positions","end":{"@type":"uast:Position","col":22,"line":1,"offset":21},"start":{"@type":"uast:Position","col":8,"line":1,"offset":7}},"@type":"uast:Comment","Block":false,"Prefix":"","Suffix":"","Tab":"","Text":"comment here"}
{"@pos":{"@type":"uast:Positions","end":{"@type":"uast:Position","col":20,"line":2,"offset":41},"start":{"@type":"uast:Position","col":1,"line":2,"offset":22}},"@type":"uast:Comment","Block":true,"Prefix":" ","Suffix":" ","Tab":"","Text":"block comment"}
{"@pos":{"@type":"uast:Positions","end":{"@type":"uast:Position","col":22,"line":1,"offset":21},"start":{"@type":"uast:Position","col":8,"line":1,"offset":7}},"@type":"uast:Comment","Block":false,"Prefix":"","Suffix":"","Tab":"","Text":"comment here"}
{"@pos":{"@type":"uast:Positions","end":{"@type":"uast:Position","col":20,"line":2,"offset":41},"start":{"@type":"uast:Position","col":1,"line":2,"offset":22}},"@type":"uast:Comment","Block":true,"Prefix":" ","Suffix":" ","Tab":"","Text":"block comment"}

query.go

package main

import (
	"encoding/json"
	"fmt"

	"gopkg.in/bblfsh/client-go.v3"
	"gopkg.in/bblfsh/client-go.v3/tools"
	"gopkg.in/bblfsh/sdk.v2/uast/nodes"
)

func main() {
	client, err := bblfsh.NewClient("localhost:9432")
	if err != nil {
		panic(err)
	}

	res, _, err := client.NewParseRequest().ReadFile("js-driver-issue-36.js").UAST()
	if err != nil {
		panic(err)
	}

	query := "//*/uast:Comment"
	it, _ := tools.Filter(res, query)
	var nodeAr nodes.Array
	for it.Next() {
		nodeAr = append(nodeAr, it.Node().(nodes.Node))
	}

	data, err := json.MarshalIndent(nodeAr, "", "  ")
	if err != nil {
		panic(err)
	}
	fmt.Println(string(data))
}
$ go run ./client.go | jq -c '.[]' | wc -l

4

@dennwc do you know what is the reason? This seems like a 2 identical copies to me

@dennwc
Copy link
Member

dennwc commented Nov 29, 2018

@bzz This is exactly the problem I talked about when reffering to DAGs. There are two pointers to the same comment node in the native AST.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants