Can I use CodeQL to infer indentation level for each line of a Python script? #9183
Replies: 2 comments 5 replies
-
I am quite sure the Python CodeQL extractor dismisses token information, so I would not expect this to be possible. @github/codeql-python do you agree? |
Beta Was this translation helpful? Give feedback.
-
I want to build a model with Python code as training data. I wanted to replace tabs with tokens such as TAB1, TAB2, ... etc. where TABn replaces n successive tabs. In this example, I hope to detect line1 needs no replacement as there is no tab at the beginning. For line2, line3 and line4, replace the 1st tab (or 4-spaces) with TAB1, leaving remaining spaces in line3 untouched. Finally, for line4, replace the first two tabs (or 4-spaces) with TAB2. In other words, TAB1 is indentation level 1, TAB2 is indentation level 2, ..., etc. Does it make sense to you?
|
Beta Was this translation helpful? Give feedback.
-
I am a new user of CodeQL and I apologize in advance if my question seems naive.
I wanted to infer indentation levels for each line of a Python script, ideally including comment lines. Travelling through Python built-in AST seems a good start, but I found it doesn't model comment or blank lines. Moreover, some data structures need special handling to assess indentation increase/decrease properly. Can I get similar data from AstNode of CodeQL? If yes, can anyone please show me some sample codes or queries?
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions