Analyzer

This document was generated from 'src/documentation/wiki-analyzer.ts' on 2025-12-01, 16:14:34 UTC presenting an overview of flowR's analyzer (v2.6.3, using R v4.5.0). Please do not edit this file/wiki page directly.

Overview
- Overview of the Analyzer
- Conducting Analyses
Builder Configuration
Plugins
- Plugin Types
  Dependency Identification, Project Discovery, File Loading, and Loading Order
- How to add a new plugin
Context Information
Caching

Overview

No matter whether you want to analyze a single R script, a couple of R notebooks, a complete project, or an R package, your journey starts with the FlowrAnalyzerBuilder (further described in Builder Configuration below). This builder allows you to configure the analysis in many different ways, for example, by specifying which plugins to use or what engine to use for the analysis.

When building the FlowrAnalyzer instance, the builder will take care to

load the requested plugins
setup an initial context
create a cache for speeding up future analyses
initialize the engine (e.g., TreeSitter) if needed

The builder provides two methods for building the analyzer:

FlowrAnalyzerBuilder::build
for an asynchronous build process that also initializes the engine if needed
FlowrAnalyzerBuilder::buildSync
for a synchronous build process, which requires that the engine (e.g., TreeSitter) has already been initialized before calling this method. Yet, as Engines only have to be initialized once per process, this method is often more convenient to use.

For more information on how to configure the builder, please refer to the Builder Configuration section below.

Overview of the Analyzer

Once you have created an analyzer instance, you can add R files, folders, or even entire projects for analysis using the FlowrAnalyzer::addRequest method. All loaded plugins will be applied fully automatically during the analysis. Please note that adding new files after you already requested analysis results may cause bigger invalidations and cause re-analysis of previously analyzed files. With the files context, you can also add virtual files to the analysis to consider, or overwrite existing files with modified content. For this, have a look at the FlowrAnalyzer::addFile method.

Note

If you want to quickly try out the analyzer, you can use the following code snippet that analyzes a simple R expression:

const analyzer = await new FlowrAnalyzerBuilder()
    .setEngine('tree-sitter')
    .build();
// register a simple inline text-file for analysis
analyzer.addRequest('x <- 1; print(x)');
// get the dataflow
const df = await analyzer.dataflow();
// obtain the identified loading order
console.log(analyzer.inspectContext().files.loadingOrder.getLoadingOrder());
// run a dependency query
const results = await analyzer.query([{ type: 'dependencies' }]);

To reset the analysis (e.g., to provide new requests) you can use FlowrAnalyzer::reset. If you need to pre-compute analysis results (e.g., to speed up future queries), you can use FlowrAnalyzer::runFull.

Conducting Analyses

Please make sure to add all of the files, folder, and projects you want to analyze using the FlowrAnalyzer::addRequest method (or FlowrAnalyzer::addFile for virtual files). Afterwards, you can request different kinds of analysis results, such as:

FlowrAnalyzer::parse to get the parsed information by the respective engine
You can also use FlowrAnalyzer::peekParse to inspect the parse information if it was already computed (but without triggering a computation). With FlowrAnalyzer::parserInformation, you get additional information on the parser used for the analysis.
FlowrAnalyzer::normalize to compute the Normalized AST
Likewise, FlowrAnalyzer::peekNormalize returns the normalized AST if it was already computed but without triggering a computation.
FlowrAnalyzer::dataflow to compute the Dataflow Graph
Again, FlowrAnalyzer::peekDataflow allows you to inspect the dataflow graph if it was already computed (but without triggering a computation).
FlowrAnalyzer::controlflow to compute the Control Flow Graph
Also, FlowrAnalyzer::peekControlflow returns the control flow graph if it was already computed but without triggering a computation.
FlowrAnalyzer::query to run queries on the analyzed code.
FlowrAnalyzer::runSearch to run a search query on the analyzed code using the search API

We work on providing a set of example repositories that demonstrate how to use the analyzer in different scenarios:

flowr-analysis/sample-analyzer-project-query for an example project that runs queries on an R project
flowr-analysis/sample-analyzer-df-diff for an example project that compares dataflows graphs

Builder Configuration

If you are interested in all available options, have a look at the Builder Reference below. The following sections highlight some of the most important configuration options:

How to configure flowR
How to configure the engine
How to register plugins

Configuring flowR

You can fundamentally change the behavior of flowR using the config file, embedded in the interface FlowrConfigOptions. With the builder you can either provide a complete configuration or amend the default configuration using:

FlowrAnalyzerBuilder::setConfig to set a complete configuration
FlowrAnalyzerBuilder::amendConfig to amend the default configuration

By default, the builder uses flowR's standard configuration obtained with defaultConfigOptions.

Note

During the analysis with the FlowrAnalyzer, you can also access the configuration with the FlowrAnalyzerContext.

Configuring the Engine

FlowR supports multiple engines for parsing and analyzing R code. With the builder, you can select the engine to use with:

FlowrAnalyzerBuilder::setEngine to set the desired engine.
FlowrAnalyzerBuilder::setParser to set a specific parser implementation.

By default, the builder uses the TreeSitter engine with the TreeSitter parser. The builder also takes care to initialize the engine if needed during the asynchronous build process with FlowrAnalyzerBuilder::build. If you want to use the synchronous build process with FlowrAnalyzerBuilder::buildSync, please ensure that the engine has already been initialized before calling this method.

Configuring Plugins

There are various ways for you to register plugins with the builder, exemplified by the following snippet relying on the FlowrAnalyzerBuilder::registerPlugins method:

const analyzer = await new FlowrAnalyzerBuilder(false)
    .registerPlugins(
        'file:description',
        new FlowrAnalyzerQmdFilePlugin(),
        ['file:rmd', [/.*.rmd/i]]
    )
    .build();

This indicates three ways to add a new plugin:

By using a predefined name (e.g., file:description for the FlowrAnalyzerDescriptionFilePlugin)
These mappings are controlled by the registerPluginMaker function in the PluginRegistry. Under the hood, this relies on makePlugin to create the plugin instance from the name.
By providing an already instantiated plugin (e.g., the new FlowrAnalyzerQmdFilePlugin instance).
You can pass these by reference, instantiating any class that conforms to the plugin specification.
By providing a tuple of the plugin name and its constructor arguments (e.g., ['file:rmd', [/.*.rmd/i]] for the FlowrAnalyzerRmdFilePlugin).
This will also use the makePlugin function under the hood to create the plugin instance.

Please note, that by passing false to the builder constructor, no default plugins (see FlowrAnalyzerPluginDefaults) are registered (otherwise, all of the plugins in the example above would be registered by default). If you want to unregister specific plugins, you can use the FlowrAnalyzerBuilder::unregisterPlugins method.

Note

If you directly access the API, please prefer creating the objects yourself by instantiating the respective classes instead of relying on the plugin registry. This avoids the indirection and potential issues with naming collisions in the registry. Moreover, this allows you to directly provide custom configuration to the plugin constructors in a readable fashion, and to re-use plugin instances. Instantiation by text is mostly for serialized communications (e.g., via a CLI or config format).

For more information on the different plugin types and how to create new plugins, please refer to the Plugins section below.

Builder Reference

The builder provides a plethora of methods to configure the resulting analyzer instance:

FlowrAnalyzerBuilder::amendConfig
Apply an amendment to the configuration the builder currently holds. Per default, the defaultConfigOptions are used.
FlowrAnalyzerBuilder::registerPlugins
Register one or multiple additional plugins. For the default plugin set, please refer to FlowrAnalyzerPluginDefaults , they can be registered by passing true to the FlowrAnalyzerBuilder constructor.
FlowrAnalyzerBuilder::setConfig
Overwrite the configuration used by the resulting analyzer.
FlowrAnalyzerBuilder::setEngine
Set the engine and hence the parser that will be used by the analyzer. This is an alternative to FlowrAnalyzerBuilder#setParser if you do not have a parser instance at hand.
FlowrAnalyzerBuilder::setInput
Additional parameters for the analyses.
FlowrAnalyzerBuilder::setParser
Set the parser instance used by the analyzer. This is an alternative to FlowrAnalyzerBuilder#setEngine if you already have a parser instance. Please be aware, that if you want to parallelize multiple analyzers, there should be separate parser instances.
FlowrAnalyzerBuilder::unregisterPlugins
Remove one or multiple plugins.

To build the analyzer after you have configured the builder, you can use one of the following:

FlowrAnalyzerBuilder::build
Create the FlowrAnalyzer instance using the given information. Please note that the only reason this is async is that if no parser is set, we need to retrieve the default engine instance which is an async operation. If you have already initialized the engine (e.g., with TreeSitterExecutor#initTreeSitter ), you can use the synchronous version FlowrAnalyzerBuilder#buildSync instead.
FlowrAnalyzerBuilder::buildSync
Synchronous version of FlowrAnalyzerBuilder#build , please only use this if you have set the parser using FlowrAnalyzerBuilder#setParser before, otherwise an error will be thrown.

Plugins

Plugins allow you to extend the capabilities of the analyzer in many different ways. For example, they can be used to support other file formats, or to provide new algorithms to determine the loading order of files in a project. All plugins have to extend the FlowrAnalyzerPlugin base class and specify their PluginType. During the analysis, the analyzer will apply all registered plugins of the different types at the appropriate stages of the analysis. If you just want to use these plugins, you can usually ignore their type and just register them with the builder as described in the Builder Configuration section above. However, if you want to create new plugins, you should be aware of the different plugin types and when they are applied during the analysis.

Currently, flowR supports the following plugin types built-in:

Name	Class	Type	Description
`file:description`	`FlowrAnalyzerDescriptionFilePlugin`	file-load	This plugin provides support for R `DESCRIPTION` files.
`file:ipynb`	`FlowrAnalyzerJupyterFilePlugin`	file-load	The plugin provides support for Jupyter (`.ipynb`) files
`file:qmd`	`FlowrAnalyzerQmdFilePlugin`	file-load	The plugin provides support for Quarto R Markdown (`.qmd`) files
`file:rmd`	`FlowrAnalyzerRmdFilePlugin`	file-load	The plugin provides support for R Markdown (`.rmd`) files
`loading-order:description`	`FlowrAnalyzerLoadingOrderDescriptionFilePlugin`	loading-order	This plugin extracts loading order information from R `DESCRIPTION` files. It looks at the `Collate` field to determine the order in which files should be loaded. If no `Collate` field is present, it does nothing.
`versions:description`	`FlowrAnalyzerPackageVersionsDescriptionFilePlugin`	package-versions	This plugin extracts package versions from R `DESCRIPTION` files. It looks at the `Depends` and `Imports` fields to find package names and their version constraints.

Plugin Types

During the construction of a new FlowrAnalyzer, plugins of different types are applied at different stages of the analysis. These plugins are grouped by their PluginType and are applied in the following order (as shown in the documentation of the PluginType):

┌───────────┐   ┌───────────────────┐   ┌─────────────┐   ┌───────────────┐   ┌───────┐
│           │   │                   │   │             │   │               │   │       │
│ *Builder* ├──>│ Project Discovery ├──>│ File Loader ├──>│ Dependencies  ├──>│ *DFA* │
│           │   │  (if necessary)   │   │             │   │   (static)    │   │       │
└───────────┘   └───────────────────┘   └──────┬──────┘   └───────────────┘   └───────┘
                                               │                                  ▲
                                               │          ┌───────────────┐       │
                                               │          │               │       │
                                               └─────────>│ Loading Order ├───────┘
                                                          │               │
                                                          └───────────────┘

Please note, that every plugin type has a default implementation (e.g., see defaultPlugin) that is always active. We describe the different plugin types in more detail below.

Project Discovery

These plugins trigger when confronted with a project analysis request (see, RProjectAnalysisRequest). Their job is to identify the files that belong to the project and add them to the analysis. flowR provides the FlowrAnalyzerProjectDiscoveryPlugin with a defaultPlugin as the default implementation that simply collects all R source files in the given folder.

Please note that all project discovery plugins should conform to the FlowrAnalyzerProjectDiscoveryPlugin base class.

File Loading

These plugins register for every file encountered by the files context and determine whether and how they can process the file. They are responsible for transforming the raw file content into a representation that flowR can work with during the analysis. For example, the FlowrAnalyzerDescriptionFilePlugin adds support for R DESCRIPTION files by parsing their content into key-value pairs. These can then be used by other plugins, e.g. the FlowrAnalyzerPackageVersionsDescriptionFilePlugin that extracts package version information from these files.

If multiple file plugins could apply (DefaultFlowrAnalyzerFilePlugin::applies) to the same file, the loading order of these plugins determines which plugin gets to process the file. Please ensure that no two file plugins apply to the same file, as this could lead to unexpected behavior. Also, make sure that all file plugins conform to the FlowrAnalyzerFilePlugin base class.

Dependency Identification

These plugins should identify which R packages are required with which versions for the analysis. This information is then used to setup the R environment for the analysis correctly. For example, the FlowrAnalyzerPackageVersionsDescriptionFilePlugin extracts package version information from DESCRIPTION files to identify the required packages and their versions.

All dependency identification plugins should conform to the FlowrAnalyzerPackageVersionsPlugin base class.

Loading Order

These plugins determine the order in which files are loaded and analyzed. This is crucial for correctly understanding the dependencies between files and improved analyses, especially in larger projects. For example, the FlowrAnalyzerLoadingOrderDescriptionFilePlugin provides a basic implementation that orders files based on the specification in a DESCRIPTION file, if present.

All loading order plugins should conform to the FlowrAnalyzerLoadingOrderPlugin base class.

How to add a new plugin

If you want to make a new plugin you first have to decide which type of plugin you want to create (see Plugin Types above). Then, you must create a new class that extends the corresponding base class (e.g., FlowrAnalyzerFilePlugin for file loading plugins). In general, most plugins operate on the context information provided by the analyzer. Usually it is a good idea to have a look at the existing plugins of the same type to get an idea of how to implement your own plugin.

Once you have your plugin you should register it with a sensible name using the registerPluginMaker function. This will allow users to register your plugin easily by name using the builder's FlowrAnalyzerBuilder::registerPlugins method. Otherwise, users will have to provide an instance of your plugin class directly.

Context Information

The FlowrAnalyzer provides various context information during the analysis. You can access the context with FlowrAnalyzer::inspectContext to receive a read-only view of the current analysis context. Likewise, you can use FlowrAnalyzerContext::inspect to get a read-only view of a given context. These read-only views prevent you from accidentally modifying the context during the analysis which may cause inconsistencies (this should be done either by wrapping methods or by plugins). The context is divided into multiple sub-contexts, each responsible for a specific aspect of the analysis. These sub-contexts are described in more detail below.

For the general structure from an implementation perspective, please have a look at FlowrAnalyzerContext.

Tip

If you need a context for testing or to create analyses with lower-level components, you can use either contextFromInput to create a context from input data (which lifts the old requestFromInput) or contextFromSources to create a context from source files (e.g., if you need a virtual file system).

If for whatever reason you need to reset the context during an analysis, you can use FlowrAnalyzerContext::reset. To pre-compute all possible information in the context before starting the main analysis, you can use FlowrAnalyzerContext::resolvePreAnalysis.

Files Context

First, let's have look at the FlowrAnalyzerFilesContext class that provides access to the files to be analyzed and their loading order:

FlowrAnalyzerFilesContext
This is the analyzer file context to be modified by all plugins that affect the files. If you are interested in inspecting these files, refer to ReadOnlyFlowrAnalyzerFilesContext . Plugins, however, can use this context directly to modify files.
(Defined at ./src/project/context/flowr-analyzer-files-context.ts#L112)
View more (AbstractFlowrAnalyzerContext, ReadOnlyFlowrAnalyzerFilesContext)
- AbstractFlowrAnalyzerContext
  Abstract class representing the context, a context may be modified and enriched by plugins (see FlowrAnalyzerPlugin ). Please use the specialized contexts like FlowrAnalyzerFilesContext or FlowrAnalyzerLoadingOrderContext to work with flowR and in general, use the FlowrAnalyzerContext to access the full project context.
  (Defined at ./src/project/context/abstract-flowr-analyzer-context.ts#L11)
- ReadOnlyFlowrAnalyzerFilesContext
  This is the read-only interface for the files context, which is used to manage all files known to the FlowrAnalyzer . It prevents you from modifying the available files, but allows you to inspect them (which is probably what you want when using the FlowrAnalyzer ). If you are a FlowrAnalyzerProjectDiscoveryPlugin and want to modify the available files, you can use the FlowrAnalyzerFilesContext directly.
  (Defined at ./src/project/context/flowr-analyzer-files-context.ts#L61)

Using the available plugins, the files context categorizes files by their FileRole (e.g., source files or DESCRIPTION files) and makes them accessible by these roles (e.g., via FlowrAnalyzerFilesContext::getFilesByRole). It also provides methods to check for whether a file exists (e.g., FlowrAnalyzerFilesContext::hasFile, FlowrAnalyzerFilesContext::exists) and to translate requests so they respect the context (e.g., FlowrAnalyzerFilesContext::resolveRequest).

For legacy reasons it also provides the list of files considered by the dataflow analysis via FlowrAnalyzerFilesContext::consideredFilesList.

Loading Order Context

Note

Please be aware that the loading order is inherently tied to the files context (as it determines which files are available for ordering). Hence, the FlowrAnalyzerLoadingOrderContext is accessible (only) via the FlowrAnalyzerFilesContext.

Here is the structure of the FlowrAnalyzerLoadingOrderContext that provides access to the identified loading order of files:

FlowrAnalyzerLoadingOrderContext
This context is responsible for managing the loading order of script files in a project, including guesses and known orders provided by FlowrAnalyzerLoadingOrderPlugin s. If you are interested in inspecting these orders, refer to ReadOnlyFlowrAnalyzerLoadingOrderContext . Plugins, however, can use this context directly to modify order guesses.
(Defined at ./src/project/context/flowr-analyzer-loading-order-context.ts#L50)
View more (AbstractFlowrAnalyzerContext, ReadOnlyFlowrAnalyzerLoadingOrderContext)
- AbstractFlowrAnalyzerContext
  Abstract class representing the context, a context may be modified and enriched by plugins (see FlowrAnalyzerPlugin ). Please use the specialized contexts like FlowrAnalyzerFilesContext or FlowrAnalyzerLoadingOrderContext to work with flowR and in general, use the FlowrAnalyzerContext to access the full project context.
  (Defined at ./src/project/context/abstract-flowr-analyzer-context.ts#L11)
- ReadOnlyFlowrAnalyzerLoadingOrderContext
  Read-only interface for the loading order context, which is used to determine the order in which script files are loaded in a project. This interface prevents you from modifying the available files, but allows you to inspect them (which is probably what you want when using the FlowrAnalyzer ). If you are a FlowrAnalyzerLoadingOrderPlugin and want to modify the available orders, you can use the FlowrAnalyzerLoadingOrderContext directly.
  (Defined at ./src/project/context/flowr-analyzer-loading-order-context.ts#L14)

Using the available plugins, the loading order context determines the order in which files are loaded and analyzed by flowR's analyzer. You can inspect the identified loading order using FlowrAnalyzerLoadingOrderContext::getLoadingOrder. If there are multiple possible loading orders (e.g., due to circular dependencies), you can use FlowrAnalyzerLoadingOrderContext::currentGuesses.

Dependencies Context

Here is the structure of the FlowrAnalyzerDependenciesContext that provides access to the identified dependencies and their versions, including the version of R:

FlowrAnalyzerDependenciesContext
This context is responsible for managing the dependencies of the project, including their versions and interplays with FlowrAnalyzerPackageVersionsPlugin s. If you are interested in inspecting these dependencies, refer to ReadOnlyFlowrAnalyzerDependenciesContext .
(Defined at ./src/project/context/flowr-analyzer-dependencies-context.ts#L33)
View more (AbstractFlowrAnalyzerContext, ReadOnlyFlowrAnalyzerDependenciesContext)
- AbstractFlowrAnalyzerContext
  Abstract class representing the context, a context may be modified and enriched by plugins (see FlowrAnalyzerPlugin ). Please use the specialized contexts like FlowrAnalyzerFilesContext or FlowrAnalyzerLoadingOrderContext to work with flowR and in general, use the FlowrAnalyzerContext to access the full project context.
  (Defined at ./src/project/context/abstract-flowr-analyzer-context.ts#L11)
- ReadOnlyFlowrAnalyzerDependenciesContext
  This is a read-only interface to the FlowrAnalyzerDependenciesContext . It prevents you from modifying the dependencies, but allows you to inspect them (which is probably what you want when using the FlowrAnalyzer ). If you are a FlowrAnalyzerPackageVersionsPlugin and want to modify the dependencies, you can use the FlowrAnalyzerDependenciesContext directly.
  (Defined at ./src/project/context/flowr-analyzer-dependencies-context.ts#L13)

Probably the most important method is FlowrAnalyzerDependenciesContext::getDependency that allows you to query for a specific dependency by name.

Environment Context

Here is the structure of the FlowrAnalyzerEnvironmentContext that provides access to the built-in environment:

FlowrAnalyzerEnvironmentContext
This context is responsible for providing the built-in environment. It creates the built-in environment based on the configuration provided in the FlowrAnalyzerContext .
(Defined at ./src/project/context/flowr-analyzer-environment-context.ts#L45)
View more (ReadOnlyFlowrAnalyzerEnvironmentContext)
- ReadOnlyFlowrAnalyzerEnvironmentContext
  This is the read-only interface to the FlowrAnalyzerEnvironmentContext , which provides access to the built-in environment used during analysis.
  (Defined at ./src/project/context/flowr-analyzer-environment-context.ts#L13)

The environment context provides access to the built-in environment via FlowrAnalyzerEnvironmentContext::makeCleanEnv. It also provides the empty built-in environment, which only contains primitives, via FlowrAnalyzerEnvironmentContext::makeCleanEnvWithEmptyBuiltIns.

Caching

To speed up analyses, flowR provides a caching mechanism that stores intermediate results of the analysis. The cache is maintained by the FlowrAnalyzerCache class and is used automatically by the analyzer during the analysis. Underlying, it relies on the PipelineExecutor to cache results of different pipeline stages.

Usually, you do not have to worry about the cache, as it is managed automatically by the analyzer. If you want to overwrite cache information, the analysis methods in FlowrAnalyzer (see Conducting Analyses above) usually provide an optional force parameter to control whether to use the cache or recompute the results.

Currently maintained by Florian Sihler and Oliver Gerstl at Ulm University
Email | GitHub | Penguins | Portfolio

💮 flowR Home

Analyzer

Overview

Overview of the Analyzer

Conducting Analyses

Builder Configuration

Configuring flowR

Configuring the Engine

Configuring Plugins

Builder Reference

Plugins

Plugin Types

Project Discovery

File Loading

Dependency Identification

Loading Order

How to add a new plugin

Context Information

Files Context

Loading Order Context

Dependencies Context

Environment Context

Caching

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally