Skip to content

Expose subgraph logs via subgraph GraphQL #6278

Open
fordN wants to merge 28 commits intomasterfrom
ford/subgraph-logs-via-graphql
Open

Expose subgraph logs via subgraph GraphQL #6278
fordN wants to merge 28 commits intomasterfrom
ford/subgraph-logs-via-graphql

Conversation

@fordN
Copy link
Copy Markdown
Contributor

@fordN fordN commented Jan 15, 2026

This PR introduces a subgraph log storage and querying system for Graph Node. Subgraph logs can be queried through the GraphQL API via a new _logs field. The implementation supports multiple storage backends (File, Elasticsearch, Loki) with a consistent query interface.

What's new

GraphQL Query API

  • New _logs query field on all subgraph deployments
  • Filter by log level, timestamp range, and text search
  • Structured log entries with metadata (source location, arguments)
  • Pagination via first/skip and sort order via orderDirection

Storage Backends

  • File: JSON Lines format, one file per subgraph (good for development)
  • Elasticsearch: Enterprise search and analytics (production)
  • Loki: Grafana log aggregation, with basic auth support (production)
  • Disabled: Default when no [log_store] section is configured

Configuration via graph-node.toml

Log store is configured through a [log_store] section in the TOML config file, following the same pattern as [store], [chains], and [deployment].

Architecture

  • LogDrain: Write-side sink for each backend (File, Loki, Elasticsearch)
  • LogStore: Read-side query interface for each backend
  • LoggerFactory: Refactored for multi-backend log routing
  • Uses slog::Level directly (no custom LogLevel enum)
  • Parse failures are logged as warnings, not silently dropped

Examples

Querying logs

{
  _logs(
    level: ERROR
    search: "timeout"
    from: "2024-01-15T00:00:00Z"
    to: "2024-01-16T00:00:00Z"
    first: 100
    orderDirection: desc
  ) {
    id
    timestamp
    level
    text
    arguments { key value }
    meta { module line column }
  }
}

Configuring log store backends

File-based (development):

[log_store]
backend = "file"
directory = "/var/log/graph-node/subgraph-logs"
retention_hours = 72

Loki (production):

[log_store]
backend = "loki"
url = "https://loki.example.com"
tenant_id = "my-tenant"
username = "user"
password = "secret"

Elasticsearch (production):

[log_store]
backend = "elasticsearch"
url = "http://localhost:9200"
username = "elastic"
password = "changeme"
index = "subgraph"
timeout_secs = 10

@fordN fordN requested a review from dwerner January 15, 2026 18:20
@fordN fordN self-assigned this Jan 15, 2026
@fordN fordN added enhancement New feature or request area/graphql logs labels Jan 15, 2026
@fordN fordN removed the request for review from dwerner January 15, 2026 21:58
@fordN fordN force-pushed the ford/subgraph-logs-via-graphql branch from 688827a to 120d61b Compare January 16, 2026 00:10
@fordN fordN requested a review from dwerner January 16, 2026 00:12
@fordN fordN force-pushed the ford/subgraph-logs-via-graphql branch from 120d61b to ee0f228 Compare January 16, 2026 01:31
@fordN fordN requested review from lutter and removed request for lutter January 16, 2026 02:04
@fordN fordN force-pushed the ford/subgraph-logs-via-graphql branch from a4432ca to 384bf35 Compare January 16, 2026 17:39
@lutter
Copy link
Copy Markdown
Collaborator

lutter commented Jan 16, 2026

One thing I wonder about: should this be configured via environment variables or through graph-node.toml? I would lean towards the latter since it's a configuration that is unlikely to change often, and having it in the config file would let's us express more complicated configuration (though admittedly, right now the config is not very complex)

@fordN fordN force-pushed the ford/subgraph-logs-via-graphql branch from 384bf35 to 881e55a Compare January 16, 2026 19:53
Copy link
Copy Markdown
Contributor

@dwerner dwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm! I have a few questions/suggestions, but yolo

@DaMandal0rian
Copy link
Copy Markdown
Contributor

Great work on this! We're planning to use the Loki drain in production as part of our ES decommission.

One blocker: our production Loki endpoint (loki-logs.thegraph.com) requires HTTP basic auth. The current LokiDrain implementation only supports GRAPH_LOG_STORE_LOKI_URL and GRAPH_LOG_STORE_LOKI_TENANT_ID — no credentials.

Could we add support for basic auth? Something like:

GRAPH_LOG_STORE_LOKI_USER=<username>
GRAPH_LOG_STORE_LOKI_PASSWORD=<password>

This would allow the reqwest client to attach an Authorization: Basic ... header on push requests. Without this, we'd need to deploy a sidecar proxy or an unauthenticated push endpoint, which adds operational complexity.

@fordN fordN force-pushed the ford/subgraph-logs-via-graphql branch from fd061b4 to 74debd3 Compare April 1, 2026 17:35
@fordN
Copy link
Copy Markdown
Contributor Author

fordN commented Apr 1, 2026

Great work on this! We're planning to use the Loki drain in production as part of our ES decommission.

One blocker: our production Loki endpoint (loki-logs.thegraph.com) requires HTTP basic auth. The current LokiDrain implementation only supports GRAPH_LOG_STORE_LOKI_URL and GRAPH_LOG_STORE_LOKI_TENANT_ID — no credentials.

Could we add support for basic auth? Something like:

GRAPH_LOG_STORE_LOKI_USER=<username>
GRAPH_LOG_STORE_LOKI_PASSWORD=<password>

This would allow the reqwest client to attach an Authorization: Basic ... header on push requests. Without this, we'd need to deploy a sidecar proxy or an unauthenticated push endpoint, which adds operational complexity.

Added! ✔️

@fordN
Copy link
Copy Markdown
Contributor Author

fordN commented Apr 1, 2026

One thing I wonder about: should this be configured via environment variables or through graph-node.toml? I would lean towards the latter since it's a configuration that is unlikely to change often, and having it in the config file would let's us express more complicated configuration (though admittedly, right now the config is not very complex)

Yeah, might as well keep configs in the toml file, easier to manage. I've made the change to that effect in commit 64fa5278.

fordN added 11 commits April 1, 2026 10:43
Introduces the foundation for the log store system with:
- LogStore trait for querying logs from backends
- LogLevel enum with FromStr trait implementation
- LogEntry and LogQuery types for structured log data
- LogStoreFactory for creating backend instances
- NoOpLogStore as default (disabled) implementation
Implements three log storage backends for querying logs:

- FileLogStore: Streams JSON Lines files with bounded memory usage
- ElasticsearchLogStore: Queries Elasticsearch indices with full-text search
- LokiLogStore: Queries Grafana Loki using LogQL

All backends implement the LogStore trait and support:
- Filtering by log level, timestamp range, and text search
- Pagination via first/skip parameters
- Returning structured LogEntry objects

Dependencies added: reqwest, serde_json for HTTP clients.
Implements slog drains for capturing and writing logs:

- FileDrain: Writes logs to JSON Lines files (one file per subgraph)
- LokiDrain: Writes logs to Grafana Loki via HTTP push API

Both drains:
- Capture structured log entries with metadata (module, line, column)
- Format logs with timestamp, level, text, and arguments
- Use efficient serialization with custom KVSerializers
Adds a configuration layer for selecting and configuring log backends:

- LogStoreConfig enum with variants: Disabled, File, Elasticsearch, Loki
- LogConfigProvider for loading config from environment variables and CLI args
- Unified GRAPH_LOG_STORE_* environment variable naming
- CLI arguments with --log-store-backend and backend-specific options
- Configuration precedence: CLI args > env vars > defaults
- Deprecation warnings for old config variables

Supported configuration:
- Backend selection (disabled, file, elasticsearch, loki)
- File: directory, max size, retention days
- Elasticsearch: endpoint, credentials, index, timeout
- Loki: endpoint, tenant ID
Refactors LoggerFactory to use LogStoreConfig instead of elastic-only:

- Replaced elastic_config with log_store_config parameter
- Build ElasticLoggingConfig on-demand from LogStoreConfig::Elasticsearch
- Support all log drain types (File, Loki, Elasticsearch)
- Maintain backward compatibility with existing elastic configuration

This enables the factory to create drains for any configured backend
while preserving the existing component logger patterns.
Adds GraphQL API for querying subgraph logs:

Schema types:
- LogLevel enum (CRITICAL, ERROR, WARNING, INFO, DEBUG)
- _Log_ type with id, timestamp, level, text, arguments, meta
- _LogArgument_ type for structured key-value pairs
- _LogMeta_ type for source location (module, line, column)

Query field (_logs) with filters:
- level: Filter by log level
- from/to: Timestamp range (ISO 8601)
- search: Text search in log messages
- first/skip: Pagination (max 1000, skip max 10000)
Integrates _logs query into the GraphQL execution pipeline:

Execution layer:
- Execute _logs queries via log_store.query_logs()
- Convert LogEntry results to GraphQL response objects
- Handle log store errors gracefully

Query parsing:
- Recognize _logs as special query field
- Build LogQuery from GraphQL arguments
- Pass log_store to execution context

Service wiring:
- Create log store from configuration in launcher
- Provide log store to GraphQL runner
- Use NoOpLogStore in test environments

This completes the read path from GraphQL query to log storage backend.
Adds comprehensive integration test for _logs query:

Test implementation:
- Deploys logs-query subgraph and waits for sync
- Triggers contract events to generate logs
- Queries _logs field with various filters
- Verifies log entries are returned correctly
- Tests filtering by level and text search
- Create graph/src/log/common.rs for common log drain functionality
   - SimpleKVSerializer: Concatenates KV pairs to strings
   - VecKVSerializer: Collects KV pairs into Vec<(String, String)>
   - HashMapKVSerializer: Collects KV pairs into HashMap
   - LogMeta: Shared metadata structure (module, line, column)
   - LogEntryBuilder: Builder for common log entry fields
   - level_to_str(): Converts slog::Level to string
   - create_async_logger(): Consistent async logger creation
- Updated FileDrain, LokiDrain, and ElasticDrain to use the log common
utilities
fordN added 13 commits April 1, 2026 10:47
- include _logs in the set of special fields that bypass indexing error
shortcutting when subgraph failed
- add integration test to ensure _log queries return logs after subgraph
failed
- Keep logs within retention_hours of now, skipping cleanup if
--log-store-retention-hours=0
Use map_while instead of filter_map on lines() iterator to properly
handle read errors, and add missing orderDirection argument to the
_logs field in mock introspection JSON.
- Replace level: String with level: Level in ElasticLog, FileLogDocument,
and LokiLogDocument
- Add shared serialize_log_level to common.rs that serializes Level
as lowercase
- Remove level_to_str() and level_str()
- Add [log_store] section to graph-node.toml as the sole configuration
path for log store backends
- Remove env var (GRAPH_LOG_STORE_*) and CLI arg (--log-store-*)
config paths, LogStoreConfigProvider, and env var helper utilities
@fordN fordN force-pushed the ford/subgraph-logs-via-graphql branch from 74debd3 to f923ca3 Compare April 1, 2026 18:10
fordN added 2 commits April 1, 2026 11:43
- Use q::Pos::default() instead of Pos::default() in api.rs
- Add log_store field to Config constructors in tests
- Pass NoOpLogStore to GraphQlRunner in gnd test runner
- Update Cargo.lock
@fordN fordN force-pushed the ford/subgraph-logs-via-graphql branch from f923ca3 to fee13b2 Compare April 1, 2026 18:56
Switch from --postgres-url CLI args to --config with a generated TOML
file, so the [log_store] section is available for the logs-query test.
@fordN fordN requested a review from dwerner April 1, 2026 19:16
The --ethereum-rpc CLI mode defaults to archive,traces features, but
the generated TOML config had features = []. This caused subgraphs
that depend on those capabilities to fail during indexing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants