Skip to content

chore: add SQL comment logging with MDC context DHIS2-20921#22988

Draft
teleivo wants to merge 4 commits intomasterfrom
sql-logging
Draft

chore: add SQL comment logging with MDC context DHIS2-20921#22988
teleivo wants to merge 4 commits intomasterfrom
sql-logging

Conversation

@teleivo
Copy link
Contributor

@teleivo teleivo commented Feb 18, 2026

SQL Comment Logging

Prepends a SQL comment with request context to every SQL statement at the JDBC level using
datasource-proxy's QueryTransformer. Covers all queries (Hibernate HQL, Spring JDBC, native SQL)
on both the main and analytics data sources.

What it does

Every SQL statement gets a comment with up to 4 keys (controller, method, sessionId,
xRequestID). Missing keys are left out.

SQL snippet:

/* controller='TrackedEntitiesExportController',method='getTrackedEntities',
   sessionId='ID/lVK...',xRequestID='pr-22988-demo' */
select e.enrollmentid, e.uid, e.created, ...

DHIS2 application log (enable.query.logging = on):

[http-nio-8080-exec-1] o.h.d.datasource.query: ... Query:["/* controller='TrackedEntitiesExportController',
  method='getTrackedEntities',sessionId='IDZMH...',xRequestID='pr-22988-demo' */
  select program0_.programid as programi1_158_0_, ..."]

[pool-6-thread-1] o.h.d.datasource.query: ... Query:["/* sessionId='UID:DHIS2rocks1' */
  select uid from jobconfiguration where cancel = true"]

PostgreSQL server log (log_min_duration_statement = 0):

2026-02-19 05:52:40 UTC [43]: user=dhis,db=dhis LOG:  duration: 0.034 ms  execute <unnamed>:
  /* controller='TrackedEntitiesExportController',method='getTrackedEntities',
     sessionId='ID//Jlm...',xRequestID='pr-22988-pglog' */
  select program0_.programid ... from program program0_ where program0_.programid=$1

pg_stat_activity

/* controller='TrackedEntitiesExportController',method='getTrackedEntities',
   sessionId='IDD/YQP/...',xRequestID='pr-22988-demo' */
select e.enrollmentid, e.uid, e.created, ...

/* controller='TrackedEntitiesExportController',method='getTrackedEntities',
   sessionId='IDD/YQP/...',xRequestID='pr-22988-demo' */
select * from (select te.uid as te_uid, te.created, ...

pg_stat_statements

 calls | avg_ms | query
-------+--------+------
    14 |   0.06 | /* controller='...',method='...',xRequestID='pr-22988-demo' */ select program0_ ...
     7 |   0.04 | /* ... */ select programsta0_.programid ...
     1 |   1.48 | /* ... */ select e.enrollmentid, e.uid, e.created ...
     1 |   2.80 | /* ... */ select * from (select ev.uid as ev_uid ...
     1 |   0.34 | /* ... */ select * from (select te.uid as te_uid ...

DHIS2 Configuration

All settings in dhis.conf:

Key Default Purpose
logging.query.comments off Prepend SQL comments to statements
enable.query.logging off Log all SQL to org.hisp.dhis.datasource.query
slow.query.logging.threshold.time 1000 Threshold in ms for slow query warnings (only with enable.query.logging)
logging.request_id.enabled on Populate sessionId MDC key (hashed session ID)

Config interactions

We only create the proxy datasource when at least one of logging.query.comments or
enable.query.logging is on. Otherwise the raw datasource is used with zero overhead.

logging.query.comments enable.query.logging Result
off off No proxy. Raw datasource, zero overhead.
on off Comments prepended, no query log output. Useful when you want to observe queries via pg_stat_activity or PostgreSQL server logs without DHIS2 application log noise.
off on Query logging to application log, no comments. Queries go to org.hisp.dhis.datasource.query at INFO, slow queries at WARN.
on on Both. Comments prepended and queries logged (logged queries include the comment). The performance test dhis.conf uses this.

MDC keys in the comment

MDC key Source Always present?
controller SqlCommentInterceptor (MVC interceptor, always registered) Yes, for all Spring MVC requests
method SqlCommentInterceptor Yes, for all Spring MVC requests
sessionId SessionIdFilter (gated by logging.request_id.enabled, default on) Yes by default. Set logging.request_id.enabled = off to omit. Only for authenticated users.
xRequestID RequestIdFilter (reads X-Request-ID header, no config gate) Only if the client sends an X-Request-ID header

With default config and logging.query.comments = on:

  • API requests without X-Request-ID: /* controller='...',method='...',sessionId='...' */
  • API requests with X-Request-ID: all 4 keys
  • Background/scheduler queries (no MVC context): /* sessionId='...' */ only
  • Unauthenticated requests: no sessionId
  • With logging.request_id.enabled = off: no sessionId regardless of authentication

Security

xRequestID is user-supplied input embedded in a SQL comment. RequestIdFilter sanitizes it to
[-_a-zA-Z0-9]{1,36}, replacing anything else with (illegal). The other keys are safe:
controller and method come from Java class/method names, sessionId is a base64-encoded
SHA-256 hash.

PostgreSQL considerations

pg_stat_activity

pg_stat_activity.query gets truncated at track_activity_query_size (default 1024 bytes). The
comment adds ~100-150 bytes. Users with long queries (common in tracker) should increase this:

track_activity_query_size = 4096

pg_stat_statements

pg_stat_statements normalizes queries (replaces literals with $N) but preserves SQL comments.
So the same query shape with different comments ends up as separate entries. Since xRequestID
varies per request, you effectively get one entry per request instead of one per query shape.

This means:

  • pg_stat_statements.max (default 5000) fills up fast, causing constant eviction
  • Aggregating stats by query shape stops working

If you need pg_stat_statements aggregation:

  • Don't enable logging.query.comments in that environment
  • Or strip comments when querying: regexp_replace(query, '/\*.*?\*/', '')
  • Or use pg_stat_activity for per-request correlation and keep pg_stat_statements clean by
    leaving comments off

PostgreSQL 14+ has pg_stat_statements.track_utility and compute_query_id but neither strips
comments during normalization.

When to use DHIS2 logging vs PostgreSQL logging

Use case DHIS2 (logging.query.comments) PostgreSQL (log_min_duration_statement)
Which endpoint triggers a query Comment tags every query PG has no idea
Correlate queries within a request xRequestID groups them Have to correlate by connection/time
N+1 detection Count queries per request Harder
Slow query detection enable.query.logging + threshold log_min_duration_statement (more reliable)
Production monitoring with aggregation Breaks pg_stat_statements pg_stat_statements works fine
Query plan analysis No auto_explain logs plans
Overhead Microseconds per query (MDC read + string concat) Near zero for log_min_duration_statement

Use DHIS2 comments when you need to know which endpoint causes which queries (N+1 detection,
attributing load). Use PostgreSQL native logging (log_min_duration_statement, auto_explain,
pg_stat_statements) for production monitoring and aggregation.

Remove proxyResultSet() (unused proxy overhead), executeAfterMethod
stack-walking (expensive per JDBC call), redundant elapsed time
callback, and ANSI escape codes from log formatting (HIGHLIGHT to
BASIC). Deprecate METHOD_QUERY_LOGGING_ENABLED and
ELAPSED_TIME_QUERY_LOGGING_ENABLED for removal in v44.
…ansformer

Prepend SQL comments with controller, action, sessionId, and xRequestID
from MDC to every query when method.query.logging.enabled is on. This
makes queries visible in pg_stat_activity/pg_stat_statements without
needing DHIS2 query logging enabled.
Add javadoc warning that MDC values are embedded verbatim in SQL
comments and must be validated before being put into MDC. Enable
query logging and comments in test performance dhis.conf.
Propagate MDC context from the HTTP thread to the TRACKER-TE-FETCH
thread pool so SQL comments appear on queries executed by
TrackedEntityAggregate and EnrollmentAggregate.

Rename MDC key "action" to "method" for clarity.
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments