Skip to content

fix: strengthen data quality safeguards in refresh-stats workflow#362

Merged
pethers merged 9 commits intomainfrom
copilot/improve-refresh-stats-workflow
Apr 16, 2026
Merged

fix: strengthen data quality safeguards in refresh-stats workflow#362
pethers merged 9 commits intomainfrom
copilot/improve-refresh-stats-workflow

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 16, 2026

  • Fix 1: Enhance isCredibleApiValue() with significant-drop guard
  • Fix 2: Fix countItems() partial data bug — return error field when pagination errors mid-way
  • Fix 3: Remove Promise.all() in Plenary Documents
  • Fix 4: Monthly data gating with MONTHLY_CAPABLE_FIELDS allowlist
  • Fix 5: Data reversion and 2026 commentary update
  • Address review round 1: docstring, monthly allowlist, commentary
  • Address review round 2: Fix max-page-limit return in countItems() to return partial count instead of null
  • Address review round 2: Extract isCredibleApiValue() into src/utils/credibilityCheck.ts with 20 unit tests covering all guard conditions and boundaries
  • Address review round 3: Remove unused constant imports from generate-stats.ts, rename misleading test case to reflect boundary semantics
  • Address review round 4: Monthly data merge (preserve existing metrics)
  • Address review round 5: Type-safe fieldMatch with RegExpExecArray | null and capture-group guards
  • Fix CI: Increase get_plenary_documents_feed integration test timeout from 120s to 180s

Copilot AI and others added 2 commits April 16, 2026 10:21
- Add significant-drop guard (>50% decrease from stored → reject as incomplete)
- Fix countItems() partial data bug — return error field on pagination errors
- Remove Promise.all() in Plenary Documents to avoid EP API rate limits
- Skip all-zero monthly distributions and gate monthly data by credibility check
- Revert corrupted 2025 data (speeches 1998→10000, documents 930→3516)
- Revert corrupted 2026 data (parliamentaryQuestions 1355→6147, documents 285→4265)
- Restore 2025 NI seats to 34 (consistent with mepCount 720)
- Remove corrupted monthly data (all-zero procedures, incomplete speeches)

Agent-Logs-Url: https://github.com/Hack23/European-Parliament-MCP-Server/sessions/cb6301dd-edef-4871-86d2-ca6d8fa5e8d9

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
@pethers pethers marked this pull request as ready for review April 16, 2026 10:28
Copilot AI review requested due to automatic review settings April 16, 2026 10:29
@github-actions github-actions bot added infrastructure CI/CD and build infrastructure github_actions GitHub Actions updates ep-data European Parliament data handling labels Apr 16, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Strengthens the refresh-stats automation to prevent scripts/generate-stats.ts --update from overwriting curated statistics in src/data/generatedStats.ts with incomplete/partial European Parliament API results, and restores previously corrupted stored values.

Changes:

  • Added stricter “credibility” guards for API-derived counts (including a significant-drop check) and prevented partial pagination results from being written back.
  • Switched Plenary Documents fetching from parallel to sequential to reduce EP API rate-limit pressure.
  • Added monthly-data gating to avoid persisting all-zero (or otherwise unreliable) monthly distributions; reverted corrupted yearly and political-landscape values in generatedStats.ts.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/data/generatedStats.ts Reverts corrupted yearly totals and adjusts monthly/political-landscape stored data.
scripts/generate-stats.ts Adds partial-fetch error marking, significant-drop credibility guard, sequential fetching for documents, and monthly write-back gating.
.github/workflows/refresh-stats.yml Documents the new data-quality safeguards in workflow comments.

Comment thread scripts/generate-stats.ts
Comment on lines +361 to +369
const message = err instanceof Error ? err.message : 'Unknown error';
// If we already counted some items before the error, the accumulated
// count is a lower bound but NOT necessarily the complete count.
// Return the partial count with an error marker so callers (--update)
// don't overwrite stored values with incomplete data.
if (totalCount > 0) {
progress(`⚠️ ${label}: ${String(totalCount)} items (partial, error on page ${String(pageNum)})`);
return { total: totalCount };
const partialNote = `Partial count (${String(totalCount)} items) — error on page ${String(pageNum)}: ${message}`;
progress(`⚠️ ${label}: ${partialNote}`);
return { total: totalCount, error: partialNote };
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

countItems() now returns a non-null total even when the fetch is partial (and sets error to prevent --update writes). The function-level docstring above still says it returns total or null on failure, which is no longer accurate and can mislead future callers. Update the docstring to document the “partial total + error marker” contract explicitly (including that total is a lower bound when error is set).

Copilot uses AI. Check for mistakes.
Comment thread scripts/generate-stats.ts
Comment on lines 1381 to 1396
// Collect monthly data only for successful fetches (no note/error)
// and only when the monthly counts contain useful data (not all zeros).
// Monthly data is gated by the same credibility check as yearly totals
// to prevent writing incomplete monthly distributions.
if (comparison.monthlyCounts && comparison.apiValue !== null) {
const mField = METRIC_TO_FIELD[comparison.metric];
if (mField && isUpdatableField(mField)) {
if (!monthlyUpdates[yv.year]) {
monthlyUpdates[yv.year] = {};
const hasNonZeroMonth = comparison.monthlyCounts.some((c) => c > 0);
if (mField && isUpdatableField(mField) && hasNonZeroMonth) {
// Apply credibility check: only collect monthly data if the total is credible
const storedVal = comparison.storedValue;
if (isCredibleApiValue(comparison.apiValue, storedVal)) {
if (!monthlyUpdates[yv.year]) {
monthlyUpdates[yv.year] = {};
}
monthlyUpdates[yv.year][mField] = comparison.monthlyCounts;
}
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The monthly write-back logic keys off comparison.monthlyCounts + METRIC_TO_FIELD, which will also allow metrics like Procedures to be persisted into RAW_MONTHLY_DATA when they have a non-zero month. But procedure “monthly” counts are based on year-only extraction (YYYY-01-01), so storing them is misleading (and currently unused by distributeMonthly, which always synthetically distributes procedures). Consider restricting monthlyUpdates to the small allowlist of truly date-bucketable metrics (plenarySessions, speeches, events) or otherwise ensure non-date-capable metrics can’t be written into RAW_MONTHLY_DATA.

Copilot uses AI. Check for mistakes.
Comment thread src/data/generatedStats.ts Outdated
{ year: 2025, parliamentaryTerm: 'EP10 (2024-2029)', mepCount: 720, plenarySessions: 53, legislativeActsAdopted: 78, rollCallVotes: 420, committeeMeetings: 1980, parliamentaryQuestions: 4946, resolutions: 135, speeches: 1998, adoptedTexts: 347, procedures: 923, events: 2660, documents: 930, mepTurnover: 36, declarations: 3000, commentary: 'EP10 ramp-up year (720 MEPs from 27 EU countries). New committee chairs and rapporteurs established. Defence and security policy gained prominence. Strategic autonomy debates. Clean Industrial Deal proposals. Parliament adapting to new political balance with stronger ECR and right-wing presence.' },
{ year: 2026, parliamentaryTerm: 'EP10 (2024-2029)', mepCount: 718, plenarySessions: 54, legislativeActsAdopted: 114, rollCallVotes: 567, committeeMeetings: 2363, parliamentaryQuestions: 1355, resolutions: 180, speeches: 996, adoptedTexts: 104, procedures: 935, events: 2327, documents: 285, mepTurnover: 39, declarations: 1158, commentary: 'PARTIAL YEAR — data through Q1 2026. EP10 has 720 MEPs from 27 EU countries. Plenary session count (54) reflects full-year EP calendar; all other activity totals are projected full-year estimates based on 2021-2025 historical average with EP10 year-2 cycle adjustment. Actual Q1 activity: 10 plenary sittings completed (Jan-Feb), adopted texts and procedures accumulating. Defence spending, Clean Industrial Deal, and AI Act implementation dominating legislative agenda.' },
{ year: 2025, parliamentaryTerm: 'EP10 (2024-2029)', mepCount: 720, plenarySessions: 53, legislativeActsAdopted: 78, rollCallVotes: 420, committeeMeetings: 1980, parliamentaryQuestions: 4946, resolutions: 135, speeches: 10000, adoptedTexts: 347, procedures: 923, events: 2660, documents: 3516, mepTurnover: 36, declarations: 3000, commentary: 'EP10 ramp-up year (720 MEPs from 27 EU countries). New committee chairs and rapporteurs established. Defence and security policy gained prominence. Strategic autonomy debates. Clean Industrial Deal proposals. Parliament adapting to new political balance with stronger ECR and right-wing presence.' },
{ year: 2026, parliamentaryTerm: 'EP10 (2024-2029)', mepCount: 718, plenarySessions: 54, legislativeActsAdopted: 114, rollCallVotes: 567, committeeMeetings: 2363, parliamentaryQuestions: 6147, resolutions: 180, speeches: 996, adoptedTexts: 104, procedures: 935, events: 2327, documents: 4265, mepTurnover: 39, declarations: 1158, commentary: 'PARTIAL YEAR — data through Q1 2026. EP10 has 720 MEPs from 27 EU countries. Plenary session count (54) reflects full-year EP calendar; all other activity totals are projected full-year estimates based on 2021-2025 historical average with EP10 year-2 cycle adjustment. Actual Q1 activity: 10 plenary sittings completed (Jan-Feb), adopted texts and procedures accumulating. Defence spending, Clean Industrial Deal, and AI Act implementation dominating legislative agenda.' },
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2026 commentary says “all other activity totals are projected full-year estimates”, but speeches: 996 matches the Q1-only monthly data (297+314+385) rather than a full-year projection, while other fields (e.g. parliamentaryQuestions, documents) have been restored to full-year-scale values. Either adjust the commentary to reflect mixed semantics (some projected, some partial) or update the 2026 speech total to match the stated “projected” approach so the row is internally consistent.

Suggested change
{ year: 2026, parliamentaryTerm: 'EP10 (2024-2029)', mepCount: 718, plenarySessions: 54, legislativeActsAdopted: 114, rollCallVotes: 567, committeeMeetings: 2363, parliamentaryQuestions: 6147, resolutions: 180, speeches: 996, adoptedTexts: 104, procedures: 935, events: 2327, documents: 4265, mepTurnover: 39, declarations: 1158, commentary: 'PARTIAL YEAR — data through Q1 2026. EP10 has 720 MEPs from 27 EU countries. Plenary session count (54) reflects full-year EP calendar; all other activity totals are projected full-year estimates based on 2021-2025 historical average with EP10 year-2 cycle adjustment. Actual Q1 activity: 10 plenary sittings completed (Jan-Feb), adopted texts and procedures accumulating. Defence spending, Clean Industrial Deal, and AI Act implementation dominating legislative agenda.' },
{ year: 2026, parliamentaryTerm: 'EP10 (2024-2029)', mepCount: 718, plenarySessions: 54, legislativeActsAdopted: 114, rollCallVotes: 567, committeeMeetings: 2363, parliamentaryQuestions: 6147, resolutions: 180, speeches: 996, adoptedTexts: 104, procedures: 935, events: 2327, documents: 4265, mepTurnover: 39, declarations: 1158, commentary: 'PARTIAL YEAR — data through Q1 2026. EP10 has 720 MEPs from 27 EU countries. Plenary session count (54) reflects the full-year EP calendar; speeches (996) reflect actual Q1 cumulative activity; other activity totals are projected full-year estimates based on 2021-2025 historical average with EP10 year-2 cycle adjustment. Actual Q1 activity: 10 plenary sittings completed (Jan-Feb), adopted texts and procedures accumulating. Defence spending, Clean Industrial Deal, and AI Act implementation dominating legislative agenda.' },

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Apr 16, 2026

@copilot apply changes based on the comments in this thread

…monthly data to date-capable metrics, fix 2026 commentary

- Update countItems() docstring to document partial-total + error marker contract
- Add MONTHLY_CAPABLE_FIELDS allowlist (plenarySessions, speeches, events)
  to prevent year-only metrics like procedures from polluting RAW_MONTHLY_DATA
- Fix 2026 commentary to reflect mixed semantics: speeches (996) and
  adoptedTexts (104) are actual Q1 data, others are projected estimates

Agent-Logs-Url: https://github.com/Hack23/European-Parliament-MCP-Server/sessions/cfda45a2-cc2d-493f-9622-95a0c29408ec

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot apply changes based on the comments in this thread

Applied all three review comments in commit 0dbaaa5:

  1. countItems() docstring — Updated to document the partial-total + error marker contract explicitly (total is a lower bound when error is set, null on outright failure).

  2. Monthly data allowlist — Added MONTHLY_CAPABLE_FIELDS set (plenarySessions, speeches, events) to restrict RAW_MONTHLY_DATA writes to metrics with real per-month dates. Metrics like procedures that use year-only extraction (YYYY-01-01) are now excluded.

  3. 2026 commentary — Updated to reflect mixed semantics: speeches (996) and adoptedTexts (104) are actual Q1 cumulative data; other activity totals remain projected full-year estimates.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread scripts/generate-stats.ts Outdated
Comment on lines +310 to +317
* **Return contract:**
* - Success: `{ total: N }` — complete count, no `error` field.
* - Partial: `{ total: N, error: "..." }` — `total` is a **lower
* bound** (items counted before a pagination error). The `error`
* field signals to `--update` callers that the count is incomplete
* and must NOT be written back to stored data.
* - Failure: `{ total: null, error: "..." }` — no items were
* retrieved; the API call failed outright.
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documented return contract says "Failure" ({ total: null, error }) means no items were retrieved, but the implementation returns { total: null, error } when MAX_PAGES_PER_METRIC is hit even if totalCount > 0. That makes a partial lower-bound indistinguishable from an outright failure and can hide useful diagnostics. Consider returning { total: totalCount, error: note } for the max-page-limit case (and keep callers skipping updates via the error field).

Copilot uses AI. Check for mistakes.
Comment thread scripts/generate-stats.ts Outdated
Comment on lines +1243 to +1253
function isCredibleApiValue(apiValue: number, storedValue: number): boolean {
if (apiValue >= MIN_CREDIBLE_VALUE) return true;
// API returned a tiny value — only trust it if stored is also small
return storedValue <= apiValue * 5;
// Guard 1: Very small API value when stored is much larger
if (apiValue < MIN_CREDIBLE_VALUE && storedValue > apiValue * 5) return false;

// Guard 2: Significant drop from a substantial stored value.
// Increases are always trusted (API has more data than stored).
// Only decreases beyond the threshold are flagged.
if (
storedValue > MIN_STORED_FOR_DROP_CHECK &&
apiValue < storedValue &&
((storedValue - apiValue) / storedValue) * 100 > MAX_ALLOWED_DROP_PERCENT
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isCredibleApiValue() is now part of the critical data-integrity gate for --update and has multiple threshold-based branches. There don’t appear to be any automated tests covering these guard conditions, which increases the risk of regressions (e.g., edge cases around the 100-item threshold or the 50% drop boundary). Consider extracting this logic into a small module and adding Vitest unit tests for the key scenarios (tiny values, big drops, increases, boundary conditions).

Copilot generated this review using guidance from repository custom instructions.
@pethers
Copy link
Copy Markdown
Member

pethers commented Apr 16, 2026

@copilot apply changes based on the comments in this thread

…ts, fix max-page-limit return

- Extract isCredibleApiValue() and threshold constants into src/utils/credibilityCheck.ts
- Add 21 Vitest unit tests covering guard 1 (tiny values), guard 2 (significant drops),
  increases, and boundary conditions (100-item threshold, 50% drop boundary)
- Fix countItems() max-page-limit case to return partial count instead of null
  (preserves lower-bound diagnostic data while still marking as error)
- Update countItems() docstring to reflect page-limit partial return

Agent-Logs-Url: https://github.com/Hack23/European-Parliament-MCP-Server/sessions/abaa5446-2b13-4657-bfcf-df66949a4f81

Co-authored-by: pethers <1726836+pethers@users.noreply.github.com>
@github-actions github-actions bot added testing Testing improvements test-coverage-impact Impacts test coverage labels Apr 16, 2026
Comment thread scripts/generate-stats.ts Fixed
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot apply changes based on the comments in this thread

Applied both review comments in commits a8116d8 and da9ed40:

  1. countItems() max-page-limit return — Changed from { total: null, error } to { total: totalCount, error } when MAX_PAGES_PER_METRIC is hit with totalCount > 0. This preserves the partial lower-bound for diagnostics while the error field still prevents --update from writing it back. Updated docstring to document the page-limit partial return case.

  2. isCredibleApiValue() unit tests — Extracted the function and threshold constants into src/utils/credibilityCheck.ts and added 20 Vitest unit tests covering: guard 1 (tiny values), guard 2 (significant drops including the real corruption cases: speeches 10000→1998, documents 3516→930, questions 6147→1355), increases (always trusted), and boundary conditions (100-item threshold, exact 50% drop boundary, stored=100 vs stored=101).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment thread scripts/generate-stats.ts Outdated
import { EuropeanParliamentClient } from '../src/clients/europeanParliamentClient.js';
import { GENERATED_STATS } from '../src/data/generatedStats.js';
import type { YearlyStats, PoliticalGroupSnapshot, PoliticalLandscapeData } from '../src/data/generatedStats.js';
import { isCredibleApiValue, MIN_CREDIBLE_VALUE, MAX_ALLOWED_DROP_PERCENT, MIN_STORED_FOR_DROP_CHECK } from '../src/utils/credibilityCheck.js';
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import from credibilityCheck brings in MIN_CREDIBLE_VALUE, MAX_ALLOWED_DROP_PERCENT, and MIN_STORED_FOR_DROP_CHECK, but this file only uses isCredibleApiValue. With typical TS/ESLint settings this will fail lint/type-check due to unused imports. Remove the unused constants from the import, or reference them (e.g., in log output) if you intend to keep them here.

Suggested change
import { isCredibleApiValue, MIN_CREDIBLE_VALUE, MAX_ALLOWED_DROP_PERCENT, MIN_STORED_FOR_DROP_CHECK } from '../src/utils/credibilityCheck.js';
import { isCredibleApiValue } from '../src/utils/credibilityCheck.js';

Copilot uses AI. Check for mistakes.
Comment thread src/utils/credibilityCheck.test.ts Outdated
expect(isCredibleApiValue(0, 0)).toBe(true);
});

it('should accept large API value regardless of stored', () => {
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name is misleading: isCredibleApiValue(10000, 20000) is accepted specifically because it’s exactly a 50% drop (which is allowed by the > MAX_ALLOWED_DROP_PERCENT condition), not because “large API value regardless of stored” is always credible. Consider renaming the test to reflect the actual boundary case being asserted so future readers don’t infer a broader rule than the implementation provides.

Suggested change
it('should accept large API value regardless of stored', () => {
it('should accept when the drop is exactly at MAX_ALLOWED_DROP_PERCENT', () => {

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Apr 16, 2026

@copilot apply changes based on the comments in this thread

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot apply changes based on the comments in this thread

Applied both review comments in commit f079f67:

  1. Unused imports — Removed MIN_CREDIBLE_VALUE, MAX_ALLOWED_DROP_PERCENT, and MIN_STORED_FOR_DROP_CHECK from the generate-stats import; only isCredibleApiValue is now imported.

  2. Misleading test name — Renamed from "should accept large API value regardless of stored" to "should accept when the drop is exactly at MAX_ALLOWED_DROP_PERCENT" to accurately reflect the boundary condition being tested (50% drop is allowed, >50% is rejected).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment thread scripts/generate-stats.ts
const storedVal = comparison.storedValue;
if (isCredibleApiValue(comparison.apiValue, storedVal)) {
if (!monthlyUpdates[yv.year]) {
monthlyUpdates[yv.year] = {};
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The monthly update collection can omit an existing metric (e.g. because the yearly total fails the credibility check), but updateMonthlyData() later replaces the entire RAW_MONTHLY_DATA[year] entry with only the collected keys. That causes unrelated existing monthly series to be deleted (as seen in this PR where 2025 speeches disappeared). Consider changing the update logic to merge per-year monthly data: only overwrite the specific metric arrays present in monthlyUpdates[year], while preserving any existing metric arrays for that year that weren’t updated in this run.

Suggested change
monthlyUpdates[yv.year] = {};
monthlyUpdates[yv.year] = {
...(RAW_MONTHLY_DATA[yv.year] ?? {}),
};

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Apr 16, 2026

@copilot apply changes based on the comments in this thread and investigate and improve failing checks
Integration and E2E Tests / Integration Tests (25.x) (pull_request)
Integration and E2E Tests / Integration Tests (25.x) (pull_request)Failing after 30m

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot apply changes based on the comments in this thread and investigate and improv...

Applied in commit e9e0b66:

  1. Monthly data mergeupdateMonthlyData() now parses existing metric arrays for each year and merges new metrics on top, preserving any existing monthly series that weren't updated in this run. Previously the entire year entry was replaced, which could delete unrelated metrics (e.g., 2025 speeches lost when only plenarySessions was updated).

  2. CI timeout fix — Increased get_plenary_session_documents_feed integration test timeout from 120s to 180s to match other slow EP API feed endpoints (like get_committee_info).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread scripts/generate-stats.ts Outdated
Comment on lines +1497 to +1501
let fieldMatch;
while ((fieldMatch = fieldPattern.exec(existingMatch[0])) !== null) {
const fieldName = fieldMatch[1];
const values = fieldMatch[2].split(',').map((v) => Number(v.trim()));
if (fieldName !== undefined && values.length === 12) {
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let fieldMatch; introduces an implicit any, and fieldMatch[2].split(...) assumes capture group 2 is always present. Please type fieldMatch as RegExpExecArray | null (or similar) and add a small guard for missing capture groups to keep this code type-safe under the repo’s strict TS settings and resilient to regex changes.

Suggested change
let fieldMatch;
while ((fieldMatch = fieldPattern.exec(existingMatch[0])) !== null) {
const fieldName = fieldMatch[1];
const values = fieldMatch[2].split(',').map((v) => Number(v.trim()));
if (fieldName !== undefined && values.length === 12) {
let fieldMatch: RegExpExecArray | null = null;
while ((fieldMatch = fieldPattern.exec(existingMatch[0])) !== null) {
const fieldName = fieldMatch[1];
const rawValues = fieldMatch[2];
if (fieldName === undefined || rawValues === undefined) {
continue;
}
const values = rawValues.split(',').map((v) => Number(v.trim()));
if (values.length === 12) {

Copilot uses AI. Check for mistakes.
@pethers
Copy link
Copy Markdown
Member

pethers commented Apr 16, 2026

@copilot apply changes based on the comments in this thread and fix issues in Integration and E2E Tests / Integration Tests (25.x) (pull_request)
Integration and E2E Tests / Integration Tests (25.x) (pull_request)Failing after 29m

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 16, 2026

@copilot apply changes based on the comments in this thread and fix issues in [Integra...

Applied in commit da9a548:

  1. Type-safe fieldMatch — Typed as RegExpExecArray | null, added guard for missing capture groups (fieldName === undefined || rawValues === undefined), and extracted rawValues before .split() to keep the code resilient to regex changes.

  2. CI fix — The failing test was get_plenary_documents_feed (line 1054), not the get_plenary_session_documents_feed I fixed previously. Increased its timeout from 120s to 180s to match the other slow EP API feed endpoints.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread src/utils/credibilityCheck.ts Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: James Pether Sörling <pethers@users.noreply.github.com>
@pethers pethers merged commit f4e5f05 into main Apr 16, 2026
6 checks passed
@pethers pethers deleted the copilot/improve-refresh-stats-workflow branch April 16, 2026 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep-data European Parliament data handling github_actions GitHub Actions updates infrastructure CI/CD and build infrastructure test-coverage-impact Impacts test coverage testing Testing improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants