-
Notifications
You must be signed in to change notification settings - Fork 247
new importer #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
new importer #214
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 28
🧹 Nitpick comments (35)
apps/start/src/modals/index.tsx (1)
8-8: Consider lazy‑loading AddImport to keep initial bundle slimAddImport brings TRPC/React Query; load it on demand.
Apply:
+import { lazy } from 'react'; -import AddImport from './add-import'; +const AddImport = lazy(() => import('./add-import'));Ensure ModalProvider or the modal surface renders within a .
Also applies to: 41-41
packages/importer/src/base-provider.ts (1)
7-8: Make core hooks async‑friendly and identifiers immutableAllow providers to perform async transforms/validation; mark identifiers readonly.
export abstract class BaseImportProvider< TRawEvent extends BaseRawEvent = BaseRawEvent, > { - abstract provider: string; - abstract version: string; + abstract readonly provider: string; + abstract readonly version: string; + + // Utility type to allow sync or async implementations + export type MaybePromise<T> = T | Promise<T>; /** * Stream-read and parse source (file/API) → yields raw events * This should be implemented as an async generator to handle large files efficiently */ abstract parseSource(): AsyncGenerator<TRawEvent, void, unknown>; /** * Convert provider format → IClickhouseEvent */ - abstract transformEvent(rawEvent: TRawEvent): IClickhouseEvent; + abstract transformEvent(rawEvent: TRawEvent): MaybePromise<IClickhouseEvent>; /** * Validate raw event structure */ - abstract validate(rawEvent: TRawEvent): boolean; + abstract validate(rawEvent: TRawEvent): MaybePromise<boolean>;Also applies to: 19-24
packages/importer/src/utils/referrer-parser.ts (3)
5-35: Type the map and normalize hosts; reduce duplicatesStrong types + normalization avoid listing www/m. variants twice and improve coverage.
-import { stripTrailingSlash } from '@openpanel/common'; +import { stripTrailingSlash } from '@openpanel/common'; +type ReferrerType = 'search' | 'social' | 'referral'; +type ReferrerEntry = { type: ReferrerType; name: string }; +const normalizeHost = (h: string) => h.replace(/^(www\.|m\.)/i, '').toLowerCase(); -const referrers: Record<string, { type: string; name: string }> = { +const referrers: Record<string, ReferrerEntry> = { // Search engines - 'google.com': { type: 'search', name: 'Google' }, - 'www.google.com': { type: 'search', name: 'Google' }, + 'google.com': { type: 'search', name: 'Google' }, 'bing.com': { type: 'search', name: 'Bing' }, - 'www.bing.com': { type: 'search', name: 'Bing' }, 'yahoo.com': { type: 'search', name: 'Yahoo!' }, - 'www.yahoo.com': { type: 'search', name: 'Yahoo!' }, 'duckduckgo.com': { type: 'search', name: 'DuckDuckGo' }, - 'www.duckduckgo.com': { type: 'search', name: 'DuckDuckGo' }, 'baidu.com': { type: 'search', name: 'Baidu' }, - 'www.baidu.com': { type: 'search', name: 'Baidu' }, // Social media - 'facebook.com': { type: 'social', name: 'Facebook' }, - 'www.facebook.com': { type: 'social', name: 'Facebook' }, + 'facebook.com': { type: 'social', name: 'Facebook' }, 'twitter.com': { type: 'social', name: 'Twitter' }, - 'www.twitter.com': { type: 'social', name: 'Twitter' }, 'x.com': { type: 'social', name: 'X' }, - 'www.x.com': { type: 'social', name: 'X' }, 'linkedin.com': { type: 'social', name: 'LinkedIn' }, - 'www.linkedin.com': { type: 'social', name: 'LinkedIn' }, 'instagram.com': { type: 'social', name: 'Instagram' }, - 'www.instagram.com': { type: 'social', name: 'Instagram' }, 'tiktok.com': { type: 'social', name: 'TikTok' }, - 'www.tiktok.com': { type: 'social', name: 'TikTok' }, 'youtube.com': { type: 'social', name: 'YouTube' }, - 'www.youtube.com': { type: 'social', name: 'YouTube' }, 'reddit.com': { type: 'social', name: 'Reddit' }, - 'www.reddit.com': { type: 'social', name: 'Reddit' }, + 't.co': { type: 'social', name: 'Twitter' }, + 'lnkd.in': { type: 'social', name: 'LinkedIn' }, }; export function parseReferrer(url: string | undefined) { - const hostname = getHostname(url); - const match = referrers[hostname] ?? referrers[hostname.replace('www.', '')]; + const hostname = normalizeHost(getHostname(url)); + const match = referrers[hostname];Also applies to: 49-52
49-57: Guard output URL to http(s) onlyAvoid echoing non-web schemes (e.g., javascript:) back to callers.
export function parseReferrer(url: string | undefined) { const hostname = getHostname(url); const match = referrers[hostname] ?? referrers[hostname.replace('www.', '')]; - return { + let safeUrl = ''; + try { + if (url) { + const u = new URL(url); + if (u.protocol === 'http:' || u.protocol === 'https:') { + safeUrl = stripTrailingSlash(url); + } + } + } catch { + // ignore + } + + return { name: match?.name ?? '', type: match?.type ?? 'referral', - url: stripTrailingSlash(url ?? ''), + url: safeUrl, }; }
60-91: Minor: normalize source before lookup and keep types consistentLower-case once; reuse normalizeHost for direct key match; return typed result.
-export function getReferrerWithQuery( - query: Record<string, string> | undefined, -) { +export function getReferrerWithQuery( + query: Record<string, string> | undefined, +): { name: string; type: ReferrerType; url: string } | null { if (!query) { return null; } - const source = query.utm_source ?? query.ref ?? query.utm_referrer ?? ''; + const source = (query.utm_source ?? query.ref ?? query.utm_referrer ?? '').trim(); if (source === '') { return null; } - const match = - Object.values(referrers).find( - (referrer) => referrer.name.toLowerCase() === source.toLowerCase(), - ) || referrers[source]; + const lower = source.toLowerCase(); + const match = + Object.values(referrers).find((r) => r.name.toLowerCase() === lower) || + referrers[normalizeHost(lower)]; if (match) { return { name: match.name, type: match.type, url: '', }; }packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
1-13: Consider more descriptive migration naming.The migration name "fix" is not descriptive. Consider renaming to something like "remove_redundant_import_columns" or "migrate_import_config_to_json" to better indicate the purpose of this schema change.
packages/db/code-migrations/5-add-imports-table.sql (1)
32-32: Consider using an Enum for import_status.The
import_statusfield usesLowCardinality(String)with a default of 'pending'. For better type safety and query performance, consider using an Enum type instead:`import_status` LowCardinality(Enum8('pending' = 1, 'processing' = 2, 'completed' = 3, 'failed' = 4)) DEFAULT 'pending'This ensures only valid status values can be stored and provides better compression.
packages/db/prisma/schema.prisma (1)
470-486: Consider using an enum for the status field.The
statusfield is defined as aStringwith a comment indicating it should be one of'pending' | 'processing' | 'completed' | 'failed'. For better type safety and database constraints, define a Prisma enum:enum ImportStatus { pending processing completed failed } model Import { // ... status ImportStatus // instead of String // ... }This provides compile-time type checking and database-level validation.
packages/trpc/src/root.ts (1)
7-7: Router key named "import" (reserved keyword) — consider renaming for DX.Valid in JS/TS, but can confuse tooling/usages (e.g., appRouter.import vs dynamic import()). Consider renaming to imports/importer for clarity.
Also applies to: 44-44
packages/queue/src/queues.ts (1)
163-176: Harden import jobs: add attempts/backoff/timeout; optionally expose QueueEvents.Imports call external providers; transient failures are expected. Provide sensible defaults and metrics hook.
Apply:
export const importQueue = new Queue<ImportQueuePayload>('import', { connection: getRedisQueue(), defaultJobOptions: { removeOnComplete: 10, - removeOnFail: 50, + removeOnFail: 50, + attempts: 3, + backoff: { type: 'exponential', delay: 30_000 }, + timeout: 1000 * 60 * 30, // 30 min; tune as needed }, }); + +// Optional: enable events for monitoring dashboards/Bull Board progress +export const importQueueEvents = new QueueEvents('import', { + connection: getRedisQueue(), +});apps/worker/src/boot-workers.ts (3)
138-142: Also drain import queue before shutdown to avoid cutting imports mid-run.- await waitForQueueToEmpty(cronQueue); + await waitForQueueToEmpty(cronQueue); + await waitForQueueToEmpty(importQueue, 10 * 60_000); // 10 min cap; tune as needed
155-167: Dev path skips graceful shutdown — can leave jobs in inconsistent state.Unify behavior and call the same exit handler in dev.
- process.on(evt, (code) => { - if (process.env.NODE_ENV === 'production') { - exitHandler(evt, code); - } else { - logger.info('Shutting down for development', { - event: evt, - code, - }); - process.exit(0); - } - }); + process.on(evt, (code) => { + exitHandler(evt, code ?? 0); + });
4-13: Remove unused imports and dead/commented code for clarity.
eventsGroupQueue,GroupWorker,eventsJob,incomingEventPureare no longer used; commented worker block can be dropped.Also applies to: 18-23, 33-48
packages/trpc/src/routers/import.ts (3)
121-131: Guard job removal on delete with error handling (active/locked jobs can throw).const job = await importQueue.getJob(importRecord.jobId); - if (job) { - await job.remove(); - } + if (job) { + try { + await job.remove(); + } catch (e) { + // best-effort removal; proceed with deletion + } + }
165-171: Retry should set stable jobId and attempts/backoff like create.- const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + );
3-6: Unused type import (IImportConfig).
IImportConfigis not referenced. Remove to satisfy lint.packages/importer/src/index.ts (1)
1-14: Public API surface looks good; minor polishConsider also re-exporting provider config/types (e.g., IUmamiImportConfig) from providers to keep consumers on a single import path. Optional.
IMPORT_REVIEW.md (1)
320-337: Move this review doc to /docs or PR notes to avoid shipping in package tarballsKeeps repo clean and avoids confusing consumers.
apps/worker/src/jobs/import.ts (4)
1-11: Avoid deep imports; prefer package barrel exportsImporting services via '@openpanel/db/src/services/import.service' couples to internal paths. Re-export these from '@openpanel/db' and import from there to reduce breakage.
75-81: Progress totalBatches fluctuates; compute once or report pctAs totalEvents grows, totalBatches changes mid-run. Either omit totalBatches here or compute a percentage for UI.
- totalBatches: Math.ceil(totalEvents / BATCH_SIZE), + progressPct: totalEvents ? Math.round((processedEvents / totalEvents) * 100) : 0,
17-17: Make BATCH_SIZE configurableRead from config/env to tune per deployment.
-const BATCH_SIZE = 5000; +const BATCH_SIZE = Number(process.env.IMPORT_BATCH_SIZE ?? 5000);
200-209: Validate provider-specific config with a schema before instantiationAvoid passing untyped Prisma JSON into providers. Parse with provider schema (zod) and fail early.
function createProvider( record: Prisma.ImportGetPayload<{ include: { project: true } }>, ) { switch (record.config.provider) { case 'umami': - return new UmamiProvider(record.projectId, record.config); + // Example: zUmamiImportConfig.safeParse(record.config) + return new UmamiProvider( + record.projectId, + (record.config as any) /* TODO: parse/validate */, + ); default: throw new Error(`Unknown provider: ${record.config.provider}`); } }I can wire in zod parsing if you expose the provider config schemas from @openpanel/importer.
packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (2)
9-16: Constrain status valuesAdd a CHECK constraint (or enum) to prevent invalid states.
"status" TEXT NOT NULL, @@ ) ; + +-- Restrict status values +ALTER TABLE "public"."imports" + ADD CONSTRAINT "imports_status_check" + CHECK ("status" IN ('pending','processing','completed','failed'));
18-22: Add indexes for common queriesList/filters will hit projectId/status/createdAt frequently.
-- AddForeignKey ALTER TABLE "public"."imports" ADD CONSTRAINT "imports_projectId_fkey" FOREIGN KEY ("projectId") REFERENCES "public"."projects"("id") ON DELETE CASCADE ON UPDATE CASCADE; + +-- Indexes for performance +CREATE INDEX IF NOT EXISTS "idx_imports_project_createdAt" + ON "public"."imports" ("projectId", "createdAt" DESC); +CREATE INDEX IF NOT EXISTS "idx_imports_status_createdAt" + ON "public"."imports" ("status", "createdAt" DESC);apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (5)
5-10: Remove unused icon importCheckCircle is unused.
-import { - CheckCircle, - CheckCircleIcon, - Download, - Loader2Icon, - XCircleIcon, -} from 'lucide-react'; +import { CheckCircleIcon, Download, Loader2Icon, XCircleIcon } from 'lucide-react';
83-104: Add onError to mutations for user feedbackSurface errors with toasts; also consider logging.
const deleteImport = useMutation( trpc.import.delete.mutationOptions({ onSuccess: () => { toast.success('Import deleted', { description: 'The import has been successfully deleted.', }); queryClient.invalidateQueries(trpc.import.list.pathFilter()); }, + onError: (err) => { + toast.error('Delete failed', { description: String(err) }); + }, }), ); @@ const retryImport = useMutation( trpc.import.retry.mutationOptions({ onSuccess: () => { toast.success('Import retried', { description: 'The import has been queued for processing again.', }); queryClient.invalidateQueries(trpc.import.list.pathFilter()); }, + onError: (err) => { + toast.error('Retry failed', { description: String(err) }); + }, }), );
244-261: Disable action buttons while pending to avoid double submitsTie disabled state to mutation pending flags.
- {imp.status === 'failed' && ( + {imp.status === 'failed' && ( <Button variant="outline" size="sm" - onClick={() => retryImport.mutate({ id: imp.id })} + disabled={retryImport.isPending} + onClick={() => retryImport.mutate({ id: imp.id })} > Retry </Button> )} <Button variant="ghost" size="sm" - onClick={() => deleteImport.mutate({ id: imp.id })} + disabled={deleteImport.isPending} + onClick={() => deleteImport.mutate({ id: imp.id })} > Delete </Button>
41-66: Consider serving logos locally or via asset proxyExternal Brandfetch URLs can break and may violate CSP. If feasible, mirror assets or use a proxy.
141-161: Consider extending IntegrationCardLogoImage to accept a customizable alt prop for improved accessibilityThe
altattribute is currently hardcoded to the generic "Integration Logo" at line 66 ofapps/start/src/components/integrations/integration-card.tsx. Extending the component to accept an optionalaltprop and passingprovider.namewould provide more descriptive alternative text for each integration, improving accessibility and SEO.packages/validation/src/index.ts (1)
444-445: Local file support vs schema
fileUrlisz.string().url(), but the Umami provider supports local files. Either (a) allowfile:///or absolute paths, or (b) drop local-path support in the provider. Today this schema blocks local imports.- fileUrl: z.string().url(), + fileUrl: z.string().refine( + (v) => /^https?:\/\//i.test(v) || /^file:\/\//i.test(v) || v.startsWith('/'), + 'Must be an http(s) URL, file:// URL, or absolute path', + ),packages/importer/src/providers/umami.ts (2)
178-195: Byte counting viabody.on('data')competes with pipelineAttach a lightweight Transform to count bytes between
bodyanddecompressto preserve backpressure and avoid dual consumers.- if (maxBytes) { - body.on('data', (chunk: Buffer) => { - seenBytes += chunk.length; - if (seenBytes > maxBytes) { - controller.abort(); - body.destroy(new Error(`Stream exceeded size limit (${seenBytes} > ${maxBytes})`)); - } - }); - } + const byteCounter = new (await import('node:stream')).Transform({ + transform(chunk, _enc, cb) { + seenBytes += chunk.length; + if (maxBytes && seenBytes > maxBytes) { + cb(new Error(`Stream exceeded size limit (${seenBytes} > ${maxBytes})`)); + return; + } + cb(null, chunk); + }, + }); @@ - if (decompress) { - await pipeline(body, decompress, parser, { signal: controller.signal }); - } else { - await pipeline(body, parser, { signal: controller.signal }); - } + if (decompress) { + await pipeline(body, byteCounter, decompress, parser, { signal: controller.signal }); + } else { + await pipeline(body, byteCounter, parser, { signal: controller.signal }); + }Also applies to: 211-227
91-99: Optional: support local Brotli (.br) like remote pathMirror remote handling and support
.brlocally if needed.- const isGzipped = filePath.endsWith('.gz'); + const isGzipped = filePath.endsWith('.gz'); + const isBrotli = filePath.endsWith('.br'); @@ - if (isGzipped) { + if (isGzipped) { await pipeline(fileStream, createGunzip(), parser); - } else { + } else if (isBrotli) { + await pipeline(fileStream, createBrotliDecompress(), parser); + } else { await pipeline(fileStream, parser); }packages/db/src/services/import.service.ts (3)
78-92: Consider using CTEs or subqueries for better maintainability.The
argMin/argMaxtuple approach works but is fragile—the 17-element tuple order must match the destructuring on lines 147-173. If fields are added, removed, or reordered, this becomes error-prone and difficult to debug.A clearer alternative would be to use CTEs with
ROW_NUMBER()or separate queries for first/last events, making field selection explicit:WITH ranked_events AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY device_id, session_id, project_id, profile_id ORDER BY created_at ASC) as first_rank, ROW_NUMBER() OVER (PARTITION BY device_id, session_id, project_id, profile_id ORDER BY created_at DESC) as last_rank FROM ${TABLE_NAMES.events_imports} WHERE ${rangeWhere} AND name NOT IN ('session_start', 'session_end') ) SELECT ... FROM ranked_events WHERE first_rank = 1 OR last_rank = 1
26-525: Consider adding explicit error handling and logging.None of the functions include try-catch blocks or error enrichment. While throwing errors to the caller is acceptable, adding contextual error handling would improve debuggability and allow for better error classification (transient vs. permanent failures).
Consider wrapping critical operations:
export async function insertImportBatch( events: IClickhouseEvent[], importId: string, ): Promise<ImportStageResult> { try { if (events.length === 0) { return { importId, totalEvents: 0, insertedEvents: 0 }; } const eventsWithMetadata = events.map((event) => ({ ...event, import_id: importId, import_status: 'pending', imported_at_meta: new Date(), })); await ch.insert({ table: TABLE_NAMES.events_imports, values: eventsWithMetadata, format: 'JSONEachRow', }); return { importId, totalEvents: events.length, insertedEvents: events.length, }; } catch (error) { // Add context and rethrow or log throw new Error(`Failed to insert import batch for ${importId}: ${error instanceof Error ? error.message : String(error)}`); } }This pattern would help with debugging import failures in production.
278-356: Consider batch migration for very large imports.The
INSERT INTO ... SELECTapproach is efficient for most cases, but for imports with millions of events, it might:
- Block for extended periods
- Consume significant memory
- Provide no progress feedback during the migration
For very large imports, consider implementing batch migration:
export async function migrateImportToProduction( importId: string, from?: string, to?: string, batchSize = 100000, // configurable batch size ): Promise<void> { let offset = 0; let migrated = 0; while (true) { const migrationQuery = ` INSERT INTO ${TABLE_NAMES.events} (...) SELECT ... FROM ${TABLE_NAMES.events_imports} WHERE import_id = {importId:String} ${from ? 'AND created_at >= {from:String}' : ''} ${to ? 'AND created_at < {to:String}' : ''} ORDER BY created_at ASC LIMIT {batchSize:UInt32} OFFSET {offset:UInt32} `; const result = await ch.command({ query: migrationQuery, query_params: { importId, from, to, batchSize, offset }, }); // If no rows affected, we're done if (result.query_id && migrated === 0) break; offset += batchSize; // Could call updateImportProgress here } }This is optional and depends on expected import sizes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
packages/importer/a70ff57d-f632-4292-a98b-658d7734fbec.csvis excluded by!**/*.csvpnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (35)
IMPORT_REVIEW.md(1 hunks)apps/start/package.json(1 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/worker/package.json(1 hunks)apps/worker/src/boot-workers.ts(5 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)packages/db/code-migrations/5-add-imports-table.sql(1 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/clickhouse/client.ts(1 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/src/utils/referrer-parser.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/components/integrations/integration-card.tsxapps/start/src/modals/index.tsxapps/start/src/modals/add-import.tsxapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsxapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/components/ui/badge.tsxapps/start/src/routeTree.gen.ts
🧬 Code graph analysis (17)
packages/db/src/types.ts (1)
packages/validation/src/index.ts (1)
IImportConfig(464-467)
packages/importer/src/utils/referrer-parser.ts (1)
packages/common/src/string.ts (1)
stripTrailingSlash(1-3)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
getRedisQueue(93-105)
apps/worker/src/jobs/import.ts (3)
packages/queue/src/queues.ts (1)
ImportQueuePayload(163-168)packages/importer/src/providers/umami.ts (2)
UmamiRawEvent(55-55)UmamiProvider(57-426)packages/db/src/services/import.service.ts (8)
insertImportBatch(26-53)updateImportProgress(387-400)getImportDateBounds(508-525)reconstructSessions(61-272)migrateImportToProduction(278-356)markImportComplete(361-382)getImportProgress(434-488)markImportFailed(405-429)
apps/start/src/modals/index.tsx (1)
apps/start/src/modals/add-import.tsx (1)
AddImport(29-164)
packages/importer/vitest.config.ts (1)
vitest.shared.ts (1)
getSharedVitestConfig(4-27)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
importQueue(170-176)apps/worker/src/jobs/import.ts (1)
importJob(19-198)
packages/trpc/src/root.ts (1)
packages/trpc/src/routers/import.ts (1)
importRouter(11-183)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
getIsCluster(17-24)packages/db/src/clickhouse/migration.ts (2)
createTable(68-116)runClickhouseMigrationCommands(322-454)
apps/start/src/modals/add-import.tsx (2)
apps/start/src/integrations/tanstack-query/root-provider.tsx (1)
Provider(68-86)packages/importer/src/types.ts (1)
ImportConfig(7-12)
packages/trpc/src/routers/import.ts (6)
packages/trpc/src/trpc.ts (2)
createTRPCRouter(138-138)protectedProcedure(161-164)packages/trpc/src/access.ts (1)
getProjectAccessCached(4-4)packages/trpc/src/errors.ts (1)
TRPCAccessError(3-7)packages/db/src/prisma-client.ts (1)
db(270-270)packages/validation/src/index.ts (1)
zCreateImport(469-477)packages/queue/src/queues.ts (1)
importQueue(170-176)
packages/db/src/services/import.service.ts (3)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/client.ts (4)
ch(131-141)TABLE_NAMES(48-60)convertClickhouseDateToJs(218-220)formatClickhouseDate(188-199)packages/db/src/prisma-client.ts (1)
db(270-270)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
importQueue(170-176)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (2)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
Route(12-34)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
BaseRawEvent(40-42)ImportJobMetadata(69-73)ErrorContext(45-51)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)
packages/importer/src/providers/umami.ts (4)
packages/validation/src/index.ts (1)
IUmamiImportConfig(450-450)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/importer/src/utils/referrer-parser.ts (2)
parseReferrer(49-58)getReferrerWithQuery(60-91)
packages/importer/src/types.ts (2)
packages/db/src/services/event.service.ts (1)
IServiceImportedEventPayload(144-147)packages/db/src/services/import.service.ts (2)
ImportProgress(16-21)ImportStageResult(10-14)
🪛 Gitleaks (8.28.0)
apps/worker/src/index.ts
[high] 60-60: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.
(aws-access-token)
🪛 LanguageTool
IMPORT_REVIEW.md
[uncategorized] ~35-~35: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... batchNumber: number ) ``` --- ## 🟡 Medium Priority Issues ### 2. **Missing UUID Dependenc...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[grammar] ~156-~156: Ensure spelling is correct
Context: ...d for EVERY event ``` Improvement: Cleanup periodically (e.g., every 1000 events) ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-and-push-api
- GitHub Check: build-and-push-worker
🔇 Additional comments (24)
apps/start/package.json (1)
6-6: Clarify the motivation for renaming "testing" → "testing2".The script was renamed from
"testing"to"testing2", but the name"testing2"is not descriptive and appears to be a placeholder. Additionally, this change seems orthogonal to the PR objective ("new importer").Please clarify:
- What is the intended purpose of renaming this script?
- Is
"testing2"a temporary name, or was a different name intended?- Was the original
"testing"script referenced in CI/CD pipelines, documentation, or developer workflows? If so, those references need to be updated to avoid breaking them.If this change is unintended, consider reverting it.
apps/start/src/components/integrations/integration-card.tsx (1)
50-69: LGTM! Consistent API enhancement.The addition of the optional
classNameprop alignsIntegrationCardLogoImagewith the other components in this file (IntegrationCardFooter,IntegrationCardHeader,IntegrationCardHeaderButtons) that already acceptclassName. The prop is properly forwarded toIntegrationCardLogo, which correctly merges it using thecn()utility.apps/start/src/components/ui/badge.tsx (2)
12-13: LGTM! Success variant follows good practices.The emerald-700 background with white text provides good contrast and is semantically appropriate for success states.
16-16: LGTM! Info variant is well-implemented.The blue-500 background with white text provides adequate contrast and follows the established pattern.
packages/importer/tsconfig.json (1)
1-9: LGTM! Clean TypeScript configuration.The configuration is properly structured for the new importer package, extending the shared base config and setting appropriate compiler options for a source-in-src, output-to-dist layout.
packages/importer/package.json (1)
27-27:bullmqdependency placement is correct.Verification shows
bullmqis exclusively indevDependencieswith no runtime imports or usage in the importer source code. The package does not directly enqueue or process jobs—job operations are handled through the@openpanel/queueworkspace dependency. The current placement is appropriate.packages/db/src/types.ts (1)
2-2: Alignment with validation types looks goodThe Prisma JSON alias cleanly mirrors the validation union. No issues spotted.
Also applies to: 16-16
packages/trpc/package.json (1)
19-19: No circular dependency detected—safe to proceed.Verification confirms @openpanel/queue does not depend on @openpanel/trpc. The dependency is unidirectional: @openpanel/trpc (line 19) → @openpanel/queue. No cycle risk.
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
45-45: LGTM! Clean addition of the Imports tab.The new imports tab follows the existing pattern and integrates well with the settings navigation structure.
apps/worker/package.json (1)
23-23: LGTM! Necessary dependency for import functionality.The addition of the
@openpanel/importerworkspace dependency enables the worker to process import jobs.packages/db/src/clickhouse/client.ts (1)
59-59: LGTM! Table name constant added correctly.The
events_importstable name addition follows the existing pattern and aligns with the new ClickHouse migration.apps/worker/src/index.ts (1)
8-8: LGTM! Import queue integration added correctly.The
importQueueis properly imported and registered with Bull Board for monitoring.Also applies to: 42-42
packages/db/src/services/event.service.ts (1)
144-147: LGTM! Type definition is clear and purposeful.The
IServiceImportedEventPayloadtype correctly omitsprofileandmetafields for imported events, which makes sense as these would need to be resolved separately during import processing.packages/db/code-migrations/5-add-imports-table.sql (2)
42-43: Verify the 7-day TTL is intentional for staging data.The TTL of 7 days on
imported_at_metameans all staging import data will be automatically deleted after one week. Confirm this aligns with your data retention requirements and import workflow expectations. If imports might take longer than 7 days to validate or if you need to keep historical import records, consider increasing this value or making it configurable.
1-38: LGTM! Well-designed ClickHouse table schema.The table structure uses appropriate data types, codecs for compression, and partitioning strategy. The use of LowCardinality for categorical fields and specialized codecs (ZSTD, Delta, Gorilla) shows good understanding of ClickHouse optimization patterns.
packages/db/prisma/schema.prisma (1)
470-486: LGTM! Import model is well-structured.The Import model has appropriate fields for tracking import progress, includes proper relations to Project with cascade delete, and uses sensible defaults. The use of Json for config provides flexibility while the comment
/// [IPrismaImportConfig]indicates type documentation exists elsewhere.Also applies to: 197-197
IMPORT_REVIEW.md (2)
37-59: Remove the action item; importer uses node:crypto, not the uuid packageThe script results show that
packages/importerusesrandomUUID()imported fromnode:crypto(inpackages/importer/src/providers/umami.tsandpackages/cli/src/importer/importer.ts), not from the externaluuidpackage. Theuuidpackage is used elsewhere in the codebase (specifically inpackages/db/src/services/event.service.ts), but not in the importer package. The review comment's recommendation to adduuidto the importer's dependencies is unnecessary.Likely an incorrect or invalid review comment.
10-31: Review comment is based on outdated or incorrect code state; actual implementation already uses proper typesThe review comment cites code that does not match the current implementation:
- processBatch function: Does not exist in the codebase (rg search found zero matches)
anytype usage: No matches found inapps/worker/src/jobs/(contradicts claim of "extensive use")- Actual typing: Line 46 of
apps/worker/src/jobs/import.tsalready usesconst eventBatch: UmamiRawEvent[] = [];(properly typed, notany[])- BaseImportProvider: Confirmed generic with
<TRawEvent extends BaseRawEvent = BaseRawEvent>, andUmamiProvidercorrectly extends it asBaseImportProvider<UmamiRawEvent>The current implementation already has the proper type safety that the review comment suggests. The code is not coupled via weak typing but properly typed with
UmamiRawEvent. No refactoring is needed.Likely an incorrect or invalid review comment.
apps/start/src/routeTree.gen.ts (1)
57-57: Generated file — no manual editsLooks consistent with the new Imports tab. No action; keep excluded from lint/format.
Also applies to: 373-378, 511-511, 564-564, 800-801, 1170-1176, 1504-1517
packages/importer/src/types.ts (1)
20-25: Naming duplication exists but no actual import collision risk – review comment is incorrect.The analysis shows that while
ImportProgressandImportStageResultare defined in both packages, the importer package does not import these collision types from @openpanel/db. The importer only imports unrelated types likeIClickhouseEventandIServiceImportedEventPayload.Additionally, the
ImportProgresstypes serve different purposes: the importer's version tracks batch processing state (totalEvents, processedEvents, currentBatch, totalBatches), while the db version tracks job metadata (importId, totalEvents, insertedEvents, status). This is intentional, not a collision.Since there are no cross-imports and the types aren't used interchangeably, no "confusion in imports" or "subtle type bugs" will occur. Both packages maintain independent type contracts appropriate to their concerns.
Likely an incorrect or invalid review comment.
packages/db/src/services/import.service.ts (4)
1-21: LGTM!The imports and type definitions are well-structured. The interfaces clearly define the data contracts for staging results and progress tracking.
26-53: LGTM!The batch insert function properly handles empty arrays, adds appropriate metadata, and leverages the retry-wrapped ClickHouse insert. The implementation is clean and defensive.
493-503: Verify cleanup scope: should Postgres import record also be deleted?The
cleanupFailedImportfunction only deletes from ClickHouse'sevents_importstable. The corresponding Postgresimportrecord remains with status 'failed'.Is this intentional (to preserve audit history), or should the Postgres record also be deleted for consistency?
If Postgres records should also be cleaned up, add:
export async function cleanupFailedImport(importId: string): Promise<void> { const deleteQuery = ` DELETE FROM ${TABLE_NAMES.events_imports} WHERE import_id = {importId:String} AND import_status = 'failed' `; await ch.command({ query: deleteQuery, query_params: { importId }, }); + + // Also remove from Postgres if desired + await db.import.delete({ + where: { id: importId, status: 'failed' }, + }); }
508-525: LGTM!The
getImportDateBoundsutility correctly handles the nullable case and provides a clean interface for date range queries. Defensive null check on line 524 prevents runtime errors.
| warning: | ||
| 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix color contrast for accessibility.
White text on a yellow-500 background likely fails WCAG AA contrast requirements (4.5:1). Yellow backgrounds conventionally require dark text for proper readability.
Apply this diff to fix the contrast issue:
- warning:
- 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90',
+ warning:
+ 'border-transparent bg-yellow-500 text-yellow-950 [a&]:hover:bg-yellow-500/90',Alternatively, use a darker yellow background with white text:
- warning:
- 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90',
+ warning:
+ 'border-transparent bg-yellow-600 text-white [a&]:hover:bg-yellow-600/90',📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| warning: | |
| 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90', | |
| warning: | |
| 'border-transparent bg-yellow-500 text-yellow-950 [a&]:hover:bg-yellow-500/90', |
| warning: | |
| 'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90', | |
| warning: | |
| 'border-transparent bg-yellow-600 text-white [a&]:hover:bg-yellow-600/90', |
🤖 Prompt for AI Agents
In apps/start/src/components/ui/badge.tsx around lines 14-15, the warning
variant uses 'bg-yellow-500 text-white' which likely fails WCAG contrast; change
either the text color to a dark color (e.g., replace text-white with
text-gray-900 or text-black) so dark text sits on yellow-500, or keep white text
and use a darker yellow background (e.g., replace bg-yellow-500 with
bg-yellow-600 or bg-yellow-700) and adjust the hover class accordingly to
maintain accessible contrast.
| toast.error('Import failed', { | ||
| description: error.message, | ||
| }); | ||
| }, | ||
| }), | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid leaking secrets in error paths/logs
error.message may include upstream details. Ensure server never echoes API secrets. Consider a generic toast and log details server-side only.
🤖 Prompt for AI Agents
In apps/start/src/modals/add-import.tsx around lines 55 to 60, the current toast
displays error.message which can leak upstream secrets; replace the client-side
toast with a generic user-facing message (e.g., "Import failed, try again
later") and remove inclusion of error.message, and instead ensure the server
logs the full error details securely (or returns a sanitized error code/message)
so secrets are never echoed to the client; update the toast usage accordingly
and confirm server-side logging/response sanitization is in place.
apps/worker/src/jobs/import.ts
Outdated
| import { UmamiProvider, type UmamiRawEvent } from '@openpanel/importer'; | ||
| import type { ImportQueuePayload } from '@openpanel/queue'; | ||
| import type { Job } from 'bullmq'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Remove UmamiRawEvent coupling; use BaseImportProvider generics
This job should be provider-agnostic. Swap UmamiRawEvent for BaseImportProvider.
-import { UmamiProvider, type UmamiRawEvent } from '@openpanel/importer';
+import { UmamiProvider, type BaseImportProvider } from '@openpanel/importer';Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 12 to 14, remove the direct
coupling to UmamiRawEvent and instead use the generic BaseImportProvider<T> from
the importer package: drop the UmamiRawEvent import, import BaseImportProvider
(and its type) from '@openpanel/importer', and replace any references to
UmamiRawEvent in the job/payload types with the generic BaseImportProvider<T>
(e.g., ImportQueuePayload<BaseImportProvider<any>> or a concrete T where
appropriate). Ensure all imports and Job generics are updated to use the
BaseImportProvider generic so the job becomes provider-agnostic.
apps/worker/src/jobs/import.ts
Outdated
| // Phase 1: Load & Transform - Process events in batches | ||
| const eventBatch: UmamiRawEvent[] = []; | ||
|
|
||
| for await (const rawEvent of providerInstance.parseSource()) { | ||
| // Validate event | ||
| if (!providerInstance.validate(rawEvent)) { | ||
| jobLogger.warn('Skipping invalid event', { rawEvent }); | ||
| continue; | ||
| } | ||
|
|
||
| eventBatch.push(rawEvent); | ||
| totalEvents++; | ||
|
|
||
| // Process batch when it reaches the batch size | ||
| if (eventBatch.length >= BATCH_SIZE) { | ||
| jobLogger.info('Processing batch', { batchSize: eventBatch.length }); | ||
|
|
||
| // Transform events | ||
| const transformedEvents = eventBatch.map((event) => | ||
| providerInstance.transformEvent(event), | ||
| ); | ||
|
|
||
| // Phase 2: Insert into staging table | ||
| await insertImportBatch(transformedEvents, importId); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Make batch buffer provider-agnostic and harden transform stage
- Use unknown[] for the raw batch.
- Guard transformEvent with try/catch to skip bad rows without failing the whole job.
- const eventBatch: UmamiRawEvent[] = [];
+ const eventBatch: unknown[] = [];
@@
- const transformedEvents = eventBatch.map((event) =>
- providerInstance.transformEvent(event),
- );
+ const transformedEvents: any[] = [];
+ for (const event of eventBatch) {
+ try {
+ transformedEvents.push(providerInstance.transformEvent(event as any));
+ } catch (e) {
+ jobLogger.warn('Skipping event on transform failure', {
+ provider: providerInstance.provider,
+ reason: e instanceof Error ? e.message : String(e),
+ });
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Phase 1: Load & Transform - Process events in batches | |
| const eventBatch: UmamiRawEvent[] = []; | |
| for await (const rawEvent of providerInstance.parseSource()) { | |
| // Validate event | |
| if (!providerInstance.validate(rawEvent)) { | |
| jobLogger.warn('Skipping invalid event', { rawEvent }); | |
| continue; | |
| } | |
| eventBatch.push(rawEvent); | |
| totalEvents++; | |
| // Process batch when it reaches the batch size | |
| if (eventBatch.length >= BATCH_SIZE) { | |
| jobLogger.info('Processing batch', { batchSize: eventBatch.length }); | |
| // Transform events | |
| const transformedEvents = eventBatch.map((event) => | |
| providerInstance.transformEvent(event), | |
| ); | |
| // Phase 2: Insert into staging table | |
| await insertImportBatch(transformedEvents, importId); | |
| // Phase 1: Load & Transform - Process events in batches | |
| const eventBatch: unknown[] = []; | |
| for await (const rawEvent of providerInstance.parseSource()) { | |
| // Validate event | |
| if (!providerInstance.validate(rawEvent)) { | |
| jobLogger.warn('Skipping invalid event', { rawEvent }); | |
| continue; | |
| } | |
| eventBatch.push(rawEvent); | |
| totalEvents++; | |
| // Process batch when it reaches the batch size | |
| if (eventBatch.length >= BATCH_SIZE) { | |
| jobLogger.info('Processing batch', { batchSize: eventBatch.length }); | |
| // Transform events | |
| const transformedEvents: any[] = []; | |
| for (const event of eventBatch) { | |
| try { | |
| transformedEvents.push(providerInstance.transformEvent(event as any)); | |
| } catch (e) { | |
| jobLogger.warn('Skipping event on transform failure', { | |
| provider: providerInstance.provider, | |
| reason: e instanceof Error ? e.message : String(e), | |
| }); | |
| } | |
| } | |
| // Phase 2: Insert into staging table | |
| await insertImportBatch(transformedEvents, importId); |
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 45 to 69, the batch buffer and
transform stage need hardening: change eventBatch from UmamiRawEvent[] to
unknown[] so the buffer is provider-agnostic, and wrap each
providerInstance.transformEvent call in a try/catch so a thrown error on one row
is caught, logged (jobLogger.warn with the rawEvent and error), and that row is
skipped without failing the whole job; ensure you collect only successful
transformed events (filter out failed/undefined results) before calling
insertImportBatch(importId).
| // Properties type for events - more specific than Record<string, any> | ||
| export interface EventProperties { | ||
| [key: string]: | ||
| | string | ||
| | number | ||
| | boolean | ||
| | null | ||
| | undefined | ||
| | Record<string, unknown>; | ||
| __query?: Record<string, unknown>; | ||
| __title?: string; | ||
| __screen?: string; | ||
| __language?: string; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EventProperties allows nested objects, but ClickHouse properties is primitives-only
IClickhouseEvent.properties is Record<string, string|number|boolean|undefined|null>. Storing objects (e.g., __query) will break ingestion unless stringified.
- Constrain EventProperties values to primitives.
- Or require JSON.stringify for nested shapes (e.g.,
__queryJson: string).
🤖 Prompt for AI Agents
In packages/importer/src/types.ts around lines 53 to 66, EventProperties
currently allows nested objects which will break ClickHouse ingestion because
IClickhouseEvent.properties must be primitives-only; update the EventProperties
interface so values are limited to string | number | boolean | null | undefined
(remove Record<string, unknown>) or replace nested object fields with explicit
JSON string fields (e.g., __queryJson: string) and rename/remove existing
__query, ensuring all nested shapes are JSON.stringified before being assigned
to EventProperties.
| @@ -0,0 +1,3 @@ | |||
| import { getSharedVitestConfig } from '../../vitest.shared'; | |||
|
|
|||
| export default getSharedVitestConfig({ __dirname }); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__dirname is unavailable in ESM modules.
Since package.json declares "type": "module", this file runs as ESM. The __dirname global doesn't exist in ESM and will throw a ReferenceError.
Apply this diff to compute the directory path in ESM:
+import { dirname } from 'node:path';
+import { fileURLToPath } from 'node:url';
+
import { getSharedVitestConfig } from '../../vitest.shared';
-export default getSharedVitestConfig({ __dirname });
+export default getSharedVitestConfig({
+ __dirname: dirname(fileURLToPath(import.meta.url))
+});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export default getSharedVitestConfig({ __dirname }); | |
| import { dirname } from 'node:path'; | |
| import { fileURLToPath } from 'node:url'; | |
| import { getSharedVitestConfig } from '../../vitest.shared'; | |
| export default getSharedVitestConfig({ | |
| __dirname: dirname(fileURLToPath(import.meta.url)) | |
| }); |
🤖 Prompt for AI Agents
In packages/importer/vitest.config.ts around line 3, the code passes the
non-existent __dirname to getSharedVitestConfig (ESM modules don't have
__dirname); compute the directory from import.meta.url instead and pass that
value: import fileURLToPath from 'url' and path from 'path' (or use their named
functions), derive const __dirname =
path.dirname(fileURLToPath(import.meta.url)), then call getSharedVitestConfig({
__dirname }) so the config receives the correct directory path in ESM.
| list: protectedProcedure | ||
| .input(z.object({ projectId: z.string() })) | ||
| .query(async ({ input, ctx }) => { | ||
| const access = await getProjectAccessCached({ | ||
| projectId: input.projectId, | ||
| userId: ctx.session.userId, | ||
| }); | ||
|
|
||
| if (!access) { | ||
| throw TRPCAccessError('You do not have access to this project'); | ||
| } | ||
|
|
||
| return db.import.findMany({ | ||
| where: { | ||
| projectId: input.projectId, | ||
| }, | ||
| orderBy: { | ||
| createdAt: 'desc', | ||
| }, | ||
| }); | ||
| }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sensitive config exposure in list/get — restrict or redact.
config likely contains provider credentials. Current list/get return it to any project member with read access. Redact or gate by role.
Option A (redact for read-level):
list: protectedProcedure
.input(z.object({ projectId: z.string() }))
.query(async ({ input, ctx }) => {
const access = await getProjectAccessCached({
projectId: input.projectId,
userId: ctx.session.userId,
});
if (!access) {
throw TRPCAccessError('You do not have access to this project');
}
- return db.import.findMany({
+ const rows = await db.import.findMany({
where: { projectId: input.projectId },
orderBy: { createdAt: 'desc' },
});
+ const canSeeSecrets =
+ typeof access !== 'boolean' && access.level !== 'read';
+ return canSeeSecrets
+ ? rows
+ : rows.map((r) => ({ ...r, config: null }));
}), get: protectedProcedure
.input(z.object({ id: z.string() }))
.query(async ({ input, ctx }) => {
const importRecord = await db.import.findUniqueOrThrow({
where: { id: input.id },
include: { project: true },
});
const access = await getProjectAccessCached({
projectId: importRecord.projectId,
userId: ctx.session.userId,
});
if (!access) {
throw TRPCAccessError('You do not have access to this import');
}
- return importRecord;
+ const canSeeSecrets =
+ typeof access !== 'boolean' && access.level !== 'read';
+ return canSeeSecrets ? importRecord : { ...importRecord, config: null };
}),Option B: Use Prisma select to exclude config in list and only include it for admin/write roles.
Also applies to: 34-57
🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 12 to 32 (and likewise apply
the change to lines 34 to 57), the current list/get resolvers return the full
import records including the sensitive config field to any project member;
change the DB queries to explicitly exclude the config field by using a Prisma
select (e.g., select: { config: false, ...otherFields }) for read-level access,
and only fetch/include config when the caller has a higher role (admin/write)
after re-checking the caller's role from ctx.session or the access object;
implement a conditional: if role allows, run a second query or include config in
select, otherwise omit/redact config from the returned objects and ensure no
accidental passthrough in list results.
| // Create import record | ||
| const importRecord = await db.import.create({ | ||
| data: { | ||
| projectId: input.projectId, | ||
| config: input.config, | ||
| status: 'pending', | ||
| }, | ||
| }); | ||
|
|
||
| // Add job to queue | ||
| const job = await importQueue.add('import', { | ||
| type: 'import', | ||
| payload: { | ||
| importId: importRecord.id, | ||
| }, | ||
| }); | ||
|
|
||
| // Update import record with job ID | ||
| await db.import.update({ | ||
| where: { id: importRecord.id }, | ||
| data: { jobId: job.id }, | ||
| }); | ||
|
|
||
| return { | ||
| ...importRecord, | ||
| jobId: job.id, | ||
| }; | ||
| }), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Queue enqueue not wrapped — record can be left pending with no job.
If importQueue.add fails, DB keeps a dangling 'pending' import. Wrap in try/catch, set failed status, and pass stable jobId/attempts/backoff.
// Create import record
const importRecord = await db.import.create({
data: {
projectId: input.projectId,
config: input.config,
status: 'pending',
},
});
- // Add job to queue
- const job = await importQueue.add('import', {
- type: 'import',
- payload: {
- importId: importRecord.id,
- },
- });
-
- // Update import record with job ID
- await db.import.update({
- where: { id: importRecord.id },
- data: { jobId: job.id },
- });
+ // Add job to queue
+ try {
+ const job = await importQueue.add(
+ 'import',
+ { type: 'import', payload: { importId: importRecord.id } },
+ { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
+ );
+ // Update import record with job ID (idempotent if jobId equals importRecord.id)
+ await db.import.update({
+ where: { id: importRecord.id },
+ data: { jobId: job.id },
+ });
+ } catch (err) {
+ await db.import.update({
+ where: { id: importRecord.id },
+ data: { status: 'failed', errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed' },
+ });
+ throw err;
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Create import record | |
| const importRecord = await db.import.create({ | |
| data: { | |
| projectId: input.projectId, | |
| config: input.config, | |
| status: 'pending', | |
| }, | |
| }); | |
| // Add job to queue | |
| const job = await importQueue.add('import', { | |
| type: 'import', | |
| payload: { | |
| importId: importRecord.id, | |
| }, | |
| }); | |
| // Update import record with job ID | |
| await db.import.update({ | |
| where: { id: importRecord.id }, | |
| data: { jobId: job.id }, | |
| }); | |
| return { | |
| ...importRecord, | |
| jobId: job.id, | |
| }; | |
| }), | |
| // Create import record | |
| const importRecord = await db.import.create({ | |
| data: { | |
| projectId: input.projectId, | |
| config: input.config, | |
| status: 'pending', | |
| }, | |
| }); | |
| // Add job to queue | |
| try { | |
| const job = await importQueue.add( | |
| 'import', | |
| { type: 'import', payload: { importId: importRecord.id } }, | |
| { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, | |
| ); | |
| // Update import record with job ID (idempotent if jobId equals importRecord.id) | |
| await db.import.update({ | |
| where: { id: importRecord.id }, | |
| data: { jobId: job.id }, | |
| }); | |
| } catch (err) { | |
| await db.import.update({ | |
| where: { id: importRecord.id }, | |
| data: { status: 'failed', errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed' }, | |
| }); | |
| throw err; | |
| } | |
| return { | |
| ...importRecord, | |
| jobId: job.id, | |
| }; | |
| }), |
🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 72-99, the code creates an
import DB record then calls importQueue.add without error handling so a failed
enqueue leaves a dangling 'pending' record; to fix, generate a stable jobId
(e.g. UUID) before enqueue, persist that jobId on the import record (or create
the record with that jobId) and then wrap importQueue.add in try/catch — on
success update any job metadata if needed, on failure update the DB record to
status 'failed' and write the error message and retry metadata
(attempts/backoff) to the record; ensure the job payload includes the stable
jobId and desired attempts/backoff so retries are deterministic.
| export const zCreateImport = z.object({ | ||
| projectId: z.string().min(1), | ||
| provider: z.enum(['umami', 'plausible', 'mixpanel']), | ||
| config: z.union([ | ||
| zUmamiImportConfig, | ||
| zPlausibleImportConfig, | ||
| zMixpanelImportConfig, | ||
| ]), | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discriminate by provider to prevent mismatched payloads
provider and config can disagree (e.g., provider='umami' with a Mixpanel config). Make zCreateImport a discriminated union on provider so config shape is tied to the selected provider.
-export const zCreateImport = z.object({
- projectId: z.string().min(1),
- provider: z.enum(['umami', 'plausible', 'mixpanel']),
- config: z.union([
- zUmamiImportConfig,
- zPlausibleImportConfig,
- zMixpanelImportConfig,
- ]),
-});
+export const zCreateImport = z.discriminatedUnion('provider', [
+ z.object({
+ projectId: z.string().min(1),
+ provider: z.literal('umami'),
+ config: zUmamiImportConfig,
+ }),
+ z.object({
+ projectId: z.string().min(1),
+ provider: z.literal('plausible'),
+ config: zPlausibleImportConfig,
+ }),
+ z.object({
+ projectId: z.string().min(1),
+ provider: z.literal('mixpanel'),
+ config: zMixpanelImportConfig,
+ }),
+]);🤖 Prompt for AI Agents
In packages/validation/src/index.ts around lines 469 to 477, zCreateImport
currently allows provider and config to disagree; change it to a discriminated
union so each provider literal is paired with the correct config shape. Replace
the single z.object with either z.discriminatedUnion('provider', [...]) or a
z.union of three z.object schemas where provider is a literal ('umami' |
'plausible' | 'mixpanel') and config is the matching zUmamiImportConfig,
zPlausibleImportConfig, or zMixpanelImportConfig respectively; keep the same
exported name and types so callers get compile-time validation that provider and
config match.
d71712e to
b60dd0e
Compare
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughA comprehensive import system is introduced, enabling data ingestion from external providers (Umami, Mixpanel) through a new packages/importer library, TRPC router, worker job processor, and UI modal. The packages/cli is removed, referrer parsing logic is reorganized into packages/common, database schemas are extended with an imports table, and UI components gain new styling variants and props. Changes
Sequence Diagram(s)sequenceDiagram
participant UI as ImportsSettings Modal
participant API as TRPC import.create
participant DB as Prisma/PostgreSQL
participant Queue as Redis Queue
participant Worker as Import Job
participant CH as ClickHouse
UI->>API: Create import (provider, config, projectId)
API->>DB: Create import record (status: pending)
DB-->>API: Return import with id
API->>Queue: Enqueue import job
Queue-->>API: Return jobId
API->>DB: Update import.jobId
Worker->>DB: Fetch import record
Worker->>DB: Phase 1: Insert events to staging (batches)
Worker->>DB: Update progress after each batch
alt Generate Sessions Enabled
Worker->>CH: Phase 2: Generate session IDs from staging
Worker->>DB: Update progress
end
Worker->>CH: Phase 3: Create session_start/end events
Worker->>CH: Phase 4: Move imports to production
Worker->>CH: Phase 5: Backfill sessions to production
Worker->>DB: Mark import completed (status: completed)
Worker-->>DB: Update final progress metrics
UI->>API: Poll/Fetch import history
API->>DB: Query imports with status + progress
DB-->>API: Return import list
API-->>UI: Display table with status, progress bar, actions
sequenceDiagram
participant Form as AddImport Modal
participant Subform as UmamiImport/MixpanelImport
participant Service as Import Service (TRPC)
participant Mutation as Job Mutation
Subform->>Subform: User fills provider-specific fields
Subform->>Service: onSubmit(config)
Service->>Mutation: import.create(projectId, provider, config)
alt Success
Mutation-->>Service: { id, status, jobId }
Service->>Service: Show success toast
Service->>Service: Invalidate import list cache
Service->>Form: Close modal
else Error
Mutation-->>Service: Error response
Service->>Service: Show error toast
Service->>Form: Keep modal open
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/common/src/object.ts (1)
22-29: Inconsistent handling of whitespace-only strings.The guard on line 22 now filters out explicit empty strings, but whitespace-only strings (e.g.,
' ') will pass this check, get trimmed to an empty string on line 28, and still be added to the accumulator. This creates inconsistent behavior.Consider moving the empty string check after the cleaning step to ensure all empty values (including trimmed whitespace) are filtered consistently:
- if (value === undefined || value === null || value === '') { + if (value === undefined || value === null) { return acc; } const cleanedValue = typeof value === 'string' ? removeInvalidSurrogates(value).trim() : String(value); + + if (cleanedValue === '') { + return acc; + } return { ...acc, [`${path}${key}`]: cleanedValue, };
♻️ Duplicate comments (18)
apps/start/src/components/ui/badge.tsx (1)
12-16: New badge variants added for status indication.The three new variants (success, warning, info) extend the badge component to support import status displays and other UI feedback needs.
Note: The warning variant's color contrast issue (white text on yellow-500) has already been flagged in a previous review.
packages/db/code-migrations/5-add-imports-table.ts (1)
65-69: TTL must be defined at CREATE time (and on local replicated table in cluster).ALTERing TTL on
events_importswill hit the Distributed table in cluster mode and be a no-op. Bake TTL intocreateTableso it applies to the replicated local table and the single-node path. Remove the ALTER.Apply these diffs:
In this file:
const sqls: string[] = [ ...createTable({ name: 'events_imports', columns: [ // ... ], orderBy: ['import_id', 'created_at'], partitionBy: 'toYYYYMM(imported_at_meta)', settings: { index_granularity: 8192, }, distributionHash: 'cityHash64(import_id)', replicatedVersion: '1', isClustered, + ttl: "imported_at_meta + INTERVAL 7 DAY", }), ]; - // Add TTL policy for auto-cleanup after 7 days - sqls.push(` - ALTER TABLE events_imports - MODIFY TTL imported_at_meta + INTERVAL 7 DAY - `);And extend
createTable(packages/db/src/clickhouse/migration.ts):export function createTable({ name: tableName, columns, indices = [], engine = 'MergeTree()', orderBy = ['tuple()'], partitionBy, settings = {}, distributionHash, replicatedVersion, isClustered, + ttl, }: CreateTableOptions): string[] { const columnDefinitions = [...columns, ...indices].join(',\n '); const settingsClause = Object.entries(settings).length ? `SETTINGS ${Object.entries(settings).map(([k, v]) => `${k} = ${v}`).join(', ')}` : ''; const partitionByClause = partitionBy ? `PARTITION BY ${partitionBy}` : ''; + const ttlClause = ttl ? `\nTTL ${ttl}` : ''; if (!isClustered) { return [ `CREATE TABLE IF NOT EXISTS ${tableName} ( ${columnDefinitions} ) ENGINE = ${engine} ${partitionByClause} ORDER BY (${orderBy.join(', ')}) -${settingsClause}`.trim(), +${ttlClause} +${settingsClause}`.trim(), ]; } return [ `CREATE TABLE IF NOT EXISTS ${replicated(tableName)} ON CLUSTER '{cluster}' ( ${columnDefinitions} ) ENGINE = Replicated${engine.replace(/^(.+?)\((.+?)?\)/, `$1('${CLUSTER_REPLICA_PATH.replace('{replicatedVersion}', replicatedVersion)}', '{replica}', $2)`).replace(/, \)$/, ')')} ${partitionByClause} ORDER BY (${orderBy.join(', ')}) -${settingsClause}`.trim(), +${ttlClause} +${settingsClause}`.trim(), `CREATE TABLE IF NOT EXISTS ${tableName} ON CLUSTER '{cluster}' AS ${replicated(tableName)} ENGINE = Distributed('{cluster}', currentDatabase(), ${replicated(tableName)}, ${distributionHash})`, ]; }Also applies to: 15-63
packages/validation/src/index.ts (1)
478-486: Make zCreateImport a discriminated union to bind provider ⇔ config.As-is,
provider='umami'can carry a Mixpanel config. Use a discriminated union onprovider.-export const zCreateImport = z.object({ - projectId: z.string().min(1), - provider: z.enum(['umami', 'plausible', 'mixpanel']), - config: z.union([ - zUmamiImportConfig, - zPlausibleImportConfig, - zMixpanelImportConfig, - ]), -}); +export const zCreateImport = z.discriminatedUnion('provider', [ + z.object({ + projectId: z.string().min(1), + provider: z.literal('umami'), + config: zUmamiImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('plausible'), + config: zPlausibleImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('mixpanel'), + config: zMixpanelImportConfig, + }), +]);packages/trpc/src/routers/import.ts (3)
12-32: Sensitive config exposure in list — restrict or redact.
configcontains provider credentials (API keys, secrets). Current implementation returns it to any project member with read access. Redact or gate by role.Option A (redact for read-level):
list: protectedProcedure .input(z.object({ projectId: z.string() })) .query(async ({ input, ctx }) => { const access = await getProjectAccessCached({ projectId: input.projectId, userId: ctx.session.userId, }); if (!access) { throw TRPCAccessError('You do not have access to this project'); } - return db.import.findMany({ + const rows = await db.import.findMany({ where: { projectId: input.projectId }, orderBy: { createdAt: 'desc' }, }); + const canSeeSecrets = + typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets + ? rows + : rows.map((r) => ({ ...r, config: null })); }),
34-56: Sensitive config exposure in get — restrict or redact.
configcontains provider credentials. Apply the same access-level check as inlistto redact secrets for read-only users.get: protectedProcedure .input(z.object({ id: z.string() })) .query(async ({ input, ctx }) => { const importRecord = await db.import.findUniqueOrThrow({ where: { id: input.id }, include: { project: true }, }); const access = await getProjectAccessCached({ projectId: importRecord.projectId, userId: ctx.session.userId, }); if (!access) { throw TRPCAccessError('You do not have access to this import'); } - return importRecord; + const canSeeSecrets = + typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets ? importRecord : { ...importRecord, config: null }; }),
72-99: Queue enqueue not wrapped — record can be left pending with no job.If
importQueue.addfails, DB keeps a dangling 'pending' import. Wrap in try/catch, set failed status, and pass stable jobId/attempts/backoff.// Create import record const importRecord = await db.import.create({ data: { projectId: input.projectId, config: input.config, status: 'pending', }, }); - // Add job to queue - const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); - - // Update import record with job ID - await db.import.update({ - where: { id: importRecord.id }, - data: { jobId: job.id }, - }); + // Add job to queue + try { + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + ); + await db.import.update({ + where: { id: importRecord.id }, + data: { jobId: job.id }, + }); + } catch (err) { + await db.import.update({ + where: { id: importRecord.id }, + data: { status: 'failed', errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed' }, + }); + throw err; + }apps/worker/src/jobs/import.ts (3)
130-139: Redact raw event logging to avoid PII leakage.Line 137 logs the full
rawEventwhich may contain PII (emails, user identifiers, etc.). Log only minimal, non-PII identifiers.if ( !providerInstance.validate( // @ts-expect-error rawEvent, ) ) { - jobLogger.warn('Skipping invalid event', { rawEvent }); + jobLogger.warn('Skipping invalid event', { + provider: providerInstance.provider, + reason: 'validation_failed', + }); continue; }
126-169: Harden batch buffer and transform stage.Batch uses
anytype (Line 126) and transform stage (Lines 147-152) lacks error handling. A single transform failure fails the entire batch. Useunknown[]and wrap transforms in try/catch to skip invalid events.- const eventBatch: any = []; + const eventBatch: unknown[] = []; for await (const rawEvent of providerInstance.parseSource( resumeLoadingFrom, )) { // ... validation ... // Process batch when it reaches the batch size if (eventBatch.length >= BATCH_SIZE) { jobLogger.info('Processing batch', { batchSize: eventBatch.length }); - const transformedEvents: IClickhouseEvent[] = eventBatch.map( - ( - // @ts-expect-error - event, - ) => providerInstance!.transformEvent(event), - ); + const transformedEvents: IClickhouseEvent[] = []; + for (const event of eventBatch) { + try { + transformedEvents.push(providerInstance.transformEvent(event as any)); + } catch (e) { + jobLogger.warn('Skipping event on transform failure', { + provider: providerInstance.provider, + reason: e instanceof Error ? e.message : String(e), + }); + } + }
246-265: Sync final Postgres counters with ClickHouse progress.UI reads
processedEvents/totalEventsfrom Postgres. After completion, Postgres isn't updated with final ClickHouse counts, so progress may show stale values. Update both tofinalProgressvalues.await markImportComplete(importId); await updateImportStatus(jobLogger, job, importId, { step: 'completed', }); jobLogger.info('Import marked as complete'); // Get final progress const finalProgress = await getImportProgress(importId); jobLogger.info('Import job completed successfully', { totalEvents: finalProgress.totalEvents, insertedEvents: finalProgress.insertedEvents, status: finalProgress.status, }); + + // Keep Postgres in sync for UI progress + await db.import.update({ + where: { id: importId }, + data: { + totalEvents: finalProgress.totalEvents, + processedEvents: finalProgress.insertedEvents, + }, + });apps/start/src/modals/add-import.tsx (1)
245-249: Avoid leaking secrets in error messages.Line 247 displays
error.messagewhich may include upstream API details or secrets. Use a generic client-facing message and log details server-side only.onError: (error) => { toast.error('Import failed', { - description: error.message, + description: 'Failed to start import. Please check your configuration and try again.', }); },packages/importer/src/providers/umami.ts (5)
79-198: SSRF exposure persists when fetching arbitrary URLs.The fetch operation on line 97 accepts user-provided URLs without validating that they don't resolve to internal networks or restricted IP ranges. This remains a critical security risk.
213-220: Device ID fallback remains incorrect and unsafe.Lines 216-217 still use
rawEvent.visit_id!for bothipanduaparameters with non-null assertions. The visit_id is semantically unrelated to IP address or user agent, and the non-null assertion will throw if visit_id is undefined.
224-228: URL construction issues remain unresolved.Line 225 concatenates hostname, path, and query without ensuring proper formatting. The query parameter needs a leading '?' and the hostname should be required for absolute URLs.
231-233: Referrer query parameter is still ignored.Line 232 constructs the referrerUrl but omits
rawEvent.referrer_query, losing query string information from the referrer.
257-296: Properties structure violates ClickHouse type expectations.Lines 260 and 267-296 build nested objects (
properties.__queryandproperties.__query.utm_*) which conflict withIClickhouseEvent.properties: Record<string, string | number | boolean | undefined | null>. ClickHouse expects flat primitive values.packages/db/src/services/import.service.ts (3)
256-256: Timestamp collision risk persists.Lines 256 and 293 adjust timestamps by exactly ±1000ms (1 second). This can collide with real events at those timestamps, causing non-deterministic ordering in time-based queries.
Also applies to: 293-293
559-559: Non-deterministic status aggregation remains.Line 559 uses
any(import_status)which returns an arbitrary status when events have mixed statuses. This makes progress reporting unpredictable.
603-603: Type assertion bypasses runtime validation.Line 603 uses a type assertion for
row.statuswithout runtime validation. If ClickHouse returns an unexpected value (due to theany()issue on line 559 or data corruption), this will silently pass invalid data.
🧹 Nitpick comments (10)
apps/start/src/components/project-mapper.tsx (1)
50-50: Consider validating for duplicate source project IDs.The component allows users to add multiple mappings with the same "from" (source project ID), which could lead to ambiguous or conflicting import behavior. Consider adding validation to prevent duplicate source IDs, either here or in the parent form's validation schema.
packages/db/src/clickhouse/client.ts (1)
136-156: Consider immutability in insert proxy.The proxy mutates
args[0]directly when injectingclickhouse_settings. While functional, this modifies the caller's object.Consider creating a new object to avoid mutation:
return (...args: any[]) => withRetry(() => { - args[0].clickhouse_settings = { + const settings = { // Allow bigger HTTP payloads/time to stream rows async_insert: 1, wait_for_async_insert: 1, // Increase insert timeouts and buffer sizes for large batches max_execution_time: 300, max_insert_block_size: '500000', max_http_get_redirects: '0', // Ensure JSONEachRow stays efficient input_format_parallel_parsing: 1, // Keep long-running inserts/queries from idling out at proxies by sending progress headers send_progress_in_http_headers: 1, http_headers_progress_interval_ms: '50000', // Ensure server holds the connection until the query is finished wait_end_of_query: 1, ...args[0].clickhouse_settings, }; + return value.apply(target, [{ ...args[0], clickhouse_settings: settings }]); - return value.apply(target, args); });packages/common/scripts/get-referrers.ts (2)
91-92: Improve error handling in main function.The current error handling only logs to console without providing actionable information or proper error propagation.
Apply this diff to improve error handling:
} catch (e) { - console.log(e); + console.error('Failed to generate referrers:', e); + process.exit(1); }This ensures the script fails with a non-zero exit code when referrer generation fails, allowing CI/build systems to detect the failure.
69-71: Implement robust error handling and fallback strategy for the external data source.The Snowplow referrer data source is currently available and returns valid JSON (HTTP 200), but the script lacks fallback mechanisms. If the fetch fails during the build, the error is only logged to the console without preventing silent failures. Consider adding retry logic, a timeout, or fallback to a cached/bundled version to ensure build reliability when the external source becomes temporarily unavailable.
packages/importer/src/providers/metadata.ts (2)
13-30: Keep provider metadata in sync with validation (plausible present there).Validation accepts 'plausible', but it’s missing here. Either add Plausible to IMPORT_PROVIDERS or drop it from validation until ready.
Would you like a patch to add Plausible with a neutral placeholder logo?
18-19: Avoid hot‑linking external logos; serve from our assets/CDN.Brandfetch links can break or change. Prefer checked‑in assets or our CDN.
Also applies to: 26-28
packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1)
1-3: Consider indexes if filtering/sorting by progress fields.If
currentBatch/currentStepappear in WHERE/ORDER BY, add indexes (even partial) to avoid seq scans. Otherwise this change looks fine.apps/worker/src/index.ts (1)
35-45: Minor improvement: Cleaner Bull Board initialization.Refactoring from a wrapped object literal to a direct
createBullBoard()call improves code clarity. This is a good incidental cleanup.packages/common/server/parser-user-agent.ts (1)
84-112: Consider simplifying the redundant type checks.Each override field uses a pattern like:
typeof overrides?.__os === 'string' && overrides?.__os ? overrides?.__os : res.os.nameThe condition
typeof overrides?.__os === 'string' && overrides?.__osis redundant. If the typeof check passes and the value is truthy, the second check adds no value unless you're explicitly filtering out empty strings. If that's the intent, a clearer pattern would be:- os: - typeof overrides?.__os === 'string' && overrides?.__os - ? overrides?.__os - : res.os.name, + os: + (typeof overrides?.__os === 'string' && overrides?.__os !== '') + ? overrides?.__os + : res.os.name,Or if empty strings should be accepted:
- os: - typeof overrides?.__os === 'string' && overrides?.__os - ? overrides?.__os - : res.os.name, + os: typeof overrides?.__os === 'string' ? overrides?.__os : res.os.name,apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)
46-76: Polling interval appropriate for progress tracking.5-second refetch interval on Line 50 provides real-time import progress updates without excessive load. Consider adding a condition to stop polling when all imports are completed/failed.
Optional optimization:
const importsQuery = useQuery( trpc.import.list.queryOptions( { projectId }, { - refetchInterval: 5000, + refetchInterval: (query) => { + const hasActiveImports = query.state.data?.some( + (imp) => imp.status === 'pending' || imp.status === 'processing' + ); + return hasActiveImports ? 5000 : false; + }, }, ), );
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (63)
apps/api/package.json(1 hunks)apps/start/package.json(2 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/project-mapper.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/date-ranger-picker.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/worker/package.json(3 hunks)apps/worker/src/boot-workers.ts(4 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/events.incoming-event.ts(3 hunks)apps/worker/src/jobs/events.incoming-events.test.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)package.json(1 hunks)packages/cli/package.json(0 hunks)packages/cli/src/cli.ts(0 hunks)packages/cli/src/importer/importer.ts(0 hunks)packages/cli/src/importer/index.ts(0 hunks)packages/cli/tsconfig.json(0 hunks)packages/cli/tsup.config.ts(0 hunks)packages/common/package.json(1 hunks)packages/common/scripts/get-referrers.ts(1 hunks)packages/common/server/index.ts(1 hunks)packages/common/server/parse-referrer.test.ts(5 hunks)packages/common/server/parse-referrer.ts(4 hunks)packages/common/server/parser-user-agent.ts(2 hunks)packages/common/src/object.ts(1 hunks)packages/db/code-migrations/5-add-imports-table.sql(1 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql(1 hunks)packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql(1 hunks)packages/db/prisma/migrations/20251028152531_imports_again/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/buffers/event-buffer.test.ts(8 hunks)packages/db/src/clickhouse/client.ts(2 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/services/session.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/metadata.ts(1 hunks)packages/importer/src/providers/mixpanel.test.ts(1 hunks)packages/importer/src/providers/mixpanel.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/redis/package.json(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
💤 Files with no reviewable changes (6)
- packages/cli/tsup.config.ts
- packages/cli/src/importer/index.ts
- packages/cli/package.json
- packages/cli/src/cli.ts
- packages/cli/src/importer/importer.ts
- packages/cli/tsconfig.json
✅ Files skipped from review due to trivial changes (2)
- packages/importer/src/providers/mixpanel.test.ts
- packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql
🚧 Files skipped from review as they are similar to previous changes (12)
- packages/db/src/services/event.service.ts
- packages/db/code-migrations/5-add-imports-table.sql
- packages/importer/vitest.config.ts
- apps/worker/package.json
- apps/start/src/components/integrations/integration-card.tsx
- packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql
- packages/db/prisma/migrations/20251018214030_fix/migration.sql
- packages/db/src/types.ts
- packages/importer/tsconfig.json
- packages/importer/package.json
- packages/importer/src/types.ts
- apps/start/src/routeTree.gen.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/modals/index.tsxapps/start/src/modals/date-ranger-picker.tsxapps/start/src/components/ui/badge.tsxapps/start/src/components/project-mapper.tsxapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/modals/add-import.tsxapps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx
🧬 Code graph analysis (17)
apps/start/src/modals/index.tsx (1)
apps/start/src/modals/add-import.tsx (1)
AddImport(231-282)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
importQueue(175-181)
packages/trpc/src/routers/import.ts (6)
packages/trpc/src/trpc.ts (2)
createTRPCRouter(138-138)protectedProcedure(161-164)packages/trpc/src/access.ts (1)
getProjectAccessCached(4-4)packages/trpc/src/errors.ts (1)
TRPCAccessError(3-7)packages/db/src/prisma-client.ts (1)
db(270-270)packages/validation/src/index.ts (1)
zCreateImport(478-486)packages/queue/src/queues.ts (1)
importQueue(175-181)
packages/trpc/src/root.ts (1)
packages/trpc/src/routers/import.ts (1)
importRouter(11-178)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
getIsCluster(17-24)packages/db/src/clickhouse/migration.ts (2)
createTable(68-116)runClickhouseMigrationCommands(322-454)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
getRedisQueue(93-105)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
importQueue(175-181)apps/worker/src/jobs/import.ts (1)
importJob(27-283)
apps/start/src/components/project-mapper.tsx (2)
packages/db/src/services/project.service.ts (1)
IServiceProject(7-7)apps/start/src/components/ui/select.tsx (5)
Select(173-173)SelectTrigger(181-181)SelectValue(182-182)SelectContent(174-174)SelectItem(176-176)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
BaseRawEvent(40-42)ImportJobMetadata(69-73)ErrorContext(45-51)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (4)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
Route(12-34)packages/importer/src/providers/metadata.ts (1)
IMPORT_PROVIDERS(13-30)apps/public/components/ui/tooltip.tsx (1)
Tooltiper(36-51)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
zUmamiImportConfig(452-454)zMixpanelImportConfig(461-470)IUmamiImportConfig(456-456)IMixpanelImportConfig(471-471)IImportConfig(473-476)apps/start/src/components/project-mapper.tsx (1)
ProjectMapper(33-116)apps/public/lib/utils.ts (1)
cn(4-6)
packages/common/server/parse-referrer.test.ts (1)
packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
IMixpanelImportConfig(471-471)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)packages/common/server/parser-user-agent.ts (2)
parseUserAgent(72-115)UserAgentInfo(71-71)packages/common/src/object.ts (2)
toDots(3-36)strip(48-48)
packages/importer/src/providers/umami.ts (5)
packages/validation/src/index.ts (1)
IUmamiImportConfig(456-456)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
packages/db/src/services/import.service.ts (4)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/client.ts (4)
ch(131-161)TABLE_NAMES(48-60)convertClickhouseDateToJs(238-240)formatClickhouseDate(208-219)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/prisma-client.ts (1)
db(270-270)
apps/worker/src/jobs/import.ts (9)
packages/queue/src/queues.ts (1)
ImportQueuePayload(168-173)packages/db/src/prisma-client.ts (1)
db(270-270)packages/db/src/services/import.service.ts (10)
ImportSteps(671-671)updateImportStatus(673-736)getImportDateBounds(610-634)insertImportBatch(28-55)generateSessionIds(62-98)createSessionsStartEndEvents(106-316)moveImportsToProduction(322-426)backfillSessionsToProduction(428-521)markImportComplete(526-545)getImportProgress(550-605)packages/db/src/clickhouse/client.ts (2)
formatClickhouseDate(208-219)error(40-45)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/logger/index.ts (1)
ILogger(6-6)packages/importer/src/index.ts (2)
UmamiProvider(1-1)MixpanelProvider(2-2)packages/importer/src/providers/umami.ts (1)
UmamiProvider(59-382)packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-441)
packages/common/scripts/get-referrers.ts (1)
apps/worker/scripts/get-referrers.ts (1)
main(66-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-and-push-worker
- GitHub Check: build-and-push-dashboard
- GitHub Check: build-and-push-api
🔇 Additional comments (43)
packages/db/src/services/session.service.ts (1)
195-199: Incomplete scope: Performance concern applies to both session and event query services.The dateIntervalInDays change (7 → 360 days) was applied to both
packages/db/src/services/session.service.ts(lines 195-199) andpackages/db/src/services/event.service.ts(lines 418-432) with identical logic. The original review only identifies the session service impact but misses the event service, effectively doubling the performance concern.Both services use the same binary threshold (1 day for >1M events, 360 days otherwise) to filter ClickHouse queries, meaning most organizations will experience a 51x increase in the lookback window across both query surfaces.
Recommendations remain the same for both services:
- Introduce intermediate tiers based on organization size
- Make the interval configurable per organization
- Add UI controls for explicit date range selection
Likely an incorrect or invalid review comment.
apps/start/src/modals/date-ranger-picker.tsx (1)
30-30: LGTM! Good UX improvement.Adding
captionLayout="dropdown"provides a more compact date navigation interface, especially useful in modal contexts where space is limited.apps/api/package.json (1)
67-67: LGTM! Reasonable pinning for build tool.Pinning the build tool (tsdown) to an exact version ensures consistent builds across environments. Since this is a devDependency, the trade-off of missing automatic updates is acceptable.
packages/db/src/buffers/event-buffer.test.ts (2)
13-23: LGTM! Test data aligned with production settings.The
clickhouseSettingsconstant mirrors the settings added to the production ClickHouse client, ensuring tests validate the actual insert behavior including async operations, timeouts, and performance optimizations.
142-142: LGTM! Consistent test expectations.All insert expectations now include the
clickhouse_settingsfield, ensuring tests verify that the proxy-wrapped insert method properly injects these settings.Also applies to: 187-187, 519-519, 570-570, 615-615, 640-640, 688-688
packages/common/package.json (1)
8-9: LGTM! Script for generating referrer mappings.The new
gen:referrersscript generates up-to-date referrer mappings from an external source, with automatic formatting applied to the output.packages/db/src/clickhouse/client.ts (2)
59-59: LGTM! Table name for import tracking.Adding
events_importsto the table names registry supports the new import system introduced in this PR.
64-67: LGTM! Extended timeouts for large import operations.The increased timeouts accommodate large data imports:
request_timeout: 60s → 300s (5 minutes)idle_socket_ttl: 8s → 60sThese align with the
max_execution_time: 300setting in the insert configuration.packages/redis/package.json (1)
11-11: Accept ioredis 5.8.2 upgrade with exact pinning.ioredis 5.8.2 is the latest version, and the upgrade is safe. v5.8.2 (released 2025-10-21) contains only bug fixes: default IP family selection and CLIENT SETINFO command improvements—no breaking changes. No security vulnerabilities were detected. The exact version pin (removing the caret) trades automatic patch-level security updates for build reproducibility, which is a valid tradeoff for production stability.
apps/worker/src/jobs/events.incoming-events.test.ts (1)
100-103: <!-- [scratchpad_start]
[task review_verification 1/1]
Analyze shell script results for referrerType normalization concerns.[observations]
- First regex search for
referrer(Type)?\s*[=:]\s*['\"]unknown['\"]returned NO matches for filters checking equality with 'unknown'- Output shows field assignments/mappings, but NO evidence of dashboards/queries filtering on referrerType='unknown'
- Code pattern in event.service.ts (lines 338-340) shows:
referrer_type: payload.referrerType ?? '',— already defaults undefined/null to empty string- ClickHouse schema files show referrer_type as
LowCardinality(String)— accepts empty strings- No breaking query patterns found in the codebase
[analysis]
- Original concern #1 (queries filtering on 'unknown') — NOT FOUND; no evidence of such filters
- Original concern #2 (DB defaults empty to '' for non-nullable columns) — CONFIRMED; code already uses
?? ''pattern- Change from test setting referrerType to '' is consistent with existing null/undefined handling
[pending]
- Verify the specific test file change at apps/worker/src/jobs/events.incoming-events.test.ts lines 100-103 and 208-211
- Confirm no other downstream assumptions exist
[done]
Script results show no matching filters on 'unknown' values and confirm existing empty-string defaulting pattern.
[scratchpad_end]
-->Referrer type normalization aligns with existing code patterns — no breaking query filters found.
The codebase already normalizes
undefined/nullreferrer fields to empty strings via nullish coalescing (?? '') in event.service.ts. No queries or dashboards filtering onreferrerType='unknown'were detected in the search. The change to normalizereferrerTypeto''in the test is consistent with the production code's existing behavior.apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
45-45: LGTM! Clean addition of the Imports tab.The new tab entry follows the existing pattern and integrates well with the settings navigation structure.
packages/trpc/package.json (1)
19-19: LGTM! Appropriate dependency addition.Adding
@openpanel/queueenables TRPC routes to interact with the import queue system, aligning with the new import workflow introduced in this PR.packages/common/server/index.ts (1)
4-4: LGTM! Consistent re-export pattern.The new re-export follows the existing pattern and appropriately expands the public API surface of the common/server module.
apps/worker/src/index.ts (1)
8-8: LGTM! ImportQueue properly integrated into Bull Board monitoring.The addition of
importQueueto the Bull Board UI enables visibility and monitoring of import jobs, which is essential for debugging and tracking the new import workflow.Also applies to: 42-42
apps/start/src/modals/index.tsx (1)
8-8: LGTM! Modal registration follows established pattern.The
AddImportmodal is properly registered in the modals registry, enabling the import configuration UI flow described in the PR objectives.Also applies to: 41-41
package.json (1)
12-12: Verified: thecommonpackage correctly defines thegen:referrersscript.The script exists in
packages/common/package.jsonand runs the referrer generation logic. The filter change fromworkertocommonis valid and properly configured.packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1)
1-2: Migration correctly implements nullablecurrentStepfor import workflow resumption.The codebase explicitly depends on
currentStepbeing nullable:apps/worker/src/jobs/import.ts:56checksif (record.currentStep === null)to identify new imports versus resumed imports. This null state allows the import job to distinguish between first-time executions (null) and resumptions from a previous step. The migration is correct as-is.apps/start/package.json (1)
34-34: LGTM! Dependency addition aligns with the import feature.The new
@openpanel/importerworkspace dependency is correctly added to support the import functionality introduced in this PR.packages/trpc/src/root.ts (1)
7-7: LGTM! Import router correctly integrated.The
importRouteris properly imported and wired into theappRouter, following the established pattern for other routers in the application.Also applies to: 44-44
packages/common/server/parser-user-agent.ts (1)
71-71: LGTM! Useful type export.The new
UserAgentInfotype export provides a clean type definition for consumers ofparseUserAgent.packages/common/server/parse-referrer.test.ts (1)
8-8: LGTM! Test expectations correctly updated.The test expectations have been updated from
'unknown'to''(empty string) to match the implementation changes inparse-referrer.ts. This is consistent with the new default behavior for unknown referrers.Also applies to: 14-14, 44-44, 52-52, 67-67, 91-91, 99-99, 113-113
packages/importer/src/index.ts (1)
1-13: LGTM! Clean barrel export for importer package.The index file properly exports the provider implementations and type definitions, establishing a clean public API for the importer package.
packages/queue/src/queues.ts (1)
168-181: LGTM! Import queue properly configured.The new
importQueuefollows the established pattern for other queues in the codebase, with appropriate job retention settings (removeOnComplete: 10,removeOnFail: 50).apps/worker/src/boot-workers.ts (1)
8-8: LGTM! Import worker correctly configured.The
importWorkeris properly set up with:
- Correct queue and job handler imports
- Configurable concurrency via
IMPORT_JOB_CONCURRENCYenvironment variable- Standard worker options
Also applies to: 23-23, 61-64
packages/common/server/parse-referrer.ts (3)
35-51: Improved referrer matching logic.The updated matching logic is cleaner and more maintainable:
- Source is now normalized to lowercase upfront
- Matching checks direct key,
.comsuffix, and case-insensitive name lookup- Removes redundant fallback
The logic now properly handles case-insensitive matching for query parameters like
utm_source=GoOgLe.
23-23: No breaking change detected.The code already defaults to empty strings for unknown referrers. Tests confirm this is the expected behavior, and no consumers in the codebase check for
type === 'unknown'. The alleged change from'unknown'to''either does not exist or was already in place prior to this PR.Likely an incorrect or invalid review comment.
1-1: Import paths are correct and resolve properly.Both imports in
packages/common/server/parse-referrer.tssuccessfully resolve:
- Line 1:
../src/string→packages/common/src/string.ts✓- Line 3:
./referrers→packages/common/server/referrers/index.ts✓packages/trpc/src/routers/import.ts (1)
101-133: LGTM!Delete procedure correctly checks write permissions and safely removes queue job before deleting the record.
apps/worker/src/jobs/events.incoming-event.ts (2)
8-12: LGTM!Import reorganization consolidates referrer utilities from
@openpanel/common/serverand correctly uses type-only import forJob.Also applies to: 21-21, 23-23
120-122: Referrer priority adjusted — verify behavior.
baseEvent.referrernow exclusively usesreferrer?.url(removing fallback toutmReferrer?.url), andreferrerTypeprioritizesreferrer?.typebeforeutmReferrer?.type. This aligns with centralized referrer parsing, but ensure existing analytics flows handle the priority change correctly.packages/db/prisma/schema.prisma (2)
470-475: LGTM!
ImportStatusenum defines a clear import lifecycle: pending → processing → completed/failed.
197-197: LGTM!Project relation to
importsis correctly defined and matches the Import model's foreign key.apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (2)
88-122: LGTM!
getStatusBadgeprovides clear visual feedback with appropriate icons and tooltips for failed imports showing error details.
229-263: LGTM!Progress display handles multiple scenarios (estimated counts, N/A, processing) with clear user communication via tooltips and progress bars.
apps/worker/src/jobs/import.ts (1)
285-298: LGTM!
createProviderfactory cleanly instantiates provider-specific implementations and throws descriptive errors for unknown providers.packages/importer/src/base-provider.ts (3)
4-31: LGTM!Abstract base class defines a clear provider contract with streaming support (
AsyncGenerator), transformation, validation, and event counting. Well-documented and type-safe.
33-58: LGTM!Optional hooks provide extension points for providers without forcing implementation. Default behaviors (re-throw errors, return 0 for estimated total) are sensible.
78-120: LGTM!
getDateChunksutility correctly splits date ranges for batched processing, handles edge cases (same start/end date, chunks exceeding end date), and is well-documented.apps/start/src/modals/add-import.tsx (3)
44-107: LGTM!UmamiImport form correctly uses
zUmamiImportConfigschema, includes required fields (provider, type, fileUrl), and integrates ProjectMapper for optional mappings.
115-229: LGTM!MixpanelImport form correctly implements required fields with appropriate input types (password for secret) and integrates date range picker for selecting import window.
253-259: LGTM!Payload construction correctly passes
configfrom forms which already include requiredproviderandtypefields set in form defaults (Lines 59-60 for Umami, Lines 124-125 for Mixpanel).packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql (1)
1-14: The migration concerns are valid and already identified in the codebase.The migration under review makes
currentStepNOT NULL without backfilling data, and the very next migration (20251028150655_fix_imports_again) immediately reverts this change, indicating the problem was encountered. Additionally, another follow-up migration (20251028152531_imports_again) modifiescurrentBatchagain.Since these are part of an active development cycle on a newly created table (10 days old as of the migration date), consider consolidating these migrations:
- Combine 20251028150123, 20251028150655, and 20251028152531 into a single migration that correctly handles the schema without intermediate reversals, or
- If these are across separate PRs, ensure
currentStepremains nullable (or add a backfill to populate existing NULLs before setting NOT NULL).The current state requires the next migration to fix the previous one, which suggests the schema design should be finalized before deployment.
packages/db/src/services/import.service.ts (1)
682-691: Verify terminal state handling.Lines 684-691 unconditionally set
status: 'processing'when step is 'loading'. If an import has already reached a terminal state ('completed' or 'failed'), this would revert it. Confirm whether this is intentional for retry scenarios, or if a state check is needed.If terminal states should be protected, add a guard:
const current = await db.import.findUnique({ where: { id: importId }, select: { status: true } }); if (current?.status === 'completed' || current?.status === 'failed') { // Skip or throw depending on requirements return; }
| "type": "module", | ||
| "scripts": { | ||
| "testing": "pnpm dev", | ||
| "testing2": "pnpm dev", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify the script rename from "testing" to "testing2".
The script key was renamed from "testing" to "testing2" without clear justification. This appears to be a development artifact rather than an intentional change for this PR.
If this rename is intentional, please clarify the purpose. Otherwise, consider reverting it or using a more descriptive name.
🤖 Prompt for AI Agents
In apps/start/package.json around line 6, the script key was renamed from
"testing" to "testing2" which looks like an accidental dev artifact; either
revert the key back to "testing" to preserve original functionality or replace
it with a descriptive name (e.g., "dev" or "start:testing") if the rename is
intentional, and update any references in documentation or CI that call the
script accordingly; if intentional, add a short comment in the PR description
explaining the reason for the new name.
| interface ProjectMapperProps { | ||
| fields: FieldArrayWithId<any, 'projectMapper', 'id'>[]; | ||
| append: UseFieldArrayAppend<any, 'projectMapper'>; | ||
| remove: UseFieldArrayRemove; | ||
| projects: IServiceProject[]; | ||
| register: UseFormRegister<any>; | ||
| watch: UseFormWatch<any>; | ||
| setValue: UseFormSetValue<any>; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Define a proper form type instead of using any.
The interface uses any for all generic type parameters, which defeats TypeScript's type safety. Define an explicit form schema type to enable proper type checking and autocomplete.
Apply this diff to add proper typing:
+interface ProjectMapperFormData {
+ projectMapper: Array<{ from: string; to: string }>;
+}
+
interface ProjectMapperProps {
- fields: FieldArrayWithId<any, 'projectMapper', 'id'>[];
- append: UseFieldArrayAppend<any, 'projectMapper'>;
+ fields: FieldArrayWithId<ProjectMapperFormData, 'projectMapper', 'id'>[];
+ append: UseFieldArrayAppend<ProjectMapperFormData, 'projectMapper'>;
remove: UseFieldArrayRemove;
projects: IServiceProject[];
- register: UseFormRegister<any>;
- watch: UseFormWatch<any>;
- setValue: UseFormSetValue<any>;
+ register: UseFormRegister<ProjectMapperFormData>;
+ watch: UseFormWatch<ProjectMapperFormData>;
+ setValue: UseFormSetValue<ProjectMapperFormData>;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| interface ProjectMapperProps { | |
| fields: FieldArrayWithId<any, 'projectMapper', 'id'>[]; | |
| append: UseFieldArrayAppend<any, 'projectMapper'>; | |
| remove: UseFieldArrayRemove; | |
| projects: IServiceProject[]; | |
| register: UseFormRegister<any>; | |
| watch: UseFormWatch<any>; | |
| setValue: UseFormSetValue<any>; | |
| } | |
| interface ProjectMapperFormData { | |
| projectMapper: Array<{ from: string; to: string }>; | |
| } | |
| interface ProjectMapperProps { | |
| fields: FieldArrayWithId<ProjectMapperFormData, 'projectMapper', 'id'>[]; | |
| append: UseFieldArrayAppend<ProjectMapperFormData, 'projectMapper'>; | |
| remove: UseFieldArrayRemove; | |
| projects: IServiceProject[]; | |
| register: UseFormRegister<ProjectMapperFormData>; | |
| watch: UseFormWatch<ProjectMapperFormData>; | |
| setValue: UseFormSetValue<ProjectMapperFormData>; | |
| } |
🤖 Prompt for AI Agents
In apps/start/src/components/project-mapper.tsx around lines 23 to 31, the
ProjectMapperProps interface uses `any` for all react-hook-form generics;
replace these with a concrete form data type (e.g., define an exported
FormValues or ProjectForm type that describes fields used by the component,
including projectMapper: { id: string; <other fields> }[] and any nested
fields); update the interface to use that type in FieldArrayWithId<FormValues,
'projectMapper', 'id'>, UseFieldArrayAppend<FormValues, 'projectMapper'>,
UseFieldArrayRemove<FormValues>, UseFormRegister<FormValues>,
UseFormWatch<FormValues>, and UseFormSetValue<FormValues>; import or declare the
FormValues type near the top of the file (or from a shared types file) and
adjust any call sites to match the new type.
| <Select | ||
| value={watch(`projectMapper.${index}.to`)} | ||
| onValueChange={(value) => | ||
| setValue(`projectMapper.${index}.to`, value) | ||
| } | ||
| > | ||
| <SelectTrigger className="mt-1 w-full" size="sm"> | ||
| <SelectValue placeholder="Select project" /> | ||
| </SelectTrigger> | ||
| <SelectContent> | ||
| {projects.slice(0, 10).map((project) => ( | ||
| <SelectItem key={project.id} value={project.id}> | ||
| {project.name} | ||
| </SelectItem> | ||
| ))} | ||
| </SelectContent> | ||
| </Select> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the arbitrary 10-project limit.
Line 92 limits the dropdown to only the first 10 projects, preventing users with more projects from mapping to those beyond the limit. The Select component should display all available projects with scrolling, or implement search/pagination if performance is a concern.
Apply this diff to show all projects:
<SelectContent>
- {projects.slice(0, 10).map((project) => (
+ {projects.map((project) => (
<SelectItem key={project.id} value={project.id}>
{project.name}
</SelectItem>If performance is a concern with many projects, consider adding a search input or ComboBox component instead of a plain Select.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <Select | |
| value={watch(`projectMapper.${index}.to`)} | |
| onValueChange={(value) => | |
| setValue(`projectMapper.${index}.to`, value) | |
| } | |
| > | |
| <SelectTrigger className="mt-1 w-full" size="sm"> | |
| <SelectValue placeholder="Select project" /> | |
| </SelectTrigger> | |
| <SelectContent> | |
| {projects.slice(0, 10).map((project) => ( | |
| <SelectItem key={project.id} value={project.id}> | |
| {project.name} | |
| </SelectItem> | |
| ))} | |
| </SelectContent> | |
| </Select> | |
| <Select | |
| value={watch(`projectMapper.${index}.to`)} | |
| onValueChange={(value) => | |
| setValue(`projectMapper.${index}.to`, value) | |
| } | |
| > | |
| <SelectTrigger className="mt-1 w-full" size="sm"> | |
| <SelectValue placeholder="Select project" /> | |
| </SelectTrigger> | |
| <SelectContent> | |
| {projects.map((project) => ( | |
| <SelectItem key={project.id} value={project.id}> | |
| {project.name} | |
| </SelectItem> | |
| ))} | |
| </SelectContent> | |
| </Select> |
🤖 Prompt for AI Agents
In apps/start/src/components/project-mapper.tsx around lines 82 to 98 the Select
is artificially limited by projects.slice(0, 10) which hides projects beyond the
first 10; remove the slice so the SelectContent maps over the entire projects
array (or replace with a virtualized list / searchable ComboBox if performance
is a concern) and ensure the SelectContent is allowed to scroll by keeping the
UI container styles intact.
| <TableCell> | ||
| <Tooltiper | ||
| content={ | ||
| <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> | ||
| {JSON.stringify(imp.config, null, 2)} | ||
| </pre> | ||
| } | ||
| tooltipClassName="max-w-xs" | ||
| > | ||
| <Badge>Config</Badge> | ||
| </Tooltiper> | ||
| </TableCell> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Config tooltip exposes sensitive credentials.
The config JSON (Line 269) contains API keys, secrets, and service credentials. This tooltip exposes them to anyone viewing the imports page. Redact sensitive fields or restrict based on user role.
<Tooltiper
content={
<pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words">
- {JSON.stringify(imp.config, null, 2)}
+ {JSON.stringify(
+ {
+ ...imp.config,
+ apiKey: imp.config.apiKey ? '***' : undefined,
+ apiSecret: imp.config.apiSecret ? '***' : undefined,
+ serviceAccount: imp.config.serviceAccount ? '***' : undefined,
+ serviceSecret: imp.config.serviceSecret ? '***' : undefined,
+ },
+ null,
+ 2
+ )}
</pre>
}
tooltipClassName="max-w-xs"
>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <TableCell> | |
| <Tooltiper | |
| content={ | |
| <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> | |
| {JSON.stringify(imp.config, null, 2)} | |
| </pre> | |
| } | |
| tooltipClassName="max-w-xs" | |
| > | |
| <Badge>Config</Badge> | |
| </Tooltiper> | |
| </TableCell> | |
| <TableCell> | |
| <Tooltiper | |
| content={ | |
| <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> | |
| {JSON.stringify( | |
| { | |
| ...imp.config, | |
| apiKey: imp.config.apiKey ? '***' : undefined, | |
| apiSecret: imp.config.apiSecret ? '***' : undefined, | |
| serviceAccount: imp.config.serviceAccount ? '***' : undefined, | |
| serviceSecret: imp.config.serviceSecret ? '***' : undefined, | |
| }, | |
| null, | |
| 2 | |
| )} | |
| </pre> | |
| } | |
| tooltipClassName="max-w-xs" | |
| > | |
| <Badge>Config</Badge> | |
| </Tooltiper> | |
| </TableCell> |
🤖 Prompt for AI Agents
In
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx
around lines 265 to 276, the tooltip currently renders the raw imp.config JSON
(line 269) which can expose API keys/secrets; update the rendering to first
sanitize/redact sensitive fields and/or gate the full config behind an
authorization check. Implement a small sanitizeConfig utility that deep-copies
the config and replaces known sensitive keys (e.g., apiKey, api_secret, secret,
password, token, credentials, privateKey, accessKey, secretAccessKey) with a
redacted placeholder, then pass the sanitized object to JSON.stringify for
display; additionally, wrap the tooltip render with a permission check so only
authorized roles (admins/owners) can see the unredacted config, otherwise show
the sanitized version or a notice that details are hidden.
| importWorker, | ||
| // eventsGroupWorker, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Events processing appears to be disabled.
The eventsGroupWorker is commented out in the workers array (line 72), which means event processing is effectively disabled. However, the worker is still instantiated (lines 34-48) and will continue to run, but won't be included in the lifecycle management (ready/error/closed handlers, graceful shutdown).
This appears to be an unintended change that would break event processing in production. Please verify:
- Is event processing intentionally disabled?
- Should
eventsGroupWorkerbe added back to the workers array? - Or should the worker instantiation be removed entirely if no longer needed?
const workers = [
sessionsWorker,
cronWorker,
notificationWorker,
miscWorker,
importWorker,
- // eventsGroupWorker,
+ eventsGroupWorker,
];📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| importWorker, | |
| // eventsGroupWorker, | |
| importWorker, | |
| eventsGroupWorker, |
🤖 Prompt for AI Agents
In apps/worker/src/boot-workers.ts around lines 71 to 72, the eventsGroupWorker
is commented out from the workers array which leaves its instance running but
unmanaged by lifecycle handlers; either uncomment/add eventsGroupWorker back
into the workers array so it participates in ready/error/close and
graceful-shutdown flows, or if event processing is intentionally disabled remove
the eventsGroupWorker instantiation and related setup (and any imports) so there
is no orphaned running worker; pick one of these two fixes and apply it
consistently across imports, worker list, and shutdown/handler registration.
| -- AlterTable | ||
| ALTER TABLE "public"."imports" ALTER COLUMN "currentBatch" DROP NOT NULL, | ||
| ALTER COLUMN "currentBatch" DROP DEFAULT, | ||
| ALTER COLUMN "currentBatch" SET DATA TYPE TEXT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify the relaxed constraints on currentBatch align with batch tracking requirements.
This migration makes three significant changes to currentBatch:
- Removes NOT NULL constraint (allows imports without batches)
- Removes DEFAULT value (no automatic batch assignment)
- Changes type to TEXT (allows flexible batch identifiers)
Please confirm that:
- Imports are expected to exist without a
currentBatch(nullable is correct) - The workflow handles missing batch values appropriately
- TEXT is the appropriate type for your batch identifier scheme (e.g., UUIDs, timestamps, custom strings)
🏁 Script executed:
#!/bin/bash
# Description: Check how currentBatch is used in import processing logic
# Search for currentBatch usage and null checks
rg -nP --type=ts -C3 'currentBatch' -g '!*.gen.ts' -g '!node_modules'Length of output: 4902
🏁 Script executed:
#!/bin/bash
# Find the Prisma schema and check currentBatch field definition
fd -name 'schema.prisma' -o -name '*.prisma'Length of output: 230
🏁 Script executed:
#!/bin/bash
# List migrations in order to find the one before this one
ls -1 packages/db/prisma/migrations/ | sort | tail -10Length of output: 295
🏁 Script executed:
#!/bin/bash
# Check the imports table definition in the latest schema or migrations
fd -name '*.sql' -path '*/migrations/*' -exec grep -l 'currentBatch' {} \;Length of output: 230
🏁 Script executed:
#!/bin/bash
# View the previous migration to understand what currentBatch type was
cat packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sqlLength of output: 86
🏁 Script executed:
#!/bin/bash
# View the migration that added the import table initially
cat packages/db/prisma/migrations/20251018205153_add_import_table/migration.sqlLength of output: 840
🏁 Script executed:
#!/bin/bash
# Search for schema.prisma in common locations
find packages/db -name 'schema.prisma' 2>/dev/nullLength of output: 32
🏁 Script executed:
#!/bin/bash
# Check what type currentBatch has in recent migrations
rg 'currentBatch' packages/db/prisma/migrations/ -A1 -B1Length of output: 859
🏁 Script executed:
#!/bin/bash
# View the Prisma schema definition
cat packages/db/prisma/schema.prisma | grep -A 50 'model imports'Length of output: 0
🏁 Script executed:
#!/bin/bash
# Check if there's a type generation process or codegen in Prisma
rg -A5 'generator|prisma' packages/db/prisma/schema.prisma | head -30Length of output: 432
🏁 Script executed:
#!/bin/bash
# Check how batch parameter is used - what values are passed
rg -nP 'options\.batch|batch:' packages/db/src/ apps/worker/src/ -B2 -A2Length of output: 4984
🏁 Script executed:
#!/bin/bash
# View full imports model from schema
rg -A 30 'model imports' packages/db/prisma/schema.prismaLength of output: 0
🏁 Script executed:
#!/bin/bash
# Check generated Prisma types for imports
cat packages/db/src/generated/prisma/index.d.ts 2>/dev/null | grep -A 30 'interface imports' | head -40Length of output: 0
🏁 Script executed:
#!/bin/bash
# View the imports.service.ts to see the UpdateImportStatusOptions type
rg -B5 -A10 'interface.*UpdateImportStatusOptions|type UpdateImportStatusOptions' packages/db/src/services/import.service.tsLength of output: 357
🏁 Script executed:
#!/bin/bash
# Find and view the ImportProgress interface definition
rg -B3 -A8 'interface ImportProgress' packages/importer/src/types.tsLength of output: 232
🏁 Script executed:
#!/bin/bash
# Check if there are any type errors or if the types are reconciled elsewhere
rg -C5 'currentBatch.*number|number.*currentBatch' packages/Length of output: 530
🏁 Script executed:
#!/bin/bash
# Check where ImportProgress is used and how currentBatch is accessed
rg -C3 'ImportProgress|currentBatch' packages/importer/src/ apps/Length of output: 3534
Fix TypeScript type definition for currentBatch in ImportProgress interface.
The migration correctly changes currentBatch from INTEGER NOT NULL DEFAULT 0 to TEXT nullable with no default. However, the TypeScript type definition needs to be updated:
- File:
packages/importer/src/types.ts, line 23 - Current:
currentBatch: number; - Should be:
currentBatch: string | null;
Reason: The migration stores TEXT values (timestamps, batch identifiers), and the code already handles null batches. The type should reflect this reality.
🤖 Prompt for AI Agents
In packages/importer/src/types.ts around line 23, the ImportProgress interface
currently declares currentBatch: number; but the DB migration changes
currentBatch to nullable TEXT; update the TypeScript definition to currentBatch:
string | null; so the type matches stored values (text/batch IDs or null) and
adjust any places expecting a number accordingly.
| model Import { | ||
| id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid | ||
| projectId String | ||
| project Project @relation(fields: [projectId], references: [id], onDelete: Cascade) | ||
| jobId String? // BullMQ job ID | ||
| status ImportStatus | ||
| statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs" | ||
| errorMessage String? | ||
| /// [IPrismaImportConfig] | ||
| config Json | ||
| totalEvents Int @default(0) | ||
| processedEvents Int @default(0) | ||
| currentStep String? | ||
| currentBatch String? // String date 2020-01-01 | ||
| createdAt DateTime @default(now()) | ||
| completedAt DateTime? | ||
| updatedAt DateTime @default(now()) @updatedAt | ||
| @@map("imports") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistency: currentStep nullable in schema but NOT NULL in migration.
Line 489 defines currentStep as String? (optional), but migration 20251028150123_fix_imports_table/migration.sql sets it to NOT NULL. Align schema and migration:
- If imports always have a step, make it required:
currentStep String - If it's optional during creation, remove NOT NULL constraint from migration
- currentStep String?
+ currentStep StringOr revert the migration's SET NOT NULL constraint.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| model Import { | |
| id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid | |
| projectId String | |
| project Project @relation(fields: [projectId], references: [id], onDelete: Cascade) | |
| jobId String? // BullMQ job ID | |
| status ImportStatus | |
| statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs" | |
| errorMessage String? | |
| /// [IPrismaImportConfig] | |
| config Json | |
| totalEvents Int @default(0) | |
| processedEvents Int @default(0) | |
| currentStep String? | |
| currentBatch String? // String date 2020-01-01 | |
| createdAt DateTime @default(now()) | |
| completedAt DateTime? | |
| updatedAt DateTime @default(now()) @updatedAt | |
| @@map("imports") | |
| } | |
| model Import { | |
| id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid | |
| projectId String | |
| project Project @relation(fields: [projectId], references: [id], onDelete: Cascade) | |
| jobId String? // BullMQ job ID | |
| status ImportStatus | |
| statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs" | |
| errorMessage String? | |
| /// [IPrismaImportConfig] | |
| config Json | |
| totalEvents Int @default(0) | |
| processedEvents Int @default(0) | |
| currentStep String | |
| currentBatch String? // String date 2020-01-01 | |
| createdAt DateTime @default(now()) | |
| completedAt DateTime? | |
| updatedAt DateTime @default(now()) @updatedAt | |
| @@map("imports") | |
| } |
🤖 Prompt for AI Agents
In packages/db/prisma/schema.prisma around lines 477 to 496, the model Import
declares currentStep as nullable (String?) while the migration
20251028150123_fix_imports_table/migration.sql sets current_step to NOT NULL;
pick one and make them consistent: if imports must always have a step, change
schema currentStep to non-nullable (currentStep String) and run prisma migrate
to regenerate the SQL; otherwise, edit the migration SQL to remove the SET NOT
NULL on current_step (or create a new migration that drops the NOT NULL
constraint) so the column remains nullable, then reapply migrations.
| e.import_id = ${sqlstring.escape(importId)} | ||
| AND toDate(e.created_at) >= ${sqlstring.escape(from)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Inconsistent parameter handling.
Lines 505-506 use sqlstring.escape() for inline parameter substitution, while the rest of the codebase uses ClickHouse's query_params mechanism (e.g., lines 89-90, 416-417, 534-535). This inconsistency reduces maintainability and may bypass ClickHouse's optimizations.
Refactor to use query_params:
+ const sessionsInsertQuery = `
...
WHERE
- e.import_id = ${sqlstring.escape(importId)}
- AND toDate(e.created_at) >= ${sqlstring.escape(from)}
+ e.import_id = {importId:String}
+ AND toDate(e.created_at) >= {from:String}
AND e.session_id != ''
GROUP BY e.session_id
`;
await ch.command({
query: sessionsInsertQuery,
+ query_params: { importId, from },
clickhouse_settings: {Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In packages/db/src/services/import.service.ts around lines 505-506, the SQL
currently inlines e.import_id = ${sqlstring.escape(importId)} AND
toDate(e.created_at) >= ${sqlstring.escape(from)}; replace these inline
sqlstring.escape substitutions with ClickHouse query parameter placeholders and
pass the corresponding parameters in the same query_params structure used
elsewhere in the file (e.g., add parameters for importId and from in the query
call), using the same parameter naming and types as other queries (e.g.,
{importId: String} and {from: DateTime}/Date) so the SQL becomes e.import_id =
{importId:String} AND toDate(e.created_at) >= {from:Date} (or matching the
project's param type) and include those values in the query_params array/object
for the query execution.
| this.logger?.info('Fetching events from Mixpanel', { | ||
| url: `${url}?${params}`, | ||
| from, | ||
| to, | ||
| projectId, | ||
| serviceAccount, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sensitive credentials logged.
Line 90 logs the serviceAccount credential. Even though the secret is not logged, exposing account identifiers in logs can aid attackers in credential stuffing or enumeration attacks.
Consider removing sensitive fields from logs:
this.logger?.info('Fetching events from Mixpanel', {
url: `${url}?${params}`,
from,
to,
projectId,
- serviceAccount,
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| this.logger?.info('Fetching events from Mixpanel', { | |
| url: `${url}?${params}`, | |
| from, | |
| to, | |
| projectId, | |
| serviceAccount, | |
| }); | |
| this.logger?.info('Fetching events from Mixpanel', { | |
| url: `${url}?${params}`, | |
| from, | |
| to, | |
| projectId, | |
| }); |
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 85 to 91, the logger
call includes the serviceAccount identifier which is considered sensitive;
remove serviceAccount from the logged metadata and only log non-sensitive
context (url, from, to, projectId) or replace serviceAccount with a
redacted/boolean flag (e.g., serviceAccountPresent: true) if presence needs to
be recorded; update the logger invocation accordingly and run a quick search to
ensure no other logs emit service account identifiers.
| // Add new job to queue | ||
| const job = await importQueue.add('import', { | ||
| type: 'import', | ||
| payload: { | ||
| importId: importRecord.id, | ||
| }, | ||
| }); | ||
|
|
||
| // Update import record | ||
| return db.import.update({ | ||
| where: { id: importRecord.id }, | ||
| data: { | ||
| jobId: job.id, | ||
| status: 'pending', | ||
| errorMessage: null, | ||
| }, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap retry enqueue in try/catch.
Same issue as create: if importQueue.add fails, the record is updated to 'pending' but no job exists. Wrap in try/catch and revert status on failure.
// Add new job to queue
- const job = await importQueue.add('import', {
- type: 'import',
- payload: {
- importId: importRecord.id,
- },
- });
-
- // Update import record
- return db.import.update({
- where: { id: importRecord.id },
- data: {
- jobId: job.id,
- status: 'pending',
- errorMessage: null,
- },
- });
+ try {
+ const job = await importQueue.add(
+ 'import',
+ { type: 'import', payload: { importId: importRecord.id } },
+ { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
+ );
+ return db.import.update({
+ where: { id: importRecord.id },
+ data: {
+ jobId: job.id,
+ status: 'pending',
+ errorMessage: null,
+ },
+ });
+ } catch (err) {
+ throw new Error('Failed to enqueue retry job: ' + (err instanceof Error ? err.message : String(err)));
+ }
}),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Add new job to queue | |
| const job = await importQueue.add('import', { | |
| type: 'import', | |
| payload: { | |
| importId: importRecord.id, | |
| }, | |
| }); | |
| // Update import record | |
| return db.import.update({ | |
| where: { id: importRecord.id }, | |
| data: { | |
| jobId: job.id, | |
| status: 'pending', | |
| errorMessage: null, | |
| }, | |
| }); | |
| // Add new job to queue | |
| try { | |
| const job = await importQueue.add( | |
| 'import', | |
| { type: 'import', payload: { importId: importRecord.id } }, | |
| { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, | |
| ); | |
| return db.import.update({ | |
| where: { id: importRecord.id }, | |
| data: { | |
| jobId: job.id, | |
| status: 'pending', | |
| errorMessage: null, | |
| }, | |
| }); | |
| } catch (err) { | |
| throw new Error('Failed to enqueue retry job: ' + (err instanceof Error ? err.message : String(err))); | |
| } |
🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 160 to 176, the code calls
importQueue.add and then unconditionally updates the DB record to status
'pending', which can leave the DB pointing to a non‑existent job if add fails;
wrap the call to importQueue.add in a try/catch, only update the import record
to 'pending' and set jobId after a successful add, and in the catch update the
import record to a terminal state (e.g., 'failed') with errorMessage set to the
caught error (or revert to the previous status if tracked); ensure the DB
updates are awaited and consider doing the job add and record update in a
transaction or in the correct sequence so no pending state is recorded when
enqueueing fails.
b60dd0e to
1a11373
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
packages/trpc/src/routers/auth.ts (2)
227-229: Remove sensitive console logs from auth flow.Logging session objects and cookie setters risks leaking tokens/PII.
Apply this diff:
- console.log('session', session); setSessionTokenCookie(ctx.setCookie, token, session.expiresAt); - console.log('ctx.setCookie', ctx.setCookie);
201-203: Avoid credential enumeration; return a generic auth error.“User does not exists” reveals account presence. Use a single generic message.
Apply this diff:
-import { TRPCAccessError, TRPCNotFoundError } from '../errors'; +import { TRPCAccessError, TRPCNotFoundError } from '../errors'; @@ - if (!user) { - throw TRPCNotFoundError('User does not exists'); - } + if (!user) { + throw TRPCAccessError('Incorrect email or password'); + }Note: If you prefer stricter semantics, you can also replace the other branch’s incorrect-password error to the same message to keep responses uniform.Also applies to: 31-31
apps/worker/src/jobs/events.incoming-event.ts (1)
39-44: Do not log full event payload; redact to avoid PII leakage.Logging entire payload and headers can include emails, IDs, and custom properties. Log only identifiers and sizes.
- logger.info('Creating event', { event: payload, jobData }); + logger.info('Creating event', { + name: payload.name, + projectId: payload.projectId, + profileId: payload.profileId, + sessionId: (payload as any).sessionId, + deviceId: (payload as any).deviceId, + propertiesSize: Object.keys(payload.properties ?? {}).length, + });
♻️ Duplicate comments (31)
apps/start/src/components/project-mapper.tsx (2)
23-31: Define a proper form type instead of usingany.The interface uses
anyfor all generic type parameters, which defeats TypeScript's type safety. Define an explicit form schema type to enable proper type checking and autocomplete.
92-92: Remove the arbitrary 10-project limit.The
slice(0, 10)limits the dropdown to only the first 10 projects, preventing users with more projects from mapping to those beyond the limit. The Select component should display all available projects with scrolling, or implement search/pagination if performance is a concern.packages/common/scripts/get-referrers.ts (1)
74-74: Fix incorrect output path.The script writes to
../../worker/src/referrers/index.ts, but frompackages/common/scripts/, this resolves to an unexpected location outside the package. The corresponding format command inpackage.jsonexpects./server/referrers/index.tsrelative to the package root.Apply this diff to fix the output path:
- path.resolve(__dirname, '../../worker/src/referrers/index.ts'), + path.resolve(__dirname, '../server/referrers/index.ts'),packages/db/code-migrations/5-add-imports-table.ts (2)
1-2: Fix__filenameusage in ESM module.The package is configured as ESM (
"type": "module"), and using__filenamedirectly will fail at runtime. ImportfileURLToPathfromnode:urland derive__filenameproperly.Apply this diff:
import fs from 'node:fs'; import path from 'node:path'; +import { fileURLToPath } from 'node:url'; import { TABLE_NAMES } from '../src/clickhouse/client'; import { chMigrationClient, createTable, runClickhouseMigrationCommands, } from '../src/clickhouse/migration'; import { getIsCluster } from './helpers'; +const __filename = fileURLToPath(import.meta.url); + export async function up() {Also applies to: 72-72
65-69: TTL on Distributed table will fail in clustered mode.
ALTER TABLE events_imports MODIFY TTLwill fail or be a no-op on the Distributed table in clustered deployments. TTL must be defined on the local Replicated MergeTree table, preferably in the CREATE TABLE statement.Consider extending
createTableto support TTL parameter and include it in the CREATE TABLE statement, or target the replicated table explicitly withON CLUSTER '{cluster}'if the ALTER approach is retained.packages/validation/src/index.ts (1)
478-486: Discriminate by provider to prevent mismatched payloads.The current schema allows
providerandconfigto disagree (e.g.,provider='umami'with a Mixpanel config). This was flagged in a previous review and remains unresolved.Apply a discriminated union on
providerso the config shape is tied to the selected provider:-export const zCreateImport = z.object({ - projectId: z.string().min(1), - provider: z.enum(['umami', 'plausible', 'mixpanel']), - config: z.union([ - zUmamiImportConfig, - zPlausibleImportConfig, - zMixpanelImportConfig, - ]), -}); +export const zCreateImport = z.discriminatedUnion('provider', [ + z.object({ + projectId: z.string().min(1), + provider: z.literal('umami'), + config: zUmamiImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('plausible'), + config: zPlausibleImportConfig, + }), + z.object({ + projectId: z.string().min(1), + provider: z.literal('mixpanel'), + config: zMixpanelImportConfig, + }), +]);packages/importer/src/types.ts (1)
53-66: EventProperties allows nested objects, but ClickHouse requires primitives only.The
EventPropertiesinterface (line 61) allowsRecord<string, unknown>and defines__queryasRecord<string, unknown>(line 62). This conflicts with ClickHouse'sIClickhouseEvent.propertieswhich must beRecord<string, string|number|boolean|undefined|null>. Storing nested objects will break ingestion unless stringified. This was flagged in a previous review and remains unresolved.Either constrain EventProperties to primitives only:
export interface EventProperties { [key: string]: | string | number | boolean | null - | undefined - | Record<string, unknown>; - __query?: Record<string, unknown>; + | undefined; + __query?: string; // JSON.stringify before storing __title?: string; __screen?: string; __language?: string; }Or explicitly handle JSON stringification for nested fields before they reach ClickHouse.
packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
10-13: Data loss risk: backfill config before dropping columns.Migrate provider/source fields into config first, then DROP. Example:
+-- Backfill config from legacy columns before dropping them +UPDATE "public"."imports" i +SET "config" = COALESCE( + jsonb_build_object( + 'provider', i."provider", + 'sourceType', i."sourceType", + 'sourceLocation', i."sourceLocation" + ) || COALESCE(i."config", '{}'::jsonb), + '{}'::jsonb +) +WHERE i."provider" IS NOT NULL + OR i."sourceType" IS NOT NULL + OR i."sourceLocation" IS NOT NULL; + -- AlterTable ALTER TABLE "public"."imports" DROP COLUMN "provider", DROP COLUMN "sourceLocation", DROP COLUMN "sourceType", ALTER COLUMN "config" DROP DEFAULT;packages/importer/vitest.config.ts (1)
3-3: ESM: __dirname is undefined. Compute from import.meta.url.Apply:
+import { dirname } from 'node:path'; +import { fileURLToPath } from 'node:url'; import { getSharedVitestConfig } from '../../vitest.shared'; -export default getSharedVitestConfig({ __dirname }); +export default getSharedVitestConfig({ + __dirname: dirname(fileURLToPath(import.meta.url)), +});apps/worker/src/boot-workers.ts (2)
66-73: eventsGroupWorker is unmanaged; either add it back to lifecycle or remove it entirely.Currently it runs but is excluded from handlers/close, risking leaks and unobserved failures. Choose one path and apply consistently.
Option A — manage it with the others (recommended):
const workers = [ sessionsWorker, cronWorker, notificationWorker, miscWorker, importWorker, - // eventsGroupWorker, + // GroupMQ worker participates in lifecycle + // (we still keep eventsGroupWorker.run(); above) + eventsGroupWorker as unknown as Worker, ];Option B — if intentionally disabled, remove it completely:
- const eventsGroupWorker = new GroupWorker<...>({ ... }); - eventsGroupWorker.run(); + // eventsGroupWorker intentionally disabled; remove instantiation to avoid orphaned worker.
155-166: Always use graceful shutdown; remove the dev fast-exit path.Avoid skipping worker close/queue drain in dev.
- if (process.env.NODE_ENV === 'production') { - exitHandler(evt, code); - } else { - logger.info('Shutting down for development', { - event: evt, - code, - }); - process.exit(0); - } + exitHandler(evt, code);packages/db/prisma/schema.prisma (1)
489-489: Schema/migration mismatch on currentStep.Schema has
currentStep String?but the migration sets it NOT NULL. Keep it nullable or update both consistently. This also aligns with import.ts wherecurrentStep === nulldenotes a new import.packages/trpc/src/routers/import.ts (4)
24-31: Sensitive config exposure in list — redact for read-level access.Return imports without leaking config to read-only members. Gate by role and null-out config when needed.
Apply:
- return db.import.findMany({ - where: { - projectId: input.projectId, - }, - orderBy: { - createdAt: 'desc', - }, - }); + const rows = await db.import.findMany({ + where: { projectId: input.projectId }, + orderBy: { createdAt: 'desc' }, + }); + const canSeeSecrets = + typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets ? rows : rows.map((r) => ({ ...r, config: null }));
55-56: Sensitive config exposure in get — redact for read-level access.Avoid returning secrets to read-only members.
Apply:
- return importRecord; + const canSeeSecrets = + typeof access !== 'boolean' && access.level !== 'read'; + return canSeeSecrets ? importRecord : { ...importRecord, config: null };
81-99: Enqueue not guarded — pending record can orphan without a job.Wrap queue add in try/catch, use stable jobId, attempts, and mark DB failed on error.
Apply:
- // Add job to queue - const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); - - // Update import record with job ID - await db.import.update({ - where: { id: importRecord.id }, - data: { jobId: job.id }, - }); + // Add job to queue + try { + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + ); + await db.import.update({ + where: { id: importRecord.id }, + data: { jobId: job.id }, + }); + } catch (err) { + await db.import.update({ + where: { id: importRecord.id }, + data: { + status: 'failed', + errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed', + }, + }); + throw err; + }
160-176: Retry enqueue not guarded — can set 'pending' without job.Same reliability issue: wrap in try/catch and only update status on success.
Apply:
- // Add new job to queue - const job = await importQueue.add('import', { - type: 'import', - payload: { - importId: importRecord.id, - }, - }); - - // Update import record - return db.import.update({ - where: { id: importRecord.id }, - data: { - jobId: job.id, - status: 'pending', - errorMessage: null, - }, - }); + // Add new job to queue + try { + const job = await importQueue.add( + 'import', + { type: 'import', payload: { importId: importRecord.id } }, + { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } }, + ); + return db.import.update({ + where: { id: importRecord.id }, + data: { jobId: job.id, status: 'pending', errorMessage: null }, + }); + } catch (err) { + throw new Error( + 'Failed to enqueue retry job: ' + + (err instanceof Error ? err.message : String(err)), + ); + }apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)
265-276: Config tooltip exposes credentials — sanitize/redact before display.Redact known secret keys and handle null config.
Apply:
+const redactConfig = (cfg: any) => { + if (!cfg) return null; + const secretKeys = new Set([ + 'apiKey','apiSecret','serviceAccount','serviceSecret','token','password', + 'secret','privateKey','accessKey','secretAccessKey', + ]); + const walk = (obj: any): any => + Object.fromEntries( + Object.entries(obj).map(([k, v]) => [ + k, + v && typeof v === 'object' ? walk(v) : secretKeys.has(k) ? '***' : v, + ]), + ); + return walk(cfg); +}; ... <Tooltiper content={ <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> - {JSON.stringify(imp.config, null, 2)} + {JSON.stringify(redactConfig(imp.config), null, 2)} </pre> } tooltipClassName="max-w-xs" > <Badge>Config</Badge> </Tooltiper>apps/worker/src/jobs/import.ts (2)
129-137: Avoid logging full raw events (PII leak).Log minimal reason/context only.
Apply:
- ) { - jobLogger.warn('Skipping invalid event', { rawEvent }); + ) { + jobLogger.warn('Skipping invalid event', { + provider: providerInstance.provider, + reason: 'validation_failed', + }); continue; }
251-258: Sync final Postgres counters with ClickHouse progress for UI.Update totalEvents/processedEvents on completion.
Apply:
const finalProgress = await getImportProgress(importId); @@ jobLogger.info('Import job completed successfully', { totalEvents: finalProgress.totalEvents, insertedEvents: finalProgress.insertedEvents, status: finalProgress.status, }); + + await db.import.update({ + where: { id: importId }, + data: { + totalEvents: finalProgress.totalEvents, + processedEvents: finalProgress.insertedEvents, + }, + });packages/importer/src/providers/umami.ts (6)
1-15: Add SSRF guard before fetching remote files.Validate scheme and resolve host to block private/loopback ranges.
Apply:
import { randomUUID } from 'node:crypto'; import { Readable } from 'node:stream'; import { pipeline } from 'node:stream/promises'; import { createBrotliDecompress, createGunzip } from 'node:zlib'; +import { lookup } from 'node:dns/promises'; +import { isIP } from 'node:net'; import { isSameDomain, parsePath } from '@openpanel/common'; import { generateDeviceId } from '@openpanel/common/server'; import { getReferrerWithQuery, parseReferrer } from '@openpanel/common/server'; import type { IClickhouseEvent } from '@openpanel/db'; import type { ILogger } from '@openpanel/logger'; import type { IUmamiImportConfig } from '@openpanel/validation'; import { parse } from 'csv-parse'; -import { assocPath } from 'ramda'; import { z } from 'zod'; import { BaseImportProvider } from '../base-provider'; + +async function assertRemoteUrlAllowed(target: string) { + const u = new URL(target); + if (!/^https?:$/i.test(u.protocol)) { + throw new Error('Only http/https URLs are allowed'); + } + const addrs = await lookup(u.hostname, { all: true, verbatim: true }); + const isPrivate = (addr: string) => + addr === '127.0.0.1' || + addr === '::1' || + addr.startsWith('10.') || + /^172\.(1[6-9]|2\d|3[0-1])\./.test(addr) || + addr.startsWith('192.168.') || + addr.startsWith('fc') || + addr.startsWith('fd') || + addr.startsWith('fe80:'); + for (const { address } of addrs) { + if (isIP(address) && isPrivate(address)) { + throw new Error('Blocked private address'); + } + } +}
79-87: Call SSRF guard before fetch.Block unsafe targets early.
Apply:
private async *parseRemoteFile( url: string, @@ ): AsyncGenerator<UmamiRawEvent, void, unknown> { const { signal, maxBytes, maxRows } = opts; const controller = new AbortController(); @@ - const res = await fetch(url, { signal: controller.signal }); + await assertRemoteUrlAllowed(url); + const res = await fetch(url, { signal: controller.signal });Also applies to: 97-103
213-221: Device ID fallback is invalid; use stable available identifiers.Stop passing visit_id as ip/ua.
Apply:
- const deviceId = - rawEvent.visit_id || - generateDeviceId({ - ip: rawEvent.visit_id!, - ua: rawEvent.visit_id!, - origin: projectId, - salt: 'xxx', - }); + const deviceId = + rawEvent.visit_id || + rawEvent.distinct_id || + rawEvent.session_id || + rawEvent.event_id || + randomUUID();
224-229: Fix URL construction (require hostname, prefix query with '?').Avoid invalid URLs.
Apply:
- const url = rawEvent.url_path - ? `https://${[rawEvent.hostname, rawEvent.url_path, rawEvent.url_query] - .filter(Boolean) - .join('')}` - : ''; + let url = ''; + if (rawEvent.hostname && rawEvent.url_path) { + const q = rawEvent.url_query + ? rawEvent.url_query.startsWith('?') + ? rawEvent.url_query + : `?${rawEvent.url_query}` + : ''; + url = `https://${rawEvent.hostname}${rawEvent.url_path}${q}`; + }
231-234: Include referrer_query in referrer URL.Preserve query parameters.
Apply:
- const referrerUrl = rawEvent.referrer_domain - ? `https://${rawEvent.referrer_domain}${rawEvent.referrer_path || ''}` - : ''; + const refQuery = rawEvent.referrer_query + ? rawEvent.referrer_query.startsWith('?') + ? rawEvent.referrer_query + : `?${rawEvent.referrer_query}` + : ''; + const referrerUrl = rawEvent.referrer_domain + ? `https://${rawEvent.referrer_domain}${rawEvent.referrer_path || ''}${refQuery}` + : '';
257-297: Flatten properties to primitives; ClickHouse rejects nested objects.Replace nested __query and assocPath usage with flat primitive keys.
Apply:
- let properties: Record<string, any> = {}; - - if (query) { - properties.__query = query; - } + const properties: Record< + string, + string | number | boolean | null | undefined + > = {}; + if (query) { + for (const [k, v] of Object.entries(query)) { + properties[`query.${k}`] = v as string; + } + } @@ - if (rawEvent.utm_source) - properties = assocPath( - ['__query', 'utm_source'], - rawEvent.utm_source, - properties, - ); - if (rawEvent.utm_medium) - properties = assocPath( - ['__query', 'utm_medium'], - rawEvent.utm_medium, - properties, - ); - if (rawEvent.utm_campaign) - properties = assocPath( - ['__query', 'utm_campaign'], - rawEvent.utm_campaign, - properties, - ); - if (rawEvent.utm_content) - properties = assocPath( - ['__query', 'utm_content'], - rawEvent.utm_content, - properties, - ); - if (rawEvent.utm_term) - properties = assocPath( - ['__query', 'utm_term'], - rawEvent.utm_term, - properties, - ); + if (rawEvent.utm_source) properties.utm_source = rawEvent.utm_source; + if (rawEvent.utm_medium) properties.utm_medium = rawEvent.utm_medium; + if (rawEvent.utm_campaign) properties.utm_campaign = rawEvent.utm_campaign; + if (rawEvent.utm_content) properties.utm_content = rawEvent.utm_content; + if (rawEvent.utm_term) properties.utm_term = rawEvent.utm_term;apps/start/src/modals/add-import.tsx (1)
245-249: Avoid leaking upstream error details to end users.Use a generic error toast; log details server-side.
Apply:
- onError: (error) => { - toast.error('Import failed', { - description: error.message, - }); - }, + onError: () => { + toast.error('Import failed', { + description: 'We could not start the import. Please try again later.', + }); + },packages/importer/src/providers/mixpanel.ts (1)
85-91: Sensitive credentials logged.Line 90 logs the
serviceAccountcredential, which can aid attackers in credential stuffing or enumeration attacks.packages/db/src/services/import.service.ts (4)
256-256: Risk of timestamp collisions with existing events.Lines 256 and 293 adjust timestamps by exactly ±1000ms (1 second), which could create timestamp conflicts with real events at those exact timestamps, leading to non-deterministic event ordering.
Also applies to: 293-293
505-506: Inconsistent parameter handling.Lines 505-506 use
sqlstring.escape()for inline parameter substitution, while the rest of the codebase uses ClickHouse'squery_paramsmechanism. This inconsistency reduces maintainability.
559-559: Use deterministic status aggregation instead ofany(import_status).Line 559 uses
any(import_status)which returns an arbitrary status from the group, making the result non-deterministic when events have mixed statuses.
603-603: Validate status value instead of using type assertion.Line 603 uses a type assertion which bypasses type safety. If ClickHouse returns an unexpected status value, the type system won't catch it.
🧹 Nitpick comments (15)
packages/common/server/parser-user-agent.ts (1)
84-112: Consider reducing code duplication.The pattern
typeof overrides?.__x === 'string' && overrides?.__x ? overrides?.__x : res.xis repeated seven times for different fields. This increases maintenance overhead—if the override logic needs adjustment, you'd need to update it in seven places.Consider extracting this logic into a helper function:
+function getOverride<T>( + override: unknown, + fallback: T | undefined +): T | undefined | string { + return typeof override === 'string' && override !== '' ? override : fallback; +} + export function parseUserAgent( ua?: string | null, overrides?: Record<string, unknown>, ) { if (!ua) return parsedServerUa; const res = parse(ua); if (isServer(res)) { return parsedServerUa; } return { - os: - typeof overrides?.__os === 'string' && overrides?.__os - ? overrides?.__os - : res.os.name, + os: getOverride(overrides?.__os, res.os.name), - osVersion: - typeof overrides?.__osVersion === 'string' && overrides?.__osVersion - ? overrides?.__osVersion - : res.os.version, + osVersion: getOverride(overrides?.__osVersion, res.os.version), - browser: - typeof overrides?.__browser === 'string' && overrides?.__browser - ? overrides?.__browser - : res.browser.name, + browser: getOverride(overrides?.__browser, res.browser.name), - browserVersion: - typeof overrides?.__browserVersion === 'string' && - overrides?.__browserVersion - ? overrides?.__browserVersion - : res.browser.version, + browserVersion: getOverride(overrides?.__browserVersion, res.browser.version), - device: - typeof overrides?.__device === 'string' && overrides?.__device - ? overrides?.__device - : res.device.type || getDevice(ua), + device: getOverride(overrides?.__device, res.device.type || getDevice(ua)), - brand: - typeof overrides?.__brand === 'string' && overrides?.__brand - ? overrides?.__brand - : res.device.vendor, + brand: getOverride(overrides?.__brand, res.device.vendor), - model: - typeof overrides?.__model === 'string' && overrides?.__model - ? overrides?.__model - : res.device.model, + model: getOverride(overrides?.__model, res.device.model), isServer: false, } as const; }packages/db/src/buffers/base-buffer.ts (1)
92-126: Avoid lock expiry during long flushes (possible double processing).If onFlush exceeds lockTimeout (60s), the Redis lock can expire mid-run and another worker may acquire it. Add a heartbeat to extend TTL only when the lockId still matches.
Apply this diff:
@@ private async releaseLock(lockId: string): Promise<void> { @@ } + private async refreshLock(lockId: string): Promise<void> { + const script = ` + if redis.call("get", KEYS[1]) == ARGV[1] then + return redis.call("expire", KEYS[1], ARGV[2]) + else + return 0 + end + `; + await getRedisCache().eval( + script, + 1, + this.lockKey, + lockId, + this.lockTimeout.toString(), + ); + } + async tryFlush() { @@ - if (acquired === 'OK') { + if (acquired === 'OK') { + const intervalMs = + Math.max(10, Math.floor(this.lockTimeout / 2)) * 1000; + const heartbeat = setInterval(() => { + void this.refreshLock(lockId).catch(() => {}); + }, intervalMs); try { @@ } catch (error) { @@ } finally { + clearInterval(heartbeat); await this.releaseLock(lockId); @@ } } }packages/trpc/src/routers/auth.ts (1)
152-154: Use the correct error type for “User already exists”.This is a conflict, not a not-found case. Prefer a 400/409-style error.
Apply this diff:
-import { TRPCAccessError, TRPCNotFoundError } from '../errors'; +import { + TRPCAccessError, + TRPCNotFoundError, + TRPCBadRequestError, +} from '../errors'; @@ - if (user) { - throw TRPCNotFoundError('User already exists'); - } + if (user) { + throw TRPCBadRequestError('User already exists'); + }Also applies to: 31-31
packages/importer/src/providers/mixpanel.test.ts (1)
5-23: Granularity mismatch: daily vs monthly chunks.Test expects daily chunks, but Mixpanel provider comments mention monthly chunking for reliability. Align expectation and implementation (or update the comment) to avoid future drift.
packages/importer/src/index.ts (1)
1-13: LGTM: coherent public surface.Exports for Umami/Mixpanel and core types look consistent. Consider exporting BaseImportProvider if external providers are intended later.
apps/worker/src/boot-workers.ts (2)
61-64: Clamp concurrency to ≥1 to avoid invalid worker configs.ENV may be "0" or non-numeric; guard with fallback.
- concurrency: Number.parseInt(process.env.IMPORT_JOB_CONCURRENCY || '1', 10), + concurrency: + Math.max(1, Number.parseInt(process.env.IMPORT_JOB_CONCURRENCY ?? '', 10) || 1),- concurrency: Number.parseInt(process.env.EVENT_JOB_CONCURRENCY || '1', 10), + concurrency: + Math.max(1, Number.parseInt(process.env.EVENT_JOB_CONCURRENCY ?? '', 10) || 1),Also applies to: 37-38
137-143: Graceful shutdown should consider all active queues (include import).Only waiting on cron can exit while imports are active. Wait for key queues or rely solely on worker.close.
- await waitForQueueToEmpty(cronQueue); + // Wait for critical queues to drain before closing workers + await Promise.all([ + waitForQueueToEmpty(cronQueue), + waitForQueueToEmpty(sessionsQueue), + waitForQueueToEmpty(notificationQueue), + waitForQueueToEmpty(miscQueue), + waitForQueueToEmpty(importQueue), + ]);packages/queue/src/queues.ts (1)
168-181: Add retries/backoff for import jobs.Imports are long-running and transiently fail (rate limits, net hiccups). Configure sane defaults.
export const importQueue = new Queue<ImportQueuePayload>('import', { connection: getRedisQueue(), defaultJobOptions: { removeOnComplete: 10, removeOnFail: 50, + attempts: 3, + backoff: { type: 'exponential', delay: 5000 }, + // prevent duplicates if enqueued idempotently from API + // jobId can be importId if desired at call-site }, });packages/db/prisma/schema.prisma (1)
477-496: Import model polish: set defaults and add indexes; keep currentStep nullable.
- Set status default to 'pending' to match lifecycle.
- Add indexes for common lookups (by projectId, status, recency).
- Keep currentStep optional; worker logic uses NULL for “new import”.
model Import { id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid projectId String project Project @relation(fields: [projectId], references: [id], onDelete: Cascade) jobId String? // BullMQ job ID - status ImportStatus + status ImportStatus @default(pending) statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs" errorMessage String? /// [IPrismaImportConfig] config Json totalEvents Int @default(0) processedEvents Int @default(0) - currentStep String? + currentStep String? currentBatch String? // String date 2020-01-01 createdAt DateTime @default(now()) completedAt DateTime? updatedAt DateTime @default(now()) @updatedAt @@map("imports") + @@index([projectId, createdAt(sort: Desc)]) + @@index([status]) + @@index([projectId, status]) }packages/trpc/src/routers/import.ts (2)
121-126: Make job removal resilient.Wrap job.remove() in try/catch so delete isn’t blocked by queue errors; optionally skip removal if status is processing.
Apply:
- if (importRecord.jobId) { - const job = await importQueue.getJob(importRecord.jobId); - if (job) { - await job.remove(); - } - } + if (importRecord.jobId) { + try { + const job = await importQueue.getJob(importRecord.jobId); + if (job) await job.remove(); + } catch (e) { + // Best-effort cleanup; continue with delete + } + }
24-31: Paginate list to avoid unbounded result sets.Add take/skip or cursor-based pagination; default e.g. last 50, with client-driven paging.
Apply:
- const rows = await db.import.findMany({ - where: { projectId: input.projectId }, - orderBy: { createdAt: 'desc' }, - }); + const rows = await db.import.findMany({ + where: { projectId: input.projectId }, + orderBy: { createdAt: 'desc' }, + take: 50, + });apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (2)
206-213: Null-safe provider/type display to work with server-side redaction.Avoid crashes when config is null.
Apply:
- <div>{imp.config.provider}</div> + <div>{imp.config?.provider ?? 'unknown'}</div> ... - {imp.config.type} + {imp.config?.type ?? 'n/a'}
102-111: Don’t surface full internal error details in UI.Truncate error text to reduce leak risk.
Apply:
- <Tooltiper - content={errorMessage} + <Tooltiper + content={errorMessage ? String(errorMessage).slice(0, 300) : 'Error'} tooltipClassName="max-w-xs break-words" >apps/worker/src/jobs/import.ts (1)
201-207: Use job logger instead of console.log.Align logging and ensure capture.
Apply:
- console.log('Generating session IDs', { from }); + jobLogger.info('Generating session IDs', { from });packages/importer/src/providers/mixpanel.ts (1)
309-314: Address TODO or document why it's deferred.The TODO comment indicates incomplete screen view detection logic. This heuristic-based approach may cause incorrect event name mapping for production data.
If this is a temporary workaround, please document when and how it should be removed. Otherwise, do you want me to help implement a more robust solution or open an issue to track this task?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (72)
.github/workflows/docker-build.yml(3 hunks)apps/api/package.json(1 hunks)apps/start/Dockerfile(2 hunks)apps/start/package.json(2 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/project-mapper.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/date-ranger-picker.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/start/src/utils/math.ts(1 hunks)apps/start/src/utils/slug.ts(1 hunks)apps/worker/Dockerfile(2 hunks)apps/worker/package.json(3 hunks)apps/worker/src/boot-workers.ts(4 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/events.incoming-event.ts(3 hunks)apps/worker/src/jobs/events.incoming-events.test.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)package.json(1 hunks)packages/cli/package.json(0 hunks)packages/cli/src/cli.ts(0 hunks)packages/cli/src/importer/importer.ts(0 hunks)packages/cli/src/importer/index.ts(0 hunks)packages/cli/tsconfig.json(0 hunks)packages/cli/tsup.config.ts(0 hunks)packages/common/package.json(1 hunks)packages/common/scripts/get-referrers.ts(1 hunks)packages/common/server/index.ts(1 hunks)packages/common/server/parse-referrer.test.ts(5 hunks)packages/common/server/parse-referrer.ts(4 hunks)packages/common/server/parser-user-agent.ts(2 hunks)packages/common/src/object.ts(1 hunks)packages/db/code-migrations/5-add-imports-table.sql(1 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/index.ts(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql(1 hunks)packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql(1 hunks)packages/db/prisma/migrations/20251028152531_imports_again/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/buffers/base-buffer.ts(1 hunks)packages/db/src/buffers/event-buffer.test.ts(8 hunks)packages/db/src/clickhouse/client.ts(2 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/services/session.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/metadata.ts(1 hunks)packages/importer/src/providers/mixpanel.test.ts(1 hunks)packages/importer/src/providers/mixpanel.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/redis/package.json(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/auth.ts(1 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/trpc/src/routers/organization.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
💤 Files with no reviewable changes (6)
- packages/cli/src/cli.ts
- packages/cli/src/importer/importer.ts
- packages/cli/tsconfig.json
- packages/cli/package.json
- packages/cli/tsup.config.ts
- packages/cli/src/importer/index.ts
🚧 Files skipped from review as they are similar to previous changes (26)
- apps/start/src/modals/date-ranger-picker.tsx
- packages/common/server/index.ts
- apps/api/package.json
- apps/start/src/modals/index.tsx
- packages/db/src/services/event.service.ts
- packages/trpc/package.json
- packages/db/src/types.ts
- packages/common/server/parse-referrer.test.ts
- apps/start/src/components/ui/badge.tsx
- apps/start/src/components/integrations/integration-card.tsx
- apps/start/src/routes/app.$organizationId.$projectId.settings._tabs.tsx
- packages/db/src/buffers/event-buffer.test.ts
- packages/common/package.json
- packages/importer/tsconfig.json
- packages/db/src/services/session.service.ts
- packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql
- packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql
- packages/common/src/object.ts
- apps/start/package.json
- apps/start/src/routeTree.gen.ts
- packages/db/src/clickhouse/client.ts
- packages/db/prisma/migrations/20251028152531_imports_again/migration.sql
- packages/importer/src/base-provider.ts
- packages/importer/package.json
- package.json
- packages/trpc/src/root.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/utils/slug.tsapps/start/src/components/project-mapper.tsxapps/start/src/modals/add-import.tsxapps/start/src/utils/math.ts
🧬 Code graph analysis (15)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
importQueue(175-181)apps/worker/src/jobs/import.ts (1)
importJob(25-281)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
getIsCluster(17-24)packages/db/src/clickhouse/migration.ts (2)
createTable(68-116)runClickhouseMigrationCommands(322-454)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
getRedisQueue(93-105)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (4)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
Route(12-34)packages/importer/src/providers/metadata.ts (1)
IMPORT_PROVIDERS(13-30)apps/public/components/ui/tooltip.tsx (1)
Tooltiper(36-51)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
packages/trpc/src/routers/import.ts (6)
packages/trpc/src/trpc.ts (2)
createTRPCRouter(138-138)protectedProcedure(161-164)packages/trpc/src/access.ts (1)
getProjectAccessCached(4-4)packages/trpc/src/errors.ts (1)
TRPCAccessError(3-7)packages/db/src/prisma-client.ts (1)
db(270-270)packages/validation/src/index.ts (1)
zCreateImport(478-486)packages/queue/src/queues.ts (1)
importQueue(175-181)
apps/start/src/components/project-mapper.tsx (2)
packages/db/src/services/project.service.ts (1)
IServiceProject(7-7)apps/start/src/components/ui/select.tsx (5)
Select(173-173)SelectTrigger(181-181)SelectValue(182-182)SelectContent(174-174)SelectItem(176-176)
packages/importer/vitest.config.ts (1)
vitest.shared.ts (1)
getSharedVitestConfig(4-27)
apps/worker/src/jobs/import.ts (6)
packages/queue/src/queues.ts (1)
ImportQueuePayload(168-173)packages/db/src/prisma-client.ts (1)
db(270-270)packages/db/src/services/import.service.ts (10)
ImportSteps(671-671)updateImportStatus(673-736)getImportDateBounds(610-634)insertImportBatch(28-55)generateSessionIds(62-98)createSessionsStartEndEvents(106-316)moveImportsToProduction(322-426)backfillSessionsToProduction(428-521)markImportComplete(526-545)getImportProgress(550-605)packages/db/src/clickhouse/client.ts (2)
formatClickhouseDate(208-219)error(40-45)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/logger/index.ts (1)
ILogger(6-6)
packages/importer/src/types.ts (2)
packages/db/src/services/event.service.ts (1)
IServiceImportedEventPayload(144-147)packages/db/src/services/import.service.ts (2)
ImportProgress(18-23)ImportStageResult(12-16)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
importQueue(175-181)
packages/importer/src/providers/umami.ts (5)
packages/validation/src/index.ts (1)
IUmamiImportConfig(456-456)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
zUmamiImportConfig(452-454)zMixpanelImportConfig(461-470)IUmamiImportConfig(456-456)IMixpanelImportConfig(471-471)IImportConfig(473-476)apps/start/src/components/project-mapper.tsx (1)
ProjectMapper(33-116)apps/public/lib/utils.ts (1)
cn(4-6)
packages/importer/src/providers/mixpanel.test.ts (1)
packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-441)
packages/db/src/services/import.service.ts (4)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/client.ts (4)
ch(131-161)TABLE_NAMES(48-60)convertClickhouseDateToJs(238-240)formatClickhouseDate(208-219)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/prisma-client.ts (1)
db(270-270)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
IMixpanelImportConfig(471-471)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)packages/common/server/parser-user-agent.ts (2)
parseUserAgent(72-115)UserAgentInfo(71-71)packages/common/src/object.ts (2)
toDots(3-36)strip(48-48)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-and-push-dashboard
- GitHub Check: build-and-push-worker
- GitHub Check: build-and-push-api
🔇 Additional comments (29)
.github/workflows/docker-build.yml (1)
149-149: Guard tag creation to main branch only—LGTM.The conditional guards on all three tag-creation steps (
if: github.ref == 'refs/heads/main') correctly restrict tag creation to the main branch. This is good practice for release/deployment management, preventing accidental tags from feature branches while allowing Docker images to be built for all branches. The syntax and consistency across all three jobs are correct.Also applies to: 210-210, 271-271
packages/common/server/parser-user-agent.ts (1)
71-71: Good addition of type export.Exporting the return type as
UserAgentInfois a best practice that improves type safety and developer experience for consumers of this function.packages/redis/package.json (1)
11-11: Verify the intent behind pinning ioredis to an exact version.Version 5.8.2 is confirmed as valid, latest, and published by the official Redis Labs maintainer. However, this is the only production dependency in the package.json using exact pinning—all others use caret constraints (
^) or monorepo references (workspace:*,catalog:). Ensure this pinning is intentional for your queue/job processing stability, rather than an accidental divergence from project conventions. If stability is not a critical concern here, consider aligning with the project's versioning strategy.packages/db/src/buffers/base-buffer.ts (1)
1-1: Import-path update is consistent with the monorepo re-exports.For safety, run the same re-export check script shared in organization.ts to ensure no accidental duplicate exports.
packages/trpc/src/routers/auth.ts (1)
15-15: Import-path change LGTM.packages/trpc/src/routers/organization.ts (1)
14-14: No issues found; import change is correct.Re-export verified:
generateSecureIdis defined once inpackages/common/server/id.tsand properly re-exported. The import path consolidation to@openpanel/common/serveris consistent with existing patterns across the codebase (already used in auth.ts, db/buffers, and other files). Tree-shaking will work correctly.apps/worker/src/jobs/events.incoming-events.test.ts (1)
102-102: LGTM! Test expectations correctly updated.The referrerType expectations now correctly use an empty string instead of 'unknown', aligning with the updated behavior in packages/common/server/parse-referrer.ts where the default type for non-matches changed from 'unknown' to ''.
Also applies to: 210-210
apps/start/src/utils/math.ts (1)
1-1: LGTM! Re-export now uses package root.The change from '@openpanel/common/src/math' to '@openpanel/common' aligns with the pattern of consuming the public API surface from package roots rather than internal paths.
packages/db/code-migrations/5-add-imports-table.sql (1)
1-43: LGTM! Table schema is well-structured.The events_imports table schema provides comprehensive event tracking with import-specific metadata. The 7-day TTL on imported_at_meta provides automatic cleanup for staging data. Note that in clustered deployments, the TTL should be applied to the replicated table rather than the distributed table (see related comment on the .ts migration file).
apps/start/src/utils/slug.ts (1)
1-1: LGTM! Re-export now uses package root.Consistent with the broader pattern of consuming the public API surface from '@openpanel/common' rather than internal paths.
packages/common/server/parse-referrer.ts (1)
1-3: LGTM! Referrer parsing logic improved.The changes improve consistency and efficiency:
- Default type changed from 'unknown' to empty string for unmatched referrers
- Source is now lowercased once during computation rather than repeatedly during comparisons
- Updated matching logic is cleaner and more maintainable
Test files correctly reflect these behavioral changes.
Also applies to: 23-23, 35-51, 63-63
packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql (1)
1-2: LGTM!The migration correctly adds a nullable
statusMessagecolumn to track detailed import status information.packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1)
1-2: LGTM!The migration correctly makes
currentStepnullable, which is appropriate for representing imports that haven't started or tracking optional step progression.apps/worker/package.json (2)
41-41: Verify intentional removal of version range flexibility.The
tsdowndependency was changed from^0.14.2to0.14.2(caret removed), which prevents automatic minor/patch updates. This appears unrelated to the importer feature.Is this pinning intentional for reproducibility, or was it accidental? If unintentional, consider restoring the caret:
- "tsdown": "0.14.2", + "tsdown": "^0.14.2",
22-22: LGTM!Adding the
@openpanel/importerdependency correctly wires the worker to process import jobs.packages/db/index.ts (1)
25-25: LGTM!The export correctly surfaces the new import service, consistent with other service exports in this file.
apps/worker/src/index.ts (1)
8-8: LGTM!The
importQueueis correctly imported and added to the Bull Board UI, enabling monitoring of import jobs alongside existing queues.Also applies to: 42-42
apps/worker/Dockerfile (2)
38-38: LGTM: workspace caching for importer.Including packages/importer/package.json improves pnpm cache hits.
85-87: LGTM: importer copied into runner image.Runtime availability is ensured; ordering with logger/common is fine.
packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
13-13: Migration is safe—config validation requires the field.The
configfield inzCreateImportis mandatory (a union of three required config types with no.optional()), and the single import creation path always provides it from validated input. Since the column remainsNOT NULL, dropping the DEFAULT poses no risk of introducing NULLs.apps/worker/src/jobs/events.incoming-event.ts (2)
8-12: Consolidation to @openpanel/common/server looks good.Shared parser usage reduces drift.
120-123: Referrer precedence change: confirm downstream expectations.Now referrer/url is only from parsed referrer (not UTM), and type prioritizes referrer over UTM. Validate dashboards/segments relying on previous precedence (tests adjusted, but product semantics matter).
packages/importer/src/providers/mixpanel.ts (1)
401-440: LGTM!The property stripping and string coercion logic correctly handles edge cases including null/undefined values and nested objects. The regex pattern effectively filters Mixpanel-specific properties.
packages/db/src/services/import.service.ts (6)
28-55: LGTM!The batch insertion logic correctly handles empty arrays, adds required import metadata, and leverages ClickHouse's JSONEachRow format for efficient bulk inserts.
62-98: LGTM!The deterministic session ID generation using MD5 hashing of device_id and 30-minute time windows ensures consistent results across imports. The use of
mutations_sync: '2'properly ensures replication before proceeding.
322-426: LGTM!The migration logic correctly handles session events with adjusted timestamps by including them via session_id lookup rather than strict date filtering. The progress headers configuration prevents connection timeouts for large imports.
526-545: LGTM!The completion marking correctly uses
mutations_sync: '2'to ensure all replicas are updated before the function returns, preventing race conditions.
610-634: LGTM!The date bounds query correctly handles the optional
fromCreatedAtparameter and provides sensible defaults for empty result sets.
673-736: Well-structured status update flow.The step-based union type provides excellent type safety, ensuring each step can only include valid fields. The status message generation is clear and user-friendly.
Note: The Postgres update at the end (lines 732-735) occurs after the job progress update. If the database update fails, the job progress will be out of sync with the database state. Consider whether this ordering is intentional or if database-first updates would be more appropriate.
| COPY packages/trpc/package.json packages/trpc/ | ||
| COPY packages/json/package.json packages/json/ | ||
| COPY packages/common/package.json packages/common/ | ||
| COPY packages/importer/package.json packages/importer/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Ensure @openpanel/importer is either available at runtime or not linked.
You add packages/importer/package.json for install caching, but the runner image does not copy packages/importer. If apps/start does any value import (not type-only) of @openpanel/importer, PNPM’s workspace symlink will point to a missing path at runtime.
- Option A: If only type imports are used, keep as-is.
- Option B: If runtime imports exist, also copy the importer sources into runner:
+# In runner stage, alongside other packages
+COPY --from=build /app/packages/importer ./packages/importerRun to confirm no value imports from apps/start:
Also applies to: 89-89
🏁 Script executed:
#!/bin/bash
# Find imports of @openpanel/importer in Start; flag non-type imports
rg -nP --type=ts --type=tsx -C2 "from ['\"]@openpanel/importer['\"]" apps/start \
| grep -v "import type" || echo "No value imports found."Length of output: 51
I need to re-run the verification with a corrected command, as the file type flag issue makes the result inconclusive.
---
🏁 Script executed:
```shell
#!/bin/bash
# Broader search: find any imports of @openpanel/importer in apps/start
# without file type restrictions to avoid flag parsing issues
echo "=== Searching for @openpanel/importer imports ==="
rg -n "@openpanel/importer" apps/start || echo "No matches found."
echo ""
echo "=== Directory structure of apps/start ==="
find apps/start -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" \) | head -20
Length of output: 1250
Need to verify if this is the only import and search more thoroughly:
---
🏁 Script executed:
```shell
#!/bin/bash
# Find all imports/requires of @openpanel/importer in apps/start
echo "=== All references to @openpanel/importer ==="
rg -n "@openpanel/importer" apps/start -A2 -B2
echo ""
echo "=== Count of files with @openpanel/importer imports ==="
rg -l "@openpanel/importer" apps/start | wc -l
Length of output: 1172
Copy @openpanel/importer sources to runner stage to fix broken workspace symlink at runtime.
Apps/start has a value import (import { IMPORT_PROVIDERS } from '@openpanel/importer/providers') in apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx:22. The Dockerfile copies packages/importer/package.json for caching but omits the source files in the runner stage, causing the workspace symlink to point to a missing path at runtime.
Copy the importer sources in the runner stage:
+# In runner stage, alongside other packages
+COPY --from=build /app/packages/importer ./packages/importerAlso applies to: line 89
🤖 Prompt for AI Agents
In apps/start/Dockerfile around lines 32 and 89, the Dockerfile only copies
packages/importer/package.json which leaves the @openpanel/importer source files
absent in the runner stage and breaks the workspace symlink at runtime; fix by
copying the importer sources into the runner stage (either COPY
packages/importer/ packages/importer/ in the runner stage or COPY --from=builder
/app/packages/importer/ /app/packages/importer/ if the builder stage contains
the sources/artifacts) so the import path @openpanel/importer resolves at
runtime.
| const jobLogger = logger.child({ | ||
| importId, | ||
| config: record.config, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logger context leaks config secrets.
Avoid attaching full config to logs; include non-sensitive fields only.
Apply:
- const jobLogger = logger.child({
- importId,
- config: record.config,
- });
+ const jobLogger = logger.child({
+ importId,
+ provider: record.config?.provider,
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const jobLogger = logger.child({ | |
| importId, | |
| config: record.config, | |
| }); | |
| const jobLogger = logger.child({ | |
| importId, | |
| provider: record.config?.provider, | |
| }); |
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 35 to 38, the logger child is
attaching the entire record.config which may contain secrets; replace that with
a whitelist of non-sensitive fields (e.g., config.name, config.type,
config.source, config.region, config.bucket or other known safe metadata) or
explicitly redact sensitive keys before passing to logger.child — do not log the
full config object; update the jobLogger creation to only include importId plus
the selected safe config fields (or a redact/mask function) so secrets are never
emitted to logs.
| const eventBatch: any = []; | ||
| for await (const rawEvent of providerInstance.parseSource( | ||
| resumeLoadingFrom, | ||
| )) { | ||
| // Validate event | ||
| if ( | ||
| !providerInstance.validate( | ||
| // @ts-expect-error | ||
| rawEvent, | ||
| ) | ||
| ) { | ||
| jobLogger.warn('Skipping invalid event', { rawEvent }); | ||
| continue; | ||
| } | ||
|
|
||
| eventBatch.push(rawEvent); | ||
|
|
||
| // Process batch when it reaches the batch size | ||
| if (eventBatch.length >= BATCH_SIZE) { | ||
| jobLogger.info('Processing batch', { batchSize: eventBatch.length }); | ||
|
|
||
| const transformedEvents: IClickhouseEvent[] = eventBatch.map( | ||
| ( | ||
| // @ts-expect-error | ||
| event, | ||
| ) => providerInstance!.transformEvent(event), | ||
| ); | ||
|
|
||
| await insertImportBatch(transformedEvents, importId); | ||
|
|
||
| processedEvents += eventBatch.length; | ||
| eventBatch.length = 0; | ||
|
|
||
| const createdAt = new Date(transformedEvents[0]?.created_at || '') | ||
| .toISOString() | ||
| .split('T')[0]; | ||
|
|
||
| await updateImportStatus(jobLogger, job, importId, { | ||
| step: 'loading', | ||
| batch: createdAt, | ||
| totalEvents, | ||
| processedEvents, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Harden batch transform; avoid crashes on empty/failed transforms.
- Use unknown[] buffer.
- Guard per-event transform with try/catch.
- Safe createdAt derivation.
Apply:
- const eventBatch: any = [];
+ const eventBatch: unknown[] = [];
@@
- if (eventBatch.length >= BATCH_SIZE) {
+ if (eventBatch.length >= BATCH_SIZE) {
jobLogger.info('Processing batch', { batchSize: eventBatch.length });
-
- const transformedEvents: IClickhouseEvent[] = eventBatch.map(
- (
- // @ts-expect-error
- event,
- ) => providerInstance!.transformEvent(event),
- );
+ const transformedEvents: IClickhouseEvent[] = [];
+ for (const ev of eventBatch) {
+ try {
+ transformedEvents.push(
+ providerInstance!.transformEvent(ev as any),
+ );
+ } catch (e) {
+ jobLogger.warn('Skipping event on transform failure', {
+ provider: providerInstance.provider,
+ reason: e instanceof Error ? e.message : String(e),
+ });
+ }
+ }
await insertImportBatch(transformedEvents, importId);
processedEvents += eventBatch.length;
eventBatch.length = 0;
- const createdAt = new Date(transformedEvents[0]?.created_at || '')
- .toISOString()
- .split('T')[0];
+ const firstCreatedAt = transformedEvents[0]?.created_at;
+ const createdAt = firstCreatedAt
+ ? firstCreatedAt.split('T')[0]!
+ : formatClickhouseDate(new Date(), true);
await updateImportStatus(jobLogger, job, importId, {
step: 'loading',
batch: createdAt,
totalEvents,
processedEvents,
});Committable suggestion skipped: line range outside the PR's diff.
| // Process remaining events in the last batch | ||
| if (eventBatch.length > 0) { | ||
| const transformedEvents = eventBatch.map( | ||
| ( | ||
| // @ts-expect-error | ||
| event, | ||
| ) => providerInstance!.transformEvent(event), | ||
| ); | ||
|
|
||
| await insertImportBatch(transformedEvents, importId); | ||
|
|
||
| processedEvents += eventBatch.length; | ||
| eventBatch.length = 0; | ||
|
|
||
| const createdAt = new Date(transformedEvents[0]?.created_at || '') | ||
| .toISOString() | ||
| .split('T')[0]; | ||
|
|
||
| await updateImportStatus(jobLogger, job, importId, { | ||
| step: 'loading', | ||
| batch: createdAt, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeat: harden final partial batch and persist counters.
Mirror guarded transform and include totals in status update.
Apply:
- if (eventBatch.length > 0) {
- const transformedEvents = eventBatch.map(
- (
- // @ts-expect-error
- event,
- ) => providerInstance!.transformEvent(event),
- );
+ if (eventBatch.length > 0) {
+ const transformedEvents: IClickhouseEvent[] = [];
+ for (const ev of eventBatch) {
+ try {
+ transformedEvents.push(providerInstance!.transformEvent(ev as any));
+ } catch (e) {
+ jobLogger.warn('Skipping event on transform failure', {
+ provider: providerInstance.provider,
+ reason: e instanceof Error ? e.message : String(e),
+ });
+ }
+ }
await insertImportBatch(transformedEvents, importId);
processedEvents += eventBatch.length;
eventBatch.length = 0;
- const createdAt = new Date(transformedEvents[0]?.created_at || '')
- .toISOString()
- .split('T')[0];
+ const firstCreatedAt = transformedEvents[0]?.created_at;
+ const createdAt = firstCreatedAt
+ ? firstCreatedAt.split('T')[0]!
+ : formatClickhouseDate(new Date(), true);
await updateImportStatus(jobLogger, job, importId, {
step: 'loading',
batch: createdAt,
+ totalEvents,
+ processedEvents,
});
}🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 170 to 192, the final partial
batch is processed without the same guards and error handling used earlier and
the status update omits processed/total counters; update the block to perform
the guarded transform exactly like the earlier batch processing (check
providerInstance exists, handle per-event transform errors or wrap transform in
try/catch and skip/collect failures), ensure transformedEvents is computed only
if providerInstance is present and has valid created_at before deriving
createdAt, and include the processedEvents and total/expected total (e.g.,
totalEvents) in the call to updateImportStatus so counters are persisted
alongside step and batch.
| -- CreateEnum | ||
| CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed'); | ||
|
|
||
| -- AlterTable | ||
| ALTER TABLE "public"."imports" DROP COLUMN "status", | ||
| ADD COLUMN "status" "public"."ImportStatus" NOT NULL, | ||
| ALTER COLUMN "currentStep" SET NOT NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Destructive migration: status drop/recreate will lose data; currentStep NOT NULL conflicts with code.
- Map TEXT → enum safely; do not drop without backfill.
- Keep currentStep nullable; worker import.ts uses null to detect new imports.
Safe migration example:
--- a/packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
+++ b/packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed');
-ALTER TABLE "public"."imports" DROP COLUMN "status",
-ADD COLUMN "status" "public"."ImportStatus" NOT NULL,
-ALTER COLUMN "currentStep" SET NOT NULL;
+-- 1) Add new enum column with default to preserve existing rows
+ALTER TABLE "public"."imports"
+ ADD COLUMN "status_v2" "public"."ImportStatus" NOT NULL DEFAULT 'pending';
+
+-- 2) Map old text values to enum; fall back to 'pending' for unknowns
+UPDATE "public"."imports"
+SET "status_v2" = CASE lower(coalesce("status", 'pending'))
+ WHEN 'pending' THEN 'pending'::"public"."ImportStatus"
+ WHEN 'processing' THEN 'processing'::"public"."ImportStatus"
+ WHEN 'completed' THEN 'completed'::"public"."ImportStatus"
+ WHEN 'failed' THEN 'failed'::"public"."ImportStatus"
+ ELSE 'pending'::"public"."ImportStatus"
+END;
+
+-- 3) Drop old column and rename
+ALTER TABLE "public"."imports" DROP COLUMN "status";
+ALTER TABLE "public"."imports" RENAME COLUMN "status_v2" TO "status";
+ALTER TABLE "public"."imports" ALTER COLUMN "status" DROP DEFAULT;
+
+-- 4) Keep currentStep nullable (code relies on NULL to denote new import)
+-- (Remove the SET NOT NULL)Also, consider wrapping in a transaction if your migration runner allows it.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| -- CreateEnum | |
| CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed'); | |
| -- AlterTable | |
| ALTER TABLE "public"."imports" DROP COLUMN "status", | |
| ADD COLUMN "status" "public"."ImportStatus" NOT NULL, | |
| ALTER COLUMN "currentStep" SET NOT NULL; | |
| -- CreateEnum | |
| CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed'); | |
| -- AlterTable | |
| -- 1) Add new enum column with default to preserve existing rows | |
| ALTER TABLE "public"."imports" | |
| ADD COLUMN "status_v2" "public"."ImportStatus" NOT NULL DEFAULT 'pending'; | |
| -- 2) Map old text values to enum; fall back to 'pending' for unknowns | |
| UPDATE "public"."imports" | |
| SET "status_v2" = CASE lower(coalesce("status", 'pending')) | |
| WHEN 'pending' THEN 'pending'::"public"."ImportStatus" | |
| WHEN 'processing' THEN 'processing'::"public"."ImportStatus" | |
| WHEN 'completed' THEN 'completed'::"public"."ImportStatus" | |
| WHEN 'failed' THEN 'failed'::"public"."ImportStatus" | |
| ELSE 'pending'::"public"."ImportStatus" | |
| END; | |
| -- 3) Drop old column and rename | |
| ALTER TABLE "public"."imports" DROP COLUMN "status"; | |
| ALTER TABLE "public"."imports" RENAME COLUMN "status_v2" TO "status"; | |
| ALTER TABLE "public"."imports" ALTER COLUMN "status" DROP DEFAULT; | |
| -- 4) Keep currentStep nullable (code relies on NULL to denote new import) | |
| -- (Remove the SET NOT NULL) |
🤖 Prompt for AI Agents
In packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
around lines 8 to 14, the migration currently drops and recreates the "status"
column (which will lose data) and forces "currentStep" to NOT NULL (which
conflicts with import worker logic); instead, create the new enum type, add a
new nullable enum column (e.g. status_new), backfill status_new from the
existing text status using a safe mapping, then drop the old text column and
rename status_new to status (or use ALTER COLUMN TYPE after backfill), and leave
currentStep nullable (do not set NOT NULL). Wrap the whole sequence in a
transaction if your runner supports it and ensure any NOT NULL constraints are
only applied after backfilling.
| const rawEvent = { | ||
| event: '$mp_web_page_view', | ||
| properties: { | ||
| time: 1746097970, | ||
| distinct_id: '$device:123', | ||
| $browser: 'Chrome', | ||
| $browser_version: 135, | ||
| $city: 'Mumbai', | ||
| $current_url: | ||
| 'https://domain.com/state/maharashtra?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_sour', | ||
| $device: 'Android', | ||
| $device_id: '123', | ||
| $initial_referrer: 'https://referrer.com/', | ||
| $initial_referring_domain: 'referrer.com', | ||
| $insert_id: 'source_id', | ||
| $lib_version: '2.60.0', | ||
| $mp_api_endpoint: 'api-js.mixpanel.com', | ||
| $mp_api_timestamp_ms: 1746078175363, | ||
| $mp_autocapture: true, | ||
| $os: 'Android', | ||
| $referrer: 'https://google.com/', | ||
| $referring_domain: 'referrer.com', | ||
| $region: 'Maharashtra', | ||
| $screen_height: 854, | ||
| $screen_width: 384, | ||
| current_domain: 'domain.com', | ||
| current_page_title: | ||
| 'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2', | ||
| current_url_path: '/state/maharashtra', | ||
| current_url_protocol: 'https:', | ||
| current_url_search: | ||
| '?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_source=5&gclid=EAIaIQobChMI6MnvhciBjQMVlS-DAx', | ||
| gclid: 'oqneoqow', | ||
| mp_country_code: 'IN', | ||
| mp_lib: 'web', | ||
| mp_processing_time_ms: 1746078175546, | ||
| mp_sent_by_lib_version: '2.60.0', | ||
| utm_medium: 'cpc', | ||
| utm_source: 'google', | ||
| }, | ||
| }; | ||
|
|
||
| const res = provider.transformEvent(rawEvent); | ||
|
|
||
| expect(res).toMatchObject({ | ||
| id: expect.any(String), | ||
| name: 'screen_view', | ||
| device_id: '123', | ||
| profile_id: '123', | ||
| project_id: 'pid', | ||
| session_id: '', | ||
| properties: { | ||
| __source_insert_id: 'source_id', | ||
| __screen: '384x854', | ||
| __lib_version: '2.60.0', | ||
| '__query.utm_source': 'google', | ||
| '__query.utm_medium': 'cpc', | ||
| '__query.utm_campaignid': '890', | ||
| '__query.utm_device': 'm', | ||
| '__query.utm_network': 'x', | ||
| '__query.utm_location': '123', | ||
| '__query.gclid': 'oqneoqow', | ||
| __title: | ||
| 'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2', | ||
| }, | ||
| created_at: '2025-05-01T11:12:50.000Z', | ||
| country: 'IN', | ||
| city: 'Mumbai', | ||
| region: 'Maharashtra', | ||
| longitude: null, | ||
| latitude: null, | ||
| os: 'Android', | ||
| os_version: undefined, | ||
| browser: 'Chrome', | ||
| browser_version: '', | ||
| device: 'mobile', | ||
| brand: '', | ||
| model: '', | ||
| duration: 0, | ||
| path: '/state/maharashtra', | ||
| origin: 'https://domain.com', | ||
| referrer: 'https://referrer.com', | ||
| referrer_name: 'Google', | ||
| referrer_type: 'search', | ||
| imported_at: expect.any(String), | ||
| sdk_name: 'mixpanel', | ||
| sdk_version: '1.0.0', | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test hides mapping bugs in provider (browser/os version).
transformEvent currently ignores $browser_version and $os_version due to camel/snake mismatch; test expecting empty/undefined masks this. Tighten assertions to expect real values, then fix provider:
- os_version: undefined,
- browser_version: '',
+ os_version: '', # or 'Android <ver>' if available
+ browser_version: '135',Provider fix (packages/importer/src/providers/mixpanel.ts):
- os: uaInfo.os || props.$os,
- os_version: uaInfo.osVersion || props.$osVersion,
- browser: uaInfo.browser || props.$browser,
- browser_version:
- uaInfo.browserVersion || props.$browserVersion
- ? String(props.$browser_version)
- : '',
+ os: uaInfo.os || props.$os || '',
+ os_version: uaInfo.osVersion || props.$os_version || '',
+ browser: uaInfo.browser || props.$browser || '',
+ browser_version: String(uaInfo.browserVersion || props.$browser_version || ''),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const rawEvent = { | |
| event: '$mp_web_page_view', | |
| properties: { | |
| time: 1746097970, | |
| distinct_id: '$device:123', | |
| $browser: 'Chrome', | |
| $browser_version: 135, | |
| $city: 'Mumbai', | |
| $current_url: | |
| 'https://domain.com/state/maharashtra?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_sour', | |
| $device: 'Android', | |
| $device_id: '123', | |
| $initial_referrer: 'https://referrer.com/', | |
| $initial_referring_domain: 'referrer.com', | |
| $insert_id: 'source_id', | |
| $lib_version: '2.60.0', | |
| $mp_api_endpoint: 'api-js.mixpanel.com', | |
| $mp_api_timestamp_ms: 1746078175363, | |
| $mp_autocapture: true, | |
| $os: 'Android', | |
| $referrer: 'https://google.com/', | |
| $referring_domain: 'referrer.com', | |
| $region: 'Maharashtra', | |
| $screen_height: 854, | |
| $screen_width: 384, | |
| current_domain: 'domain.com', | |
| current_page_title: | |
| 'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2', | |
| current_url_path: '/state/maharashtra', | |
| current_url_protocol: 'https:', | |
| current_url_search: | |
| '?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_source=5&gclid=EAIaIQobChMI6MnvhciBjQMVlS-DAx', | |
| gclid: 'oqneoqow', | |
| mp_country_code: 'IN', | |
| mp_lib: 'web', | |
| mp_processing_time_ms: 1746078175546, | |
| mp_sent_by_lib_version: '2.60.0', | |
| utm_medium: 'cpc', | |
| utm_source: 'google', | |
| }, | |
| }; | |
| const res = provider.transformEvent(rawEvent); | |
| expect(res).toMatchObject({ | |
| id: expect.any(String), | |
| name: 'screen_view', | |
| device_id: '123', | |
| profile_id: '123', | |
| project_id: 'pid', | |
| session_id: '', | |
| properties: { | |
| __source_insert_id: 'source_id', | |
| __screen: '384x854', | |
| __lib_version: '2.60.0', | |
| '__query.utm_source': 'google', | |
| '__query.utm_medium': 'cpc', | |
| '__query.utm_campaignid': '890', | |
| '__query.utm_device': 'm', | |
| '__query.utm_network': 'x', | |
| '__query.utm_location': '123', | |
| '__query.gclid': 'oqneoqow', | |
| __title: | |
| 'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2', | |
| }, | |
| created_at: '2025-05-01T11:12:50.000Z', | |
| country: 'IN', | |
| city: 'Mumbai', | |
| region: 'Maharashtra', | |
| longitude: null, | |
| latitude: null, | |
| os: 'Android', | |
| os_version: undefined, | |
| browser: 'Chrome', | |
| browser_version: '', | |
| device: 'mobile', | |
| brand: '', | |
| model: '', | |
| duration: 0, | |
| path: '/state/maharashtra', | |
| origin: 'https://domain.com', | |
| referrer: 'https://referrer.com', | |
| referrer_name: 'Google', | |
| referrer_type: 'search', | |
| imported_at: expect.any(String), | |
| sdk_name: 'mixpanel', | |
| sdk_version: '1.0.0', | |
| }); | |
| }); | |
| const rawEvent = { | |
| event: '$mp_web_page_view', | |
| properties: { | |
| time: 1746097970, | |
| distinct_id: '$device:123', | |
| $browser: 'Chrome', | |
| $browser_version: 135, | |
| $city: 'Mumbai', | |
| $current_url: | |
| 'https://domain.com/state/maharashtra?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_sour', | |
| $device: 'Android', | |
| $device_id: '123', | |
| $initial_referrer: 'https://referrer.com/', | |
| $initial_referring_domain: 'referrer.com', | |
| $insert_id: 'source_id', | |
| $lib_version: '2.60.0', | |
| $mp_api_endpoint: 'api-js.mixpanel.com', | |
| $mp_api_timestamp_ms: 1746078175363, | |
| $mp_autocapture: true, | |
| $os: 'Android', | |
| $referrer: 'https://google.com/', | |
| $referring_domain: 'referrer.com', | |
| $region: 'Maharashtra', | |
| $screen_height: 854, | |
| $screen_width: 384, | |
| current_domain: 'domain.com', | |
| current_page_title: | |
| 'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2', | |
| current_url_path: '/state/maharashtra', | |
| current_url_protocol: 'https:', | |
| current_url_search: | |
| '?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_source=5&gclid=EAIaIQobChMI6MnvhciBjQMVlS-DAx', | |
| gclid: 'oqneoqow', | |
| mp_country_code: 'IN', | |
| mp_lib: 'web', | |
| mp_processing_time_ms: 1746078175546, | |
| mp_sent_by_lib_version: '2.60.0', | |
| utm_medium: 'cpc', | |
| utm_source: 'google', | |
| }, | |
| }; | |
| const res = provider.transformEvent(rawEvent); | |
| expect(res).toMatchObject({ | |
| id: expect.any(String), | |
| name: 'screen_view', | |
| device_id: '123', | |
| profile_id: '123', | |
| project_id: 'pid', | |
| session_id: '', | |
| properties: { | |
| __source_insert_id: 'source_id', | |
| __screen: '384x854', | |
| __lib_version: '2.60.0', | |
| '__query.utm_source': 'google', | |
| '__query.utm_medium': 'cpc', | |
| '__query.utm_campaignid': '890', | |
| '__query.utm_device': 'm', | |
| '__query.utm_network': 'x', | |
| '__query.utm_location': '123', | |
| '__query.gclid': 'oqneoqow', | |
| __title: | |
| 'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2', | |
| }, | |
| created_at: '2025-05-01T11:12:50.000Z', | |
| country: 'IN', | |
| city: 'Mumbai', | |
| region: 'Maharashtra', | |
| longitude: null, | |
| latitude: null, | |
| os: 'Android', | |
| os_version: '', | |
| browser: 'Chrome', | |
| browser_version: '135', | |
| device: 'mobile', | |
| brand: '', | |
| model: '', | |
| duration: 0, | |
| path: '/state/maharashtra', | |
| origin: 'https://domain.com', | |
| referrer: 'https://referrer.com', | |
| referrer_name: 'Google', | |
| referrer_type: 'search', | |
| imported_at: expect.any(String), | |
| sdk_name: 'mixpanel', | |
| sdk_version: '1.0.0', | |
| }); | |
| }); |
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.test.ts lines 38-126: the test
currently masks a mapping bug by expecting browser_version = '' and os_version =
undefined even though rawEvent.properties contains $browser_version and $os;
update the test to assert the actual expected values (browser_version: '135',
os_version: 'Android' or the correct field value) and ensure created_at remains
derived from the time field. Then fix the provider in
packages/importer/src/providers/mixpanel.ts to properly read both camelCase and
snake_case Mixpanel fields (e.g., $browser_version / browser_version,
$os_version / os_version, and fallback to $os if os_version missing), coerce
numeric versions to strings, and assign them to the transformed event's
browser_version and os_version fields so the tightened test passes.
| const event = JSON.parse(line); | ||
| yield event; | ||
| } catch (error) { | ||
| console.warn('Failed to parse Mixpanel event:', line); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Use configured logger instead of console.warn.
Lines 134 and 146 use console.warn, bypassing the structured logger passed to the constructor. This creates inconsistent logging and prevents proper log aggregation and filtering.
Apply this diff to use the configured logger:
for (const line of lines) {
if (line.trim()) {
try {
const event = JSON.parse(line);
yield event;
} catch (error) {
- console.warn('Failed to parse Mixpanel event:', line);
+ this.logger?.warn('Failed to parse Mixpanel event', { line, error });
}
}
} if (buffer.trim()) {
try {
const event = JSON.parse(buffer);
yield event;
} catch (error) {
- console.warn('Failed to parse final Mixpanel event:', buffer);
+ this.logger?.warn('Failed to parse final Mixpanel event', { buffer, error });
}
}Also applies to: 146-146
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 134 and 146, replace
the direct console.warn calls with the instance's configured logger (e.g.,
this.logger.warn or the logger variable passed into the constructor) so logging
uses the structured logger; update both occurrences to call the logger with the
same message and include the event/line and any parse error details, preserving
message clarity and log level.
| const userAgent = props.osVersion || ''; | ||
| const uaInfo = this.isWebEvent(props.mp_lib) | ||
| ? parseUserAgent(userAgent, props) | ||
| : this.parseServerDeviceInfo(props); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect user agent assignment.
Line 202 assigns props.osVersion to the userAgent variable, but OS version is not the same as the user agent string. The user agent should be extracted from a property like props.$user_agent or similar.
Verify the correct Mixpanel property name for the user agent string and apply the fix:
- const userAgent = props.osVersion || '';
+ const userAgent = props.$user_agent || '';
const uaInfo = this.isWebEvent(props.mp_lib)
? parseUserAgent(userAgent, props)
: this.parseServerDeviceInfo(props);Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 202 to 205, the code
incorrectly assigns props.osVersion to userAgent; replace this with the Mixpanel
user agent property (e.g., props.$user_agent or the correct property name from
Mixpanel events), fall back to an empty string or props.osVersion if user agent
is not present, and pass that userAgent into parseUserAgent; ensure the
isWebEvent branch uses the actual UA string and not the OS version.
| os: uaInfo.os || props.$os, | ||
| os_version: uaInfo.osVersion || props.$osVersion, | ||
| browser: uaInfo.browser || props.$browser, | ||
| browser_version: | ||
| uaInfo.browserVersion || props.$browserVersion | ||
| ? String(props.$browser_version) | ||
| : '', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Inconsistent property naming may cause data loss.
Lines 289 and 292-293 use props.$osVersion and props.$browser_version respectively, but Mixpanel typically uses underscores (e.g., $os_version). This inconsistency may result in missing data if the wrong property names are used.
Verify the correct Mixpanel property naming convention:
🌐 Web query:
What are the correct Mixpanel event property names for OS version and browser version?
💡 Result:
Use Mixpanel's default property names:
- OS name: $os — OS version: $os_version. [1]
- Browser name: $browser — Browser version: $browser_version. [1]
Source: Mixpanel Default/Property reference. [1]
Fix inconsistent Mixpanel property names to prevent data loss.
The code uses camelCase property names (props.$osVersion, props.$browserVersion) but Mixpanel's standard property names are $os_version and $browser_version (snake_case). Additionally, lines 292-294 check props.$browserVersion in the condition but access props.$browser_version in the ternary, creating a logic inconsistency. Update both lines 289 and 292 to use the correct snake_case property names to ensure data is properly captured.
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 288 to 294, the
Mixpanel property names are inconsistent (using camelCase props.$osVersion and
props.$browserVersion) and the ternary condition checks a different prop than it
accesses, which can drop data; change references to use Mixpanel's snake_case
names (props.$os_version and props.$browser_version) and make the ternary check
and the value access use the same snake_case prop, converting to string as
needed (e.g., String(props.$browser_version)) so os_version and browser_version
are populated correctly.
1a11373 to
c9ebec6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (29)
packages/validation/src/index.ts (1)
478-486: This issue has already been flagged in a previous review.The concern about provider and config mismatches remains unaddressed. Please refer to the previous review comment which provides a discriminated union solution to enforce that each provider literal is paired with its correct config shape.
packages/db/prisma/migrations/20251028152531_imports_again/migration.sql (1)
1-4: SQL migration is correct.The migration correctly alters the
currentBatchcolumn to TEXT nullable without a default. The TypeScript type mismatch identified in the past review comment (lines 1-4) should be addressed separately inpackages/importer/src/types.ts.packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)
10-13: Critical: Data migration missing before dropping columns.As identified in the past review comment (lines 10-12), this migration drops
provider,sourceLocation, andsourceTypecolumns without first migrating their data to theconfigJSON field. This will cause data loss for existing import records and break the worker'screateProviderfunction which expectsconfig.provider.packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql (1)
8-14: Destructive migration: status drop/recreate will lose data; currentStep NOT NULL conflicts with code.This migration drops and recreates the status column, which will lose existing data. Additionally, forcing currentStep to NOT NULL conflicts with the schema definition (line 489 in schema.prisma shows
currentStep String?) and worker code that relies on null values to detect new imports.The safe migration approach is to:
- Add a new enum column with a default value
- Backfill from the existing text column with proper mapping
- Drop the old column and rename
- Keep currentStep nullable
Refer to the detailed migration example in the existing past review comment.
apps/start/src/components/project-mapper.tsx (2)
23-31: Define a proper form type instead of usingany.The interface uses
anyfor all generic type parameters, defeating TypeScript's type safety. Define an explicit form schema type to enable proper type checking and autocomplete.Apply the diff suggested in the past review comment to add proper typing with a ProjectMapperFormData interface.
92-92: Remove the arbitrary 10-project limit.Line 92 limits the dropdown to only the first 10 projects, preventing users with more projects from mapping to those beyond the limit. The Select component should display all available projects with scrolling.
Remove
.slice(0, 10)to show all projects, or implement search/pagination if performance is a concern.packages/importer/package.json (2)
5-5: Main entry point references source instead of build output.The
mainfield points tosrc/index.ts, but consumers will need the compiled JavaScript output. This should point to./dist/index.js.Update to:
- "main": "src/index.ts", + "main": "./dist/index.js",Also ensure "types" field points to the compiled declaration file.
21-22: Review CSV library usage and align Ramda versions.Two concerns:
- Only
csv-parseis listed (csv-parser is not present), so this may be fineramdais at^0.29.1but@types/ramdais at^0.31.1- version mismatch may cause type errorsUpdate ramda to match its types:
- "ramda": "^0.29.1", + "ramda": "^0.31.1",packages/db/prisma/schema.prisma (1)
479-498: Inconsistency:currentStepnullable in schema but NOT NULL in migration.Line 489 defines
currentStepasString?(optional), but migration20251028150123_fix_imports_table/migration.sqlsets it to NOT NULL. This mismatch will cause runtime issues.Align schema and migration: either make it required in schema (
currentStep String) or remove the NOT NULL constraint from the migration. Based on the worker code needing null to detect new imports, keeping it nullable is recommended.packages/db/code-migrations/5-add-imports-table.ts (1)
74-85: Fix__filenameusage in ESM context.The package is configured as ESM (
"type": "module"), and__filenameis undefined in ESM. Other migration files (2-accounts.ts, 3-init-ch.ts) already use the correct pattern.Import and use
fileURLToPath:+import { fileURLToPath } from 'node:url'; + +const __filename = fileURLToPath(import.meta.url); + export async function up() {packages/importer/src/providers/mixpanel.test.ts (1)
39-127: Test masks mapping bugs in provider (browser/os version).The test expects
browser_version: ''andos_version: undefinedeven though the rawEvent contains$browser_version: 135and$os: 'Android'. This masks a bug in the provider's field mapping logic.Tighten the test assertions to expect the actual values:
- os_version: undefined, - browser_version: '', + os_version: 'Android', // or appropriate parsed version + browser_version: '135',Then fix the provider (packages/importer/src/providers/mixpanel.ts) to correctly map
$browser_versionand$os_versionfields, handling both snake_case and camelCase variants.packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (1)
1-3: Ensure gen_random_uuid() is available.The migration uses
gen_random_uuid()without ensuring the pgcrypto extension is enabled, which will cause the migration to fail on databases where pgcrypto isn't already available.Add the extension check before the table creation:
+-- Enable required extensions +CREATE EXTENSION IF NOT EXISTS pgcrypto; + -- CreateTable CREATE TABLE "public"."imports" (packages/importer/src/providers/mixpanel.ts (4)
128-147: Route parse errors through the provider logger
console.warnbypasses our structured logger and can disappear in production. Usethis.logger?.warn(with the error and raw line) instead.Apply this diff:
- console.warn('Failed to parse Mixpanel event:', line); + this.logger?.warn('Failed to parse Mixpanel event', { + line, + error, + }); @@ - console.warn('Failed to parse final Mixpanel event:', buffer); + this.logger?.warn('Failed to parse final Mixpanel event', { + buffer, + error, + });
85-91: Remove service account from logsLogging
serviceAccountdiscloses a credential identifier. Drop it (or mask it) from the metadata.Apply this diff:
this.logger?.info('Fetching events from Mixpanel', { url: `${url}?${params}`, from, to, projectId, - serviceAccount, });
201-205: Parse UA from$user_agent, not osVersion
props.osVersionis not a user-agent string, so every web event ends up with blank UA/device data. Prefer$user_agent(fallback to empty string) before feedingparseUserAgent.Apply this diff:
- const userAgent = props.osVersion || ''; + const userAgent = + typeof props.$user_agent === 'string' && props.$user_agent + ? props.$user_agent + : '';
273-279: Fix Mixpanel property names and browser version fallbackMixpanel uses snake_case (
$os_version,$browser_version), and the current ternary always stringifies$browser_versioneven when undefined, yielding'undefined'. Use the correct keys and direct fallbacks.Apply this diff:
- os: uaInfo.os || props.$os, - os_version: uaInfo.osVersion || props.$osVersion, - browser: uaInfo.browser || props.$browser, - browser_version: - uaInfo.browserVersion || props.$browserVersion - ? String(props.$browser_version) - : '', + os: uaInfo.os || props.$os || '', + os_version: uaInfo.osVersion || props.$os_version || '', + browser: uaInfo.browser || props.$browser || '', + browser_version: + uaInfo.browserVersion || props.$browser_version || '',apps/worker/src/jobs/import.ts (5)
35-38: Redact import configuration in logs
record.configcan carry API keys and secrets; logging it verbatim leaks credentials. Please log only non-sensitive identifiers such asprovider(and explicitly redacted fields) before fanning out to child logs.Apply this diff:
- const jobLogger = logger.child({ - importId, - config: record.config, - }); + const jobLogger = logger.child({ + importId, + provider: record.config?.provider, + });
134-142: Do not log raw provider eventsInvalid events are currently dumped to the logger, exposing full payloads (PII, tokens, etc.). Emit only minimal context (provider + reason) so we stay compliant while still highlighting the failure.
Apply this diff:
- if ( - !providerInstance.validate( - // @ts-expect-error - rawEvent, - ) - ) { - jobLogger.warn('Skipping invalid event', { rawEvent }); + if (!providerInstance.validate(rawEvent as never)) { + jobLogger.warn('Skipping invalid event', { + provider: providerInstance.provider, + reason: 'validation_failed', + }); continue; }
128-198: Harden batch transforms to avoid import abortsA single
transformEventexception currently tears down the job, and we over-count processed rows even when transforms fail. Buffer the batch asunknown[], wrap each transform in try/catch, log failures, and incrementprocessedEventsbased on the successfully transformed rows only (both for full and tail batches).Apply this diff:
- const eventBatch: any = []; + const eventBatch: unknown[] = []; @@ - const transformedEvents: IClickhouseEvent[] = eventBatch.map( - ( - // @ts-expect-error - event, - ) => providerInstance!.transformEvent(event), - ); + const transformedEvents: IClickhouseEvent[] = []; + for (const raw of eventBatch) { + try { + transformedEvents.push( + providerInstance.transformEvent( + raw as Parameters<typeof providerInstance.transformEvent>[0], + ), + ); + } catch (error) { + jobLogger.warn('Skipping event on transform failure', { + provider: providerInstance.provider, + reason: error instanceof Error ? error.message : String(error), + }); + } + } + if (transformedEvents.length === 0) { + eventBatch.length = 0; + continue; + } await insertImportBatch(transformedEvents, importId); - processedEvents += eventBatch.length; + processedEvents += transformedEvents.length; @@ - if (eventBatch.length > 0) { - const transformedEvents = eventBatch.map( - ( - // @ts-expect-error - event, - ) => providerInstance!.transformEvent(event), - ); + if (eventBatch.length > 0) { + const transformedEvents: IClickhouseEvent[] = []; + for (const raw of eventBatch) { + try { + transformedEvents.push( + providerInstance.transformEvent( + raw as Parameters<typeof providerInstance.transformEvent>[0], + ), + ); + } catch (error) { + jobLogger.warn('Skipping event on transform failure', { + provider: providerInstance.provider, + reason: error instanceof Error ? error.message : String(error), + }); + } + } + if (transformedEvents.length === 0) { + eventBatch.length = 0; + continue; + } await insertImportBatch(transformedEvents, importId); - processedEvents += eventBatch.length; + processedEvents += transformedEvents.length;
194-197: Include counters in the final loading updateThe tail-batch status update omits
totalEventsandprocessedEvents, leaving Postgres stale for the UI. Pass the counters just like the earlier batches.Apply this diff:
await updateImportStatus(jobLogger, job, importId, { step: 'loading', batch: createdAt, + totalEvents, + processedEvents, });
250-269: Persist final totals after completionAfter
markImportComplete, Postgres still holds outdatedprocessedEvents/totalEvents. Fetch the final ClickHouse progress and write those numbers back (either viaupdateImportStatusor a directdb.import.update) so the UI reflects 100 % completion.Apply this diff:
- await updateImportStatus(jobLogger, job, importId, { - step: 'completed', - }); + await updateImportStatus(jobLogger, job, importId, { + step: 'completed', + totalEvents, + processedEvents, + }); @@ - const finalProgress = await getImportProgress(importId); + const finalProgress = await getImportProgress(importId); + await db.import.update({ + where: { id: importId }, + data: { + totalEvents: finalProgress.totalEvents, + processedEvents: finalProgress.insertedEvents, + }, + });apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)
265-275: Redact sensitive config before displaying
imp.configfrequently contains API keys and secrets; dumping it straight into the tooltip leaks credentials. Sanitize (or redact) known secret fields before stringifying, and/or restrict the tooltip to privileged roles.Apply this diff (or similar):
- content={ - <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> - {JSON.stringify(imp.config, null, 2)} - </pre> - } + content={ + <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words"> + {JSON.stringify(redactImportConfig(imp.config), null, 2)} + </pre> + }Add a helper such as:
function redactImportConfig(config: Record<string, unknown>) { const masked = { ...config }; for (const key of ['apiKey', 'apiSecret', 'serviceAccount', 'serviceSecret', 'token', 'secret']) { if (masked[key]) masked[key] = '***'; } return masked; }apps/start/src/modals/add-import.tsx (1)
246-248: Use a generic failure messageSurfacing
error.messagecan leak upstream credentials or debug details. Replace it with a neutral user-facing string and keep the sensitive message server-side.Apply this diff:
- onError: (error) => { - toast.error('Import failed', { - description: error.message, - }); + onError: () => { + toast.error('Import failed', { + description: 'We could not queue the import. Please try again or contact support.', + });packages/importer/src/types.ts (1)
53-66: EventProperties allows nested objects, but ClickHousepropertiesis primitives-only.
EventPropertiespermitsRecord<string, unknown>(line 61) and defines__queryas a nested object (line 62). However,IClickhouseEvent.propertiesexpectsRecord<string, string | number | boolean | undefined | null>. Storing objects will break ingestion unless stringified.Consider:
- Constrain values to primitives:
string | number | boolean | null | undefined- Or require JSON stringification for nested data (e.g.,
__queryJson: string)Apply this diff to constrain to primitives:
export interface EventProperties { [key: string]: | string | number | boolean | null - | undefined - | Record<string, unknown>; - __query?: Record<string, unknown>; + | undefined; __title?: string; __screen?: string; __language?: string; }packages/db/src/services/import.service.ts (5)
1-12: Missing import forcrypto.randomUUID().Lines 272 and 309 call
crypto.randomUUID(), butrandomUUIDis not imported fromnode:crypto.Add the import:
import type { ILogger } from '@openpanel/logger'; +import { randomUUID } from 'node:crypto'; import sqlstring from 'sqlstring';Then replace usage:
- id: crypto.randomUUID(), + id: randomUUID(),
262-288: Risk of timestamp collisions with existing events.Adjusting session boundaries by ±1 second (line 288: -1000ms, line 325: +500ms) could create timestamp conflicts if real events exist at those exact timestamps, leading to non-deterministic ordering.
Consider:
- Use sub-millisecond offsets (±1ms) to reduce collision probability
- Add an
is_syntheticflag and use(created_at, is_synthetic)for ordering- Use ClickHouse's native session reconstruction capabilities
- created_at: adjustTimestamp(session.first_timestamp, -1000), + created_at: adjustTimestamp(session.first_timestamp, -1),
553-554: Inconsistent parameter handling.Lines 553-554 use
sqlstring.escape()for inline substitution, while the rest of the file uses ClickHouse'squery_paramsmechanism (e.g., lines 122, 449). This inconsistency reduces maintainability.Refactor to use query_params:
WHERE - e.import_id = ${sqlstring.escape(importId)} - AND toDate(e.created_at) = ${sqlstring.escape(from)} + e.import_id = {importId:String} + AND toDate(e.created_at) = {from:String} AND e.session_id != '' GROUP BY e.session_id `; await ch.command({ query: sessionsInsertQuery, + query_params: { importId, from }, clickhouse_settings: {
601-613: Use deterministic status aggregation instead ofany(import_status).
any(import_status)(line 607) returns an arbitrary status from the group, making results non-deterministic when events have mixed statuses (e.g., some 'pending', some 'processed').Use deterministic aggregation:
SELECT import_id, COUNT(*) as total_events, COUNTIf(import_status = 'pending') as pending_events, COUNTIf(import_status = 'processed') as processed_events, - any(import_status) as status + CASE + WHEN COUNTIf(import_status = 'failed') > 0 THEN 'failed' + WHEN COUNTIf(import_status = 'processed') = COUNT(*) THEN 'processed' + WHEN COUNTIf(import_status = 'pending') > 0 THEN 'processing' + ELSE 'pending' + END as status FROM ${TABLE_NAMES.events_imports}
647-652: Validate status value instead of using type assertion.The type assertion on line 651 bypasses type safety. If ClickHouse returns an unexpected status value, the type system won't catch it.
Add runtime validation:
+ const validStatuses = ['pending', 'processing', 'processed', 'failed'] as const; + const status = validStatuses.includes(row.status as any) + ? (row.status as typeof validStatuses[number]) + : 'pending'; + return { importId, totalEvents: row.total_events, insertedEvents: row.processed_events, - status: row.status as 'pending' | 'processing' | 'processed' | 'failed', + status, };
🧹 Nitpick comments (5)
packages/common/src/object.test.ts (1)
5-32: Strong test coverage for main scenarios.The test thoroughly validates the core functionality including primitives, arrays, nested objects, and JSON string parsing. The expectations correctly reflect the implementation's behavior of stringifying values, flattening structures, and filtering out null/undefined/empty strings.
Consider adding test cases for additional edge cases:
- Boolean values (
true/false)- Falsy number (
0)- Empty arrays and objects
- Arrays containing null/undefined elements
These would further strengthen the test suite but are not critical for the current functionality.
packages/validation/src/index.ts (1)
461-471: Consider date validation for temporal fields.The
fromandtofields likely represent date ranges but are only validated as non-empty strings. Invalid date formats could cause runtime errors in the import processor.Consider adding date format validation:
export const zMixpanelImportConfig = z.object({ provider: z.literal('mixpanel'), type: z.literal('api'), serviceAccount: z.string().min(1), serviceSecret: z.string().min(1), projectId: z.string().min(1), - from: z.string().min(1), - to: z.string().min(1), + from: z.string().datetime(), + to: z.string().datetime(), mapScreenViewProperty: z.string().optional(), });Note: Use
.datetime()for ISO 8601 timestamps or.date()for YYYY-MM-DD format, depending on the expected input format.packages/redis/package.json (1)
11-11: Consider using caret range for consistency.The change from
"^5.7.0"to"5.8.2"removes semver flexibility and pins to an exact version. This is inconsistent with dependency management patterns elsewhere in the monorepo and prevents automatic patch updates.Consider reverting to a caret range for consistency:
- "ioredis": "5.8.2" + "ioredis": "^5.8.2"packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1)
1-3: Migration creates type churn.The migration adds
currentBatchasINTEGER NOT NULL DEFAULT 0, but a subsequent migration (20251028152531_imports_again) changes it toTEXTnullable. While both migrations are technically valid, this indicates design iteration. For future migrations, consider finalizing type decisions before creating schema changes to reduce churn.packages/db/src/services/import.service.ts (1)
721-784: Consider coordinating Postgres and ClickHouse state updates.
updateImportStatusupdates Postgres only, whilemarkImportComplete(lines 574-593) updates ClickHouse only. If one system's update fails, import state becomes inconsistent across the two databases.Consider one of these approaches:
- Update Postgres first (as source of truth), then ClickHouse, with retry logic
- Use an idempotent reconciliation job that periodically syncs state
- At minimum, add try-catch with logging for manual reconciliation
export async function updateImportStatus( jobLogger: ILogger, job: { updateProgress: (progress: Record<string, any>) => void }, importId: string, options: UpdateImportStatusOptions, ): Promise<void> { + try { const data: Prisma.ImportUpdateInput = {}; // ... build data object ... await job.updateProgress(data); await db.import.update({ where: { id: importId }, data, }); + } catch (error) { + jobLogger.error('Failed to update import status', { importId, error }); + throw error; + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (76)
apps/api/package.json(1 hunks)apps/start/Dockerfile(2 hunks)apps/start/package.json(2 hunks)apps/start/src/components/integrations/integration-card.tsx(1 hunks)apps/start/src/components/project-mapper.tsx(1 hunks)apps/start/src/components/ui/badge.tsx(1 hunks)apps/start/src/modals/add-import.tsx(1 hunks)apps/start/src/modals/date-ranger-picker.tsx(1 hunks)apps/start/src/modals/index.tsx(2 hunks)apps/start/src/routeTree.gen.ts(11 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx(1 hunks)apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx(1 hunks)apps/start/src/utils/math.ts(1 hunks)apps/start/src/utils/slug.ts(1 hunks)apps/worker/Dockerfile(2 hunks)apps/worker/package.json(3 hunks)apps/worker/src/boot-workers.ts(4 hunks)apps/worker/src/index.ts(2 hunks)apps/worker/src/jobs/cron.delete-projects.ts(1 hunks)apps/worker/src/jobs/events.incoming-event.ts(3 hunks)apps/worker/src/jobs/events.incoming-events.test.ts(2 hunks)apps/worker/src/jobs/import.ts(1 hunks)package.json(1 hunks)packages/cli/package.json(0 hunks)packages/cli/src/cli.ts(0 hunks)packages/cli/src/importer/importer.ts(0 hunks)packages/cli/src/importer/index.ts(0 hunks)packages/cli/tsconfig.json(0 hunks)packages/cli/tsup.config.ts(0 hunks)packages/common/package.json(1 hunks)packages/common/scripts/get-referrers.ts(1 hunks)packages/common/server/index.ts(1 hunks)packages/common/server/parse-referrer.test.ts(5 hunks)packages/common/server/parse-referrer.ts(4 hunks)packages/common/server/parser-user-agent.ts(2 hunks)packages/common/src/object.test.ts(1 hunks)packages/common/src/object.ts(2 hunks)packages/db/code-migrations/5-add-imports-table.sql(1 hunks)packages/db/code-migrations/5-add-imports-table.ts(1 hunks)packages/db/index.ts(2 hunks)packages/db/package.json(1 hunks)packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251018214030_fix/migration.sql(1 hunks)packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql(1 hunks)packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql(1 hunks)packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql(1 hunks)packages/db/prisma/migrations/20251028152531_imports_again/migration.sql(1 hunks)packages/db/prisma/schema.prisma(2 hunks)packages/db/src/buffers/base-buffer.ts(1 hunks)packages/db/src/buffers/event-buffer.test.ts(8 hunks)packages/db/src/clickhouse/client.ts(4 hunks)packages/db/src/clickhouse/csv.ts(1 hunks)packages/db/src/clickhouse/migration.ts(1 hunks)packages/db/src/services/event.service.ts(1 hunks)packages/db/src/services/import.service.ts(1 hunks)packages/db/src/services/session.service.ts(1 hunks)packages/db/src/types.ts(2 hunks)packages/importer/package.json(1 hunks)packages/importer/src/base-provider.ts(1 hunks)packages/importer/src/index.ts(1 hunks)packages/importer/src/providers/metadata.ts(1 hunks)packages/importer/src/providers/mixpanel.test.ts(1 hunks)packages/importer/src/providers/mixpanel.ts(1 hunks)packages/importer/src/providers/umami.ts(1 hunks)packages/importer/src/types.ts(1 hunks)packages/importer/tsconfig.json(1 hunks)packages/importer/vitest.config.ts(1 hunks)packages/queue/src/queues.ts(1 hunks)packages/redis/package.json(1 hunks)packages/trpc/package.json(1 hunks)packages/trpc/src/root.ts(2 hunks)packages/trpc/src/routers/auth.ts(1 hunks)packages/trpc/src/routers/import.ts(1 hunks)packages/trpc/src/routers/organization.ts(1 hunks)packages/validation/src/index.ts(1 hunks)
💤 Files with no reviewable changes (6)
- packages/cli/package.json
- packages/cli/tsup.config.ts
- packages/cli/src/cli.ts
- packages/cli/src/importer/importer.ts
- packages/cli/src/importer/index.ts
- packages/cli/tsconfig.json
🚧 Files skipped from review as they are similar to previous changes (33)
- packages/common/server/index.ts
- apps/start/src/modals/date-ranger-picker.tsx
- apps/worker/src/index.ts
- packages/common/server/parse-referrer.ts
- packages/trpc/src/root.ts
- packages/db/src/types.ts
- apps/start/src/modals/index.tsx
- packages/common/src/object.ts
- apps/worker/Dockerfile
- packages/db/src/buffers/event-buffer.test.ts
- packages/db/index.ts
- packages/trpc/src/routers/import.ts
- apps/start/src/components/integrations/integration-card.tsx
- packages/trpc/src/routers/organization.ts
- apps/start/src/utils/math.ts
- packages/importer/src/providers/metadata.ts
- packages/common/scripts/get-referrers.ts
- apps/start/package.json
- apps/start/src/components/ui/badge.tsx
- package.json
- packages/importer/src/providers/umami.ts
- apps/worker/package.json
- packages/importer/vitest.config.ts
- packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql
- packages/db/src/services/event.service.ts
- apps/start/Dockerfile
- packages/importer/src/index.ts
- apps/worker/src/boot-workers.ts
- packages/queue/src/queues.ts
- packages/importer/tsconfig.json
- packages/trpc/src/routers/auth.ts
- apps/start/src/routes/app.$organizationId.$projectId.settings._tabs.tsx
- packages/db/src/services/session.service.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}
📄 CodeRabbit inference engine (apps/start/.cursorrules)
apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'
Files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/modals/add-import.tsxapps/start/src/utils/slug.tsapps/start/src/routeTree.gen.tsapps/start/src/components/project-mapper.tsx
🧠 Learnings (9)
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, use the helper functions in ./packages/db/src/clickhouse/query-functions.ts rather than crafting raw queries
Applied to files:
packages/db/src/clickhouse/csv.tspackages/db/code-migrations/5-add-imports-table.tspackages/db/src/clickhouse/client.tspackages/db/src/services/import.service.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*.tsx : Define React route files with createFileRoute and let the bundler/CLI manage the path string
Applied to files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/__root.tsx : Declare the root route in routes/__root.tsx using createRootRoute or createRootRouteWithContext
Applied to files:
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsxapps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/{index,*.index}.tsx : For index routes, use index.tsx or *.index.tsx files and ensure the createFileRoute path uses a trailing slash (e.g., '/posts/')
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*$*.tsx : Name dynamic route files with $param (e.g., posts.$postId.tsx) to capture path parameters
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/$.tsx : Use $.tsx for splat (catch-all) routes to capture the remaining path into params._splat
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*_.tsx : Use a trailing underscore in a segment filename to create a non-nested (un-nested) route (e.g., posts_.tsx)
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/_*.tsx : Use leading underscore filenames for pathless layout routes that wrap children without matching a URL segment
Applied to files:
apps/start/src/routeTree.gen.ts
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, always construct them using the custom query builder in ./packages/db/src/clickhouse/query-builder.ts
Applied to files:
packages/db/src/clickhouse/client.tspackages/db/src/services/import.service.ts
🧬 Code graph analysis (12)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
getIsCluster(17-24)packages/db/src/clickhouse/migration.ts (3)
createTable(68-116)modifyTTL(118-132)runClickhouseMigrationCommands(338-470)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
BaseRawEvent(40-42)ImportJobMetadata(69-73)ErrorContext(45-51)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)
packages/importer/src/providers/mixpanel.test.ts (1)
packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-452)
packages/common/server/parse-referrer.test.ts (1)
packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)
apps/worker/src/jobs/import.ts (8)
packages/queue/src/queues.ts (1)
ImportQueuePayload(168-173)packages/db/src/prisma-client.ts (1)
db(270-270)packages/db/src/services/import.service.ts (10)
ImportSteps(719-719)updateImportStatus(721-784)getImportDateBounds(658-682)insertImportBatch(30-87)generateSessionIds(94-130)createSessionsStartEndEvents(138-348)moveImportsToProduction(354-458)backfillSessionsToProduction(460-569)markImportComplete(574-593)getImportProgress(598-653)packages/db/src/clickhouse/client.ts (2)
formatClickhouseDate(237-248)error(41-46)packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/logger/index.ts (1)
ILogger(6-6)packages/importer/src/providers/umami.ts (1)
UmamiProvider(59-382)packages/importer/src/providers/mixpanel.ts (1)
MixpanelProvider(18-452)
packages/common/src/object.test.ts (1)
packages/common/src/object.ts (1)
toDots(16-67)
packages/importer/src/types.ts (2)
packages/db/src/services/event.service.ts (1)
IServiceImportedEventPayload(144-147)packages/db/src/services/import.service.ts (2)
ImportProgress(20-25)ImportStageResult(14-18)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (4)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
Route(12-34)packages/importer/src/providers/metadata.ts (1)
IMPORT_PROVIDERS(13-30)apps/public/components/ui/tooltip.tsx (1)
Tooltiper(36-51)apps/start/src/components/integrations/integration-card.tsx (3)
IntegrationCard(91-112)IntegrationCardLogoImage(50-69)IntegrationCardFooter(3-15)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
zUmamiImportConfig(452-454)zMixpanelImportConfig(461-470)IUmamiImportConfig(456-456)IMixpanelImportConfig(471-471)IImportConfig(473-476)apps/start/src/components/project-mapper.tsx (1)
ProjectMapper(33-116)apps/public/lib/utils.ts (1)
cn(4-6)
apps/start/src/components/project-mapper.tsx (2)
packages/db/src/services/project.service.ts (1)
IServiceProject(7-7)apps/start/src/components/ui/select.tsx (5)
Select(173-173)SelectTrigger(181-181)SelectValue(182-182)SelectContent(174-174)SelectItem(176-176)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
IMixpanelImportConfig(471-471)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/services/event.service.ts (2)
IClickhouseEvent(68-102)query(701-847)packages/common/src/url.ts (2)
parsePath(11-49)isSameDomain(51-63)packages/common/server/parse-referrer.ts (2)
parseReferrer(17-26)getReferrerWithQuery(28-66)packages/common/server/parser-user-agent.ts (1)
parseUserAgent(72-115)packages/common/src/object.ts (2)
toDots(16-67)strip(79-79)
packages/db/src/services/import.service.ts (5)
packages/db/src/services/event.service.ts (1)
IClickhouseEvent(68-102)packages/db/src/clickhouse/csv.ts (2)
csvEscapeField(42-53)csvEscapeJson(15-38)packages/db/src/clickhouse/client.ts (5)
formatClickhouseDate(237-248)chInsertCSV(202-228)TABLE_NAMES(49-61)ch(132-162)convertClickhouseDateToJs(267-269)packages/logger/index.ts (1)
ILogger(6-6)packages/db/src/prisma-client.ts (1)
db(270-270)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-and-push-worker
- GitHub Check: build-and-push-api
- GitHub Check: build-and-push-dashboard
🔇 Additional comments (30)
apps/api/package.json (1)
67-67: Verify the rationale for pinning tsdown to an exact version.The caret operator has been removed from tsdown, changing it from
"^0.14.2"to"0.14.2"(exact version pin). The AI summary indicates this change is coordinated across the monorepo. Ensure this is intentional for reproducible builds or a known stability requirement with this specific version.Is this pinning part of a broader monorepo strategy, or is there a specific reason (e.g., bug fix, stability) for locking to 0.14.2? You may want to verify consistency across other workspace packages to confirm alignment.
packages/db/src/buffers/base-buffer.ts (1)
1-1: LGTM! Import path consolidation.The import path change from
@openpanel/common/server/idto@openpanel/common/serveris part of a coordinated refactor to centralize server utilities, with similar updates across auth.ts and organization.ts.packages/common/src/object.test.ts (2)
1-2: LGTM!The imports are correct and appropriate for a vitest test file.
34-47: Excellent validation of malformed JSON handling.This test effectively validates that the function correctly:
- Parses valid JSON strings and expands them into dot notation
- Skips strings that appear to be JSON but are malformed
- Preserves regular strings as-is
The test expectations align well with the implementation's defensive approach to JSON parsing.
packages/common/package.json (1)
6-9: Public exports field enables dual entry points.The new
exportsfield correctly exposes the main entry point and a server-specific export, aligning with the referrer utilities being made part of the public API surface (parse-referrer, id).packages/validation/src/index.ts (2)
439-459: Well-structured file-based import configs.The
createFileImportConfighelper elegantly reduces duplication while maintaining type safety with provider literals. ThezProjectMapperschema and its integration intozUmamiImportConfigprovides clear validation for project mapping requirements.
473-476: Clean union type export.The
IImportConfigunion type appropriately combines the three provider-specific config types for type safety in consuming code.packages/trpc/package.json (1)
19-19: LGTM!The addition of the
@openpanel/queueworkspace dependency aligns with the new import queue functionality introduced in this PR.packages/db/src/clickhouse/csv.ts (3)
5-13: LGTM!The escape order is correct (backslashes first), and the function properly handles all necessary control characters for ClickHouse Map format.
15-38: LGTM!The function correctly handles null values, normalizes to strings for Map(String,String), and properly delegates to
csvEscapeFieldfor final CSV escaping.
42-53: LGTM!The function correctly implements CSV RFC 4180 standard by escaping double quotes through doubling and wrapping fields containing special characters.
packages/db/src/clickhouse/client.ts (4)
1-1: LGTM!The
Readableimport is correctly used in the newchInsertCSVfunction, and theevents_importstable addition aligns with the import system introduced in this PR.Also applies to: 60-60
65-65: Verify timeout increases are appropriate for the import workload.The request timeout increased 5x (60s → 300s) and idle socket TTL increased 7.5x (8s → 60s). While these increases support long-running import operations, consider monitoring these values in production to ensure they don't mask underlying performance issues or cause resource exhaustion.
Also applies to: 68-68
137-157: LGTM!The retry wrapper and comprehensive ClickHouse settings properly support large batch inserts. The settings spread order (line 154) correctly allows user settings to override defaults, and the progress headers prevent proxy timeouts during long-running operations.
202-228: LGTM!The CSV streaming implementation is correct, and the format settings properly align with the CSV escaping utilities (double quotes for CSV fields, single quotes for Map format internally). Error handling and logging are appropriate.
apps/worker/src/jobs/cron.delete-projects.ts (1)
57-57: LGTM!The type change from numeric
0to string'0'aligns with ClickHouse setting conventions and likely reflects updated type definitions in the newer @clickhouse/client version.packages/importer/src/base-provider.ts (3)
4-31: LGTM!The abstract class design is clean with appropriate type constraints. Using
AsyncGeneratorforparseSourceenables memory-efficient streaming for large imports, which is essential for this use case.
36-68: LGTM!The optional hooks are well-designed with sensible defaults. The
shouldGenerateSessionIds()default offalseappropriately assumes providers handle session ID generation during streaming.
78-120: LGTM!The date chunking implementation correctly splits date ranges while handling edge cases (same start/end date) and ensuring chunks don't exceed the end date. The break condition prevents infinite loops.
packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql (1)
1-2: LGTM!The migration correctly adds an optional
statusMessagecolumn to support detailed import status tracking, aligning with the PR's import system objectives.packages/db/package.json (1)
16-16: Verify compatibility with @clickhouse/client 1.12.1.The upgrade from
^1.2.0to^1.12.1spans 10 minor versions. While maintaining semver compatibility, ensure there are no behavioral changes affecting existing ClickHouse operations.apps/worker/src/jobs/events.incoming-events.test.ts (1)
102-102: LGTM! Test expectations align with updated referrer parsing behavior.The expectations correctly reflect that
referrerTypenow returns an empty string ('') instead of'unknown'for non-derived referrers, consistent with the changes inpackages/common/server/parse-referrer.ts.Also applies to: 210-210
packages/db/src/clickhouse/migration.ts (1)
118-132: LGTM! Clean implementation following established patterns.The
modifyTTLfunction correctly handles both clustered and non-clustered scenarios, using thereplicated()helper consistently with other migration utilities in this file. The implementation is type-safe and follows the existing code structure.packages/common/server/parse-referrer.test.ts (1)
8-8: LGTM! Comprehensive test coverage for the referrer type change.All test expectations have been consistently updated to reflect that unknown or empty referrer types now return
''instead of'unknown'. The changes cover bothparseReferrerandgetReferrerWithQueryfunctions across various scenarios (undefined, empty, invalid, and unknown inputs).Also applies to: 14-14, 44-44, 52-52, 67-67, 91-91, 99-99, 113-113
packages/db/code-migrations/5-add-imports-table.sql (1)
1-43: LGTM! Well-designed import staging table.The
events_importstable schema is well-structured:
- Mirrors the main events table structure with additional import tracking fields (
import_id,import_status,imported_at_meta)- Uses appropriate compression codecs for each data type
- Partitioning by month on
imported_at_metaaligns with the 7-day TTL for efficient partition dropsORDER BY (import_id, created_at)supports efficient queries by import batchThe separate TTL
ALTERstatement follows correct ClickHouse patterns.packages/common/server/parser-user-agent.ts (2)
71-71: Good API expansion with exported type.The
UserAgentInfotype export provides a clean public API for consumers to type the return value ofparseUserAgent, using TypeScript'sReturnTypeutility type.
84-112: LGTM! More defensive override handling.The updated override logic adds explicit type guards and truthiness checks, preventing empty strings or non-string values from being used as overrides. This is more defensive than the previous
??operator approach and ensures only valid, non-empty string overrides are applied. The pattern is consistently applied across all fields.packages/db/code-migrations/5-add-imports-table.ts (1)
65-72: TTL implementation via modifyTTL helper is correct.The code uses the
modifyTTLhelper function which correctly targets the replicated table in cluster mode (see packages/db/src/clickhouse/migration.ts lines 117-131). This addresses the concern about TTL on distributed tables.The helper properly constructs:
ALTER TABLE ${replicated(tableName)} ON CLUSTER '{cluster}' MODIFY TTL ${ttl}for clusteredALTER TABLE ${tableName} MODIFY TTL ${ttl}for non-clusteredapps/worker/src/jobs/events.incoming-event.ts (2)
8-23: Import reorganization aligns with common package refactoring.The imports have been updated to use public exports from
@openpanel/common/server, and Job is now correctly imported as a type-only import. The separate import ofomitfrom ramda alongside the namespace import is acceptable.
120-122: Referrer logic improvements are correct.The changes improve referrer handling:
- Line 120:
baseEvent.referrernow uses onlyreferrer?.url(removedutmReferrer?.url), which is correct since UTM parameters shouldn't override the actual referrer URL- Line 122:
referrerTypenow prioritizesreferrer?.typebeforeutmReferrer?.type, which properly favors detected referrer types (like 'search') over UTM-based types
| @@ -1 +1 @@ | |||
| export * from '@openpanel/common/src/slug'; | |||
| export * from '@openpanel/common'; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify export scope: file exports entire common package instead of slug utilities.
This file is named slug.ts but now exports everything from @openpanel/common, not just slug-related functions. This introduces namespace pollution and breaks the single-responsibility principle for utility modules.
Consider using named exports to maintain the focused scope:
-export * from '@openpanel/common';
+export { slugify, /* other slug-specific exports */ } from '@openpanel/common';Run the following script to verify what slug-related symbols should be exported:
🏁 Script executed:
#!/bin/bash
# Find usages of this slug utility module to determine required exports
rg -n --type=ts --type=tsx "from ['\"]@/utils/slug['\"]" -A 1 -B 1
# Find the original slug exports from common package
fd -e ts -e tsx --full-path "packages/common" | xargs rg -n "export.*slug" -iLength of output: 157
🏁 Script executed:
#!/bin/bash
# Check all exports in the common package's index
echo "=== Contents of packages/common/index.ts ==="
cat packages/common/index.ts
echo ""
echo "=== Search for any imports of apps/start/src/utils/slug.ts ==="
rg -n --type=ts "utils/slug" -C 2
echo ""
echo "=== Check if apps/start/src/utils/slug.ts file exists and its content ==="
cat apps/start/src/utils/slug.ts 2>/dev/null || echo "File not found"Length of output: 921
🏁 Script executed:
#!/bin/bash
# Check full content of packages/common/src/slug.ts to see all exports
echo "=== Full content of packages/common/src/slug.ts ==="
cat packages/common/src/slug.ts
echo ""
echo "=== Check if there are any other references to slug utilities ==="
rg -n "slug" apps/start/src/utils/ --type=tsLength of output: 616
Use named export to maintain module scope: export { slug } from '@openpanel/common';
The file is named slug.ts but re-exports the entire @openpanel/common package (11 modules), when it should only export the slug function. While named imports in consuming files (import { slug }) will tree-shake correctly, this creates a confusing public interface that doesn't match the module's name and breaks the expectation of focused utility exports.
-export * from '@openpanel/common';
+export { slug } from '@openpanel/common';📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export * from '@openpanel/common'; | |
| export { slug } from '@openpanel/common'; |
🤖 Prompt for AI Agents
In apps/start/src/utils/slug.ts around line 1, the file currently re-exports the
entire @openpanel/common package which exposes many unrelated modules; change it
to a focused named re-export so this module only exposes the slug utility.
Replace the bulk export with a named export that re-exports just slug from
@openpanel/common (maintain TypeScript types and default/namespace
considerations), run a quick type-check/build to ensure no other modules rely on
the previous broad export surface.
| "test": "vitest", | ||
| "typecheck": "tsc --noEmit" | ||
| "typecheck": "tsc --noEmit", | ||
| "gen:referrers": "jiti scripts/get-referrers.ts && biome format --write ./server/referrers/index.ts" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify target path for gen:referrers script output.
The gen:referrers script writes to ./server/referrers/index.ts, but the AI summary context mentions the script writes to packages/worker/src/referrers/index.ts. Please confirm that the relative path is correct and matches where the get-referrers.ts script actually writes its output.
🏁 Script executed:
#!/bin/bash
# Verify the get-referrers script exists and check its output path
fd -e ts 'get-referrers' packages/common/
cat packages/common/scripts/get-referrers.ts 2>/dev/null | grep -n 'writeFile\|output' | head -20Length of output: 65
🏁 Script executed:
#!/bin/bash
# Show the full writeFileSync call and surrounding context
sed -n '70,80p' packages/common/scripts/get-referrers.tsLength of output: 638
🏁 Script executed:
#!/bin/bash
# Verify directory structure to confirm relative path resolution
ls -la packages/ | head -20Length of output: 1005
🏁 Script executed:
#!/bin/bash
# Search for worker directory and referrers paths
fd -t d 'worker' packages/ 2>/dev/null
fd -t d 'referrers' . 2>/dev/null | head -20Length of output: 35
🏁 Script executed:
#!/bin/bash
# Check if the file exists at the only known location
ls -la packages/common/server/referrers/index.ts 2>&1Length of output: 92
Fix get-referrers.ts output path mismatch.
The script writes to a non-existent path. In packages/common/scripts/get-referrers.ts line 73, change the output path from path.resolve(__dirname, '../../worker/src/referrers/index.ts') to path.resolve(__dirname, '../../server/referrers/index.ts'). The directory packages/worker does not exist; only packages/common/server/referrers/ exists where the file is correctly formatted by the package.json script.
🤖 Prompt for AI Agents
In packages/common/scripts/get-referrers.ts around line 73 (and referenced by
packages/common/package.json script at line 13), the script writes output to a
non-existent path '../../worker/src/referrers/index.ts'; update the path to
'../../server/referrers/index.ts' using path.resolve(__dirname,
'../../server/referrers/index.ts') so the generated file is written into the
existing packages/common/server/referrers directory (ensure path.resolve is used
and the target directory exists).
| export interface SessionInfo { | ||
| id: string; | ||
| lastTimestamp: number; | ||
| lastEvent: IServiceImportedEventPayload; | ||
| } | ||
|
|
||
| export interface ImportProgress { | ||
| totalEvents: number; | ||
| processedEvents: number; | ||
| currentBatch: number; | ||
| totalBatches: number; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Remove duplicate type definitions.
ImportProgress (lines 20-25) and ImportStageResult (lines 75-80) are redefined in packages/db/src/services/import.service.ts. This duplication risks type drift if one definition is updated without the other.
The service file should import these types from this module:
In packages/db/src/services/import.service.ts, remove the duplicate definitions and add:
import type { ImportProgress, ImportStageResult } from '@openpanel/importer';Also applies to: 75-80
🤖 Prompt for AI Agents
In packages/importer/src/types.ts around lines 14-25 and 75-80, ImportProgress
and ImportStageResult are defined here but duplicated in
packages/db/src/services/import.service.ts; remove the duplicate type
definitions from packages/db/src/services/import.service.ts and replace them
with a type-only import from this module: add "import type { ImportProgress,
ImportStageResult } from '@openpanel/importer';" at the top of the service file
and update any local references to use the imported types.
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes