Skip to content

Conversation

@lindesvard
Copy link
Contributor

@lindesvard lindesvard commented Oct 19, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added data import functionality allowing users to import analytics data from external providers (Umami, Mixpanel) into projects
    • New Imports tab in project settings to manage import configurations and view history
    • Import tracking with status indicators, progress bars, and error messaging
    • New badge variants (success, warning, info) for enhanced UI flexibility
  • Bug Fixes

    • Improved referrer type classification handling

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 28

🧹 Nitpick comments (35)
apps/start/src/modals/index.tsx (1)

8-8: Consider lazy‑loading AddImport to keep initial bundle slim

AddImport brings TRPC/React Query; load it on demand.

Apply:

+import { lazy } from 'react';
-import AddImport from './add-import';
+const AddImport = lazy(() => import('./add-import'));

Ensure ModalProvider or the modal surface renders within a .

Also applies to: 41-41

packages/importer/src/base-provider.ts (1)

7-8: Make core hooks async‑friendly and identifiers immutable

Allow providers to perform async transforms/validation; mark identifiers readonly.

 export abstract class BaseImportProvider<
   TRawEvent extends BaseRawEvent = BaseRawEvent,
 > {
-  abstract provider: string;
-  abstract version: string;
+  abstract readonly provider: string;
+  abstract readonly version: string;
+
+  // Utility type to allow sync or async implementations
+  export type MaybePromise<T> = T | Promise<T>;

   /**
    * Stream-read and parse source (file/API) → yields raw events
    * This should be implemented as an async generator to handle large files efficiently
    */
   abstract parseSource(): AsyncGenerator<TRawEvent, void, unknown>;

   /**
    * Convert provider format → IClickhouseEvent
    */
-  abstract transformEvent(rawEvent: TRawEvent): IClickhouseEvent;
+  abstract transformEvent(rawEvent: TRawEvent): MaybePromise<IClickhouseEvent>;

   /**
    * Validate raw event structure
    */
-  abstract validate(rawEvent: TRawEvent): boolean;
+  abstract validate(rawEvent: TRawEvent): MaybePromise<boolean>;

Also applies to: 19-24

packages/importer/src/utils/referrer-parser.ts (3)

5-35: Type the map and normalize hosts; reduce duplicates

Strong types + normalization avoid listing www/m. variants twice and improve coverage.

-import { stripTrailingSlash } from '@openpanel/common';
+import { stripTrailingSlash } from '@openpanel/common';

+type ReferrerType = 'search' | 'social' | 'referral';
+type ReferrerEntry = { type: ReferrerType; name: string };
+const normalizeHost = (h: string) => h.replace(/^(www\.|m\.)/i, '').toLowerCase();

-const referrers: Record<string, { type: string; name: string }> = {
+const referrers: Record<string, ReferrerEntry> = {
   // Search engines
-  'google.com': { type: 'search', name: 'Google' },
-  'www.google.com': { type: 'search', name: 'Google' },
+  'google.com': { type: 'search', name: 'Google' },
   'bing.com': { type: 'search', name: 'Bing' },
-  'www.bing.com': { type: 'search', name: 'Bing' },
   'yahoo.com': { type: 'search', name: 'Yahoo!' },
-  'www.yahoo.com': { type: 'search', name: 'Yahoo!' },
   'duckduckgo.com': { type: 'search', name: 'DuckDuckGo' },
-  'www.duckduckgo.com': { type: 'search', name: 'DuckDuckGo' },
   'baidu.com': { type: 'search', name: 'Baidu' },
-  'www.baidu.com': { type: 'search', name: 'Baidu' },

   // Social media
-  'facebook.com': { type: 'social', name: 'Facebook' },
-  'www.facebook.com': { type: 'social', name: 'Facebook' },
+  'facebook.com': { type: 'social', name: 'Facebook' },
   'twitter.com': { type: 'social', name: 'Twitter' },
-  'www.twitter.com': { type: 'social', name: 'Twitter' },
   'x.com': { type: 'social', name: 'X' },
-  'www.x.com': { type: 'social', name: 'X' },
   'linkedin.com': { type: 'social', name: 'LinkedIn' },
-  'www.linkedin.com': { type: 'social', name: 'LinkedIn' },
   'instagram.com': { type: 'social', name: 'Instagram' },
-  'www.instagram.com': { type: 'social', name: 'Instagram' },
   'tiktok.com': { type: 'social', name: 'TikTok' },
-  'www.tiktok.com': { type: 'social', name: 'TikTok' },
   'youtube.com': { type: 'social', name: 'YouTube' },
-  'www.youtube.com': { type: 'social', name: 'YouTube' },
   'reddit.com': { type: 'social', name: 'Reddit' },
-  'www.reddit.com': { type: 'social', name: 'Reddit' },
+  't.co': { type: 'social', name: 'Twitter' },
+  'lnkd.in': { type: 'social', name: 'LinkedIn' },
 };
 
 export function parseReferrer(url: string | undefined) {
-  const hostname = getHostname(url);
-  const match = referrers[hostname] ?? referrers[hostname.replace('www.', '')];
+  const hostname = normalizeHost(getHostname(url));
+  const match = referrers[hostname];

Also applies to: 49-52


49-57: Guard output URL to http(s) only

Avoid echoing non-web schemes (e.g., javascript:) back to callers.

 export function parseReferrer(url: string | undefined) {
   const hostname = getHostname(url);
   const match = referrers[hostname] ?? referrers[hostname.replace('www.', '')];

-  return {
+  let safeUrl = '';
+  try {
+    if (url) {
+      const u = new URL(url);
+      if (u.protocol === 'http:' || u.protocol === 'https:') {
+        safeUrl = stripTrailingSlash(url);
+      }
+    }
+  } catch {
+    // ignore
+  }
+
+  return {
     name: match?.name ?? '',
     type: match?.type ?? 'referral',
-    url: stripTrailingSlash(url ?? ''),
+    url: safeUrl,
   };
 }

60-91: Minor: normalize source before lookup and keep types consistent

Lower-case once; reuse normalizeHost for direct key match; return typed result.

-export function getReferrerWithQuery(
-  query: Record<string, string> | undefined,
-) {
+export function getReferrerWithQuery(
+  query: Record<string, string> | undefined,
+): { name: string; type: ReferrerType; url: string } | null {
   if (!query) {
     return null;
   }

-  const source = query.utm_source ?? query.ref ?? query.utm_referrer ?? '';
+  const source = (query.utm_source ?? query.ref ?? query.utm_referrer ?? '').trim();
   if (source === '') {
     return null;
   }

-  const match =
-    Object.values(referrers).find(
-      (referrer) => referrer.name.toLowerCase() === source.toLowerCase(),
-    ) || referrers[source];
+  const lower = source.toLowerCase();
+  const match =
+    Object.values(referrers).find((r) => r.name.toLowerCase() === lower) ||
+    referrers[normalizeHost(lower)];

   if (match) {
     return {
       name: match.name,
       type: match.type,
       url: '',
     };
   }
packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)

1-13: Consider more descriptive migration naming.

The migration name "fix" is not descriptive. Consider renaming to something like "remove_redundant_import_columns" or "migrate_import_config_to_json" to better indicate the purpose of this schema change.

packages/db/code-migrations/5-add-imports-table.sql (1)

32-32: Consider using an Enum for import_status.

The import_status field uses LowCardinality(String) with a default of 'pending'. For better type safety and query performance, consider using an Enum type instead:

`import_status` LowCardinality(Enum8('pending' = 1, 'processing' = 2, 'completed' = 3, 'failed' = 4)) DEFAULT 'pending'

This ensures only valid status values can be stored and provides better compression.

packages/db/prisma/schema.prisma (1)

470-486: Consider using an enum for the status field.

The status field is defined as a String with a comment indicating it should be one of 'pending' | 'processing' | 'completed' | 'failed'. For better type safety and database constraints, define a Prisma enum:

enum ImportStatus {
  pending
  processing
  completed
  failed
}

model Import {
  // ...
  status ImportStatus // instead of String
  // ...
}

This provides compile-time type checking and database-level validation.

packages/trpc/src/root.ts (1)

7-7: Router key named "import" (reserved keyword) — consider renaming for DX.

Valid in JS/TS, but can confuse tooling/usages (e.g., appRouter.import vs dynamic import()). Consider renaming to imports/importer for clarity.

Also applies to: 44-44

packages/queue/src/queues.ts (1)

163-176: Harden import jobs: add attempts/backoff/timeout; optionally expose QueueEvents.

Imports call external providers; transient failures are expected. Provide sensible defaults and metrics hook.

Apply:

 export const importQueue = new Queue<ImportQueuePayload>('import', {
   connection: getRedisQueue(),
   defaultJobOptions: {
     removeOnComplete: 10,
-    removeOnFail: 50,
+    removeOnFail: 50,
+    attempts: 3,
+    backoff: { type: 'exponential', delay: 30_000 },
+    timeout: 1000 * 60 * 30, // 30 min; tune as needed
   },
 });
+
+// Optional: enable events for monitoring dashboards/Bull Board progress
+export const importQueueEvents = new QueueEvents('import', {
+  connection: getRedisQueue(),
+});
apps/worker/src/boot-workers.ts (3)

138-142: Also drain import queue before shutdown to avoid cutting imports mid-run.

-      await waitForQueueToEmpty(cronQueue);
+      await waitForQueueToEmpty(cronQueue);
+      await waitForQueueToEmpty(importQueue, 10 * 60_000); // 10 min cap; tune as needed

155-167: Dev path skips graceful shutdown — can leave jobs in inconsistent state.

Unify behavior and call the same exit handler in dev.

-      process.on(evt, (code) => {
-        if (process.env.NODE_ENV === 'production') {
-          exitHandler(evt, code);
-        } else {
-          logger.info('Shutting down for development', {
-            event: evt,
-            code,
-          });
-          process.exit(0);
-        }
-      });
+      process.on(evt, (code) => {
+        exitHandler(evt, code ?? 0);
+      });

4-13: Remove unused imports and dead/commented code for clarity.

eventsGroupQueue, GroupWorker, eventsJob, incomingEventPure are no longer used; commented worker block can be dropped.

Also applies to: 18-23, 33-48

packages/trpc/src/routers/import.ts (3)

121-131: Guard job removal on delete with error handling (active/locked jobs can throw).

        const job = await importQueue.getJob(importRecord.jobId);
-        if (job) {
-          await job.remove();
-        }
+        if (job) {
+          try {
+            await job.remove();
+          } catch (e) {
+            // best-effort removal; proceed with deletion
+          }
+        }

165-171: Retry should set stable jobId and attempts/backoff like create.

-      const job = await importQueue.add('import', {
-        type: 'import',
-        payload: {
-          importId: importRecord.id,
-        },
-      });
+      const job = await importQueue.add(
+        'import',
+        { type: 'import', payload: { importId: importRecord.id } },
+        { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
+      );

3-6: Unused type import (IImportConfig).

IImportConfig is not referenced. Remove to satisfy lint.

packages/importer/src/index.ts (1)

1-14: Public API surface looks good; minor polish

Consider also re-exporting provider config/types (e.g., IUmamiImportConfig) from providers to keep consumers on a single import path. Optional.

IMPORT_REVIEW.md (1)

320-337: Move this review doc to /docs or PR notes to avoid shipping in package tarballs

Keeps repo clean and avoids confusing consumers.

apps/worker/src/jobs/import.ts (4)

1-11: Avoid deep imports; prefer package barrel exports

Importing services via '@openpanel/db/src/services/import.service' couples to internal paths. Re-export these from '@openpanel/db' and import from there to reduce breakage.


75-81: Progress totalBatches fluctuates; compute once or report pct

As totalEvents grows, totalBatches changes mid-run. Either omit totalBatches here or compute a percentage for UI.

-          totalBatches: Math.ceil(totalEvents / BATCH_SIZE),
+          progressPct: totalEvents ? Math.round((processedEvents / totalEvents) * 100) : 0,

17-17: Make BATCH_SIZE configurable

Read from config/env to tune per deployment.

-const BATCH_SIZE = 5000;
+const BATCH_SIZE = Number(process.env.IMPORT_BATCH_SIZE ?? 5000);

200-209: Validate provider-specific config with a schema before instantiation

Avoid passing untyped Prisma JSON into providers. Parse with provider schema (zod) and fail early.

 function createProvider(
   record: Prisma.ImportGetPayload<{ include: { project: true } }>,
 ) {
   switch (record.config.provider) {
     case 'umami':
-      return new UmamiProvider(record.projectId, record.config);
+      // Example: zUmamiImportConfig.safeParse(record.config)
+      return new UmamiProvider(
+        record.projectId,
+        (record.config as any) /* TODO: parse/validate */,
+      );
     default:
       throw new Error(`Unknown provider: ${record.config.provider}`);
   }
 }

I can wire in zod parsing if you expose the provider config schemas from @openpanel/importer.

packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (2)

9-16: Constrain status values

Add a CHECK constraint (or enum) to prevent invalid states.

   "status" TEXT NOT NULL,
@@
 ) ;
+
+-- Restrict status values
+ALTER TABLE "public"."imports"
+  ADD CONSTRAINT "imports_status_check"
+  CHECK ("status" IN ('pending','processing','completed','failed'));

18-22: Add indexes for common queries

List/filters will hit projectId/status/createdAt frequently.

 -- AddForeignKey
 ALTER TABLE "public"."imports" ADD CONSTRAINT "imports_projectId_fkey" FOREIGN KEY ("projectId") REFERENCES "public"."projects"("id") ON DELETE CASCADE ON UPDATE CASCADE;
+
+-- Indexes for performance
+CREATE INDEX IF NOT EXISTS "idx_imports_project_createdAt"
+  ON "public"."imports" ("projectId", "createdAt" DESC);
+CREATE INDEX IF NOT EXISTS "idx_imports_status_createdAt"
+  ON "public"."imports" ("status", "createdAt" DESC);
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (5)

5-10: Remove unused icon import

CheckCircle is unused.

-import {
-  CheckCircle,
-  CheckCircleIcon,
-  Download,
-  Loader2Icon,
-  XCircleIcon,
-} from 'lucide-react';
+import { CheckCircleIcon, Download, Loader2Icon, XCircleIcon } from 'lucide-react';

83-104: Add onError to mutations for user feedback

Surface errors with toasts; also consider logging.

 const deleteImport = useMutation(
   trpc.import.delete.mutationOptions({
     onSuccess: () => {
       toast.success('Import deleted', {
         description: 'The import has been successfully deleted.',
       });
       queryClient.invalidateQueries(trpc.import.list.pathFilter());
     },
+    onError: (err) => {
+      toast.error('Delete failed', { description: String(err) });
+    },
   }),
 );
@@
 const retryImport = useMutation(
   trpc.import.retry.mutationOptions({
     onSuccess: () => {
       toast.success('Import retried', {
         description: 'The import has been queued for processing again.',
       });
       queryClient.invalidateQueries(trpc.import.list.pathFilter());
     },
+    onError: (err) => {
+      toast.error('Retry failed', { description: String(err) });
+    },
   }),
 );

244-261: Disable action buttons while pending to avoid double submits

Tie disabled state to mutation pending flags.

-                    {imp.status === 'failed' && (
+                    {imp.status === 'failed' && (
                       <Button
                         variant="outline"
                         size="sm"
-                        onClick={() => retryImport.mutate({ id: imp.id })}
+                        disabled={retryImport.isPending}
+                        onClick={() => retryImport.mutate({ id: imp.id })}
                       >
                         Retry
                       </Button>
                     )}
                     <Button
                       variant="ghost"
                       size="sm"
-                      onClick={() => deleteImport.mutate({ id: imp.id })}
+                      disabled={deleteImport.isPending}
+                      onClick={() => deleteImport.mutate({ id: imp.id })}
                     >
                       Delete
                     </Button>

41-66: Consider serving logos locally or via asset proxy

External Brandfetch URLs can break and may violate CSP. If feasible, mirror assets or use a proxy.


141-161: Consider extending IntegrationCardLogoImage to accept a customizable alt prop for improved accessibility

The alt attribute is currently hardcoded to the generic "Integration Logo" at line 66 of apps/start/src/components/integrations/integration-card.tsx. Extending the component to accept an optional alt prop and passing provider.name would provide more descriptive alternative text for each integration, improving accessibility and SEO.

packages/validation/src/index.ts (1)

444-445: Local file support vs schema

fileUrl is z.string().url(), but the Umami provider supports local files. Either (a) allow file:/// or absolute paths, or (b) drop local-path support in the provider. Today this schema blocks local imports.

- fileUrl: z.string().url(),
+ fileUrl: z.string().refine(
+   (v) => /^https?:\/\//i.test(v) || /^file:\/\//i.test(v) || v.startsWith('/'),
+   'Must be an http(s) URL, file:// URL, or absolute path',
+ ),
packages/importer/src/providers/umami.ts (2)

178-195: Byte counting via body.on('data') competes with pipeline

Attach a lightweight Transform to count bytes between body and decompress to preserve backpressure and avoid dual consumers.

-    if (maxBytes) {
-      body.on('data', (chunk: Buffer) => {
-        seenBytes += chunk.length;
-        if (seenBytes > maxBytes) {
-          controller.abort();
-          body.destroy(new Error(`Stream exceeded size limit (${seenBytes} > ${maxBytes})`));
-        }
-      });
-    }
+    const byteCounter = new (await import('node:stream')).Transform({
+      transform(chunk, _enc, cb) {
+        seenBytes += chunk.length;
+        if (maxBytes && seenBytes > maxBytes) {
+          cb(new Error(`Stream exceeded size limit (${seenBytes} > ${maxBytes})`));
+          return;
+        }
+        cb(null, chunk);
+      },
+    });
@@
-        if (decompress) {
-          await pipeline(body, decompress, parser, { signal: controller.signal });
-        } else {
-          await pipeline(body, parser, { signal: controller.signal });
-        }
+        if (decompress) {
+          await pipeline(body, byteCounter, decompress, parser, { signal: controller.signal });
+        } else {
+          await pipeline(body, byteCounter, parser, { signal: controller.signal });
+        }

Also applies to: 211-227


91-99: Optional: support local Brotli (.br) like remote path

Mirror remote handling and support .br locally if needed.

-    const isGzipped = filePath.endsWith('.gz');
+    const isGzipped = filePath.endsWith('.gz');
+    const isBrotli = filePath.endsWith('.br');
@@
-        if (isGzipped) {
+        if (isGzipped) {
           await pipeline(fileStream, createGunzip(), parser);
-        } else {
+        } else if (isBrotli) {
+          await pipeline(fileStream, createBrotliDecompress(), parser);
+        } else {
           await pipeline(fileStream, parser);
         }
packages/db/src/services/import.service.ts (3)

78-92: Consider using CTEs or subqueries for better maintainability.

The argMin/argMax tuple approach works but is fragile—the 17-element tuple order must match the destructuring on lines 147-173. If fields are added, removed, or reordered, this becomes error-prone and difficult to debug.

A clearer alternative would be to use CTEs with ROW_NUMBER() or separate queries for first/last events, making field selection explicit:

WITH ranked_events AS (
  SELECT *,
    ROW_NUMBER() OVER (PARTITION BY device_id, session_id, project_id, profile_id ORDER BY created_at ASC) as first_rank,
    ROW_NUMBER() OVER (PARTITION BY device_id, session_id, project_id, profile_id ORDER BY created_at DESC) as last_rank
  FROM ${TABLE_NAMES.events_imports}
  WHERE ${rangeWhere} AND name NOT IN ('session_start', 'session_end')
)
SELECT ...
FROM ranked_events
WHERE first_rank = 1 OR last_rank = 1

26-525: Consider adding explicit error handling and logging.

None of the functions include try-catch blocks or error enrichment. While throwing errors to the caller is acceptable, adding contextual error handling would improve debuggability and allow for better error classification (transient vs. permanent failures).

Consider wrapping critical operations:

export async function insertImportBatch(
  events: IClickhouseEvent[],
  importId: string,
): Promise<ImportStageResult> {
  try {
    if (events.length === 0) {
      return { importId, totalEvents: 0, insertedEvents: 0 };
    }
    
    const eventsWithMetadata = events.map((event) => ({
      ...event,
      import_id: importId,
      import_status: 'pending',
      imported_at_meta: new Date(),
    }));

    await ch.insert({
      table: TABLE_NAMES.events_imports,
      values: eventsWithMetadata,
      format: 'JSONEachRow',
    });

    return {
      importId,
      totalEvents: events.length,
      insertedEvents: events.length,
    };
  } catch (error) {
    // Add context and rethrow or log
    throw new Error(`Failed to insert import batch for ${importId}: ${error instanceof Error ? error.message : String(error)}`);
  }
}

This pattern would help with debugging import failures in production.


278-356: Consider batch migration for very large imports.

The INSERT INTO ... SELECT approach is efficient for most cases, but for imports with millions of events, it might:

  • Block for extended periods
  • Consume significant memory
  • Provide no progress feedback during the migration

For very large imports, consider implementing batch migration:

export async function migrateImportToProduction(
  importId: string,
  from?: string,
  to?: string,
  batchSize = 100000, // configurable batch size
): Promise<void> {
  let offset = 0;
  let migrated = 0;
  
  while (true) {
    const migrationQuery = `
      INSERT INTO ${TABLE_NAMES.events} (...)
      SELECT ...
      FROM ${TABLE_NAMES.events_imports}
      WHERE import_id = {importId:String}
      ${from ? 'AND created_at >= {from:String}' : ''}
      ${to ? 'AND created_at < {to:String}' : ''}
      ORDER BY created_at ASC
      LIMIT {batchSize:UInt32} OFFSET {offset:UInt32}
    `;
    
    const result = await ch.command({
      query: migrationQuery,
      query_params: { importId, from, to, batchSize, offset },
    });
    
    // If no rows affected, we're done
    if (result.query_id && migrated === 0) break;
    
    offset += batchSize;
    // Could call updateImportProgress here
  }
}

This is optional and depends on expected import sizes.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8a297e and d71712e.

⛔ Files ignored due to path filters (2)
  • packages/importer/a70ff57d-f632-4292-a98b-658d7734fbec.csv is excluded by !**/*.csv
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (35)
  • IMPORT_REVIEW.md (1 hunks)
  • apps/start/package.json (1 hunks)
  • apps/start/src/components/integrations/integration-card.tsx (1 hunks)
  • apps/start/src/components/ui/badge.tsx (1 hunks)
  • apps/start/src/modals/add-import.tsx (1 hunks)
  • apps/start/src/modals/index.tsx (2 hunks)
  • apps/start/src/routeTree.gen.ts (11 hunks)
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1 hunks)
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1 hunks)
  • apps/worker/package.json (1 hunks)
  • apps/worker/src/boot-workers.ts (5 hunks)
  • apps/worker/src/index.ts (2 hunks)
  • apps/worker/src/jobs/import.ts (1 hunks)
  • packages/db/code-migrations/5-add-imports-table.sql (1 hunks)
  • packages/db/code-migrations/5-add-imports-table.ts (1 hunks)
  • packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251018214030_fix/migration.sql (1 hunks)
  • packages/db/prisma/schema.prisma (2 hunks)
  • packages/db/src/clickhouse/client.ts (1 hunks)
  • packages/db/src/services/event.service.ts (1 hunks)
  • packages/db/src/services/import.service.ts (1 hunks)
  • packages/db/src/types.ts (2 hunks)
  • packages/importer/package.json (1 hunks)
  • packages/importer/src/base-provider.ts (1 hunks)
  • packages/importer/src/index.ts (1 hunks)
  • packages/importer/src/providers/umami.ts (1 hunks)
  • packages/importer/src/types.ts (1 hunks)
  • packages/importer/src/utils/referrer-parser.ts (1 hunks)
  • packages/importer/tsconfig.json (1 hunks)
  • packages/importer/vitest.config.ts (1 hunks)
  • packages/queue/src/queues.ts (1 hunks)
  • packages/trpc/package.json (1 hunks)
  • packages/trpc/src/root.ts (2 hunks)
  • packages/trpc/src/routers/import.ts (1 hunks)
  • packages/validation/src/index.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/start/.cursorrules)

apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'

Files:

  • apps/start/src/components/integrations/integration-card.tsx
  • apps/start/src/modals/index.tsx
  • apps/start/src/modals/add-import.tsx
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx
  • apps/start/src/components/ui/badge.tsx
  • apps/start/src/routeTree.gen.ts
🧬 Code graph analysis (17)
packages/db/src/types.ts (1)
packages/validation/src/index.ts (1)
  • IImportConfig (464-467)
packages/importer/src/utils/referrer-parser.ts (1)
packages/common/src/string.ts (1)
  • stripTrailingSlash (1-3)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
  • getRedisQueue (93-105)
apps/worker/src/jobs/import.ts (3)
packages/queue/src/queues.ts (1)
  • ImportQueuePayload (163-168)
packages/importer/src/providers/umami.ts (2)
  • UmamiRawEvent (55-55)
  • UmamiProvider (57-426)
packages/db/src/services/import.service.ts (8)
  • insertImportBatch (26-53)
  • updateImportProgress (387-400)
  • getImportDateBounds (508-525)
  • reconstructSessions (61-272)
  • migrateImportToProduction (278-356)
  • markImportComplete (361-382)
  • getImportProgress (434-488)
  • markImportFailed (405-429)
apps/start/src/modals/index.tsx (1)
apps/start/src/modals/add-import.tsx (1)
  • AddImport (29-164)
packages/importer/vitest.config.ts (1)
vitest.shared.ts (1)
  • getSharedVitestConfig (4-27)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
  • importQueue (170-176)
apps/worker/src/jobs/import.ts (1)
  • importJob (19-198)
packages/trpc/src/root.ts (1)
packages/trpc/src/routers/import.ts (1)
  • importRouter (11-183)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
  • getIsCluster (17-24)
packages/db/src/clickhouse/migration.ts (2)
  • createTable (68-116)
  • runClickhouseMigrationCommands (322-454)
apps/start/src/modals/add-import.tsx (2)
apps/start/src/integrations/tanstack-query/root-provider.tsx (1)
  • Provider (68-86)
packages/importer/src/types.ts (1)
  • ImportConfig (7-12)
packages/trpc/src/routers/import.ts (6)
packages/trpc/src/trpc.ts (2)
  • createTRPCRouter (138-138)
  • protectedProcedure (161-164)
packages/trpc/src/access.ts (1)
  • getProjectAccessCached (4-4)
packages/trpc/src/errors.ts (1)
  • TRPCAccessError (3-7)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
packages/validation/src/index.ts (1)
  • zCreateImport (469-477)
packages/queue/src/queues.ts (1)
  • importQueue (170-176)
packages/db/src/services/import.service.ts (3)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
packages/db/src/clickhouse/client.ts (4)
  • ch (131-141)
  • TABLE_NAMES (48-60)
  • convertClickhouseDateToJs (218-220)
  • formatClickhouseDate (188-199)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
  • importQueue (170-176)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (2)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
  • Route (12-34)
apps/start/src/components/integrations/integration-card.tsx (3)
  • IntegrationCard (91-112)
  • IntegrationCardLogoImage (50-69)
  • IntegrationCardFooter (3-15)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
  • BaseRawEvent (40-42)
  • ImportJobMetadata (69-73)
  • ErrorContext (45-51)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
packages/importer/src/providers/umami.ts (4)
packages/validation/src/index.ts (1)
  • IUmamiImportConfig (450-450)
packages/db/src/services/event.service.ts (2)
  • IClickhouseEvent (68-102)
  • query (701-847)
packages/common/src/url.ts (2)
  • parsePath (11-49)
  • isSameDomain (51-63)
packages/importer/src/utils/referrer-parser.ts (2)
  • parseReferrer (49-58)
  • getReferrerWithQuery (60-91)
packages/importer/src/types.ts (2)
packages/db/src/services/event.service.ts (1)
  • IServiceImportedEventPayload (144-147)
packages/db/src/services/import.service.ts (2)
  • ImportProgress (16-21)
  • ImportStageResult (10-14)
🪛 Gitleaks (8.28.0)
apps/worker/src/index.ts

[high] 60-60: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.

(aws-access-token)

🪛 LanguageTool
IMPORT_REVIEW.md

[uncategorized] ~35-~35: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... batchNumber: number ) ``` --- ## 🟡 Medium Priority Issues ### 2. **Missing UUID Dependenc...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[grammar] ~156-~156: Ensure spelling is correct
Context: ...d for EVERY event ``` Improvement: Cleanup periodically (e.g., every 1000 events) ...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-and-push-api
  • GitHub Check: build-and-push-worker
🔇 Additional comments (24)
apps/start/package.json (1)

6-6: Clarify the motivation for renaming "testing" → "testing2".

The script was renamed from "testing" to "testing2", but the name "testing2" is not descriptive and appears to be a placeholder. Additionally, this change seems orthogonal to the PR objective ("new importer").

Please clarify:

  1. What is the intended purpose of renaming this script?
  2. Is "testing2" a temporary name, or was a different name intended?
  3. Was the original "testing" script referenced in CI/CD pipelines, documentation, or developer workflows? If so, those references need to be updated to avoid breaking them.

If this change is unintended, consider reverting it.

apps/start/src/components/integrations/integration-card.tsx (1)

50-69: LGTM! Consistent API enhancement.

The addition of the optional className prop aligns IntegrationCardLogoImage with the other components in this file (IntegrationCardFooter, IntegrationCardHeader, IntegrationCardHeaderButtons) that already accept className. The prop is properly forwarded to IntegrationCardLogo, which correctly merges it using the cn() utility.

apps/start/src/components/ui/badge.tsx (2)

12-13: LGTM! Success variant follows good practices.

The emerald-700 background with white text provides good contrast and is semantically appropriate for success states.


16-16: LGTM! Info variant is well-implemented.

The blue-500 background with white text provides adequate contrast and follows the established pattern.

packages/importer/tsconfig.json (1)

1-9: LGTM! Clean TypeScript configuration.

The configuration is properly structured for the new importer package, extending the shared base config and setting appropriate compiler options for a source-in-src, output-to-dist layout.

packages/importer/package.json (1)

27-27: bullmq dependency placement is correct.

Verification shows bullmq is exclusively in devDependencies with no runtime imports or usage in the importer source code. The package does not directly enqueue or process jobs—job operations are handled through the @openpanel/queue workspace dependency. The current placement is appropriate.

packages/db/src/types.ts (1)

2-2: Alignment with validation types looks good

The Prisma JSON alias cleanly mirrors the validation union. No issues spotted.

Also applies to: 16-16

packages/trpc/package.json (1)

19-19: No circular dependency detected—safe to proceed.

Verification confirms @openpanel/queue does not depend on @openpanel/trpc. The dependency is unidirectional: @openpanel/trpc (line 19) → @openpanel/queue. No cycle risk.

apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)

45-45: LGTM! Clean addition of the Imports tab.

The new imports tab follows the existing pattern and integrates well with the settings navigation structure.

apps/worker/package.json (1)

23-23: LGTM! Necessary dependency for import functionality.

The addition of the @openpanel/importer workspace dependency enables the worker to process import jobs.

packages/db/src/clickhouse/client.ts (1)

59-59: LGTM! Table name constant added correctly.

The events_imports table name addition follows the existing pattern and aligns with the new ClickHouse migration.

apps/worker/src/index.ts (1)

8-8: LGTM! Import queue integration added correctly.

The importQueue is properly imported and registered with Bull Board for monitoring.

Also applies to: 42-42

packages/db/src/services/event.service.ts (1)

144-147: LGTM! Type definition is clear and purposeful.

The IServiceImportedEventPayload type correctly omits profile and meta fields for imported events, which makes sense as these would need to be resolved separately during import processing.

packages/db/code-migrations/5-add-imports-table.sql (2)

42-43: Verify the 7-day TTL is intentional for staging data.

The TTL of 7 days on imported_at_meta means all staging import data will be automatically deleted after one week. Confirm this aligns with your data retention requirements and import workflow expectations. If imports might take longer than 7 days to validate or if you need to keep historical import records, consider increasing this value or making it configurable.


1-38: LGTM! Well-designed ClickHouse table schema.

The table structure uses appropriate data types, codecs for compression, and partitioning strategy. The use of LowCardinality for categorical fields and specialized codecs (ZSTD, Delta, Gorilla) shows good understanding of ClickHouse optimization patterns.

packages/db/prisma/schema.prisma (1)

470-486: LGTM! Import model is well-structured.

The Import model has appropriate fields for tracking import progress, includes proper relations to Project with cascade delete, and uses sensible defaults. The use of Json for config provides flexibility while the comment /// [IPrismaImportConfig] indicates type documentation exists elsewhere.

Also applies to: 197-197

IMPORT_REVIEW.md (2)

37-59: Remove the action item; importer uses node:crypto, not the uuid package

The script results show that packages/importer uses randomUUID() imported from node:crypto (in packages/importer/src/providers/umami.ts and packages/cli/src/importer/importer.ts), not from the external uuid package. The uuid package is used elsewhere in the codebase (specifically in packages/db/src/services/event.service.ts), but not in the importer package. The review comment's recommendation to add uuid to the importer's dependencies is unnecessary.

Likely an incorrect or invalid review comment.


10-31: Review comment is based on outdated or incorrect code state; actual implementation already uses proper types

The review comment cites code that does not match the current implementation:

  • processBatch function: Does not exist in the codebase (rg search found zero matches)
  • any type usage: No matches found in apps/worker/src/jobs/ (contradicts claim of "extensive use")
  • Actual typing: Line 46 of apps/worker/src/jobs/import.ts already uses const eventBatch: UmamiRawEvent[] = []; (properly typed, not any[])
  • BaseImportProvider: Confirmed generic with <TRawEvent extends BaseRawEvent = BaseRawEvent>, and UmamiProvider correctly extends it as BaseImportProvider<UmamiRawEvent>

The current implementation already has the proper type safety that the review comment suggests. The code is not coupled via weak typing but properly typed with UmamiRawEvent. No refactoring is needed.

Likely an incorrect or invalid review comment.

apps/start/src/routeTree.gen.ts (1)

57-57: Generated file — no manual edits

Looks consistent with the new Imports tab. No action; keep excluded from lint/format.

Also applies to: 373-378, 511-511, 564-564, 800-801, 1170-1176, 1504-1517

packages/importer/src/types.ts (1)

20-25: Naming duplication exists but no actual import collision risk – review comment is incorrect.

The analysis shows that while ImportProgress and ImportStageResult are defined in both packages, the importer package does not import these collision types from @openpanel/db. The importer only imports unrelated types like IClickhouseEvent and IServiceImportedEventPayload.

Additionally, the ImportProgress types serve different purposes: the importer's version tracks batch processing state (totalEvents, processedEvents, currentBatch, totalBatches), while the db version tracks job metadata (importId, totalEvents, insertedEvents, status). This is intentional, not a collision.

Since there are no cross-imports and the types aren't used interchangeably, no "confusion in imports" or "subtle type bugs" will occur. Both packages maintain independent type contracts appropriate to their concerns.

Likely an incorrect or invalid review comment.

packages/db/src/services/import.service.ts (4)

1-21: LGTM!

The imports and type definitions are well-structured. The interfaces clearly define the data contracts for staging results and progress tracking.


26-53: LGTM!

The batch insert function properly handles empty arrays, adds appropriate metadata, and leverages the retry-wrapped ClickHouse insert. The implementation is clean and defensive.


493-503: Verify cleanup scope: should Postgres import record also be deleted?

The cleanupFailedImport function only deletes from ClickHouse's events_imports table. The corresponding Postgres import record remains with status 'failed'.

Is this intentional (to preserve audit history), or should the Postgres record also be deleted for consistency?

If Postgres records should also be cleaned up, add:

 export async function cleanupFailedImport(importId: string): Promise<void> {
   const deleteQuery = `
     DELETE FROM ${TABLE_NAMES.events_imports}
     WHERE import_id = {importId:String} AND import_status = 'failed'
   `;

   await ch.command({
     query: deleteQuery,
     query_params: { importId },
   });
+  
+  // Also remove from Postgres if desired
+  await db.import.delete({
+    where: { id: importId, status: 'failed' },
+  });
 }

508-525: LGTM!

The getImportDateBounds utility correctly handles the nullable case and provides a clean interface for date range queries. Defensive null check on line 524 prevents runtime errors.

Comment on lines +14 to +15
warning:
'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix color contrast for accessibility.

White text on a yellow-500 background likely fails WCAG AA contrast requirements (4.5:1). Yellow backgrounds conventionally require dark text for proper readability.

Apply this diff to fix the contrast issue:

-        warning:
-          'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90',
+        warning:
+          'border-transparent bg-yellow-500 text-yellow-950 [a&]:hover:bg-yellow-500/90',

Alternatively, use a darker yellow background with white text:

-        warning:
-          'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90',
+        warning:
+          'border-transparent bg-yellow-600 text-white [a&]:hover:bg-yellow-600/90',
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
warning:
'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90',
warning:
'border-transparent bg-yellow-500 text-yellow-950 [a&]:hover:bg-yellow-500/90',
Suggested change
warning:
'border-transparent bg-yellow-500 text-white [a&]:hover:bg-yellow-500/90',
warning:
'border-transparent bg-yellow-600 text-white [a&]:hover:bg-yellow-600/90',
🤖 Prompt for AI Agents
In apps/start/src/components/ui/badge.tsx around lines 14-15, the warning
variant uses 'bg-yellow-500 text-white' which likely fails WCAG contrast; change
either the text color to a dark color (e.g., replace text-white with
text-gray-900 or text-black) so dark text sits on yellow-500, or keep white text
and use a darker yellow background (e.g., replace bg-yellow-500 with
bg-yellow-600 or bg-yellow-700) and adjust the hover class accordingly to
maintain accessible contrast.

Comment on lines +55 to +251
toast.error('Import failed', {
description: error.message,
});
},
}),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid leaking secrets in error paths/logs

error.message may include upstream details. Ensure server never echoes API secrets. Consider a generic toast and log details server-side only.

🤖 Prompt for AI Agents
In apps/start/src/modals/add-import.tsx around lines 55 to 60, the current toast
displays error.message which can leak upstream secrets; replace the client-side
toast with a generic user-facing message (e.g., "Import failed, try again
later") and remove inclusion of error.message, and instead ensure the server
logs the full error details securely (or returns a sanitized error code/message)
so secrets are never echoed to the client; update the toast usage accordingly
and confirm server-side logging/response sanitization is in place.

Comment on lines 12 to 14
import { UmamiProvider, type UmamiRawEvent } from '@openpanel/importer';
import type { ImportQueuePayload } from '@openpanel/queue';
import type { Job } from 'bullmq';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove UmamiRawEvent coupling; use BaseImportProvider generics

This job should be provider-agnostic. Swap UmamiRawEvent for BaseImportProvider.

-import { UmamiProvider, type UmamiRawEvent } from '@openpanel/importer';
+import { UmamiProvider, type BaseImportProvider } from '@openpanel/importer';

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 12 to 14, remove the direct
coupling to UmamiRawEvent and instead use the generic BaseImportProvider<T> from
the importer package: drop the UmamiRawEvent import, import BaseImportProvider
(and its type) from '@openpanel/importer', and replace any references to
UmamiRawEvent in the job/payload types with the generic BaseImportProvider<T>
(e.g., ImportQueuePayload<BaseImportProvider<any>> or a concrete T where
appropriate). Ensure all imports and Job generics are updated to use the
BaseImportProvider generic so the job becomes provider-agnostic.

Comment on lines 45 to 69
// Phase 1: Load & Transform - Process events in batches
const eventBatch: UmamiRawEvent[] = [];

for await (const rawEvent of providerInstance.parseSource()) {
// Validate event
if (!providerInstance.validate(rawEvent)) {
jobLogger.warn('Skipping invalid event', { rawEvent });
continue;
}

eventBatch.push(rawEvent);
totalEvents++;

// Process batch when it reaches the batch size
if (eventBatch.length >= BATCH_SIZE) {
jobLogger.info('Processing batch', { batchSize: eventBatch.length });

// Transform events
const transformedEvents = eventBatch.map((event) =>
providerInstance.transformEvent(event),
);

// Phase 2: Insert into staging table
await insertImportBatch(transformedEvents, importId);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Make batch buffer provider-agnostic and harden transform stage

  • Use unknown[] for the raw batch.
  • Guard transformEvent with try/catch to skip bad rows without failing the whole job.
-    const eventBatch: UmamiRawEvent[] = [];
+    const eventBatch: unknown[] = [];
@@
-        const transformedEvents = eventBatch.map((event) =>
-          providerInstance.transformEvent(event),
-        );
+        const transformedEvents: any[] = [];
+        for (const event of eventBatch) {
+          try {
+            transformedEvents.push(providerInstance.transformEvent(event as any));
+          } catch (e) {
+            jobLogger.warn('Skipping event on transform failure', {
+              provider: providerInstance.provider,
+              reason: e instanceof Error ? e.message : String(e),
+            });
+          }
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Phase 1: Load & Transform - Process events in batches
const eventBatch: UmamiRawEvent[] = [];
for await (const rawEvent of providerInstance.parseSource()) {
// Validate event
if (!providerInstance.validate(rawEvent)) {
jobLogger.warn('Skipping invalid event', { rawEvent });
continue;
}
eventBatch.push(rawEvent);
totalEvents++;
// Process batch when it reaches the batch size
if (eventBatch.length >= BATCH_SIZE) {
jobLogger.info('Processing batch', { batchSize: eventBatch.length });
// Transform events
const transformedEvents = eventBatch.map((event) =>
providerInstance.transformEvent(event),
);
// Phase 2: Insert into staging table
await insertImportBatch(transformedEvents, importId);
// Phase 1: Load & Transform - Process events in batches
const eventBatch: unknown[] = [];
for await (const rawEvent of providerInstance.parseSource()) {
// Validate event
if (!providerInstance.validate(rawEvent)) {
jobLogger.warn('Skipping invalid event', { rawEvent });
continue;
}
eventBatch.push(rawEvent);
totalEvents++;
// Process batch when it reaches the batch size
if (eventBatch.length >= BATCH_SIZE) {
jobLogger.info('Processing batch', { batchSize: eventBatch.length });
// Transform events
const transformedEvents: any[] = [];
for (const event of eventBatch) {
try {
transformedEvents.push(providerInstance.transformEvent(event as any));
} catch (e) {
jobLogger.warn('Skipping event on transform failure', {
provider: providerInstance.provider,
reason: e instanceof Error ? e.message : String(e),
});
}
}
// Phase 2: Insert into staging table
await insertImportBatch(transformedEvents, importId);
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 45 to 69, the batch buffer and
transform stage need hardening: change eventBatch from UmamiRawEvent[] to
unknown[] so the buffer is provider-agnostic, and wrap each
providerInstance.transformEvent call in a try/catch so a thrown error on one row
is caught, logged (jobLogger.warn with the rawEvent and error), and that row is
skipped without failing the whole job; ensure you collect only successful
transformed events (filter out failed/undefined results) before calling
insertImportBatch(importId).

Comment on lines +53 to +66
// Properties type for events - more specific than Record<string, any>
export interface EventProperties {
[key: string]:
| string
| number
| boolean
| null
| undefined
| Record<string, unknown>;
__query?: Record<string, unknown>;
__title?: string;
__screen?: string;
__language?: string;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

EventProperties allows nested objects, but ClickHouse properties is primitives-only

IClickhouseEvent.properties is Record<string, string|number|boolean|undefined|null>. Storing objects (e.g., __query) will break ingestion unless stringified.

  • Constrain EventProperties values to primitives.
  • Or require JSON.stringify for nested shapes (e.g., __queryJson: string).
🤖 Prompt for AI Agents
In packages/importer/src/types.ts around lines 53 to 66, EventProperties
currently allows nested objects which will break ClickHouse ingestion because
IClickhouseEvent.properties must be primitives-only; update the EventProperties
interface so values are limited to string | number | boolean | null | undefined
(remove Record<string, unknown>) or replace nested object fields with explicit
JSON string fields (e.g., __queryJson: string) and rename/remove existing
__query, ensuring all nested shapes are JSON.stringified before being assigned
to EventProperties.

@@ -0,0 +1,3 @@
import { getSharedVitestConfig } from '../../vitest.shared';

export default getSharedVitestConfig({ __dirname });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

__dirname is unavailable in ESM modules.

Since package.json declares "type": "module", this file runs as ESM. The __dirname global doesn't exist in ESM and will throw a ReferenceError.

Apply this diff to compute the directory path in ESM:

+import { dirname } from 'node:path';
+import { fileURLToPath } from 'node:url';
+
 import { getSharedVitestConfig } from '../../vitest.shared';

-export default getSharedVitestConfig({ __dirname });
+export default getSharedVitestConfig({ 
+  __dirname: dirname(fileURLToPath(import.meta.url)) 
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export default getSharedVitestConfig({ __dirname });
import { dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
import { getSharedVitestConfig } from '../../vitest.shared';
export default getSharedVitestConfig({
__dirname: dirname(fileURLToPath(import.meta.url))
});
🤖 Prompt for AI Agents
In packages/importer/vitest.config.ts around line 3, the code passes the
non-existent __dirname to getSharedVitestConfig (ESM modules don't have
__dirname); compute the directory from import.meta.url instead and pass that
value: import fileURLToPath from 'url' and path from 'path' (or use their named
functions), derive const __dirname =
path.dirname(fileURLToPath(import.meta.url)), then call getSharedVitestConfig({
__dirname }) so the config receives the correct directory path in ESM.

Comment on lines +12 to +32
list: protectedProcedure
.input(z.object({ projectId: z.string() }))
.query(async ({ input, ctx }) => {
const access = await getProjectAccessCached({
projectId: input.projectId,
userId: ctx.session.userId,
});

if (!access) {
throw TRPCAccessError('You do not have access to this project');
}

return db.import.findMany({
where: {
projectId: input.projectId,
},
orderBy: {
createdAt: 'desc',
},
});
}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Sensitive config exposure in list/get — restrict or redact.

config likely contains provider credentials. Current list/get return it to any project member with read access. Redact or gate by role.

Option A (redact for read-level):

   list: protectedProcedure
     .input(z.object({ projectId: z.string() }))
     .query(async ({ input, ctx }) => {
       const access = await getProjectAccessCached({
         projectId: input.projectId,
         userId: ctx.session.userId,
       });
       if (!access) {
         throw TRPCAccessError('You do not have access to this project');
       }
-      return db.import.findMany({
+      const rows = await db.import.findMany({
         where: { projectId: input.projectId },
         orderBy: { createdAt: 'desc' },
       });
+      const canSeeSecrets =
+        typeof access !== 'boolean' && access.level !== 'read';
+      return canSeeSecrets
+        ? rows
+        : rows.map((r) => ({ ...r, config: null }));
     }),
   get: protectedProcedure
     .input(z.object({ id: z.string() }))
     .query(async ({ input, ctx }) => {
       const importRecord = await db.import.findUniqueOrThrow({
         where: { id: input.id },
         include: { project: true },
       });
       const access = await getProjectAccessCached({
         projectId: importRecord.projectId,
         userId: ctx.session.userId,
       });
       if (!access) {
         throw TRPCAccessError('You do not have access to this import');
       }
-      return importRecord;
+      const canSeeSecrets =
+        typeof access !== 'boolean' && access.level !== 'read';
+      return canSeeSecrets ? importRecord : { ...importRecord, config: null };
     }),

Option B: Use Prisma select to exclude config in list and only include it for admin/write roles.

Also applies to: 34-57

🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 12 to 32 (and likewise apply
the change to lines 34 to 57), the current list/get resolvers return the full
import records including the sensitive config field to any project member;
change the DB queries to explicitly exclude the config field by using a Prisma
select (e.g., select: { config: false, ...otherFields }) for read-level access,
and only fetch/include config when the caller has a higher role (admin/write)
after re-checking the caller's role from ctx.session or the access object;
implement a conditional: if role allows, run a second query or include config in
select, otherwise omit/redact config from the returned objects and ensure no
accidental passthrough in list results.

Comment on lines +72 to +99
// Create import record
const importRecord = await db.import.create({
data: {
projectId: input.projectId,
config: input.config,
status: 'pending',
},
});

// Add job to queue
const job = await importQueue.add('import', {
type: 'import',
payload: {
importId: importRecord.id,
},
});

// Update import record with job ID
await db.import.update({
where: { id: importRecord.id },
data: { jobId: job.id },
});

return {
...importRecord,
jobId: job.id,
};
}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Queue enqueue not wrapped — record can be left pending with no job.

If importQueue.add fails, DB keeps a dangling 'pending' import. Wrap in try/catch, set failed status, and pass stable jobId/attempts/backoff.

       // Create import record
       const importRecord = await db.import.create({
         data: {
           projectId: input.projectId,
           config: input.config,
           status: 'pending',
         },
       });

-      // Add job to queue
-      const job = await importQueue.add('import', {
-        type: 'import',
-        payload: {
-          importId: importRecord.id,
-        },
-      });
-
-      // Update import record with job ID
-      await db.import.update({
-        where: { id: importRecord.id },
-        data: { jobId: job.id },
-      });
+      // Add job to queue
+      try {
+        const job = await importQueue.add(
+          'import',
+          { type: 'import', payload: { importId: importRecord.id } },
+          { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
+        );
+        // Update import record with job ID (idempotent if jobId equals importRecord.id)
+        await db.import.update({
+          where: { id: importRecord.id },
+          data: { jobId: job.id },
+        });
+      } catch (err) {
+        await db.import.update({
+          where: { id: importRecord.id },
+          data: { status: 'failed', errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed' },
+        });
+        throw err;
+      }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Create import record
const importRecord = await db.import.create({
data: {
projectId: input.projectId,
config: input.config,
status: 'pending',
},
});
// Add job to queue
const job = await importQueue.add('import', {
type: 'import',
payload: {
importId: importRecord.id,
},
});
// Update import record with job ID
await db.import.update({
where: { id: importRecord.id },
data: { jobId: job.id },
});
return {
...importRecord,
jobId: job.id,
};
}),
// Create import record
const importRecord = await db.import.create({
data: {
projectId: input.projectId,
config: input.config,
status: 'pending',
},
});
// Add job to queue
try {
const job = await importQueue.add(
'import',
{ type: 'import', payload: { importId: importRecord.id } },
{ jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
);
// Update import record with job ID (idempotent if jobId equals importRecord.id)
await db.import.update({
where: { id: importRecord.id },
data: { jobId: job.id },
});
} catch (err) {
await db.import.update({
where: { id: importRecord.id },
data: { status: 'failed', errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed' },
});
throw err;
}
return {
...importRecord,
jobId: job.id,
};
}),
🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 72-99, the code creates an
import DB record then calls importQueue.add without error handling so a failed
enqueue leaves a dangling 'pending' record; to fix, generate a stable jobId
(e.g. UUID) before enqueue, persist that jobId on the import record (or create
the record with that jobId) and then wrap importQueue.add in try/catch — on
success update any job metadata if needed, on failure update the DB record to
status 'failed' and write the error message and retry metadata
(attempts/backoff) to the record; ensure the job payload includes the stable
jobId and desired attempts/backoff so retries are deterministic.

Comment on lines +469 to +486
export const zCreateImport = z.object({
projectId: z.string().min(1),
provider: z.enum(['umami', 'plausible', 'mixpanel']),
config: z.union([
zUmamiImportConfig,
zPlausibleImportConfig,
zMixpanelImportConfig,
]),
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Discriminate by provider to prevent mismatched payloads

provider and config can disagree (e.g., provider='umami' with a Mixpanel config). Make zCreateImport a discriminated union on provider so config shape is tied to the selected provider.

-export const zCreateImport = z.object({
-  projectId: z.string().min(1),
-  provider: z.enum(['umami', 'plausible', 'mixpanel']),
-  config: z.union([
-    zUmamiImportConfig,
-    zPlausibleImportConfig,
-    zMixpanelImportConfig,
-  ]),
-});
+export const zCreateImport = z.discriminatedUnion('provider', [
+  z.object({
+    projectId: z.string().min(1),
+    provider: z.literal('umami'),
+    config: zUmamiImportConfig,
+  }),
+  z.object({
+    projectId: z.string().min(1),
+    provider: z.literal('plausible'),
+    config: zPlausibleImportConfig,
+  }),
+  z.object({
+    projectId: z.string().min(1),
+    provider: z.literal('mixpanel'),
+    config: zMixpanelImportConfig,
+  }),
+]);
🤖 Prompt for AI Agents
In packages/validation/src/index.ts around lines 469 to 477, zCreateImport
currently allows provider and config to disagree; change it to a discriminated
union so each provider literal is paired with the correct config shape. Replace
the single z.object with either z.discriminatedUnion('provider', [...]) or a
z.union of three z.object schemas where provider is a literal ('umami' |
'plausible' | 'mixpanel') and config is the matching zUmamiImportConfig,
zPlausibleImportConfig, or zMixpanelImportConfig respectively; keep the same
exported name and types so callers get compile-time validation that provider and
config match.

@vercel
Copy link

vercel bot commented Oct 28, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
openpanel-public Error Error Oct 30, 2025 2:29pm

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 28, 2025

Walkthrough

A comprehensive import system is introduced, enabling data ingestion from external providers (Umami, Mixpanel) through a new packages/importer library, TRPC router, worker job processor, and UI modal. The packages/cli is removed, referrer parsing logic is reorganized into packages/common, database schemas are extended with an imports table, and UI components gain new styling variants and props.

Changes

Cohort / File(s) Summary
New Importer Package
packages/importer/*
Establishes a new TypeScript package with base provider class, Umami and Mixpanel provider implementations, type definitions, and metadata. Exports provider instances and type contracts for import lifecycle management.
Import Feature - Database Layer
packages/db/prisma/schema.prisma, packages/db/prisma/migrations/*, packages/db/code-migrations/5-add-imports-table.*, packages/db/src/services/import.service.ts, packages/db/src/clickhouse/*
Adds Import model and ImportStatus enum; creates ClickHouse and PostgreSQL tables for imports; introduces import service with staging workflow (batch insertion, session generation, production migration); adds CSV helper utilities and TTL migration support.
Import Feature - API Layer
packages/trpc/src/routers/import.ts, packages/queue/src/queues.ts, packages/validation/src/index.ts
Adds TRPC router with list/get/create/delete/retry procedures; introduces importQueue with import job payload; defines zod schemas for Umami/Mixpanel/Plausible import configs.
Import Feature - Worker Processing
apps/worker/src/boot-workers.ts, apps/worker/src/index.ts, apps/worker/src/jobs/import.ts
Registers import worker with configurable concurrency; adds comprehensive import job handler orchestrating five-phase pipeline (load → batch → session generation → session events → production migration).
Import Feature - UI
apps/start/src/modals/add-import.tsx, apps/start/src/modals/index.tsx, apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx, apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx, apps/start/src/components/project-mapper.tsx
Adds AddImport modal with provider-specific subforms (Umami, Mixpanel); registers modal in modal index; introduces ImportsSettings tab and route with import history table, provider grid, and mutation handlers.
Referrer Logic Reorganization
packages/common/server/parse-referrer.ts, packages/common/server/parser-user-agent.ts, packages/common/scripts/get-referrers.ts, packages/common/server/index.ts, packages/common/package.json, package.json
Moves referrer parsing and generation logic to common/server; changes default referrer type from 'unknown' to '' (empty string); adds gen:referrers script; updates import paths across worker and apps.
UI Component Enhancements
apps/start/src/components/ui/badge.tsx, apps/start/src/components/integrations/integration-card.tsx, apps/start/src/modals/date-ranger-picker.tsx
Adds badge variants (success, warning, info); adds optional className prop to IntegrationCardLogoImage; adds captionLayout="dropdown" to DateRangerPicker.
CLI Package Removal
packages/cli/*
Removes entire CLI package including package.json, entry point, importer logic, and build configurations.
Dependency & Config Updates
apps/start/package.json, apps/worker/package.json, packages/trpc/package.json, packages/common/package.json, packages/db/package.json, packages/redis/package.json, apps/api/package.json
Adds @openpanel/importer and @openpanel/queue dependencies; updates tsdown pins; adds exports mappings in common; updates ClickHouse client version.
Utility & Helper Updates
packages/common/src/object.ts, packages/common/src/object.test.ts, packages/db/src/services/event.service.ts, packages/db/src/types.ts, apps/start/src/utils/math.ts, apps/start/src/utils/slug.ts, apps/worker/src/jobs/events.incoming-event.ts, apps/worker/src/jobs/events.incoming-events.test.ts, apps/worker/src/jobs/cron.delete-projects.ts, packages/db/src/buffers/base-buffer.ts, packages/trpc/src/routers/auth.ts, packages/trpc/src/routers/organization.ts
Enhances toDots to handle malformed JSON; adds IServiceImportedEventPayload type; updates import paths from deep modules to package exports; normalizes ClickHouse settings types and session query defaults.
Dockerfile Updates
apps/start/Dockerfile, apps/worker/Dockerfile
Adds importer package to workspace and production image copy stages for both frontend and worker containers.
Route & Config Generation
apps/start/src/routeTree.gen.ts
Registers new imports tab route under project settings; extends FileRoutesByFullPath, FileRoutesByTo, and FileRoutesById type unions.

Sequence Diagram(s)

sequenceDiagram
    participant UI as ImportsSettings Modal
    participant API as TRPC import.create
    participant DB as Prisma/PostgreSQL
    participant Queue as Redis Queue
    participant Worker as Import Job
    participant CH as ClickHouse

    UI->>API: Create import (provider, config, projectId)
    API->>DB: Create import record (status: pending)
    DB-->>API: Return import with id
    API->>Queue: Enqueue import job
    Queue-->>API: Return jobId
    API->>DB: Update import.jobId
    
    Worker->>DB: Fetch import record
    Worker->>DB: Phase 1: Insert events to staging (batches)
    Worker->>DB: Update progress after each batch
    alt Generate Sessions Enabled
        Worker->>CH: Phase 2: Generate session IDs from staging
        Worker->>DB: Update progress
    end
    Worker->>CH: Phase 3: Create session_start/end events
    Worker->>CH: Phase 4: Move imports to production
    Worker->>CH: Phase 5: Backfill sessions to production
    Worker->>DB: Mark import completed (status: completed)
    Worker-->>DB: Update final progress metrics

    UI->>API: Poll/Fetch import history
    API->>DB: Query imports with status + progress
    DB-->>API: Return import list
    API-->>UI: Display table with status, progress bar, actions
Loading
sequenceDiagram
    participant Form as AddImport Modal
    participant Subform as UmamiImport/MixpanelImport
    participant Service as Import Service (TRPC)
    participant Mutation as Job Mutation

    Subform->>Subform: User fills provider-specific fields
    Subform->>Service: onSubmit(config)
    Service->>Mutation: import.create(projectId, provider, config)
    
    alt Success
        Mutation-->>Service: { id, status, jobId }
        Service->>Service: Show success toast
        Service->>Service: Invalidate import list cache
        Service->>Form: Close modal
    else Error
        Mutation-->>Service: Error response
        Service->>Service: Show error toast
        Service->>Form: Keep modal open
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Specific areas requiring extra attention:
    • packages/importer/src/providers/umami.ts and mixpanel.ts: Dense logic for event transformation, streaming, decompression, and provider-specific field mapping; validate schema compliance and error handling paths
    • apps/worker/src/jobs/import.ts: Five-phase pipeline with resume logic, chunked processing, progress tracking, and state transitions; verify phase boundaries and error recovery mechanisms
    • packages/db/src/services/import.service.ts: Comprehensive import service layer with multiple staging/production migrations; ensure ClickHouse query correctness and data consistency guarantees
    • apps/start/src/modals/add-import.tsx: Modal state management, form delegation, and TRPC mutation orchestration; validate loading states and error messaging
    • Database migrations (5 SQL files): Multiple schema revisions indicate iterative refinement; verify final schema matches Prisma model and no conflicting constraints remain
    • CLI removal impact: Confirm no downstream dependencies or documentation references remain; verify migration path for existing CLI users

Poem

🐰 Hop along the import trail,
Five phases deep, events never fail,
From Umami hops to Mixpanel's call,
Sessions born, production haul,
Data streams like carrots stacked so tall!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "new importer" is directly related to the primary change in the pull request, which introduces a comprehensive new import/importer system. The changeset includes new provider implementations (Umami, Mixpanel), database tables and migrations, UI components for managing imports, TRPC router endpoints, worker jobs, and supporting infrastructure. The title accurately and concisely summarizes this main addition without excessive detail or vague terminology. While the title is brief, it clearly conveys that a new importer feature is being introduced, which is appropriate since not all implementation details are expected to be captured in the title.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/import

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/common/src/object.ts (1)

22-29: Inconsistent handling of whitespace-only strings.

The guard on line 22 now filters out explicit empty strings, but whitespace-only strings (e.g., ' ') will pass this check, get trimmed to an empty string on line 28, and still be added to the accumulator. This creates inconsistent behavior.

Consider moving the empty string check after the cleaning step to ensure all empty values (including trimmed whitespace) are filtered consistently:

-    if (value === undefined || value === null || value === '') {
+    if (value === undefined || value === null) {
       return acc;
     }
 
     const cleanedValue =
       typeof value === 'string'
         ? removeInvalidSurrogates(value).trim()
         : String(value);
+
+    if (cleanedValue === '') {
+      return acc;
+    }
 
     return {
       ...acc,
       [`${path}${key}`]: cleanedValue,
     };
♻️ Duplicate comments (18)
apps/start/src/components/ui/badge.tsx (1)

12-16: New badge variants added for status indication.

The three new variants (success, warning, info) extend the badge component to support import status displays and other UI feedback needs.

Note: The warning variant's color contrast issue (white text on yellow-500) has already been flagged in a previous review.

packages/db/code-migrations/5-add-imports-table.ts (1)

65-69: TTL must be defined at CREATE time (and on local replicated table in cluster).

ALTERing TTL on events_imports will hit the Distributed table in cluster mode and be a no-op. Bake TTL into createTable so it applies to the replicated local table and the single-node path. Remove the ALTER.

Apply these diffs:

In this file:

   const sqls: string[] = [
     ...createTable({
       name: 'events_imports',
       columns: [
         // ...
       ],
       orderBy: ['import_id', 'created_at'],
       partitionBy: 'toYYYYMM(imported_at_meta)',
       settings: {
         index_granularity: 8192,
       },
       distributionHash: 'cityHash64(import_id)',
       replicatedVersion: '1',
       isClustered,
+      ttl: "imported_at_meta + INTERVAL 7 DAY",
     }),
   ];
 
-  // Add TTL policy for auto-cleanup after 7 days
-  sqls.push(`
-    ALTER TABLE events_imports 
-    MODIFY TTL imported_at_meta + INTERVAL 7 DAY
-  `);

And extend createTable (packages/db/src/clickhouse/migration.ts):

 export function createTable({
   name: tableName,
   columns,
   indices = [],
   engine = 'MergeTree()',
   orderBy = ['tuple()'],
   partitionBy,
   settings = {},
   distributionHash,
   replicatedVersion,
   isClustered,
+  ttl,
 }: CreateTableOptions): string[] {
   const columnDefinitions = [...columns, ...indices].join(',\n  ');
   const settingsClause = Object.entries(settings).length
     ? `SETTINGS ${Object.entries(settings).map(([k, v]) => `${k} = ${v}`).join(', ')}`
     : '';
   const partitionByClause = partitionBy ? `PARTITION BY ${partitionBy}` : '';
+  const ttlClause = ttl ? `\nTTL ${ttl}` : '';
 
   if (!isClustered) {
     return [
       `CREATE TABLE IF NOT EXISTS ${tableName} (
   ${columnDefinitions}
 )
 ENGINE = ${engine}
 ${partitionByClause}
 ORDER BY (${orderBy.join(', ')})
-${settingsClause}`.trim(),
+${ttlClause}
+${settingsClause}`.trim(),
     ];
   }
 
   return [
     `CREATE TABLE IF NOT EXISTS ${replicated(tableName)} ON CLUSTER '{cluster}' (
   ${columnDefinitions}
 )
 ENGINE = Replicated${engine.replace(/^(.+?)\((.+?)?\)/, `$1('${CLUSTER_REPLICA_PATH.replace('{replicatedVersion}', replicatedVersion)}', '{replica}', $2)`).replace(/, \)$/, ')')}
 ${partitionByClause}
 ORDER BY (${orderBy.join(', ')})
-${settingsClause}`.trim(),
+${ttlClause}
+${settingsClause}`.trim(),
     `CREATE TABLE IF NOT EXISTS ${tableName} ON CLUSTER '{cluster}' AS ${replicated(tableName)}
 ENGINE = Distributed('{cluster}', currentDatabase(), ${replicated(tableName)}, ${distributionHash})`,
   ];
 }

Also applies to: 15-63

packages/validation/src/index.ts (1)

478-486: Make zCreateImport a discriminated union to bind provider ⇔ config.

As-is, provider='umami' can carry a Mixpanel config. Use a discriminated union on provider.

-export const zCreateImport = z.object({
-  projectId: z.string().min(1),
-  provider: z.enum(['umami', 'plausible', 'mixpanel']),
-  config: z.union([
-    zUmamiImportConfig,
-    zPlausibleImportConfig,
-    zMixpanelImportConfig,
-  ]),
-});
+export const zCreateImport = z.discriminatedUnion('provider', [
+  z.object({
+    projectId: z.string().min(1),
+    provider: z.literal('umami'),
+    config: zUmamiImportConfig,
+  }),
+  z.object({
+    projectId: z.string().min(1),
+    provider: z.literal('plausible'),
+    config: zPlausibleImportConfig,
+  }),
+  z.object({
+    projectId: z.string().min(1),
+    provider: z.literal('mixpanel'),
+    config: zMixpanelImportConfig,
+  }),
+]);
packages/trpc/src/routers/import.ts (3)

12-32: Sensitive config exposure in list — restrict or redact.

config contains provider credentials (API keys, secrets). Current implementation returns it to any project member with read access. Redact or gate by role.

Option A (redact for read-level):

  list: protectedProcedure
    .input(z.object({ projectId: z.string() }))
    .query(async ({ input, ctx }) => {
      const access = await getProjectAccessCached({
        projectId: input.projectId,
        userId: ctx.session.userId,
      });
      if (!access) {
        throw TRPCAccessError('You do not have access to this project');
      }
-      return db.import.findMany({
+      const rows = await db.import.findMany({
        where: { projectId: input.projectId },
        orderBy: { createdAt: 'desc' },
      });
+      const canSeeSecrets =
+        typeof access !== 'boolean' && access.level !== 'read';
+      return canSeeSecrets
+        ? rows
+        : rows.map((r) => ({ ...r, config: null }));
    }),

34-56: Sensitive config exposure in get — restrict or redact.

config contains provider credentials. Apply the same access-level check as in list to redact secrets for read-only users.

  get: protectedProcedure
    .input(z.object({ id: z.string() }))
    .query(async ({ input, ctx }) => {
      const importRecord = await db.import.findUniqueOrThrow({
        where: { id: input.id },
        include: { project: true },
      });
      const access = await getProjectAccessCached({
        projectId: importRecord.projectId,
        userId: ctx.session.userId,
      });
      if (!access) {
        throw TRPCAccessError('You do not have access to this import');
      }
-      return importRecord;
+      const canSeeSecrets =
+        typeof access !== 'boolean' && access.level !== 'read';
+      return canSeeSecrets ? importRecord : { ...importRecord, config: null };
    }),

72-99: Queue enqueue not wrapped — record can be left pending with no job.

If importQueue.add fails, DB keeps a dangling 'pending' import. Wrap in try/catch, set failed status, and pass stable jobId/attempts/backoff.

      // Create import record
      const importRecord = await db.import.create({
        data: {
          projectId: input.projectId,
          config: input.config,
          status: 'pending',
        },
      });

-      // Add job to queue
-      const job = await importQueue.add('import', {
-        type: 'import',
-        payload: {
-          importId: importRecord.id,
-        },
-      });
-
-      // Update import record with job ID
-      await db.import.update({
-        where: { id: importRecord.id },
-        data: { jobId: job.id },
-      });
+      // Add job to queue
+      try {
+        const job = await importQueue.add(
+          'import',
+          { type: 'import', payload: { importId: importRecord.id } },
+          { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
+        );
+        await db.import.update({
+          where: { id: importRecord.id },
+          data: { jobId: job.id },
+        });
+      } catch (err) {
+        await db.import.update({
+          where: { id: importRecord.id },
+          data: { status: 'failed', errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed' },
+        });
+        throw err;
+      }
apps/worker/src/jobs/import.ts (3)

130-139: Redact raw event logging to avoid PII leakage.

Line 137 logs the full rawEvent which may contain PII (emails, user identifiers, etc.). Log only minimal, non-PII identifiers.

      if (
        !providerInstance.validate(
          // @ts-expect-error
          rawEvent,
        )
      ) {
-        jobLogger.warn('Skipping invalid event', { rawEvent });
+        jobLogger.warn('Skipping invalid event', {
+          provider: providerInstance.provider,
+          reason: 'validation_failed',
+        });
        continue;
      }

126-169: Harden batch buffer and transform stage.

Batch uses any type (Line 126) and transform stage (Lines 147-152) lacks error handling. A single transform failure fails the entire batch. Use unknown[] and wrap transforms in try/catch to skip invalid events.

-    const eventBatch: any = [];
+    const eventBatch: unknown[] = [];
    for await (const rawEvent of providerInstance.parseSource(
      resumeLoadingFrom,
    )) {
      // ... validation ...
      
      // Process batch when it reaches the batch size
      if (eventBatch.length >= BATCH_SIZE) {
        jobLogger.info('Processing batch', { batchSize: eventBatch.length });

-        const transformedEvents: IClickhouseEvent[] = eventBatch.map(
-          (
-            // @ts-expect-error
-            event,
-          ) => providerInstance!.transformEvent(event),
-        );
+        const transformedEvents: IClickhouseEvent[] = [];
+        for (const event of eventBatch) {
+          try {
+            transformedEvents.push(providerInstance.transformEvent(event as any));
+          } catch (e) {
+            jobLogger.warn('Skipping event on transform failure', {
+              provider: providerInstance.provider,
+              reason: e instanceof Error ? e.message : String(e),
+            });
+          }
+        }

246-265: Sync final Postgres counters with ClickHouse progress.

UI reads processedEvents/totalEvents from Postgres. After completion, Postgres isn't updated with final ClickHouse counts, so progress may show stale values. Update both to finalProgress values.

    await markImportComplete(importId);
    await updateImportStatus(jobLogger, job, importId, {
      step: 'completed',
    });
    jobLogger.info('Import marked as complete');

    // Get final progress
    const finalProgress = await getImportProgress(importId);

    jobLogger.info('Import job completed successfully', {
      totalEvents: finalProgress.totalEvents,
      insertedEvents: finalProgress.insertedEvents,
      status: finalProgress.status,
    });
+
+    // Keep Postgres in sync for UI progress
+    await db.import.update({
+      where: { id: importId },
+      data: {
+        totalEvents: finalProgress.totalEvents,
+        processedEvents: finalProgress.insertedEvents,
+      },
+    });
apps/start/src/modals/add-import.tsx (1)

245-249: Avoid leaking secrets in error messages.

Line 247 displays error.message which may include upstream API details or secrets. Use a generic client-facing message and log details server-side only.

      onError: (error) => {
        toast.error('Import failed', {
-          description: error.message,
+          description: 'Failed to start import. Please check your configuration and try again.',
        });
      },
packages/importer/src/providers/umami.ts (5)

79-198: SSRF exposure persists when fetching arbitrary URLs.

The fetch operation on line 97 accepts user-provided URLs without validating that they don't resolve to internal networks or restricted IP ranges. This remains a critical security risk.


213-220: Device ID fallback remains incorrect and unsafe.

Lines 216-217 still use rawEvent.visit_id! for both ip and ua parameters with non-null assertions. The visit_id is semantically unrelated to IP address or user agent, and the non-null assertion will throw if visit_id is undefined.


224-228: URL construction issues remain unresolved.

Line 225 concatenates hostname, path, and query without ensuring proper formatting. The query parameter needs a leading '?' and the hostname should be required for absolute URLs.


231-233: Referrer query parameter is still ignored.

Line 232 constructs the referrerUrl but omits rawEvent.referrer_query, losing query string information from the referrer.


257-296: Properties structure violates ClickHouse type expectations.

Lines 260 and 267-296 build nested objects (properties.__query and properties.__query.utm_*) which conflict with IClickhouseEvent.properties: Record<string, string | number | boolean | undefined | null>. ClickHouse expects flat primitive values.

packages/db/src/services/import.service.ts (3)

256-256: Timestamp collision risk persists.

Lines 256 and 293 adjust timestamps by exactly ±1000ms (1 second). This can collide with real events at those timestamps, causing non-deterministic ordering in time-based queries.

Also applies to: 293-293


559-559: Non-deterministic status aggregation remains.

Line 559 uses any(import_status) which returns an arbitrary status when events have mixed statuses. This makes progress reporting unpredictable.


603-603: Type assertion bypasses runtime validation.

Line 603 uses a type assertion for row.status without runtime validation. If ClickHouse returns an unexpected value (due to the any() issue on line 559 or data corruption), this will silently pass invalid data.

🧹 Nitpick comments (10)
apps/start/src/components/project-mapper.tsx (1)

50-50: Consider validating for duplicate source project IDs.

The component allows users to add multiple mappings with the same "from" (source project ID), which could lead to ambiguous or conflicting import behavior. Consider adding validation to prevent duplicate source IDs, either here or in the parent form's validation schema.

packages/db/src/clickhouse/client.ts (1)

136-156: Consider immutability in insert proxy.

The proxy mutates args[0] directly when injecting clickhouse_settings. While functional, this modifies the caller's object.

Consider creating a new object to avoid mutation:

       return (...args: any[]) =>
         withRetry(() => {
-          args[0].clickhouse_settings = {
+          const settings = {
             // Allow bigger HTTP payloads/time to stream rows
             async_insert: 1,
             wait_for_async_insert: 1,
             // Increase insert timeouts and buffer sizes for large batches
             max_execution_time: 300,
             max_insert_block_size: '500000',
             max_http_get_redirects: '0',
             // Ensure JSONEachRow stays efficient
             input_format_parallel_parsing: 1,
             // Keep long-running inserts/queries from idling out at proxies by sending progress headers
             send_progress_in_http_headers: 1,
             http_headers_progress_interval_ms: '50000',
             // Ensure server holds the connection until the query is finished
             wait_end_of_query: 1,
             ...args[0].clickhouse_settings,
           };
+          return value.apply(target, [{ ...args[0], clickhouse_settings: settings }]);
-          return value.apply(target, args);
         });
packages/common/scripts/get-referrers.ts (2)

91-92: Improve error handling in main function.

The current error handling only logs to console without providing actionable information or proper error propagation.

Apply this diff to improve error handling:

   } catch (e) {
-    console.log(e);
+    console.error('Failed to generate referrers:', e);
+    process.exit(1);
   }

This ensures the script fails with a non-zero exit code when referrer generation fails, allowing CI/build systems to detect the failure.


69-71: Implement robust error handling and fallback strategy for the external data source.

The Snowplow referrer data source is currently available and returns valid JSON (HTTP 200), but the script lacks fallback mechanisms. If the fetch fails during the build, the error is only logged to the console without preventing silent failures. Consider adding retry logic, a timeout, or fallback to a cached/bundled version to ensure build reliability when the external source becomes temporarily unavailable.

packages/importer/src/providers/metadata.ts (2)

13-30: Keep provider metadata in sync with validation (plausible present there).

Validation accepts 'plausible', but it’s missing here. Either add Plausible to IMPORT_PROVIDERS or drop it from validation until ready.

Would you like a patch to add Plausible with a neutral placeholder logo?


18-19: Avoid hot‑linking external logos; serve from our assets/CDN.

Brandfetch links can break or change. Prefer checked‑in assets or our CDN.

Also applies to: 26-28

packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1)

1-3: Consider indexes if filtering/sorting by progress fields.

If currentBatch/currentStep appear in WHERE/ORDER BY, add indexes (even partial) to avoid seq scans. Otherwise this change looks fine.

apps/worker/src/index.ts (1)

35-45: Minor improvement: Cleaner Bull Board initialization.

Refactoring from a wrapped object literal to a direct createBullBoard() call improves code clarity. This is a good incidental cleanup.

packages/common/server/parser-user-agent.ts (1)

84-112: Consider simplifying the redundant type checks.

Each override field uses a pattern like:

typeof overrides?.__os === 'string' && overrides?.__os
  ? overrides?.__os
  : res.os.name

The condition typeof overrides?.__os === 'string' && overrides?.__os is redundant. If the typeof check passes and the value is truthy, the second check adds no value unless you're explicitly filtering out empty strings. If that's the intent, a clearer pattern would be:

-    os:
-      typeof overrides?.__os === 'string' && overrides?.__os
-        ? overrides?.__os
-        : res.os.name,
+    os:
+      (typeof overrides?.__os === 'string' && overrides?.__os !== '')
+        ? overrides?.__os
+        : res.os.name,

Or if empty strings should be accepted:

-    os:
-      typeof overrides?.__os === 'string' && overrides?.__os
-        ? overrides?.__os
-        : res.os.name,
+    os: typeof overrides?.__os === 'string' ? overrides?.__os : res.os.name,
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)

46-76: Polling interval appropriate for progress tracking.

5-second refetch interval on Line 50 provides real-time import progress updates without excessive load. Consider adding a condition to stop polling when all imports are completed/failed.

Optional optimization:

  const importsQuery = useQuery(
    trpc.import.list.queryOptions(
      { projectId },
      {
-        refetchInterval: 5000,
+        refetchInterval: (query) => {
+          const hasActiveImports = query.state.data?.some(
+            (imp) => imp.status === 'pending' || imp.status === 'processing'
+          );
+          return hasActiveImports ? 5000 : false;
+        },
      },
    ),
  );
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d71712e and b60dd0e.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (63)
  • apps/api/package.json (1 hunks)
  • apps/start/package.json (2 hunks)
  • apps/start/src/components/integrations/integration-card.tsx (1 hunks)
  • apps/start/src/components/project-mapper.tsx (1 hunks)
  • apps/start/src/components/ui/badge.tsx (1 hunks)
  • apps/start/src/modals/add-import.tsx (1 hunks)
  • apps/start/src/modals/date-ranger-picker.tsx (1 hunks)
  • apps/start/src/modals/index.tsx (2 hunks)
  • apps/start/src/routeTree.gen.ts (11 hunks)
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1 hunks)
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1 hunks)
  • apps/worker/package.json (3 hunks)
  • apps/worker/src/boot-workers.ts (4 hunks)
  • apps/worker/src/index.ts (2 hunks)
  • apps/worker/src/jobs/events.incoming-event.ts (3 hunks)
  • apps/worker/src/jobs/events.incoming-events.test.ts (2 hunks)
  • apps/worker/src/jobs/import.ts (1 hunks)
  • package.json (1 hunks)
  • packages/cli/package.json (0 hunks)
  • packages/cli/src/cli.ts (0 hunks)
  • packages/cli/src/importer/importer.ts (0 hunks)
  • packages/cli/src/importer/index.ts (0 hunks)
  • packages/cli/tsconfig.json (0 hunks)
  • packages/cli/tsup.config.ts (0 hunks)
  • packages/common/package.json (1 hunks)
  • packages/common/scripts/get-referrers.ts (1 hunks)
  • packages/common/server/index.ts (1 hunks)
  • packages/common/server/parse-referrer.test.ts (5 hunks)
  • packages/common/server/parse-referrer.ts (4 hunks)
  • packages/common/server/parser-user-agent.ts (2 hunks)
  • packages/common/src/object.ts (1 hunks)
  • packages/db/code-migrations/5-add-imports-table.sql (1 hunks)
  • packages/db/code-migrations/5-add-imports-table.ts (1 hunks)
  • packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251018214030_fix/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028152531_imports_again/migration.sql (1 hunks)
  • packages/db/prisma/schema.prisma (2 hunks)
  • packages/db/src/buffers/event-buffer.test.ts (8 hunks)
  • packages/db/src/clickhouse/client.ts (2 hunks)
  • packages/db/src/services/event.service.ts (1 hunks)
  • packages/db/src/services/import.service.ts (1 hunks)
  • packages/db/src/services/session.service.ts (1 hunks)
  • packages/db/src/types.ts (2 hunks)
  • packages/importer/package.json (1 hunks)
  • packages/importer/src/base-provider.ts (1 hunks)
  • packages/importer/src/index.ts (1 hunks)
  • packages/importer/src/providers/metadata.ts (1 hunks)
  • packages/importer/src/providers/mixpanel.test.ts (1 hunks)
  • packages/importer/src/providers/mixpanel.ts (1 hunks)
  • packages/importer/src/providers/umami.ts (1 hunks)
  • packages/importer/src/types.ts (1 hunks)
  • packages/importer/tsconfig.json (1 hunks)
  • packages/importer/vitest.config.ts (1 hunks)
  • packages/queue/src/queues.ts (1 hunks)
  • packages/redis/package.json (1 hunks)
  • packages/trpc/package.json (1 hunks)
  • packages/trpc/src/root.ts (2 hunks)
  • packages/trpc/src/routers/import.ts (1 hunks)
  • packages/validation/src/index.ts (1 hunks)
💤 Files with no reviewable changes (6)
  • packages/cli/tsup.config.ts
  • packages/cli/src/importer/index.ts
  • packages/cli/package.json
  • packages/cli/src/cli.ts
  • packages/cli/src/importer/importer.ts
  • packages/cli/tsconfig.json
✅ Files skipped from review due to trivial changes (2)
  • packages/importer/src/providers/mixpanel.test.ts
  • packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql
🚧 Files skipped from review as they are similar to previous changes (12)
  • packages/db/src/services/event.service.ts
  • packages/db/code-migrations/5-add-imports-table.sql
  • packages/importer/vitest.config.ts
  • apps/worker/package.json
  • apps/start/src/components/integrations/integration-card.tsx
  • packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql
  • packages/db/prisma/migrations/20251018214030_fix/migration.sql
  • packages/db/src/types.ts
  • packages/importer/tsconfig.json
  • packages/importer/package.json
  • packages/importer/src/types.ts
  • apps/start/src/routeTree.gen.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/start/.cursorrules)

apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'

Files:

  • apps/start/src/modals/index.tsx
  • apps/start/src/modals/date-ranger-picker.tsx
  • apps/start/src/components/ui/badge.tsx
  • apps/start/src/components/project-mapper.tsx
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx
  • apps/start/src/modals/add-import.tsx
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx
🧬 Code graph analysis (17)
apps/start/src/modals/index.tsx (1)
apps/start/src/modals/add-import.tsx (1)
  • AddImport (231-282)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
  • importQueue (175-181)
packages/trpc/src/routers/import.ts (6)
packages/trpc/src/trpc.ts (2)
  • createTRPCRouter (138-138)
  • protectedProcedure (161-164)
packages/trpc/src/access.ts (1)
  • getProjectAccessCached (4-4)
packages/trpc/src/errors.ts (1)
  • TRPCAccessError (3-7)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
packages/validation/src/index.ts (1)
  • zCreateImport (478-486)
packages/queue/src/queues.ts (1)
  • importQueue (175-181)
packages/trpc/src/root.ts (1)
packages/trpc/src/routers/import.ts (1)
  • importRouter (11-178)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
  • getIsCluster (17-24)
packages/db/src/clickhouse/migration.ts (2)
  • createTable (68-116)
  • runClickhouseMigrationCommands (322-454)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
  • getRedisQueue (93-105)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
  • importQueue (175-181)
apps/worker/src/jobs/import.ts (1)
  • importJob (27-283)
apps/start/src/components/project-mapper.tsx (2)
packages/db/src/services/project.service.ts (1)
  • IServiceProject (7-7)
apps/start/src/components/ui/select.tsx (5)
  • Select (173-173)
  • SelectTrigger (181-181)
  • SelectValue (182-182)
  • SelectContent (174-174)
  • SelectItem (176-176)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
  • BaseRawEvent (40-42)
  • ImportJobMetadata (69-73)
  • ErrorContext (45-51)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (4)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
  • Route (12-34)
packages/importer/src/providers/metadata.ts (1)
  • IMPORT_PROVIDERS (13-30)
apps/public/components/ui/tooltip.tsx (1)
  • Tooltiper (36-51)
apps/start/src/components/integrations/integration-card.tsx (3)
  • IntegrationCard (91-112)
  • IntegrationCardLogoImage (50-69)
  • IntegrationCardFooter (3-15)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
  • zUmamiImportConfig (452-454)
  • zMixpanelImportConfig (461-470)
  • IUmamiImportConfig (456-456)
  • IMixpanelImportConfig (471-471)
  • IImportConfig (473-476)
apps/start/src/components/project-mapper.tsx (1)
  • ProjectMapper (33-116)
apps/public/lib/utils.ts (1)
  • cn (4-6)
packages/common/server/parse-referrer.test.ts (1)
packages/common/server/parse-referrer.ts (2)
  • parseReferrer (17-26)
  • getReferrerWithQuery (28-66)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
  • IMixpanelImportConfig (471-471)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/db/src/services/event.service.ts (2)
  • IClickhouseEvent (68-102)
  • query (701-847)
packages/common/src/url.ts (2)
  • parsePath (11-49)
  • isSameDomain (51-63)
packages/common/server/parse-referrer.ts (2)
  • parseReferrer (17-26)
  • getReferrerWithQuery (28-66)
packages/common/server/parser-user-agent.ts (2)
  • parseUserAgent (72-115)
  • UserAgentInfo (71-71)
packages/common/src/object.ts (2)
  • toDots (3-36)
  • strip (48-48)
packages/importer/src/providers/umami.ts (5)
packages/validation/src/index.ts (1)
  • IUmamiImportConfig (456-456)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/db/src/services/event.service.ts (2)
  • IClickhouseEvent (68-102)
  • query (701-847)
packages/common/src/url.ts (2)
  • parsePath (11-49)
  • isSameDomain (51-63)
packages/common/server/parse-referrer.ts (2)
  • parseReferrer (17-26)
  • getReferrerWithQuery (28-66)
packages/db/src/services/import.service.ts (4)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
packages/db/src/clickhouse/client.ts (4)
  • ch (131-161)
  • TABLE_NAMES (48-60)
  • convertClickhouseDateToJs (238-240)
  • formatClickhouseDate (208-219)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
apps/worker/src/jobs/import.ts (9)
packages/queue/src/queues.ts (1)
  • ImportQueuePayload (168-173)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
packages/db/src/services/import.service.ts (10)
  • ImportSteps (671-671)
  • updateImportStatus (673-736)
  • getImportDateBounds (610-634)
  • insertImportBatch (28-55)
  • generateSessionIds (62-98)
  • createSessionsStartEndEvents (106-316)
  • moveImportsToProduction (322-426)
  • backfillSessionsToProduction (428-521)
  • markImportComplete (526-545)
  • getImportProgress (550-605)
packages/db/src/clickhouse/client.ts (2)
  • formatClickhouseDate (208-219)
  • error (40-45)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/importer/src/index.ts (2)
  • UmamiProvider (1-1)
  • MixpanelProvider (2-2)
packages/importer/src/providers/umami.ts (1)
  • UmamiProvider (59-382)
packages/importer/src/providers/mixpanel.ts (1)
  • MixpanelProvider (18-441)
packages/common/scripts/get-referrers.ts (1)
apps/worker/scripts/get-referrers.ts (1)
  • main (66-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-and-push-worker
  • GitHub Check: build-and-push-dashboard
  • GitHub Check: build-and-push-api
🔇 Additional comments (43)
packages/db/src/services/session.service.ts (1)

195-199: Incomplete scope: Performance concern applies to both session and event query services.

The dateIntervalInDays change (7 → 360 days) was applied to both packages/db/src/services/session.service.ts (lines 195-199) and packages/db/src/services/event.service.ts (lines 418-432) with identical logic. The original review only identifies the session service impact but misses the event service, effectively doubling the performance concern.

Both services use the same binary threshold (1 day for >1M events, 360 days otherwise) to filter ClickHouse queries, meaning most organizations will experience a 51x increase in the lookback window across both query surfaces.

Recommendations remain the same for both services:

  • Introduce intermediate tiers based on organization size
  • Make the interval configurable per organization
  • Add UI controls for explicit date range selection

Likely an incorrect or invalid review comment.

apps/start/src/modals/date-ranger-picker.tsx (1)

30-30: LGTM! Good UX improvement.

Adding captionLayout="dropdown" provides a more compact date navigation interface, especially useful in modal contexts where space is limited.

apps/api/package.json (1)

67-67: LGTM! Reasonable pinning for build tool.

Pinning the build tool (tsdown) to an exact version ensures consistent builds across environments. Since this is a devDependency, the trade-off of missing automatic updates is acceptable.

packages/db/src/buffers/event-buffer.test.ts (2)

13-23: LGTM! Test data aligned with production settings.

The clickhouseSettings constant mirrors the settings added to the production ClickHouse client, ensuring tests validate the actual insert behavior including async operations, timeouts, and performance optimizations.


142-142: LGTM! Consistent test expectations.

All insert expectations now include the clickhouse_settings field, ensuring tests verify that the proxy-wrapped insert method properly injects these settings.

Also applies to: 187-187, 519-519, 570-570, 615-615, 640-640, 688-688

packages/common/package.json (1)

8-9: LGTM! Script for generating referrer mappings.

The new gen:referrers script generates up-to-date referrer mappings from an external source, with automatic formatting applied to the output.

packages/db/src/clickhouse/client.ts (2)

59-59: LGTM! Table name for import tracking.

Adding events_imports to the table names registry supports the new import system introduced in this PR.


64-67: LGTM! Extended timeouts for large import operations.

The increased timeouts accommodate large data imports:

  • request_timeout: 60s → 300s (5 minutes)
  • idle_socket_ttl: 8s → 60s

These align with the max_execution_time: 300 setting in the insert configuration.

packages/redis/package.json (1)

11-11: Accept ioredis 5.8.2 upgrade with exact pinning.

ioredis 5.8.2 is the latest version, and the upgrade is safe. v5.8.2 (released 2025-10-21) contains only bug fixes: default IP family selection and CLIENT SETINFO command improvements—no breaking changes. No security vulnerabilities were detected. The exact version pin (removing the caret) trades automatic patch-level security updates for build reproducibility, which is a valid tradeoff for production stability.

apps/worker/src/jobs/events.incoming-events.test.ts (1)

100-103: <!-- [scratchpad_start]
[task review_verification 1/1]
Analyze shell script results for referrerType normalization concerns.

[observations]

  • First regex search for referrer(Type)?\s*[=:]\s*['\"]unknown['\"] returned NO matches for filters checking equality with 'unknown'
  • Output shows field assignments/mappings, but NO evidence of dashboards/queries filtering on referrerType='unknown'
  • Code pattern in event.service.ts (lines 338-340) shows: referrer_type: payload.referrerType ?? '', — already defaults undefined/null to empty string
  • ClickHouse schema files show referrer_type as LowCardinality(String) — accepts empty strings
  • No breaking query patterns found in the codebase

[analysis]

  • Original concern #1 (queries filtering on 'unknown') — NOT FOUND; no evidence of such filters
  • Original concern #2 (DB defaults empty to '' for non-nullable columns) — CONFIRMED; code already uses ?? '' pattern
  • Change from test setting referrerType to '' is consistent with existing null/undefined handling

[pending]

  • Verify the specific test file change at apps/worker/src/jobs/events.incoming-events.test.ts lines 100-103 and 208-211
  • Confirm no other downstream assumptions exist

[done]
Script results show no matching filters on 'unknown' values and confirm existing empty-string defaulting pattern.
[scratchpad_end]
-->

Referrer type normalization aligns with existing code patterns — no breaking query filters found.

The codebase already normalizes undefined/null referrer fields to empty strings via nullish coalescing (?? '') in event.service.ts. No queries or dashboards filtering on referrerType='unknown' were detected in the search. The change to normalize referrerType to '' in the test is consistent with the production code's existing behavior.

apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)

45-45: LGTM! Clean addition of the Imports tab.

The new tab entry follows the existing pattern and integrates well with the settings navigation structure.

packages/trpc/package.json (1)

19-19: LGTM! Appropriate dependency addition.

Adding @openpanel/queue enables TRPC routes to interact with the import queue system, aligning with the new import workflow introduced in this PR.

packages/common/server/index.ts (1)

4-4: LGTM! Consistent re-export pattern.

The new re-export follows the existing pattern and appropriately expands the public API surface of the common/server module.

apps/worker/src/index.ts (1)

8-8: LGTM! ImportQueue properly integrated into Bull Board monitoring.

The addition of importQueue to the Bull Board UI enables visibility and monitoring of import jobs, which is essential for debugging and tracking the new import workflow.

Also applies to: 42-42

apps/start/src/modals/index.tsx (1)

8-8: LGTM! Modal registration follows established pattern.

The AddImport modal is properly registered in the modals registry, enabling the import configuration UI flow described in the PR objectives.

Also applies to: 41-41

package.json (1)

12-12: Verified: the common package correctly defines the gen:referrers script.

The script exists in packages/common/package.json and runs the referrer generation logic. The filter change from worker to common is valid and properly configured.

packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1)

1-2: Migration correctly implements nullable currentStep for import workflow resumption.

The codebase explicitly depends on currentStep being nullable: apps/worker/src/jobs/import.ts:56 checks if (record.currentStep === null) to identify new imports versus resumed imports. This null state allows the import job to distinguish between first-time executions (null) and resumptions from a previous step. The migration is correct as-is.

apps/start/package.json (1)

34-34: LGTM! Dependency addition aligns with the import feature.

The new @openpanel/importer workspace dependency is correctly added to support the import functionality introduced in this PR.

packages/trpc/src/root.ts (1)

7-7: LGTM! Import router correctly integrated.

The importRouter is properly imported and wired into the appRouter, following the established pattern for other routers in the application.

Also applies to: 44-44

packages/common/server/parser-user-agent.ts (1)

71-71: LGTM! Useful type export.

The new UserAgentInfo type export provides a clean type definition for consumers of parseUserAgent.

packages/common/server/parse-referrer.test.ts (1)

8-8: LGTM! Test expectations correctly updated.

The test expectations have been updated from 'unknown' to '' (empty string) to match the implementation changes in parse-referrer.ts. This is consistent with the new default behavior for unknown referrers.

Also applies to: 14-14, 44-44, 52-52, 67-67, 91-91, 99-99, 113-113

packages/importer/src/index.ts (1)

1-13: LGTM! Clean barrel export for importer package.

The index file properly exports the provider implementations and type definitions, establishing a clean public API for the importer package.

packages/queue/src/queues.ts (1)

168-181: LGTM! Import queue properly configured.

The new importQueue follows the established pattern for other queues in the codebase, with appropriate job retention settings (removeOnComplete: 10, removeOnFail: 50).

apps/worker/src/boot-workers.ts (1)

8-8: LGTM! Import worker correctly configured.

The importWorker is properly set up with:

  • Correct queue and job handler imports
  • Configurable concurrency via IMPORT_JOB_CONCURRENCY environment variable
  • Standard worker options

Also applies to: 23-23, 61-64

packages/common/server/parse-referrer.ts (3)

35-51: Improved referrer matching logic.

The updated matching logic is cleaner and more maintainable:

  • Source is now normalized to lowercase upfront
  • Matching checks direct key, .com suffix, and case-insensitive name lookup
  • Removes redundant fallback

The logic now properly handles case-insensitive matching for query parameters like utm_source=GoOgLe.


23-23: No breaking change detected.

The code already defaults to empty strings for unknown referrers. Tests confirm this is the expected behavior, and no consumers in the codebase check for type === 'unknown'. The alleged change from 'unknown' to '' either does not exist or was already in place prior to this PR.

Likely an incorrect or invalid review comment.


1-1: Import paths are correct and resolve properly.

Both imports in packages/common/server/parse-referrer.ts successfully resolve:

  • Line 1: ../src/stringpackages/common/src/string.ts
  • Line 3: ./referrerspackages/common/server/referrers/index.ts
packages/trpc/src/routers/import.ts (1)

101-133: LGTM!

Delete procedure correctly checks write permissions and safely removes queue job before deleting the record.

apps/worker/src/jobs/events.incoming-event.ts (2)

8-12: LGTM!

Import reorganization consolidates referrer utilities from @openpanel/common/server and correctly uses type-only import for Job.

Also applies to: 21-21, 23-23


120-122: Referrer priority adjusted — verify behavior.

baseEvent.referrer now exclusively uses referrer?.url (removing fallback to utmReferrer?.url), and referrerType prioritizes referrer?.type before utmReferrer?.type. This aligns with centralized referrer parsing, but ensure existing analytics flows handle the priority change correctly.

packages/db/prisma/schema.prisma (2)

470-475: LGTM!

ImportStatus enum defines a clear import lifecycle: pending → processing → completed/failed.


197-197: LGTM!

Project relation to imports is correctly defined and matches the Import model's foreign key.

apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (2)

88-122: LGTM!

getStatusBadge provides clear visual feedback with appropriate icons and tooltips for failed imports showing error details.


229-263: LGTM!

Progress display handles multiple scenarios (estimated counts, N/A, processing) with clear user communication via tooltips and progress bars.

apps/worker/src/jobs/import.ts (1)

285-298: LGTM!

createProvider factory cleanly instantiates provider-specific implementations and throws descriptive errors for unknown providers.

packages/importer/src/base-provider.ts (3)

4-31: LGTM!

Abstract base class defines a clear provider contract with streaming support (AsyncGenerator), transformation, validation, and event counting. Well-documented and type-safe.


33-58: LGTM!

Optional hooks provide extension points for providers without forcing implementation. Default behaviors (re-throw errors, return 0 for estimated total) are sensible.


78-120: LGTM!

getDateChunks utility correctly splits date ranges for batched processing, handles edge cases (same start/end date, chunks exceeding end date), and is well-documented.

apps/start/src/modals/add-import.tsx (3)

44-107: LGTM!

UmamiImport form correctly uses zUmamiImportConfig schema, includes required fields (provider, type, fileUrl), and integrates ProjectMapper for optional mappings.


115-229: LGTM!

MixpanelImport form correctly implements required fields with appropriate input types (password for secret) and integrates date range picker for selecting import window.


253-259: LGTM!

Payload construction correctly passes config from forms which already include required provider and type fields set in form defaults (Lines 59-60 for Umami, Lines 124-125 for Mixpanel).

packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql (1)

1-14: The migration concerns are valid and already identified in the codebase.

The migration under review makes currentStep NOT NULL without backfilling data, and the very next migration (20251028150655_fix_imports_again) immediately reverts this change, indicating the problem was encountered. Additionally, another follow-up migration (20251028152531_imports_again) modifies currentBatch again.

Since these are part of an active development cycle on a newly created table (10 days old as of the migration date), consider consolidating these migrations:

  • Combine 20251028150123, 20251028150655, and 20251028152531 into a single migration that correctly handles the schema without intermediate reversals, or
  • If these are across separate PRs, ensure currentStep remains nullable (or add a backfill to populate existing NULLs before setting NOT NULL).

The current state requires the next migration to fix the previous one, which suggests the schema design should be finalized before deployment.

packages/db/src/services/import.service.ts (1)

682-691: Verify terminal state handling.

Lines 684-691 unconditionally set status: 'processing' when step is 'loading'. If an import has already reached a terminal state ('completed' or 'failed'), this would revert it. Confirm whether this is intentional for retry scenarios, or if a state check is needed.

If terminal states should be protected, add a guard:

const current = await db.import.findUnique({ 
  where: { id: importId },
  select: { status: true }
});
if (current?.status === 'completed' || current?.status === 'failed') {
  // Skip or throw depending on requirements
  return;
}

"type": "module",
"scripts": {
"testing": "pnpm dev",
"testing2": "pnpm dev",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify the script rename from "testing" to "testing2".

The script key was renamed from "testing" to "testing2" without clear justification. This appears to be a development artifact rather than an intentional change for this PR.

If this rename is intentional, please clarify the purpose. Otherwise, consider reverting it or using a more descriptive name.

🤖 Prompt for AI Agents
In apps/start/package.json around line 6, the script key was renamed from
"testing" to "testing2" which looks like an accidental dev artifact; either
revert the key back to "testing" to preserve original functionality or replace
it with a descriptive name (e.g., "dev" or "start:testing") if the rename is
intentional, and update any references in documentation or CI that call the
script accordingly; if intentional, add a short comment in the PR description
explaining the reason for the new name.

Comment on lines +23 to +31
interface ProjectMapperProps {
fields: FieldArrayWithId<any, 'projectMapper', 'id'>[];
append: UseFieldArrayAppend<any, 'projectMapper'>;
remove: UseFieldArrayRemove;
projects: IServiceProject[];
register: UseFormRegister<any>;
watch: UseFormWatch<any>;
setValue: UseFormSetValue<any>;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Define a proper form type instead of using any.

The interface uses any for all generic type parameters, which defeats TypeScript's type safety. Define an explicit form schema type to enable proper type checking and autocomplete.

Apply this diff to add proper typing:

+interface ProjectMapperFormData {
+  projectMapper: Array<{ from: string; to: string }>;
+}
+
 interface ProjectMapperProps {
-  fields: FieldArrayWithId<any, 'projectMapper', 'id'>[];
-  append: UseFieldArrayAppend<any, 'projectMapper'>;
+  fields: FieldArrayWithId<ProjectMapperFormData, 'projectMapper', 'id'>[];
+  append: UseFieldArrayAppend<ProjectMapperFormData, 'projectMapper'>;
   remove: UseFieldArrayRemove;
   projects: IServiceProject[];
-  register: UseFormRegister<any>;
-  watch: UseFormWatch<any>;
-  setValue: UseFormSetValue<any>;
+  register: UseFormRegister<ProjectMapperFormData>;
+  watch: UseFormWatch<ProjectMapperFormData>;
+  setValue: UseFormSetValue<ProjectMapperFormData>;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
interface ProjectMapperProps {
fields: FieldArrayWithId<any, 'projectMapper', 'id'>[];
append: UseFieldArrayAppend<any, 'projectMapper'>;
remove: UseFieldArrayRemove;
projects: IServiceProject[];
register: UseFormRegister<any>;
watch: UseFormWatch<any>;
setValue: UseFormSetValue<any>;
}
interface ProjectMapperFormData {
projectMapper: Array<{ from: string; to: string }>;
}
interface ProjectMapperProps {
fields: FieldArrayWithId<ProjectMapperFormData, 'projectMapper', 'id'>[];
append: UseFieldArrayAppend<ProjectMapperFormData, 'projectMapper'>;
remove: UseFieldArrayRemove;
projects: IServiceProject[];
register: UseFormRegister<ProjectMapperFormData>;
watch: UseFormWatch<ProjectMapperFormData>;
setValue: UseFormSetValue<ProjectMapperFormData>;
}
🤖 Prompt for AI Agents
In apps/start/src/components/project-mapper.tsx around lines 23 to 31, the
ProjectMapperProps interface uses `any` for all react-hook-form generics;
replace these with a concrete form data type (e.g., define an exported
FormValues or ProjectForm type that describes fields used by the component,
including projectMapper: { id: string; <other fields> }[] and any nested
fields); update the interface to use that type in FieldArrayWithId<FormValues,
'projectMapper', 'id'>, UseFieldArrayAppend<FormValues, 'projectMapper'>,
UseFieldArrayRemove<FormValues>, UseFormRegister<FormValues>,
UseFormWatch<FormValues>, and UseFormSetValue<FormValues>; import or declare the
FormValues type near the top of the file (or from a shared types file) and
adjust any call sites to match the new type.

Comment on lines +82 to +98
<Select
value={watch(`projectMapper.${index}.to`)}
onValueChange={(value) =>
setValue(`projectMapper.${index}.to`, value)
}
>
<SelectTrigger className="mt-1 w-full" size="sm">
<SelectValue placeholder="Select project" />
</SelectTrigger>
<SelectContent>
{projects.slice(0, 10).map((project) => (
<SelectItem key={project.id} value={project.id}>
{project.name}
</SelectItem>
))}
</SelectContent>
</Select>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove the arbitrary 10-project limit.

Line 92 limits the dropdown to only the first 10 projects, preventing users with more projects from mapping to those beyond the limit. The Select component should display all available projects with scrolling, or implement search/pagination if performance is a concern.

Apply this diff to show all projects:

                     <SelectContent>
-                      {projects.slice(0, 10).map((project) => (
+                      {projects.map((project) => (
                         <SelectItem key={project.id} value={project.id}>
                           {project.name}
                         </SelectItem>

If performance is a concern with many projects, consider adding a search input or ComboBox component instead of a plain Select.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<Select
value={watch(`projectMapper.${index}.to`)}
onValueChange={(value) =>
setValue(`projectMapper.${index}.to`, value)
}
>
<SelectTrigger className="mt-1 w-full" size="sm">
<SelectValue placeholder="Select project" />
</SelectTrigger>
<SelectContent>
{projects.slice(0, 10).map((project) => (
<SelectItem key={project.id} value={project.id}>
{project.name}
</SelectItem>
))}
</SelectContent>
</Select>
<Select
value={watch(`projectMapper.${index}.to`)}
onValueChange={(value) =>
setValue(`projectMapper.${index}.to`, value)
}
>
<SelectTrigger className="mt-1 w-full" size="sm">
<SelectValue placeholder="Select project" />
</SelectTrigger>
<SelectContent>
{projects.map((project) => (
<SelectItem key={project.id} value={project.id}>
{project.name}
</SelectItem>
))}
</SelectContent>
</Select>
🤖 Prompt for AI Agents
In apps/start/src/components/project-mapper.tsx around lines 82 to 98 the Select
is artificially limited by projects.slice(0, 10) which hides projects beyond the
first 10; remove the slice so the SelectContent maps over the entire projects
array (or replace with a virtualized list / searchable ComboBox if performance
is a concern) and ensure the SelectContent is allowed to scroll by keeping the
UI container styles intact.

Comment on lines +265 to +276
<TableCell>
<Tooltiper
content={
<pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words">
{JSON.stringify(imp.config, null, 2)}
</pre>
}
tooltipClassName="max-w-xs"
>
<Badge>Config</Badge>
</Tooltiper>
</TableCell>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Config tooltip exposes sensitive credentials.

The config JSON (Line 269) contains API keys, secrets, and service credentials. This tooltip exposes them to anyone viewing the imports page. Redact sensitive fields or restrict based on user role.

  <Tooltiper
    content={
      <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words">
-        {JSON.stringify(imp.config, null, 2)}
+        {JSON.stringify(
+          {
+            ...imp.config,
+            apiKey: imp.config.apiKey ? '***' : undefined,
+            apiSecret: imp.config.apiSecret ? '***' : undefined,
+            serviceAccount: imp.config.serviceAccount ? '***' : undefined,
+            serviceSecret: imp.config.serviceSecret ? '***' : undefined,
+          },
+          null,
+          2
+        )}
      </pre>
    }
    tooltipClassName="max-w-xs"
  >
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<TableCell>
<Tooltiper
content={
<pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words">
{JSON.stringify(imp.config, null, 2)}
</pre>
}
tooltipClassName="max-w-xs"
>
<Badge>Config</Badge>
</Tooltiper>
</TableCell>
<TableCell>
<Tooltiper
content={
<pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words">
{JSON.stringify(
{
...imp.config,
apiKey: imp.config.apiKey ? '***' : undefined,
apiSecret: imp.config.apiSecret ? '***' : undefined,
serviceAccount: imp.config.serviceAccount ? '***' : undefined,
serviceSecret: imp.config.serviceSecret ? '***' : undefined,
},
null,
2
)}
</pre>
}
tooltipClassName="max-w-xs"
>
<Badge>Config</Badge>
</Tooltiper>
</TableCell>
🤖 Prompt for AI Agents
In
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx
around lines 265 to 276, the tooltip currently renders the raw imp.config JSON
(line 269) which can expose API keys/secrets; update the rendering to first
sanitize/redact sensitive fields and/or gate the full config behind an
authorization check. Implement a small sanitizeConfig utility that deep-copies
the config and replaces known sensitive keys (e.g., apiKey, api_secret, secret,
password, token, credentials, privateKey, accessKey, secretAccessKey) with a
redacted placeholder, then pass the sanitized object to JSON.stringify for
display; additionally, wrap the tooltip render with a permission check so only
authorized roles (admins/owners) can see the unredacted config, otherwise show
the sanitized version or a notice that details are hidden.

Comment on lines +71 to +72
importWorker,
// eventsGroupWorker,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Events processing appears to be disabled.

The eventsGroupWorker is commented out in the workers array (line 72), which means event processing is effectively disabled. However, the worker is still instantiated (lines 34-48) and will continue to run, but won't be included in the lifecycle management (ready/error/closed handlers, graceful shutdown).

This appears to be an unintended change that would break event processing in production. Please verify:

  1. Is event processing intentionally disabled?
  2. Should eventsGroupWorker be added back to the workers array?
  3. Or should the worker instantiation be removed entirely if no longer needed?
  const workers = [
    sessionsWorker,
    cronWorker,
    notificationWorker,
    miscWorker,
    importWorker,
-    // eventsGroupWorker,
+    eventsGroupWorker,
  ];
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
importWorker,
// eventsGroupWorker,
importWorker,
eventsGroupWorker,
🤖 Prompt for AI Agents
In apps/worker/src/boot-workers.ts around lines 71 to 72, the eventsGroupWorker
is commented out from the workers array which leaves its instance running but
unmanaged by lifecycle handlers; either uncomment/add eventsGroupWorker back
into the workers array so it participates in ready/error/close and
graceful-shutdown flows, or if event processing is intentionally disabled remove
the eventsGroupWorker instantiation and related setup (and any imports) so there
is no orphaned running worker; pick one of these two fixes and apply it
consistently across imports, worker list, and shutdown/handler registration.

Comment on lines +1 to +4
-- AlterTable
ALTER TABLE "public"."imports" ALTER COLUMN "currentBatch" DROP NOT NULL,
ALTER COLUMN "currentBatch" DROP DEFAULT,
ALTER COLUMN "currentBatch" SET DATA TYPE TEXT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the relaxed constraints on currentBatch align with batch tracking requirements.

This migration makes three significant changes to currentBatch:

  1. Removes NOT NULL constraint (allows imports without batches)
  2. Removes DEFAULT value (no automatic batch assignment)
  3. Changes type to TEXT (allows flexible batch identifiers)

Please confirm that:

  • Imports are expected to exist without a currentBatch (nullable is correct)
  • The workflow handles missing batch values appropriately
  • TEXT is the appropriate type for your batch identifier scheme (e.g., UUIDs, timestamps, custom strings)

🏁 Script executed:

#!/bin/bash
# Description: Check how currentBatch is used in import processing logic

# Search for currentBatch usage and null checks
rg -nP --type=ts -C3 'currentBatch' -g '!*.gen.ts' -g '!node_modules'

Length of output: 4902



🏁 Script executed:

#!/bin/bash
# Find the Prisma schema and check currentBatch field definition
fd -name 'schema.prisma' -o -name '*.prisma'

Length of output: 230


🏁 Script executed:

#!/bin/bash
# List migrations in order to find the one before this one
ls -1 packages/db/prisma/migrations/ | sort | tail -10

Length of output: 295


🏁 Script executed:

#!/bin/bash
# Check the imports table definition in the latest schema or migrations
fd -name '*.sql' -path '*/migrations/*' -exec grep -l 'currentBatch' {} \;

Length of output: 230



🏁 Script executed:

#!/bin/bash
# View the previous migration to understand what currentBatch type was
cat packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql

Length of output: 86


🏁 Script executed:

#!/bin/bash
# View the migration that added the import table initially
cat packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql

Length of output: 840


🏁 Script executed:

#!/bin/bash
# Search for schema.prisma in common locations
find packages/db -name 'schema.prisma' 2>/dev/null

Length of output: 32


🏁 Script executed:

#!/bin/bash
# Check what type currentBatch has in recent migrations
rg 'currentBatch' packages/db/prisma/migrations/ -A1 -B1

Length of output: 859



🏁 Script executed:

#!/bin/bash
# View the Prisma schema definition
cat packages/db/prisma/schema.prisma | grep -A 50 'model imports'

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check if there's a type generation process or codegen in Prisma
rg -A5 'generator|prisma' packages/db/prisma/schema.prisma | head -30

Length of output: 432


🏁 Script executed:

#!/bin/bash
# Check how batch parameter is used - what values are passed
rg -nP 'options\.batch|batch:' packages/db/src/ apps/worker/src/ -B2 -A2

Length of output: 4984



🏁 Script executed:

#!/bin/bash
# View full imports model from schema
rg -A 30 'model imports' packages/db/prisma/schema.prisma

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check generated Prisma types for imports
cat packages/db/src/generated/prisma/index.d.ts 2>/dev/null | grep -A 30 'interface imports' | head -40

Length of output: 0


🏁 Script executed:

#!/bin/bash
# View the imports.service.ts to see the UpdateImportStatusOptions type
rg -B5 -A10 'interface.*UpdateImportStatusOptions|type UpdateImportStatusOptions' packages/db/src/services/import.service.ts

Length of output: 357



🏁 Script executed:

#!/bin/bash
# Find and view the ImportProgress interface definition
rg -B3 -A8 'interface ImportProgress' packages/importer/src/types.ts

Length of output: 232


🏁 Script executed:

#!/bin/bash
# Check if there are any type errors or if the types are reconciled elsewhere
rg -C5 'currentBatch.*number|number.*currentBatch' packages/

Length of output: 530



🏁 Script executed:

#!/bin/bash
# Check where ImportProgress is used and how currentBatch is accessed
rg -C3 'ImportProgress|currentBatch' packages/importer/src/ apps/

Length of output: 3534


Fix TypeScript type definition for currentBatch in ImportProgress interface.

The migration correctly changes currentBatch from INTEGER NOT NULL DEFAULT 0 to TEXT nullable with no default. However, the TypeScript type definition needs to be updated:

  • File: packages/importer/src/types.ts, line 23
  • Current: currentBatch: number;
  • Should be: currentBatch: string | null;

Reason: The migration stores TEXT values (timestamps, batch identifiers), and the code already handles null batches. The type should reflect this reality.

🤖 Prompt for AI Agents
In packages/importer/src/types.ts around line 23, the ImportProgress interface
currently declares currentBatch: number; but the DB migration changes
currentBatch to nullable TEXT; update the TypeScript definition to currentBatch:
string | null; so the type matches stored values (text/batch IDs or null) and
adjust any places expecting a number accordingly.

Comment on lines +477 to +498
model Import {
id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid
projectId String
project Project @relation(fields: [projectId], references: [id], onDelete: Cascade)
jobId String? // BullMQ job ID
status ImportStatus
statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs"
errorMessage String?
/// [IPrismaImportConfig]
config Json
totalEvents Int @default(0)
processedEvents Int @default(0)
currentStep String?
currentBatch String? // String date 2020-01-01
createdAt DateTime @default(now())
completedAt DateTime?
updatedAt DateTime @default(now()) @updatedAt
@@map("imports")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Inconsistency: currentStep nullable in schema but NOT NULL in migration.

Line 489 defines currentStep as String? (optional), but migration 20251028150123_fix_imports_table/migration.sql sets it to NOT NULL. Align schema and migration:

  • If imports always have a step, make it required: currentStep String
  • If it's optional during creation, remove NOT NULL constraint from migration
- currentStep     String?
+ currentStep     String

Or revert the migration's SET NOT NULL constraint.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
model Import {
id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid
projectId String
project Project @relation(fields: [projectId], references: [id], onDelete: Cascade)
jobId String? // BullMQ job ID
status ImportStatus
statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs"
errorMessage String?
/// [IPrismaImportConfig]
config Json
totalEvents Int @default(0)
processedEvents Int @default(0)
currentStep String?
currentBatch String? // String date 2020-01-01
createdAt DateTime @default(now())
completedAt DateTime?
updatedAt DateTime @default(now()) @updatedAt
@@map("imports")
}
model Import {
id String @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid
projectId String
project Project @relation(fields: [projectId], references: [id], onDelete: Cascade)
jobId String? // BullMQ job ID
status ImportStatus
statusMessage String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs"
errorMessage String?
/// [IPrismaImportConfig]
config Json
totalEvents Int @default(0)
processedEvents Int @default(0)
currentStep String
currentBatch String? // String date 2020-01-01
createdAt DateTime @default(now())
completedAt DateTime?
updatedAt DateTime @default(now()) @updatedAt
@@map("imports")
}
🤖 Prompt for AI Agents
In packages/db/prisma/schema.prisma around lines 477 to 496, the model Import
declares currentStep as nullable (String?) while the migration
20251028150123_fix_imports_table/migration.sql sets current_step to NOT NULL;
pick one and make them consistent: if imports must always have a step, change
schema currentStep to non-nullable (currentStep String) and run prisma migrate
to regenerate the SQL; otherwise, edit the migration SQL to remove the SET NOT
NULL on current_step (or create a new migration that drops the NOT NULL
constraint) so the column remains nullable, then reapply migrations.

Comment on lines 505 to 506
e.import_id = ${sqlstring.escape(importId)}
AND toDate(e.created_at) >= ${sqlstring.escape(from)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Inconsistent parameter handling.

Lines 505-506 use sqlstring.escape() for inline parameter substitution, while the rest of the codebase uses ClickHouse's query_params mechanism (e.g., lines 89-90, 416-417, 534-535). This inconsistency reduces maintainability and may bypass ClickHouse's optimizations.

Refactor to use query_params:

+  const sessionsInsertQuery = `
     ...
     WHERE 
-      e.import_id = ${sqlstring.escape(importId)}
-      AND toDate(e.created_at) >= ${sqlstring.escape(from)}
+      e.import_id = {importId:String}
+      AND toDate(e.created_at) >= {from:String}
       AND e.session_id != ''
     GROUP BY e.session_id
   `;

   await ch.command({
     query: sessionsInsertQuery,
+    query_params: { importId, from },
     clickhouse_settings: {

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In packages/db/src/services/import.service.ts around lines 505-506, the SQL
currently inlines e.import_id = ${sqlstring.escape(importId)} AND
toDate(e.created_at) >= ${sqlstring.escape(from)}; replace these inline
sqlstring.escape substitutions with ClickHouse query parameter placeholders and
pass the corresponding parameters in the same query_params structure used
elsewhere in the file (e.g., add parameters for importId and from in the query
call), using the same parameter naming and types as other queries (e.g.,
{importId: String} and {from: DateTime}/Date) so the SQL becomes e.import_id =
{importId:String} AND toDate(e.created_at) >= {from:Date} (or matching the
project's param type) and include those values in the query_params array/object
for the query execution.

Comment on lines +85 to +91
this.logger?.info('Fetching events from Mixpanel', {
url: `${url}?${params}`,
from,
to,
projectId,
serviceAccount,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Sensitive credentials logged.

Line 90 logs the serviceAccount credential. Even though the secret is not logged, exposing account identifiers in logs can aid attackers in credential stuffing or enumeration attacks.

Consider removing sensitive fields from logs:

 this.logger?.info('Fetching events from Mixpanel', {
   url: `${url}?${params}`,
   from,
   to,
   projectId,
-  serviceAccount,
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
this.logger?.info('Fetching events from Mixpanel', {
url: `${url}?${params}`,
from,
to,
projectId,
serviceAccount,
});
this.logger?.info('Fetching events from Mixpanel', {
url: `${url}?${params}`,
from,
to,
projectId,
});
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 85 to 91, the logger
call includes the serviceAccount identifier which is considered sensitive;
remove serviceAccount from the logged metadata and only log non-sensitive
context (url, from, to, projectId) or replace serviceAccount with a
redacted/boolean flag (e.g., serviceAccountPresent: true) if presence needs to
be recorded; update the logger invocation accordingly and run a quick search to
ensure no other logs emit service account identifiers.

Comment on lines +160 to +176
// Add new job to queue
const job = await importQueue.add('import', {
type: 'import',
payload: {
importId: importRecord.id,
},
});

// Update import record
return db.import.update({
where: { id: importRecord.id },
data: {
jobId: job.id,
status: 'pending',
errorMessage: null,
},
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Wrap retry enqueue in try/catch.

Same issue as create: if importQueue.add fails, the record is updated to 'pending' but no job exists. Wrap in try/catch and revert status on failure.

      // Add new job to queue
-      const job = await importQueue.add('import', {
-        type: 'import',
-        payload: {
-          importId: importRecord.id,
-        },
-      });
-
-      // Update import record
-      return db.import.update({
-        where: { id: importRecord.id },
-        data: {
-          jobId: job.id,
-          status: 'pending',
-          errorMessage: null,
-        },
-      });
+      try {
+        const job = await importQueue.add(
+          'import',
+          { type: 'import', payload: { importId: importRecord.id } },
+          { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
+        );
+        return db.import.update({
+          where: { id: importRecord.id },
+          data: {
+            jobId: job.id,
+            status: 'pending',
+            errorMessage: null,
+          },
+        });
+      } catch (err) {
+        throw new Error('Failed to enqueue retry job: ' + (err instanceof Error ? err.message : String(err)));
+      }
    }),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Add new job to queue
const job = await importQueue.add('import', {
type: 'import',
payload: {
importId: importRecord.id,
},
});
// Update import record
return db.import.update({
where: { id: importRecord.id },
data: {
jobId: job.id,
status: 'pending',
errorMessage: null,
},
});
// Add new job to queue
try {
const job = await importQueue.add(
'import',
{ type: 'import', payload: { importId: importRecord.id } },
{ jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
);
return db.import.update({
where: { id: importRecord.id },
data: {
jobId: job.id,
status: 'pending',
errorMessage: null,
},
});
} catch (err) {
throw new Error('Failed to enqueue retry job: ' + (err instanceof Error ? err.message : String(err)));
}
🤖 Prompt for AI Agents
In packages/trpc/src/routers/import.ts around lines 160 to 176, the code calls
importQueue.add and then unconditionally updates the DB record to status
'pending', which can leave the DB pointing to a non‑existent job if add fails;
wrap the call to importQueue.add in a try/catch, only update the import record
to 'pending' and set jobId after a successful add, and in the catch update the
import record to a terminal state (e.g., 'failed') with errorMessage set to the
caught error (or revert to the previous status if tracked); ensure the DB
updates are awaited and consider doing the job add and record update in a
transaction or in the correct sequence so no pending state is recorded when
enqueueing fails.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
packages/trpc/src/routers/auth.ts (2)

227-229: Remove sensitive console logs from auth flow.

Logging session objects and cookie setters risks leaking tokens/PII.

Apply this diff:

-      console.log('session', session);
       setSessionTokenCookie(ctx.setCookie, token, session.expiresAt);
-      console.log('ctx.setCookie', ctx.setCookie);

201-203: Avoid credential enumeration; return a generic auth error.

“User does not exists” reveals account presence. Use a single generic message.

Apply this diff:

-import { TRPCAccessError, TRPCNotFoundError } from '../errors';
+import { TRPCAccessError, TRPCNotFoundError } from '../errors';
@@
-      if (!user) {
-        throw TRPCNotFoundError('User does not exists');
-      }
+      if (!user) {
+        throw TRPCAccessError('Incorrect email or password');
+      }
Note: If you prefer stricter semantics, you can also replace the other branch’s incorrect-password error to the same message to keep responses uniform.

Also applies to: 31-31

apps/worker/src/jobs/events.incoming-event.ts (1)

39-44: Do not log full event payload; redact to avoid PII leakage.

Logging entire payload and headers can include emails, IDs, and custom properties. Log only identifiers and sizes.

-  logger.info('Creating event', { event: payload, jobData });
+  logger.info('Creating event', {
+    name: payload.name,
+    projectId: payload.projectId,
+    profileId: payload.profileId,
+    sessionId: (payload as any).sessionId,
+    deviceId: (payload as any).deviceId,
+    propertiesSize: Object.keys(payload.properties ?? {}).length,
+  });
♻️ Duplicate comments (31)
apps/start/src/components/project-mapper.tsx (2)

23-31: Define a proper form type instead of using any.

The interface uses any for all generic type parameters, which defeats TypeScript's type safety. Define an explicit form schema type to enable proper type checking and autocomplete.


92-92: Remove the arbitrary 10-project limit.

The slice(0, 10) limits the dropdown to only the first 10 projects, preventing users with more projects from mapping to those beyond the limit. The Select component should display all available projects with scrolling, or implement search/pagination if performance is a concern.

packages/common/scripts/get-referrers.ts (1)

74-74: Fix incorrect output path.

The script writes to ../../worker/src/referrers/index.ts, but from packages/common/scripts/, this resolves to an unexpected location outside the package. The corresponding format command in package.json expects ./server/referrers/index.ts relative to the package root.

Apply this diff to fix the output path:

-      path.resolve(__dirname, '../../worker/src/referrers/index.ts'),
+      path.resolve(__dirname, '../server/referrers/index.ts'),
packages/db/code-migrations/5-add-imports-table.ts (2)

1-2: Fix __filename usage in ESM module.

The package is configured as ESM ("type": "module"), and using __filename directly will fail at runtime. Import fileURLToPath from node:url and derive __filename properly.

Apply this diff:

 import fs from 'node:fs';
 import path from 'node:path';
+import { fileURLToPath } from 'node:url';
 import { TABLE_NAMES } from '../src/clickhouse/client';
 import {
   chMigrationClient,
   createTable,
   runClickhouseMigrationCommands,
 } from '../src/clickhouse/migration';
 import { getIsCluster } from './helpers';
 
+const __filename = fileURLToPath(import.meta.url);
+
 export async function up() {

Also applies to: 72-72


65-69: TTL on Distributed table will fail in clustered mode.

ALTER TABLE events_imports MODIFY TTL will fail or be a no-op on the Distributed table in clustered deployments. TTL must be defined on the local Replicated MergeTree table, preferably in the CREATE TABLE statement.

Consider extending createTable to support TTL parameter and include it in the CREATE TABLE statement, or target the replicated table explicitly with ON CLUSTER '{cluster}' if the ALTER approach is retained.

packages/validation/src/index.ts (1)

478-486: Discriminate by provider to prevent mismatched payloads.

The current schema allows provider and config to disagree (e.g., provider='umami' with a Mixpanel config). This was flagged in a previous review and remains unresolved.

Apply a discriminated union on provider so the config shape is tied to the selected provider:

-export const zCreateImport = z.object({
-  projectId: z.string().min(1),
-  provider: z.enum(['umami', 'plausible', 'mixpanel']),
-  config: z.union([
-    zUmamiImportConfig,
-    zPlausibleImportConfig,
-    zMixpanelImportConfig,
-  ]),
-});
+export const zCreateImport = z.discriminatedUnion('provider', [
+  z.object({
+    projectId: z.string().min(1),
+    provider: z.literal('umami'),
+    config: zUmamiImportConfig,
+  }),
+  z.object({
+    projectId: z.string().min(1),
+    provider: z.literal('plausible'),
+    config: zPlausibleImportConfig,
+  }),
+  z.object({
+    projectId: z.string().min(1),
+    provider: z.literal('mixpanel'),
+    config: zMixpanelImportConfig,
+  }),
+]);
packages/importer/src/types.ts (1)

53-66: EventProperties allows nested objects, but ClickHouse requires primitives only.

The EventProperties interface (line 61) allows Record<string, unknown> and defines __query as Record<string, unknown> (line 62). This conflicts with ClickHouse's IClickhouseEvent.properties which must be Record<string, string|number|boolean|undefined|null>. Storing nested objects will break ingestion unless stringified. This was flagged in a previous review and remains unresolved.

Either constrain EventProperties to primitives only:

 export interface EventProperties {
   [key: string]:
     | string
     | number
     | boolean
     | null
-    | undefined
-    | Record<string, unknown>;
-  __query?: Record<string, unknown>;
+    | undefined;
+  __query?: string; // JSON.stringify before storing
   __title?: string;
   __screen?: string;
   __language?: string;
 }

Or explicitly handle JSON stringification for nested fields before they reach ClickHouse.

packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)

10-13: Data loss risk: backfill config before dropping columns.

Migrate provider/source fields into config first, then DROP. Example:

+-- Backfill config from legacy columns before dropping them
+UPDATE "public"."imports" i
+SET "config" = COALESCE(
+  jsonb_build_object(
+    'provider', i."provider",
+    'sourceType', i."sourceType",
+    'sourceLocation', i."sourceLocation"
+  ) || COALESCE(i."config", '{}'::jsonb),
+  '{}'::jsonb
+)
+WHERE i."provider" IS NOT NULL
+   OR i."sourceType" IS NOT NULL
+   OR i."sourceLocation" IS NOT NULL;
+
 -- AlterTable
 ALTER TABLE "public"."imports" DROP COLUMN "provider",
 DROP COLUMN "sourceLocation",
 DROP COLUMN "sourceType",
 ALTER COLUMN "config" DROP DEFAULT;
packages/importer/vitest.config.ts (1)

3-3: ESM: __dirname is undefined. Compute from import.meta.url.

Apply:

+import { dirname } from 'node:path';
+import { fileURLToPath } from 'node:url';
 import { getSharedVitestConfig } from '../../vitest.shared';
 
-export default getSharedVitestConfig({ __dirname });
+export default getSharedVitestConfig({
+  __dirname: dirname(fileURLToPath(import.meta.url)),
+});
apps/worker/src/boot-workers.ts (2)

66-73: eventsGroupWorker is unmanaged; either add it back to lifecycle or remove it entirely.

Currently it runs but is excluded from handlers/close, risking leaks and unobserved failures. Choose one path and apply consistently.

Option A — manage it with the others (recommended):

   const workers = [
     sessionsWorker,
     cronWorker,
     notificationWorker,
     miscWorker,
     importWorker,
-    // eventsGroupWorker,
+    // GroupMQ worker participates in lifecycle
+    // (we still keep eventsGroupWorker.run(); above)
+    eventsGroupWorker as unknown as Worker,
   ];

Option B — if intentionally disabled, remove it completely:

-  const eventsGroupWorker = new GroupWorker<...>({ ... });
-  eventsGroupWorker.run();
+  // eventsGroupWorker intentionally disabled; remove instantiation to avoid orphaned worker.

155-166: Always use graceful shutdown; remove the dev fast-exit path.

Avoid skipping worker close/queue drain in dev.

-        if (process.env.NODE_ENV === 'production') {
-          exitHandler(evt, code);
-        } else {
-          logger.info('Shutting down for development', {
-            event: evt,
-            code,
-          });
-          process.exit(0);
-        }
+        exitHandler(evt, code);
packages/db/prisma/schema.prisma (1)

489-489: Schema/migration mismatch on currentStep.

Schema has currentStep String? but the migration sets it NOT NULL. Keep it nullable or update both consistently. This also aligns with import.ts where currentStep === null denotes a new import.

packages/trpc/src/routers/import.ts (4)

24-31: Sensitive config exposure in list — redact for read-level access.

Return imports without leaking config to read-only members. Gate by role and null-out config when needed.

Apply:

-      return db.import.findMany({
-        where: {
-          projectId: input.projectId,
-        },
-        orderBy: {
-          createdAt: 'desc',
-        },
-      });
+      const rows = await db.import.findMany({
+        where: { projectId: input.projectId },
+        orderBy: { createdAt: 'desc' },
+      });
+      const canSeeSecrets =
+        typeof access !== 'boolean' && access.level !== 'read';
+      return canSeeSecrets ? rows : rows.map((r) => ({ ...r, config: null }));

55-56: Sensitive config exposure in get — redact for read-level access.

Avoid returning secrets to read-only members.

Apply:

-      return importRecord;
+      const canSeeSecrets =
+        typeof access !== 'boolean' && access.level !== 'read';
+      return canSeeSecrets ? importRecord : { ...importRecord, config: null };

81-99: Enqueue not guarded — pending record can orphan without a job.

Wrap queue add in try/catch, use stable jobId, attempts, and mark DB failed on error.

Apply:

-      // Add job to queue
-      const job = await importQueue.add('import', {
-        type: 'import',
-        payload: {
-          importId: importRecord.id,
-        },
-      });
-
-      // Update import record with job ID
-      await db.import.update({
-        where: { id: importRecord.id },
-        data: { jobId: job.id },
-      });
+      // Add job to queue
+      try {
+        const job = await importQueue.add(
+          'import',
+          { type: 'import', payload: { importId: importRecord.id } },
+          { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
+        );
+        await db.import.update({
+          where: { id: importRecord.id },
+          data: { jobId: job.id },
+        });
+      } catch (err) {
+        await db.import.update({
+          where: { id: importRecord.id },
+          data: {
+            status: 'failed',
+            errorMessage: err instanceof Error ? err.message : 'Queue enqueue failed',
+          },
+        });
+        throw err;
+      }

160-176: Retry enqueue not guarded — can set 'pending' without job.

Same reliability issue: wrap in try/catch and only update status on success.

Apply:

-      // Add new job to queue
-      const job = await importQueue.add('import', {
-        type: 'import',
-        payload: {
-          importId: importRecord.id,
-        },
-      });
-
-      // Update import record
-      return db.import.update({
-        where: { id: importRecord.id },
-        data: {
-          jobId: job.id,
-          status: 'pending',
-          errorMessage: null,
-        },
-      });
+      // Add new job to queue
+      try {
+        const job = await importQueue.add(
+          'import',
+          { type: 'import', payload: { importId: importRecord.id } },
+          { jobId: importRecord.id, attempts: 3, backoff: { type: 'exponential', delay: 30_000 } },
+        );
+        return db.import.update({
+          where: { id: importRecord.id },
+          data: { jobId: job.id, status: 'pending', errorMessage: null },
+        });
+      } catch (err) {
+        throw new Error(
+          'Failed to enqueue retry job: ' +
+            (err instanceof Error ? err.message : String(err)),
+        );
+      }
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)

265-276: Config tooltip exposes credentials — sanitize/redact before display.

Redact known secret keys and handle null config.

Apply:

+const redactConfig = (cfg: any) => {
+  if (!cfg) return null;
+  const secretKeys = new Set([
+    'apiKey','apiSecret','serviceAccount','serviceSecret','token','password',
+    'secret','privateKey','accessKey','secretAccessKey',
+  ]);
+  const walk = (obj: any): any =>
+    Object.fromEntries(
+      Object.entries(obj).map(([k, v]) => [
+        k,
+        v && typeof v === 'object' ? walk(v) : secretKeys.has(k) ? '***' : v,
+      ]),
+    );
+  return walk(cfg);
+};
...
                   <Tooltiper
                     content={
                       <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words">
-                          {JSON.stringify(imp.config, null, 2)}
+                          {JSON.stringify(redactConfig(imp.config), null, 2)}
                       </pre>
                     }
                     tooltipClassName="max-w-xs"
                   >
                     <Badge>Config</Badge>
                   </Tooltiper>
apps/worker/src/jobs/import.ts (2)

129-137: Avoid logging full raw events (PII leak).

Log minimal reason/context only.

Apply:

-        ) {
-          jobLogger.warn('Skipping invalid event', { rawEvent });
+        ) {
+          jobLogger.warn('Skipping invalid event', {
+            provider: providerInstance.provider,
+            reason: 'validation_failed',
+          });
           continue;
         }

251-258: Sync final Postgres counters with ClickHouse progress for UI.

Update totalEvents/processedEvents on completion.

Apply:

     const finalProgress = await getImportProgress(importId);
@@
     jobLogger.info('Import job completed successfully', {
       totalEvents: finalProgress.totalEvents,
       insertedEvents: finalProgress.insertedEvents,
       status: finalProgress.status,
     });
+
+    await db.import.update({
+      where: { id: importId },
+      data: {
+        totalEvents: finalProgress.totalEvents,
+        processedEvents: finalProgress.insertedEvents,
+      },
+    });
packages/importer/src/providers/umami.ts (6)

1-15: Add SSRF guard before fetching remote files.

Validate scheme and resolve host to block private/loopback ranges.

Apply:

 import { randomUUID } from 'node:crypto';
 import { Readable } from 'node:stream';
 import { pipeline } from 'node:stream/promises';
 import { createBrotliDecompress, createGunzip } from 'node:zlib';
+import { lookup } from 'node:dns/promises';
+import { isIP } from 'node:net';
 import { isSameDomain, parsePath } from '@openpanel/common';
 import { generateDeviceId } from '@openpanel/common/server';
 import { getReferrerWithQuery, parseReferrer } from '@openpanel/common/server';
 import type { IClickhouseEvent } from '@openpanel/db';
 import type { ILogger } from '@openpanel/logger';
 import type { IUmamiImportConfig } from '@openpanel/validation';
 import { parse } from 'csv-parse';
-import { assocPath } from 'ramda';
 import { z } from 'zod';
 import { BaseImportProvider } from '../base-provider';
+
+async function assertRemoteUrlAllowed(target: string) {
+  const u = new URL(target);
+  if (!/^https?:$/i.test(u.protocol)) {
+    throw new Error('Only http/https URLs are allowed');
+  }
+  const addrs = await lookup(u.hostname, { all: true, verbatim: true });
+  const isPrivate = (addr: string) =>
+    addr === '127.0.0.1' ||
+    addr === '::1' ||
+    addr.startsWith('10.') ||
+    /^172\.(1[6-9]|2\d|3[0-1])\./.test(addr) ||
+    addr.startsWith('192.168.') ||
+    addr.startsWith('fc') ||
+    addr.startsWith('fd') ||
+    addr.startsWith('fe80:');
+  for (const { address } of addrs) {
+    if (isIP(address) && isPrivate(address)) {
+      throw new Error('Blocked private address');
+    }
+  }
+}

79-87: Call SSRF guard before fetch.

Block unsafe targets early.

Apply:

   private async *parseRemoteFile(
     url: string,
@@
   ): AsyncGenerator<UmamiRawEvent, void, unknown> {
     const { signal, maxBytes, maxRows } = opts;
     const controller = new AbortController();
@@
-    const res = await fetch(url, { signal: controller.signal });
+    await assertRemoteUrlAllowed(url);
+    const res = await fetch(url, { signal: controller.signal });

Also applies to: 97-103


213-221: Device ID fallback is invalid; use stable available identifiers.

Stop passing visit_id as ip/ua.

Apply:

-    const deviceId =
-      rawEvent.visit_id ||
-      generateDeviceId({
-        ip: rawEvent.visit_id!,
-        ua: rawEvent.visit_id!,
-        origin: projectId,
-        salt: 'xxx',
-      });
+    const deviceId =
+      rawEvent.visit_id ||
+      rawEvent.distinct_id ||
+      rawEvent.session_id ||
+      rawEvent.event_id ||
+      randomUUID();

224-229: Fix URL construction (require hostname, prefix query with '?').

Avoid invalid URLs.

Apply:

-    const url = rawEvent.url_path
-      ? `https://${[rawEvent.hostname, rawEvent.url_path, rawEvent.url_query]
-          .filter(Boolean)
-          .join('')}`
-      : '';
+    let url = '';
+    if (rawEvent.hostname && rawEvent.url_path) {
+      const q = rawEvent.url_query
+        ? rawEvent.url_query.startsWith('?')
+          ? rawEvent.url_query
+          : `?${rawEvent.url_query}`
+        : '';
+      url = `https://${rawEvent.hostname}${rawEvent.url_path}${q}`;
+    }

231-234: Include referrer_query in referrer URL.

Preserve query parameters.

Apply:

-    const referrerUrl = rawEvent.referrer_domain
-      ? `https://${rawEvent.referrer_domain}${rawEvent.referrer_path || ''}`
-      : '';
+    const refQuery = rawEvent.referrer_query
+      ? rawEvent.referrer_query.startsWith('?')
+        ? rawEvent.referrer_query
+        : `?${rawEvent.referrer_query}`
+      : '';
+    const referrerUrl = rawEvent.referrer_domain
+      ? `https://${rawEvent.referrer_domain}${rawEvent.referrer_path || ''}${refQuery}`
+      : '';

257-297: Flatten properties to primitives; ClickHouse rejects nested objects.

Replace nested __query and assocPath usage with flat primitive keys.

Apply:

-    let properties: Record<string, any> = {};
-
-    if (query) {
-      properties.__query = query;
-    }
+    const properties: Record<
+      string,
+      string | number | boolean | null | undefined
+    > = {};
+    if (query) {
+      for (const [k, v] of Object.entries(query)) {
+        properties[`query.${k}`] = v as string;
+      }
+    }
@@
-    if (rawEvent.utm_source)
-      properties = assocPath(
-        ['__query', 'utm_source'],
-        rawEvent.utm_source,
-        properties,
-      );
-    if (rawEvent.utm_medium)
-      properties = assocPath(
-        ['__query', 'utm_medium'],
-        rawEvent.utm_medium,
-        properties,
-      );
-    if (rawEvent.utm_campaign)
-      properties = assocPath(
-        ['__query', 'utm_campaign'],
-        rawEvent.utm_campaign,
-        properties,
-      );
-    if (rawEvent.utm_content)
-      properties = assocPath(
-        ['__query', 'utm_content'],
-        rawEvent.utm_content,
-        properties,
-      );
-    if (rawEvent.utm_term)
-      properties = assocPath(
-        ['__query', 'utm_term'],
-        rawEvent.utm_term,
-        properties,
-      );
+    if (rawEvent.utm_source) properties.utm_source = rawEvent.utm_source;
+    if (rawEvent.utm_medium) properties.utm_medium = rawEvent.utm_medium;
+    if (rawEvent.utm_campaign) properties.utm_campaign = rawEvent.utm_campaign;
+    if (rawEvent.utm_content) properties.utm_content = rawEvent.utm_content;
+    if (rawEvent.utm_term) properties.utm_term = rawEvent.utm_term;
apps/start/src/modals/add-import.tsx (1)

245-249: Avoid leaking upstream error details to end users.

Use a generic error toast; log details server-side.

Apply:

-      onError: (error) => {
-        toast.error('Import failed', {
-          description: error.message,
-        });
-      },
+      onError: () => {
+        toast.error('Import failed', {
+          description: 'We could not start the import. Please try again later.',
+        });
+      },
packages/importer/src/providers/mixpanel.ts (1)

85-91: Sensitive credentials logged.

Line 90 logs the serviceAccount credential, which can aid attackers in credential stuffing or enumeration attacks.

packages/db/src/services/import.service.ts (4)

256-256: Risk of timestamp collisions with existing events.

Lines 256 and 293 adjust timestamps by exactly ±1000ms (1 second), which could create timestamp conflicts with real events at those exact timestamps, leading to non-deterministic event ordering.

Also applies to: 293-293


505-506: Inconsistent parameter handling.

Lines 505-506 use sqlstring.escape() for inline parameter substitution, while the rest of the codebase uses ClickHouse's query_params mechanism. This inconsistency reduces maintainability.


559-559: Use deterministic status aggregation instead of any(import_status).

Line 559 uses any(import_status) which returns an arbitrary status from the group, making the result non-deterministic when events have mixed statuses.


603-603: Validate status value instead of using type assertion.

Line 603 uses a type assertion which bypasses type safety. If ClickHouse returns an unexpected status value, the type system won't catch it.

🧹 Nitpick comments (15)
packages/common/server/parser-user-agent.ts (1)

84-112: Consider reducing code duplication.

The pattern typeof overrides?.__x === 'string' && overrides?.__x ? overrides?.__x : res.x is repeated seven times for different fields. This increases maintenance overhead—if the override logic needs adjustment, you'd need to update it in seven places.

Consider extracting this logic into a helper function:

+function getOverride<T>(
+  override: unknown,
+  fallback: T | undefined
+): T | undefined | string {
+  return typeof override === 'string' && override !== '' ? override : fallback;
+}
+
 export function parseUserAgent(
   ua?: string | null,
   overrides?: Record<string, unknown>,
 ) {
   if (!ua) return parsedServerUa;
   const res = parse(ua);
 
   if (isServer(res)) {
     return parsedServerUa;
   }
 
   return {
-    os:
-      typeof overrides?.__os === 'string' && overrides?.__os
-        ? overrides?.__os
-        : res.os.name,
+    os: getOverride(overrides?.__os, res.os.name),
-    osVersion:
-      typeof overrides?.__osVersion === 'string' && overrides?.__osVersion
-        ? overrides?.__osVersion
-        : res.os.version,
+    osVersion: getOverride(overrides?.__osVersion, res.os.version),
-    browser:
-      typeof overrides?.__browser === 'string' && overrides?.__browser
-        ? overrides?.__browser
-        : res.browser.name,
+    browser: getOverride(overrides?.__browser, res.browser.name),
-    browserVersion:
-      typeof overrides?.__browserVersion === 'string' &&
-      overrides?.__browserVersion
-        ? overrides?.__browserVersion
-        : res.browser.version,
+    browserVersion: getOverride(overrides?.__browserVersion, res.browser.version),
-    device:
-      typeof overrides?.__device === 'string' && overrides?.__device
-        ? overrides?.__device
-        : res.device.type || getDevice(ua),
+    device: getOverride(overrides?.__device, res.device.type || getDevice(ua)),
-    brand:
-      typeof overrides?.__brand === 'string' && overrides?.__brand
-        ? overrides?.__brand
-        : res.device.vendor,
+    brand: getOverride(overrides?.__brand, res.device.vendor),
-    model:
-      typeof overrides?.__model === 'string' && overrides?.__model
-        ? overrides?.__model
-        : res.device.model,
+    model: getOverride(overrides?.__model, res.device.model),
     isServer: false,
   } as const;
 }
packages/db/src/buffers/base-buffer.ts (1)

92-126: Avoid lock expiry during long flushes (possible double processing).

If onFlush exceeds lockTimeout (60s), the Redis lock can expire mid-run and another worker may acquire it. Add a heartbeat to extend TTL only when the lockId still matches.

Apply this diff:

@@
   private async releaseLock(lockId: string): Promise<void> {
@@
   }
 
+  private async refreshLock(lockId: string): Promise<void> {
+    const script = `
+      if redis.call("get", KEYS[1]) == ARGV[1] then
+        return redis.call("expire", KEYS[1], ARGV[2])
+      else
+        return 0
+      end
+    `;
+    await getRedisCache().eval(
+      script,
+      1,
+      this.lockKey,
+      lockId,
+      this.lockTimeout.toString(),
+    );
+  }
+
   async tryFlush() {
@@
-    if (acquired === 'OK') {
+    if (acquired === 'OK') {
+      const intervalMs =
+        Math.max(10, Math.floor(this.lockTimeout / 2)) * 1000;
+      const heartbeat = setInterval(() => {
+        void this.refreshLock(lockId).catch(() => {});
+      }, intervalMs);
       try {
@@
       } catch (error) {
@@
       } finally {
+        clearInterval(heartbeat);
         await this.releaseLock(lockId);
@@
       }
     }
   }
packages/trpc/src/routers/auth.ts (1)

152-154: Use the correct error type for “User already exists”.

This is a conflict, not a not-found case. Prefer a 400/409-style error.

Apply this diff:

-import { TRPCAccessError, TRPCNotFoundError } from '../errors';
+import {
+  TRPCAccessError,
+  TRPCNotFoundError,
+  TRPCBadRequestError,
+} from '../errors';
@@
-      if (user) {
-        throw TRPCNotFoundError('User already exists');
-      }
+      if (user) {
+        throw TRPCBadRequestError('User already exists');
+      }

Also applies to: 31-31

packages/importer/src/providers/mixpanel.test.ts (1)

5-23: Granularity mismatch: daily vs monthly chunks.

Test expects daily chunks, but Mixpanel provider comments mention monthly chunking for reliability. Align expectation and implementation (or update the comment) to avoid future drift.

packages/importer/src/index.ts (1)

1-13: LGTM: coherent public surface.

Exports for Umami/Mixpanel and core types look consistent. Consider exporting BaseImportProvider if external providers are intended later.

apps/worker/src/boot-workers.ts (2)

61-64: Clamp concurrency to ≥1 to avoid invalid worker configs.

ENV may be "0" or non-numeric; guard with fallback.

-  concurrency: Number.parseInt(process.env.IMPORT_JOB_CONCURRENCY || '1', 10),
+  concurrency:
+    Math.max(1, Number.parseInt(process.env.IMPORT_JOB_CONCURRENCY ?? '', 10) || 1),
-  concurrency: Number.parseInt(process.env.EVENT_JOB_CONCURRENCY || '1', 10),
+  concurrency:
+    Math.max(1, Number.parseInt(process.env.EVENT_JOB_CONCURRENCY ?? '', 10) || 1),

Also applies to: 37-38


137-143: Graceful shutdown should consider all active queues (include import).

Only waiting on cron can exit while imports are active. Wait for key queues or rely solely on worker.close.

-      await waitForQueueToEmpty(cronQueue);
+      // Wait for critical queues to drain before closing workers
+      await Promise.all([
+        waitForQueueToEmpty(cronQueue),
+        waitForQueueToEmpty(sessionsQueue),
+        waitForQueueToEmpty(notificationQueue),
+        waitForQueueToEmpty(miscQueue),
+        waitForQueueToEmpty(importQueue),
+      ]);
packages/queue/src/queues.ts (1)

168-181: Add retries/backoff for import jobs.

Imports are long-running and transiently fail (rate limits, net hiccups). Configure sane defaults.

 export const importQueue = new Queue<ImportQueuePayload>('import', {
   connection: getRedisQueue(),
   defaultJobOptions: {
     removeOnComplete: 10,
     removeOnFail: 50,
+    attempts: 3,
+    backoff: { type: 'exponential', delay: 5000 },
+    // prevent duplicates if enqueued idempotently from API
+    // jobId can be importId if desired at call-site
   },
 });
packages/db/prisma/schema.prisma (1)

477-496: Import model polish: set defaults and add indexes; keep currentStep nullable.

  • Set status default to 'pending' to match lifecycle.
  • Add indexes for common lookups (by projectId, status, recency).
  • Keep currentStep optional; worker logic uses NULL for “new import”.
 model Import {
   id              String       @id @default(dbgenerated("gen_random_uuid()")) @db.Uuid
   projectId       String
   project         Project      @relation(fields: [projectId], references: [id], onDelete: Cascade)
   jobId           String? // BullMQ job ID
-  status          ImportStatus
+  status          ImportStatus @default(pending)
   statusMessage   String? // Human-readable current step like "Importing events (Feb 2025)", "Generating session IDs"
   errorMessage    String?
   /// [IPrismaImportConfig]
   config          Json
   totalEvents     Int          @default(0)
   processedEvents Int          @default(0)
-  currentStep     String?
+  currentStep     String?
   currentBatch    String? // String date 2020-01-01
   createdAt       DateTime     @default(now())
   completedAt     DateTime?
   updatedAt       DateTime     @default(now()) @updatedAt

   @@map("imports")
+  @@index([projectId, createdAt(sort: Desc)])
+  @@index([status])
+  @@index([projectId, status])
 }
packages/trpc/src/routers/import.ts (2)

121-126: Make job removal resilient.

Wrap job.remove() in try/catch so delete isn’t blocked by queue errors; optionally skip removal if status is processing.

Apply:

-      if (importRecord.jobId) {
-        const job = await importQueue.getJob(importRecord.jobId);
-        if (job) {
-          await job.remove();
-        }
-      }
+      if (importRecord.jobId) {
+        try {
+          const job = await importQueue.getJob(importRecord.jobId);
+          if (job) await job.remove();
+        } catch (e) {
+          // Best-effort cleanup; continue with delete
+        }
+      }

24-31: Paginate list to avoid unbounded result sets.

Add take/skip or cursor-based pagination; default e.g. last 50, with client-driven paging.

Apply:

-      const rows = await db.import.findMany({
-        where: { projectId: input.projectId },
-        orderBy: { createdAt: 'desc' },
-      });
+      const rows = await db.import.findMany({
+        where: { projectId: input.projectId },
+        orderBy: { createdAt: 'desc' },
+        take: 50,
+      });
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (2)

206-213: Null-safe provider/type display to work with server-side redaction.

Avoid crashes when config is null.

Apply:

-                      <div>{imp.config.provider}</div>
+                      <div>{imp.config?.provider ?? 'unknown'}</div>
...
-                        {imp.config.type}
+                        {imp.config?.type ?? 'n/a'}

102-111: Don’t surface full internal error details in UI.

Truncate error text to reduce leak risk.

Apply:

-        <Tooltiper
-          content={errorMessage}
+        <Tooltiper
+          content={errorMessage ? String(errorMessage).slice(0, 300) : 'Error'}
           tooltipClassName="max-w-xs break-words"
         >
apps/worker/src/jobs/import.ts (1)

201-207: Use job logger instead of console.log.

Align logging and ensure capture.

Apply:

-        console.log('Generating session IDs', { from });
+        jobLogger.info('Generating session IDs', { from });
packages/importer/src/providers/mixpanel.ts (1)

309-314: Address TODO or document why it's deferred.

The TODO comment indicates incomplete screen view detection logic. This heuristic-based approach may cause incorrect event name mapping for production data.

If this is a temporary workaround, please document when and how it should be removed. Otherwise, do you want me to help implement a more robust solution or open an issue to track this task?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b60dd0e and 1a11373.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (72)
  • .github/workflows/docker-build.yml (3 hunks)
  • apps/api/package.json (1 hunks)
  • apps/start/Dockerfile (2 hunks)
  • apps/start/package.json (2 hunks)
  • apps/start/src/components/integrations/integration-card.tsx (1 hunks)
  • apps/start/src/components/project-mapper.tsx (1 hunks)
  • apps/start/src/components/ui/badge.tsx (1 hunks)
  • apps/start/src/modals/add-import.tsx (1 hunks)
  • apps/start/src/modals/date-ranger-picker.tsx (1 hunks)
  • apps/start/src/modals/index.tsx (2 hunks)
  • apps/start/src/routeTree.gen.ts (11 hunks)
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1 hunks)
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1 hunks)
  • apps/start/src/utils/math.ts (1 hunks)
  • apps/start/src/utils/slug.ts (1 hunks)
  • apps/worker/Dockerfile (2 hunks)
  • apps/worker/package.json (3 hunks)
  • apps/worker/src/boot-workers.ts (4 hunks)
  • apps/worker/src/index.ts (2 hunks)
  • apps/worker/src/jobs/events.incoming-event.ts (3 hunks)
  • apps/worker/src/jobs/events.incoming-events.test.ts (2 hunks)
  • apps/worker/src/jobs/import.ts (1 hunks)
  • package.json (1 hunks)
  • packages/cli/package.json (0 hunks)
  • packages/cli/src/cli.ts (0 hunks)
  • packages/cli/src/importer/importer.ts (0 hunks)
  • packages/cli/src/importer/index.ts (0 hunks)
  • packages/cli/tsconfig.json (0 hunks)
  • packages/cli/tsup.config.ts (0 hunks)
  • packages/common/package.json (1 hunks)
  • packages/common/scripts/get-referrers.ts (1 hunks)
  • packages/common/server/index.ts (1 hunks)
  • packages/common/server/parse-referrer.test.ts (5 hunks)
  • packages/common/server/parse-referrer.ts (4 hunks)
  • packages/common/server/parser-user-agent.ts (2 hunks)
  • packages/common/src/object.ts (1 hunks)
  • packages/db/code-migrations/5-add-imports-table.sql (1 hunks)
  • packages/db/code-migrations/5-add-imports-table.ts (1 hunks)
  • packages/db/index.ts (1 hunks)
  • packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251018214030_fix/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028152531_imports_again/migration.sql (1 hunks)
  • packages/db/prisma/schema.prisma (2 hunks)
  • packages/db/src/buffers/base-buffer.ts (1 hunks)
  • packages/db/src/buffers/event-buffer.test.ts (8 hunks)
  • packages/db/src/clickhouse/client.ts (2 hunks)
  • packages/db/src/services/event.service.ts (1 hunks)
  • packages/db/src/services/import.service.ts (1 hunks)
  • packages/db/src/services/session.service.ts (1 hunks)
  • packages/db/src/types.ts (2 hunks)
  • packages/importer/package.json (1 hunks)
  • packages/importer/src/base-provider.ts (1 hunks)
  • packages/importer/src/index.ts (1 hunks)
  • packages/importer/src/providers/metadata.ts (1 hunks)
  • packages/importer/src/providers/mixpanel.test.ts (1 hunks)
  • packages/importer/src/providers/mixpanel.ts (1 hunks)
  • packages/importer/src/providers/umami.ts (1 hunks)
  • packages/importer/src/types.ts (1 hunks)
  • packages/importer/tsconfig.json (1 hunks)
  • packages/importer/vitest.config.ts (1 hunks)
  • packages/queue/src/queues.ts (1 hunks)
  • packages/redis/package.json (1 hunks)
  • packages/trpc/package.json (1 hunks)
  • packages/trpc/src/root.ts (2 hunks)
  • packages/trpc/src/routers/auth.ts (1 hunks)
  • packages/trpc/src/routers/import.ts (1 hunks)
  • packages/trpc/src/routers/organization.ts (1 hunks)
  • packages/validation/src/index.ts (1 hunks)
💤 Files with no reviewable changes (6)
  • packages/cli/src/cli.ts
  • packages/cli/src/importer/importer.ts
  • packages/cli/tsconfig.json
  • packages/cli/package.json
  • packages/cli/tsup.config.ts
  • packages/cli/src/importer/index.ts
🚧 Files skipped from review as they are similar to previous changes (26)
  • apps/start/src/modals/date-ranger-picker.tsx
  • packages/common/server/index.ts
  • apps/api/package.json
  • apps/start/src/modals/index.tsx
  • packages/db/src/services/event.service.ts
  • packages/trpc/package.json
  • packages/db/src/types.ts
  • packages/common/server/parse-referrer.test.ts
  • apps/start/src/components/ui/badge.tsx
  • apps/start/src/components/integrations/integration-card.tsx
  • apps/start/src/routes/app.$organizationId.$projectId.settings._tabs.tsx
  • packages/db/src/buffers/event-buffer.test.ts
  • packages/common/package.json
  • packages/importer/tsconfig.json
  • packages/db/src/services/session.service.ts
  • packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql
  • packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql
  • packages/common/src/object.ts
  • apps/start/package.json
  • apps/start/src/routeTree.gen.ts
  • packages/db/src/clickhouse/client.ts
  • packages/db/prisma/migrations/20251028152531_imports_again/migration.sql
  • packages/importer/src/base-provider.ts
  • packages/importer/package.json
  • package.json
  • packages/trpc/src/root.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/start/.cursorrules)

apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'

Files:

  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx
  • apps/start/src/utils/slug.ts
  • apps/start/src/components/project-mapper.tsx
  • apps/start/src/modals/add-import.tsx
  • apps/start/src/utils/math.ts
🧬 Code graph analysis (15)
apps/worker/src/boot-workers.ts (2)
packages/queue/src/queues.ts (1)
  • importQueue (175-181)
apps/worker/src/jobs/import.ts (1)
  • importJob (25-281)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
  • getIsCluster (17-24)
packages/db/src/clickhouse/migration.ts (2)
  • createTable (68-116)
  • runClickhouseMigrationCommands (322-454)
packages/queue/src/queues.ts (1)
packages/redis/redis.ts (1)
  • getRedisQueue (93-105)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (4)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
  • Route (12-34)
packages/importer/src/providers/metadata.ts (1)
  • IMPORT_PROVIDERS (13-30)
apps/public/components/ui/tooltip.tsx (1)
  • Tooltiper (36-51)
apps/start/src/components/integrations/integration-card.tsx (3)
  • IntegrationCard (91-112)
  • IntegrationCardLogoImage (50-69)
  • IntegrationCardFooter (3-15)
packages/trpc/src/routers/import.ts (6)
packages/trpc/src/trpc.ts (2)
  • createTRPCRouter (138-138)
  • protectedProcedure (161-164)
packages/trpc/src/access.ts (1)
  • getProjectAccessCached (4-4)
packages/trpc/src/errors.ts (1)
  • TRPCAccessError (3-7)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
packages/validation/src/index.ts (1)
  • zCreateImport (478-486)
packages/queue/src/queues.ts (1)
  • importQueue (175-181)
apps/start/src/components/project-mapper.tsx (2)
packages/db/src/services/project.service.ts (1)
  • IServiceProject (7-7)
apps/start/src/components/ui/select.tsx (5)
  • Select (173-173)
  • SelectTrigger (181-181)
  • SelectValue (182-182)
  • SelectContent (174-174)
  • SelectItem (176-176)
packages/importer/vitest.config.ts (1)
vitest.shared.ts (1)
  • getSharedVitestConfig (4-27)
apps/worker/src/jobs/import.ts (6)
packages/queue/src/queues.ts (1)
  • ImportQueuePayload (168-173)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
packages/db/src/services/import.service.ts (10)
  • ImportSteps (671-671)
  • updateImportStatus (673-736)
  • getImportDateBounds (610-634)
  • insertImportBatch (28-55)
  • generateSessionIds (62-98)
  • createSessionsStartEndEvents (106-316)
  • moveImportsToProduction (322-426)
  • backfillSessionsToProduction (428-521)
  • markImportComplete (526-545)
  • getImportProgress (550-605)
packages/db/src/clickhouse/client.ts (2)
  • formatClickhouseDate (208-219)
  • error (40-45)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/importer/src/types.ts (2)
packages/db/src/services/event.service.ts (1)
  • IServiceImportedEventPayload (144-147)
packages/db/src/services/import.service.ts (2)
  • ImportProgress (18-23)
  • ImportStageResult (12-16)
apps/worker/src/index.ts (1)
packages/queue/src/queues.ts (1)
  • importQueue (175-181)
packages/importer/src/providers/umami.ts (5)
packages/validation/src/index.ts (1)
  • IUmamiImportConfig (456-456)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/db/src/services/event.service.ts (2)
  • IClickhouseEvent (68-102)
  • query (701-847)
packages/common/src/url.ts (2)
  • parsePath (11-49)
  • isSameDomain (51-63)
packages/common/server/parse-referrer.ts (2)
  • parseReferrer (17-26)
  • getReferrerWithQuery (28-66)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
  • zUmamiImportConfig (452-454)
  • zMixpanelImportConfig (461-470)
  • IUmamiImportConfig (456-456)
  • IMixpanelImportConfig (471-471)
  • IImportConfig (473-476)
apps/start/src/components/project-mapper.tsx (1)
  • ProjectMapper (33-116)
apps/public/lib/utils.ts (1)
  • cn (4-6)
packages/importer/src/providers/mixpanel.test.ts (1)
packages/importer/src/providers/mixpanel.ts (1)
  • MixpanelProvider (18-441)
packages/db/src/services/import.service.ts (4)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
packages/db/src/clickhouse/client.ts (4)
  • ch (131-161)
  • TABLE_NAMES (48-60)
  • convertClickhouseDateToJs (238-240)
  • formatClickhouseDate (208-219)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
  • IMixpanelImportConfig (471-471)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/db/src/services/event.service.ts (2)
  • IClickhouseEvent (68-102)
  • query (701-847)
packages/common/src/url.ts (2)
  • parsePath (11-49)
  • isSameDomain (51-63)
packages/common/server/parse-referrer.ts (2)
  • parseReferrer (17-26)
  • getReferrerWithQuery (28-66)
packages/common/server/parser-user-agent.ts (2)
  • parseUserAgent (72-115)
  • UserAgentInfo (71-71)
packages/common/src/object.ts (2)
  • toDots (3-36)
  • strip (48-48)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-and-push-dashboard
  • GitHub Check: build-and-push-worker
  • GitHub Check: build-and-push-api
🔇 Additional comments (29)
.github/workflows/docker-build.yml (1)

149-149: Guard tag creation to main branch only—LGTM.

The conditional guards on all three tag-creation steps (if: github.ref == 'refs/heads/main') correctly restrict tag creation to the main branch. This is good practice for release/deployment management, preventing accidental tags from feature branches while allowing Docker images to be built for all branches. The syntax and consistency across all three jobs are correct.

Also applies to: 210-210, 271-271

packages/common/server/parser-user-agent.ts (1)

71-71: Good addition of type export.

Exporting the return type as UserAgentInfo is a best practice that improves type safety and developer experience for consumers of this function.

packages/redis/package.json (1)

11-11: Verify the intent behind pinning ioredis to an exact version.

Version 5.8.2 is confirmed as valid, latest, and published by the official Redis Labs maintainer. However, this is the only production dependency in the package.json using exact pinning—all others use caret constraints (^) or monorepo references (workspace:*, catalog:). Ensure this pinning is intentional for your queue/job processing stability, rather than an accidental divergence from project conventions. If stability is not a critical concern here, consider aligning with the project's versioning strategy.

packages/db/src/buffers/base-buffer.ts (1)

1-1: Import-path update is consistent with the monorepo re-exports.

For safety, run the same re-export check script shared in organization.ts to ensure no accidental duplicate exports.

packages/trpc/src/routers/auth.ts (1)

15-15: Import-path change LGTM.

packages/trpc/src/routers/organization.ts (1)

14-14: No issues found; import change is correct.

Re-export verified: generateSecureId is defined once in packages/common/server/id.ts and properly re-exported. The import path consolidation to @openpanel/common/server is consistent with existing patterns across the codebase (already used in auth.ts, db/buffers, and other files). Tree-shaking will work correctly.

apps/worker/src/jobs/events.incoming-events.test.ts (1)

102-102: LGTM! Test expectations correctly updated.

The referrerType expectations now correctly use an empty string instead of 'unknown', aligning with the updated behavior in packages/common/server/parse-referrer.ts where the default type for non-matches changed from 'unknown' to ''.

Also applies to: 210-210

apps/start/src/utils/math.ts (1)

1-1: LGTM! Re-export now uses package root.

The change from '@openpanel/common/src/math' to '@openpanel/common' aligns with the pattern of consuming the public API surface from package roots rather than internal paths.

packages/db/code-migrations/5-add-imports-table.sql (1)

1-43: LGTM! Table schema is well-structured.

The events_imports table schema provides comprehensive event tracking with import-specific metadata. The 7-day TTL on imported_at_meta provides automatic cleanup for staging data. Note that in clustered deployments, the TTL should be applied to the replicated table rather than the distributed table (see related comment on the .ts migration file).

apps/start/src/utils/slug.ts (1)

1-1: LGTM! Re-export now uses package root.

Consistent with the broader pattern of consuming the public API surface from '@openpanel/common' rather than internal paths.

packages/common/server/parse-referrer.ts (1)

1-3: LGTM! Referrer parsing logic improved.

The changes improve consistency and efficiency:

  • Default type changed from 'unknown' to empty string for unmatched referrers
  • Source is now lowercased once during computation rather than repeatedly during comparisons
  • Updated matching logic is cleaner and more maintainable

Test files correctly reflect these behavioral changes.

Also applies to: 23-23, 35-51, 63-63

packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql (1)

1-2: LGTM!

The migration correctly adds a nullable statusMessage column to track detailed import status information.

packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1)

1-2: LGTM!

The migration correctly makes currentStep nullable, which is appropriate for representing imports that haven't started or tracking optional step progression.

apps/worker/package.json (2)

41-41: Verify intentional removal of version range flexibility.

The tsdown dependency was changed from ^0.14.2 to 0.14.2 (caret removed), which prevents automatic minor/patch updates. This appears unrelated to the importer feature.

Is this pinning intentional for reproducibility, or was it accidental? If unintentional, consider restoring the caret:

-    "tsdown": "0.14.2",
+    "tsdown": "^0.14.2",

22-22: LGTM!

Adding the @openpanel/importer dependency correctly wires the worker to process import jobs.

packages/db/index.ts (1)

25-25: LGTM!

The export correctly surfaces the new import service, consistent with other service exports in this file.

apps/worker/src/index.ts (1)

8-8: LGTM!

The importQueue is correctly imported and added to the Bull Board UI, enabling monitoring of import jobs alongside existing queues.

Also applies to: 42-42

apps/worker/Dockerfile (2)

38-38: LGTM: workspace caching for importer.

Including packages/importer/package.json improves pnpm cache hits.


85-87: LGTM: importer copied into runner image.

Runtime availability is ensured; ordering with logger/common is fine.

packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)

13-13: Migration is safe—config validation requires the field.

The config field in zCreateImport is mandatory (a union of three required config types with no .optional()), and the single import creation path always provides it from validated input. Since the column remains NOT NULL, dropping the DEFAULT poses no risk of introducing NULLs.

apps/worker/src/jobs/events.incoming-event.ts (2)

8-12: Consolidation to @openpanel/common/server looks good.

Shared parser usage reduces drift.


120-123: Referrer precedence change: confirm downstream expectations.

Now referrer/url is only from parsed referrer (not UTM), and type prioritizes referrer over UTM. Validate dashboards/segments relying on previous precedence (tests adjusted, but product semantics matter).

packages/importer/src/providers/mixpanel.ts (1)

401-440: LGTM!

The property stripping and string coercion logic correctly handles edge cases including null/undefined values and nested objects. The regex pattern effectively filters Mixpanel-specific properties.

packages/db/src/services/import.service.ts (6)

28-55: LGTM!

The batch insertion logic correctly handles empty arrays, adds required import metadata, and leverages ClickHouse's JSONEachRow format for efficient bulk inserts.


62-98: LGTM!

The deterministic session ID generation using MD5 hashing of device_id and 30-minute time windows ensures consistent results across imports. The use of mutations_sync: '2' properly ensures replication before proceeding.


322-426: LGTM!

The migration logic correctly handles session events with adjusted timestamps by including them via session_id lookup rather than strict date filtering. The progress headers configuration prevents connection timeouts for large imports.


526-545: LGTM!

The completion marking correctly uses mutations_sync: '2' to ensure all replicas are updated before the function returns, preventing race conditions.


610-634: LGTM!

The date bounds query correctly handles the optional fromCreatedAt parameter and provides sensible defaults for empty result sets.


673-736: Well-structured status update flow.

The step-based union type provides excellent type safety, ensuring each step can only include valid fields. The status message generation is clear and user-friendly.

Note: The Postgres update at the end (lines 732-735) occurs after the job progress update. If the database update fails, the job progress will be out of sync with the database state. Consider whether this ordering is intentional or if database-first updates would be more appropriate.

COPY packages/trpc/package.json packages/trpc/
COPY packages/json/package.json packages/json/
COPY packages/common/package.json packages/common/
COPY packages/importer/package.json packages/importer/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Ensure @openpanel/importer is either available at runtime or not linked.

You add packages/importer/package.json for install caching, but the runner image does not copy packages/importer. If apps/start does any value import (not type-only) of @openpanel/importer, PNPM’s workspace symlink will point to a missing path at runtime.

  • Option A: If only type imports are used, keep as-is.
  • Option B: If runtime imports exist, also copy the importer sources into runner:
+# In runner stage, alongside other packages
+COPY --from=build /app/packages/importer ./packages/importer

Run to confirm no value imports from apps/start:

Also applies to: 89-89


🏁 Script executed:

#!/bin/bash
# Find imports of @openpanel/importer in Start; flag non-type imports
rg -nP --type=ts --type=tsx -C2 "from ['\"]@openpanel/importer['\"]" apps/start \
  | grep -v "import type" || echo "No value imports found."

Length of output: 51


I need to re-run the verification with a corrected command, as the file type flag issue makes the result inconclusive.


---

🏁 Script executed:

```shell
#!/bin/bash
# Broader search: find any imports of @openpanel/importer in apps/start
# without file type restrictions to avoid flag parsing issues
echo "=== Searching for @openpanel/importer imports ==="
rg -n "@openpanel/importer" apps/start || echo "No matches found."

echo ""
echo "=== Directory structure of apps/start ==="
find apps/start -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" \) | head -20

Length of output: 1250


Need to verify if this is the only import and search more thoroughly:


---

🏁 Script executed:

```shell
#!/bin/bash
# Find all imports/requires of @openpanel/importer in apps/start
echo "=== All references to @openpanel/importer ==="
rg -n "@openpanel/importer" apps/start -A2 -B2

echo ""
echo "=== Count of files with @openpanel/importer imports ==="
rg -l "@openpanel/importer" apps/start | wc -l

Length of output: 1172


Copy @openpanel/importer sources to runner stage to fix broken workspace symlink at runtime.

Apps/start has a value import (import { IMPORT_PROVIDERS } from '@openpanel/importer/providers') in apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx:22. The Dockerfile copies packages/importer/package.json for caching but omits the source files in the runner stage, causing the workspace symlink to point to a missing path at runtime.

Copy the importer sources in the runner stage:

+# In runner stage, alongside other packages
+COPY --from=build /app/packages/importer ./packages/importer

Also applies to: line 89

🤖 Prompt for AI Agents
In apps/start/Dockerfile around lines 32 and 89, the Dockerfile only copies
packages/importer/package.json which leaves the @openpanel/importer source files
absent in the runner stage and breaks the workspace symlink at runtime; fix by
copying the importer sources into the runner stage (either COPY
packages/importer/ packages/importer/ in the runner stage or COPY --from=builder
/app/packages/importer/ /app/packages/importer/ if the builder stage contains
the sources/artifacts) so the import path @openpanel/importer resolves at
runtime.

Comment on lines +35 to +38
const jobLogger = logger.child({
importId,
config: record.config,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Logger context leaks config secrets.

Avoid attaching full config to logs; include non-sensitive fields only.

Apply:

-  const jobLogger = logger.child({
-    importId,
-    config: record.config,
-  });
+  const jobLogger = logger.child({
+    importId,
+    provider: record.config?.provider,
+  });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const jobLogger = logger.child({
importId,
config: record.config,
});
const jobLogger = logger.child({
importId,
provider: record.config?.provider,
});
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 35 to 38, the logger child is
attaching the entire record.config which may contain secrets; replace that with
a whitelist of non-sensitive fields (e.g., config.name, config.type,
config.source, config.region, config.bucket or other known safe metadata) or
explicitly redact sensitive keys before passing to logger.child — do not log the
full config object; update the jobLogger creation to only include importId plus
the selected safe config fields (or a redact/mask function) so secrets are never
emitted to logs.

Comment on lines +124 to +173
const eventBatch: any = [];
for await (const rawEvent of providerInstance.parseSource(
resumeLoadingFrom,
)) {
// Validate event
if (
!providerInstance.validate(
// @ts-expect-error
rawEvent,
)
) {
jobLogger.warn('Skipping invalid event', { rawEvent });
continue;
}

eventBatch.push(rawEvent);

// Process batch when it reaches the batch size
if (eventBatch.length >= BATCH_SIZE) {
jobLogger.info('Processing batch', { batchSize: eventBatch.length });

const transformedEvents: IClickhouseEvent[] = eventBatch.map(
(
// @ts-expect-error
event,
) => providerInstance!.transformEvent(event),
);

await insertImportBatch(transformedEvents, importId);

processedEvents += eventBatch.length;
eventBatch.length = 0;

const createdAt = new Date(transformedEvents[0]?.created_at || '')
.toISOString()
.split('T')[0];

await updateImportStatus(jobLogger, job, importId, {
step: 'loading',
batch: createdAt,
totalEvents,
processedEvents,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Harden batch transform; avoid crashes on empty/failed transforms.

  • Use unknown[] buffer.
  • Guard per-event transform with try/catch.
  • Safe createdAt derivation.

Apply:

-      const eventBatch: any = [];
+      const eventBatch: unknown[] = [];
@@
-        if (eventBatch.length >= BATCH_SIZE) {
+        if (eventBatch.length >= BATCH_SIZE) {
           jobLogger.info('Processing batch', { batchSize: eventBatch.length });
-
-          const transformedEvents: IClickhouseEvent[] = eventBatch.map(
-            (
-              // @ts-expect-error
-              event,
-            ) => providerInstance!.transformEvent(event),
-          );
+          const transformedEvents: IClickhouseEvent[] = [];
+          for (const ev of eventBatch) {
+            try {
+              transformedEvents.push(
+                providerInstance!.transformEvent(ev as any),
+              );
+            } catch (e) {
+              jobLogger.warn('Skipping event on transform failure', {
+                provider: providerInstance.provider,
+                reason: e instanceof Error ? e.message : String(e),
+              });
+            }
+          }
 
           await insertImportBatch(transformedEvents, importId);
 
           processedEvents += eventBatch.length;
           eventBatch.length = 0;
 
-          const createdAt = new Date(transformedEvents[0]?.created_at || '')
-            .toISOString()
-            .split('T')[0];
+          const firstCreatedAt = transformedEvents[0]?.created_at;
+          const createdAt = firstCreatedAt
+            ? firstCreatedAt.split('T')[0]!
+            : formatClickhouseDate(new Date(), true);
 
           await updateImportStatus(jobLogger, job, importId, {
             step: 'loading',
             batch: createdAt,
             totalEvents,
             processedEvents,
           });

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +170 to +198
// Process remaining events in the last batch
if (eventBatch.length > 0) {
const transformedEvents = eventBatch.map(
(
// @ts-expect-error
event,
) => providerInstance!.transformEvent(event),
);

await insertImportBatch(transformedEvents, importId);

processedEvents += eventBatch.length;
eventBatch.length = 0;

const createdAt = new Date(transformedEvents[0]?.created_at || '')
.toISOString()
.split('T')[0];

await updateImportStatus(jobLogger, job, importId, {
step: 'loading',
batch: createdAt,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Repeat: harden final partial batch and persist counters.

Mirror guarded transform and include totals in status update.

Apply:

-      if (eventBatch.length > 0) {
-        const transformedEvents = eventBatch.map(
-          (
-            // @ts-expect-error
-            event,
-          ) => providerInstance!.transformEvent(event),
-        );
+      if (eventBatch.length > 0) {
+        const transformedEvents: IClickhouseEvent[] = [];
+        for (const ev of eventBatch) {
+          try {
+            transformedEvents.push(providerInstance!.transformEvent(ev as any));
+          } catch (e) {
+            jobLogger.warn('Skipping event on transform failure', {
+              provider: providerInstance.provider,
+              reason: e instanceof Error ? e.message : String(e),
+            });
+          }
+        }
 
         await insertImportBatch(transformedEvents, importId);
 
         processedEvents += eventBatch.length;
         eventBatch.length = 0;
 
-        const createdAt = new Date(transformedEvents[0]?.created_at || '')
-          .toISOString()
-          .split('T')[0];
+        const firstCreatedAt = transformedEvents[0]?.created_at;
+        const createdAt = firstCreatedAt
+          ? firstCreatedAt.split('T')[0]!
+          : formatClickhouseDate(new Date(), true);
 
         await updateImportStatus(jobLogger, job, importId, {
           step: 'loading',
           batch: createdAt,
+          totalEvents,
+          processedEvents,
         });
       }
🤖 Prompt for AI Agents
In apps/worker/src/jobs/import.ts around lines 170 to 192, the final partial
batch is processed without the same guards and error handling used earlier and
the status update omits processed/total counters; update the block to perform
the guarded transform exactly like the earlier batch processing (check
providerInstance exists, handle per-event transform errors or wrap transform in
try/catch and skip/collect failures), ensure transformedEvents is computed only
if providerInstance is present and has valid created_at before deriving
createdAt, and include the processedEvents and total/expected total (e.g.,
totalEvents) in the call to updateImportStatus so counters are persisted
alongside step and batch.

Comment on lines +8 to +14
-- CreateEnum
CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed');

-- AlterTable
ALTER TABLE "public"."imports" DROP COLUMN "status",
ADD COLUMN "status" "public"."ImportStatus" NOT NULL,
ALTER COLUMN "currentStep" SET NOT NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Destructive migration: status drop/recreate will lose data; currentStep NOT NULL conflicts with code.

  • Map TEXT → enum safely; do not drop without backfill.
  • Keep currentStep nullable; worker import.ts uses null to detect new imports.

Safe migration example:

--- a/packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
+++ b/packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
 CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed');

-ALTER TABLE "public"."imports" DROP COLUMN "status",
-ADD COLUMN     "status" "public"."ImportStatus" NOT NULL,
-ALTER COLUMN "currentStep" SET NOT NULL;
+-- 1) Add new enum column with default to preserve existing rows
+ALTER TABLE "public"."imports"
+  ADD COLUMN "status_v2" "public"."ImportStatus" NOT NULL DEFAULT 'pending';
+
+-- 2) Map old text values to enum; fall back to 'pending' for unknowns
+UPDATE "public"."imports"
+SET "status_v2" = CASE lower(coalesce("status", 'pending'))
+  WHEN 'pending' THEN 'pending'::"public"."ImportStatus"
+  WHEN 'processing' THEN 'processing'::"public"."ImportStatus"
+  WHEN 'completed' THEN 'completed'::"public"."ImportStatus"
+  WHEN 'failed' THEN 'failed'::"public"."ImportStatus"
+  ELSE 'pending'::"public"."ImportStatus"
+END;
+
+-- 3) Drop old column and rename
+ALTER TABLE "public"."imports" DROP COLUMN "status";
+ALTER TABLE "public"."imports" RENAME COLUMN "status_v2" TO "status";
+ALTER TABLE "public"."imports" ALTER COLUMN "status" DROP DEFAULT;
+
+-- 4) Keep currentStep nullable (code relies on NULL to denote new import)
+-- (Remove the SET NOT NULL)

Also, consider wrapping in a transaction if your migration runner allows it.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
-- CreateEnum
CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed');
-- AlterTable
ALTER TABLE "public"."imports" DROP COLUMN "status",
ADD COLUMN "status" "public"."ImportStatus" NOT NULL,
ALTER COLUMN "currentStep" SET NOT NULL;
-- CreateEnum
CREATE TYPE "public"."ImportStatus" AS ENUM ('pending', 'processing', 'completed', 'failed');
-- AlterTable
-- 1) Add new enum column with default to preserve existing rows
ALTER TABLE "public"."imports"
ADD COLUMN "status_v2" "public"."ImportStatus" NOT NULL DEFAULT 'pending';
-- 2) Map old text values to enum; fall back to 'pending' for unknowns
UPDATE "public"."imports"
SET "status_v2" = CASE lower(coalesce("status", 'pending'))
WHEN 'pending' THEN 'pending'::"public"."ImportStatus"
WHEN 'processing' THEN 'processing'::"public"."ImportStatus"
WHEN 'completed' THEN 'completed'::"public"."ImportStatus"
WHEN 'failed' THEN 'failed'::"public"."ImportStatus"
ELSE 'pending'::"public"."ImportStatus"
END;
-- 3) Drop old column and rename
ALTER TABLE "public"."imports" DROP COLUMN "status";
ALTER TABLE "public"."imports" RENAME COLUMN "status_v2" TO "status";
ALTER TABLE "public"."imports" ALTER COLUMN "status" DROP DEFAULT;
-- 4) Keep currentStep nullable (code relies on NULL to denote new import)
-- (Remove the SET NOT NULL)
🤖 Prompt for AI Agents
In packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql
around lines 8 to 14, the migration currently drops and recreates the "status"
column (which will lose data) and forces "currentStep" to NOT NULL (which
conflicts with import worker logic); instead, create the new enum type, add a
new nullable enum column (e.g. status_new), backfill status_new from the
existing text status using a safe mapping, then drop the old text column and
rename status_new to status (or use ALTER COLUMN TYPE after backfill), and leave
currentStep nullable (do not set NOT NULL). Wrap the whole sequence in a
transaction if your runner supports it and ensure any NOT NULL constraints are
only applied after backfilling.

Comment on lines 38 to 127
const rawEvent = {
event: '$mp_web_page_view',
properties: {
time: 1746097970,
distinct_id: '$device:123',
$browser: 'Chrome',
$browser_version: 135,
$city: 'Mumbai',
$current_url:
'https://domain.com/state/maharashtra?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_sour',
$device: 'Android',
$device_id: '123',
$initial_referrer: 'https://referrer.com/',
$initial_referring_domain: 'referrer.com',
$insert_id: 'source_id',
$lib_version: '2.60.0',
$mp_api_endpoint: 'api-js.mixpanel.com',
$mp_api_timestamp_ms: 1746078175363,
$mp_autocapture: true,
$os: 'Android',
$referrer: 'https://google.com/',
$referring_domain: 'referrer.com',
$region: 'Maharashtra',
$screen_height: 854,
$screen_width: 384,
current_domain: 'domain.com',
current_page_title:
'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2',
current_url_path: '/state/maharashtra',
current_url_protocol: 'https:',
current_url_search:
'?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_source=5&gclid=EAIaIQobChMI6MnvhciBjQMVlS-DAx',
gclid: 'oqneoqow',
mp_country_code: 'IN',
mp_lib: 'web',
mp_processing_time_ms: 1746078175546,
mp_sent_by_lib_version: '2.60.0',
utm_medium: 'cpc',
utm_source: 'google',
},
};

const res = provider.transformEvent(rawEvent);

expect(res).toMatchObject({
id: expect.any(String),
name: 'screen_view',
device_id: '123',
profile_id: '123',
project_id: 'pid',
session_id: '',
properties: {
__source_insert_id: 'source_id',
__screen: '384x854',
__lib_version: '2.60.0',
'__query.utm_source': 'google',
'__query.utm_medium': 'cpc',
'__query.utm_campaignid': '890',
'__query.utm_device': 'm',
'__query.utm_network': 'x',
'__query.utm_location': '123',
'__query.gclid': 'oqneoqow',
__title:
'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2',
},
created_at: '2025-05-01T11:12:50.000Z',
country: 'IN',
city: 'Mumbai',
region: 'Maharashtra',
longitude: null,
latitude: null,
os: 'Android',
os_version: undefined,
browser: 'Chrome',
browser_version: '',
device: 'mobile',
brand: '',
model: '',
duration: 0,
path: '/state/maharashtra',
origin: 'https://domain.com',
referrer: 'https://referrer.com',
referrer_name: 'Google',
referrer_type: 'search',
imported_at: expect.any(String),
sdk_name: 'mixpanel',
sdk_version: '1.0.0',
});
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Test hides mapping bugs in provider (browser/os version).

transformEvent currently ignores $browser_version and $os_version due to camel/snake mismatch; test expecting empty/undefined masks this. Tighten assertions to expect real values, then fix provider:

-      os_version: undefined,
-      browser_version: '',
+      os_version: '',            # or 'Android <ver>' if available
+      browser_version: '135',

Provider fix (packages/importer/src/providers/mixpanel.ts):

-      os: uaInfo.os || props.$os,
-      os_version: uaInfo.osVersion || props.$osVersion,
-      browser: uaInfo.browser || props.$browser,
-      browser_version:
-        uaInfo.browserVersion || props.$browserVersion
-          ? String(props.$browser_version)
-          : '',
+      os: uaInfo.os || props.$os || '',
+      os_version: uaInfo.osVersion || props.$os_version || '',
+      browser: uaInfo.browser || props.$browser || '',
+      browser_version: String(uaInfo.browserVersion || props.$browser_version || ''),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const rawEvent = {
event: '$mp_web_page_view',
properties: {
time: 1746097970,
distinct_id: '$device:123',
$browser: 'Chrome',
$browser_version: 135,
$city: 'Mumbai',
$current_url:
'https://domain.com/state/maharashtra?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_sour',
$device: 'Android',
$device_id: '123',
$initial_referrer: 'https://referrer.com/',
$initial_referring_domain: 'referrer.com',
$insert_id: 'source_id',
$lib_version: '2.60.0',
$mp_api_endpoint: 'api-js.mixpanel.com',
$mp_api_timestamp_ms: 1746078175363,
$mp_autocapture: true,
$os: 'Android',
$referrer: 'https://google.com/',
$referring_domain: 'referrer.com',
$region: 'Maharashtra',
$screen_height: 854,
$screen_width: 384,
current_domain: 'domain.com',
current_page_title:
'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2',
current_url_path: '/state/maharashtra',
current_url_protocol: 'https:',
current_url_search:
'?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_source=5&gclid=EAIaIQobChMI6MnvhciBjQMVlS-DAx',
gclid: 'oqneoqow',
mp_country_code: 'IN',
mp_lib: 'web',
mp_processing_time_ms: 1746078175546,
mp_sent_by_lib_version: '2.60.0',
utm_medium: 'cpc',
utm_source: 'google',
},
};
const res = provider.transformEvent(rawEvent);
expect(res).toMatchObject({
id: expect.any(String),
name: 'screen_view',
device_id: '123',
profile_id: '123',
project_id: 'pid',
session_id: '',
properties: {
__source_insert_id: 'source_id',
__screen: '384x854',
__lib_version: '2.60.0',
'__query.utm_source': 'google',
'__query.utm_medium': 'cpc',
'__query.utm_campaignid': '890',
'__query.utm_device': 'm',
'__query.utm_network': 'x',
'__query.utm_location': '123',
'__query.gclid': 'oqneoqow',
__title:
'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2',
},
created_at: '2025-05-01T11:12:50.000Z',
country: 'IN',
city: 'Mumbai',
region: 'Maharashtra',
longitude: null,
latitude: null,
os: 'Android',
os_version: undefined,
browser: 'Chrome',
browser_version: '',
device: 'mobile',
brand: '',
model: '',
duration: 0,
path: '/state/maharashtra',
origin: 'https://domain.com',
referrer: 'https://referrer.com',
referrer_name: 'Google',
referrer_type: 'search',
imported_at: expect.any(String),
sdk_name: 'mixpanel',
sdk_version: '1.0.0',
});
});
const rawEvent = {
event: '$mp_web_page_view',
properties: {
time: 1746097970,
distinct_id: '$device:123',
$browser: 'Chrome',
$browser_version: 135,
$city: 'Mumbai',
$current_url:
'https://domain.com/state/maharashtra?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_sour',
$device: 'Android',
$device_id: '123',
$initial_referrer: 'https://referrer.com/',
$initial_referring_domain: 'referrer.com',
$insert_id: 'source_id',
$lib_version: '2.60.0',
$mp_api_endpoint: 'api-js.mixpanel.com',
$mp_api_timestamp_ms: 1746078175363,
$mp_autocapture: true,
$os: 'Android',
$referrer: 'https://google.com/',
$referring_domain: 'referrer.com',
$region: 'Maharashtra',
$screen_height: 854,
$screen_width: 384,
current_domain: 'domain.com',
current_page_title:
'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2',
current_url_path: '/state/maharashtra',
current_url_protocol: 'https:',
current_url_search:
'?utm_source=google&utm_medium=cpc&utm_campaignid=890&utm_adgroupid=&utm_adid=&utm_term=&utm_device=m&utm_network=x&utm_location=123&gclid=oqneoqow&gad_source=5&gclid=EAIaIQobChMI6MnvhciBjQMVlS-DAx',
gclid: 'oqneoqow',
mp_country_code: 'IN',
mp_lib: 'web',
mp_processing_time_ms: 1746078175546,
mp_sent_by_lib_version: '2.60.0',
utm_medium: 'cpc',
utm_source: 'google',
},
};
const res = provider.transformEvent(rawEvent);
expect(res).toMatchObject({
id: expect.any(String),
name: 'screen_view',
device_id: '123',
profile_id: '123',
project_id: 'pid',
session_id: '',
properties: {
__source_insert_id: 'source_id',
__screen: '384x854',
__lib_version: '2.60.0',
'__query.utm_source': 'google',
'__query.utm_medium': 'cpc',
'__query.utm_campaignid': '890',
'__query.utm_device': 'm',
'__query.utm_network': 'x',
'__query.utm_location': '123',
'__query.gclid': 'oqneoqow',
__title:
'Landeed: Satbara Utara, 7/12 Extract, Property Card & Index 2',
},
created_at: '2025-05-01T11:12:50.000Z',
country: 'IN',
city: 'Mumbai',
region: 'Maharashtra',
longitude: null,
latitude: null,
os: 'Android',
os_version: '',
browser: 'Chrome',
browser_version: '135',
device: 'mobile',
brand: '',
model: '',
duration: 0,
path: '/state/maharashtra',
origin: 'https://domain.com',
referrer: 'https://referrer.com',
referrer_name: 'Google',
referrer_type: 'search',
imported_at: expect.any(String),
sdk_name: 'mixpanel',
sdk_version: '1.0.0',
});
});
🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.test.ts lines 38-126: the test
currently masks a mapping bug by expecting browser_version = '' and os_version =
undefined even though rawEvent.properties contains $browser_version and $os;
update the test to assert the actual expected values (browser_version: '135',
os_version: 'Android' or the correct field value) and ensure created_at remains
derived from the time field. Then fix the provider in
packages/importer/src/providers/mixpanel.ts to properly read both camelCase and
snake_case Mixpanel fields (e.g., $browser_version / browser_version,
$os_version / os_version, and fallback to $os if os_version missing), coerce
numeric versions to strings, and assign them to the transformed event's
browser_version and os_version fields so the tightened test passes.

const event = JSON.parse(line);
yield event;
} catch (error) {
console.warn('Failed to parse Mixpanel event:', line);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use configured logger instead of console.warn.

Lines 134 and 146 use console.warn, bypassing the structured logger passed to the constructor. This creates inconsistent logging and prevents proper log aggregation and filtering.

Apply this diff to use the configured logger:

         for (const line of lines) {
           if (line.trim()) {
             try {
               const event = JSON.parse(line);
               yield event;
             } catch (error) {
-              console.warn('Failed to parse Mixpanel event:', line);
+              this.logger?.warn('Failed to parse Mixpanel event', { line, error });
             }
           }
         }
       if (buffer.trim()) {
         try {
           const event = JSON.parse(buffer);
           yield event;
         } catch (error) {
-          console.warn('Failed to parse final Mixpanel event:', buffer);
+          this.logger?.warn('Failed to parse final Mixpanel event', { buffer, error });
         }
       }

Also applies to: 146-146

🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 134 and 146, replace
the direct console.warn calls with the instance's configured logger (e.g.,
this.logger.warn or the logger variable passed into the constructor) so logging
uses the structured logger; update both occurrences to call the logger with the
same message and include the event/line and any parse error details, preserving
message clarity and log level.

Comment on lines +202 to +205
const userAgent = props.osVersion || '';
const uaInfo = this.isWebEvent(props.mp_lib)
? parseUserAgent(userAgent, props)
: this.parseServerDeviceInfo(props);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Incorrect user agent assignment.

Line 202 assigns props.osVersion to the userAgent variable, but OS version is not the same as the user agent string. The user agent should be extracted from a property like props.$user_agent or similar.

Verify the correct Mixpanel property name for the user agent string and apply the fix:

-    const userAgent = props.osVersion || '';
+    const userAgent = props.$user_agent || '';
     const uaInfo = this.isWebEvent(props.mp_lib)
       ? parseUserAgent(userAgent, props)
       : this.parseServerDeviceInfo(props);

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 202 to 205, the code
incorrectly assigns props.osVersion to userAgent; replace this with the Mixpanel
user agent property (e.g., props.$user_agent or the correct property name from
Mixpanel events), fall back to an empty string or props.osVersion if user agent
is not present, and pass that userAgent into parseUserAgent; ensure the
isWebEvent branch uses the actual UA string and not the OS version.

Comment on lines +288 to +279
os: uaInfo.os || props.$os,
os_version: uaInfo.osVersion || props.$osVersion,
browser: uaInfo.browser || props.$browser,
browser_version:
uaInfo.browserVersion || props.$browserVersion
? String(props.$browser_version)
: '',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Inconsistent property naming may cause data loss.

Lines 289 and 292-293 use props.$osVersion and props.$browser_version respectively, but Mixpanel typically uses underscores (e.g., $os_version). This inconsistency may result in missing data if the wrong property names are used.

Verify the correct Mixpanel property naming convention:


🌐 Web query:

What are the correct Mixpanel event property names for OS version and browser version?

💡 Result:

Use Mixpanel's default property names:

  • OS name: $os — OS version: $os_version. [1]
  • Browser name: $browser — Browser version: $browser_version. [1]

Source: Mixpanel Default/Property reference. [1]


Fix inconsistent Mixpanel property names to prevent data loss.

The code uses camelCase property names (props.$osVersion, props.$browserVersion) but Mixpanel's standard property names are $os_version and $browser_version (snake_case). Additionally, lines 292-294 check props.$browserVersion in the condition but access props.$browser_version in the ternary, creating a logic inconsistency. Update both lines 289 and 292 to use the correct snake_case property names to ensure data is properly captured.

🤖 Prompt for AI Agents
In packages/importer/src/providers/mixpanel.ts around lines 288 to 294, the
Mixpanel property names are inconsistent (using camelCase props.$osVersion and
props.$browserVersion) and the ternary condition checks a different prop than it
accesses, which can drop data; change references to use Mixpanel's snake_case
names (props.$os_version and props.$browser_version) and make the ternary check
and the value access use the same snake_case prop, converting to string as
needed (e.g., String(props.$browser_version)) so os_version and browser_version
are populated correctly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (29)
packages/validation/src/index.ts (1)

478-486: This issue has already been flagged in a previous review.

The concern about provider and config mismatches remains unaddressed. Please refer to the previous review comment which provides a discriminated union solution to enforce that each provider literal is paired with its correct config shape.

packages/db/prisma/migrations/20251028152531_imports_again/migration.sql (1)

1-4: SQL migration is correct.

The migration correctly alters the currentBatch column to TEXT nullable without a default. The TypeScript type mismatch identified in the past review comment (lines 1-4) should be addressed separately in packages/importer/src/types.ts.

packages/db/prisma/migrations/20251018214030_fix/migration.sql (1)

10-13: Critical: Data migration missing before dropping columns.

As identified in the past review comment (lines 10-12), this migration drops provider, sourceLocation, and sourceType columns without first migrating their data to the config JSON field. This will cause data loss for existing import records and break the worker's createProvider function which expects config.provider.

packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql (1)

8-14: Destructive migration: status drop/recreate will lose data; currentStep NOT NULL conflicts with code.

This migration drops and recreates the status column, which will lose existing data. Additionally, forcing currentStep to NOT NULL conflicts with the schema definition (line 489 in schema.prisma shows currentStep String?) and worker code that relies on null values to detect new imports.

The safe migration approach is to:

  1. Add a new enum column with a default value
  2. Backfill from the existing text column with proper mapping
  3. Drop the old column and rename
  4. Keep currentStep nullable

Refer to the detailed migration example in the existing past review comment.

apps/start/src/components/project-mapper.tsx (2)

23-31: Define a proper form type instead of using any.

The interface uses any for all generic type parameters, defeating TypeScript's type safety. Define an explicit form schema type to enable proper type checking and autocomplete.

Apply the diff suggested in the past review comment to add proper typing with a ProjectMapperFormData interface.


92-92: Remove the arbitrary 10-project limit.

Line 92 limits the dropdown to only the first 10 projects, preventing users with more projects from mapping to those beyond the limit. The Select component should display all available projects with scrolling.

Remove .slice(0, 10) to show all projects, or implement search/pagination if performance is a concern.

packages/importer/package.json (2)

5-5: Main entry point references source instead of build output.

The main field points to src/index.ts, but consumers will need the compiled JavaScript output. This should point to ./dist/index.js.

Update to:

-  "main": "src/index.ts",
+  "main": "./dist/index.js",

Also ensure "types" field points to the compiled declaration file.


21-22: Review CSV library usage and align Ramda versions.

Two concerns:

  1. Only csv-parse is listed (csv-parser is not present), so this may be fine
  2. ramda is at ^0.29.1 but @types/ramda is at ^0.31.1 - version mismatch may cause type errors

Update ramda to match its types:

-    "ramda": "^0.29.1",
+    "ramda": "^0.31.1",
packages/db/prisma/schema.prisma (1)

479-498: Inconsistency: currentStep nullable in schema but NOT NULL in migration.

Line 489 defines currentStep as String? (optional), but migration 20251028150123_fix_imports_table/migration.sql sets it to NOT NULL. This mismatch will cause runtime issues.

Align schema and migration: either make it required in schema (currentStep String) or remove the NOT NULL constraint from the migration. Based on the worker code needing null to detect new imports, keeping it nullable is recommended.

packages/db/code-migrations/5-add-imports-table.ts (1)

74-85: Fix __filename usage in ESM context.

The package is configured as ESM ("type": "module"), and __filename is undefined in ESM. Other migration files (2-accounts.ts, 3-init-ch.ts) already use the correct pattern.

Import and use fileURLToPath:

+import { fileURLToPath } from 'node:url';
+
+const __filename = fileURLToPath(import.meta.url);
+
 export async function up() {
packages/importer/src/providers/mixpanel.test.ts (1)

39-127: Test masks mapping bugs in provider (browser/os version).

The test expects browser_version: '' and os_version: undefined even though the rawEvent contains $browser_version: 135 and $os: 'Android'. This masks a bug in the provider's field mapping logic.

Tighten the test assertions to expect the actual values:

-      os_version: undefined,
-      browser_version: '',
+      os_version: 'Android',  // or appropriate parsed version
+      browser_version: '135',

Then fix the provider (packages/importer/src/providers/mixpanel.ts) to correctly map $browser_version and $os_version fields, handling both snake_case and camelCase variants.

packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (1)

1-3: Ensure gen_random_uuid() is available.

The migration uses gen_random_uuid() without ensuring the pgcrypto extension is enabled, which will cause the migration to fail on databases where pgcrypto isn't already available.

Add the extension check before the table creation:

+-- Enable required extensions
+CREATE EXTENSION IF NOT EXISTS pgcrypto;
+
 -- CreateTable
 CREATE TABLE "public"."imports" (
packages/importer/src/providers/mixpanel.ts (4)

128-147: Route parse errors through the provider logger

console.warn bypasses our structured logger and can disappear in production. Use this.logger?.warn (with the error and raw line) instead.

Apply this diff:

-              console.warn('Failed to parse Mixpanel event:', line);
+              this.logger?.warn('Failed to parse Mixpanel event', {
+                line,
+                error,
+              });
@@
-          console.warn('Failed to parse final Mixpanel event:', buffer);
+          this.logger?.warn('Failed to parse final Mixpanel event', {
+            buffer,
+            error,
+          });

85-91: Remove service account from logs

Logging serviceAccount discloses a credential identifier. Drop it (or mask it) from the metadata.

Apply this diff:

     this.logger?.info('Fetching events from Mixpanel', {
       url: `${url}?${params}`,
       from,
       to,
       projectId,
-      serviceAccount,
     });

201-205: Parse UA from $user_agent, not osVersion

props.osVersion is not a user-agent string, so every web event ends up with blank UA/device data. Prefer $user_agent (fallback to empty string) before feeding parseUserAgent.

Apply this diff:

-    const userAgent = props.osVersion || '';
+    const userAgent =
+      typeof props.$user_agent === 'string' && props.$user_agent
+        ? props.$user_agent
+        : '';

273-279: Fix Mixpanel property names and browser version fallback

Mixpanel uses snake_case ($os_version, $browser_version), and the current ternary always stringifies $browser_version even when undefined, yielding 'undefined'. Use the correct keys and direct fallbacks.

Apply this diff:

-      os: uaInfo.os || props.$os,
-      os_version: uaInfo.osVersion || props.$osVersion,
-      browser: uaInfo.browser || props.$browser,
-      browser_version:
-        uaInfo.browserVersion || props.$browserVersion
-          ? String(props.$browser_version)
-          : '',
+      os: uaInfo.os || props.$os || '',
+      os_version: uaInfo.osVersion || props.$os_version || '',
+      browser: uaInfo.browser || props.$browser || '',
+      browser_version:
+        uaInfo.browserVersion || props.$browser_version || '',
apps/worker/src/jobs/import.ts (5)

35-38: Redact import configuration in logs

record.config can carry API keys and secrets; logging it verbatim leaks credentials. Please log only non-sensitive identifiers such as provider (and explicitly redacted fields) before fanning out to child logs.

Apply this diff:

-  const jobLogger = logger.child({
-    importId,
-    config: record.config,
-  });
+  const jobLogger = logger.child({
+    importId,
+    provider: record.config?.provider,
+  });

134-142: Do not log raw provider events

Invalid events are currently dumped to the logger, exposing full payloads (PII, tokens, etc.). Emit only minimal context (provider + reason) so we stay compliant while still highlighting the failure.

Apply this diff:

-      if (
-        !providerInstance.validate(
-          // @ts-expect-error
-          rawEvent,
-        )
-      ) {
-        jobLogger.warn('Skipping invalid event', { rawEvent });
+      if (!providerInstance.validate(rawEvent as never)) {
+        jobLogger.warn('Skipping invalid event', {
+          provider: providerInstance.provider,
+          reason: 'validation_failed',
+        });
         continue;
       }

128-198: Harden batch transforms to avoid import aborts

A single transformEvent exception currently tears down the job, and we over-count processed rows even when transforms fail. Buffer the batch as unknown[], wrap each transform in try/catch, log failures, and increment processedEvents based on the successfully transformed rows only (both for full and tail batches).

Apply this diff:

-      const eventBatch: any = [];
+      const eventBatch: unknown[] = [];
@@
-          const transformedEvents: IClickhouseEvent[] = eventBatch.map(
-            (
-              // @ts-expect-error
-              event,
-            ) => providerInstance!.transformEvent(event),
-          );
+          const transformedEvents: IClickhouseEvent[] = [];
+          for (const raw of eventBatch) {
+            try {
+              transformedEvents.push(
+                providerInstance.transformEvent(
+                  raw as Parameters<typeof providerInstance.transformEvent>[0],
+                ),
+              );
+            } catch (error) {
+              jobLogger.warn('Skipping event on transform failure', {
+                provider: providerInstance.provider,
+                reason: error instanceof Error ? error.message : String(error),
+              });
+            }
+          }
+          if (transformedEvents.length === 0) {
+            eventBatch.length = 0;
+            continue;
+          }
 
           await insertImportBatch(transformedEvents, importId);
 
-          processedEvents += eventBatch.length;
+          processedEvents += transformedEvents.length;
@@
-      if (eventBatch.length > 0) {
-        const transformedEvents = eventBatch.map(
-          (
-            // @ts-expect-error
-            event,
-          ) => providerInstance!.transformEvent(event),
-        );
+      if (eventBatch.length > 0) {
+        const transformedEvents: IClickhouseEvent[] = [];
+        for (const raw of eventBatch) {
+          try {
+            transformedEvents.push(
+              providerInstance.transformEvent(
+                raw as Parameters<typeof providerInstance.transformEvent>[0],
+              ),
+            );
+          } catch (error) {
+            jobLogger.warn('Skipping event on transform failure', {
+              provider: providerInstance.provider,
+              reason: error instanceof Error ? error.message : String(error),
+            });
+          }
+        }
+        if (transformedEvents.length === 0) {
+          eventBatch.length = 0;
+          continue;
+        }
 
         await insertImportBatch(transformedEvents, importId);
 
-        processedEvents += eventBatch.length;
+        processedEvents += transformedEvents.length;

194-197: Include counters in the final loading update

The tail-batch status update omits totalEvents and processedEvents, leaving Postgres stale for the UI. Pass the counters just like the earlier batches.

Apply this diff:

         await updateImportStatus(jobLogger, job, importId, {
           step: 'loading',
           batch: createdAt,
+          totalEvents,
+          processedEvents,
         });

250-269: Persist final totals after completion

After markImportComplete, Postgres still holds outdated processedEvents/totalEvents. Fetch the final ClickHouse progress and write those numbers back (either via updateImportStatus or a direct db.import.update) so the UI reflects 100 % completion.

Apply this diff:

-    await updateImportStatus(jobLogger, job, importId, {
-      step: 'completed',
-    });
+    await updateImportStatus(jobLogger, job, importId, {
+      step: 'completed',
+      totalEvents,
+      processedEvents,
+    });
@@
-    const finalProgress = await getImportProgress(importId);
+    const finalProgress = await getImportProgress(importId);
+    await db.import.update({
+      where: { id: importId },
+      data: {
+        totalEvents: finalProgress.totalEvents,
+        processedEvents: finalProgress.insertedEvents,
+      },
+    });
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1)

265-275: Redact sensitive config before displaying

imp.config frequently contains API keys and secrets; dumping it straight into the tooltip leaks credentials. Sanitize (or redact) known secret fields before stringifying, and/or restrict the tooltip to privileged roles.

Apply this diff (or similar):

-                      content={
-                        <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words">
-                          {JSON.stringify(imp.config, null, 2)}
-                        </pre>
-                      }
+                      content={
+                        <pre className="font-mono text-sm leading-normal whitespace-pre-wrap break-words">
+                          {JSON.stringify(redactImportConfig(imp.config), null, 2)}
+                        </pre>
+                      }

Add a helper such as:

function redactImportConfig(config: Record<string, unknown>) {
  const masked = { ...config };
  for (const key of ['apiKey', 'apiSecret', 'serviceAccount', 'serviceSecret', 'token', 'secret']) {
    if (masked[key]) masked[key] = '***';
  }
  return masked;
}
apps/start/src/modals/add-import.tsx (1)

246-248: Use a generic failure message

Surfacing error.message can leak upstream credentials or debug details. Replace it with a neutral user-facing string and keep the sensitive message server-side.

Apply this diff:

-      onError: (error) => {
-        toast.error('Import failed', {
-          description: error.message,
-        });
+      onError: () => {
+        toast.error('Import failed', {
+          description: 'We could not queue the import. Please try again or contact support.',
+        });
packages/importer/src/types.ts (1)

53-66: EventProperties allows nested objects, but ClickHouse properties is primitives-only.

EventProperties permits Record<string, unknown> (line 61) and defines __query as a nested object (line 62). However, IClickhouseEvent.properties expects Record<string, string | number | boolean | undefined | null>. Storing objects will break ingestion unless stringified.

Consider:

  • Constrain values to primitives: string | number | boolean | null | undefined
  • Or require JSON stringification for nested data (e.g., __queryJson: string)

Apply this diff to constrain to primitives:

 export interface EventProperties {
   [key: string]:
     | string
     | number
     | boolean
     | null
-    | undefined
-    | Record<string, unknown>;
-  __query?: Record<string, unknown>;
+    | undefined;
   __title?: string;
   __screen?: string;
   __language?: string;
 }
packages/db/src/services/import.service.ts (5)

1-12: Missing import for crypto.randomUUID().

Lines 272 and 309 call crypto.randomUUID(), but randomUUID is not imported from node:crypto.

Add the import:

 import type { ILogger } from '@openpanel/logger';
+import { randomUUID } from 'node:crypto';
 import sqlstring from 'sqlstring';

Then replace usage:

-      id: crypto.randomUUID(),
+      id: randomUUID(),

262-288: Risk of timestamp collisions with existing events.

Adjusting session boundaries by ±1 second (line 288: -1000ms, line 325: +500ms) could create timestamp conflicts if real events exist at those exact timestamps, leading to non-deterministic ordering.

Consider:

  1. Use sub-millisecond offsets (±1ms) to reduce collision probability
  2. Add an is_synthetic flag and use (created_at, is_synthetic) for ordering
  3. Use ClickHouse's native session reconstruction capabilities
-     created_at: adjustTimestamp(session.first_timestamp, -1000),
+     created_at: adjustTimestamp(session.first_timestamp, -1),

553-554: Inconsistent parameter handling.

Lines 553-554 use sqlstring.escape() for inline substitution, while the rest of the file uses ClickHouse's query_params mechanism (e.g., lines 122, 449). This inconsistency reduces maintainability.

Refactor to use query_params:

     WHERE 
-      e.import_id = ${sqlstring.escape(importId)}
-      AND toDate(e.created_at) = ${sqlstring.escape(from)}
+      e.import_id = {importId:String}
+      AND toDate(e.created_at) = {from:String}
       AND e.session_id != ''
     GROUP BY e.session_id
   `;

   await ch.command({
     query: sessionsInsertQuery,
+    query_params: { importId, from },
     clickhouse_settings: {

601-613: Use deterministic status aggregation instead of any(import_status).

any(import_status) (line 607) returns an arbitrary status from the group, making results non-deterministic when events have mixed statuses (e.g., some 'pending', some 'processed').

Use deterministic aggregation:

     SELECT 
       import_id,
       COUNT(*) as total_events,
       COUNTIf(import_status = 'pending') as pending_events,
       COUNTIf(import_status = 'processed') as processed_events,
-      any(import_status) as status
+      CASE
+        WHEN COUNTIf(import_status = 'failed') > 0 THEN 'failed'
+        WHEN COUNTIf(import_status = 'processed') = COUNT(*) THEN 'processed'
+        WHEN COUNTIf(import_status = 'pending') > 0 THEN 'processing'
+        ELSE 'pending'
+      END as status
     FROM ${TABLE_NAMES.events_imports}

647-652: Validate status value instead of using type assertion.

The type assertion on line 651 bypasses type safety. If ClickHouse returns an unexpected status value, the type system won't catch it.

Add runtime validation:

+ const validStatuses = ['pending', 'processing', 'processed', 'failed'] as const;
+ const status = validStatuses.includes(row.status as any) 
+   ? (row.status as typeof validStatuses[number])
+   : 'pending';
+
  return {
    importId,
    totalEvents: row.total_events,
    insertedEvents: row.processed_events,
-   status: row.status as 'pending' | 'processing' | 'processed' | 'failed',
+   status,
  };
🧹 Nitpick comments (5)
packages/common/src/object.test.ts (1)

5-32: Strong test coverage for main scenarios.

The test thoroughly validates the core functionality including primitives, arrays, nested objects, and JSON string parsing. The expectations correctly reflect the implementation's behavior of stringifying values, flattening structures, and filtering out null/undefined/empty strings.

Consider adding test cases for additional edge cases:

  • Boolean values (true/false)
  • Falsy number (0)
  • Empty arrays and objects
  • Arrays containing null/undefined elements

These would further strengthen the test suite but are not critical for the current functionality.

packages/validation/src/index.ts (1)

461-471: Consider date validation for temporal fields.

The from and to fields likely represent date ranges but are only validated as non-empty strings. Invalid date formats could cause runtime errors in the import processor.

Consider adding date format validation:

 export const zMixpanelImportConfig = z.object({
   provider: z.literal('mixpanel'),
   type: z.literal('api'),
   serviceAccount: z.string().min(1),
   serviceSecret: z.string().min(1),
   projectId: z.string().min(1),
-  from: z.string().min(1),
-  to: z.string().min(1),
+  from: z.string().datetime(),
+  to: z.string().datetime(),
   mapScreenViewProperty: z.string().optional(),
 });

Note: Use .datetime() for ISO 8601 timestamps or .date() for YYYY-MM-DD format, depending on the expected input format.

packages/redis/package.json (1)

11-11: Consider using caret range for consistency.

The change from "^5.7.0" to "5.8.2" removes semver flexibility and pins to an exact version. This is inconsistent with dependency management patterns elsewhere in the monorepo and prevents automatic patch updates.

Consider reverting to a caret range for consistency:

-    "ioredis": "5.8.2"
+    "ioredis": "^5.8.2"
packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1)

1-3: Migration creates type churn.

The migration adds currentBatch as INTEGER NOT NULL DEFAULT 0, but a subsequent migration (20251028152531_imports_again) changes it to TEXT nullable. While both migrations are technically valid, this indicates design iteration. For future migrations, consider finalizing type decisions before creating schema changes to reduce churn.

packages/db/src/services/import.service.ts (1)

721-784: Consider coordinating Postgres and ClickHouse state updates.

updateImportStatus updates Postgres only, while markImportComplete (lines 574-593) updates ClickHouse only. If one system's update fails, import state becomes inconsistent across the two databases.

Consider one of these approaches:

  1. Update Postgres first (as source of truth), then ClickHouse, with retry logic
  2. Use an idempotent reconciliation job that periodically syncs state
  3. At minimum, add try-catch with logging for manual reconciliation
 export async function updateImportStatus(
   jobLogger: ILogger,
   job: { updateProgress: (progress: Record<string, any>) => void },
   importId: string,
   options: UpdateImportStatusOptions,
 ): Promise<void> {
+  try {
     const data: Prisma.ImportUpdateInput = {};
     // ... build data object ...
     
     await job.updateProgress(data);
     await db.import.update({
       where: { id: importId },
       data,
     });
+  } catch (error) {
+    jobLogger.error('Failed to update import status', { importId, error });
+    throw error;
+  }
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a11373 and c9ebec6.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (76)
  • apps/api/package.json (1 hunks)
  • apps/start/Dockerfile (2 hunks)
  • apps/start/package.json (2 hunks)
  • apps/start/src/components/integrations/integration-card.tsx (1 hunks)
  • apps/start/src/components/project-mapper.tsx (1 hunks)
  • apps/start/src/components/ui/badge.tsx (1 hunks)
  • apps/start/src/modals/add-import.tsx (1 hunks)
  • apps/start/src/modals/date-ranger-picker.tsx (1 hunks)
  • apps/start/src/modals/index.tsx (2 hunks)
  • apps/start/src/routeTree.gen.ts (11 hunks)
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (1 hunks)
  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1 hunks)
  • apps/start/src/utils/math.ts (1 hunks)
  • apps/start/src/utils/slug.ts (1 hunks)
  • apps/worker/Dockerfile (2 hunks)
  • apps/worker/package.json (3 hunks)
  • apps/worker/src/boot-workers.ts (4 hunks)
  • apps/worker/src/index.ts (2 hunks)
  • apps/worker/src/jobs/cron.delete-projects.ts (1 hunks)
  • apps/worker/src/jobs/events.incoming-event.ts (3 hunks)
  • apps/worker/src/jobs/events.incoming-events.test.ts (2 hunks)
  • apps/worker/src/jobs/import.ts (1 hunks)
  • package.json (1 hunks)
  • packages/cli/package.json (0 hunks)
  • packages/cli/src/cli.ts (0 hunks)
  • packages/cli/src/importer/importer.ts (0 hunks)
  • packages/cli/src/importer/index.ts (0 hunks)
  • packages/cli/tsconfig.json (0 hunks)
  • packages/cli/tsup.config.ts (0 hunks)
  • packages/common/package.json (1 hunks)
  • packages/common/scripts/get-referrers.ts (1 hunks)
  • packages/common/server/index.ts (1 hunks)
  • packages/common/server/parse-referrer.test.ts (5 hunks)
  • packages/common/server/parse-referrer.ts (4 hunks)
  • packages/common/server/parser-user-agent.ts (2 hunks)
  • packages/common/src/object.test.ts (1 hunks)
  • packages/common/src/object.ts (2 hunks)
  • packages/db/code-migrations/5-add-imports-table.sql (1 hunks)
  • packages/db/code-migrations/5-add-imports-table.ts (1 hunks)
  • packages/db/index.ts (2 hunks)
  • packages/db/package.json (1 hunks)
  • packages/db/prisma/migrations/20251018205153_add_import_table/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251018214030_fix/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028103556_import_batch_and_status/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028150123_fix_imports_table/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql (1 hunks)
  • packages/db/prisma/migrations/20251028152531_imports_again/migration.sql (1 hunks)
  • packages/db/prisma/schema.prisma (2 hunks)
  • packages/db/src/buffers/base-buffer.ts (1 hunks)
  • packages/db/src/buffers/event-buffer.test.ts (8 hunks)
  • packages/db/src/clickhouse/client.ts (4 hunks)
  • packages/db/src/clickhouse/csv.ts (1 hunks)
  • packages/db/src/clickhouse/migration.ts (1 hunks)
  • packages/db/src/services/event.service.ts (1 hunks)
  • packages/db/src/services/import.service.ts (1 hunks)
  • packages/db/src/services/session.service.ts (1 hunks)
  • packages/db/src/types.ts (2 hunks)
  • packages/importer/package.json (1 hunks)
  • packages/importer/src/base-provider.ts (1 hunks)
  • packages/importer/src/index.ts (1 hunks)
  • packages/importer/src/providers/metadata.ts (1 hunks)
  • packages/importer/src/providers/mixpanel.test.ts (1 hunks)
  • packages/importer/src/providers/mixpanel.ts (1 hunks)
  • packages/importer/src/providers/umami.ts (1 hunks)
  • packages/importer/src/types.ts (1 hunks)
  • packages/importer/tsconfig.json (1 hunks)
  • packages/importer/vitest.config.ts (1 hunks)
  • packages/queue/src/queues.ts (1 hunks)
  • packages/redis/package.json (1 hunks)
  • packages/trpc/package.json (1 hunks)
  • packages/trpc/src/root.ts (2 hunks)
  • packages/trpc/src/routers/auth.ts (1 hunks)
  • packages/trpc/src/routers/import.ts (1 hunks)
  • packages/trpc/src/routers/organization.ts (1 hunks)
  • packages/validation/src/index.ts (1 hunks)
💤 Files with no reviewable changes (6)
  • packages/cli/package.json
  • packages/cli/tsup.config.ts
  • packages/cli/src/cli.ts
  • packages/cli/src/importer/importer.ts
  • packages/cli/src/importer/index.ts
  • packages/cli/tsconfig.json
🚧 Files skipped from review as they are similar to previous changes (33)
  • packages/common/server/index.ts
  • apps/start/src/modals/date-ranger-picker.tsx
  • apps/worker/src/index.ts
  • packages/common/server/parse-referrer.ts
  • packages/trpc/src/root.ts
  • packages/db/src/types.ts
  • apps/start/src/modals/index.tsx
  • packages/common/src/object.ts
  • apps/worker/Dockerfile
  • packages/db/src/buffers/event-buffer.test.ts
  • packages/db/index.ts
  • packages/trpc/src/routers/import.ts
  • apps/start/src/components/integrations/integration-card.tsx
  • packages/trpc/src/routers/organization.ts
  • apps/start/src/utils/math.ts
  • packages/importer/src/providers/metadata.ts
  • packages/common/scripts/get-referrers.ts
  • apps/start/package.json
  • apps/start/src/components/ui/badge.tsx
  • package.json
  • packages/importer/src/providers/umami.ts
  • apps/worker/package.json
  • packages/importer/vitest.config.ts
  • packages/db/prisma/migrations/20251028150655_fix_imports_again/migration.sql
  • packages/db/src/services/event.service.ts
  • apps/start/Dockerfile
  • packages/importer/src/index.ts
  • apps/worker/src/boot-workers.ts
  • packages/queue/src/queues.ts
  • packages/importer/tsconfig.json
  • packages/trpc/src/routers/auth.ts
  • apps/start/src/routes/app.$organizationId.$projectId.settings._tabs.tsx
  • packages/db/src/services/session.service.ts
🧰 Additional context used
📓 Path-based instructions (1)
apps/start/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/start/.cursorrules)

apps/start/**/*.{ts,tsx}: Instrument server functions created with createServerFn by wrapping their implementation in Sentry.startSpan and importing Sentry from '@sentry/tanstackstart-react'
When using Sentry in the TanStack Start React app, import it as: import * as Sentry from '@sentry/tanstackstart-react'

Files:

  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx
  • apps/start/src/modals/add-import.tsx
  • apps/start/src/utils/slug.ts
  • apps/start/src/routeTree.gen.ts
  • apps/start/src/components/project-mapper.tsx
🧠 Learnings (9)
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, use the helper functions in ./packages/db/src/clickhouse/query-functions.ts rather than crafting raw queries

Applied to files:

  • packages/db/src/clickhouse/csv.ts
  • packages/db/code-migrations/5-add-imports-table.ts
  • packages/db/src/clickhouse/client.ts
  • packages/db/src/services/import.service.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*.tsx : Define React route files with createFileRoute and let the bundler/CLI manage the path string

Applied to files:

  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx
  • apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/__root.tsx : Declare the root route in routes/__root.tsx using createRootRoute or createRootRouteWithContext

Applied to files:

  • apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx
  • apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/{index,*.index}.tsx : For index routes, use index.tsx or *.index.tsx files and ensure the createFileRoute path uses a trailing slash (e.g., '/posts/')

Applied to files:

  • apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*$*.tsx : Name dynamic route files with $param (e.g., posts.$postId.tsx) to capture path parameters

Applied to files:

  • apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/$.tsx : Use $.tsx for splat (catch-all) routes to capture the remaining path into params._splat

Applied to files:

  • apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/*_.tsx : Use a trailing underscore in a segment filename to create a non-nested (un-nested) route (e.g., posts_.tsx)

Applied to files:

  • apps/start/src/routeTree.gen.ts
📚 Learning: 2025-10-16T10:29:28.924Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: apps/start/.cursorrules:0-0
Timestamp: 2025-10-16T10:29:28.924Z
Learning: Applies to apps/start/routes/**/_*.tsx : Use leading underscore filenames for pathless layout routes that wrap children without matching a URL segment

Applied to files:

  • apps/start/src/routeTree.gen.ts
📚 Learning: 2025-09-26T18:30:17.476Z
Learnt from: CR
PR: Openpanel-dev/openpanel#0
File: .cursorrules:0-0
Timestamp: 2025-09-26T18:30:17.476Z
Learning: When writing ClickHouse queries, always construct them using the custom query builder in ./packages/db/src/clickhouse/query-builder.ts

Applied to files:

  • packages/db/src/clickhouse/client.ts
  • packages/db/src/services/import.service.ts
🧬 Code graph analysis (12)
packages/db/code-migrations/5-add-imports-table.ts (2)
packages/db/code-migrations/helpers.ts (1)
  • getIsCluster (17-24)
packages/db/src/clickhouse/migration.ts (3)
  • createTable (68-116)
  • modifyTTL (118-132)
  • runClickhouseMigrationCommands (338-470)
packages/importer/src/base-provider.ts (2)
packages/importer/src/types.ts (3)
  • BaseRawEvent (40-42)
  • ImportJobMetadata (69-73)
  • ErrorContext (45-51)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
packages/importer/src/providers/mixpanel.test.ts (1)
packages/importer/src/providers/mixpanel.ts (1)
  • MixpanelProvider (18-452)
packages/common/server/parse-referrer.test.ts (1)
packages/common/server/parse-referrer.ts (2)
  • parseReferrer (17-26)
  • getReferrerWithQuery (28-66)
apps/worker/src/jobs/import.ts (8)
packages/queue/src/queues.ts (1)
  • ImportQueuePayload (168-173)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
packages/db/src/services/import.service.ts (10)
  • ImportSteps (719-719)
  • updateImportStatus (721-784)
  • getImportDateBounds (658-682)
  • insertImportBatch (30-87)
  • generateSessionIds (94-130)
  • createSessionsStartEndEvents (138-348)
  • moveImportsToProduction (354-458)
  • backfillSessionsToProduction (460-569)
  • markImportComplete (574-593)
  • getImportProgress (598-653)
packages/db/src/clickhouse/client.ts (2)
  • formatClickhouseDate (237-248)
  • error (41-46)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/importer/src/providers/umami.ts (1)
  • UmamiProvider (59-382)
packages/importer/src/providers/mixpanel.ts (1)
  • MixpanelProvider (18-452)
packages/common/src/object.test.ts (1)
packages/common/src/object.ts (1)
  • toDots (16-67)
packages/importer/src/types.ts (2)
packages/db/src/services/event.service.ts (1)
  • IServiceImportedEventPayload (144-147)
packages/db/src/services/import.service.ts (2)
  • ImportProgress (20-25)
  • ImportStageResult (14-18)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.imports.tsx (4)
apps/start/src/routes/_app.$organizationId.$projectId_.settings._tabs.tsx (1)
  • Route (12-34)
packages/importer/src/providers/metadata.ts (1)
  • IMPORT_PROVIDERS (13-30)
apps/public/components/ui/tooltip.tsx (1)
  • Tooltiper (36-51)
apps/start/src/components/integrations/integration-card.tsx (3)
  • IntegrationCard (91-112)
  • IntegrationCardLogoImage (50-69)
  • IntegrationCardFooter (3-15)
apps/start/src/modals/add-import.tsx (3)
packages/validation/src/index.ts (5)
  • zUmamiImportConfig (452-454)
  • zMixpanelImportConfig (461-470)
  • IUmamiImportConfig (456-456)
  • IMixpanelImportConfig (471-471)
  • IImportConfig (473-476)
apps/start/src/components/project-mapper.tsx (1)
  • ProjectMapper (33-116)
apps/public/lib/utils.ts (1)
  • cn (4-6)
apps/start/src/components/project-mapper.tsx (2)
packages/db/src/services/project.service.ts (1)
  • IServiceProject (7-7)
apps/start/src/components/ui/select.tsx (5)
  • Select (173-173)
  • SelectTrigger (181-181)
  • SelectValue (182-182)
  • SelectContent (174-174)
  • SelectItem (176-176)
packages/importer/src/providers/mixpanel.ts (7)
packages/validation/src/index.ts (1)
  • IMixpanelImportConfig (471-471)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/db/src/services/event.service.ts (2)
  • IClickhouseEvent (68-102)
  • query (701-847)
packages/common/src/url.ts (2)
  • parsePath (11-49)
  • isSameDomain (51-63)
packages/common/server/parse-referrer.ts (2)
  • parseReferrer (17-26)
  • getReferrerWithQuery (28-66)
packages/common/server/parser-user-agent.ts (1)
  • parseUserAgent (72-115)
packages/common/src/object.ts (2)
  • toDots (16-67)
  • strip (79-79)
packages/db/src/services/import.service.ts (5)
packages/db/src/services/event.service.ts (1)
  • IClickhouseEvent (68-102)
packages/db/src/clickhouse/csv.ts (2)
  • csvEscapeField (42-53)
  • csvEscapeJson (15-38)
packages/db/src/clickhouse/client.ts (5)
  • formatClickhouseDate (237-248)
  • chInsertCSV (202-228)
  • TABLE_NAMES (49-61)
  • ch (132-162)
  • convertClickhouseDateToJs (267-269)
packages/logger/index.ts (1)
  • ILogger (6-6)
packages/db/src/prisma-client.ts (1)
  • db (270-270)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-and-push-worker
  • GitHub Check: build-and-push-api
  • GitHub Check: build-and-push-dashboard
🔇 Additional comments (30)
apps/api/package.json (1)

67-67: Verify the rationale for pinning tsdown to an exact version.

The caret operator has been removed from tsdown, changing it from "^0.14.2" to "0.14.2" (exact version pin). The AI summary indicates this change is coordinated across the monorepo. Ensure this is intentional for reproducible builds or a known stability requirement with this specific version.

Is this pinning part of a broader monorepo strategy, or is there a specific reason (e.g., bug fix, stability) for locking to 0.14.2? You may want to verify consistency across other workspace packages to confirm alignment.

packages/db/src/buffers/base-buffer.ts (1)

1-1: LGTM! Import path consolidation.

The import path change from @openpanel/common/server/id to @openpanel/common/server is part of a coordinated refactor to centralize server utilities, with similar updates across auth.ts and organization.ts.

packages/common/src/object.test.ts (2)

1-2: LGTM!

The imports are correct and appropriate for a vitest test file.


34-47: Excellent validation of malformed JSON handling.

This test effectively validates that the function correctly:

  • Parses valid JSON strings and expands them into dot notation
  • Skips strings that appear to be JSON but are malformed
  • Preserves regular strings as-is

The test expectations align well with the implementation's defensive approach to JSON parsing.

packages/common/package.json (1)

6-9: Public exports field enables dual entry points.

The new exports field correctly exposes the main entry point and a server-specific export, aligning with the referrer utilities being made part of the public API surface (parse-referrer, id).

packages/validation/src/index.ts (2)

439-459: Well-structured file-based import configs.

The createFileImportConfig helper elegantly reduces duplication while maintaining type safety with provider literals. The zProjectMapper schema and its integration into zUmamiImportConfig provides clear validation for project mapping requirements.


473-476: Clean union type export.

The IImportConfig union type appropriately combines the three provider-specific config types for type safety in consuming code.

packages/trpc/package.json (1)

19-19: LGTM!

The addition of the @openpanel/queue workspace dependency aligns with the new import queue functionality introduced in this PR.

packages/db/src/clickhouse/csv.ts (3)

5-13: LGTM!

The escape order is correct (backslashes first), and the function properly handles all necessary control characters for ClickHouse Map format.


15-38: LGTM!

The function correctly handles null values, normalizes to strings for Map(String,String), and properly delegates to csvEscapeField for final CSV escaping.


42-53: LGTM!

The function correctly implements CSV RFC 4180 standard by escaping double quotes through doubling and wrapping fields containing special characters.

packages/db/src/clickhouse/client.ts (4)

1-1: LGTM!

The Readable import is correctly used in the new chInsertCSV function, and the events_imports table addition aligns with the import system introduced in this PR.

Also applies to: 60-60


65-65: Verify timeout increases are appropriate for the import workload.

The request timeout increased 5x (60s → 300s) and idle socket TTL increased 7.5x (8s → 60s). While these increases support long-running import operations, consider monitoring these values in production to ensure they don't mask underlying performance issues or cause resource exhaustion.

Also applies to: 68-68


137-157: LGTM!

The retry wrapper and comprehensive ClickHouse settings properly support large batch inserts. The settings spread order (line 154) correctly allows user settings to override defaults, and the progress headers prevent proxy timeouts during long-running operations.


202-228: LGTM!

The CSV streaming implementation is correct, and the format settings properly align with the CSV escaping utilities (double quotes for CSV fields, single quotes for Map format internally). Error handling and logging are appropriate.

apps/worker/src/jobs/cron.delete-projects.ts (1)

57-57: LGTM!

The type change from numeric 0 to string '0' aligns with ClickHouse setting conventions and likely reflects updated type definitions in the newer @clickhouse/client version.

packages/importer/src/base-provider.ts (3)

4-31: LGTM!

The abstract class design is clean with appropriate type constraints. Using AsyncGenerator for parseSource enables memory-efficient streaming for large imports, which is essential for this use case.


36-68: LGTM!

The optional hooks are well-designed with sensible defaults. The shouldGenerateSessionIds() default of false appropriately assumes providers handle session ID generation during streaming.


78-120: LGTM!

The date chunking implementation correctly splits date ranges while handling edge cases (same start/end date) and ensuring chunks don't exceed the end date. The break condition prevents infinite loops.

packages/db/prisma/migrations/20251022191315_add_status_message/migration.sql (1)

1-2: LGTM!

The migration correctly adds an optional statusMessage column to support detailed import status tracking, aligning with the PR's import system objectives.

packages/db/package.json (1)

16-16: Verify compatibility with @clickhouse/client 1.12.1.

The upgrade from ^1.2.0 to ^1.12.1 spans 10 minor versions. While maintaining semver compatibility, ensure there are no behavioral changes affecting existing ClickHouse operations.

apps/worker/src/jobs/events.incoming-events.test.ts (1)

102-102: LGTM! Test expectations align with updated referrer parsing behavior.

The expectations correctly reflect that referrerType now returns an empty string ('') instead of 'unknown' for non-derived referrers, consistent with the changes in packages/common/server/parse-referrer.ts.

Also applies to: 210-210

packages/db/src/clickhouse/migration.ts (1)

118-132: LGTM! Clean implementation following established patterns.

The modifyTTL function correctly handles both clustered and non-clustered scenarios, using the replicated() helper consistently with other migration utilities in this file. The implementation is type-safe and follows the existing code structure.

packages/common/server/parse-referrer.test.ts (1)

8-8: LGTM! Comprehensive test coverage for the referrer type change.

All test expectations have been consistently updated to reflect that unknown or empty referrer types now return '' instead of 'unknown'. The changes cover both parseReferrer and getReferrerWithQuery functions across various scenarios (undefined, empty, invalid, and unknown inputs).

Also applies to: 14-14, 44-44, 52-52, 67-67, 91-91, 99-99, 113-113

packages/db/code-migrations/5-add-imports-table.sql (1)

1-43: LGTM! Well-designed import staging table.

The events_imports table schema is well-structured:

  • Mirrors the main events table structure with additional import tracking fields (import_id, import_status, imported_at_meta)
  • Uses appropriate compression codecs for each data type
  • Partitioning by month on imported_at_meta aligns with the 7-day TTL for efficient partition drops
  • ORDER BY (import_id, created_at) supports efficient queries by import batch

The separate TTL ALTER statement follows correct ClickHouse patterns.

packages/common/server/parser-user-agent.ts (2)

71-71: Good API expansion with exported type.

The UserAgentInfo type export provides a clean public API for consumers to type the return value of parseUserAgent, using TypeScript's ReturnType utility type.


84-112: LGTM! More defensive override handling.

The updated override logic adds explicit type guards and truthiness checks, preventing empty strings or non-string values from being used as overrides. This is more defensive than the previous ?? operator approach and ensures only valid, non-empty string overrides are applied. The pattern is consistently applied across all fields.

packages/db/code-migrations/5-add-imports-table.ts (1)

65-72: TTL implementation via modifyTTL helper is correct.

The code uses the modifyTTL helper function which correctly targets the replicated table in cluster mode (see packages/db/src/clickhouse/migration.ts lines 117-131). This addresses the concern about TTL on distributed tables.

The helper properly constructs:

  • ALTER TABLE ${replicated(tableName)} ON CLUSTER '{cluster}' MODIFY TTL ${ttl} for clustered
  • ALTER TABLE ${tableName} MODIFY TTL ${ttl} for non-clustered
apps/worker/src/jobs/events.incoming-event.ts (2)

8-23: Import reorganization aligns with common package refactoring.

The imports have been updated to use public exports from @openpanel/common/server, and Job is now correctly imported as a type-only import. The separate import of omit from ramda alongside the namespace import is acceptable.


120-122: Referrer logic improvements are correct.

The changes improve referrer handling:

  • Line 120: baseEvent.referrer now uses only referrer?.url (removed utmReferrer?.url), which is correct since UTM parameters shouldn't override the actual referrer URL
  • Line 122: referrerType now prioritizes referrer?.type before utmReferrer?.type, which properly favors detected referrer types (like 'search') over UTM-based types

@@ -1 +1 @@
export * from '@openpanel/common/src/slug';
export * from '@openpanel/common';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify export scope: file exports entire common package instead of slug utilities.

This file is named slug.ts but now exports everything from @openpanel/common, not just slug-related functions. This introduces namespace pollution and breaks the single-responsibility principle for utility modules.

Consider using named exports to maintain the focused scope:

-export * from '@openpanel/common';
+export { slugify, /* other slug-specific exports */ } from '@openpanel/common';

Run the following script to verify what slug-related symbols should be exported:


🏁 Script executed:

#!/bin/bash
# Find usages of this slug utility module to determine required exports
rg -n --type=ts --type=tsx "from ['\"]@/utils/slug['\"]" -A 1 -B 1

# Find the original slug exports from common package
fd -e ts -e tsx --full-path "packages/common" | xargs rg -n "export.*slug" -i

Length of output: 157


🏁 Script executed:

#!/bin/bash
# Check all exports in the common package's index
echo "=== Contents of packages/common/index.ts ==="
cat packages/common/index.ts

echo ""
echo "=== Search for any imports of apps/start/src/utils/slug.ts ==="
rg -n --type=ts "utils/slug" -C 2

echo ""
echo "=== Check if apps/start/src/utils/slug.ts file exists and its content ==="
cat apps/start/src/utils/slug.ts 2>/dev/null || echo "File not found"

Length of output: 921


🏁 Script executed:

#!/bin/bash
# Check full content of packages/common/src/slug.ts to see all exports
echo "=== Full content of packages/common/src/slug.ts ==="
cat packages/common/src/slug.ts

echo ""
echo "=== Check if there are any other references to slug utilities ==="
rg -n "slug" apps/start/src/utils/ --type=ts

Length of output: 616


Use named export to maintain module scope: export { slug } from '@openpanel/common';

The file is named slug.ts but re-exports the entire @openpanel/common package (11 modules), when it should only export the slug function. While named imports in consuming files (import { slug }) will tree-shake correctly, this creates a confusing public interface that doesn't match the module's name and breaks the expectation of focused utility exports.

-export * from '@openpanel/common';
+export { slug } from '@openpanel/common';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export * from '@openpanel/common';
export { slug } from '@openpanel/common';
🤖 Prompt for AI Agents
In apps/start/src/utils/slug.ts around line 1, the file currently re-exports the
entire @openpanel/common package which exposes many unrelated modules; change it
to a focused named re-export so this module only exposes the slug utility.
Replace the bulk export with a named export that re-exports just slug from
@openpanel/common (maintain TypeScript types and default/namespace
considerations), run a quick type-check/build to ensure no other modules rely on
the previous broad export surface.

"test": "vitest",
"typecheck": "tsc --noEmit"
"typecheck": "tsc --noEmit",
"gen:referrers": "jiti scripts/get-referrers.ts && biome format --write ./server/referrers/index.ts"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify target path for gen:referrers script output.

The gen:referrers script writes to ./server/referrers/index.ts, but the AI summary context mentions the script writes to packages/worker/src/referrers/index.ts. Please confirm that the relative path is correct and matches where the get-referrers.ts script actually writes its output.


🏁 Script executed:

#!/bin/bash
# Verify the get-referrers script exists and check its output path
fd -e ts 'get-referrers' packages/common/
cat packages/common/scripts/get-referrers.ts 2>/dev/null | grep -n 'writeFile\|output' | head -20

Length of output: 65


🏁 Script executed:

#!/bin/bash
# Show the full writeFileSync call and surrounding context
sed -n '70,80p' packages/common/scripts/get-referrers.ts

Length of output: 638


🏁 Script executed:

#!/bin/bash
# Verify directory structure to confirm relative path resolution
ls -la packages/ | head -20

Length of output: 1005


🏁 Script executed:

#!/bin/bash
# Search for worker directory and referrers paths
fd -t d 'worker' packages/ 2>/dev/null
fd -t d 'referrers' . 2>/dev/null | head -20

Length of output: 35


🏁 Script executed:

#!/bin/bash
# Check if the file exists at the only known location
ls -la packages/common/server/referrers/index.ts 2>&1

Length of output: 92


Fix get-referrers.ts output path mismatch.

The script writes to a non-existent path. In packages/common/scripts/get-referrers.ts line 73, change the output path from path.resolve(__dirname, '../../worker/src/referrers/index.ts') to path.resolve(__dirname, '../../server/referrers/index.ts'). The directory packages/worker does not exist; only packages/common/server/referrers/ exists where the file is correctly formatted by the package.json script.

🤖 Prompt for AI Agents
In packages/common/scripts/get-referrers.ts around line 73 (and referenced by
packages/common/package.json script at line 13), the script writes output to a
non-existent path '../../worker/src/referrers/index.ts'; update the path to
'../../server/referrers/index.ts' using path.resolve(__dirname,
'../../server/referrers/index.ts') so the generated file is written into the
existing packages/common/server/referrers directory (ensure path.resolve is used
and the target directory exists).

Comment on lines +14 to +25
export interface SessionInfo {
id: string;
lastTimestamp: number;
lastEvent: IServiceImportedEventPayload;
}

export interface ImportProgress {
totalEvents: number;
processedEvents: number;
currentBatch: number;
totalBatches: number;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove duplicate type definitions.

ImportProgress (lines 20-25) and ImportStageResult (lines 75-80) are redefined in packages/db/src/services/import.service.ts. This duplication risks type drift if one definition is updated without the other.

The service file should import these types from this module:

In packages/db/src/services/import.service.ts, remove the duplicate definitions and add:

import type { ImportProgress, ImportStageResult } from '@openpanel/importer';

Also applies to: 75-80

🤖 Prompt for AI Agents
In packages/importer/src/types.ts around lines 14-25 and 75-80, ImportProgress
and ImportStageResult are defined here but duplicated in
packages/db/src/services/import.service.ts; remove the duplicate type
definitions from packages/db/src/services/import.service.ts and replace them
with a type-only import from this module: add "import type { ImportProgress,
ImportStageResult } from '@openpanel/importer';" at the top of the service file
and update any local references to use the imported types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants