Skip to content

Conversation

@surajmn1
Copy link
Contributor

@surajmn1 surajmn1 commented Dec 16, 2025

merge after merging #30

Summary by CodeRabbit

Release Notes

  • New Features

    • Real-time progress streaming during dataset creation, refresh, and upload operations with detailed step-by-step status updates
    • Live progress indicators showing completion percentage for improved user awareness
  • UI/UX Improvements

    • Enhanced feedback system with progress bars and status messages throughout data processing workflows
    • Streamlined notifications for a cleaner user experience

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Walkthrough

This PR introduces server-sent events (SSE) streaming across dataset creation and refresh operations. Backend handlers emit real-time progress events via SSE channels, replacing single-response patterns. Frontend components add progress UI integration and new SSE-based mutation hooks to consume and display streaming updates.

Changes

Cohort / File(s) Summary
Backend: Database Create/Refresh
server/interfaces/http/routes/source/database/create.go, server/interfaces/http/routes/source/database/refresh.go
Added SSE infrastructure (SSEEvent, SSEData types, channels, sendEvent helpers) and refactored handlers to emit status_update events for each stage (validating project, creating table, fetching metrics, generating summaries) instead of single JSON responses; errors now stream as SSE events with proper cleanup via goroutine-based async processing.
Backend: S3 Create/Refresh
server/interfaces/http/routes/source/s3/create.go, server/interfaces/http/routes/source/s3/refresh.go
Added SSE support with SSEEvent and SSEData types; replaced single-response JSON with streamed progress events (validating, ingesting, computing metrics, creating dataset/summary, uploading schema); integrated async processing in goroutines with error propagation via SSE channel; updated HTTP response headers to text/event-stream.
Frontend: SSE Client Utility
web/src/lib/sse-client.ts
Created new SSE client with SSEEvent interface and fetchWithSSE function; handles streaming line-by-line parsing, authentication via access token, org/user headers, and error/complete event propagation; returns result data from complete event.
Frontend: Dataset Mutation Hooks
web/src/lib/mutations/dataset/source-dataset.ts, web/src/lib/mutations/dataset/source-database-dataset.ts, web/src/lib/mutations/dataset/refresh-dataset.ts, web/src/lib/mutations/dataset/refresh-dataset-database.ts
Added SSE-enabled variants (useSourceDatasetSSE, useSourceDatabaseDatasetSSE, useRefreshDatasetSSE, useRefreshDatabaseDatasetSSE) alongside existing mutations; new hooks use fetchWithSSE with onProgress callback; updated SourceDatabaseDatasetParams to include onProgress callback and removed non-SSE useSourceDatabaseDataset export.
Frontend: Progress UI Components
web/src/components/dataset/database-source-form.tsx, web/src/components/dataset/dataset-header.tsx, web/src/components/dataset/dataset-upload-wizard.tsx, web/src/components/dataset/file-refresh-wizard.tsx
Integrated SSE-based progress tracking with uploadProgress/dialogProgress state; replaced mutation calls with SSE hooks; added conditional progress UI (banner, progress bar, loader) during creation/refresh; increments progress capped at 90%, displays status messages, and handles dialog vs toast flows for different refresh paths; updated FileRefreshWizardProps to include datasetId.
Frontend: Upload UI
web/src/components/dataset/csv-validation-uppy.tsx, web/src/components/dataset/unified-uploader.tsx
Removed success toast ("File uploaded to S3 successfully!" and "File uploaded successfully!") from upload-success handlers; all other upload logic and error handling remain unchanged.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Key areas requiring attention:

  • Backend SSE event ordering and completeness: Verify each handler emits all expected status_update events in correct sequence and closes channels/goroutines cleanly on error or completion across all four backend files
  • Error propagation in async context: Confirm error handling via handleFailure properly streams errors to client and prevents orphaned goroutines
  • Frontend progress capping logic: Validate onProgress increments progress correctly, caps at 90%, and displays accurate status messages from SSEEvent.message
  • Response data extraction: Ensure SSE complete events carry correct dataset and summary payloads; verify frontend extracts res.dataset.id vs res.data.dataset.id paths consistently
  • SSE client auth/header handling: Review fetchWithSSE's use of NEXT_PUBLIC_ENABLE_AUTH, org ID headers, and access token injection for correctness
  • UI state transitions: Check progress UI is hidden after completion/error and form remains accessible for retry after error in database-source-form and file-refresh-wizard

Possibly related PRs

  • #108: Modifies dataset-upload-wizard.tsx with overlapping step navigation and validation logic changes
  • #92: Modifies database-source-form.tsx with mutation/response handling flow updates

Suggested reviewers

  • Nirlep5252

Poem

🐰 In streams of events the data flows,
Progress bars that gently grow,
From S3 clouds to databases deep,
Toasts and dialogs their vigil keep,
Real-time whispers, no more long waits—
SSE magic opens the gates! 🚀

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title refers to disabling 'Next' in the upload flow, but the changeset primarily implements comprehensive SSE-based progress streaming across multiple backend routes and frontend components, with only a minor UI state management improvement. Revise the title to reflect the main changes: something like 'Add SSE-based progress streaming for dataset operations' or 'Implement server-sent events for upload and refresh flows'.
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/web/523

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link

claude bot commented Dec 16, 2025

Claude finished @surajmn1's task —— View job


PR Code Review

I've completed a comprehensive review of this pull request that implements SSE (Server-Sent Events) for the dataset upload flow. Here are my findings:

Strengths

  1. Well-structured SSE Implementation: The new sse-client.ts provides a clean abstraction for SSE handling with proper error handling and event parsing.

  2. Comprehensive Resource Cleanup: Both Go backend routes implement robust cleanup mechanisms (cleanupResources functions) that properly handle partial failures and prevent resource leaks.

  3. Consistent Error Handling: The backend properly propagates errors through SSE events and the frontend handles them appropriately.

  4. Progress Tracking: The UI provides real-time feedback with progress bars and status messages during long-running operations.

  5. Type Safety: Good TypeScript interfaces for SSE events and data structures.

⚠️ Issues & Concerns

Security Issues
  1. Database Connection String Exposure (database-source-form.tsx:194):

    <Input type="password" />

    While the input type is "password", connection strings may still be logged or exposed in error messages. Consider implementing server-side validation and sanitization.

  2. SQL Injection Risk (database-source-form.tsx:210):
    The SQL query field accepts raw SQL without validation. While this may be intentional for flexibility, consider adding warnings or basic SQL validation.

Performance Concerns
  1. Blocking Sleep in Go Code (server/interfaces/http/routes/source/database/create.go:200):

    time.Sleep(2 * time.Second) // Wait for the table to be created in OLAP

    Hard-coded sleeps are problematic. Consider implementing proper polling or async notifications.

  2. Memory Usage in SSE (sse-client.ts:60-94):
    The SSE client keeps reading the entire stream without backpressure control. For large datasets, this could cause memory issues.

  3. Inefficient Progress Updates (dataset-upload-wizard.tsx:460-462):

    currentProgress = Math.min(currentProgress + 10, 90);

    Artificial progress increments don't reflect actual progress and could mislead users.

Code Quality Issues
  1. Console Logging in Production (multiple files):
    Found numerous console.log, console.error statements that should use proper logging mechanisms.

  2. Commented Dead Code (dataset-header.tsx:93-140):
    Large blocks of commented code should be removed.

  3. Inconsistent Error Handling:

    • Some places use toast notifications
    • Others use alert components
    • Some suppress errors silently
Potential Bugs
  1. Race Condition (dataset-upload-wizard.tsx:323-327):

    await new Promise((resolve) => setTimeout(resolve, 2000));

    Hard-coded delays for state updates can cause race conditions.

  2. Missing Null Checks (database-source-form.tsx:97-112):
    Response parsing assumes certain structure without proper validation.

  3. URL Parsing Complexity (dataset-upload-wizard.tsx:362-431):
    Complex URL parsing logic with many assumptions could fail with edge cases.

📋 Recommendations

High Priority
  1. Remove hard-coded sleeps - Replace with proper async patterns
  2. Add input validation for SQL queries and connection strings
  3. Implement proper logging - Replace console statements with structured logging
  4. Add error boundaries for better error handling in React components
Medium Priority
  1. Add progress tracking - Implement real progress reporting from backend
  2. Optimize memory usage - Add backpressure control to SSE client
  3. Add timeout handling - Long-running operations should have timeouts
  4. Improve type safety - Add more strict typing for API responses
Low Priority
  1. Clean up dead code - Remove commented code blocks
  2. Standardize error handling - Use consistent error handling patterns
  3. Add JSDoc comments - Document complex functions

🧪 Test Coverage Assessment

The repository has test infrastructure (Go tests, Python tests) but this PR lacks:

  • Unit tests for new SSE functionality
  • Integration tests for upload flow
  • Error scenario testing
  • Performance testing for large files

🔒 Security Assessment

While generally secure, pay attention to:

  • Database connection string handling
  • SQL query validation
  • Input sanitization
  • Error message sanitization to prevent information leakage

Overall Assessment

This is a substantial improvement to the upload experience with real-time progress tracking. The implementation is generally solid but has some performance and reliability issues that should be addressed before merging. The SSE architecture is well-designed and the code quality is good overall.

Recommendation: Address the high-priority issues (especially the hard-coded sleeps and error handling) before merging.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web/src/components/dataset/file-refresh-wizard.tsx (1)

58-67: Unused datasetId prop in component signature.

datasetId is declared in FileRefreshWizardProps but the component extracts it from useParams() on line 70-73 instead. Either use the prop or remove it from the interface to avoid confusion.

 export interface FileRefreshWizardProps {
   projectId: string;
-  datasetId: string;
   onRefreshComplete: () => void;
 }

 export function FileRefreshWizard({
   projectId,
   onRefreshComplete,
 }: FileRefreshWizardProps) {
♻️ Duplicate comments (1)
server/interfaces/http/routes/source/s3/create.go (1)

17-28: Duplicate SSE types - same as database package.

SSEEvent and SSEData are duplicated here. Consider consolidating with the types in the database package.

🧹 Nitpick comments (18)
server/interfaces/http/routes/source/database/refresh.go (3)

64-73: Consider handling JSON marshaling errors.

The json.Marshal error is silently ignored on line 70. While unlikely to fail for these simple structs, logging or handling the error would improve debuggability.

-		payloadBytes, _ := json.Marshal(eventPayload)
+		payloadBytes, err := json.Marshal(eventPayload)
+		if err != nil {
+			h.logger.Error("Failed to marshal SSE event", zap.Error(err))
+			return
+		}

83-86: Goroutine continues processing after client disconnect.

The goroutine lacks context cancellation awareness. If the client disconnects mid-stream, the background operations (database queries, OLAP refreshes) continue unnecessarily, wasting resources.

Consider passing ctx.Context() to the goroutine and checking for cancellation at key points.

+	// Get cancellable context for client disconnect detection
+	reqCtx := ctx.Context()
+
 	// Start async refresh process
 	go func() {
 		defer close(sseChan)
+
+		// Helper to check if client disconnected
+		checkCancelled := func() bool {
+			select {
+			case <-reqCtx.Done():
+				h.logger.Info("Client disconnected, aborting refresh")
+				return true
+			default:
+				return false
+			}
+		}

214-215: Hardcoded sleep for data commit synchronization.

The 2-second sleep is a fragile synchronization mechanism. Consider implementing a polling/retry strategy with exponential backoff, or using a proper synchronization mechanism from the OLAP service if available.

server/interfaces/http/routes/source/database/create.go (2)

17-28: Consider extracting SSE types to a shared package.

SSEEvent and SSEData types are duplicated across database and s3 packages. Extract them to a shared package (e.g., interfaces/http/routes/common or domain/models) to avoid duplication and ensure consistency.


158-161: Same client disconnect issue as refresh handler.

The goroutine lacks context cancellation awareness. Consider adding the same pattern suggested for refresh.go to abort processing when the client disconnects.

server/interfaces/http/routes/source/s3/create.go (1)

157-161: Background context used instead of request context.

Using context.Background() means operations continue even if the client disconnects. Consider using a context derived from the request (ctx.Context()) with appropriate timeout/cancellation handling.

web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)

66-93: SSE hook doesn't integrate with React Query state management.

The useRefreshDatabaseDatasetSSE hook returns a raw async function, bypassing React Query's state management (loading, error, caching). This diverges from the coding guideline to use React Query for server state management.

Consider wrapping with useMutation for consistent state handling:

-export const useRefreshDatabaseDatasetSSE = () => {
+export const useRefreshDatabaseDatasetSSE = () => {
   const { accessToken } = useAuth();
-  return async ({
+  
+  const mutationFn = async ({
     projectId,
     datasetName,
     refreshType,
     onProgress,
   }: {
     projectId: string;
     datasetName: string;
     refreshType: "full" | "incremental";
     onProgress: (event: SSEEvent) => void;
   }) => {
     return await fetchWithSSE(
       '/source/database/refresh',
       {
         method: 'POST',
         body: JSON.stringify({
           project_id: projectId,
           dataset_name: datasetName,
           refresh_type: refreshType,
         }),
       },
       onProgress,
       accessToken
     );
   };
+  
+  return { mutate: mutationFn, mutateAsync: mutationFn };
 };

Alternatively, use useMutation from @tanstack/react-query directly for full state management.

web/src/lib/mutations/dataset/refresh-dataset.ts (1)

73-106: SSE hook bypasses React Query state management.

The useRefreshDatasetSSE hook returns a raw async function instead of integrating with React Query. This means:

  • No automatic loading/error state tracking
  • No cache invalidation integration
  • Inconsistent with the existing useRefreshDataset mutation pattern

Consider wrapping with useMutation from React Query to maintain consistency with the coding guidelines that specify using React Query for server state management:

+import { useMutation } from "@tanstack/react-query";
+
-export const useRefreshDatasetSSE = () => {
+export const useRefreshDatasetSSE = () => {
   const { accessToken } = useAuth();
-  return async ({
+  
+  return useMutation({
+    mutationKey: ["refresh-dataset-sse"],
+    mutationFn: async ({
       datasetName,
       projectId,
       s3Url,
       source,
       ignoreErrors,
       onProgress,
     }: {
       datasetName: string;
       projectId: string;
       s3Url: string;
       source?: string;
       ignoreErrors?: boolean;
       onProgress: (event: SSEEvent) => void
     }) => {
       return await fetchWithSSE(
         '/source/s3/refresh',
         {
           method: 'POST',
           body: JSON.stringify({
             dataset_name: datasetName,
             project_id: projectId,
             file_path: s3Url,
             ignore_errors: ignoreErrors ?? true,
             source: source,
           }),
         },
         onProgress,
         accessToken
       );
-  };
+    },
+  });
 };
web/src/lib/mutations/dataset/source-database-dataset.ts (1)

14-14: Consider making onProgress optional with a no-op default.

Making onProgress required forces all callers to provide a progress handler even when they don't need progress updates. Consider making it optional:

-  onProgress: (event: SSEEvent) => void;
+  onProgress?: (event: SSEEvent) => void;

Then in the call:

-      params.onProgress,
+      params.onProgress ?? (() => {}),
web/src/components/dataset/database-source-form.tsx (2)

70-93: Progress tracking uses closure variable which may cause stale reads.

The currentProgress variable is captured in the onProgress closure. If multiple rapid SSE events arrive, React's batching could cause setUploadProgress to be called with potentially stale values. Consider using a ref or functional state update:

+  const currentProgressRef = React.useRef(0);
+
   const handleSubmit = async (e: React.FormEvent) => {
     // ...
-    let currentProgress = 0;
+    currentProgressRef.current = 0;
     setUploadProgress({ message: "Initializing connection...", percentage: 0 });

     try {
       const response = await startDatabaseUpload({
         // ...
         onProgress: (event: SSEEvent) => {
           if (event.type === 'status_update') {
-            currentProgress = Math.min(currentProgress + 10, 90);
+            currentProgressRef.current = Math.min(currentProgressRef.current + 10, 90);
             setUploadProgress({
               message: event.message,
-              percentage: currentProgress
+              percentage: currentProgressRef.current
             });
           }
         }
       });

114-119: Inconsistent error callback behavior between validation and runtime errors.

Line 106 calls onError(errMessage) for invalid responses, but line 118 intentionally skips onError to allow retries. This asymmetry could confuse parent components. Consider documenting this behavior or making it consistent:

       if (!dataset?.id) {
         const errMessage = "Invalid response from server: Dataset ID not found.";
         setFormError(errMessage);
-        onError(errMessage);
+        // Don't call onError to keep dialog open for retry
         setUploadProgress(null);
         return;
       }
web/src/lib/sse-client.ts (2)

10-15: Consider adding AbortController support for cancellation.

Long-running SSE connections should be cancellable. The function accepts RequestInit but callers cannot easily abort the stream:

 export async function fetchWithSSE<T = unknown>(
   url: string,
   options: RequestInit,
   onEvent: (event: SSEEvent<T>) => void,
-  accessToken?: string | null
+  accessToken?: string | null,
+  signal?: AbortSignal
 ): Promise<T> {
   // ...
   const response = await fetch(process.env.NEXT_PUBLIC_GOPIE_API_URL + url, {
     ...options,
     headers,
+    signal,
   });

This enables callers to cancel uploads when users navigate away or close dialogs.


21-23: Minor: Inconsistent indentation.

Line 22 has inconsistent indentation compared to surrounding code.

   // Add Auth & Org Headers (Reusing logic from api-client)
-const isAuthEnabled = String(process.env.NEXT_PUBLIC_ENABLE_AUTH).trim() === "true";
+  const isAuthEnabled = String(process.env.NEXT_PUBLIC_ENABLE_AUTH).trim() === "true";
   const orgId = getGlobalOrganizationId();
web/src/components/dataset/dataset-upload-wizard.tsx (3)

323-324: Hardcoded timeout is a race condition workaround.

Using setTimeout(resolve, 2000) to wait for the store update is fragile. If the upload completes faster or slower, this could cause issues. Consider using a more deterministic approach, such as subscribing to store changes or using a callback from the upload process.


476-480: Type assertion lacks runtime validation.

The as { dataset: Dataset } assertion assumes the server always returns the expected shape. If the SSE endpoint returns a different structure, this could cause subtle runtime errors. Consider adding a runtime check or using a type guard.

-      }) as { dataset: Dataset };
-
-      if (!res?.dataset?.id) {
+      });
+
+      const typedRes = res as { dataset?: Dataset } | undefined;
+      if (!typedRes?.dataset?.id) {

163-163: Minor formatting inconsistency.

There's extra leading whitespace before const. Consider aligning with the surrounding code style.

-const columnMappings = useColumnNameStore((state) => state.columnMappings); 
+  const columnMappings = useColumnNameStore((state) => state.columnMappings);
web/src/components/dataset/file-refresh-wizard.tsx (1)

482-497: Remove unnecessary JSX wrapper braces.

The RefreshFileUploader component is wrapped in unnecessary curly braces. This works but is unconventional.

-              {
-                <RefreshFileUploader
-                  ref={uploaderRef}
-                  existingSchema={existingSchemaData?.schema as ColumnInfo[]}
-                  onValidationStateChange={handleValidationStateChange}
-                  onUploadSuccess={handleUploadComplete}
-                  onUploadError={(msg) => {
-                    // Update toast if active, otherwise just set error
-                    toast.error("Upload Failed", {
-                      id: REFRESH_TOAST_ID,
-                      description: msg,
-                    });
-                    setApiError(msg);
-                  }}
-                />
-              }
+              <RefreshFileUploader
+                ref={uploaderRef}
+                existingSchema={existingSchemaData?.schema as ColumnInfo[]}
+                onValidationStateChange={handleValidationStateChange}
+                onUploadSuccess={handleUploadComplete}
+                onUploadError={(msg) => {
+                  toast.error("Upload Failed", {
+                    id: REFRESH_TOAST_ID,
+                    description: msg,
+                  });
+                  setApiError(msg);
+                }}
+              />
web/src/components/dataset/dataset-header.tsx (1)

147-147: Consider aliasing imports for clarity.

Renaming toast to toasty from useToast while importing toast from sonner works but is confusing. Consider using clearer aliases at the import level.

-import { toast } from "sonner"; 
+import { toast as sonnerToast } from "sonner"; 

Then use sonnerToast instead of toast throughout the SSE progress flows, keeping the original toast from useToast unchanged.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2948d39 and ba3bacb.

📒 Files selected for processing (15)
  • server/interfaces/http/routes/source/database/create.go (4 hunks)
  • server/interfaces/http/routes/source/database/refresh.go (3 hunks)
  • server/interfaces/http/routes/source/s3/create.go (5 hunks)
  • server/interfaces/http/routes/source/s3/refresh.go (3 hunks)
  • web/src/components/dataset/csv-validation-uppy.tsx (0 hunks)
  • web/src/components/dataset/database-source-form.tsx (6 hunks)
  • web/src/components/dataset/dataset-header.tsx (12 hunks)
  • web/src/components/dataset/dataset-upload-wizard.tsx (11 hunks)
  • web/src/components/dataset/file-refresh-wizard.tsx (10 hunks)
  • web/src/components/dataset/unified-uploader.tsx (0 hunks)
  • web/src/lib/mutations/dataset/refresh-dataset-database.ts (2 hunks)
  • web/src/lib/mutations/dataset/refresh-dataset.ts (2 hunks)
  • web/src/lib/mutations/dataset/source-database-dataset.ts (2 hunks)
  • web/src/lib/mutations/dataset/source-dataset.ts (2 hunks)
  • web/src/lib/sse-client.ts (1 hunks)
💤 Files with no reviewable changes (2)
  • web/src/components/dataset/unified-uploader.tsx
  • web/src/components/dataset/csv-validation-uppy.tsx
🧰 Additional context used
📓 Path-based instructions (4)
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx}: Use Next.js 15 with App Router and React 19 for the web frontend framework
Use TypeScript as the language for the web frontend
Generate and use types from API responses for type safety
Use dynamic imports and code splitting for optimal performance
Load DuckDB WASM on-demand and use web workers for client-side SQL execution

Files:

  • web/src/lib/mutations/dataset/refresh-dataset.ts
  • web/src/lib/mutations/dataset/refresh-dataset-database.ts
  • web/src/lib/mutations/dataset/source-dataset.ts
  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/dataset-header.tsx
  • web/src/components/dataset/database-source-form.tsx
  • web/src/lib/sse-client.ts
  • web/src/components/dataset/file-refresh-wizard.tsx
  • web/src/lib/mutations/dataset/source-database-dataset.ts
web/src/lib/{queries,mutations}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Use React Query (TanStack Query) for server state management

Files:

  • web/src/lib/mutations/dataset/refresh-dataset.ts
  • web/src/lib/mutations/dataset/refresh-dataset-database.ts
  • web/src/lib/mutations/dataset/source-dataset.ts
  • web/src/lib/mutations/dataset/source-database-dataset.ts
web/src/components/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/components/**/*.{ts,tsx}: Use TailwindCSS with Radix UI components for styling instead of custom CSS
Use React Hook Form with Zod validation for form handling
Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Wrap features in error boundaries to handle errors gracefully
Always handle loading and error states in UI components
Use Next.js Image component for image optimization

Files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/dataset-header.tsx
  • web/src/components/dataset/database-source-form.tsx
  • web/src/components/dataset/file-refresh-wizard.tsx
web/src/components/dataset/**upload**.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Use Uppy for dataset file uploads in upload components

Files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
🧠 Learnings (14)
📓 Common learnings
Learnt from: CR
Repo: factly/gopie PR: 0
File: chat-server/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:34.955Z
Learning: Applies to chat-server/app/api/v1/routers/**/*.py : Use Server-Sent Events for streaming real-time query progress responses
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/**upload**.{ts,tsx} : Use Uppy for dataset file uploads in upload components

Applied to files:

  • web/src/lib/mutations/dataset/refresh-dataset.ts
  • web/src/lib/mutations/dataset/refresh-dataset-database.ts
  • web/src/lib/mutations/dataset/source-dataset.ts
  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/dataset-header.tsx
  • web/src/components/dataset/database-source-form.tsx
  • web/src/components/dataset/file-refresh-wizard.tsx
  • web/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/**/*.{ts,tsx} : Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls

Applied to files:

  • web/src/lib/mutations/dataset/refresh-dataset.ts
  • web/src/lib/mutations/dataset/refresh-dataset-database.ts
  • web/src/lib/mutations/dataset/source-dataset.ts
  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/dataset-header.tsx
  • web/src/components/dataset/file-refresh-wizard.tsx
  • web/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/lib/{queries,mutations}/**/*.{ts,tsx} : Use React Query (TanStack Query) for server state management

Applied to files:

  • web/src/lib/mutations/dataset/refresh-dataset.ts
  • web/src/lib/mutations/dataset/refresh-dataset-database.ts
  • web/src/lib/mutations/dataset/source-dataset.ts
  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Integrate DuckDB WASM for client-side SQL execution in the Next.js frontend

Applied to files:

  • web/src/lib/mutations/dataset/refresh-dataset-database.ts
  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/dataset-header.tsx
  • web/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-12-02T11:48:34.955Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: chat-server/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:34.955Z
Learning: Applies to chat-server/app/api/v1/routers/**/*.py : Use Server-Sent Events for streaming real-time query progress responses

Applied to files:

  • server/interfaces/http/routes/source/database/refresh.go
  • server/interfaces/http/routes/source/s3/refresh.go
  • server/interfaces/http/routes/source/database/create.go
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Integrate DuckDB for OLAP queries in the Go backend server

Applied to files:

  • server/interfaces/http/routes/source/database/refresh.go
  • server/interfaces/http/routes/source/database/create.go
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Store dataset files in S3-compatible storage (MinIO) in the Go backend server

Applied to files:

  • server/interfaces/http/routes/source/database/refresh.go
  • server/interfaces/http/routes/source/s3/refresh.go
  • server/interfaces/http/routes/source/s3/create.go
  • server/interfaces/http/routes/source/database/create.go
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/sql-editor.{ts,tsx} : Use Monaco Editor for SQL editing with syntax highlighting in SQL editor components

Applied to files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/dataset-header.tsx
  • web/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Zustand for global state management and React Query for server state in the Next.js frontend

Applied to files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/file-refresh-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Monaco Editor for SQL editing in the Next.js frontend

Applied to files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/dataset-header.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/**/*.{ts,tsx} : Load DuckDB WASM on-demand and use web workers for client-side SQL execution

Applied to files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/dataset-header.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/lib/stores/**/*.{ts,tsx} : Use Zustand for global state management (auth, chat, SQL, visualization stores)

Applied to files:

  • web/src/components/dataset/dataset-header.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Use PostgreSQL with SQLC for type-safe queries in the Go backend server

Applied to files:

  • server/interfaces/http/routes/source/database/create.go
🧬 Code graph analysis (11)
web/src/lib/mutations/dataset/refresh-dataset.ts (1)
web/src/lib/sse-client.ts (1)
  • fetchWithSSE (10-96)
web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)
web/src/lib/sse-client.ts (1)
  • fetchWithSSE (10-96)
server/interfaces/http/routes/source/database/refresh.go (3)
server/interfaces/http/routes/source/database/create.go (2)
  • SSEData (25-28)
  • SSEEvent (18-22)
server/infrastructure/postgres/gen/database_sources.sql.go (1)
  • UpdateDatabaseSourceLastUpdatedAtParams (172-175)
server/infrastructure/postgres/gen/datasets.sql.go (1)
  • UpdateDatasetParams (356-367)
web/src/lib/mutations/dataset/source-dataset.ts (1)
web/src/lib/sse-client.ts (1)
  • fetchWithSSE (10-96)
web/src/components/dataset/dataset-upload-wizard.tsx (2)
web/src/lib/mutations/dataset/source-dataset.ts (1)
  • useSourceDatasetSSE (59-103)
web/src/lib/sse-client.ts (1)
  • SSEEvent (3-8)
web/src/components/dataset/dataset-header.tsx (2)
web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)
  • useRefreshDatabaseDatasetSSE (66-93)
web/src/lib/queries/dataset/check-timestamp-column.ts (1)
  • useCheckTimestampColumn (16-42)
web/src/components/dataset/database-source-form.tsx (1)
web/src/lib/mutations/dataset/source-database-dataset.ts (1)
  • useSourceDatabaseDatasetSSE (17-41)
server/interfaces/http/routes/source/database/create.go (2)
server/domain/pkg/utils.go (1)
  • RandomString (40-46)
server/domain/models/database_source.go (1)
  • CreateDatabaseSourceParams (17-29)
web/src/lib/sse-client.ts (1)
web/src/lib/api-client.ts (1)
  • getGlobalOrganizationId (42-44)
web/src/components/dataset/file-refresh-wizard.tsx (2)
web/src/lib/mutations/dataset/refresh-dataset.ts (1)
  • useRefreshDatasetSSE (73-106)
web/src/components/dataset/refresh-file-uploader.tsx (1)
  • RefreshFileUploader (64-402)
web/src/lib/mutations/dataset/source-database-dataset.ts (1)
web/src/lib/sse-client.ts (1)
  • fetchWithSSE (10-96)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: claude-review
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (14)
server/interfaces/http/routes/source/database/refresh.go (1)

275-280: LGTM - Completion event structure is correct.

The final completion event correctly includes both dataset and summary data, aligning with the SSE client expectations on the frontend.

server/interfaces/http/routes/source/s3/create.go (1)

288-293: LGTM - S3 upload SSE flow is well-structured.

The completion event and cleanup handling are correct for the S3 upload flow.

server/interfaces/http/routes/source/s3/refresh.go (2)

238-243: LGTM - S3 refresh completion handling.

The completion event correctly includes dataset and summary data.


182-189: No type compatibility issue exists.

Line 184 passes int(count) to RowCount, which is correct. The code uses models.UpdateDatasetParams from the domain model layer where RowCount is typed as int, not pgtype.Int4. The type conversion is appropriate and the code is valid.

Likely an incorrect or invalid review comment.

web/src/lib/mutations/dataset/source-dataset.ts (1)

59-103: LGTM - SSE variant maintains backward compatibility.

The implementation correctly:

  • Preserves the existing useSourceDataset mutation for non-streaming use cases
  • Mirrors the request body structure from the original mutation
  • Integrates with the SSE client for streaming progress

The same React Query integration suggestion from the other SSE hooks applies here for consistency with the codebase's state management patterns.

web/src/components/dataset/database-source-form.tsx (1)

144-164: LGTM - Progress UI implementation.

The progress UI correctly:

  • Shows a spinner with the current status message
  • Displays percentage completion
  • Provides user feedback about expected duration
  • Conditionally renders based on upload state
web/src/components/dataset/dataset-upload-wizard.tsx (2)

43-46: LGTM: SSE integration imports are properly structured.

The imports for SSE-based upload flow (useSourceDatasetSSE, SSEEvent, Dataset) and the Progress component are correctly added to support the new streaming progress functionality.


662-669: LGTM: Step 3 navigation guard properly implemented.

The canProceedFromStep3 logic correctly prevents users from proceeding when:

  • Processing is ongoing
  • Invalid column names exist
  • Description generation is in progress
  • Descriptions haven't been generated yet

This aligns with the PR objective of properly disabling the "Next" button.

web/src/components/dataset/file-refresh-wizard.tsx (2)

125-269: LGTM: SSE-based refresh flow is well-structured.

The callRefreshApiEndpoint function correctly:

  • Resumes progress from where S3 upload left off (25%)
  • Uses a persistent toast ID for seamless UX transitions
  • Properly handles SSE progress events
  • Invalidates relevant queries on success
  • Provides clear error feedback

271-291: LGTM: Clean handoff from S3 upload to refresh API.

The handleUploadComplete function properly bridges the S3 upload completion to the SSE-based refresh flow with a visible progress indicator, ensuring users see continuous progress.

web/src/components/dataset/dataset-header.tsx (4)

313-405: LGTM: Dual-path progress handling for dialog and toast flows.

The handleDbRefresh function correctly manages two UX paths:

  • Dialog-based progress when the dialog is open
  • Toast-based progress for direct refresh actions

Both paths properly handle progress updates, success notifications, and error states with appropriate cleanup.


971-975: LGTM: Dialog close prevention during refresh.

Preventing the dialog from closing while a refresh operation is in progress is good UX. Users won't accidentally dismiss the progress indicator.


1002-1043: LGTM: Dialog progress UI implementation.

The conditional rendering cleanly switches between the progress view and the action buttons based on dialogProgress state. The progress UI is consistent with the toast-based approach used elsewhere.


370-375: LGTM: Comprehensive cache invalidation after refresh.

Using Promise.all for parallel query invalidation is efficient. The invalidation covers all relevant queries: dataset, datasets list, schema, and table data.

Comment on lines +226 to 229
summaryBytes, _ := json.Marshal(datasetSummary)
summaryString := string(summaryBytes)
cleanup.hasSummary = true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Premature hasSummary flag setting.

Setting cleanup.hasSummary = true on line 228 after GetDatasetSummary is premature. The summary isn't persisted until CreateDatasetSummary on line 345. If a failure occurs between these points, cleanup will attempt to delete a non-existent summary.

 		summaryBytes, _ := json.Marshal(datasetSummary)
 		summaryString := string(summaryBytes)
-		cleanup.hasSummary = true
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
summaryBytes, _ := json.Marshal(datasetSummary)
summaryString := string(summaryBytes)
cleanup.hasSummary = true
summaryBytes, _ := json.Marshal(datasetSummary)
summaryString := string(summaryBytes)
🤖 Prompt for AI Agents
In server/interfaces/http/routes/source/database/create.go around lines 226-229,
the code sets cleanup.hasSummary = true immediately after marshalling the
summary, which is premature because the summary is not persisted until
CreateDatasetSummary (around line 345); change the logic so the flag is only set
after CreateDatasetSummary returns successfully (move the assignment to
immediately after the successful persistence), and ensure any error path between
marshalling and persistence does not mark hasSummary true so cleanup won't
attempt to delete a non-existent summary.

Comment on lines +346 to +352
if err != nil {
h.logger.Error("Error creating dataset summary", zap.Error(err))
cleanup.hasSummary = true
h.cleanupResources(cleanup)
handleFailure(fmt.Errorf("error creating dataset summary: %w", err))
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Incorrect hasSummary flag in error block.

Setting cleanup.hasSummary = true inside the error block (line 348) is incorrect. If CreateDatasetSummary fails, there's no summary to clean up.

 		if err != nil {
 			h.logger.Error("Error creating dataset summary", zap.Error(err))
-			cleanup.hasSummary = true
 			h.cleanupResources(cleanup)
 			handleFailure(fmt.Errorf("error creating dataset summary: %w", err))
 			return
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if err != nil {
h.logger.Error("Error creating dataset summary", zap.Error(err))
cleanup.hasSummary = true
h.cleanupResources(cleanup)
handleFailure(fmt.Errorf("error creating dataset summary: %w", err))
return
}
if err != nil {
h.logger.Error("Error creating dataset summary", zap.Error(err))
h.cleanupResources(cleanup)
handleFailure(fmt.Errorf("error creating dataset summary: %w", err))
return
}
🤖 Prompt for AI Agents
In server/interfaces/http/routes/source/database/create.go around lines 346 to
352, the error branch for CreateDatasetSummary incorrectly sets
cleanup.hasSummary = true even though summary creation failed; remove that
assignment (or set it to false) so cleanup does not attempt to delete a
non-existent summary, keep the subsequent cleanup call and error handling
unchanged.

Comment on lines +128 to +139
// Ensure transactions are committed or rolled back
defer func() {
if err != nil {
h.logger.Error("Rolling back transactions due to error", zap.Error(err))
olapTx.Rollback()
storeTx.Rollback(txCtx)
} else {
h.logger.Info("Committing transactions")
olapTx.Commit()
storeTx.Commit(txCtx)
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fragile transaction commit/rollback pattern.

The defer block relies on the outer err variable being non-nil to determine rollback vs commit. This pattern is fragile because:

  1. Any shadowed err := ... won't affect the outer err, potentially causing incorrect commits.
  2. Any future code additions could inadvertently affect the commit decision.

Consider using an explicit success flag or handling commit/rollback at the end of the goroutine.

+		var txErr error // Explicit error tracking for transaction
+
 		// Ensure transactions are committed or rolled back
 		defer func() {
-			if err != nil {
-				h.logger.Error("Rolling back transactions due to error", zap.Error(err))
+			if txErr != nil {
+				h.logger.Error("Rolling back transactions due to error", zap.Error(txErr))
 				olapTx.Rollback()
 				storeTx.Rollback(txCtx)
 			} else {

Then set txErr = err before each early return, or use a success flag pattern.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In server/interfaces/http/routes/source/s3/refresh.go around lines 128 to 139,
the defer currently checks the outer err variable to decide commit vs rollback
which is fragile due to shadowing and future changes; change the pattern to use
an explicit success flag (e.g., named success or txSuccess) initialized false,
set it to true only after all transactional work succeeds (before normal
return), and have the defer examine that flag to commit on success and rollback
otherwise; alternatively, if you prefer the existing err variable, ensure every
early return sets a shared txErr variable to the error before returning so the
defer reads that explicit txErr instead of a potentially-shadowed err.

Comment on lines +1292 to +1306
{/* {uploadProgress.status === 'processing' && (
<div className="space-y-3 p-4 border rounded-lg bg-secondary/10">
<div className="flex items-center justify-between text-sm">
<span className="font-medium flex items-center gap-2">
<Loader2 className="h-4 w-4 animate-spin text-primary" />
{uploadProgress.message}
</span>
<span className="text-muted-foreground">{uploadProgress.percentage}%</span>
</div>
<Progress value={uploadProgress.percentage} className="h-2" />
<p className="text-xs text-muted-foreground">
Processing your data. Large datasets may take a moment.
</p>
</div>
)} */}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Remove commented-out code.

This block references uploadProgress state that doesn't exist in the current implementation. If this UI is needed, it should be integrated with the new SSE-based progress flow; otherwise, remove it to reduce code noise.

-              {/* {uploadProgress.status === 'processing' && (
-      <div className="space-y-3 p-4 border rounded-lg bg-secondary/10">
-        <div className="flex items-center justify-between text-sm">
-          <span className="font-medium flex items-center gap-2">
-            <Loader2 className="h-4 w-4 animate-spin text-primary" />
-            {uploadProgress.message}
-          </span>
-          <span className="text-muted-foreground">{uploadProgress.percentage}%</span>
-        </div>
-        <Progress value={uploadProgress.percentage} className="h-2" />
-        <p className="text-xs text-muted-foreground">
-          Processing your data. Large datasets may take a moment.
-        </p>
-      </div>
-    )} */}
-                           <Button
+              <Button
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{/* {uploadProgress.status === 'processing' && (
<div className="space-y-3 p-4 border rounded-lg bg-secondary/10">
<div className="flex items-center justify-between text-sm">
<span className="font-medium flex items-center gap-2">
<Loader2 className="h-4 w-4 animate-spin text-primary" />
{uploadProgress.message}
</span>
<span className="text-muted-foreground">{uploadProgress.percentage}%</span>
</div>
<Progress value={uploadProgress.percentage} className="h-2" />
<p className="text-xs text-muted-foreground">
Processing your data. Large datasets may take a moment.
</p>
</div>
)} */}
<Button
🤖 Prompt for AI Agents
In web/src/components/dataset/dataset-upload-wizard.tsx around lines 1292 to
1306, remove the commented-out UI block that references the obsolete
uploadProgress state; this code is dead and either should be integrated with the
new SSE-based progress flow or deleted to reduce noise. Delete the entire
commented block and, if progress UI is required later, re-implement it using the
current SSE progress state and handlers rather than the removed uploadProgress
variable.

Comment on lines +67 to +68
const { accessToken } = useAuth();
return async ({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Access token may become stale.

The accessToken is captured when the hook is called, but the returned async function may be invoked later. If the token refreshes in between, the mutation uses a stale token.

Consider getting a fresh token inside the async function, or use a ref to always access the latest token.

 export const useRefreshDatabaseDatasetSSE = () => {
-  const { accessToken } = useAuth();
+  const { getAccessToken } = useAuth(); // Assuming useAuth provides a getter
   return async ({
     projectId,
     datasetName,
     refreshType,
     onProgress,
   }: {
     projectId: string;
     datasetName: string;
     refreshType: "full" | "incremental";
     onProgress: (event: SSEEvent) => void;
   }) => {
+    const accessToken = await getAccessToken?.();
     return await fetchWithSSE(

If useAuth doesn't provide getAccessToken, consider using a ref pattern or restructuring the hook.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In web/src/lib/mutations/dataset/refresh-dataset-database.ts around lines 67-68,
the hook captures accessToken when the hook runs which can become stale by the
time the returned async function executes; modify the implementation to retrieve
a fresh token at invocation time (either call a provided getAccessToken inside
the async function or read accessToken from a ref that useAuth updates) and
update any call sites accordingly so the network request always uses the current
token.

Comment on lines +60 to +93
while (true) {
const { done, value } = await reader.read();
if (done) break;

const chunk = decoder.decode(value, { stream: true });
const lines = chunk.split('\n');

for (const line of lines) {
if (line.startsWith('data: ')) {
try {
const jsonStr = line.slice(6);
if (!jsonStr.trim()) continue;

const event = JSON.parse(jsonStr);

// Map backend event types to frontend expected types if necessary
// Server sends: "status_update", "complete", "error"
onEvent(event);

if (event.type === 'complete') {
resultData = event.data;
}
if (event.type === 'error') {
throw new Error(event.message);
}
} catch (e) {
if (e instanceof Error && e.message !== "Unexpected end of JSON input") {
console.error("SSE Parse Error", e);
}
if (line.includes('"type":"error"')) throw e; // Re-throw actual errors
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

SSE parsing doesn't handle messages split across chunks.

The current implementation splits each chunk by newlines, but SSE messages can be split across multiple chunks. This could cause parsing failures for large payloads:

+  let buffer = '';
+
   while (true) {
     const { done, value } = await reader.read();
     if (done) break;

-    const chunk = decoder.decode(value, { stream: true });
-    const lines = chunk.split('\n');
+    buffer += decoder.decode(value, { stream: true });
+    const lines = buffer.split('\n');
+    buffer = lines.pop() || ''; // Keep incomplete line in buffer

     for (const line of lines) {
       // ... existing parsing logic
     }
   }
+
+  // Process any remaining buffer content
+  if (buffer.trim() && buffer.startsWith('data: ')) {
+    // ... handle remaining data
+  }

Committable suggestion skipped: line range outside the PR's diff.

}
}

return resultData as T;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unsafe cast when no complete event received.

If the SSE stream ends without a complete event (e.g., connection dropped), resultData remains undefined but is cast to T. This could cause runtime errors in callers expecting valid data:

-  return resultData as T;
+  if (resultData === undefined) {
+    throw new Error('SSE stream ended without completion event');
+  }
+  return resultData;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return resultData as T;
if (resultData === undefined) {
throw new Error('SSE stream ended without completion event');
}
return resultData;
🤖 Prompt for AI Agents
In web/src/lib/sse-client.ts around line 95, the code unsafely casts resultData
to T even when resultData may be undefined if no 'complete' event arrived;
change the logic to validate resultData before casting: if resultData is
undefined, reject/throw a clear error (e.g., "SSE stream ended without complete
event") or return a nullable type, otherwise cast and return; update any promise
resolution paths and ensure any EventSource/error handlers also trigger the same
rejection to avoid silent undefined returns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants