-
Notifications
You must be signed in to change notification settings - Fork 0
web : fix row validation errors not rendering properly #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- for uploads and refreshes
WalkthroughAdded Server-Sent Events (SSE) to backend upload/refresh handlers (database and S3), introduced a frontend SSE client and SSE-based mutation hooks, and updated multiple UI components to stream incremental progress, errors, and final results; also simplified rejected-row validation logic. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant HTTP as HTTP Handler
participant SSE as Stream Writer
participant Worker as Background Goroutine
participant DB as Database
participant AI as AI/External Service
Client->>HTTP: POST /source/... (upload / refresh)
HTTP->>Client: 200 + headers (text/event-stream)
HTTP->>SSE: start streaming loop
HTTP->>Worker: spawn goroutine to perform steps
activate Worker
Worker->>SSE: event(status_update, "validating")
SSE->>Client: data: {type:"status_update", message:"validating"}
Worker->>DB: check/create tables or start ingest
Worker->>SSE: event(status_update, "table_created" / "ingesting")
SSE->>Client: data: {type:"status_update", message:"table_created"}
Worker->>DB: fetch metrics / verify
Worker->>AI: generate descriptions / schema (if applicable)
Worker->>SSE: event(status_update, "ai_summary_generated")
SSE->>Client: data: {type:"status_update", message:"ai_summary_generated"}
Worker->>DB: create dataset/source, upload schema
alt success
Worker->>SSE: event(complete, "done", data:{dataset,...})
SSE->>Client: data: {type:"complete", message:"done", data:{...}}
else error
Worker->>SSE: event(error, "reason")
SSE->>Client: data: {type:"error", message:"reason"}
end
deactivate Worker
HTTP->>Client: close stream
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude finished @surajmn1's task —— View job PR Review: Row Validation Errors Rendering FixComprehensive review completed for this PR that fixes row validation errors not rendering properly and adds streaming support for DB-based uploads. Review Progress
🔍 Detailed Code Review✅ Positive Findings1. Excellent Validation Error Rendering ImprovementsThe validation error display has been significantly improved across components:
2. Robust SSE ImplementationNew streaming client (
3. Comprehensive Backend StreamingDatabase source endpoints demonstrate excellent patterns:
4. Improved User Experience
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (24)
server/interfaces/http/routes/source/database/create.go (4)
199-200: Avoid hardcoded sleep for waiting on table creation.Using a fixed 2-second delay is fragile—it may be insufficient under load or wastefully long otherwise. Consider polling with exponential backoff or using a callback/notification mechanism from the OLAP service.
- sendEvent("status_update", "Waiting for table creation...", nil) - time.Sleep(2 * time.Second) // Wait for the table to be created in OLAP + sendEvent("status_update", "Waiting for table creation...", nil) + // TODO: Replace with polling/retry mechanism or callback from OLAP service + if err := h.waitForTableReady(tableName); err != nil { + h.logger.Error("Table creation timeout", zap.Error(err), zap.String("table_name", tableName)) + handleFailure(fmt.Errorf("table creation timeout: %w", err)) + return + }
230-238: Quote the table identifier for defense in depth.While
tableNameis internally generated, directly interpolating identifiers into SQL queries is a risky pattern. If the generation logic changes or is reused, this could become a SQL injection vector. Consider quoting the identifier.- rows, err := h.olapSvc.ExecuteQuery(fmt.Sprintf("select * from %s order by random() limit 50", tableName)) + rows, err := h.olapSvc.ExecuteQuery(fmt.Sprintf(`select * from "%s" order by random() limit 50`, tableName))Alternatively, if
olapSvcsupports parameterized identifiers or has a dedicated method for safe identifier handling, prefer that approach.
264-278: Typo:datasetDesciption→datasetDescription.- datasetDesciption, err := h.aiSvc.GenerateDatasetDescription( + datasetDescription, err := h.aiSvc.GenerateDatasetDescription( body.Alias, columnNames, descriptions, rowsString, summaryString, ) if err != nil { h.logger.Error("Error generating dataset description", zap.Error(err)) h.cleanupResources(cleanup) handleFailure(fmt.Errorf("error generating dataset description: %w", err)) return } // override the description provided by user for now - body.Description = datasetDesciption + body.Description = datasetDescription
158-161: Handle client disconnection to avoid wasted work.The goroutine doesn't check for client disconnection. If a client drops the connection mid-stream, the goroutine continues executing all operations (AI calls, database writes), wasting resources. Consider passing a context and checking for cancellation at key points.
+ // Get context for cancellation detection + reqCtx := ctx.Context() + // Start async upload process go func() { defer close(sseChan) + + // Helper to check if client disconnected + checkCancelled := func() bool { + select { + case <-reqCtx.Done(): + h.logger.Info("Client disconnected, aborting operation") + return true + default: + return false + } + }Then call
checkCancelled()before expensive operations (AI calls, database writes) and return early if true.web/src/lib/validation/validate-file.ts (1)
622-622: Extract the hardcoded limit to a named constant.The
LIMIT 200is hardcoded and appears in the query. Consider extracting it to a named constant at the file level for better maintainability.+// Maximum number of rejected rows to retrieve for display +const MAX_REJECTED_ROWS_PREVIEW = 200; + async function getRejectedRows( conn: duckdb.AsyncDuckDBConnection, ): Promise<{ rejectedRows: RejectedRow[]; rejectedRowCount: number }> { try { // ... existing code ... const rejectedQ = await conn.query(` SELECT * FROM reject_errors ORDER BY file_id, line - LIMIT 200; + LIMIT ${MAX_REJECTED_ROWS_PREVIEW}; `);server/interfaces/http/routes/source/s3/refresh.go (3)
128-139: Transaction commit/rollback logic relies onerrvariable state at defer time.The defer closure captures
errby reference. Sinceerris reassigned throughout the goroutine, the final value at defer execution determines commit vs rollback. This works correctly, but the pattern is subtle and could be fragile if future code paths return early without settingerr.Consider using a named
successboolean set explicitly before the completion event for clearer intent:+ success := false // Ensure transactions are committed or rolled back defer func() { - if err != nil { + if !success { h.logger.Error("Rolling back transactions due to error", zap.Error(err)) olapTx.Rollback() storeTx.Rollback(txCtx) } else { h.logger.Info("Committing transactions") olapTx.Commit() storeTx.Commit(txCtx) } }()Then set
success = truebefore the completion event at line 239.
167-168: Hardcoded sleep for data commit synchronization is fragile.The 2-second sleep may be insufficient under load or excessive in normal conditions. Consider using a polling mechanism or event-based synchronization if the OLAP service supports it.
78-84: Error event format may not be parsed correctly by the frontend SSE client.The
handleFailurehelper emitsevent: error\ndata: ...but the frontendfetchWithSSEonly processes lines starting withdata:. Theevent: errorline is ignored. The error is still detected viaevent.type === 'error'in the JSON payload, so this works, but theevent:line is redundant.For consistency with
sendEvent, consider simplifying:handleFailure := func(failErr error) { errMsg := failErr.Error() - errorPayload, _ := json.Marshal(map[string]string{"type": "error", "message": errMsg}) - errorMsg := fmt.Sprintf("event: error\ndata: %s\n\n", errorPayload) - sseChan <- SSEData{Data: []byte(errorMsg)} + sendEvent("error", errMsg, nil) }server/interfaces/http/routes/source/s3/create.go (1)
17-28: SSEEvent and SSEData types are duplicated across packages.These types are also defined in
server/interfaces/http/routes/source/database/create.go(per relevant snippets). Extract to a shared package to avoid duplication and ensure consistency:// e.g., server/interfaces/http/routes/common/sse.go package common type SSEEvent struct { Type string `json:"type"` Message string `json:"message"` Data any `json:"data,omitempty"` } type SSEData struct { Data []byte Error error }server/interfaces/http/routes/source/database/refresh.go (1)
214-215: Same hardcoded sleep pattern as other handlers.Consider extracting this to a configuration constant or implementing proper synchronization.
web/src/lib/mutations/dataset/refresh-dataset.ts (1)
73-106: Hook pattern doesn't integrate with React Query mutation lifecycle.The
useRefreshDatasetSSEreturns a plain async function rather than usingcreateMutation. This means:
- No automatic loading/error state tracking
- No mutation cache integration
- No retry logic or callbacks (onSuccess, onError, onSettled)
Per coding guidelines, React Query should be used for server state management. Consider wrapping with
useMutationfrom React Query:export const useRefreshDatasetSSE = () => { const { accessToken } = useAuth(); return useMutation({ mutationKey: ["refresh-dataset-sse"], mutationFn: async ({ datasetName, projectId, s3Url, source, ignoreErrors, onProgress, }: RefreshDatasetSSEVariables) => { return fetchWithSSE( '/source/s3/refresh', { method: 'POST', body: JSON.stringify({ dataset_name: datasetName, project_id: projectId, file_path: s3Url, ignore_errors: ignoreErrors ?? true, source: source, }), }, onProgress, accessToken ); }, }); };As per coding guidelines, use React Query for server state management.
web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)
66-93: Same hook pattern issues asuseRefreshDatasetSSE.This hook has the same concerns:
- Plain async function return instead of React Query mutation integration
- Access token captured at hook render time (potential staleness)
Apply the same refactoring suggestions as
useRefreshDatasetSSEfor consistency and proper React Query integration. As per coding guidelines, use React Query for server state management.web/src/lib/mutations/dataset/source-dataset.ts (1)
59-103: Consider adding explicit return type for type safety.The
useSourceDatasetSSEhook returnsfetchWithSSEresult without an explicit type, which loses type safety. Additionally,SSEEventis used without a generic parameter.-export const useSourceDatasetSSE = () => { +interface SourceDatasetSSEResult { + dataset: Dataset; + summary: { + dataset_name: string; + summary: Record<string, string>[]; + }; +} + +export const useSourceDatasetSSE = () => { const { accessToken } = useAuth(); return async ({ ... - onProgress: (event: SSEEvent) => void + onProgress: (event: SSEEvent<SourceDatasetSSEResult>) => void }) => { ... - return await fetchWithSSE( + return await fetchWithSSE<SourceDatasetSSEResult>( '/source/s3/upload', ... ); }; };web/src/lib/mutations/dataset/source-database-dataset.ts (1)
4-15: Interface design differs fromuseSourceDatasetSSEpattern.The
onProgresscallback is included inSourceDatabaseDatasetParamsinterface, whileuseSourceDatasetSSEdefines it inline in the function parameters. Consider aligning the patterns for consistency across SSE hooks.web/src/components/dataset/dataset-header.tsx (2)
342-366: Progress increment logic may not accurately reflect actual progress.The progress increments by a fixed 10% per
status_updateevent and caps at 90%. If the server sends many events, progress will quickly hit 90% and stall; if it sends few events, progress may appear slower than actual work. Consider usingevent.progressif the server provides it, or adjusting the increment based on expected event count.onProgress: (event: SSEEvent) => { if (event.type === 'status_update') { - currentProgress = Math.min(currentProgress + 10, 90); + // Prefer server-provided progress if available + currentProgress = event.progress != null + ? Math.min(event.progress, 90) + : Math.min(currentProgress + 10, 90); if (isDialogFlow) {
147-147: Variable rename fromtoasttotoastyis functional but consider alternatives.Renaming destructured
toasttotoastyto avoid shadowing the importedtoastfrom sonner works, but is slightly confusing. Consider renaming the sonner import instead for clarity.-import { toast } from "sonner"; +import { toast as sonnerToast } from "sonner"; ... -const { toast : toasty } = useToast(); +const { toast } = useToast();Then use
sonnerToastfor the sonner calls (lines 325, 354, 380, 382, 394, 396).web/src/components/dataset/database-source-form.tsx (2)
85-93: Same progress increment pattern — consider extracting to shared utility.This progress calculation logic (
Math.min(currentProgress + 10, 90)) is duplicated indataset-header.tsx. Consider extracting a shared utility or using server-provided progress values.
96-112: Complex type assertions for response handling.The response structure handling with multiple type assertions is defensive but hard to follow. Consider defining a proper response type interface.
+interface DatabaseUploadResponse { + data?: { + dataset: { + id: string; + alias: string; + }; + }; + dataset?: { + id: string; + alias: string; + }; +} // In handleSubmit: -const result = (response as unknown as Record<string, unknown>)?.data || response; -const typedResult = result as Record<string, unknown>; -const dataset = typedResult?.dataset as Record<string, unknown> | undefined; +const typedResponse = response as DatabaseUploadResponse; +const dataset = typedResponse?.data?.dataset || typedResponse?.dataset;web/src/lib/sse-client.ts (2)
22-22: Inconsistent indentation on line 22.The
const isAuthEnabledline has inconsistent indentation compared to surrounding code.// Add Auth & Org Headers (Reusing logic from api-client) -const isAuthEnabled = String(process.env.NEXT_PUBLIC_ENABLE_AUTH).trim() === "true"; + const isAuthEnabled = String(process.env.NEXT_PUBLIC_ENABLE_AUTH).trim() === "true"; const orgId = getGlobalOrganizationId();
60-93: SSE parsing may fail with chunked JSON split across reads.The current implementation splits chunks by newline and parses each
data:line as complete JSON. If a large JSON payload is split across multiple chunks (mid-line), parsing will fail. The"Unexpected end of JSON input"check suggests this is known but silently ignored.Consider buffering incomplete lines:
const reader = response.body.getReader(); const decoder = new TextDecoder(); let resultData: T | undefined; +let buffer = ''; while (true) { const { done, value } = await reader.read(); if (done) break; - const chunk = decoder.decode(value, { stream: true }); - const lines = chunk.split('\n'); + buffer += decoder.decode(value, { stream: true }); + const lines = buffer.split('\n'); + // Keep the last incomplete line in the buffer + buffer = lines.pop() || ''; for (const line of lines) {web/src/components/dataset/file-refresh-wizard.tsx (2)
58-73: UnuseddatasetIdprop in component signature.The
datasetIdis declared inFileRefreshWizardProps(line 60) but is never destructured or used from props. Instead, it's obtained fromuseParams()on line 70. This creates a confusing API where callers must passdatasetIdbut it's ignored.Either remove
datasetIdfrom the props interface and rely solely onuseParams(), or use the prop and remove theuseParams()call for consistency.export interface FileRefreshWizardProps { projectId: string; - datasetId: string; onRefreshComplete: () => void; }
482-497: Remove unnecessary curly braces around JSX.The curly braces wrapping
RefreshFileUploaderon lines 482 and 497 serve no purpose and can be removed for cleaner code.- { - <RefreshFileUploader - ref={uploaderRef} - existingSchema={existingSchemaData?.schema as ColumnInfo[]} - onValidationStateChange={handleValidationStateChange} - onUploadSuccess={handleUploadComplete} - onUploadError={(msg) => { - // Update toast if active, otherwise just set error - toast.error("Upload Failed", { - id: REFRESH_TOAST_ID, - description: msg, - }); - setApiError(msg); - }} - /> - } + <RefreshFileUploader + ref={uploaderRef} + existingSchema={existingSchemaData?.schema as ColumnInfo[]} + onValidationStateChange={handleValidationStateChange} + onUploadSuccess={handleUploadComplete} + onUploadError={(msg) => { + // Update toast if active, otherwise just set error + toast.error("Upload Failed", { + id: REFRESH_TOAST_ID, + description: msg, + }); + setApiError(msg); + }} + />web/src/components/dataset/dataset-upload-wizard.tsx (2)
1272-1286: Remove commented-out code block.This commented JSX block for
uploadProgressappears to be dead code. If it's intended for future implementation, consider removing it and tracking the feature in a separate issue or TODO comment instead.- {/* {uploadProgress.status === 'processing' && ( - <div className="space-y-3 p-4 border rounded-lg bg-secondary/10"> - <div className="flex items-center justify-between text-sm"> - <span className="font-medium flex items-center gap-2"> - <Loader2 className="h-4 w-4 animate-spin text-primary" /> - {uploadProgress.message} - </span> - <span className="text-muted-foreground">{uploadProgress.percentage}%</span> - </div> - <Progress value={uploadProgress.percentage} className="h-2" /> - <p className="text-xs text-muted-foreground"> - Processing your data. Large datasets may take a moment. - </p> - </div> - )} */} - <Button + <Button
324-341: Hardcoded timeout for store update is fragile.The 2-second hardcoded wait on line 325 assumes the store will be updated within that timeframe. This is a fragile pattern that could cause race conditions on slow connections or unnecessary delays on fast ones.
Consider using a polling approach with a maximum timeout, or restructuring the flow so the upload response is passed directly rather than relying on store synchronization timing.
- // Wait a bit longer for the store to update with the upload response - await new Promise((resolve) => setTimeout(resolve, 2000)); + // Poll for the upload response with timeout + const maxAttempts = 10; + const pollInterval = 200; + for (let i = 0; i < maxAttempts; i++) { + const newUploadResponse = useUploadStore.getState().uploadResponse; + if (newUploadResponse && typeof newUploadResponse === "object") { + break; + } + await new Promise((resolve) => setTimeout(resolve, pollInterval)); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
server/interfaces/http/routes/source/database/create.go(4 hunks)server/interfaces/http/routes/source/database/refresh.go(3 hunks)server/interfaces/http/routes/source/s3/create.go(5 hunks)server/interfaces/http/routes/source/s3/refresh.go(3 hunks)web/src/components/dataset/csv-validation-uppy.tsx(0 hunks)web/src/components/dataset/database-source-form.tsx(6 hunks)web/src/components/dataset/dataset-header.tsx(12 hunks)web/src/components/dataset/dataset-upload-wizard.tsx(11 hunks)web/src/components/dataset/file-refresh-wizard.tsx(10 hunks)web/src/components/dataset/unified-uploader.tsx(0 hunks)web/src/lib/mutations/dataset/refresh-dataset-database.ts(2 hunks)web/src/lib/mutations/dataset/refresh-dataset.ts(2 hunks)web/src/lib/mutations/dataset/source-database-dataset.ts(2 hunks)web/src/lib/mutations/dataset/source-dataset.ts(2 hunks)web/src/lib/sse-client.ts(1 hunks)web/src/lib/validation/validate-file.ts(1 hunks)
💤 Files with no reviewable changes (2)
- web/src/components/dataset/csv-validation-uppy.tsx
- web/src/components/dataset/unified-uploader.tsx
🧰 Additional context used
📓 Path-based instructions (4)
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx}: Use Next.js 15 with App Router and React 19 for the web frontend framework
Use TypeScript as the language for the web frontend
Generate and use types from API responses for type safety
Use dynamic imports and code splitting for optimal performance
Load DuckDB WASM on-demand and use web workers for client-side SQL execution
Files:
web/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/lib/mutations/dataset/refresh-dataset.tsweb/src/lib/sse-client.tsweb/src/components/dataset/dataset-header.tsxweb/src/lib/mutations/dataset/source-dataset.tsweb/src/lib/mutations/dataset/source-database-dataset.tsweb/src/lib/validation/validate-file.tsweb/src/components/dataset/database-source-form.tsxweb/src/components/dataset/file-refresh-wizard.tsxweb/src/components/dataset/dataset-upload-wizard.tsx
web/src/lib/{queries,mutations}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Use React Query (TanStack Query) for server state management
Files:
web/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/lib/mutations/dataset/refresh-dataset.tsweb/src/lib/mutations/dataset/source-dataset.tsweb/src/lib/mutations/dataset/source-database-dataset.ts
web/src/components/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/components/**/*.{ts,tsx}: Use TailwindCSS with Radix UI components for styling instead of custom CSS
Use React Hook Form with Zod validation for form handling
Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Wrap features in error boundaries to handle errors gracefully
Always handle loading and error states in UI components
Use Next.js Image component for image optimization
Files:
web/src/components/dataset/dataset-header.tsxweb/src/components/dataset/database-source-form.tsxweb/src/components/dataset/file-refresh-wizard.tsxweb/src/components/dataset/dataset-upload-wizard.tsx
web/src/components/dataset/**upload**.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Use Uppy for dataset file uploads in upload components
Files:
web/src/components/dataset/dataset-upload-wizard.tsx
🧠 Learnings (14)
📓 Common learnings
Learnt from: CR
Repo: factly/gopie PR: 0
File: chat-server/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:34.955Z
Learning: Applies to chat-server/app/api/v1/routers/**/*.py : Use Server-Sent Events for streaming real-time query progress responses
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/**/*.{ts,tsx} : Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Applied to files:
web/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/lib/mutations/dataset/refresh-dataset.tsweb/src/components/dataset/dataset-header.tsxweb/src/lib/mutations/dataset/source-dataset.tsweb/src/lib/mutations/dataset/source-database-dataset.tsweb/src/components/dataset/file-refresh-wizard.tsxweb/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/lib/{queries,mutations}/**/*.{ts,tsx} : Use React Query (TanStack Query) for server state management
Applied to files:
web/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/lib/mutations/dataset/refresh-dataset.tsweb/src/lib/mutations/dataset/source-dataset.tsweb/src/lib/mutations/dataset/source-database-dataset.tsweb/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Integrate DuckDB WASM for client-side SQL execution in the Next.js frontend
Applied to files:
web/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/components/dataset/dataset-header.tsxweb/src/lib/mutations/dataset/source-database-dataset.tsweb/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:34.955Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: chat-server/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:34.955Z
Learning: Applies to chat-server/app/api/v1/routers/**/*.py : Use Server-Sent Events for streaming real-time query progress responses
Applied to files:
server/interfaces/http/routes/source/database/create.goserver/interfaces/http/routes/source/s3/create.goserver/interfaces/http/routes/source/database/refresh.go
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Integrate DuckDB for OLAP queries in the Go backend server
Applied to files:
server/interfaces/http/routes/source/database/create.go
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Store dataset files in S3-compatible storage (MinIO) in the Go backend server
Applied to files:
server/interfaces/http/routes/source/database/create.goserver/interfaces/http/routes/source/s3/refresh.goserver/interfaces/http/routes/source/s3/create.goserver/interfaces/http/routes/source/database/refresh.go
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Use PostgreSQL with SQLC for type-safe queries in the Go backend server
Applied to files:
server/interfaces/http/routes/source/database/create.go
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/**upload**.{ts,tsx} : Use Uppy for dataset file uploads in upload components
Applied to files:
web/src/lib/mutations/dataset/refresh-dataset.tsweb/src/components/dataset/dataset-header.tsxweb/src/lib/mutations/dataset/source-dataset.tsweb/src/lib/mutations/dataset/source-database-dataset.tsweb/src/components/dataset/database-source-form.tsxweb/src/components/dataset/file-refresh-wizard.tsxweb/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/sql-editor.{ts,tsx} : Use Monaco Editor for SQL editing with syntax highlighting in SQL editor components
Applied to files:
web/src/components/dataset/dataset-header.tsxweb/src/lib/mutations/dataset/source-database-dataset.tsweb/src/components/dataset/database-source-form.tsxweb/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Monaco Editor for SQL editing in the Next.js frontend
Applied to files:
web/src/components/dataset/dataset-header.tsxweb/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/**/*.{ts,tsx} : Load DuckDB WASM on-demand and use web workers for client-side SQL execution
Applied to files:
web/src/components/dataset/dataset-header.tsxweb/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/lib/stores/**/*.{ts,tsx} : Use Zustand for global state management (auth, chat, SQL, visualization stores)
Applied to files:
web/src/components/dataset/dataset-header.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Zustand for global state management and React Query for server state in the Next.js frontend
Applied to files:
web/src/components/dataset/file-refresh-wizard.tsxweb/src/components/dataset/dataset-upload-wizard.tsx
🧬 Code graph analysis (10)
web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)
web/src/lib/sse-client.ts (1)
fetchWithSSE(10-96)
web/src/lib/mutations/dataset/refresh-dataset.ts (1)
web/src/lib/sse-client.ts (1)
fetchWithSSE(10-96)
web/src/lib/sse-client.ts (1)
web/src/lib/api-client.ts (1)
getGlobalOrganizationId(42-44)
web/src/components/dataset/dataset-header.tsx (2)
web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)
useRefreshDatabaseDatasetSSE(66-93)web/src/lib/queries/dataset/check-timestamp-column.ts (1)
useCheckTimestampColumn(16-42)
web/src/lib/mutations/dataset/source-dataset.ts (1)
web/src/lib/sse-client.ts (1)
fetchWithSSE(10-96)
web/src/lib/mutations/dataset/source-database-dataset.ts (1)
web/src/lib/sse-client.ts (1)
fetchWithSSE(10-96)
server/interfaces/http/routes/source/s3/create.go (3)
server/interfaces/http/routes/source/database/create.go (2)
SSEData(25-28)SSEEvent(18-22)server/domain/models/dataset.go (1)
CreateDatasetParams(84-108)server/infrastructure/postgres/gen/datasets.sql.go (1)
CreateDatasetParams(32-45)
web/src/components/dataset/database-source-form.tsx (1)
web/src/lib/mutations/dataset/source-database-dataset.ts (1)
useSourceDatabaseDatasetSSE(17-41)
web/src/components/dataset/file-refresh-wizard.tsx (2)
web/src/lib/mutations/dataset/refresh-dataset.ts (1)
useRefreshDatasetSSE(73-106)web/src/components/dataset/refresh-file-uploader.tsx (1)
RefreshFileUploader(64-402)
web/src/components/dataset/dataset-upload-wizard.tsx (2)
web/src/lib/mutations/dataset/source-dataset.ts (1)
useSourceDatasetSSE(59-103)web/src/lib/sse-client.ts (1)
SSEEvent(3-8)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: claude-review
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (25)
server/interfaces/http/routes/source/database/create.go (4)
17-28: LGTM!The SSE type definitions are well-structured for streaming progress updates.
SSEEventprovides a clean JSON structure, andSSEDataappropriately wraps channel communication with error handling capability.
135-156: LGTM!The SSE channel setup and helper functions are well-designed. The buffer size of 10 is reasonable, and the helpers reduce code duplication. Note: ignoring
json.Marshalerrors (lines 145, 153) is acceptable here since these simple structs won't fail marshaling.
226-229: Verify cleanup logic:hasSummaryset prematurely?Setting
cleanup.hasSummary = trueafterGetDatasetSummary(line 228) seems premature—this appears to compute/fetch the summary, not persist it. The actual persistence happens atCreateDatasetSummary(line 345). If cleanup runs before line 345,DeleteDatasetSummarymay be called on a non-existent record.Consider moving this assignment to after the summary is actually persisted:
summaryBytes, _ := json.Marshal(datasetSummary) summaryString := string(summaryBytes) - cleanup.hasSummary = trueAnd keep only the assignment at line 353 (after
CreateDatasetSummarysucceeds). Please verify the semantics ofGetDatasetSummaryvsCreateDatasetSummaryto confirm.
380-406: LGTM!The SSE streaming setup follows best practices—correct headers, proper use of
SetBodyStreamWriter, and appropriate error handling for write/flush operations. Based on learnings, this aligns with the recommendation to use Server-Sent Events for streaming real-time progress responses.web/src/lib/validation/validate-file.ts (2)
607-616: LGTM!The early-exit pattern efficiently handles the case where no validation errors occurred, avoiding unnecessary queries.
644-646: LGTM!Graceful error handling that returns empty results instead of throwing ensures the validation process doesn't fail completely when rejected row retrieval encounters issues.
server/interfaces/http/routes/source/s3/refresh.go (1)
252-268: Client disconnection doesn't stop the background refresh operation.If
w.Writeorw.Flushfails (client disconnected), the stream writer returns but the goroutine continues processing. This is acceptable if server-side completion is desired regardless of client presence, but wastes resources otherwise.Consider using a context with cancellation passed to the goroutine if early termination on disconnect is preferred.
server/interfaces/http/routes/source/s3/create.go (2)
189-202: LGTM!The progressive resource cleanup pattern is well-implemented. Resources are tracked incrementally and cleaned up appropriately on failure paths.
295-319: SSE streaming implementation is consistent with refresh.go.The streaming pattern matches the refresh handler. The same considerations about client disconnection and error event format apply here.
server/interfaces/http/routes/source/database/refresh.go (1)
240-268: No rollback for partial failures after dataset update.If summary creation (line 251) or AI agent upload (line 260) fails, the dataset record is already updated with new metrics. This may leave the system in an inconsistent state where dataset metadata is current but summary/schema are stale or missing.
Consider whether a transactional approach (similar to s3/refresh.go) or compensating actions are needed for consistency.
web/src/lib/mutations/dataset/refresh-dataset.ts (1)
73-75: Access token captured at hook call time may become stale.The
accessTokenis captured when the hook renders, but the returned async function may be called later. If the token refreshes in between, a stale token is used.Consider accessing the token inside the async function, or use a ref to always get the current value:
export const useRefreshDatasetSSE = () => { const { getAccessToken } = useAuth(); // if available return async (...) => { const accessToken = await getAccessToken(); return fetchWithSSE(..., accessToken); }; };web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)
79-91: LGTM on the fetchWithSSE integration.The SSE request configuration correctly maps the mutation parameters to the expected backend payload format, and properly passes the access token for authorization.
web/src/lib/mutations/dataset/source-database-dataset.ts (1)
17-41: LGTM — SSE hook implementation is correct.The hook properly extracts the access token and delegates to
fetchWithSSE. Consider adding an explicit return type for improved type safety, similar to the suggestion foruseSourceDatasetSSE.web/src/components/dataset/dataset-header.tsx (2)
370-375: Good use ofPromise.allfor parallel query invalidations.Consolidating query invalidations into a single
Promise.allimproves efficiency by running them concurrently rather than sequentially.
971-975: Good UX: Preventing dialog close during active refresh.Preventing the dialog from closing while a refresh is in progress avoids confusing state and potential data inconsistency.
web/src/components/dataset/database-source-form.tsx (1)
144-164: Progress UI during upload looks good.The progress UI provides clear feedback with spinner, message, percentage, and progress bar. The informational text about process duration is helpful.
web/src/lib/sse-client.ts (1)
3-8: SSEEvent interface is well-defined.The interface properly captures the expected event types with appropriate optional fields for data and progress.
web/src/components/dataset/file-refresh-wizard.tsx (3)
195-230: SSE progress handling looks good.The SSE integration correctly updates the persistent toast with incremental progress. The pattern of capping progress at 90% before completion and using a persistent toast ID ensures a smooth user experience.
247-260: Error handling with toast transition is correct.The error handling properly converts the persistent loading toast to an error toast with a finite duration, ensuring proper cleanup. The error message is captured and displayed appropriately.
463-480: New validation state alerts are well-structured.The
no_schema_to_compareandvalidation_unavailablestates now have corresponding alert components with appropriate styling and messaging, improving the user experience for edge cases.web/src/components/dataset/dataset-upload-wizard.tsx (5)
164-165: Good extraction of rejected rows into a computed variable.Creating
rejectedas a derived variable fromvalidationResult?.rejectedRows ?? []with a fallback to an empty array is a clean pattern that simplifies the rendering logic and prevents potential undefined access issues.
1123-1156: Validation error rendering fix looks correct.The validation warnings now properly render with:
- Stable composite keys using
${err.rowNumber}-${err.columnName}- The
errorMessagefield is displayed correctly- Proper fallback with empty array prevents crashes
This addresses the PR objective of fixing row validation errors not rendering properly.
1544-1559: Dialog validation error rendering is correct but key can be simplified.The rendering logic is correct. The key
${error.rowNumber}-${error.columnName}-${index}is unique but theindexsuffix is redundant ifrowNumber+columnNameis already unique per error. This is a minor nit.
459-476: SSE progress callback pattern is consistent with other components.The
onProgresscallback correctly updates the toast with SSE status messages. The artificial progress increment (capped at 90%) provides good UX feedback while waiting for actual completion.
450-481: Incorrect response structure assumption masks runtime logic error.The
fetchWithSSEfunction returnsPromise<unknown>(no type parameter passed), but the code asserts it as{ dataset: Dataset }while the actual response structure is{ data: { dataset: Dataset, summary: {...} } }. This means the runtime checkif (!res?.dataset?.id)accesses the wrong property path—it should beres?.data?.dataset?.id.To fix: Pass a proper type parameter to
fetchWithSSE<Response>()where Response is properly defined, or validate the SSE response structure using Zod (per coding guidelines for API response validation) to ensure type safety at both compile and runtime.
| if strings.HasPrefix(source.ConnectionString, "postgres") { | ||
| driver = "postgres" | ||
| } else { | ||
| driver = "mysql" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driver detection may incorrectly default to MySQL.
The check only matches postgres prefix. Consider handling postgresql:// as well, or adding explicit validation:
- if strings.HasPrefix(source.ConnectionString, "postgres") {
+ if strings.HasPrefix(source.ConnectionString, "postgres://") || strings.HasPrefix(source.ConnectionString, "postgresql://") {
driver = "postgres"
- } else {
+ } else if strings.HasPrefix(source.ConnectionString, "mysql://") {
driver = "mysql"
+ } else {
+ handleFailure(fmt.Errorf("unsupported database driver in connection string"))
+ return
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In server/interfaces/http/routes/source/database/refresh.go around lines 136 to
140, the current logic defaults to "mysql" whenever the connection string
doesn't start with "postgres", which can misdetect "postgresql://" or other
schemes; update the detection to explicitly handle both "postgres" and
"postgresql" prefixes (or better, parse the connection string with url.Parse and
switch on the URL scheme) and if the scheme is neither postgres/postgresql nor
mysql, return an error or validation failure instead of silently defaulting to
"mysql".
| <DialogFooter className="pt-4"> | ||
| <DialogClose asChild> | ||
| <Button type="button" variant="outline" onClick={onCloseDialog}> | ||
| Cancel | ||
| </Button> | ||
| </DialogClose> | ||
| <Button type="submit"> | ||
| Create Dataset | ||
| </Button> | ||
| </DialogClose> | ||
| <Button type="submit" disabled={sourceDatabaseDataset.isPending}> | ||
| {sourceDatabaseDataset.isPending ? "Creating..." : "Create Dataset"} | ||
| </Button> | ||
| </DialogFooter> | ||
| </form> | ||
| </DialogFooter> | ||
| </form> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Form lacks ability to cancel during upload.
When uploadProgress is set, the form is hidden and there's no cancel button in the progress UI. If the upload takes too long or the user wants to abort, they have no option except closing the browser tab. Consider adding a cancel mechanism or at minimum a note about closing the dialog.
🤖 Prompt for AI Agents
In web/src/components/dataset/database-source-form.tsx around lines 234-244, the
form footer hides any cancel action while uploadProgress is set so users cannot
abort an ongoing upload; update the footer rendering to show a Cancel/Abort
button (or a close icon) whenever uploadProgress is active, wiring that button
to an abort handler (prefer an injected abortUpload prop or store an
AbortController for the current upload and call controller.abort()), and then
call onCloseDialog or appropriate cleanup after abort; additionally, display a
short hint near the progress UI that the user can cancel to stop the upload.
| } | ||
| } | ||
|
|
||
| return resultData as T; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return type assertion may return undefined as T.
If no complete event is received before the stream ends, resultData remains undefined but is cast to T. This could cause runtime issues in callers expecting a valid value.
- return resultData as T;
+ if (resultData === undefined) {
+ throw new Error('SSE stream ended without a complete event');
+ }
+ return resultData;🤖 Prompt for AI Agents
In web/src/lib/sse-client.ts around line 95, the code returns resultData as T
which may be undefined if no 'complete' event was received; update the function
to either (A) validate resultData before returning and throw a descriptive
runtime error if it's undefined (e.g., "SSE stream ended without complete
event") or (B) change the function return type to T | undefined and propagate
that change to callers; implement one of these fixes and update any call sites
or types accordingly so callers handle the absent value instead of receiving an
unsafe asserted T.
| rejectedRows.push({ | ||
| rowNumber: Number(row.line || 0), | ||
| columnName, | ||
| expectedType, | ||
| actualValue, | ||
| errorMessage, | ||
| }; | ||
| }); | ||
|
|
||
| // Get total count of rejected rows | ||
| const countQuery = await conn.query( | ||
| `SELECT COUNT(*) as total FROM reject_errors` | ||
| ); | ||
| const totalCount = Number(countQuery.toArray()[0]?.total || 0); | ||
| expectedType:"", | ||
| actualValue:"", | ||
| errorMessage: errorMsg, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the validate-file.ts file and examine the relevant section
find web/src -name "validate-file.ts" -type f | head -5Repository: factly/gopie
Length of output: 95
🏁 Script executed:
#!/bin/bash
# Search for RejectedRow interface definition
rg -n "interface RejectedRow" web/src/ -A 10Repository: factly/gopie
Length of output: 765
🏁 Script executed:
#!/bin/bash
# Search for components that consume rejectedRows
rg -n "rejectedRows" web/src/ --type=ts --type=tsx | head -20Repository: factly/gopie
Length of output: 83
🏁 Script executed:
#!/bin/bash
# Search for components that use or display rejectedRows
rg -l "rejectedRows" web/src --type=ts | head -10Repository: factly/gopie
Length of output: 243
🏁 Script executed:
#!/bin/bash
# Get the actual implementation in validate-file.ts around lines 631-637
sed -n '610,650p' web/src/lib/validation/validate-file.tsRepository: factly/gopie
Length of output: 1228
🏁 Script executed:
#!/bin/bash
# Search for imports or references to RejectedRow interface
rg -n "RejectedRow\|rejectedRows" web/src --type=ts -A 2 | head -50Repository: factly/gopie
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Check csv-validation-uppy.tsx for how it displays rejectedRows
rg -n "expectedType\|actualValue\|rejectedRows" web/src/components/dataset/csv-validation-uppy.tsx -B 2 -A 2Repository: factly/gopie
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Check dataset-upload-wizard.tsx for how it displays rejectedRows
rg -n "expectedType\|actualValue\|rejectedRows" web/src/components/dataset/dataset-upload-wizard.tsx -B 2 -A 2Repository: factly/gopie
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Look for any table or display component that renders the validation errors
rg -n "RejectedRow\|rejected.*map\|rejected.*render" web/src/components/dataset/ -A 5Repository: factly/gopie
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Search for where rejectedRows is actually used in components
rg -n "rejectedRows" web/src/components/dataset/ -B 5 -A 5Repository: factly/gopie
Length of output: 9286
🏁 Script executed:
#!/bin/bash
# Look for table or list rendering patterns in csv-validation-uppy.tsx
head -200 web/src/components/dataset/csv-validation-uppy.tsx | tail -150Repository: factly/gopie
Length of output: 4851
🏁 Script executed:
#!/bin/bash
# Search more broadly for validation error display
rg -n "errorMessage\|error_message" web/src/components/dataset/csv-validation-uppy.tsx -B 3 -A 3Repository: factly/gopie
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Get lines 789-831 from csv-validation-uppy.tsx to see the complete displayMessage logic
sed -n '789,831p' web/src/components/dataset/csv-validation-uppy.tsxRepository: factly/gopie
Length of output: 2129
🏁 Script executed:
#!/bin/bash
# Also check url-uploader.tsx logic for displayMessage
sed -n '652,680p' web/src/components/dataset/url-uploader.tsxRepository: factly/gopie
Length of output: 1767
Empty expectedType and actualValue fields reduce error message clarity for users.
The expectedType and actualValue fields in RejectedRow objects (lines 631-637) are hardcoded to empty strings, but the UI components in csv-validation-uppy.tsx and url-uploader.tsx render these fields and have logic to display them when available. With empty values, users see degraded error messages:
actualValuealways displays as "is empty" instead of the problematic valueexpectedTypealways displays as "a valid type" instead of the actual expected type
Users lose critical context when understanding why rows were rejected. Consider:
- Parse error_message: Extract type or value information from DuckDB's error message string
- Query table schema: Retrieve expected types by querying information_schema.columns for the target table
- Make fields optional: If values cannot be reliably populated, mark them as optional in the interface and update UI consumers accordingly
🤖 Prompt for AI Agents
In web/src/lib/validation/validate-file.ts around lines 631-637 the RejectedRow
objects are populated with empty expectedType and actualValue which degrades UI
error clarity; update the code to populate these fields by (1) attempting to
parse DuckDB's error message to extract the offending value and expected type,
and set actualValue/expectedType when parsing succeeds, (2) if parsing fails,
query information_schema.columns for the target table to get the expected type
for the column and set expectedType, and (3) if neither approach yields reliable
values, change the RejectedRow type to make expectedType and actualValue
optional and adjust callers (csv-validation-uppy.tsx and url-uploader.tsx) to
handle absent fields gracefully.
|
I'll analyze this and get back to you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
web/src/components/dataset/dataset-upload-wizard.tsx (1)
243-526: Decompose this long function for maintainability.The
handleCreateDatasetfunction is 283 lines long, handling S3 URL extraction, upload triggering, URL parsing, SSE communication, and toast management. This violates the single responsibility principle and makes the code harder to test and maintain.Consider extracting these helper functions:
// Extract S3 URL from upload response function extractS3Url(uploadResponse: unknown): string { // Lines 277-288, 331-342 logic } // Parse and normalize S3 URL function normalizeS3Url(uploadURL: string): string { // Lines 362-432 logic } // Create progress toast updater function createToastUpdater(toastId: string) { return { updateProgress: (message: string, progress: number) => { toast.loading( <div className="flex flex-col gap-2 w-full min-w-[250px]"> <div className="flex justify-between text-sm"> <span className="font-medium">{message}</span> <span className="text-muted-foreground">{progress}%</span> </div> <Progress value={progress} className="h-2" /> </div>, { id: toastId } ); }, // ... other toast helpers }; }Then refactor
handleCreateDatasetto orchestrate these helpers, reducing it to ~100 lines.
🧹 Nitpick comments (1)
web/src/components/dataset/dataset-upload-wizard.tsx (1)
459-476: Consider progress tracking limitations.The progress increments by 10% on each
status_updateevent and caps at 90%, which doesn't reflect actual server-side progress. While this provides visual feedback, users might expect the percentage to represent real completion.Consider these options:
- Use
event.progressif the server provides it (check line 5 in web/src/lib/sse-client.ts shows SSEEvent has optionalprogress?: number)- Add a comment explaining the progress is estimated
- Remove percentage display and show only a pulsing progress bar
Example using server progress:
onProgress: (event: SSEEvent) => { if (event.type === 'status_update') { - // Increment progress artificially (capped at 90%) - currentProgress = Math.min(currentProgress + 10, 90); + // Use server-provided progress or fall back to artificial increment + currentProgress = event.progress ?? Math.min(currentProgress + 10, 90); // 6. UPDATE TOAST with SSE Message toast.loading(
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
web/src/components/dataset/dataset-upload-wizard.tsx(11 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx}: Use Next.js 15 with App Router and React 19 for the web frontend framework
Use TypeScript as the language for the web frontend
Generate and use types from API responses for type safety
Use dynamic imports and code splitting for optimal performance
Load DuckDB WASM on-demand and use web workers for client-side SQL execution
Files:
web/src/components/dataset/dataset-upload-wizard.tsx
web/src/components/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/components/**/*.{ts,tsx}: Use TailwindCSS with Radix UI components for styling instead of custom CSS
Use React Hook Form with Zod validation for form handling
Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Wrap features in error boundaries to handle errors gracefully
Always handle loading and error states in UI components
Use Next.js Image component for image optimization
Files:
web/src/components/dataset/dataset-upload-wizard.tsx
web/src/components/dataset/**upload**.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Use Uppy for dataset file uploads in upload components
Files:
web/src/components/dataset/dataset-upload-wizard.tsx
🧠 Learnings (9)
📓 Common learnings
Learnt from: CR
Repo: factly/gopie PR: 0
File: chat-server/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:34.955Z
Learning: Applies to chat-server/app/api/v1/routers/**/*.py : Use Server-Sent Events for streaming real-time query progress responses
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/**upload**.{ts,tsx} : Use Uppy for dataset file uploads in upload components
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/sql-editor.{ts,tsx} : Use Monaco Editor for SQL editing with syntax highlighting in SQL editor components
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/**/*.{ts,tsx} : Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Zustand for global state management and React Query for server state in the Next.js frontend
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Integrate DuckDB WASM for client-side SQL execution in the Next.js frontend
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/lib/{queries,mutations}/**/*.{ts,tsx} : Use React Query (TanStack Query) for server state management
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Monaco Editor for SQL editing in the Next.js frontend
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/**/*.{ts,tsx} : Load DuckDB WASM on-demand and use web workers for client-side SQL execution
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
🧬 Code graph analysis (1)
web/src/components/dataset/dataset-upload-wizard.tsx (2)
web/src/lib/mutations/dataset/source-dataset.ts (1)
useSourceDatasetSSE(59-103)web/src/lib/sse-client.ts (1)
SSEEvent(3-8)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: claude-review
🔇 Additional comments (5)
web/src/components/dataset/dataset-upload-wizard.tsx (5)
43-46: LGTM: Clean SSE integration imports.The new imports for SSE support are well-organized and properly typed.
79-79: LGTM: SSE hook integration.The hook usage follows the established pattern for mutations in this codebase.
164-164: LGTM: Clean derived state.This simplifies access to rejected rows throughout the component and handles the undefined case properly.
1123-1164: LGTM: Clean validation error rendering.The validation error display properly limits to 5 items and provides a clear summary count. The use of the derived
rejectedarray simplifies the code and improves readability.
1552-1567: LGTM: Consistent validation error display.The dialog maintains consistency with Step 2's validation rendering pattern, properly limiting display and showing a summary count.
| }) as { dataset: Dataset }; | ||
|
|
||
| if (!res?.data.dataset.id) { | ||
| if (!res?.dataset?.id) { | ||
| throw new Error("Invalid response from server: Dataset ID not found."); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsafe type assertion could cause runtime errors.
Line 477 uses a type assertion as { dataset: Dataset } without validating the response shape. If the SSE endpoint returns an unexpected structure, this will cause runtime errors.
Apply this diff to add runtime validation:
- }) as { dataset: Dataset };
+ });
- if (!res?.dataset?.id) {
+ // Validate response shape
+ if (!res || typeof res !== 'object') {
+ throw new Error("Invalid response from server: Expected an object.");
+ }
+
+ const response = res as Record<string, unknown>;
+ if (!response.dataset || typeof response.dataset !== 'object') {
+ throw new Error("Invalid response from server: Dataset object not found.");
+ }
+
+ const dataset = response.dataset as Dataset;
+ if (!dataset.id) {
throw new Error("Invalid response from server: Dataset ID not found.");
}Then update the subsequent usage:
setCreatedDataset({
- ...res.dataset,
+ ...dataset,
formatDisplay,
columnMappings: alter_column_names,
columnDescriptions: column_descriptions,
validationResult,
});
- toast.success(`Dataset ${res.dataset.alias} (${formatDisplay}) created successfully`, {
+ toast.success(`Dataset ${dataset.alias} (${formatDisplay}) created successfully`, {
id: toastId,
duration: 4000,
});Committable suggestion skipped: line range outside the PR's diff.
| {/* {uploadProgress.status === 'processing' && ( | ||
| <div className="space-y-3 p-4 border rounded-lg bg-secondary/10"> | ||
| <div className="flex items-center justify-between text-sm"> | ||
| <span className="font-medium flex items-center gap-2"> | ||
| <Loader2 className="h-4 w-4 animate-spin text-primary" /> | ||
| {uploadProgress.message} | ||
| </span> | ||
| <span className="text-muted-foreground">{uploadProgress.percentage}%</span> | ||
| </div> | ||
| <Progress value={uploadProgress.percentage} className="h-2" /> | ||
| <p className="text-xs text-muted-foreground"> | ||
| Processing your data. Large datasets may take a moment. | ||
| </p> | ||
| </div> | ||
| )} */} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented code.
This large commented block appears to be old upload progress UI that was replaced by the SSE toast implementation. Commented code should be removed to improve readability—git history preserves this for reference if needed.
Apply this diff:
- {/* {uploadProgress.status === 'processing' && (
- <div className="space-y-3 p-4 border rounded-lg bg-secondary/10">
- <div className="flex items-center justify-between text-sm">
- <span className="font-medium flex items-center gap-2">
- <Loader2 className="h-4 w-4 animate-spin text-primary" />
- {uploadProgress.message}
- </span>
- <span className="text-muted-foreground">{uploadProgress.percentage}%</span>
- </div>
- <Progress value={uploadProgress.percentage} className="h-2" />
- <p className="text-xs text-muted-foreground">
- Processing your data. Large datasets may take a moment.
- </p>
- </div>
- )} */}
<Button📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {/* {uploadProgress.status === 'processing' && ( | |
| <div className="space-y-3 p-4 border rounded-lg bg-secondary/10"> | |
| <div className="flex items-center justify-between text-sm"> | |
| <span className="font-medium flex items-center gap-2"> | |
| <Loader2 className="h-4 w-4 animate-spin text-primary" /> | |
| {uploadProgress.message} | |
| </span> | |
| <span className="text-muted-foreground">{uploadProgress.percentage}%</span> | |
| </div> | |
| <Progress value={uploadProgress.percentage} className="h-2" /> | |
| <p className="text-xs text-muted-foreground"> | |
| Processing your data. Large datasets may take a moment. | |
| </p> | |
| </div> | |
| )} */} |
🤖 Prompt for AI Agents
In web/src/components/dataset/dataset-upload-wizard.tsx around lines 1280 to
1294, remove the entire commented-out JSX block that renders the old upload
progress UI (the multi-line comment starting with {/* and ending with */}) so
the file no longer contains dead commented code; rely on git history if you need
to restore it.
|
Claude finished @surajmn1's task —— View job PR Review: Row Validation Errors Rendering Fix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
web/src/components/dataset/dataset-upload-wizard.tsx (2)
477-477: Duplicate: Unsafe type assertion flagged in previous review.This type assertion without runtime validation was already identified in a previous review comment. The suggested fix includes proper runtime validation of the response structure before casting.
1280-1294: Duplicate: Remove commented code as previously flagged.This commented-out upload progress UI was already identified for removal in a previous review. Dead code should be deleted to improve maintainability—git history preserves it if needed.
🧹 Nitpick comments (1)
web/src/components/dataset/dataset-upload-wizard.tsx (1)
459-476: Consider smoothing progress increments based on event count.The current logic increments progress by 10% per SSE event (capped at 90%). If the backend sends many status updates quickly, progress could jump rapidly and feel jarring.
Consider a more gradual increment or time-based smoothing:
- // Increment progress artificially (capped at 90%) - currentProgress = Math.min(currentProgress + 10, 90); + // Increment progress gradually (capped at 90%) + const increment = (90 - currentProgress) * 0.2; // 20% of remaining progress + currentProgress = Math.min(currentProgress + increment, 90);This creates a logarithmic progress curve that feels smoother to users.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
web/src/components/dataset/dataset-upload-wizard.tsx(11 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx}: Use Next.js 15 with App Router and React 19 for the web frontend framework
Use TypeScript as the language for the web frontend
Generate and use types from API responses for type safety
Use dynamic imports and code splitting for optimal performance
Load DuckDB WASM on-demand and use web workers for client-side SQL execution
Files:
web/src/components/dataset/dataset-upload-wizard.tsx
web/src/components/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/components/**/*.{ts,tsx}: Use TailwindCSS with Radix UI components for styling instead of custom CSS
Use React Hook Form with Zod validation for form handling
Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Wrap features in error boundaries to handle errors gracefully
Always handle loading and error states in UI components
Use Next.js Image component for image optimization
Files:
web/src/components/dataset/dataset-upload-wizard.tsx
web/src/components/dataset/**upload**.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Use Uppy for dataset file uploads in upload components
Files:
web/src/components/dataset/dataset-upload-wizard.tsx
🧠 Learnings (9)
📓 Common learnings
Learnt from: CR
Repo: factly/gopie PR: 0
File: chat-server/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:34.955Z
Learning: Applies to chat-server/app/api/v1/routers/**/*.py : Use Server-Sent Events for streaming real-time query progress responses
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/**upload**.{ts,tsx} : Use Uppy for dataset file uploads in upload components
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/sql-editor.{ts,tsx} : Use Monaco Editor for SQL editing with syntax highlighting in SQL editor components
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/**/*.{ts,tsx} : Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Zustand for global state management and React Query for server state in the Next.js frontend
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Integrate DuckDB WASM for client-side SQL execution in the Next.js frontend
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/lib/{queries,mutations}/**/*.{ts,tsx} : Use React Query (TanStack Query) for server state management
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Monaco Editor for SQL editing in the Next.js frontend
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/**/*.{ts,tsx} : Load DuckDB WASM on-demand and use web workers for client-side SQL execution
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsx
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: claude-review
🔇 Additional comments (5)
web/src/components/dataset/dataset-upload-wizard.tsx (5)
43-46: LGTM! Clean SSE integration imports.The new imports properly support the SSE-based upload flow with progress tracking.
79-79: LGTM! SSE hook properly initialized.The
useSourceDatasetSSEhook correctly replaces the previous mutation approach for streaming upload progress.Based on learnings, using Server-Sent Events for streaming real-time progress is the recommended pattern.
164-164: LGTM! Clean derived state pattern.Extracting
rejectedas a derived value improves readability and eliminates repetitivevalidationResult?.rejectedRows ?? []checks throughout the component.
1123-1164: LGTM! Clear and concise validation error display.The refactored rendering using the
rejectedarray provides a clean, user-friendly display of validation errors with appropriate limiting (5 items max) and summary for additional issues.
1552-1569: LGTM! Consistent validation warning display.The dialog rendering matches the pattern used in Step 2, providing a consistent user experience when displaying validation errors.

merge after #30
Summary by CodeRabbit
New Features
User Experience
UI Improvements
✏️ Tip: You can customize this high-level summary in your review settings.