-
Notifications
You must be signed in to change notification settings - Fork 0
web : properly disable “Next” in the upload flow #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- for uploads and refreshes
WalkthroughThis PR introduces server-sent events (SSE) streaming across dataset creation and refresh operations. Backend handlers emit real-time progress events via SSE channels, replacing single-response patterns. Frontend components add progress UI integration and new SSE-based mutation hooks to consume and display streaming updates. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes Key areas requiring attention:
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude finished @surajmn1's task —— View job PR Code ReviewI've completed a comprehensive review of this pull request that implements SSE (Server-Sent Events) for the dataset upload flow. Here are my findings: ✅ Strengths
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
web/src/components/dataset/file-refresh-wizard.tsx (1)
58-67: UnuseddatasetIdprop in component signature.
datasetIdis declared inFileRefreshWizardPropsbut the component extracts it fromuseParams()on line 70-73 instead. Either use the prop or remove it from the interface to avoid confusion.export interface FileRefreshWizardProps { projectId: string; - datasetId: string; onRefreshComplete: () => void; } export function FileRefreshWizard({ projectId, onRefreshComplete, }: FileRefreshWizardProps) {
♻️ Duplicate comments (1)
server/interfaces/http/routes/source/s3/create.go (1)
17-28: Duplicate SSE types - same as database package.
SSEEventandSSEDataare duplicated here. Consider consolidating with the types in the database package.
🧹 Nitpick comments (18)
server/interfaces/http/routes/source/database/refresh.go (3)
64-73: Consider handling JSON marshaling errors.The
json.Marshalerror is silently ignored on line 70. While unlikely to fail for these simple structs, logging or handling the error would improve debuggability.- payloadBytes, _ := json.Marshal(eventPayload) + payloadBytes, err := json.Marshal(eventPayload) + if err != nil { + h.logger.Error("Failed to marshal SSE event", zap.Error(err)) + return + }
83-86: Goroutine continues processing after client disconnect.The goroutine lacks context cancellation awareness. If the client disconnects mid-stream, the background operations (database queries, OLAP refreshes) continue unnecessarily, wasting resources.
Consider passing
ctx.Context()to the goroutine and checking for cancellation at key points.+ // Get cancellable context for client disconnect detection + reqCtx := ctx.Context() + // Start async refresh process go func() { defer close(sseChan) + + // Helper to check if client disconnected + checkCancelled := func() bool { + select { + case <-reqCtx.Done(): + h.logger.Info("Client disconnected, aborting refresh") + return true + default: + return false + } + }
214-215: Hardcoded sleep for data commit synchronization.The 2-second sleep is a fragile synchronization mechanism. Consider implementing a polling/retry strategy with exponential backoff, or using a proper synchronization mechanism from the OLAP service if available.
server/interfaces/http/routes/source/database/create.go (2)
17-28: Consider extracting SSE types to a shared package.
SSEEventandSSEDatatypes are duplicated acrossdatabaseands3packages. Extract them to a shared package (e.g.,interfaces/http/routes/commonordomain/models) to avoid duplication and ensure consistency.
158-161: Same client disconnect issue as refresh handler.The goroutine lacks context cancellation awareness. Consider adding the same pattern suggested for refresh.go to abort processing when the client disconnects.
server/interfaces/http/routes/source/s3/create.go (1)
157-161: Background context used instead of request context.Using
context.Background()means operations continue even if the client disconnects. Consider using a context derived from the request (ctx.Context()) with appropriate timeout/cancellation handling.web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)
66-93: SSE hook doesn't integrate with React Query state management.The
useRefreshDatabaseDatasetSSEhook returns a raw async function, bypassing React Query's state management (loading, error, caching). This diverges from the coding guideline to use React Query for server state management.Consider wrapping with
useMutationfor consistent state handling:-export const useRefreshDatabaseDatasetSSE = () => { +export const useRefreshDatabaseDatasetSSE = () => { const { accessToken } = useAuth(); - return async ({ + + const mutationFn = async ({ projectId, datasetName, refreshType, onProgress, }: { projectId: string; datasetName: string; refreshType: "full" | "incremental"; onProgress: (event: SSEEvent) => void; }) => { return await fetchWithSSE( '/source/database/refresh', { method: 'POST', body: JSON.stringify({ project_id: projectId, dataset_name: datasetName, refresh_type: refreshType, }), }, onProgress, accessToken ); }; + + return { mutate: mutationFn, mutateAsync: mutationFn }; };Alternatively, use
useMutationfrom@tanstack/react-querydirectly for full state management.web/src/lib/mutations/dataset/refresh-dataset.ts (1)
73-106: SSE hook bypasses React Query state management.The
useRefreshDatasetSSEhook returns a raw async function instead of integrating with React Query. This means:
- No automatic loading/error state tracking
- No cache invalidation integration
- Inconsistent with the existing
useRefreshDatasetmutation patternConsider wrapping with
useMutationfrom React Query to maintain consistency with the coding guidelines that specify using React Query for server state management:+import { useMutation } from "@tanstack/react-query"; + -export const useRefreshDatasetSSE = () => { +export const useRefreshDatasetSSE = () => { const { accessToken } = useAuth(); - return async ({ + + return useMutation({ + mutationKey: ["refresh-dataset-sse"], + mutationFn: async ({ datasetName, projectId, s3Url, source, ignoreErrors, onProgress, }: { datasetName: string; projectId: string; s3Url: string; source?: string; ignoreErrors?: boolean; onProgress: (event: SSEEvent) => void }) => { return await fetchWithSSE( '/source/s3/refresh', { method: 'POST', body: JSON.stringify({ dataset_name: datasetName, project_id: projectId, file_path: s3Url, ignore_errors: ignoreErrors ?? true, source: source, }), }, onProgress, accessToken ); - }; + }, + }); };web/src/lib/mutations/dataset/source-database-dataset.ts (1)
14-14: Consider makingonProgressoptional with a no-op default.Making
onProgressrequired forces all callers to provide a progress handler even when they don't need progress updates. Consider making it optional:- onProgress: (event: SSEEvent) => void; + onProgress?: (event: SSEEvent) => void;Then in the call:
- params.onProgress, + params.onProgress ?? (() => {}),web/src/components/dataset/database-source-form.tsx (2)
70-93: Progress tracking uses closure variable which may cause stale reads.The
currentProgressvariable is captured in theonProgressclosure. If multiple rapid SSE events arrive, React's batching could causesetUploadProgressto be called with potentially stale values. Consider using a ref or functional state update:+ const currentProgressRef = React.useRef(0); + const handleSubmit = async (e: React.FormEvent) => { // ... - let currentProgress = 0; + currentProgressRef.current = 0; setUploadProgress({ message: "Initializing connection...", percentage: 0 }); try { const response = await startDatabaseUpload({ // ... onProgress: (event: SSEEvent) => { if (event.type === 'status_update') { - currentProgress = Math.min(currentProgress + 10, 90); + currentProgressRef.current = Math.min(currentProgressRef.current + 10, 90); setUploadProgress({ message: event.message, - percentage: currentProgress + percentage: currentProgressRef.current }); } } });
114-119: Inconsistent error callback behavior between validation and runtime errors.Line 106 calls
onError(errMessage)for invalid responses, but line 118 intentionally skipsonErrorto allow retries. This asymmetry could confuse parent components. Consider documenting this behavior or making it consistent:if (!dataset?.id) { const errMessage = "Invalid response from server: Dataset ID not found."; setFormError(errMessage); - onError(errMessage); + // Don't call onError to keep dialog open for retry setUploadProgress(null); return; }web/src/lib/sse-client.ts (2)
10-15: Consider adding AbortController support for cancellation.Long-running SSE connections should be cancellable. The function accepts
RequestInitbut callers cannot easily abort the stream:export async function fetchWithSSE<T = unknown>( url: string, options: RequestInit, onEvent: (event: SSEEvent<T>) => void, - accessToken?: string | null + accessToken?: string | null, + signal?: AbortSignal ): Promise<T> { // ... const response = await fetch(process.env.NEXT_PUBLIC_GOPIE_API_URL + url, { ...options, headers, + signal, });This enables callers to cancel uploads when users navigate away or close dialogs.
21-23: Minor: Inconsistent indentation.Line 22 has inconsistent indentation compared to surrounding code.
// Add Auth & Org Headers (Reusing logic from api-client) -const isAuthEnabled = String(process.env.NEXT_PUBLIC_ENABLE_AUTH).trim() === "true"; + const isAuthEnabled = String(process.env.NEXT_PUBLIC_ENABLE_AUTH).trim() === "true"; const orgId = getGlobalOrganizationId();web/src/components/dataset/dataset-upload-wizard.tsx (3)
323-324: Hardcoded timeout is a race condition workaround.Using
setTimeout(resolve, 2000)to wait for the store update is fragile. If the upload completes faster or slower, this could cause issues. Consider using a more deterministic approach, such as subscribing to store changes or using a callback from the upload process.
476-480: Type assertion lacks runtime validation.The
as { dataset: Dataset }assertion assumes the server always returns the expected shape. If the SSE endpoint returns a different structure, this could cause subtle runtime errors. Consider adding a runtime check or using a type guard.- }) as { dataset: Dataset }; - - if (!res?.dataset?.id) { + }); + + const typedRes = res as { dataset?: Dataset } | undefined; + if (!typedRes?.dataset?.id) {
163-163: Minor formatting inconsistency.There's extra leading whitespace before
const. Consider aligning with the surrounding code style.-const columnMappings = useColumnNameStore((state) => state.columnMappings); + const columnMappings = useColumnNameStore((state) => state.columnMappings);web/src/components/dataset/file-refresh-wizard.tsx (1)
482-497: Remove unnecessary JSX wrapper braces.The
RefreshFileUploadercomponent is wrapped in unnecessary curly braces. This works but is unconventional.- { - <RefreshFileUploader - ref={uploaderRef} - existingSchema={existingSchemaData?.schema as ColumnInfo[]} - onValidationStateChange={handleValidationStateChange} - onUploadSuccess={handleUploadComplete} - onUploadError={(msg) => { - // Update toast if active, otherwise just set error - toast.error("Upload Failed", { - id: REFRESH_TOAST_ID, - description: msg, - }); - setApiError(msg); - }} - /> - } + <RefreshFileUploader + ref={uploaderRef} + existingSchema={existingSchemaData?.schema as ColumnInfo[]} + onValidationStateChange={handleValidationStateChange} + onUploadSuccess={handleUploadComplete} + onUploadError={(msg) => { + toast.error("Upload Failed", { + id: REFRESH_TOAST_ID, + description: msg, + }); + setApiError(msg); + }} + />web/src/components/dataset/dataset-header.tsx (1)
147-147: Consider aliasing imports for clarity.Renaming
toasttotoastyfromuseToastwhile importingtoastfromsonnerworks but is confusing. Consider using clearer aliases at the import level.-import { toast } from "sonner"; +import { toast as sonnerToast } from "sonner";Then use
sonnerToastinstead oftoastthroughout the SSE progress flows, keeping the originaltoastfromuseToastunchanged.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
server/interfaces/http/routes/source/database/create.go(4 hunks)server/interfaces/http/routes/source/database/refresh.go(3 hunks)server/interfaces/http/routes/source/s3/create.go(5 hunks)server/interfaces/http/routes/source/s3/refresh.go(3 hunks)web/src/components/dataset/csv-validation-uppy.tsx(0 hunks)web/src/components/dataset/database-source-form.tsx(6 hunks)web/src/components/dataset/dataset-header.tsx(12 hunks)web/src/components/dataset/dataset-upload-wizard.tsx(11 hunks)web/src/components/dataset/file-refresh-wizard.tsx(10 hunks)web/src/components/dataset/unified-uploader.tsx(0 hunks)web/src/lib/mutations/dataset/refresh-dataset-database.ts(2 hunks)web/src/lib/mutations/dataset/refresh-dataset.ts(2 hunks)web/src/lib/mutations/dataset/source-database-dataset.ts(2 hunks)web/src/lib/mutations/dataset/source-dataset.ts(2 hunks)web/src/lib/sse-client.ts(1 hunks)
💤 Files with no reviewable changes (2)
- web/src/components/dataset/unified-uploader.tsx
- web/src/components/dataset/csv-validation-uppy.tsx
🧰 Additional context used
📓 Path-based instructions (4)
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx}: Use Next.js 15 with App Router and React 19 for the web frontend framework
Use TypeScript as the language for the web frontend
Generate and use types from API responses for type safety
Use dynamic imports and code splitting for optimal performance
Load DuckDB WASM on-demand and use web workers for client-side SQL execution
Files:
web/src/lib/mutations/dataset/refresh-dataset.tsweb/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/lib/mutations/dataset/source-dataset.tsweb/src/components/dataset/dataset-upload-wizard.tsxweb/src/components/dataset/dataset-header.tsxweb/src/components/dataset/database-source-form.tsxweb/src/lib/sse-client.tsweb/src/components/dataset/file-refresh-wizard.tsxweb/src/lib/mutations/dataset/source-database-dataset.ts
web/src/lib/{queries,mutations}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Use React Query (TanStack Query) for server state management
Files:
web/src/lib/mutations/dataset/refresh-dataset.tsweb/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/lib/mutations/dataset/source-dataset.tsweb/src/lib/mutations/dataset/source-database-dataset.ts
web/src/components/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/components/**/*.{ts,tsx}: Use TailwindCSS with Radix UI components for styling instead of custom CSS
Use React Hook Form with Zod validation for form handling
Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Wrap features in error boundaries to handle errors gracefully
Always handle loading and error states in UI components
Use Next.js Image component for image optimization
Files:
web/src/components/dataset/dataset-upload-wizard.tsxweb/src/components/dataset/dataset-header.tsxweb/src/components/dataset/database-source-form.tsxweb/src/components/dataset/file-refresh-wizard.tsx
web/src/components/dataset/**upload**.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Use Uppy for dataset file uploads in upload components
Files:
web/src/components/dataset/dataset-upload-wizard.tsx
🧠 Learnings (14)
📓 Common learnings
Learnt from: CR
Repo: factly/gopie PR: 0
File: chat-server/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:34.955Z
Learning: Applies to chat-server/app/api/v1/routers/**/*.py : Use Server-Sent Events for streaming real-time query progress responses
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/**upload**.{ts,tsx} : Use Uppy for dataset file uploads in upload components
Applied to files:
web/src/lib/mutations/dataset/refresh-dataset.tsweb/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/lib/mutations/dataset/source-dataset.tsweb/src/components/dataset/dataset-upload-wizard.tsxweb/src/components/dataset/dataset-header.tsxweb/src/components/dataset/database-source-form.tsxweb/src/components/dataset/file-refresh-wizard.tsxweb/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/**/*.{ts,tsx} : Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Applied to files:
web/src/lib/mutations/dataset/refresh-dataset.tsweb/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/lib/mutations/dataset/source-dataset.tsweb/src/components/dataset/dataset-upload-wizard.tsxweb/src/components/dataset/dataset-header.tsxweb/src/components/dataset/file-refresh-wizard.tsxweb/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/lib/{queries,mutations}/**/*.{ts,tsx} : Use React Query (TanStack Query) for server state management
Applied to files:
web/src/lib/mutations/dataset/refresh-dataset.tsweb/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/lib/mutations/dataset/source-dataset.tsweb/src/components/dataset/dataset-upload-wizard.tsxweb/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Integrate DuckDB WASM for client-side SQL execution in the Next.js frontend
Applied to files:
web/src/lib/mutations/dataset/refresh-dataset-database.tsweb/src/components/dataset/dataset-upload-wizard.tsxweb/src/components/dataset/dataset-header.tsxweb/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-12-02T11:48:34.955Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: chat-server/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:34.955Z
Learning: Applies to chat-server/app/api/v1/routers/**/*.py : Use Server-Sent Events for streaming real-time query progress responses
Applied to files:
server/interfaces/http/routes/source/database/refresh.goserver/interfaces/http/routes/source/s3/refresh.goserver/interfaces/http/routes/source/database/create.go
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Integrate DuckDB for OLAP queries in the Go backend server
Applied to files:
server/interfaces/http/routes/source/database/refresh.goserver/interfaces/http/routes/source/database/create.go
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Store dataset files in S3-compatible storage (MinIO) in the Go backend server
Applied to files:
server/interfaces/http/routes/source/database/refresh.goserver/interfaces/http/routes/source/s3/refresh.goserver/interfaces/http/routes/source/s3/create.goserver/interfaces/http/routes/source/database/create.go
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/sql-editor.{ts,tsx} : Use Monaco Editor for SQL editing with syntax highlighting in SQL editor components
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsxweb/src/components/dataset/dataset-header.tsxweb/src/lib/mutations/dataset/source-database-dataset.ts
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Zustand for global state management and React Query for server state in the Next.js frontend
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsxweb/src/components/dataset/file-refresh-wizard.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Monaco Editor for SQL editing in the Next.js frontend
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsxweb/src/components/dataset/dataset-header.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/**/*.{ts,tsx} : Load DuckDB WASM on-demand and use web workers for client-side SQL execution
Applied to files:
web/src/components/dataset/dataset-upload-wizard.tsxweb/src/components/dataset/dataset-header.tsx
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/lib/stores/**/*.{ts,tsx} : Use Zustand for global state management (auth, chat, SQL, visualization stores)
Applied to files:
web/src/components/dataset/dataset-header.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Use PostgreSQL with SQLC for type-safe queries in the Go backend server
Applied to files:
server/interfaces/http/routes/source/database/create.go
🧬 Code graph analysis (11)
web/src/lib/mutations/dataset/refresh-dataset.ts (1)
web/src/lib/sse-client.ts (1)
fetchWithSSE(10-96)
web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)
web/src/lib/sse-client.ts (1)
fetchWithSSE(10-96)
server/interfaces/http/routes/source/database/refresh.go (3)
server/interfaces/http/routes/source/database/create.go (2)
SSEData(25-28)SSEEvent(18-22)server/infrastructure/postgres/gen/database_sources.sql.go (1)
UpdateDatabaseSourceLastUpdatedAtParams(172-175)server/infrastructure/postgres/gen/datasets.sql.go (1)
UpdateDatasetParams(356-367)
web/src/lib/mutations/dataset/source-dataset.ts (1)
web/src/lib/sse-client.ts (1)
fetchWithSSE(10-96)
web/src/components/dataset/dataset-upload-wizard.tsx (2)
web/src/lib/mutations/dataset/source-dataset.ts (1)
useSourceDatasetSSE(59-103)web/src/lib/sse-client.ts (1)
SSEEvent(3-8)
web/src/components/dataset/dataset-header.tsx (2)
web/src/lib/mutations/dataset/refresh-dataset-database.ts (1)
useRefreshDatabaseDatasetSSE(66-93)web/src/lib/queries/dataset/check-timestamp-column.ts (1)
useCheckTimestampColumn(16-42)
web/src/components/dataset/database-source-form.tsx (1)
web/src/lib/mutations/dataset/source-database-dataset.ts (1)
useSourceDatabaseDatasetSSE(17-41)
server/interfaces/http/routes/source/database/create.go (2)
server/domain/pkg/utils.go (1)
RandomString(40-46)server/domain/models/database_source.go (1)
CreateDatabaseSourceParams(17-29)
web/src/lib/sse-client.ts (1)
web/src/lib/api-client.ts (1)
getGlobalOrganizationId(42-44)
web/src/components/dataset/file-refresh-wizard.tsx (2)
web/src/lib/mutations/dataset/refresh-dataset.ts (1)
useRefreshDatasetSSE(73-106)web/src/components/dataset/refresh-file-uploader.tsx (1)
RefreshFileUploader(64-402)
web/src/lib/mutations/dataset/source-database-dataset.ts (1)
web/src/lib/sse-client.ts (1)
fetchWithSSE(10-96)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: claude-review
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (go)
🔇 Additional comments (14)
server/interfaces/http/routes/source/database/refresh.go (1)
275-280: LGTM - Completion event structure is correct.The final completion event correctly includes both dataset and summary data, aligning with the SSE client expectations on the frontend.
server/interfaces/http/routes/source/s3/create.go (1)
288-293: LGTM - S3 upload SSE flow is well-structured.The completion event and cleanup handling are correct for the S3 upload flow.
server/interfaces/http/routes/source/s3/refresh.go (2)
238-243: LGTM - S3 refresh completion handling.The completion event correctly includes dataset and summary data.
182-189: No type compatibility issue exists.Line 184 passes
int(count)toRowCount, which is correct. The code usesmodels.UpdateDatasetParamsfrom the domain model layer whereRowCountis typed asint, notpgtype.Int4. The type conversion is appropriate and the code is valid.Likely an incorrect or invalid review comment.
web/src/lib/mutations/dataset/source-dataset.ts (1)
59-103: LGTM - SSE variant maintains backward compatibility.The implementation correctly:
- Preserves the existing
useSourceDatasetmutation for non-streaming use cases- Mirrors the request body structure from the original mutation
- Integrates with the SSE client for streaming progress
The same React Query integration suggestion from the other SSE hooks applies here for consistency with the codebase's state management patterns.
web/src/components/dataset/database-source-form.tsx (1)
144-164: LGTM - Progress UI implementation.The progress UI correctly:
- Shows a spinner with the current status message
- Displays percentage completion
- Provides user feedback about expected duration
- Conditionally renders based on upload state
web/src/components/dataset/dataset-upload-wizard.tsx (2)
43-46: LGTM: SSE integration imports are properly structured.The imports for SSE-based upload flow (
useSourceDatasetSSE,SSEEvent,Dataset) and theProgresscomponent are correctly added to support the new streaming progress functionality.
662-669: LGTM: Step 3 navigation guard properly implemented.The
canProceedFromStep3logic correctly prevents users from proceeding when:
- Processing is ongoing
- Invalid column names exist
- Description generation is in progress
- Descriptions haven't been generated yet
This aligns with the PR objective of properly disabling the "Next" button.
web/src/components/dataset/file-refresh-wizard.tsx (2)
125-269: LGTM: SSE-based refresh flow is well-structured.The
callRefreshApiEndpointfunction correctly:
- Resumes progress from where S3 upload left off (25%)
- Uses a persistent toast ID for seamless UX transitions
- Properly handles SSE progress events
- Invalidates relevant queries on success
- Provides clear error feedback
271-291: LGTM: Clean handoff from S3 upload to refresh API.The
handleUploadCompletefunction properly bridges the S3 upload completion to the SSE-based refresh flow with a visible progress indicator, ensuring users see continuous progress.web/src/components/dataset/dataset-header.tsx (4)
313-405: LGTM: Dual-path progress handling for dialog and toast flows.The
handleDbRefreshfunction correctly manages two UX paths:
- Dialog-based progress when the dialog is open
- Toast-based progress for direct refresh actions
Both paths properly handle progress updates, success notifications, and error states with appropriate cleanup.
971-975: LGTM: Dialog close prevention during refresh.Preventing the dialog from closing while a refresh operation is in progress is good UX. Users won't accidentally dismiss the progress indicator.
1002-1043: LGTM: Dialog progress UI implementation.The conditional rendering cleanly switches between the progress view and the action buttons based on
dialogProgressstate. The progress UI is consistent with the toast-based approach used elsewhere.
370-375: LGTM: Comprehensive cache invalidation after refresh.Using
Promise.allfor parallel query invalidation is efficient. The invalidation covers all relevant queries: dataset, datasets list, schema, and table data.
| summaryBytes, _ := json.Marshal(datasetSummary) | ||
| summaryString := string(summaryBytes) | ||
| cleanup.hasSummary = true | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Premature hasSummary flag setting.
Setting cleanup.hasSummary = true on line 228 after GetDatasetSummary is premature. The summary isn't persisted until CreateDatasetSummary on line 345. If a failure occurs between these points, cleanup will attempt to delete a non-existent summary.
summaryBytes, _ := json.Marshal(datasetSummary)
summaryString := string(summaryBytes)
- cleanup.hasSummary = true📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| summaryBytes, _ := json.Marshal(datasetSummary) | |
| summaryString := string(summaryBytes) | |
| cleanup.hasSummary = true | |
| summaryBytes, _ := json.Marshal(datasetSummary) | |
| summaryString := string(summaryBytes) | |
🤖 Prompt for AI Agents
In server/interfaces/http/routes/source/database/create.go around lines 226-229,
the code sets cleanup.hasSummary = true immediately after marshalling the
summary, which is premature because the summary is not persisted until
CreateDatasetSummary (around line 345); change the logic so the flag is only set
after CreateDatasetSummary returns successfully (move the assignment to
immediately after the successful persistence), and ensure any error path between
marshalling and persistence does not mark hasSummary true so cleanup won't
attempt to delete a non-existent summary.
| if err != nil { | ||
| h.logger.Error("Error creating dataset summary", zap.Error(err)) | ||
| cleanup.hasSummary = true | ||
| h.cleanupResources(cleanup) | ||
| handleFailure(fmt.Errorf("error creating dataset summary: %w", err)) | ||
| return | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect hasSummary flag in error block.
Setting cleanup.hasSummary = true inside the error block (line 348) is incorrect. If CreateDatasetSummary fails, there's no summary to clean up.
if err != nil {
h.logger.Error("Error creating dataset summary", zap.Error(err))
- cleanup.hasSummary = true
h.cleanupResources(cleanup)
handleFailure(fmt.Errorf("error creating dataset summary: %w", err))
return
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if err != nil { | |
| h.logger.Error("Error creating dataset summary", zap.Error(err)) | |
| cleanup.hasSummary = true | |
| h.cleanupResources(cleanup) | |
| handleFailure(fmt.Errorf("error creating dataset summary: %w", err)) | |
| return | |
| } | |
| if err != nil { | |
| h.logger.Error("Error creating dataset summary", zap.Error(err)) | |
| h.cleanupResources(cleanup) | |
| handleFailure(fmt.Errorf("error creating dataset summary: %w", err)) | |
| return | |
| } |
🤖 Prompt for AI Agents
In server/interfaces/http/routes/source/database/create.go around lines 346 to
352, the error branch for CreateDatasetSummary incorrectly sets
cleanup.hasSummary = true even though summary creation failed; remove that
assignment (or set it to false) so cleanup does not attempt to delete a
non-existent summary, keep the subsequent cleanup call and error handling
unchanged.
| // Ensure transactions are committed or rolled back | ||
| defer func() { | ||
| if err != nil { | ||
| h.logger.Error("Rolling back transactions due to error", zap.Error(err)) | ||
| olapTx.Rollback() | ||
| storeTx.Rollback(txCtx) | ||
| } else { | ||
| h.logger.Info("Committing transactions") | ||
| olapTx.Commit() | ||
| storeTx.Commit(txCtx) | ||
| } | ||
| }() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fragile transaction commit/rollback pattern.
The defer block relies on the outer err variable being non-nil to determine rollback vs commit. This pattern is fragile because:
- Any shadowed
err := ...won't affect the outererr, potentially causing incorrect commits. - Any future code additions could inadvertently affect the commit decision.
Consider using an explicit success flag or handling commit/rollback at the end of the goroutine.
+ var txErr error // Explicit error tracking for transaction
+
// Ensure transactions are committed or rolled back
defer func() {
- if err != nil {
- h.logger.Error("Rolling back transactions due to error", zap.Error(err))
+ if txErr != nil {
+ h.logger.Error("Rolling back transactions due to error", zap.Error(txErr))
olapTx.Rollback()
storeTx.Rollback(txCtx)
} else {Then set txErr = err before each early return, or use a success flag pattern.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In server/interfaces/http/routes/source/s3/refresh.go around lines 128 to 139,
the defer currently checks the outer err variable to decide commit vs rollback
which is fragile due to shadowing and future changes; change the pattern to use
an explicit success flag (e.g., named success or txSuccess) initialized false,
set it to true only after all transactional work succeeds (before normal
return), and have the defer examine that flag to commit on success and rollback
otherwise; alternatively, if you prefer the existing err variable, ensure every
early return sets a shared txErr variable to the error before returning so the
defer reads that explicit txErr instead of a potentially-shadowed err.
| {/* {uploadProgress.status === 'processing' && ( | ||
| <div className="space-y-3 p-4 border rounded-lg bg-secondary/10"> | ||
| <div className="flex items-center justify-between text-sm"> | ||
| <span className="font-medium flex items-center gap-2"> | ||
| <Loader2 className="h-4 w-4 animate-spin text-primary" /> | ||
| {uploadProgress.message} | ||
| </span> | ||
| <span className="text-muted-foreground">{uploadProgress.percentage}%</span> | ||
| </div> | ||
| <Progress value={uploadProgress.percentage} className="h-2" /> | ||
| <p className="text-xs text-muted-foreground"> | ||
| Processing your data. Large datasets may take a moment. | ||
| </p> | ||
| </div> | ||
| )} */} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Remove commented-out code.
This block references uploadProgress state that doesn't exist in the current implementation. If this UI is needed, it should be integrated with the new SSE-based progress flow; otherwise, remove it to reduce code noise.
- {/* {uploadProgress.status === 'processing' && (
- <div className="space-y-3 p-4 border rounded-lg bg-secondary/10">
- <div className="flex items-center justify-between text-sm">
- <span className="font-medium flex items-center gap-2">
- <Loader2 className="h-4 w-4 animate-spin text-primary" />
- {uploadProgress.message}
- </span>
- <span className="text-muted-foreground">{uploadProgress.percentage}%</span>
- </div>
- <Progress value={uploadProgress.percentage} className="h-2" />
- <p className="text-xs text-muted-foreground">
- Processing your data. Large datasets may take a moment.
- </p>
- </div>
- )} */}
- <Button
+ <Button📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {/* {uploadProgress.status === 'processing' && ( | |
| <div className="space-y-3 p-4 border rounded-lg bg-secondary/10"> | |
| <div className="flex items-center justify-between text-sm"> | |
| <span className="font-medium flex items-center gap-2"> | |
| <Loader2 className="h-4 w-4 animate-spin text-primary" /> | |
| {uploadProgress.message} | |
| </span> | |
| <span className="text-muted-foreground">{uploadProgress.percentage}%</span> | |
| </div> | |
| <Progress value={uploadProgress.percentage} className="h-2" /> | |
| <p className="text-xs text-muted-foreground"> | |
| Processing your data. Large datasets may take a moment. | |
| </p> | |
| </div> | |
| )} */} | |
| <Button |
🤖 Prompt for AI Agents
In web/src/components/dataset/dataset-upload-wizard.tsx around lines 1292 to
1306, remove the commented-out UI block that references the obsolete
uploadProgress state; this code is dead and either should be integrated with the
new SSE-based progress flow or deleted to reduce noise. Delete the entire
commented block and, if progress UI is required later, re-implement it using the
current SSE progress state and handlers rather than the removed uploadProgress
variable.
| const { accessToken } = useAuth(); | ||
| return async ({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Access token may become stale.
The accessToken is captured when the hook is called, but the returned async function may be invoked later. If the token refreshes in between, the mutation uses a stale token.
Consider getting a fresh token inside the async function, or use a ref to always access the latest token.
export const useRefreshDatabaseDatasetSSE = () => {
- const { accessToken } = useAuth();
+ const { getAccessToken } = useAuth(); // Assuming useAuth provides a getter
return async ({
projectId,
datasetName,
refreshType,
onProgress,
}: {
projectId: string;
datasetName: string;
refreshType: "full" | "incremental";
onProgress: (event: SSEEvent) => void;
}) => {
+ const accessToken = await getAccessToken?.();
return await fetchWithSSE(If useAuth doesn't provide getAccessToken, consider using a ref pattern or restructuring the hook.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In web/src/lib/mutations/dataset/refresh-dataset-database.ts around lines 67-68,
the hook captures accessToken when the hook runs which can become stale by the
time the returned async function executes; modify the implementation to retrieve
a fresh token at invocation time (either call a provided getAccessToken inside
the async function or read accessToken from a ref that useAuth updates) and
update any call sites accordingly so the network request always uses the current
token.
| while (true) { | ||
| const { done, value } = await reader.read(); | ||
| if (done) break; | ||
|
|
||
| const chunk = decoder.decode(value, { stream: true }); | ||
| const lines = chunk.split('\n'); | ||
|
|
||
| for (const line of lines) { | ||
| if (line.startsWith('data: ')) { | ||
| try { | ||
| const jsonStr = line.slice(6); | ||
| if (!jsonStr.trim()) continue; | ||
|
|
||
| const event = JSON.parse(jsonStr); | ||
|
|
||
| // Map backend event types to frontend expected types if necessary | ||
| // Server sends: "status_update", "complete", "error" | ||
| onEvent(event); | ||
|
|
||
| if (event.type === 'complete') { | ||
| resultData = event.data; | ||
| } | ||
| if (event.type === 'error') { | ||
| throw new Error(event.message); | ||
| } | ||
| } catch (e) { | ||
| if (e instanceof Error && e.message !== "Unexpected end of JSON input") { | ||
| console.error("SSE Parse Error", e); | ||
| } | ||
| if (line.includes('"type":"error"')) throw e; // Re-throw actual errors | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SSE parsing doesn't handle messages split across chunks.
The current implementation splits each chunk by newlines, but SSE messages can be split across multiple chunks. This could cause parsing failures for large payloads:
+ let buffer = '';
+
while (true) {
const { done, value } = await reader.read();
if (done) break;
- const chunk = decoder.decode(value, { stream: true });
- const lines = chunk.split('\n');
+ buffer += decoder.decode(value, { stream: true });
+ const lines = buffer.split('\n');
+ buffer = lines.pop() || ''; // Keep incomplete line in buffer
for (const line of lines) {
// ... existing parsing logic
}
}
+
+ // Process any remaining buffer content
+ if (buffer.trim() && buffer.startsWith('data: ')) {
+ // ... handle remaining data
+ }Committable suggestion skipped: line range outside the PR's diff.
| } | ||
| } | ||
|
|
||
| return resultData as T; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsafe cast when no complete event received.
If the SSE stream ends without a complete event (e.g., connection dropped), resultData remains undefined but is cast to T. This could cause runtime errors in callers expecting valid data:
- return resultData as T;
+ if (resultData === undefined) {
+ throw new Error('SSE stream ended without completion event');
+ }
+ return resultData;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return resultData as T; | |
| if (resultData === undefined) { | |
| throw new Error('SSE stream ended without completion event'); | |
| } | |
| return resultData; |
🤖 Prompt for AI Agents
In web/src/lib/sse-client.ts around line 95, the code unsafely casts resultData
to T even when resultData may be undefined if no 'complete' event arrived;
change the logic to validate resultData before casting: if resultData is
undefined, reject/throw a clear error (e.g., "SSE stream ended without complete
event") or return a nullable type, otherwise cast and return; update any promise
resolution paths and ensure any EventSource/error handlers also trigger the same
rejection to avoid silent undefined returns.
merge after merging #30
Summary by CodeRabbit
Release Notes
New Features
UI/UX Improvements
✏️ Tip: You can customize this high-level summary in your review settings.