Skip to content

Breaking: Remove courseassets module, inline _assetIds on content documents #114

@taylortom

Description

@taylortom

Summary

The courseassets collection maintains a materialized join table (courseId + contentId → assetId) that is rebuilt on every content insert/update/delete. This is the primary performance bottleneck during clone operations — profiling shows postInsertHook (courseassets) takes 5-6 seconds per 242 items on the server, and with 8 concurrent multilang clones, causes 37+ seconds of contention.

The proposal is to store _assetIds: [...] directly on each content document, eliminating the separate collection and the courseassets module entirely.

Current problem

  • Courseassets hooks into content's postInsertHook and runs per-item: deleteMany + getSchema + extractAssetIds + per-asset findOne + insert
  • For 242 items with assets: 300-500+ DB round-trips per clone
  • 8 concurrent multilang clones = 2,400-4,000 queries competing for the same DB connection
  • Clone of 242-item × 8-language course: ~47 seconds, of which ~37 seconds is courseassets

Proposed changes

Remove courseassets module entirely:

  • The only remaining logic is the asset deletion guard (~10 lines), which moves into ContentModule.init() as a preDeleteHook tap on the assets module
  • Remove the courseassets collection, courseasset schema, routes, and package
  • Remove adapt-authoring-courseassets from all peerDependencies and workspace configs
  • The RESOURCE_IN_USE error moves to the content module's error definitions
  • POST /api/courseassets/query has no frontend consumers (confirmed — the UI does not reference courseassets anywhere). Only server-side consumers are AdaptFrameworkBuild and AdaptFrameworkImport, which switch to querying content's _assetIds field directly

Content module:

  • Move extractAssetIds utility from courseassets into content's lib/utils/
  • Compute and store _assetIds during insert() and via preUpdateHook tap
  • Add MongoDB index on { _assetIds: 1 }
  • Add asset deletion guard in init() (tap assets.preDeleteHook)
  • Clone: zero changes needed — _assetIds copies automatically with the document

AdaptFrameworkBuild:

  • Change loadAssetData to query content.find({ _courseId }, { projection: { _assetIds: 1 } }) instead of courseassets.find({ courseId })

AdaptFrameworkImport:

  • Remove courseassets dependency; _assetIds computed automatically during content insert

Schema (adapt-schemas):

  • Add _assetIds array field to content schema with editorOnly flag

Migration

One-time migration to populate _assetIds on existing content documents:

  1. For each content document, resolve its schema and extract asset IDs
  2. Bulk $set _assetIds on all documents
  3. Drop the courseassets collection

Performance impact

Phase Before After
Clone (per language) 300-500 DB queries (5-6s) 0 extra queries
Clone (8 languages) 37-47s total ~1-2s total
Regular insert 1 + 2N queries (N = assets) 0 extra queries (computed inline)
Regular update 1 + 2N queries 0 extra queries (computed inline)
Asset delete check 1 query on courseassets 1 query on content._assetIds (indexed)

Rollout

Can be phased:

  1. Content module adds _assetIds computation (backward-compatible, courseassets still works in parallel)
  2. Run migration to populate existing content
  3. Switch consumers (build, import) to read _assetIds from content
  4. Deprecate and remove courseassets module (breaking major version)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions