Describe the Housekeeping
Under-represented wells are marked in the Limber UI, generally on the final plate in the library prep pipeline. Look for purposes with mark_under_represented_wells key in config.
This information is broadcast to the MLWH in one go for a sequencing batch when it is released. See post_release_batch method in sequencing_pipeline.rb.
Additional context
This came up because of a bug where loading a Batch page accidentally triggered the code to find under-represented wells for that whole Batch, which took 6-10 minutes.
See Slack thread https://psd-team.slack.com/archives/C095N1K4W80/p1781267914912399
Results of asking Co-Pilot about performance issues in the code:
The method has multiple layers of N+1 query problems:
def under_rep_requests_for_lane(lane)
lane.ancestors.grep(Plate) # 1 query for ancestors
.flat_map(&:wells) # N queries — one per plate
.flat_map(&:requests) # M queries — one per well
.compact.uniq
.select { |r| r.poly_metadata.any? { |pol| pol.key == UNDER_REPRESENTED_KEY } }
# P queries — one per request
end
Each step loads associated records lazily in Ruby iteration rather than at the database level:
.grep(Plate) — filters in Ruby after loading all ancestors, so no issue here, but...
.flat_map(&:wells) — fires oneSELECT wells WHERE plate_id = ?per ancestor plate
.flat_map(&:requests) — fires oneSELECT requests WHERE ... = well_idper well
.select { |r| r.poly_metadata... } — fires one SELECT poly_metadata WHERE metadatable_id = ? per request
With e.g. 3 plates × 96 wells × 2 requests each = 576 requests, you'd get roughly 3 + 288 + 576 = 867 queries just for one lane.
The fix is to push the loading into the database using eager loading or a single SQL query. The ancestors association likely supports includes, and you could restructure to something like:
lane.ancestors.grep(Plate)
.flat_map { |plate| plate.wells.includes(requests: :poly_metadata).flat_map(&:wells) }
Describe the Housekeeping
Under-represented wells are marked in the Limber UI, generally on the final plate in the library prep pipeline. Look for purposes with
mark_under_represented_wellskey in config.This information is broadcast to the MLWH in one go for a sequencing batch when it is released. See
post_release_batchmethod insequencing_pipeline.rb.Additional context
This came up because of a bug where loading a Batch page accidentally triggered the code to find under-represented wells for that whole Batch, which took 6-10 minutes.
See Slack thread https://psd-team.slack.com/archives/C095N1K4W80/p1781267914912399
Results of asking Co-Pilot about performance issues in the code:
The method has multiple layers of N+1 query problems:
Each step loads associated records lazily in Ruby iteration rather than at the database level:
.grep(Plate)— filters in Ruby after loading all ancestors, so no issue here, but....flat_map(&:wells)— fires oneSELECT wells WHERE plate_id = ?per ancestor plate.flat_map(&:requests)— fires oneSELECT requests WHERE ... = well_idper well.select { |r| r.poly_metadata... }— fires oneSELECT poly_metadata WHERE metadatable_id = ?per requestWith e.g. 3 plates × 96 wells × 2 requests each = 576 requests, you'd get roughly 3 + 288 + 576 = 867 queries just for one lane.
The fix is to push the loading into the database using eager loading or a single SQL query. The ancestors association likely supports includes, and you could restructure to something like: