Skip to content

Commit 0a30288

Browse files
htejunaxboe
authored andcommitted
blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe
blk-mq uses percpu_ref for its usage counter which tracks the number of in-flight commands and used to synchronously drain the queue on freeze. percpu_ref shutdown takes measureable wallclock time as it involves a sched RCU grace period. This means that draining a blk-mq takes measureable wallclock time. One would think that this shouldn't matter as queue shutdown should be a rare event which takes place asynchronously w.r.t. userland. Unfortunately, SCSI probing involves synchronously setting up and then tearing down a lot of request_queues back-to-back for non-existent LUNs. This means that SCSI probing may take more than ten seconds when scsi-mq is used. This will be properly fixed by implementing a mechanism to keep q->mq_usage_counter in atomic mode till genhd registration; however, that involves rather big updates to percpu_ref which is difficult to apply late in the devel cycle (v3.17-rc6 at the moment). As a stop-gap measure till the proper fix can be implemented in the next cycle, this patch introduces __percpu_ref_kill_expedited() and makes blk_mq_freeze_queue() use it. This is heavy-handed but should work for testing the experimental SCSI blk-mq implementation. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Christoph Hellwig <[email protected]> Link: http://lkml.kernel.org/g/[email protected] Fixes: add703f ("blk-mq: use percpu_ref for mq usage count") Cc: Kent Overstreet <[email protected]> Cc: Jens Axboe <[email protected]> Tested-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
1 parent 452b636 commit 0a30288

File tree

3 files changed

+27
-1
lines changed

3 files changed

+27
-1
lines changed

block/blk-mq.c

+10-1
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,16 @@ void blk_mq_freeze_queue(struct request_queue *q)
119119
spin_unlock_irq(q->queue_lock);
120120

121121
if (freeze) {
122-
percpu_ref_kill(&q->mq_usage_counter);
122+
/*
123+
* XXX: Temporary kludge to work around SCSI blk-mq stall.
124+
* SCSI synchronously creates and destroys many queues
125+
* back-to-back during probe leading to lengthy stalls.
126+
* This will be fixed by keeping ->mq_usage_counter in
127+
* atomic mode until genhd registration, but, for now,
128+
* let's work around using expedited synchronization.
129+
*/
130+
__percpu_ref_kill_expedited(&q->mq_usage_counter);
131+
123132
blk_mq_run_queues(q, false);
124133
}
125134
wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->mq_usage_counter));

include/linux/percpu-refcount.h

+1
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ void percpu_ref_reinit(struct percpu_ref *ref);
7171
void percpu_ref_exit(struct percpu_ref *ref);
7272
void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
7373
percpu_ref_func_t *confirm_kill);
74+
void __percpu_ref_kill_expedited(struct percpu_ref *ref);
7475

7576
/**
7677
* percpu_ref_kill - drop the initial ref

lib/percpu-refcount.c

+16
Original file line numberDiff line numberDiff line change
@@ -184,3 +184,19 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
184184
call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
185185
}
186186
EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
187+
188+
/*
189+
* XXX: Temporary kludge to work around SCSI blk-mq stall. Used only by
190+
* block/blk-mq.c::blk_mq_freeze_queue(). Will be removed during v3.18
191+
* devel cycle. Do not use anywhere else.
192+
*/
193+
void __percpu_ref_kill_expedited(struct percpu_ref *ref)
194+
{
195+
WARN_ONCE(ref->pcpu_count_ptr & PCPU_REF_DEAD,
196+
"percpu_ref_kill() called more than once on %pf!",
197+
ref->release);
198+
199+
ref->pcpu_count_ptr |= PCPU_REF_DEAD;
200+
synchronize_sched_expedited();
201+
percpu_ref_kill_rcu(&ref->rcu);
202+
}

0 commit comments

Comments
 (0)