linux/block
Milan Broz 0e435ac26e block: fix setting of max_segment_size and seg_boundary mask
Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
devices.

When stacking devices (LVM over MD over SCSI) some of the request queue
parameters are not set up correctly in some cases by default, namely
max_segment_size and and seg_boundary mask.

If you create MD device over SCSI, these attributes are zeroed.

Problem become when there is over this mapping next device-mapper mapping
- queue attributes are set in DM this way:

request_queue   max_segment_size  seg_boundary_mask
SCSI                65536             0xffffffff
MD RAID1                0                      0
LVM                 65536                 -1 (64bit)

Unfortunately bio_add_page (resp.  bio_phys_segments) calculates number of
physical segments according to these parameters.

During the generic_make_request() is segment cout recalculated and can
increase bio->bi_phys_segments count over the allowed limit.  (After
bio_clone() in stack operation.)

Thi is specially problem in CCISS driver, where it produce OOPS here

    BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);

(MAXSEGENTRIES is 31 by default.)

Sometimes even this command is enough to cause oops:

  dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10

This command generates bios with 250 sectors, allocated in 32 4k-pages
(last page uses only 1024 bytes).

For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
unfortunatelly on lower layer it is recalculated to 32 segments and this
violates CCISS restriction and triggers BUG_ON().

The patch tries to fix it by:

 * initializing attributes above in queue request constructor
   blk_queue_make_request()

 * make sure that blk_queue_stack_limits() inherits setting

 (DM uses its own function to set the limits because it
 blk_queue_stack_limits() was introduced later.  It should probably switch
 to use generic stack limit function too.)

 * sets the default seg_boundary value in one place (blkdev.h)

 * use this mask as default in DM (instead of -1, which differs in 64bit)

Bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=471639
http://bugzilla.kernel.org/show_bug.cgi?id=8672

Signed-off-by: Milan Broz <mbroz@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Cc: Neil Brown <neilb@suse.de>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-03 12:55:55 +01:00
..
as-iosched.c block: as/cfq ssd idle check update 2008-10-09 08:56:19 +02:00
blk-barrier.c block: internal dequeue shouldn't start timer 2008-12-03 12:41:26 +01:00
blk-core.c block: fix setting of max_segment_size and seg_boundary mask 2008-12-03 12:55:55 +01:00
blk-exec.c
blk-integrity.c block: Switch blk_integrity_compare from bdev to gendisk 2008-10-09 08:56:21 +02:00
blk-ioc.c
blk-map.c When block layer fails to map iov, it calls bio_unmap_user to undo 2008-12-03 12:41:20 +01:00
blk-merge.c block: remove unused ll_new_mergeable() 2008-11-06 08:41:55 +01:00
blk-settings.c block: fix setting of max_segment_size and seg_boundary mask 2008-12-03 12:55:55 +01:00
blk-softirq.c block: add fault injection mechanism for faking request timeouts 2008-10-09 08:56:17 +02:00
blk-sysfs.c block: add support for IO CPU affinity 2008-10-09 08:56:09 +02:00
blk-tag.c block: reserve some tags just for sync IO 2008-10-09 08:56:19 +02:00
blk-timeout.c Block: use round_jiffies_up() 2008-11-06 08:42:49 +01:00
blk.h block: remove __generic_unplug_device() from exports 2008-10-17 14:03:08 +02:00
blktrace.c blktrace: use BLKTRACE_BDEV_SIZE as the name size for setup structure 2008-10-09 08:56:20 +02:00
bsg.c [PATCH] switch scsi_cmd_ioctl() to passing fmode_t 2008-10-21 07:47:14 -04:00
cfq-iosched.c block: as/cfq ssd idle check update 2008-10-09 08:56:19 +02:00
cmd-filter.c [PATCH] introduce fmode_t, do annotations 2008-10-21 07:47:06 -04:00
compat_ioctl.c compat_blkdev_driver_ioctl: Remove unused variable warning 2008-10-23 10:28:25 -07:00
deadline-iosched.c
elevator.c block: internal dequeue shouldn't start timer 2008-12-03 12:41:26 +01:00
genhd.c block: set disk->node_id before it's being used 2008-12-03 12:41:20 +01:00
ioctl.c block: make add_partition() return pointer to hd_struct 2008-11-18 15:08:56 +01:00
Kconfig
Kconfig.iosched
Makefile block: unify request timeout handling 2008-10-09 08:56:13 +02:00
noop-iosched.c
scsi_ioctl.c [PATCH] switch scsi_cmd_ioctl() to passing fmode_t 2008-10-21 07:47:14 -04:00