linux

Commit Graph

Author	SHA1	Message	Date
Hannes Reinecke	91921e016a	scsi: use dev_printk variants where possible Using dev_printk variants prefixes the logging message with the originating device, which makes debugging easier. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:42 +02:00
Hannes Reinecke	e5f73ce324	scsi: use dev_printk() variants for ioctl Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:41 +02:00
Hannes Reinecke	b30d8bca5b	scsi: Implement st_printk() Update the st driver to use dev_printk() variants instead of plain printk(); this will prefix logging messages with the appropriate device. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:41 +02:00
Hannes Reinecke	28c31729c8	scsi: Implement ch_printk() Update the ch driver to use dev_printk() variants instead of plain printk(); this will prefix logging messages with the appropriate device. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:40 +02:00
Hannes Reinecke	95e159d6dd	scsi: Implement sg_printk() Update the sg driver to use dev_printk() variants instead of plain printk(); this will prefix logging messages with the appropriate device. Signed-off-by: Hannes Reinecke <hare@suse.de> Acked-by: Doug Gilbert <dgilbert@interlog.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:40 +02:00
Hannes Reinecke	96eefad2d9	scsi: Implement sr_printk() Update the sr driver to use dev_printk() variants instead of plain printk(); this will prefix logging messages with the appropriate device. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:39 +02:00
Hannes Reinecke	d9e5d61837	scsi_scan: Fixup scsilun_to_int() scsilun_to_int() has an error which prevents it from generating correct LUN numbers for 64bit values. Also we should remove the misleading comment about portions of the LUN being ignored; the initiator should treat the LUN as an opaque value. And, finally, the example given should use the correct prefix (here: extended flat space addressing scheme). This patch includes the modifications suggested by Bart van Assche. Cc: Bart van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: James Bottomley <jbottomley@parallels.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:39 +02:00
Hannes Reinecke	1abf635d2f	scsi: use 64-bit value for 'max_luns' Now that we're using 64-bit LUNs internally we need to increase the size of max_luns to 64 bits, too. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:38 +02:00
Hannes Reinecke	b4210b810e	Add module param type 'ullong' Some driver might want to pass in an 64-bit value, so introduce a module param type 'ullong'. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ewan Milne <emilne@redhat.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:37 +02:00
Hannes Reinecke	9cb78c16f5	scsi: use 64-bit LUNs The SCSI standard defines 64-bit values for LUNs, and large arrays employing large or hierarchical LUN numbers become more and more common. So update the linux SCSI stack to use 64-bit LUN numbers. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:37 +02:00
Hannes Reinecke	755f516bbb	qla2xxx: Restrict max_lun to 16-bit for older HBAs Older HBAs are only capable of supporting 16-bit LUNs, so we need to make sure to adjust max_lun accordingly. Signed-off-by: Hannes Reinecke <hare@suse.de> Acked-by: Chad Dupuis <chad.dupuis@qlogic.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:36 +02:00
Hannes Reinecke	22ffeb48b7	scsi_scan: Restrict sequential scan to 256 LUNs Sequential scan for more than 256 LUNs is very fragile as LUNs might not be numbered sequentially after that point. SAM revisions later than SCSI-3 impose a structure on LUNs larger than 256, making LUN numbers between 256 and 16384 illegal. SCSI-3, however allows for plain 64-bit numbers with no internal structure. So restrict sequential LUN scan to 256 LUNs and add a new blacklist flag 'BLIST_SCSI3LUN' to scan up to max_lun devices. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:35 +02:00
Hannes Reinecke	c309b35171	scsi: Remove CONFIG_SCSI_MULTI_LUN Obsolete; either use 'max_lun' if the host supports only a limited number of LUNs or BLIST_NOLUN if the target has problems addressing more than one LUN. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:35 +02:00
Douglas Gilbert	cc833acbee	sg: O_EXCL and other lock handling This addresses a problem reported by Vaughan Cao concerning the correctness of the O_EXCL logic in the sg driver. POSIX doesn't defined O_EXCL semantics on devices but "allow only one open file descriptor at a time per sg device" is a rough definition. The sg driver's semantics have been to wait on an open() when O_NONBLOCK is not given and there are O_EXCL headwinds. Nasty things can happen during that wait such as the device being detached (removed). So multiple locks are reworked in this patch making it large and hard to break down into digestible bits. This patch is against Linus's current git repository which doesn't include any sg patches sent in the last few weeks. Hence this patch touches as little as possible that it doesn't need to and strips out most SCSI_LOG_TIMEOUT() changes in v3 because Hannes said he was going to rework all that stuff. The sg3_utils package has several test programs written to test this patch. See examples/sg_tst_excl*.cpp . Not all the locks and flags in sg have been re-worked in this patch, notably sg_request::done . That can wait for a follow-up patch if this one meets with approval. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:07:34 +02:00
Douglas Gilbert	16070cc189	sg: add SG_FLAG_Q_AT_TAIL flag When the SG_IO ioctl was copied into the block layer and later into the bsg driver, subtle differences emerged. One difference is the way injected commands are queued through the block layer (i.e. this is not SCSI device queueing nor SATA NCQ). Summarizing: - SG_IO in the block layer: blk_exec*(at_head=false) - sg SG_IO: at_head=true - bsg SG_IO: at_head=true Some time ago Boaz Harrosh introduced a sg v4 flag called BSG_FLAG_Q_AT_TAIL to override the bsg driver default. This patch does the equivalent for the sg driver. ChangeLog: Introduce SG_FLAG_Q_AT_TAIL flag to cause commands to be injected into the block layer with at_head=false. Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:34 +02:00
Douglas Gilbert	65c26a0f39	sg: relax 16 byte cdb restriction - remove the 16 byte CDB (SCSI command) length limit from the sg driver by handling longer CDBs the same way as the bsg driver. Remove comment from sg.h public interface about the cmd_len field being limited to 16 bytes. - remove some dead code caused by this change - cleanup comment block at the top of sg.h, fix urls Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:33 +02:00
Martin K. Petersen	bcdb247c6b	sd: Limit transfer length Until now the per-command transfer length has exclusively been gated by the max_sectors parameter in the scsi_host template. Given that the size of this parameter has been bumped to an unsigned int we have to be careful not to exceed the target device's capabilities. If the if the device specifies a Maximum Transfer Length in the Block Limits VPD we'll use that value. Otherwise we'll use 0xffffffff for devices that have use_16_for_rw set and 0xffff for the rest. We then combine the chosen disk limit with max_sectors in the host template. The smaller of the two will be used to set the max_hw_sectors queue limit. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:33 +02:00
Clément Calmels	8d964478b2	sd: bad return code of init_sd In init_sd function, if kmem_cache_create or mempool_create_slab_pools calls fail, the error will not be correclty reported because class_register previously set the value of err to 0. Signed-off-by: Clément Calmels <clement.calmels@free.fr> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:32 +02:00
Vaughan Cao	cb2fb68d06	sd: notify block layer when using temporary change to cache_type This is a fix for commit `39c60a0948` "sd: fix array cache flushing bug causing performance problems" We must notify the block layer via q->flush_flags after a temporary change of the cache_type to write through. Without this, a SYNCHRONIZE CACHE command will still be generated. This patch factors out a helper that can be called from sd_revalidate_disk and cache_type_store. Signed-off-by: Vaughan Cao <vaughan.cao@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:31 +02:00
Akinobu Mita	6bb5e6e772	scsi_debug: allow huge transfer length for read/write commands This change enables to test read/write commands with huge transfer length such as 1GB. For example: # modprobe scsi_debug dev_size_mb=1024 clustering=1 opts=1 # cat /sys/block/$DEV/queue/max_hw_sectors_kb > \ /sys/block/$DEV/queue/max_sectors_kb # fio --name=test --rw=write --bs=1g --size=1g --filename=/dev/$DEV \ --mem=mmaphuge --direct=1 The data type of max_sectors in scsi_host_template has been extended to unsigned int by the previous change. So we can increase it from 0xffff to 0xffffffff to allow such huge transfer length. Also, this increases sg_tablesize and max_segment_size, otherwise the maximum transfer length is limited to 64MB. (sg_tablesize * max_segment_size = 256 * 256KB) Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:31 +02:00
Akinobu Mita	8ed5a4d2f7	scsi: increase upper limit for max_sectors max_sectors in struct Scsi_Host specifies maximum number of sectors allowed in a single SCSI command. The data type of max_sectors is unsigned short, so the maximum transfer length per SCSI command is limited to less than 256MB in 4096-bytes sector size. (0xffff * 4096) This commit increases the SCSI mid level's limitation for max_sectors upto the block layer's limitation for max_hw_sectors by extending the data type of max_sectors in struct Scsi_Host and scsi_host_template, so that SCSI lower level drivers can specify more than 0xffff. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:30 +02:00
Akinobu Mita	e430cbc8bb	sd: use READ_16 or WRITE_16 when transfer length is greater than 0xffff This change makes the scsi disk driver handle the requests whose transfer length is greater than 0xffff with READ_16 or WRITE_16. However, this is a preparation for extending the data type of max_sectors in struct Scsi_Host and scsi_host_template. So, it is impossible to happen this condition for now, because SCSI low-level drivers can not specify max_sectors greater than 0xffff due to the data type limitation. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:30 +02:00
Akinobu Mita	46f69e6a6b	sg: prevent integer overflow when converting from sectors to bytes This prevents integer overflow when converting the request queue's max_sectors from sectors to bytes. However, this is a preparation for extending the data type of max_sectors in struct Scsi_Host and scsi_host_template. So, it is impossible to happen this integer overflow for now, because SCSI low-level drivers can not specify max_sectors greater than 0xffff due to the data type limitation. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:29 +02:00
Bart Van Assche	fcc95a7634	scsi: remove two cancel_delayed_work() calls from the mid-layer scsi_put_command() is either invoked before blk_start_request() or after block layer processing has completed. scsi_cmnd.abort_work is scheduled from inside the SCSI timeout handler. The block layer guarantees that either the regular completion handler (softirq_done_fn()) or the timeout handler (rq_timed_out_fn()) is invoked but not both. This means that scsi_put_command() is never invoked while abort_work is scheduled. Hence remove the cancel_delayed_work() call from scsi_put_command(). Similarly, scsi_abort_command() is only invoked from the SCSI timeout handler. If scsi_abort_command() is invoked for a SCSI command with the SCSI_EH_ABORT_SCHEDULED flag set this means that scmd_eh_abort_handler() has already invoked scsi_queue_insert() and hence that scsi_cmnd.abort_work is no longer pending. Hence also remove the cancel_delayed_work() call from scsi_abort_command(). Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:28 +02:00
James Bottomley	89fb4cd1f7	scsi: handle flush errors properly Flush commands don't transfer data and thus need to be special cased in the I/O completion handler so that we can propagate errors to the block layer and filesystem. Signed-off-by: James Bottomley <JBottomley@Parallels.com> Reported-by: Steven Haber <steven@qumulo.com> Tested-by: Steven Haber <steven@qumulo.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 21:56:34 +02:00
Linus Torvalds	59ca9ee428	Fixes: - Fixes console deadlock when resuming PV guests - Fix regression hit when ballooning and resuming PV guests -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJTxt1pAAoJEFjIrFwIi8fJbGcH/RW88DhFw3wJrtyd68R2uY4R BQVsUeXltzS7cRZ46ytStn+UveXOII9SI1kdU0yAPWbpzgKwXkXLTGUbL26vXmfq ZpAGIVFBG+SrHuTDEYP/IhB2TI6ugTeHVK8+Mo+HfjEs99OJ+BAf4IlZ9cLUvrW8 l51HcFe7c9ueH7YculTAesUWZrdS6p5FBbTz0pwG8BvU5NxJ2EC+MktLEKZmV27Q Y3CN///BhIN6YXhPA3Frykxs36m4gFJUPgXbEQFea79I67f+frBX/VdKNPeaMoaR yb305PStdxwo6Ja8Qx/CrTl+y8vTtLP1cADRGSdLD3m1vI42PNF4Qjr1ruLbA34= =ky5L -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen fixes from Konrad Rzeszutek Wilk: "Two fixes found during migration of PV guests. David would be the one doing this pull but he is on vacation. Fixes: - fix console deadlock when resuming PV guests - fix regression hit when ballooning and resuming PV guests" * tag 'stable/for-linus-3.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: set ballooned out pages as invalid in p2m xen/manage: fix potential deadlock when resuming the console	2014-07-17 08:02:35 -10:00
Linus Torvalds	22d368544b	A few more fixes for ftrace infrastructure. I was cleaning out my INBOX and found two fixes from zhangwei from a year ago that were lost in my mail. These fix an inconsistency between trace_puts() and the way trace_printk() works. The reason this is important to fix is because when trace_printk() doesn't have any arguments, it turns into a trace_puts(). Not being able to enable a stack trace against trace_printk() because it does not have any arguments is quite confusing. Also, the fix is rather trivial and low risk. While porting some changes to PowerPC I discovered that it still has the function graph tracer filter bug that if you also enable stack tracing the function graph tracer filter is ignored. I fixed that up. Finally, Martin Lau, fixed a bug that would cause readers of the ftrace ring buffer to block forever even though it was suppose to be NONBLOCK. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJTxfAHAAoJEKQekfcNnQGueDkH/jQ/FgNPw+GbKuNpZwWqMQds ivGM8yb6NgmlDQx60oBOL9TduKN2LCqW3HWDoyzduSozuLUxpRTPWtVThKo0Q9Fp FPHWewdrX7DwWkyEqeG7FAAhjqPedgEZoDOwy7BArzJYpHcpex42trDY4BhjPrlM ESA+/M+Xm1vGCJzcNSqNHYE/mF/tHptdM1mlLZ3g9ql49MiTbQxEMbHX7lih24sv AzNP1SlisYN5CkX4hbRGhCYkZdmdVZt/xiyBAQRWCGFx+jsdJbMXDJnevDCdU9Mv GNtKwqsi1/GeE5nvKxZyQKsDEE98Pd8LyQ0kDqKpAfTutqNE4mpWJkCxUqTmE4Q= =aFqO -----END PGP SIGNATURE----- Merge tag 'trace-fixes-v3.16-rc5-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "A few more fixes for ftrace infrastructure. I was cleaning out my INBOX and found two fixes from zhangwei from a year ago that were lost in my mail. These fix an inconsistency between trace_puts() and the way trace_printk() works. The reason this is important to fix is because when trace_printk() doesn't have any arguments, it turns into a trace_puts(). Not being able to enable a stack trace against trace_printk() because it does not have any arguments is quite confusing. Also, the fix is rather trivial and low risk. While porting some changes to PowerPC I discovered that it still has the function graph tracer filter bug that if you also enable stack tracing the function graph tracer filter is ignored. I fixed that up. Finally, Martin Lau, fixed a bug that would cause readers of the ftrace ring buffer to block forever even though it was suppose to be NONBLOCK" This also includes the fix from an earlier pull request: "Oleg Nesterov fixed a memory leak that happens if a user creates a tracing instance, sets up a filter in an event, and then removes that instance. The filter allocates memory that is never freed when the instance is destroyed" * tag 'trace-fixes-v3.16-rc5-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ring-buffer: Fix polling on trace_pipe tracing: Add TRACE_ITER_PRINTK flag check in __trace_puts/__trace_bputs tracing: Fix graph tracer with stack tracer on other archs tracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs tracing: instance_rmdir() leaks ftrace_event_file->filter	2014-07-17 07:57:33 -10:00
Linus Torvalds	b6603fe574	MTD updates for 3.16-rc6 - Fix ELM suspend/resume - Reduce warnings if NAND ECC is too weak - Add CFI support for Sharp LH28F640BF NOR The last fix is coming in because other commits in the 3.16 cycle depended on this support. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTxrg9AAoJEFySrpd9RFgtJvMP/1xlHR4Pb8uJnTPFwDZbR6CQ 0I9q7K0wgvdtlTXLKqbmOEVZ7yn6EVpx/TR9Pdk+aCvJFrnKjkiTenZ7yeT80iad lOp4zWf2+x74zMn9Q2m16YAoL/AwoGoiP5iaMnD+vwkwf2kzr7QbgxO/YqwtCv0O gY/eAPiCWzlKlswEavSVBj//eqSdHWn4imKDisZ0roDYw8J+T3n42gwnVoqp5MqU mrk0N/hpuxaYA4TH3XFhDb/QgfKSsNt/c3FzeQKO+aurwfM7TuuKVOGYdT/URuhg oSLCRBoVF9EIUVIDeEWxmktJJx8HRCpmoLZE7jj8g/y9I6P4JZk73KTMIEdE9170 x9j7jYngp68yKw9wel6zcTPpEzzHzyenJKlUqUjwhsxemfdXjDOp+X+QlI8MGvuL kypLN8xnIMYhSpFhOq0W+Emi67CXi9uN83HMZV4nP9krZDHb796pi1iRX66Vdl61 1U9dJ2YZy4mSoDQXS8bo98yR9Qw/UDA+Fus+OOfnGtTxcMYF2rdmL5sCMNBvAK+1 fpkSJDnNrmaa6zvSrKm6JOk5UTEFvmXZocBcrwIzMG3jI+JsZ9jwY7kpUA25Dz5b wowABmfnjmuazsI8Ga9zGUjl70vHV/VuVYYSZybMxGdHxYKmdXJ1kSDqtuSlSQ3A +KxZd8I/mdnOAy/OWIsd =OGTK -----END PGP SIGNATURE----- Merge tag 'for-linus-20140716' of git://git.infradead.org/linux-mtd Pull MTD fixes from Brian Norris: - Fix ELM suspend/resume - Reduce warnings if NAND ECC is too weak - Add CFI support for Sharp LH28F640BF NOR The last fix is coming in because other commits in the 3.16 cycle depended on this support. * tag 'for-linus-20140716' of git://git.infradead.org/linux-mtd: mtd: cfi_cmdset_0001.c: add support for Sharp LH28F640BF NOR mtd: nand: reduce the warning noise when the ECC is too weak mtd: devices: elm: fix elm_context_save() and elm_context_restore() functions	2014-07-16 10:11:42 -10:00
Linus Torvalds	bcf44bfe5e	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A cpufreq lockup fix and a compiler warning fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix compiler warnings x86, tsc: Fix cpufreq lockup	2014-07-16 10:11:02 -10:00
Linus Torvalds	d14aef3872	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Tooling fixes and an Intel PMU driver fixlet" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Do not allow optimized switch for non-cloned events perf/x86/intel: ignore CondChgd bit to avoid false NMI handling perf symbols: Get kernel start address by symbol name perf tools: Fix segfault in cumulative.callchain report	2014-07-16 10:10:27 -10:00
Linus Torvalds	2da2944740	sound fixes for 3.16-rc6 Things seem to calm down so far, just a small few HD-audio fixes (regression fixes and a new codec ID addition) popping up. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJTxlkHAAoJEGwxgFQ9KSmk0EEP/iKDIFbGzlQeH/X6G0pwnk9+ Cl+j8KBUQfevV49SCGZm+h8NHF+RpnV3OrVHQFH+a1AwZqYRA0pEs4sDjRarcSud fwMNqe/AUmTtyazeF2fSvNokbzHLlWaWG/+vLzxsvvsGlu2tNppNPK6WYrQyRp/G 8ZCruUQz+Cl/GY4VN2Cx4zV7xNkcyUQnbnL3Uo8EM2hZKK4I2YdtQPc8rrD4+DzG YPDtbton4Sb0xPOnAcmmmuXQDM3rSh0xMK8ZMwfDeruLapXhY1z25PbugUvet2Xx gkROW9JfseEjvJKT55sYbrj+jEFlIPAeom/BSVJBerJmYs+jbDkpCzlbtzU5dT0R yUp6v5Z9VYmIkExJZOFmJVq9D2UbCd1b4kKZdwExvCtVPJ8Ce3HhCvrB5MJWl7vo nA3Z7pPXeqhBmxBHT1oREREn5OydRR1MfERNEtxuVWrg7jl9ugUGHBc3/F24K2Ia Xkbnh7orTmUZy9LeMA9BTUFKGmAKr/phju8RRahv846TFnd98NSK8g1F+Gi1UEfB 4ZqG9fJJBhwzm2ZIh8ocMOF7fVJsTOPurPkfL3PbosZ8J/7qsh3unVihrHuLDbC+ 0sM3JS7kG6IgMCgyB8+697AtAP+jw1sAz/gDG+bjXx2BA2HtycLlb5mXwB8Jvg0y K9XlJ6urPLsQGKsxCTO4 =Ua+N -----END PGP SIGNATURE----- Merge tag 'sound-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Things seem to calm down so far, just a small few HD-audio fixes (regression fixes and a new codec ID addition) popping up" * tag 'sound-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Fix broken PM due to incomplete i915 initialization ALSA: hda - Revert stream assignment order for Intel controllers ALSA: hda - Add new GPU codec ID 0x10de0070 to snd-hda ALSA: hda: Fix build warning	2014-07-16 06:48:08 -10:00
Linus Torvalds	c20ddc6499	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota fix from Jan Kara: "Fix locking of dquot shrinker" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: missing lock in dqcache_shrink_scan()	2014-07-15 17:47:42 -10:00
Linus Torvalds	d7b6e53e32	A GPIO fix for the v3.16 series - fixes up some merge confusion from the merge window. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTxXoGAAoJEEEQszewGV1z6ogQAIIe+juP1FUjRdsZkD8S6ewF gdtOSMuPfL3HV+hSlRLKchAQbA+4HMxyH9lv0Vd8CHKy91w932w6x84O10B1XQJf ethsxxXmMi7OGTMQkIrbuZZDcnOVkcNmw1GuPXknvHUWgNFDUMz6nu0YShQ+ZehQ ctIAyfV0wdFilZHuSkO+thj/+cDjBEjlklgAcVVQ6OgnmshPDd1PmELzVbKIueWf E+/KgyLYvunubcuYB56dfDWtG4pE48qVekvDzXHZXxVQbuW3Ov4fOzQvyO01s/WQ cK7S7T5T3FepYzWIvh5fvGIXKfoD6sjdQWQ4hUBYTZI+qpdoN7Diri3T1Rhe4avj lJ5xCZujJ3AnFSEowqn9oM0CQQ5J0Dx3joUYnL+Krkt+va+LayrRwWxdMmoCX57H 2uw3uAVdAg4qhqHcxJP7HnsgXEocDVKcGXV59++wdE8Ebann3IeyH4MnDCr2hQM+ u0O1IgjjZlOM0NAfGySCcTsfUxW4mB8znAOex+66PnX1Mvncm0UVc+VrkAxcYp3Q 3Ek7hz6sIPB/b8RYzPb5X4aT+lXZyD8I8lOE7OruxhWggCC3cnCDQWRJQ+DNvwYJ z3GiPLJRgFuvoF7QrYhOeBlUH3fZ9mrqLQietQiAS1tz0oMKYO4TBkM8HDjLMNAx 3hQEuyp9EHDA9K1gxERL =h0US -----END PGP SIGNATURE----- Merge tag 'gpio-v3.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fix from Linus Walleij: "Fix up some merge confusion from the merge window" * tag 'gpio-v3.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: mcp23s08: Eliminates redundant checking.	2014-07-15 17:46:51 -10:00
Martin Lau	97b8ee8453	ring-buffer: Fix polling on trace_pipe ring_buffer_poll_wait() should always put the poll_table to its wait_queue even there is immediate data available. Otherwise, the following epoll and read sequence will eventually hang forever: 1. Put some data to make the trace_pipe ring_buffer read ready first 2. epoll_ctl(efd, EPOLL_CTL_ADD, trace_pipe_fd, ee) 3. epoll_wait() 4. read(trace_pipe_fd) till EAGAIN 5. Add some more data to the trace_pipe ring_buffer 6. epoll_wait() -> this epoll_wait() will block forever ~ During the epoll_ctl(efd, EPOLL_CTL_ADD,...) call in step 2, ring_buffer_poll_wait() returns immediately without adding poll_table, which has poll_table->_qproc pointing to ep_poll_callback(), to its wait_queue. ~ During the epoll_wait() call in step 3 and step 6, ring_buffer_poll_wait() cannot add ep_poll_callback() to its wait_queue because the poll_table->_qproc is NULL and it is how epoll works. ~ When there is new data available in step 6, ring_buffer does not know it has to call ep_poll_callback() because it is not in its wait queue. Hence, block forever. Other poll implementation seems to call poll_wait() unconditionally as the very first thing to do. For example, tcp_poll() in tcp.c. Link: http://lkml.kernel.org/p/20140610060637.GA14045@devbig242.prn2.facebook.com Cc: stable@vger.kernel.org # 2.6.27 Fixes: `2a2cc8f7c4` "ftrace: allow the event pipe to be polled" Reviewed-by: Chris Mason <clm@fb.com> Signed-off-by: Martin Lau <kafai@fb.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-07-15 17:07:47 -04:00
Niu Yawei	d68aab6b8f	quota: missing lock in dqcache_shrink_scan() Commit `1ab6c4997e` (fs: convert fs shrinkers to new scan/count API) accidentally removed locking from quota shrinker. Fix it - dqcache_shrink_scan() should use dq_list_lock to protect the scan on free_dquots list. CC: stable@vger.kernel.org Fixes: `1ab6c4997e` Signed-off-by: Niu Yawei <yawei.niu@intel.com> Signed-off-by: Jan Kara <jack@suse.cz>	2014-07-15 22:36:18 +02:00
zhangwei(Jovi)	f0160a5a29	tracing: Add TRACE_ITER_PRINTK flag check in __trace_puts/__trace_bputs The TRACE_ITER_PRINTK check in __trace_puts/__trace_bputs is missing, so add it, to be consistent with __trace_printk/__trace_bprintk. Those functions are all called by the same function: trace_printk(). Link: http://lkml.kernel.org/p/51E7A7D6.8090900@huawei.com Cc: stable@vger.kernel.org # 3.11+ Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-07-15 11:58:40 -04:00
Linus Torvalds	0b632204c7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "This contains miscellaneous fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: replace count*size kzalloc by kcalloc fuse: release temporary page if fuse_writepage_locked() failed fuse: restructure ->rename2() fuse: avoid scheduling while atomic fuse: handle large user and group ID fuse: inode: drop cast fuse: ignore entry-timeout on LOOKUP_REVAL fuse: timeout comparison fix	2014-07-15 08:57:17 -07:00
Linus Torvalds	5615f9f822	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Bluetooth pairing fixes from Johan Hedberg. 2) ieee80211_send_auth() doesn't allocate enough tail room for the SKB, from Max Stepanov. 3) New iwlwifi chip IDs, from Oren Givon. 4) bnx2x driver reads wrong PCI config space MSI register, from Yijing Wang. 5) IPV6 MLD Query validation isn't strong enough, from Hangbin Liu. 6) Fix double SKB free in openvswitch, from Andy Zhou. 7) Fix sk_dst_set() being racey with UDP sockets, leading to strange crashes, from Eric Dumazet. 8) Interpret the NAPI budget correctly in the new systemport driver, from Florian Fainelli. 9) VLAN code frees percpu stats in the wrong place, leading to crashes in the get stats handler. From Eric Dumazet. 10) TCP sockets doing a repair can crash with a divide by zero, because we invoke tcp_push() with an MSS value of zero. Just skip that part of the sendmsg paths in repair mode. From Christoph Paasch. 11) IRQ affinity bug fixes in mlx4 driver from Amir Vadai. 12) Don't ignore path MTU icmp messages with a zero mtu, machines out there still spit them out, and all of our per-protocol handlers for PMTU can cope with it just fine. From Edward Allcutt. 13) Some NETDEV_CHANGE notifier invocations were not passing in the correct kind of cookie as the argument, from Loic Prylli. 14) Fix crashes in long multicast/broadcast reassembly, from Jon Paul Maloy. 15) ip_tunnel_lookup() doesn't interpret wildcard keys correctly, fix from Dmitry Popov. 16) Fix skb->sk assigned without taking a reference to 'sk' in appletalk, from Andrey Utkin. 17) Fix some info leaks in ULP event signalling to userspace in SCTP, from Daniel Borkmann. 18) Fix deadlocks in HSO driver, from Olivier Sobrie. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits) hso: fix deadlock when receiving bursts of data hso: remove unused workqueue net: ppp: don't call sk_chk_filter twice mlx4: mark napi id for gro_skb bonding: fix ad_select module param check net: pppoe: use correct channel MTU when using Multilink PPP neigh: sysctl - simplify address calculation of gc_* variables net: sctp: fix information leaks in ulpevent layer MAINTAINERS: update r8169 maintainer net: bcmgenet: fix RGMII_MODE_EN bit tipc: clear 'next'-pointer of message fragments before reassembly r8152: fix r8152_csum_workaround function be2net: set EQ DB clear-intr bit in be_open() GRE: enable offloads for GRE farsync: fix invalid memory accesses in fst_add_one() and fst_init_card() igb: do a reset on SR-IOV re-init if device is down igb: Workaround for i210 Errata 25: Slow System Clock usbnet: smsc95xx: add reset_resume function with reset operation dp83640: Always decode received status frames r8169: disable L23 ...	2014-07-15 08:42:52 -07:00
Steven Rostedt (Red Hat)	5f8bf2d263	tracing: Fix graph tracer with stack tracer on other archs Running my ftrace tests on PowerPC, it failed the test that checks if function_graph tracer is affected by the stack tracer. It was. Looking into this, I found that the update_function_graph_func() must be called even if the trampoline function is not changed. This is because archs like PowerPC do not support ftrace_ops being passed by assembly and instead uses a helper function (what the trampoline function points to). Since this function is not changed even when multiple ftrace_ops are added to the code, the test that falls out before calling update_function_graph_func() will miss that the update must still be done. Call update_function_graph_function() for all calls to update_ftrace_function() Cc: stable@vger.kernel.org # 3.3+ Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-07-15 11:10:25 -04:00
zhangwei(Jovi)	8abfb8727f	tracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs Currently trace option stacktrace is not applicable for trace_printk with constant string argument, the reason is in __trace_puts/__trace_bputs ftrace_trace_stack is missing. In contrast, when using trace_printk with non constant string argument(will call into __trace_printk/__trace_bprintk), then trace option stacktrace is workable, this inconstant result will confuses users a lot. Link: http://lkml.kernel.org/p/51E7A7C9.9040401@huawei.com Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-07-15 11:04:40 -04:00
Takashi Iwai	4da63c6fc4	ALSA: hda - Fix broken PM due to incomplete i915 initialization When the initialization of Intel HDMI controller fails due to missing i915 kernel symbols (e.g. HD-audio is built in while i915 is module), the driver discontinues the probe. However, since the probe was done asynchronously, the driver object still remains, thus the relevant PM ops are still called at suspend/resume. This results in the bad access to the incomplete audio card object, eventually leads to Oops or stall at PM. This patch adds the missing checks of chip->init_failed flag at each PM callback in order to fix the problem above. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79561 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>	2014-07-15 15:19:43 +02:00
Olivier Sobrie	8f9818af4e	hso: fix deadlock when receiving bursts of data When the module sends bursts of data, sometimes a deadlock happens in the hso driver when the tty buffer doesn't get the chance to be flushed quickly enough. Remove the endless while loop in function put_rxbuf_data() which is called by the urb completion handler. If there isn't enough room in the tty buffer, discards all the data received in the URB. Cc: David Miller <davem@davemloft.net> Cc: David Laight <David.Laight@ACULAB.COM> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Dan Williams <dcbw@redhat.com> Cc: Jan Dumon <j.dumon@option.com> Signed-off-by: Olivier Sobrie <olivier@sobrie.be> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-14 19:27:34 -07:00
Olivier Sobrie	5c763edfe4	hso: remove unused workqueue The workqueue "retry_unthrottle_workqueue" is not scheduled anywhere in the code. So, remove it. Signed-off-by: Olivier Sobrie <olivier@sobrie.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-14 19:27:34 -07:00
Linus Torvalds	1b81e88189	IEEE 1394 (FireWire) subsystem fix: The 1394 drivers cannot and are not supposed to be built on platforms which don't provide the DMA mapping API. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJTxCGPAAoJEHnzb7JUXXnQ0W8QAKTGboWwSU5kznTcWGveIgEl fckaDnyinK9Vd6lOpixxsSc0rxdaKPmMmoC1ta2AGcD4NgRyfM2aD3Lw6dArzlHr xgcDRPE4K/41atxf6Z/wZnsXFSLrNEJbr6e8oZncGDzcRp4DKzv7tX2meNzyQRFE ns0/sjeN7QJEamd8vUkELiVdBED7NLiQjxXdULQ9AjY/cC3byzWAHzyoII0ff5NY +URYGtTIxwJ2KoNAtqD9OAazwPUV8rt1XXWnqkUHS89TnACsKiWmHpaBq+6jTkqG YsRxY9h0eZ+sSHoeEBUS2WLiaKWufvsBWBTCOUVoO0eS2iYHJQ1SwSLYU1+Zh0IR Fey2N2YaMA1HNthJ0OJJdC+sqSZ/tBmE+ytN7rTVHu0Yw0ZffzSPgJ303jIO0y87 rYQ57K6dXhn5kDqh/PheDJCZ1OwMqz+zvgJDyv9XaEcm9lynDmmhc2IqmzlqONmz 0UbiHITiqMrv6bSzZ83+PVeoROw2DBgH/fEu/Xk2vC1hBA7s4NynGSllEht1tXT3 zLbVW/bBbZfjqc0a1UADBRHVKSllac+BPlxvGHSUxJusqC2VN9S9tgEhboj7+baY 7VHbGkuCaEqtZeErzfNTN9uGvUsBN0KGDatW4Kp0g2CZ33dFDpJFOt7O5QaephnF DhlO4daXNsvHvFVgpBGZ =vY8J -----END PGP SIGNATURE----- Merge tag 'firewire-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Stefan Richter: "The 1394 drivers cannot and are not supposed to be built on platforms which don't provide the DMA mapping API (regression since v3.16-rc1 with CONFIG_COMPILE_TEST=y on some architectures)" * tag 'firewire-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: IEEE 1394 (FireWire) support should depend on HAS_DMA	2014-07-14 17:17:53 -07:00
Linus Torvalds	8ec8ba8e7d	Merge git://git.kvack.org/~bcrl/aio-fixes Pull another aio fix from Ben LaHaise: "put_reqs_available() can now be called from within irq context, which means that it (and its sibling function get_reqs_available()) now need to be irq-safe, not just preempt-safe" * git://git.kvack.org/~bcrl/aio-fixes: aio: protect reqs_available updates from changes in interrupt handlers	2014-07-14 17:11:50 -07:00
Sasha Levin	3cf521f7dc	net/l2tp: don't fall back on UDP [get\|set]sockopt The l2tp [get\|set]sockopt() code has fallen back to the UDP functions for socket option levels != SOL_PPPOL2TP since day one, but that has never actually worked, since the l2tp socket isn't an inet socket. As David Miller points out: "If we wanted this to work, it'd have to look up the tunnel and then use tunnel->sk, but I wonder how useful that would be" Since this can never have worked so nobody could possibly have depended on that functionality, just remove the broken code and return -EINVAL. Reported-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: James Chapman <jchapman@katalix.com> Acked-by: David Miller <davem@davemloft.net> Cc: Phil Turnbull <phil.turnbull@oracle.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-07-14 17:02:31 -07:00
Christoph Schulz	3916a31927	net: ppp: don't call sk_chk_filter twice Commit `568f194e8b` ("net: ppp: use sk_unattached_filter api") causes sk_chk_filter() to be called twice when setting a PPP pass or active filter. This applies to both the generic PPP subsystem implemented by drivers/net/ppp/ppp_generic.c and the ISDN PPP subsystem implemented by drivers/isdn/i4l/isdn_ppp.c. The first call is from within get_filter(). The second one is through the call chain ppp_ioctl() or isdn_ppp_ioctl() --> sk_unattached_filter_create() --> __sk_prepare_filter() --> sk_chk_filter() The first call from within get_filter() should be deleted as get_filter() is called just before calling sk_unattached_filter_create() later on, which eventually calls sk_chk_filter() anyway. For 3.15.x, this proposed change is a bugfix rather than a pure optimization as in that branch, sk_chk_filter() may replace filter codes by other codes which are not recognized when executing sk_chk_filter() a second time. So with 3.15.x, if sk_chk_filter() is called twice, the second invocation may yield EINVAL (this depends on the filter codes found in the filter to be set, but because the replacement is done for frequently used codes, this is almost always the case). The net effect is that setting pass and/or active PPP filters does not work anymore, since sk_unattached_filter_create() always returns EINVAL due to the second call to sk_chk_filter(), regardless whether the filter was originally sane or not. Signed-off-by: Christoph Schulz <develop@kristov.de> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-14 16:15:12 -07:00
Jason Wang	32b333fe99	mlx4: mark napi id for gro_skb Napi id was not marked for gro_skb, this will lead rx busy loop won't work correctly since they stack never try to call low latency receive method because of a zero socket napi id. Fix this by marking napi id for gro_skb. The transaction rate of 1 byte netperf tcp_rr gets about 50% increased (from 20531.68 to 30610.88). Cc: Amir Vadai <amirv@mellanox.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-14 16:14:16 -07:00
Nikolay Aleksandrov	548d28bd0e	bonding: fix ad_select module param check Obvious copy/paste error when I converted the ad_select to the new option API. "lacp_rate" there should be "ad_select" so we can get the proper value. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: David S. Miller <davem@davemloft.net> Fixes: `9e5f5eebe7` ("bonding: convert ad_select to use the new option API") Reported-by: Karim Scheik <karim.scheik@prisma-solutions.at> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-14 14:36:58 -07:00
Christoph Schulz	a8a3e41c67	net: pppoe: use correct channel MTU when using Multilink PPP The PPP channel MTU is used with Multilink PPP when ppp_mp_explode() (see ppp_generic module) tries to determine how big a fragment might be. According to RFC 1661, the MTU excludes the 2-byte PPP protocol field, see the corresponding comment and code in ppp_mp_explode(): /* * hdrlen includes the 2-byte PPP protocol field, but the * MTU counts only the payload excluding the protocol field. * (RFC1661 Section 2) / mtu = pch->chan->mtu - (hdrlen - 2); However, the pppoe module does* include the PPP protocol field in the channel MTU, which is wrong as it causes the PPP payload to be 1-2 bytes too big under certain circumstances (one byte if PPP protocol compression is used, two otherwise), causing the generated Ethernet packets to be dropped. So the pppoe module has to subtract two bytes from the channel MTU. This error only manifests itself when using Multilink PPP, as otherwise the channel MTU is not used anywhere. In the following, I will describe how to reproduce this bug. We configure two pppd instances for multilink PPP over two PPPoE links, say eth2 and eth3, with a MTU of 1492 bytes for each link and a MRRU of 2976 bytes. (This MRRU is computed by adding the two link MTUs and subtracting the MP header twice, which is 4 bytes long.) The necessary pppd statements on both sides are "multilink mtu 1492 mru 1492 mrru 2976". On the client side, we additionally need "plugin rp-pppoe.so eth2" and "plugin rp-pppoe.so eth3", respectively; on the server side, we additionally need to start two pppoe-server instances to be able to establish two PPPoE sessions, one over eth2 and one over eth3. We set the MTU of the PPP network interface to the MRRU (2976) on both sides of the connection in order to make use of the higher bandwidth. (If we didn't do that, IP fragmentation would kick in, which we want to avoid.) Now we send a ICMPv4 echo request with a payload of 2948 bytes from client to server over the PPP link. This results in the following network packet: 2948 (echo payload) + 8 (ICMPv4 header) + 20 (IPv4 header) --------------------- 2976 (PPP payload) These 2976 bytes do not exceed the MTU of the PPP network interface, so the IP packet is not fragmented. Now the multilink PPP code in ppp_mp_explode() prepends one protocol byte (0x21 for IPv4), making the packet one byte bigger than the negotiated MRRU. So this packet would have to be divided in three fragments. But this does not happen as each link MTU is assumed to be two bytes larger. So this packet is diveded into two fragments only, one of size 1489 and one of size 1488. Now we have for that bigger fragment: 1489 (PPP payload) + 4 (MP header) + 2 (PPP protocol field for the MP payload (0x3d)) + 6 (PPPoE header) -------------------------- 1501 (Ethernet payload) This packet exceeds the link MTU and is discarded. If one configures the link MTU on the client side to 1501, one can see the discarded Ethernet frames with tcpdump running on the client. A ping -s 2948 -c 1 192.168.15.254 leads to the smaller fragment that is correctly received on the server side: (tcpdump -vvvne -i eth3 pppoes and ppp proto 0x3d) 52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864), length 1514: PPPoE [ses 0x3] MLPPP (0x003d), length 1494: seq 0x000, Flags [end], length 1492 and to the bigger fragment that is not received on the server side: (tcpdump -vvvne -i eth2 pppoes and ppp proto 0x3d) 52:54:00:70:9e:89 > 52:54:00:5d:6f:b0, ethertype PPPoE S (0x8864), length 1515: PPPoE [ses 0x5] MLPPP (0x003d), length 1495: seq 0x000, Flags [begin], length 1493 With the patch below, we correctly obtain three fragments: 52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864), length 1514: PPPoE [ses 0x1] MLPPP (0x003d), length 1494: seq 0x000, Flags [begin], length 1492 52:54:00:70:9e:89 > 52:54:00:5d:6f:b0, ethertype PPPoE S (0x8864), length 1514: PPPoE [ses 0x1] MLPPP (0x003d), length 1494: seq 0x000, Flags [none], length 1492 52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864), length 27: PPPoE [ses 0x1] MLPPP (0x003d), length 7: seq 0x000, Flags [end], length 5 And the ICMPv4 echo request is successfully received at the server side: IP (tos 0x0, ttl 64, id 21925, offset 0, flags [DF], proto ICMP (1), length 2976) 192.168.222.2 > 192.168.15.254: ICMP echo request, id 30530, seq 0, length 2956 The bug was introduced in commit `c9aa689537` ("[PPPOE]: Advertise PPPoE MTU") from the very beginning. This patch applies to 3.10 upwards but the fix can be applied (with minor modifications) to kernels as old as 2.6.32. Signed-off-by: Christoph Schulz <develop@kristov.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-07-14 14:35:46 -07:00

1 2 3 4 5 ...

456284 Commits All Branches Search

456284 Commits

All Branches