linux

Author	SHA1	Message	Date
Dan Carpenter	1397490938	IB/mlx4: Handle -ENOMEM in forward_trap() ib_create_send_mad() can return ERR_PTR(-ENOMEM) here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-10 17:42:06 -08:00
Vladimir Sokolovsky	3afa9f19e5	IB/mlx4: Don't call dma_free_coherent() with irqs disabled mlx4_ib_free_cq_buf() should not be called under spin_lock_irq() since it calls dma_free_coherent(), which needs irqs enabled. Fix this by deferring the free to outside the locked region. This was found due to the WARN_ON(irqs_disabled()); in swiotlb_free_coherent(). Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-10 17:42:06 -08:00
Or Gerlitz	8ae31e5b1f	IPoIB: Add GRO support Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-10 17:41:55 -08:00
Or Gerlitz	19e364f680	IPoIB: Remove LRO support As a first step in moving from LRO to GRO, revert commit `af40da894e` ("IPoIB: add LRO support"). Also eliminate the ethtool set_flags callback which isn't needed anymore. Finally, we need to include <linux/sched.h> directly to get the declaration of restart_syscall() (which used to be included implicitly through <linux/inet_lro.h>). Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-10 17:41:54 -08:00
Joe Perches	1eba27e87a	IB/ipath: Use printf extension %pR for struct resource Using %pR standardizes the struct resource output. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-10 17:41:50 -08:00
Steve Wise	db8b101671	RDMA/cxgb4: Don't re-init wait object in init/fini paths Re-initializing the wait object in rdma_init()/rdma_fini() causes a timing window which can lead to a deadlock during close. Once this deadlock hits, all RDMA activity over the T4 device will be stuck. There's no need to re-init the wait object, so remove it. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-10 17:41:43 -08:00
Stephen Hemminger	c943109163	RDMA/cxgb3,cxgb4: Remove dead code This removes unused code found by running 'make namespacecheck'; compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-10 17:41:43 -08:00
David Dillow	9af762719e	IB/srp: consolidate hot-path variables into cache lines Put the variables accessed together in the hot-path into common cachelines, and separate them by RW vs RO to avoid false dirtying. We keep a local copy of the lkey and rkey in the target to avoid traversing pointers (and associated cache lines) to find them. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Dillow <dillowda@ornl.gov>	2011-01-10 15:44:51 -05:00
Bart Van Assche	e968467822	IB/srp: stop sharing the host lock with SCSI We don't need protection against the SCSI stack, so use our own lock to allow parallel progress on separate CPUs. Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out and small cleanups by David Dillow ] Signed-off-by: David Dillow <dillowda@ornl.gov>	2011-01-10 15:44:50 -05:00
Bart Van Assche	94a9174c63	IB/srp: reduce lock coverage of command completion We only need the lock to cover list and credit manipulations, so push those into srp_remove_req() and update the call chains. We reorder the request removal and command completion in srp_process_rsp() to avoid the SCSI mid-layer sending another command before we've released our request and added any credits returned by the target. This prevents us from returning HOST_BUSY unneccesarily. Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out, small cleanups, and modified to avoid potential extraneous HOST_BUSY returns by David Dillow ] Signed-off-by: David Dillow <dillowda@ornl.gov>	2011-01-10 15:44:50 -05:00
Bart Van Assche	76c75b258f	IB/srp: reduce local coverage for command submission and EH We only need locks to protect our lists and number of credits available. By pre-consuming the credit for the request, we can reduce our lock coverage to just those areas. If we don't actually send the request, we'll need to put the credit back into the pool. Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out and small cleanups by David Dillow ] Signed-off-by: David Dillow <dillowda@ornl.gov>	2011-01-10 15:44:49 -05:00
Bart Van Assche	536ae14e75	IB/srp: don't move active requests to their own list We use req->scmnd != NULL to indicate an active request, so there's no need to keep a separate list for them. We can afford the array iteration during error handling, and dropping it gives us one less item that needs lock protection. Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out and small cleanups by David Dillow ] Signed-off-by: David Dillow <dillowda@ornl.gov>	2011-01-10 15:44:42 -05:00
Linus Torvalds	b4a45f5fe8	Merge branch 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin * 'vfs-scale-working' of git://git.kernel.org/pub/scm/linux/kernel/git/npiggin/linux-npiggin: (57 commits) fs: scale mntget/mntput fs: rename vfsmount counter helpers fs: implement faster dentry memcmp fs: prefetch inode data in dcache lookup fs: improve scalability of pseudo filesystems fs: dcache per-inode inode alias locking fs: dcache per-bucket dcache hash locking bit_spinlock: add required includes kernel: add bl_list xfs: provide simple rcu-walk ACL implementation btrfs: provide simple rcu-walk ACL implementation ext2,3,4: provide simple rcu-walk ACL implementation fs: provide simple rcu-walk generic_check_acl implementation fs: provide rcu-walk aware permission i_ops fs: rcu-walk aware d_revalidate method fs: cache optimise dentry and inode for rcu-walk fs: dcache reduce branches in lookup path fs: dcache remove d_mounted fs: fs_struct use seqlock fs: rcu-walk for path lookup ...	2011-01-07 08:56:33 -08:00
Nick Piggin	dc0474be3e	fs: dcache rationalise dget variants dget_locked was a shortcut to avoid the lazy lru manipulation when we already held dcache_lock (lru manipulation was relatively cheap at that point). However, how that the lru lock is an innermost one, we never hold it at any caller, so the lock cost can now be avoided. We already have well working lazy dcache LRU, so it should be fine to defer LRU manipulations to scan time. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:24 +11:00
Nick Piggin	b5c84bf6f6	fs: dcache remove dcache_lock dcache_lock no longer protects anything. remove it. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:23 +11:00
Nick Piggin	b7ab39f631	fs: dcache scale dentry refcount Make d_count non-atomic and protect it with d_lock. This allows us to ensure a 0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when we start protecting many other dentry members with d_lock. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:21 +11:00
Bart Van Assche	dcb4cb85f4	IB/srp: allow lockless work posting Only one CPU at a time will own an RX IU, so using the address of the IU as the work request cookie allows us to avoid taking a lock. We can similarly prepare the TX path for lockless posting by moving the free TX IUs to a list. This also removes the requirement that the queue sizes be a power of 2. Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out, small cleanups, and modified to avoid needing an extra field in the IU by David Dillow] Signed-off-by: David Dillow <dillowda@ornl.gov>	2011-01-05 15:24:25 -05:00
Bart Van Assche	9709f0e05b	IB/srp: consolidate state change code Signed-off-by: Bart Van Assche <bvanassche@acm.org> [ broken out and small cleanups by David Dillow ] Signed-off-by: David Dillow <dillowda@ornl.gov>	2011-01-05 15:24:25 -05:00
David Dillow	f8b6e31e4e	IB/srp: allow task management without a previous request We can only have one task management comment outstanding, so move the completion and status to the target port. This allows us to handle resets of a LUN without a corresponding request having been sent. Meanwhile, we don't need to play games with host_scribble, just use it as the pointer it is. This fixes a crash when we issue a bus reset using sg_reset. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=13893 Reported-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Dillow <dillowda@ornl.gov>	2011-01-05 15:24:25 -05:00
David S. Miller	17f7f4d9fc	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/fib_frontend.c	2010-12-26 22:37:05 -08:00
Jiri Kosina	4b7bd36470	Merge branch 'master' into for-next Conflicts: MAINTAINERS arch/arm/mach-omap2/pm24xx.c drivers/scsi/bfa/bfa_fcpim.c Needed to update to apply fixes for which the old branch was too outdated.	2010-12-22 18:57:02 +01:00
Dan Carpenter	7182afea8d	IB/uverbs: Handle large number of entries in poll CQ In ib_uverbs_poll_cq() code there is a potential integer overflow if userspace passes in a large cmd.ne. The calls to kmalloc() would allocate smaller buffers than intended, leading to memory corruption. There iss also an information leak if resp wasn't all used. Unprivileged userspace may call this function, although only if an RDMA device that uses this function is present. Fix this by copying CQ entries one at a time, which avoids the allocation entirely, and also by moving this copying into a function that makes sure to initialize all memory copied to userspace. Special thanks to Jason Gunthorpe <jgunthorpe@obsidianresearch.com> for his help and advice. Cc: <stable@kernel.org> Signed-off-by: Dan Carpenter <error27@gmail.com> [ Monkey around with things a bit to avoid bad code generation by gcc when designated initializers are used. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-12-08 15:23:49 -08:00
Linus Torvalds	75318ec327	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB: Fix information leak in marshalling code IB/pack: Remove some unused code added by the IBoE patches IB/mlx4: Fix IBoE link state IB/mlx4: Fix IBoE reported link rate mlx4_core: Workaround firmware bug in query dev cap IB/mlx4: Fix memory ordering of VLAN insertion control bits MAINTAINERS: Update NetEffect entry	2010-12-02 12:10:56 -08:00
Roland Dreier	7adce751ce	Merge branches 'misc', 'mlx4' and 'nes' into for-next	2010-12-01 16:33:47 -08:00
Vasiliy Kulikov	91a4d157d0	IB: Fix information leak in marshalling code ib_ucm_init_qp_attr() and ucma_init_qp_attr() pass struct ib_uverbs_qp_attr with reserved, qp_state, {ah_attr,alt_ah_attr}{reserved,->grh.reserved} fields uninitialized to copy_to_user(). This leads to leaking of contents of kernel stack memory to userspace. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-12-01 16:33:18 -08:00
Or Gerlitz	f55864a4f4	IB/pack: Remove some unused code added by the IBoE patches Remove unused functions added by commit `ff7f5aab35` ("IB/pack: IBoE UD packet packing support"). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>	2010-12-01 16:30:18 -08:00
Eli Cohen	21d606090e	IB/mlx4: Fix IBoE link state Use netif_running() and netif_carrier_ok() to report link state, exactly as is done to report Ethernet link state in sysfs. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-12-01 16:11:29 -08:00
Eli Cohen	328266c561	IB/mlx4: Fix IBoE reported link rate The link rate is the product of the link speed in the link width. For Etherent ports the rate is 10G, so we use 1 for the width and 4 for speed to get the correct rate. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-12-01 16:10:35 -08:00
Eli Cohen	e27535b9c6	IB/mlx4: Fix memory ordering of VLAN insertion control bits We must fully update the control segment before marking it as valid, so that hardware doesn't start executing it before we're ready. Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Move VLAN control bit setting to before wmb(). - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-12-01 11:08:54 -08:00
Eric Dumazet	22f4fbd9bd	infiniband: remove dev_base_lock use dev_base_lock is the legacy way to lock the device list, and is planned to disappear. (writers hold RTNL, readers hold RCU lock) Convert rdma_translate_ip() and update_ipv6_gids() to RCU locking. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-24 11:41:56 -08:00
Arnd Bergmann	451a3c24b0	BKL: remove extraneous #include <smp_lock.h> The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-17 08:59:32 -08:00
Jeff Garzik	f281233d3e	SCSI host lock push-down Move the mid-layer's ->queuecommand() invocation from being locked with the host lock to being unlocked to facilitate speeding up the critical path for drivers who don't need this lock taken anyway. The patch below presents a simple SCSI host lock push-down as an equivalent transformation. No locking or other behavior should change with this patch. All existing bugs and locking orders are preserved. Additionally, add one parameter to queuecommand, struct Scsi_Host * and remove one parameter from queuecommand, void (done)(struct scsi_cmnd ) Scsi_Host* is a convenient pointer that most host drivers need anyway, and 'done' is redundant to struct scsi_cmnd->scsi_done. Minimal code disturbance was attempted with this change. Most drivers needed only two one-line modifications for their host lock push-down. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Acked-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-16 13:33:23 -08:00
Jesper Juhl	e987fa357a	infiniband: Only include mutex.h once in drivers/infiniband/hw/cxgb4/iw_cxgb4.h Only include the header linux/mutex.h once inside drivers/infiniband/hw/cxgb4/iw_cxgb4.h Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-11-15 14:35:19 +01:00
Eric Dumazet	72cdd1d971	net: get rid of rtable->idev It seems idev field in struct rtable has no special purpose, but adding extra atomic ops. We hold refcounts on the device itself (using percpu data, so pretty cheap in current kernel). infiniband case is solved using dst.dev instead of idev->dev Removal of this field means routing without route cache is now using shared data, percpu data, and only potential contention is a pair of atomic ops on struct neighbour per forwarded packet. About 5% speedup on routing test. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-11 10:29:40 -08:00
Uwe Kleine-König	b595076a18	tree-wide: fix comment/printk typos "gadget", "through", "command", "maintain", "maintain", "controller", "address", "between", "initiali[zs]e", "instead", "function", "select", "already", "equal", "access", "management", "hierarchy", "registration", "interest", "relative", "memory", "offset", "already", Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-11-01 15:38:34 -04:00
Al Viro	fc14f2fef6	convert get_sb_single() users Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-29 04:16:28 -04:00
Linus Torvalds	426e1f5cec	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) split invalidate_inodes() fs: skip I_FREEING inodes in writeback_sb_inodes fs: fold invalidate_list into invalidate_inodes fs: do not drop inode_lock in dispose_list fs: inode split IO and LRU lists fs: switch bdev inode bdi's correctly fs: fix buffer invalidation in invalidate_list fsnotify: use dget_parent smbfs: use dget_parent exportfs: use dget_parent fs: use RCU read side protection in d_validate fs: clean up dentry lru modification fs: split __shrink_dcache_sb fs: improve DCACHE_REFERENCED usage fs: use percpu counter for nr_dentry and nr_dentry_unused fs: simplify __d_free fs: take dcache_lock inside __d_path fs: do not assign default i_ino in new_inode fs: introduce a per-cpu last_ino allocator new helper: ihold() ...	2010-10-26 17:58:44 -07:00
Linus Torvalds	9e5fca251f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (63 commits) IB/qib: clean up properly if pci_set_consistent_dma_mask() fails IB/qib: Allow driver to load if PCIe AER fails IB/qib: Fix uninitialized pointer if CONFIG_PCI_MSI not set IB/qib: Fix extra log level in qib_early_err() RDMA/cxgb4: Remove unnecessary KERN_<level> use RDMA/cxgb3: Remove unnecessary KERN_<level> use IB/core: Add link layer type information to sysfs IB/mlx4: Add VLAN support for IBoE IB/core: Add VLAN support for IBoE IB/mlx4: Add support for IBoE mlx4_en: Change multicast promiscuous mode to support IBoE mlx4_core: Update data structures and constants for IBoE mlx4_core: Allow protocol drivers to find corresponding interfaces IB/uverbs: Return link layer type to userspace for query port operation IB/srp: Sync buffer before posting send IB/srp: Use list_first_entry() IB/srp: Reduce number of BUSY conditions IB/srp: Eliminate two forward declarations IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144 IB: Replace EXTRA_CFLAGS with ccflags-y ...	2010-10-26 17:54:22 -07:00
Hagen Paul Pfeifer	732eacc054	replace nested max/min macros with {max,min}3 macro Use the new {max,min}3 macros to save some cycles and bytes on the stack. This patch substitutes trivial nested macros with their counterpart. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Joe Perches <joe@perches.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-26 16:52:12 -07:00
Roland Dreier	116e9535fe	Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iboe', 'ipoib', 'misc', 'mlx4', 'nes', 'qib' and 'srp' into for-next	2010-10-26 16:09:11 -07:00
Jason Gunthorpe	2ca78d23a7	IB/qib: clean up properly if pci_set_consistent_dma_mask() fails Clean up properly if pci_set_consistent_dma_mask() fails. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-26 16:09:02 -07:00
Ralph Campbell	5d26a1df23	IB/qib: Allow driver to load if PCIe AER fails Some PCIe root complex chip sets don't support advanced error reporting. Allow the driver to load OK if pci_enable_pcie_error_reporting() fails. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-26 16:09:02 -07:00
Ralph Campbell	9e43e0106d	IB/qib: Fix uninitialized pointer if CONFIG_PCI_MSI not set If CONFIG_PCI_MSI is not set, and a QLE7140 is present, the pointer "dd" is uninitialized. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-26 16:09:02 -07:00
Jason Gunthorpe	82fdb0ab54	IB/qib: Fix extra log level in qib_early_err() Noticed this odd looking thing in dmesg: ib_qib 0000:02:00.0: <3>ib_qib: Unable to enable pcie error reporting: -5 which is due to a bad use of dev_info. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-26 16:09:02 -07:00
Joe Perches	aa1ad26089	RDMA/cxgb4: Remove unnecessary KERN_<level> use Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-26 13:45:59 -07:00
Joe Perches	ca7cf94f8b	RDMA/cxgb3: Remove unnecessary KERN_<level> use Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-26 13:45:49 -07:00
Christoph Hellwig	85fe4025c6	fs: do not assign default i_ino in new_inode Instead of always assigning an increasing inode number in new_inode move the call to assign it into those callers that actually need it. For now callers that need it is estimated conservatively, that is the call is added to all filesystems that do not assign an i_ino by themselves. For a few more filesystems we can avoid assigning any inode number given that they aren't user visible, and for others it could be done lazily when an inode number is actually needed, but that's left for later patches. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-25 21:26:11 -04:00
Eli Cohen	8ad330a002	IB/core: Add link layer type information to sysfs Since an IB transport port may use either IB or Ethernet as its link layer, add the file /sys/class/infiniband/<device>/ports/<port_num>/link_layer to show the link layer for the port. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-25 10:20:39 -07:00
Eli Cohen	4c3eb3ca13	IB/mlx4: Add VLAN support for IBoE This patch allows IBoE traffic to be encapsulated in 802.1Q tagged VLAN frames. The VLAN tag is encoded in the GID and derived from it by a simple computation. The netdev notifier callback is modified to catch VLAN device addition/removal and the port's GID table is updated to reflect the change, so that for each netdevice there is an entry in the GID table. When the port's GID table is exhausted, GID entries will not be added. Only children of the main interfaces can add to the GID table; if a VLAN interface is added on another VLAN interface (e.g. "vconfig add eth2.6 8"), then that interfaces will not add an entry to the GID table. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-25 10:20:39 -07:00
Eli Cohen	af7bd46376	IB/core: Add VLAN support for IBoE Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the GID derived from a link local address in the following way: GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN. The 3 bits user priority field of the packets are identical to the 3 bits of the SL. In case of rdma_cm apps, the TOS field is used to generate the SL field by doing a shift right of 5 bits effectively taking to 3 MS bits of the TOS field. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-25 10:20:39 -07:00
Eli Cohen	fa417f7b52	IB/mlx4: Add support for IBoE Add support for IBoE to mlx4_ib. The bulk of the code is handling the new address vector fields; mlx4 needs the MAC address of a remote node to include it in a WQE (for datagrams) or in the QP context (for connected QPs). Address resolution is done by assuming all unicast GIDs are either link-local IPv6 addresses. Multicast group attach/detach needs to update the NIC's multicast filters; but since attaching a QP to a multicast group can be done before the QP is bound to a port, for IBoE we need to keep track of all multicast groups that a QP is attached too before it transitions from INIT to RTR (since it does not have a port in the INIT state). Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Many things cleaned up and otherwise monkeyed with; hope I didn't introduce too many bugs. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-25 10:20:39 -07:00
Eli Cohen	2420b60b1d	IB/uverbs: Return link layer type to userspace for query port operation Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-25 10:20:39 -07:00
David Dillow	19081f31ce	IB/srp: Sync buffer before posting send srp_send_tsk_mgmt() was missing the proper DMA sync calls before posting the buffer to the device. Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-24 22:14:23 -07:00
Bart Van Assche	21c1a90769	IB/srp: Use list_first_entry() Use the list_first_entry() macro in ib_srp instead of open-coding the equivalent, which makes the source code slightly more descriptive. The list_first_entry() macro itself was introduced in kernel 2.6.22. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-24 22:14:19 -07:00
Bart Van Assche	7ade400aba	IB/srp: Reduce number of BUSY conditions As proposed by the SRP (draft) standard, ib_srp reserves one ring element for SRP_TSK_MGMT requests. This patch makes sure that the SCSI mid-layer never tries to queue more than (SRP request limit) - 1 SCSI commands to ib_srp. This improves performance for targets whose request limit is less than or equal to SRP_NORMAL_REQ_SQ_SIZE by reducing the number of BUSY responses reported by ib_srp to the SCSI mid-layer. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-24 22:14:14 -07:00
David Dillow	05a1d7504f	IB/srp: Eliminate two forward declarations Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-24 22:14:08 -07:00
Linus Torvalds	229aebb873	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Update broken web addresses in arch directory. Update broken web addresses in the kernel. Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget Revert "Fix typo: configuation => configuration" partially ida: document IDA_BITMAP_LONGS calculation ext2: fix a typo on comment in ext2/inode.c drivers/scsi: Remove unnecessary casts of private_data drivers/s390: Remove unnecessary casts of private_data net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data drivers/infiniband: Remove unnecessary casts of private_data drivers/gpu/drm: Remove unnecessary casts of private_data kernel/pm_qos_params.c: Remove unnecessary casts of private_data fs/ecryptfs: Remove unnecessary casts of private_data fs/seq_file.c: Remove unnecessary casts of private_data arm: uengine.c: remove C99 comments arm: scoop.c: remove C99 comments Fix typo configue => configure in comments Fix typo: configuation => configuration Fix typo interrest[ing\|ed] => interest[ing\|ed] Fix various typos of valid in comments ... Fix up trivial conflicts in: drivers/char/ipmi/ipmi_si_intf.c drivers/usb/gadget/rndis.c net/irda/irnet/irnet_ppp.c	2010-10-24 13:41:39 -07:00
Jack Morgenstein	d0d68b8693	IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144 The Node Description cannot be changed via MADs (it is read-only). Until now, it was changed in the driver via sysfs, and the new Node Description was simply inserted by the driver into MAD responses (replacing the description returned by FW). System startup scripts use the sysfs interface to change the node description at driver startup to show the hostname, etc. However, this has a race condition: the SM could discover the original FW node description rather than the system-specific description if it queried the port before the startup scripts finish running. For mlx4, we fix this with a new FW command (SET_NODE) that allows passing the new node description to FW. When this command is invoked, FW sends a trap 144 to the SM. When it gets this trap, the SM can query the node to obtain the new node description -- thus eliminating the effects of the race. This patch simply calls SET_NODE command when a new node description is entered via sysfs (thus causing trap 144 to be issued by the FW). We ignore all failures of the SET_NODE command (including those caused by using a device FW that predates the SET_NODE command), since in that case things work just as before. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-23 13:53:09 -07:00
matt mooney	7454159d3c	IB: Replace EXTRA_CFLAGS with ccflags-y Signed-off-by: matt mooney <mfm@muteddisk.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-23 13:45:03 -07:00
Steve Wise	97cb7e40c6	RDMA/ucma: Allow tuning the max listen backlog For iWARP connections, the connect request is carried in a TCP payload on an already established TCP connection. So if the ucma's backlog is full, the connection request is transmitted and acked at the TCP level by the time the connect request gets dropped in the ucma. The end result is the connection gets rejected by the iWARP provider. Further, a 32 node 256NP OpenMPI job will generate > 128 connect requests on some ranks. This patch increases the default max backlog to 1024, and adds a sysctl variable so the backlog can be adjusted at run time. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-23 13:41:40 -07:00
Eli Cohen	c3aa9b186b	IPoIB: Set dev_id field of net_device Use the net device's dev_id field to encode the port number of the pci device. This can be used to to associate a net device with the pci device's port. The encoding is: dev_id = port - 1. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-23 13:35:48 -07:00
Linus Torvalds	5f05647dd8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1699 commits) bnx2/bnx2x: Unsupported Ethtool operations should return -EINVAL. vlan: Calling vlan_hwaccel_do_receive() is always valid. tproxy: use the interface primary IP address as a default value for --on-ip tproxy: added IPv6 support to the socket match cxgb3: function namespace cleanup tproxy: added IPv6 support to the TPROXY target tproxy: added IPv6 socket lookup function to nf_tproxy_core be2net: Changes to use only priority codes allowed by f/w tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled tproxy: added tproxy sockopt interface in the IPV6 layer tproxy: added udp6_lib_lookup function tproxy: added const specifiers to udp lookup functions tproxy: split off ipv6 defragmentation to a separate module l2tp: small cleanup nf_nat: restrict ICMP translation for embedded header can: mcp251x: fix generation of error frames can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set can-raw: add msg_flags to distinguish local traffic 9p: client code cleanup rds: make local functions/variables static ... Fix up conflicts in net/core/dev.c, drivers/net/pcmcia/smc91c92_cs.c and drivers/net/wireless/ath/ath9k/debug.c as per David	2010-10-23 11:47:02 -07:00
David Dillow	bb12588a38	IB/srp: Implement SRP_CRED_REQ and SRP_AER_REQ This patch adds support for SRP_CRED_REQ to avoid a lockup by targets that use that mechanism to return credits to the initiator. This prevents a lockup observed in the field where we would never add the credits from the SRP_CRED_REQ to our current count, and would therefore never send another command to the target. Minimal support for SRP_AER_REQ is also added, as these messages can also be used to convey additional credits to the initiator. Based upon extensive debugging and code by Bart Van Assche and a bug report by Chris Worley. Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-22 22:19:10 -07:00
Bart Van Assche	dd5e6e38b2	IB/srp: Preparation for transmit ring response allocation The transmit ring in ib_srp (srp_target.tx_ring) is currently only used for allocating requests sent by the initiator to the target. This patch prepares using that ring for allocation of both requests and responses. Also, this patch differentiates the uses of SRP_SQ_SIZE, increases the size of the IB send completion queue by one element and reserves one transmit ring slot for SRP_TSK_MGMT requests. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-22 22:19:10 -07:00
Jason Gunthorpe	5715f5d44b	IB/qib: Process RDMA WRITE ONLY with IMMEDIATE properly See table 35 in IBA - the header order for RDMA_WRITE_ONLY_WITH_IMMEDIATE and SEND_LAST_WITH_IMMEDIATE is different: the RDMA_WRITE_ONLY has a RETH header before the immediate data, so we need a different code path to extract the immediate data. I tested this with a userspace app that does RDMA_WRITE with immediate on a QLE7140. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-22 22:12:15 -07:00
Steve Wise	b955150ea7	RDMA/cxgb3: When a user QP is marked in error, also mark the CQs in error The flushing of work requests for user QPs is implemented entirely in the user mode library. The only kernel interaction is to mark the user QP object indicating it is in error when the QP exits RTS. When the user QP operations are called by the application (eg: post_send, post_recv), the QP in error bit is checked and if set, the library flushes the QP. If, however, the application is not doing IO, but rather just polling the CQ, it will never get flushed work requests. This breaks some classes of applications. This patch adds logic to mark user CQs in error when a QP that is bound to the CQ is marked in error. The library poll code can then notice the CQ is in error and flush all the in error QPs bound to that CQ. Design: - add 1 extra CQE entry to the CQ memory that will be used to indicate in error status. - return the desired CQ memory size that should be mapped by the library - bump the ABI since the create_cq uverbs response changes. - detect older libraries and reduce the mmap size accordingly. (The ABI bump doesn't break old libraries, since they didn't check the ABI field anyway) Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-22 22:00:53 -07:00
Steve Wise	da411ba1da	RDMA/cxgb4: Use cxgb4 service for packet gl to skb Remove the local service t4_pktgl_to_skb() and use cxgb4_pktgl_to_skb() exported by cxgb4. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-22 21:58:50 -07:00
Steve Wise	de5dd81b49	RDMA/cxgb4: Export T4 TCP MIB Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-22 21:57:26 -07:00
Linus Torvalds	092e0e7e52	Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek	2010-10-22 10:52:56 -07:00
Justin P. Mattock	631dd1a885	Update broken web addresses in the kernel. The patch below updates broken web addresses in the kernel Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Finn Thain <fthain@telegraphics.com.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com> Cc: Mike Frysinger <vapier.adi@gmail.com> Acked-by: Ben Pfaff <blp@cs.stanford.edu> Acked-by: Hans J. Koch <hjk@linutronix.de> Reviewed-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-10-18 11:03:14 +02:00
Randy Dunlap	e00ce92e0b	infiniband: fix mlx4 kconfig dependency warning Fix kconfig dependency warning to satisfy dependencies: warning: (MLX4_EN && NETDEVICES && NETDEV_10000 && PCI && INET \|\| MLX4_INFINIBAND && INFINIBAND) selects MLX4_CORE which has unmet direct dependencies (NETDEVICES && NETDEV_10000 && PCI) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-16 11:13:25 -07:00
Arnd Bergmann	6038f373a3	llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-15 15:53:27 +02:00
Eli Cohen	ff7f5aab35	IB/pack: IBoE UD packet packing support Add support for packing IBoE packet headers. Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Clean up and fix ib_ud_header_init() a bit. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-14 12:41:29 -07:00
Eli Cohen	3c86aa70bf	RDMA/cm: Add RDMA CM support for IBoE devices Add support for IBoE device binding and IP --> GID resolution. Path resolving and multicast joining are implemented within cma.c by filling in the responses and running callbacks in the CMA work queue. IP --> GID resolution always yields IPv6 link local addresses; remote GIDs are derived from the destination MAC address of the remote port. Multicast GIDs are always mapped to multicast MACs as is done in IPv6. (IPv4 multicast is enabled by translating IPv4 multicast addresses to IPv6 multicast as described in <http://www.mail-archive.com/ipng@sunroof.eng.sun.com/msg02134.html>.) Some helper functions are added to ib_addr.h. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-13 15:46:43 -07:00
Eli Cohen	fac70d5191	IB/mad: IBoE supports only QP1 (no QP0) Since IBoE is using Ethernet as its link layer, there is no central management entity so there is need for QP0. QP1 is still needed since it handles communications between CM agents. This patch will skip QP0 and create only QP1 for IBoE ports. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-13 09:38:11 -07:00
Eli Cohen	7b4c876961	IPoIB: Skip IBoE ports IPoIB is IP-over-Infiniband link layer. In the case of IBoE, the link layer is Ethernet and IP can work directly over Ethernet, so disable IPoIB for non-IB_LINK_LAYER_INFINIBAND ports. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-13 09:38:11 -07:00
Eric Dumazet	29b4433d99	net: percpu net_device refcount We tried very hard to remove all possible dev_hold()/dev_put() pairs in network stack, using RCU conversions. There is still an unavoidable device refcount change for every dst we create/destroy, and this can slow down some workloads (routers or some app servers, mmap af_packet) We can switch to a percpu refcount implementation, now dynamic per_cpu infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes per device. On x86, dev_hold(dev) code : before lock incl 0x280(%ebx) after: movl 0x260(%ebx),%eax incl fs:(%eax) Stress bench : (Sending 160.000.000 UDP frames, IP route cache disabled, dual E5540 @2.53GHz, 32bit kernel, FIB_TRIE) Before: real 1m1.662s user 0m14.373s sys 12m55.960s After: real 0m51.179s user 0m15.329s sys 10m15.942s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-10-12 12:35:25 -07:00
Animesh K Trivedi	26012f0750	RDMA/iwcm: Fix hang in uninterruptible wait on cm_id destroy A process can get stuck in an uninterruptible wait in the kernel while destroying a cm_id when iw_cm_connect() fails: For example, When creation of a PD fails but the user continues with an attempt to connect to the server without checking the return value, in iw_cm_connect() a NULL qp is found so the call fails. However the IWCM_F_CONNECT_WAIT bit is not cleared. destroy_cm_id() then waits forever for IWCM_F_CONNECT_WAIT to be cleared. The same problem exists on the passive side with the accept call. Fix this by clearing the bit and waking up any waiters in the appropriate spots. Signed-off-by: Animesh Trivedi <atr@zurich.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-11 20:24:04 -07:00
Steve Wise	3160977a6e	RDMA/cxgb4: Use simple_read_from_buffer() for debugfs handlers We can replace our equivalent open-coded version. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-11 20:15:14 -07:00
Steve Wise	8bbac892fb	RDMA/cxgb4: Add default_llseek to debugfs files Incorporate BKL removal changes. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-11 20:14:00 -07:00
Eli Cohen	5a0fd09428	IB/mlx4: Limit size of fast registration WRs Fix the limit on the size of max fast registration WRs that can be posted to match hardware capabilities. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-11 14:33:17 -07:00
Joe Perches	0f2f930a67	IB/qib: Remove unnecessary casts of private_data Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-06 14:43:51 -07:00
Maciej Sosnowski	52106bd24c	RDMA/nes: Turn carrier off on ifdown This lets the bonding driver to detect when an interface goes down. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-06 14:42:32 -07:00
Chien Tung	2932772156	RDMA/nes: Report correct port state if interface is down With commit `cd6860eb` ("RDMA/nes: Fix hangs on ifdown") we no longer remove nes interfaces on ifdown. On nes_query_port(), add an additional check of the netdev queue and report IB_PORT_DOWN if the queue is not running. Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-06 12:59:56 -07:00
Sonny Rao	625fbd3a36	IB/ehca: Fix driver on relocatable kernel the eHCA driver registers a MR for all of kernel memory, but makes the assumption that valid memory exists at KERNELBASE. This assumption may not be true in the case of a relocatable kernel, so use KERNELBASE + PHYSICAL_START to get the true beginning of usable kernel memory. cc: Joachim Fenkes <fenkes@de.ibm.com> cc: Christoph Raisch <raisch@de.ibm.com> cc: Hoan-Ham Hguyen <hnguyen@de.ibm.com> Signed-off-by: Sonny Rao <sonnyrao@us.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-06 12:57:07 -07:00
Thomas Gleixner	557d0540b9	IB/umad: Make user_mad semaphore a real one Get rid of init_MUTEX[_LOCKED]() and use sema_init() instead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 20:52:21 -07:00
Joe Perches	fc4ec9bd82	RDMA/amso1100: Remove KERN_<level> from pr_<level> use Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 20:51:20 -07:00
Dan Carpenter	4d8d6389b2	RDMA/nes: Remove unneeded variable Just a small cleanup. The "passive_state" variable isn't used any more after commit `dae58728dc` ("RDMA/nes: Fix double CLOSE event indication crash") Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 20:50:07 -07:00
Christoph Lameter	fed1db33fe	IPoIB: Set pkt_type correctly for multicast packets (fix IGMP breakage) IGMP processing is broken because the IPOIB does not set the skb->pkt_type the right way for multicast traffic. All incoming packets are set to PACKET_HOST which means that igmp_recv() will ignore the IGMP broadcasts/multicasts. This in turn means that the IGMP timers are firing and are sending information about multicast subscriptions unnecessarily. In a large private network this can cause traffic spikes. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 11:09:23 -07:00
Steve Wise	40dbf6ee38	RDMA/cxgb4: Fastreg NSMR fixes - Remove dsgl support - doesn't work in T4. - Wrap the immediate PBL as needed when building it in the wr. - Adjust max pbl depth allowed based on ulptx alignment requirements. - Bump the slots per SQ to 5 to allow up to 128MB fast registers. - Advertise fastreg support by default. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:53:50 -07:00
Steve Wise	410ade4c26	RDMA/cxgb4: Don't set completion flag for read requests Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:53:49 -07:00
Steve Wise	98ae68b7ee	RDMA/cxgb4: Set the default TCP send window to 128KB This helps with large IO throughput. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:53:48 -07:00
Steve Wise	2f5b48c3ad	RDMA/cxgb4: Use a mutex for QP and EP state transitions Move the connection setup/teardown paths to the workq thread removing spin lock/irq disable requirements for these paths. This allows calls down to the LLD for EP and QP state transition actions to be atomic with respect to processing CPL messages coming up from the HW. Namely, calls to rdma_init() and rdma_fini() can now be called with the mutex held avoiding many race conditions with the abort path. The QP spinlock is still used but only to manipulate the qp state. This allows the fastpaths, poll, post_send, and pos_recv, to run in the irq context. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:53:48 -07:00
Steve Wise	c6d7b26791	RDMA/cxgb4: Support on-chip SQs T4 support on-chip SQs to reduce latency. This patch adds support for this in iw_cxgb4: - Manage ocqp memory like other adapter mem resources. - Allocate user mode SQs from ocqp mem if available. - Map ocqp mem to user process using write combining. - Map PCIE_MA_SYNC reg to user process. Bump uverbs ABI. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:46:35 -07:00
Steve Wise	aadc4df308	RDMA/cxgb4: Centralize the wait logic Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:46:34 -07:00
Steve Wise	9e8d1fa342	RDMA/cxgb4: debugfs files for dumping active stags Add "stags" debugfs file. This is useful for examining the TPTE and PBL entries in adapter memory. It allows scripts to dump just the active entries. Also clean up the "qps" file handlers and shared common code. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:46:33 -07:00
Steve Wise	05fb962947	RDMA/cxgb4: Log HW lack-of-resource errors This helps debug cases where HW resources are depleted. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:46:33 -07:00
Steve Wise	0e42c1f430	RDMA/cxgb4: Handle CPL_RDMA_TERMINATE messages T4 FW sends up CPL_RDMA_TERMINATE to indicate a peer TERM. This triggers the QP moving to TERMINATE state. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:46:32 -07:00
Steve Wise	6ff0e343b3	RDMA/cxgb4: Ignore TERMINATE CQEs T4 incorrectly inserts TERM CQEs into the CQ. Silently ignore them. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:46:31 -07:00
Steve Wise	7459486133	RDMA/cxgb4: Ignore positive return values from cxgb4__send() functions The cxgb4__send() functions return NET_XMIT_ values, which are positive integers or negative errno values. So don't treat positive return values as an error. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:46:31 -07:00
Steve Wise	13fecb83b4	RDMA/cxgb4: Zero out ISGL padding The HW design requires zeroing any pad in SGLs. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:46:30 -07:00
Steve Wise	af93fb5dcc	RDMA/cxgb4: Don't use null ep ptr In c4iw_modify_qp() error path, only use qhp->ep if ep is not already set. Otherwise qhp->ep can be NULL and we crash. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 10:46:29 -07:00
Roland Dreier	183ae74bda	RDMA/nes: Fix cast-to-pointer warnings on 32-bit Fix: drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_alloc_fast_reg_page_list': drivers/infiniband/hw/nes/nes_verbs.c:477: warning: cast to pointer from integer of different size drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_post_send': drivers/infiniband/hw/nes/nes_verbs.c:3486: warning: cast to pointer from integer of different size drivers/infiniband/hw/nes/nes_verbs.c:3486: warning: cast to pointer from integer of different size by printing u64 quantities by casting to unsigned long and long and using %llx, rather than casting to void* and using %p. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-27 17:51:33 -07:00
Eli Cohen	a3f5adaf49	IB/core: Add link layer property to ports This patch allows ports to have different link layers: IB_LINK_LAYER_INFINIBAND or IB_LINK_LAYER_ETHERNET. This is required for adding IBoE (InfiniBand-over-Ethernet, aka RoCE) support. For devices that do not provide an implementation for querying the link layer property of a port, we return a default value based on the transport: RMA_TRANSPORT_IB nodes will return IB_LINK_LAYER_INFINIBAND and RDMA_TRANSPORT_IWARP nodes will return IB_LINK_LAYER_ETHERNET. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-27 17:51:10 -07:00
Roland Dreier	c8e081a1bf	RDMA/cxgb4: Fix warnings about casts to/from pointers of different sizes Fix: drivers/infiniband/hw/cxgb4/qp.c: In function ‘create_qp’: drivers/infiniband/hw/cxgb4/qp.c:147: warning: cast from pointer to integer of different size drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_fini’: drivers/infiniband/hw/cxgb4/qp.c:988: warning: cast from pointer to integer of different size drivers/infiniband/hw/cxgb4/qp.c: In function ‘rdma_init’: drivers/infiniband/hw/cxgb4/qp.c:1063: warning: cast from pointer to integer of different size drivers/infiniband/hw/cxgb4/mem.c: In function ‘write_adapter_mem’: drivers/infiniband/hw/cxgb4/mem.c:74: warning: cast from pointer to integer of different size drivers/infiniband/hw/cxgb4/cq.c: In function ‘destroy_cq’: drivers/infiniband/hw/cxgb4/cq.c:58: warning: cast from pointer to integer of different size drivers/infiniband/hw/cxgb4/cq.c: In function ‘create_cq’: drivers/infiniband/hw/cxgb4/cq.c:135: warning: cast from pointer to integer of different size drivers/infiniband/hw/cxgb4/cm.c: In function ‘fw6_msg’: drivers/infiniband/hw/cxgb4/cm.c:2326: warning: cast to pointer from integer of different size by casting pointers to unsigned long instead of u64. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-27 17:51:04 -07:00
Steve Wise	bec658ff31	RDMA/cxgb3: Turn off RX coalescing for iWARP connections The HW by default has RX coalescing on. For iWARP connections, this causes a 100ms delay in connection establishement due to the ingress MPA Start message being stalled in HW. So explicitly turn RX coalescing off when setting up iWARP connections. This was causing very bad performance for NP64 gather operations using Open MPI, due to the way it sets up connections on larger jobs. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-27 09:28:55 -07:00
Joe Perches	ea3f0e6bc5	drivers/infiniband: Remove unnecessary casts of private_data Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-09-23 13:38:28 +02:00
Roland Dreier	17859d07c8	Merge branches 'cxgb3' and 'nes' into for-linus	2010-09-08 14:43:28 -07:00
Faisal Latif	29da03b9d1	RDMA/nes: Fix hang with modified FIN handling on A0 cards Changing state to CLOSING when FIN is received causes A0 cards to hang. Fix this by checking for A0 cards in FIN handling. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-08 14:38:23 -07:00
Faisal Latif	67d7072115	RDMA/nes: Change state to closing after FIN When the driver receives an AE for FIN received, it closes the connection without changing the state of the connection in the hardware to closing. By changing the state to closing, hardware will do a normal close sequence. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-08 14:38:04 -07:00
Faisal Latif	dae58728dc	RDMA/nes: Fix double CLOSE event indication crash During a stress testing in a large cluster, multiple close event are detected and BUG() is hit in the iWARP core. The cause is that the active node gave up while waiting for an MPA response from the peer and tried to close the connection by sending RST. The passive node driver receives the RST but is waiting for MPA response from the user. When the MPA accept is received, the driver offloads the connection and sends a CLOSE event. The driver gets an AE indicating RESET received and also sends a CLOSE event, hitting a BUG(). Fix this by correcting RESET handling and sending CLOSE events. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-08 14:35:48 -07:00
Chien Tung	70c9db0fdf	RDMA/nes: Write correct register write to set TX pause param Setting TX pause param writes to the wrong register location causing the adapter to hang. Correct the define used to write the reigster. Addresses: https://bugs.openfabrics.org/show_bug.cgi?id=2116 Reported-by: Shiri Franchi <shirif@voltaire.com> Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-08 14:29:19 -07:00
Steve Wise	dc4e96ce2d	RDMA/cxgb3: Don't exceed the max HW CQ depth The max depth supported by T3 is 64K entries. This fixes a bug introduced in commit `9918b28d` ("RDMA/cxgb3: Increase the max CQ depth") that causes stalls and possibly crashes in large MPI clusters. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-02 14:52:21 -07:00
Linus Torvalds	58d4ea65b9	Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6 * 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: mmc_spi: Fix unterminated of_match_table of/sparc: fix build regression from of_device changes of/device: Replace struct of_device with struct platform_device	2010-08-12 09:11:31 -07:00
Steve Wise	93fb72e443	RDMA/cxgb4: Obtain RDMA QID ranges from LLD/FW Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-07 23:08:47 -07:00
Linus Torvalds	3cc08fc35d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (42 commits) IB/qib: Add missing <linux/slab.h> include IB/ehca: Drop unnecessary NULL test RDMA/nes: Fix confusing if statement indentation IB/ehca: Init irq tasklet before irq can happen RDMA/nes: Fix misindented code RDMA/nes: Fix showing wqm_quanta RDMA/nes: Get rid of "set but not used" variables RDMA/nes: Read firmware version from correct place IB/srp: Export req_lim via sysfs IB/srp: Make receive buffer handling more robust IB/srp: Use print_hex_dump() IB: Rename RAW_ETY to RAW_ETHERTYPE RDMA/nes: Fix two sparse warnings RDMA/cxgb3: Make needlessly global iwch_l2t_send() static IB/iser: Make needlessly global iser_alloc_rx_descriptors() static RDMA/cxgb4: Add timeouts when waiting for FW responses IB/qib: Fix race between qib_error_qp() and receive packet processing IB/qib: Limit the number of packets processed per interrupt IB/qib: Allow writes to the diag_counters to be able to clear them IB/qib: Set cfgctxts to number of CPUs by default ...	2010-08-07 17:08:02 -07:00
Grant Likely	2dc1158137	of/device: Replace struct of_device with struct platform_device of_device is just an alias for platform_device, so remove it entirely. Also replace to_of_device() with to_platform_device() and update comment blocks. This patch was initially generated from the following semantic patch, and then edited by hand to pick up the bits that coccinelle didn't catch. @@ @@ -struct of_device +struct platform_device Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Reviewed-by: David S. Miller <davem@davemloft.net>	2010-08-06 09:25:50 -06:00
Roland Dreier	03b37ecdb3	Merge branches 'cxgb3', 'cxgb4', 'ehca', 'ipath', 'misc', 'nes', 'qib' and 'srp' into for-next	2010-08-05 14:27:14 -07:00
David Miller	ba818afdc6	IB/qib: Add missing <linux/slab.h> include Fix build failure on sparc64 which is missing the include of <linux/slab.h> via <asm/pci.h> that x86, powerpc, ia64, etc. have. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-05 14:26:58 -07:00
Julia Lawall	2db0032181	IB/ehca: Drop unnecessary NULL test list_for_each_entry binds its first argument to a non-null value, and thus any null test on the value of that argument is superfluous. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ iterator I; expression x; statement S,S1,S2; @@ I(x,...) { <... - if (x == NULL && ...) S ...> } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-05 14:24:55 -07:00
Roland Dreier	817979ac45	RDMA/nes: Fix confusing if statement indentation Fix confusing indentation that makes a statement look as if it's part of an if statement when in fact it isn't. Reported-by: Julia Lawall <julia@diku.dk> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-05 14:21:31 -07:00
Alexander Schmidt	bd5d0ccbef	IB/ehca: Init irq tasklet before irq can happen Initialize tasklet before interrupts are requested to prevent scheduling of an uninitialized tasklet. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 16:14:33 -07:00
Linus Torvalds	3cfc2c42c1	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (48 commits) Documentation: update broken web addresses. fix comment typo "choosed" -> "chosen" hostap:hostap_hw.c Fix typo in comment Fix spelling contorller -> controller in comments Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault fs/Kconfig: Fix typo Userpace -> Userspace Removing dead MACH_U300_BS26 drivers/infiniband: Remove unnecessary casts of private_data fs/ocfs2: Remove unnecessary casts of private_data libfc: use ARRAY_SIZE scsi: bfa: use ARRAY_SIZE drm: i915: use ARRAY_SIZE drm: drm_edid: use ARRAY_SIZE synclink: use ARRAY_SIZE block: cciss: use ARRAY_SIZE comment typo fixes: charater => character fix comment typos concerning "challenge" arm: plat-spear: fix typo in kerneldoc reiserfs: typo comment fix update email address ...	2010-08-04 15:31:02 -07:00
Roland Dreier	b2a899eaf3	RDMA/nes: Fix misindented code In nes_probe(), a bit of code is indented one tab stop too far. Fix this. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 14:29:31 -07:00
Roland Dreier	df924f833c	RDMA/nes: Fix showing wqm_quanta In nes_show_wqm_quanta(), the wrong value is printed. Fix this. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 14:27:01 -07:00
Roland Dreier	69d5102383	RDMA/nes: Get rid of "set but not used" variables Delete dead code in various places that is shown by gcc 4.6's new -Wunused-but-set-variable warnings. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 14:25:40 -07:00
Miroslaw Walukiewicz	ff0380ce39	RDMA/nes: Read firmware version from correct place Signed-off-by: Mirek Walukiewicz <miroslaw.walukiewicz@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 13:22:28 -07:00
Bart Van Assche	89de74866b	IB/srp: Export req_lim via sysfs Export req_lim via sysfs for debugging. Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com> Acked-by: David Dillow <dave@thedillows.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 13:02:26 -07:00
Linus Torvalds	6ba74014c1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits) phy/marvell: add 88ec048 support igb: Program MDICNFG register prior to PHY init e1000e: correct MAC-PHY interconnect register offset for 82579 hso: Add new product ID can: Add driver for esd CAN-USB/2 device l2tp: fix export of header file for userspace can-raw: Fix skb_orphan_try handling Revert "net: remove zap_completion_queue" net: cleanup inclusion phy/marvell: add 88e1121 interface mode support u32: negative offset fix net: Fix a typo from "dev" to "ndev" igb: Use irq_synchronize per vector when using MSI-X ixgbevf: fix null pointer dereference due to filter being set for VLAN 0 e1000e: Fix irq_synchronize in MSI-X case e1000e: register pm_qos request on hardware activation ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice net: Add getsockopt support for TCP thin-streams cxgb4: update driver version cxgb4: add new PCI IDs ... Manually fix up conflicts in: - drivers/net/e1000e/netdev.c: due to pm_qos registration infrastructure changes - drivers/net/phy/marvell.c: conflict between adding 88ec048 support and cleaning up the IDs - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req conflict (registration change vs marking it static)	2010-08-04 11:47:58 -07:00
Bart Van Assche	c996bb47bb	IB/srp: Make receive buffer handling more robust The current strategy in ib_srp for posting receive buffers is: * Post one buffer after channel establishment. * Post one buffer before sending an SRP_CMD or SRP_TSK_MGMT to the target. As a result, only the first non-SRP_RSP information unit from the target will be processed. If that first information unit is an SRP_T_LOGOUT, it will be processed. On the other hand, if the initiator receives an SRP_CRED_REQ or SRP_AER_REQ before it receives a SRP_T_LOGOUT, the SRP_T_LOGOUT won't be processed. We can fix this inconsistency by changing the strategy for posting receive buffers to: * Post all receive buffers after channel establishment. * After a receive buffer has been consumed and processed, post it again. A side effect is that the ib_post_recv() call is moved out of the SCSI command processing path. Since __srp_post_recv() is not called directly any more, get rid of it and move the code directly into srp_post_recv(). Also, move srp_post_recv() up in the file to avoid a forward declaration. Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com> Acked-by: David Dillow <dave@thedillows.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 11:47:39 -07:00
Bart Van Assche	7a7008110b	IB/srp: Use print_hex_dump() Replace an open-coded dump of the receive buffer with a call to print_hex_dump(). Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 11:24:12 -07:00
Aleksey Senin	a2ebf07ae5	IB: Rename RAW_ETY to RAW_ETHERTYPE Change abbreviated IB_QPT_RAW_ETY to IB_QPT_RAW_ETHERTYPE to make the special QP type easier to understand. cf http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg04530.html Signed-off-by: Aleksey Senin <alekseys@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 10:44:19 -07:00
Or Gerlitz	812d867221	RDMA/nes: Fix two sparse warnings Simple changes to fix warnings: CHECK drivers/infiniband/hw/nes/nes_verbs.c nes_verbs.c:1944:45: warning: Using plain integer as NULL pointer nes_verbs.c:1944:48: warning: Using plain integer as NULL pointer CHECK drivers/infiniband/hw/nes/nes_cm.c nes_cm.c:2645:43: warning: mixing different enum types nes_cm.c:2645:43: int enum iw_cm_event_type versus nes_cm.c:2645:43: int enum iw_cm_event_status Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Acked-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 10:00:34 -07:00
Or Gerlitz	18199f573e	RDMA/cxgb3: Make needlessly global iwch_l2t_send() static Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 09:57:01 -07:00
Or Gerlitz	48d8fcebb7	IB/iser: Make needlessly global iser_alloc_rx_descriptors() static Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 09:54:55 -07:00
Steve Wise	a5f4a07820	RDMA/cxgb4: Add timeouts when waiting for FW responses Don't hang a host thread if the FW stops responding. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 09:54:42 -07:00
Jiri Kosina	d790d4d583	Merge branch 'master' into for-next	2010-08-04 15:14:38 +02:00
Ralph Campbell	a5210c12b7	IB/qib: Fix race between qib_error_qp() and receive packet processing When transitioning a QP to the error state, in progress RWQEs need to be marked complete. This also involves releasing the reference count to the memory regions referenced in the SGEs. The locking in the receive packet processing wasn't sufficient to prevent qib_error_qp() from modifying the r_sge state at the same time, thus leading to kernel panics. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-03 13:59:47 -07:00
Ralph Campbell	3e3aed0b88	IB/qib: Limit the number of packets processed per interrupt Don't processes too many packets without allowing other IRQ functions a chance to run. Otherwise, there is a chance of getting a "soft lockup" messages and poor application response times. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-03 13:59:25 -07:00
Ira Weiny	4c6931f5d4	IB/qib: Allow writes to the diag_counters to be able to clear them Signed-off-by: Ira Weiny <weiny2@llnl.gov> Acked-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-03 13:59:19 -07:00
Ralph Campbell	0502f94c62	IB/qib: Set cfgctxts to number of CPUs by default Up to now, we have set the number of available user contexts based on the number of hardware contexts which is set according to the number of available CPUs. This was fine since most CPUs had a power of two number of cores and the chip supported 4, 8, or 16 user contexts. Now that some systems have 12 cores, the default isn't optimal and should be set to 12 even though 16 hardware contexts need to be enabled. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-03 13:59:05 -07:00
Steve Wise	ca5a22028d	RDMA/cxgb4: Set/reset the EP timer inside EP lock Endpoint timer manipulation needs to be done inside the lock. Otherwise we can get into a situation where a timer is stopped before it is started, which hits the WARN_ON() in stop_ep_timer(). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-02 21:06:17 -07:00
Steve Wise	d4f1a5c6ef	RDMA/cxgb4: Use correct control txq There is only one control txq per tx channel. So use the port number as the queue index when sending. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-02 21:06:12 -07:00
Steve Wise	73d6fcad2a	RDMA/cxgb4: Fix race in fini path There exists a race condition where the app disconnects, which initiates an orderly close (via rdma_fini()), concurrently with an ingress abort condition, which initiates an abortive close operation. Since rdma_fini() must be called without IRQs disabled, the fini can be called after the QP has been transitioned to ERROR. This is ok, but we need to protect against qp->ep getting NULLed. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-02 21:06:06 -07:00
Sean Hefty	50a025c69e	IB/cm: Check LAP state before sending an MRA NULL pointer dereferences in ib_cm_init_qp_attr() were seen by some users. From a crash dump, I determined that we died in cm_init_qp_rts_attr() (it's inlined, so it doesn't show up in the traceback) on the line labeled below: static int cm_init_qp_rts_attr(struct cm_id_private cm_id_priv, struct ib_qp_attr qp_attr, int qp_attr_mask) { ........ if (cm_id_priv->id.lap_state == IB_CM_LAP_UNINIT) { ..... } else { qp_attr_mask = IB_QP_ALT_PATH \| IB_QP_PATH_MIG_STATE; qp_attr->alt_port_num = cm_id_priv->alt_av.port->port_num; <-die The problem is that the rdma_cm can call ib_send_cm_mra() after a connection has been established. The ib_cm incorrectly assumes that the MRA is in response to a LAP (load alternate path) message, even though no LAP message has been received. The ib_cm needs to check the lap_state before sending an MRA if the cm_id state is established. Reported-by: Arthur Kepner <akepner@sgi.com> Reported-by: Josh England <jjengla@gmail.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-28 15:18:24 -07:00
Faisal Latif	cd6860eb03	RDMA/nes: Fix hangs on ifdown When ib_unregister_device() is called from netdev stop during ifdown, it sometimes hangs. Changes made to indicate port_err to ib_dispatch_event() during netdev stop and port_active during netdev open. The ib_unregister_device() is only called during remove of the module. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-28 15:14:27 -07:00
Chien Tung	0eec495ee6	RDMA/nes: Store and print eeprom version Read and print eeprom version and save it off for later use. Also delete a tab. Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-28 15:12:38 -07:00
Peter Huewe	33085bb8da	RDMA/nes: Convert pci_table entries to PCI_VDEVICE This patch converts pci_table entries, where .subvendor=PCI_ANY_ID and .subdevice=PCI_ANY_ID, .class=0 and .class_mask=0, to use the PCI_VDEVICE macro, and thus improves readability. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-28 10:39:33 -07:00
Alexander Schmidt	e675b6db12	IB/ehca: Catch failing ioremap() When ioremap() fails with a NULL pointer, catch the error and pass it to the caller of create_qp() or create_cq() instead of trying to dereference the NULL pointer later on. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-21 12:46:29 -07:00
Dave Olson	bdf8edcb57	IB/qib: Allow PSM to select from multiple port assignment algorithms We used to allow only full specification, or using all contexts within an HCA before moving to the next HCA. We now allow an additional method -- round-robining through HCAs -- and make that the default. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-21 11:39:36 -07:00
Ralph Campbell	2d978a953b	IB/qib: Turn off IB latency mode Turn off IB latency mode. This improves link quality for slower process chips. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-21 11:39:31 -07:00
Arnd Bergmann	dd378c2102	IB/qib: Use generic_file_llseek When the default llseek action gets changed to no_llseek, all file systems relying on the current behaviour need to set explicit .llseek operations. In case of qib_fs, we want the files to be seekable, so generic_file_llseek fits best. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-21 11:39:27 -07:00
Steve Wise	d37ac31ddc	RDMA/cxgb4: Support variable sized work requests T4 EQ entries are in multiples of 64 bytes. Currently the RDMA SQ and RQ use fixed sized entries composed of 4 EQ entries for the SQ and 2 EQ entries for the RQ. For optimial latency with small IO, we need to change this so the HW only needs to DMA the EQ entries actually used by a given work request. Implementation: - add wq_pidx counter to track where we are in the EQ. cidx/pidx are used for the sw sq/rq tracking and flow control. - the variable part of work requests is the SGL. Add new functions to build the SGL and/or immediate data directly in the EQ memory wrapping when needed. - adjust the min burst size for the EQ contexts to 64B. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-21 11:16:20 -07:00
Dan Carpenter	3d4f9a28e0	RDMA/cxgb3: Clean up signed check of unsigned variable Q_FREECNT() returns the number of spaces free. This should never be a negative amount. Also the num_wrs is an unsigned int so it can never be less than zero. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-21 10:57:25 -07:00
David Rientjes	d3c814e8b2	RDMA/cxgb4: Remove dependency on __GFP_NOFAIL The alloc_skb() in various allocations are failable, so remove __GFP_NOFAIL from their masks. Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-21 10:55:05 -07:00
Steve Wise	ba6d39256b	RDMA/cxgb4: Add module option to tweak delayed ack Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-21 10:53:52 -07:00
Ben Hutchings	dccb816de3	IB/ipath: Fix probe failure path The failure path in ipath_init_one() does not match the cleanup code in ipath_remove_one() and appears to leave interrupts enabled in some cases. Change it to match. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-21 10:48:39 -07:00
Joe Perches	78e2c6415a	drivers/infiniband: Remove unnecessary casts of private_data Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-07-20 17:23:32 +02:00
Alexander Schmidt	91fb0dd9cb	IB/ehca: Fix bitmask handling for lock_hcalls Fix reading hcall locking capability bit from device capabilities. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-19 13:23:32 -07:00
Ralph Campbell	cc323b2aaa	IB/qib: Avoid variable-length array Rather than use a variable size array allocation on the stack, define a constant for the maximum array size possible. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-19 13:21:24 -07:00
Roland Dreier	85963e4cbc	RDMA/cxgb4: Remove unneeded NULL check The rest of the code seems to assume that ep->com.cm_id can't be NULL, so remove an unneeded test. Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-19 13:13:09 -07:00
Dan Carpenter	c1d7356c85	RDMA/cxgb4: Remove unneeded assignment We don't need to assign rpl here, we do that later on. Signed-off-by: Dan Carpenter <error27@gmail.com> [ Indeed this assignment makes no sense, since skb is set to NULL a couple of lines before. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-19 13:09:40 -07:00
Roland Dreier	ea9f3bc6d1	RDMA/nes: Rewrite expression to avoid undefined semantics Change code like x = expr(++x) that assigns to x twice without a sequence point in between to the intended (and well-defined) x = expr(x + 1) Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-14 13:29:21 -07:00
Roland Dreier	f400e5b38a	IB/umad: Remove unused-but-set variable 'already_dead' Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-14 13:25:04 -07:00
Ben Hutchings	ecd4b48a16	IB/qib: Use request_firmware() to load SD7220 firmware Extract the microcode for the QLogic QLE7220 series IB HCA and use the kernel microcode request facility to load the microcode. This supports Debian Linux's requirements to separate microcode which doesn't have open source code available from the device driver. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-08 13:27:05 -07:00
Linus Torvalds	e467e104bb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IPoIB: Fix world-writable child interface control sysfs attributes IB/qib: Clean up properly if qib_init() fails IB/qib: Completion queue callback needs to be single threaded IB/qib: Update 7322 serdes tables IB/qib: Clear 6120 hardware error register IB/qib: Clear eager buffer memory for each new process IB/qib: Mask hardware error during link reset IB/qib: Don't mark VL15 bufs as WC to avoid a rare 7322 chip problem RDMA/cxgb4: Derive smac_idx from port viid RDMA/cxgb4: Avoid false GTS CIDX_INC overflows RDMA/cxgb4: Don't call abort_connection() for active connect failures RDMA/cxgb4: Use the DMA state API instead of the pci equivalents	2010-07-08 12:20:54 -07:00
Roland Dreier	9e770044a0	Merge branches 'cxgb4', 'ipoib' and 'qib' into for-next	2010-07-08 09:10:24 -07:00
Or Gerlitz	7a52b34b07	IPoIB: Fix world-writable child interface control sysfs attributes Sumeet Lahorani <sumeet.lahorani@oracle.com> reported that the IPoIB child entries are world-writable; however we don't want ordinary users to be able to create and destroy child interfaces, so fix them to be writable only by root. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:23:22 -07:00
Ralph Campbell	756a33b8dc	IB/qib: Clean up properly if qib_init() fails If qib_init() fails, the driver fails to free memory, unregister device files, and unregister with the PCIe framework. The driver will unload without error but a subsequent driver load will cause the system to panic. This was found by changing the 7220 code to load the serdes microcode separately and not installing the microcode file. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:14:04 -07:00
Ralph Campbell	950aff5394	IB/qib: Completion queue callback needs to be single threaded Workqueues aren't exactly equivalent to tasklets since the callback function may be called from multiple CPUs before the callback returns. This causes completion notification callbacks to have MT bugs since they weren't expecting this behavior. The fix is to use a single threaded work queue. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:13:58 -07:00
Ralph Campbell	7c7a416ef8	IB/qib: Update 7322 serdes tables Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:13:46 -07:00
Ralph Campbell	2d757a7ce0	IB/qib: Clear 6120 hardware error register The hardware error register needs to be cleared or another interrupt will be generated, thus causing an infinite loop. This is a regression introduced when removing debug output. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:13:40 -07:00
Ralph Campbell	5df4223a44	IB/qib: Clear eager buffer memory for each new process The eager buffers are not being cleared before being mmapped into a new user address space. This is a potential security risk and should be fixed. Note that the eager header queue is already being cleared. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:13:21 -07:00
Ralph Campbell	b9e03e0489	IB/qib: Mask hardware error during link reset The HCA checks for certain hardware errors which can be falsely triggered when the IB link is reset. The fix is to mask them rather than report them. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:13:20 -07:00
Dave Olson	fce24a9d28	IB/qib: Don't mark VL15 bufs as WC to avoid a rare 7322 chip problem Don't set write combining via PAT on the VL15 buffers to avoid a rare problem with unaligned writes from interrupt-flushed store buffers. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:13:20 -07:00
Steve Wise	2c5934bfc5	RDMA/cxgb4: Derive smac_idx from port viid Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:05:16 -07:00
Steve Wise	1973e8b8ed	RDMA/cxgb4: Avoid false GTS CIDX_INC overflows The T4 IQ hw design assumes CIDX_INC credits will be returned on a regular basis and always before the CIDX counter crosses over the PIDX counter. For RDMA CQs, however, returning CIDX_INC credits is only needed and desired when and if the CQ is armed for notification. This can lead to a GTS write returning credits that causes the HW to reject the credit update because it causes CIDX to pass PIDX. Once this happens, the CIDX/PIDX counters get out of whack and an application can miss a notification and get stuck blocked awaiting a notification. To avoid this, we allocate the HW IQ 2x times the requested size. This seems to avoid the false overflow failures. If we see more issues with this, then we'll have to add code in the poll path to return credits periodically like when the amount reaches 1/2 the queue depth). I would like to avoid this as it adds a PCI write transaction for applications that never arm the CQ (like most MPIs). Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:04:04 -07:00
Steve Wise	b21ef16a8b	RDMA/cxgb4: Don't call abort_connection() for active connect failures Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:02:54 -07:00
FUJITA Tomonori	f38926aa1d	RDMA/cxgb4: Use the DMA state API instead of the pci equivalents This replace the PCI DMA state API (include/linux/pci-dma.h) with the DMA equivalents since the PCI DMA state API will be obsolete. No functional change. For further information about the background: http://marc.info/?l=linux-netdev&m=127037540020276&w=2 Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-06 14:01:42 -07:00
Ben Hutchings	39827be26b	IB/{nes, ipoib}: Pass supported flags to ethtool_op_set_flags() Following commit `1437ce3983` "ethtool: Change ethtool_op_set_flags to validate flags", ethtool_op_set_flags takes a third parameter and cannot be used directly as an implementation of ethtool_ops::set_flags. Changes nes and ipoib driver to pass in the appropriate value. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-07-04 11:48:14 -07:00
Jiri Kosina	f1bbbb6912	Merge branch 'master' into for-next	2010-06-16 18:08:13 +02:00
Uwe Kleine-König	421f91d21a	fix typos concerning "initiali[zs]e" Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-06-16 18:05:05 +02:00
Uwe Kleine-König	732bee7af3	fix typos concerning "hierarchy" Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-06-16 18:03:14 +02:00
Changli Gao	d8d1f30b95	net-next: remove useless union keyword remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-10 23:31:35 -07:00
Al Viro	971b2e8a3f	fix the deadlock in qib_fs get_sb_single() calls fill_super with superblock locked; calling deactivate_super() will deadlock immedately. Moreover, if fill_super callback returns an error, get_sb_single() will release the reference to superblock itself just fine. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-06-04 17:16:27 -04:00
Linus Torvalds	3e9345edd8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/qib: Remove DCA support until feature is finished IB/qib: Use a single txselect module parameter for serdes tuning IB/qib: Don't rely on (undefined) order of function parameter evaluation IB/ucm: Use memdup_user() IB/qib: Fix undefined symbol error when CONFIG_PCI_MSI=n	2010-05-30 09:12:16 -07:00
Roland Dreier	767dcd42e5	Merge branches 'misc' and 'qib' into for-next	2010-05-27 11:05:04 -07:00
Ralph Campbell	7145c45a06	IB/qib: Remove DCA support until feature is finished The DCA code was left over from internal development to test the hardware feature and allow performance testing. The results were mixed and will require some additional work to make full use of the feature. Therefore, it is being removed for now. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-27 11:04:48 -07:00
Akinobu Mita	1dee31f74f	ehca: convert cpu notifier to return encapsulate errno value By the previous modification, the cpu notifier can return encapsulate errno value. This converts the cpu notifiers for ehca. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Cc: Christoph Raisch <raisch@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-27 09:12:48 -07:00
Ralph Campbell	a77fcf8950	IB/qib: Use a single txselect module parameter for serdes tuning As part of the earlier patches submitted and reviewed, it was agreed to change the way serdes tuning parameters were specified to the driver. The updated patch got dropped by the linux-rdma email list so the earlier version of qib_iba7322.c ended up being used. This patch updates qib_iab7322.c to the simpler, single parameter method of setting the serdes parameters. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-26 16:37:39 -07:00
Roland Dreier	f27ec1d6db	IB/qib: Don't rely on (undefined) order of function parameter evaluation Some of the qib sysfs code passes a buffer pointer into simple_read_from_buffer() but relies on a function call in another parameter of the same call to initialize that pointer. Since the order of evaluation of function parameters is undefined, this will break if gcc chooses the wrong order. Fix this by splitting the code into two separate function calls. This was noticed because of warnings like the following on ppc: drivers/infiniband/hw/qib/qib_fs.c: In function 'portcntrs_2_read': drivers/infiniband/hw/qib/qib_fs.c:203: warning: 'counters' is used uninitialized in this function Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-26 13:15:06 -07:00
Julia Lawall	e642df6a0b	IB/ucm: Use memdup_user() Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) \|\| ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk>	2010-05-25 21:10:57 -07:00
Ralph Campbell	7e3a1f4ab1	IB/qib: Fix undefined symbol error when CONFIG_PCI_MSI=n This patch fixes a compile error saying qib_init_iba6120_funcs() is undefined when CONFIG_PCI_MSI is not defined. Thanks to Randy Dunlap <randy.dunlap@oracle.com> for finding this and suggesting the fix. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-25 21:09:43 -07:00
Linus Torvalds	8e9815a0f8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/nes: Fix incorrect unlock in nes_process_mac_intr() RDMA/nes: Async event for closed QP causes crash RDMA/nes: Have ethtool read hardware registers for rx/tx stats RDMA/cxgb4: Only insert sq qid in lookup table RDMA/cxgb4: Support IB_WR_READ_WITH_INV opcode RDMA/cxgb4: Set fence flag for inv-local-stag work requests RDMA/cxgb4: Update some HW limits RDMA/cxgb4: Don't limit fastreg page list depth RDMA/cxgb4: Return proper errors in fastreg mr/pbl allocation RDMA/cxgb4: Fix overflow bug in CQ arm RDMA/cxgb4: Optimize CQ overflow detection RDMA/cxgb4: CQ size must be IQ size - 2 RDMA/cxgb4: Register RDMA provider based on LLD state_change events RDMA/cxgb4: Detach from the LLD after unregistering RDMA device IB/ipath: Remove support for QLogic PCIe QLE devices IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters IB/mad: Make needlessly global mad_sendq_size/mad_recvq_size static IB/core: Allow device-specific per-port sysfs files mlx4_core: Clean up mlx4_alloc_icm() a bit mlx4_core: Fix possible chunk sg list overflow in mlx4_alloc_icm()	2010-05-25 12:05:17 -07:00
Roland Dreier	acdc30b56a	Merge branches 'cxgb4', 'misc', 'mlx4', 'nes' and 'qib' into for-next	2010-05-25 09:54:03 -07:00
Chien Tung	b17e0969dc	RDMA/nes: Fix incorrect unlock in nes_process_mac_intr() Commit `ce6e74f2` ("RDMA/nes: Make nesadapter->phy_lock usage consistent") introduced a problem where phy_lock was only unlocked within an if statement and so nes_process_mac_intr() could return with phy_lock still held. Fix this. This was discovered because of the sparse warning: drivers/infiniband/hw/nes/nes_hw.c:2643:9: warning: context imbalance in 'nes_process_mac_intr' - different lock contexts for basic block Reported-by: Roland Dreier <rdreier@cisco.com> Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-25 09:53:06 -07:00
Faisal Latif	df02902313	RDMA/nes: Async event for closed QP causes crash Under abnormal termination, modify_qp() closes the QP, and async event (AE) handling also attempts to close the same QP, causing a crash. Fix this by checking the state of the QP before processing the AE. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-24 21:12:54 -07:00
Faisal Latif	39942a028c	RDMA/nes: Have ethtool read hardware registers for rx/tx stats Enhance ethtool to read hardware registers for rcv/tx error stats. Also add support for free pbl resources. Remove cq depth stats, which are not used. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-24 21:12:54 -07:00
Steve Wise	30a6a62fc3	RDMA/cxgb4: Only insert sq qid in lookup table Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-24 21:08:05 -07:00
Steve Wise	2f1fb507ee	RDMA/cxgb4: Support IB_WR_READ_WITH_INV opcode Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-24 21:08:04 -07:00

... 2 3 4 5 6 ...

2724 Commits