Commit Graph

2154 Commits

Author SHA1 Message Date
Alexander Schmidt
1d4d6da535 IB/ehca: Bump version number
Increment version number for DMEM toleration.

Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-23 10:30:04 -07:00
Roland Dreier
99987bea47 IB/mthca: Replace dma_sync_single() use with proper functions
dma_sync_single() is deprecated now, and the use in mthca is wrong:
there should be a dma_sync_single_for_cpu() before touching the memory
from the CPU, and a dma_sync_single_for_device() afterwards.  Fix
this, prompted by a kick in the pants from a patch from FUJITA
Tomonori <fujita.tomonori@lab.ntt.co.jp>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 23:04:13 -07:00
Faisal Latif
68237a0ff8 RDMA/nes: Fix FIN state handling under error conditions
During cluster testing, one QP was not closed, as FIN is not handled
properly when its rexmit count expires or in some cases when RST is is
received after sending FIN.  The reason is that the cm_id does not get
decremented under these conditions.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:53:28 -07:00
Faisal Latif
66388d67a0 RDMA/nes: Fix max_qp_init_rd_atom returned from query device
In nes_query_device(), max_qp_init_rd_atom is incorrectly set to
max_qp_wr.  This was found when a test application had a dapl async
event error.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:52:30 -07:00
Roel Kluin
af04662b4d IB/ehca: Ensure that guid_entry index is not negative
This prevents the memcpy() of a guid_entries element using a negative index.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:23:48 -07:00
Hannes Hering
0cf89dcdbc IB/ehca: Tolerate dynamic memory operations before driver load
Implement toleration of dynamic memory operations and 16 GB gigantic
pages, where "toleration" means that the driver can cope with dynamic
memory operations that happen before the driver is loaded.  While the
ehca driver is loaded, dynamic memory operations are still prohibited
by returning NOTIFY_BAD from the memory notifier.

On module load the driver walks through available system memory,
checks for available memory ranges and then registers the kernel
internal memory region accordingly.  The translation of address ranges
is implemented via a 3-level busmap.

Signed-off-by: Hannes Hering <hering2@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-22 22:18:51 -07:00
Greg Kroah-Hartman
f899c2ddd4 infiniband: ehca: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device.  Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.

Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: general@lists.openfabrics.org
Cc: Christoph Raisch <raisch@de.ibm.com>
Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:27 -07:00
Greg Kroah-Hartman
3f7c58a05f infiniband: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device.  Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used.  These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.


Cc: general@lists.openfabrics.org
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15 21:30:26 -07:00
David S. Miller
9cbc1cb8cd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:
	Documentation/feature-removal-schedule.txt
	drivers/scsi/fcoe/fcoe.c
	net/core/drop_monitor.c
	net/core/net-traces.c
2009-06-15 03:02:23 -07:00
Linus Torvalds
cf5046323e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  mlx4_core: Don't double-free IRQs when falling back from MSI-X to INTx
  IB/mthca: Don't double-free IRQs when falling back from MSI-X to INTx
  IB/mlx4: Add strong ordering to local inval and fast reg work requests
  IB/ehca: Remove superfluous bitmasks from QP control block
  RDMA/cxgb3: Limit fast register size based on T3 limitations
  RDMA/cxgb3: Report correct port state and MTU
  mlx4_core: Add module parameter for number of MTTs per segment
  IB/mthca: Add module parameter for number of MTTs per segment
  RDMA/nes: Fix off-by-one bugs in reset_adapter_ne020() and init_serdes()
  infiniband: Remove void casts
  IB/ehca: Increment version number
  IB/ehca: Remove unnecessary memory operations for userspace queue pairs
  IB/ehca: Fall back to vmalloc() for big allocations
  IB/ehca: Replace vmalloc() with kmalloc() for queue allocation
2009-06-14 13:53:22 -07:00
Roland Dreier
8d34ff3401 Merge branches 'cxgb3', 'ehca', 'misc', 'mlx4', 'mthca' and 'nes' into for-linus 2009-06-14 13:31:19 -07:00
Roland Dreier
9aa0a489d9 IB/mthca: Don't double-free IRQs when falling back from MSI-X to INTx
When both MSI-X and legacy INTx fail to generate an interrupt, the
driver frees the MSI-X interrupts twice.  Fix this by clearing the
have_irq flag for the MSI-X interrupts when they are freed the first
time.

Reported-by: Yinghai Lu <yhlu.kernel@gmail.com>
Tested-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-13 15:14:09 -07:00
Jack Morgenstein
2ac6bf4ddc IB/mlx4: Add strong ordering to local inval and fast reg work requests
The ConnectX Programmer's Reference Manual states that the "SO" bit
must be set when posting Fast Register and Local Invalidate send work
requests.  When this bit is set, the work request will be executed
only after all previous work requests on the send queue have been
executed.  (If the bit is not set, Fast Register and Local Invalidate
WQEs may begin execution too early, which violates the defined
semantics for these operations)

This fixes the issue with NFS/RDMA reported in
<http://lists.openfabrics.org/pipermail/general/2009-April/059253.html>

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-05 10:36:24 -07:00
Joachim Fenkes
25a5239327 IB/ehca: Remove superfluous bitmasks from QP control block
All the fields in the control block are nicely right-aligned, so no
masking is necessary.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-06-03 13:25:42 -07:00
Eric Dumazet
adf30907d6 net: skb->dst accessors
Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-03 02:51:04 -07:00
Eric Dumazet
86d15cd833 net: unset IFF_XMIT_DST_RELEASE for qeth and ipoib
Last two drivers that need skb->dst in their start_xmit() function

Tell dev_hard_start_xmit() to no release it by unsetting  IFF_XMIT_DST_RELEASE

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-30 23:04:46 -07:00
Steve Wise
3026c19a14 RDMA/cxgb3: Limit fast register size based on T3 limitations
T3 firmware only supports one WRs worth of page list for fast register
work requests.  The driver currently allows 2 WRs worth, which
doesn't work for T3, so reduce the limit in the driver.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27 14:43:39 -07:00
Steve Wise
7ab1a2b31d RDMA/cxgb3: Report correct port state and MTU
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27 14:42:36 -07:00
Eli Cohen
c1f67a88bf IB/mthca: Add module parameter for number of MTTs per segment
The current MTT allocator uses kmalloc() to allocate a buffer for its
buddy allocator, and thus is limited in the amount of MTT segments
that it can control.  As a result, the size of memory that can be
registered is limited too.  This patch uses a module parameter to
control the number of MTT entries that each segment represents,
allowing more memory to be registered with the same number of
segments.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-27 14:36:16 -07:00
Mike Christie
b3cd5050bf [SCSI] libiscsi: add task aborted state
If a task did not complete normally due to a TMF, libiscsi will
now complete the task with the state ISCSI_TASK_ABRT_TMF. Drivers
like bnx2i that need to free resources if a command did not complete normally
can then check the task state. If a driver does not need to send
a special command if we have dropped the session then they can check
for ISCSI_TASK_ABRT_SESS_RECOV.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-23 15:44:13 -05:00
Mike Christie
10eb0f013c [SCSI] iscsi: pass ep connect shost
When we create the tcp/ip connection by calling ep_connect, we currently
just go by the routing table info.

I think there are two problems with this.

1. Some drivers do not have access to a routing table. Some drivers like
qla4xxx do not even know about other ports.

2. If you have two initiator ports on the same subnet, the user may have
set things up so that session1 was supposed to be run through port1. and
session2 was supposed to be run through port2. It looks like we could
end with both sessions going through one of the ports.

Fixes for cxgb3i from Karen Xie.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-05-23 15:44:09 -05:00
Eric W. Biederman
26574401fe net: Fix ipoib rtnl_lock sysfs deadlock.
Network device sysfs files that grab the rtnl_lock unconditionally
will deadlock if accessed when the network device is being
unregistered.  So use trylock and syscall_restart to avoid this
deadlock.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-18 22:15:59 -07:00
Roel Kluin
28e43a519b RDMA/nes: Fix off-by-one bugs in reset_adapter_ne020() and init_serdes()
With a postfix increment, i is incremented one past 10K/5K before the
loop ends, so the error messages will be displayed too soon if the
test succeeds on the last iteration.  Fix the comparisons to be >
instead of >=.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-15 10:16:45 -07:00
Jack Stone
5b891a9332 infiniband: Remove void casts
Remove uneeded casts of void *.

Signed-off-by: Jack Stone <jwjstone@fastmail.fm>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:53:39 -07:00
Stefan Roscher
bde2cfaf8f IB/ehca: Increment version number
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:43 -07:00
Stefan Roscher
1988d1fa1a IB/ehca: Remove unnecessary memory operations for userspace queue pairs
The queue map for flush completion circumvention is only used for
kernel space queue pairs.  This patch skips the allocation of the
queue maps in case the QP is created for userspace.  In addition, this
patch does not iomap the galpas for kernel usage if the queue pair is
only used in userspace.  These changes will improve the performance of
creation of userspace queue pairs.

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:43 -07:00
Stefan Roscher
c94f156f63 IB/ehca: Fall back to vmalloc() for big allocations
In case of large queue pairs there is the possibillity of allocation
failures due to memory fragmentation when using kmalloc().  To ensure
the memory is allocated even if kmalloc() can not find chunks which
are big enough, we fall back to allocating the memory with vmalloc().

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:42 -07:00
Anton Blanchard
bf31a1a02e IB/ehca: Replace vmalloc() with kmalloc() for queue allocation
To improve performance of driver resource allocation, replace
vmalloc() calls with kmalloc().

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-13 16:52:40 -07:00
Linus Torvalds
c98861f7de Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Don't overwrite fast registration page list when posting work request
  RDMA/cxgb3: Don't complete flushed send work requests twice
2009-05-13 16:31:12 -07:00
Roland Dreier
8be741b0ac Merge branches 'cxgb3' and 'mlx4' into for-linus 2009-05-13 15:16:17 -07:00
Al Viro
265e771e81 Fix deadlock in ipathfs ->get_sb()
forgot to unlock superblock before calling deactivate_super()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-05-09 10:49:40 -04:00
Jack Morgenstein
2b6b7d4be4 IB/mlx4: Don't overwrite fast registration page list when posting work request
The low-level mlx4 driver modified the page-list addresses for fast
register work requests post send to big-endian, and set a "present"
bit.  This caused problems later when the consumer attempted to unmap
the pages using the page-list (using the list addresses which were
assumed to be still in CPU-endian order).  Fix the mlx4 driver to
allocate two buffers and use a private buffer for the hardware-format
bus addresses.

This patch fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1571>,
an NFS/RDMA server crash.  The cause of the crash was found by Vu Pham
of Mellanox.  The fix is along the lines suggested by Steve Wise in
comment #21 in bug 1571.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-05-07 21:35:13 -07:00
Linus Torvalds
61bd1e858d Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (53 commits)
  [SCSI] libosd: OSD2r05: on-the-wire changes for latest OSD2 revision 5.
  [SCSI] libosd: OSD2r05: OSD_CRYPTO_KEYID_SIZE will grow 20 => 32 bytes
  [SCSI] libosd: OSD2r05: Prepare for rev5 attribute list changes
  [SCSI] libosd: fix potential ERR_PTR dereference in osd_initiator.c
  [SCSI] mpt2sas : bump driver version to 01.100.02.00
  [SCSI] mpt2sas: fix hotplug event processing
  [SCSI] mpt2sas : release diagnotic buffers prior host reset
  [SCSI] mpt2sas : Broadcast Primative AEN bug fix
  [SCSI] mpt2sas : Identify Dell series-7 adapters at driver load time
  [SCSI] mpt2sas : driver name needs to be in the MPT2IOCINFO ioctl
  [SCSI] mpt2sas : running out of message frames
  [SCSI] mpt2sas : fix oops when firmware sends large sense buffer size
  [SCSI] mpt2sas : the sanity check in base_interrupt needs to be on dword boundary
  [SCSI] mpt2sas : unique ioctl magic number
  [SCSI] fix sign extension with 1.5TB usb-storage LBD=y
  [SCSI] ipr: Fix sleeping function called with interrupts disabled
  [SCSI] fcoe: fip: add multicast filter to receive FIP advertisements.
  [SCSI] libfc: Fix compilation warnings with allmodconfig
  [SCSI] fcoe: fix spelling typos and bad comments
  [SCSI] fcoe: don't export functions that are internal to fcoe
  ...
2009-05-02 16:36:34 -07:00
Steve Wise
ec6995ddaa RDMA/cxgb3: Don't complete flushed send work requests twice
When the SQ is flushed, mark the flushed entries as not signaled so
the poll logic doesn't re-insert the CQ entry thinking its an out of
order completion.

The bug can cause the NFS/RDMA server to crash due to processing the
same completed work request twice.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-29 15:15:59 -07:00
Roland Dreier
9308f96c79 Merge branches 'cxgb3', 'ipoib', 'mthca', 'mlx4' and 'nes' into for-linus 2009-04-28 16:01:31 -07:00
Chien Tung
26cc5e57bb RDMA/nes: Update iw_nes version
Update version number to 1.5.0.0

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:46:29 -07:00
Faisal Latif
9256b25130 RDMA/nes: Fix error path in nes_accept()
If reg_phys_mem() fails, we need to free memory allocated for MPA
frame with private data before returning the error. Also move
nes_add_ref() after the reg_phys_mem() is successful.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:45:19 -07:00
Faisal Latif
109d67e4f1 RDMA/nes: Fix hang issues for large cluster dynamic connections
Running large cluster setup, we are hanging after many hours of
testing.  Fixing this required going over the code and making sure the
rexmit entry was properly removed based on the cm_node's state and
packet received.  Also when receiving a FIN packet, check seq# and
make sure there were no errors before calling handle_fin().

Following are the changes done in nes_cm.c:

* handle_ack_pkt() needs to return error value, so in case of error,
  handle_fin() is not called. Some cleanup done while going over the code.

* handle_rst_pkt(), handling of cm_node's NES_CM_STATE_LAST_ACK is missing.

* process_packet(), in case of FIN only packet is received, call
  check_seq() before processing.

* in handle_fin_pkt(), we are calling cleanup_retrans_entry() for all
  conditions, even if the packets need to be dropped.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:41:06 -07:00
Faisal Latif
4e9c390036 RDMA/nes: Increase rexmit timeout interval
Under heavy load with large cluster testing, it may take longer to
receive a response to MPA requests.  Change the driver to wait longer
after each rexmit to max time value.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:39:36 -07:00
Faisal Latif
c11470f9f4 RDMA/nes: Check for sequence number wrap-around
check_seq() was not checking if the seq#s have wrapped.  Fix it.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:38:31 -07:00
Faisal Latif
53094c388f RDMA/nes: Do not set apbvt entry for loopback
When a connect request comes, apbvt should only be set for
non-loopback connections.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:37:34 -07:00
Chien Tung
1f0dba1e51 RDMA/nes: Fix unused variable compile warning when INFINIBAND_NES_DEBUG=n
Remove the NES_DEBUG that is causing the compile warning about an
unused variable when INFINIBAND_NES_DEBUG is not enabled.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:36:03 -07:00
Chien Tung
0e4562da9e RDMA/nes: Fix fw_ver in /sys
/sys/class/infiniband/nes?/fw_ver is not displaying firmware version
properly (it shows 0.0.0 with the current code).  Fill in the correct
firmware version number.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:33:48 -07:00
Chien Tung
923223776b RDMA/nes: Set trace length to 1 inch for SFP_D
With updated PHY firmware for SFP_D, setting the trace length to 1
inch for SFP_D provides a more stable link.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:30:35 -07:00
Chien Tung
e998c25bc2 RDMA/nes: Enable repause timer for port 1
Enable repause timer for port 1.  Without this setting, under stress,
the chip may misbehave.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:29:42 -07:00
Chien Tung
366835e249 RDMA/nes: Correct CDR loop filter setting for port 1
In commit 1b949324 ("RDMA/nes: Fix SFP+ PHY initialization") there is
a mistake in the clean up code that removed port 1 CDR loop filter
settings for 10G cards other than CX4.  Put the correct setting back
for appropriate PHY types.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:28:41 -07:00
Chien Tung
010db4d127 RDMA/nes: Modify thermo mitigation to flip SerDes1 ref clk to internal
Change thermo mitigation code to flip the SerDes1 reference clock to
internal, to match the change in commit a4849fc1 ("RDMA/nes: Add
wide_ppm_offset parm for switch compatibility").

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-27 13:27:21 -07:00
Mike Christie
6b5d6c443a [SCSI] cxgb3i, iser, iscsi_tcp: set target can queue
Set target can queue limit to the number of preallocated
session tasks we have.

This along with the cxgb3i can_queue patch will fix a throughput
problem where it could only queue one LU worth of data at a time.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-04-27 10:09:54 -05:00
Miroslaw Walukiewicz
5d1af5c832 RDMA/nes: Fix resource issues in nes_create_cq() and nes_destroy_cq()
In error paths where a CQ is not created, pbl is not freeed properly.

In nes_destroy_cq(), add the corresponding check for nescq->mcrqf to
not call nes_free_resource() when it is already done in nes_create_cq().

Signed-off-by: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-21 16:16:48 -07:00
Matt Kraai
cc005fa20c RDMA/nes: Remove root_256()'s unused pbl_count_256 parameter
Signed-off-by: Matt Kraai <kraai@ftbfs.org>
Acked-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-21 10:43:21 -07:00
Jack Morgenstein
8531f1f14a IB/mthca: Fix timeout for INIT_HCA and a few other commands
Commands INIT_HCA, CLOSE_HCA, SYS_EN, SYS_DIS, and CLOSE_IB all have 1
second timeouts.  For INIT_HCA this causes problems when had more than
2^18 are QPs configured, since the command takes more than 1 second to
complete.

All other commands have 60-second timeouts.  This patch makes the
above commands consistent with the rest of the commands (and with the
chip documentation).

This patch is an expansion of a patch from Arthur Kepner
<akepner@sgi.com> fixing just the INIT_HCA timeout.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 21:12:25 -07:00
Steve Wise
cde9e2f930 RDMA/cxgb3: Don't zero QP attrs when moving to IDLE
QP attributes must stay initialized when moving back to IDLE.  Zeroing
them will crash the system in _flush_qp() if the QP is subsequently
moved to ERROR and back to IDLE.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 17:00:53 -07:00
Don Wood
3f32eb1185 RDMA/nes: Fix bugs in nes_reg_phys_mr()
The code incorrectly failed memory registration if the buffer was not
page aligned.  Also, the length field is mangled causing the hardware
to think the registration is much larger than it really is.

The fix is to remove the page alignment restriction as well the
incorrect length adjustment.  Also make sure that all buffers after
the first start at a page boundary, and all buffers except the last
end on a page boundary.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 14:53:00 -07:00
Chien Tung
1af9222b52 RDMA/nes: Fix compiler warning at nes_verbs.c:1955
Initialize pbl_count_256 to 0 to get rid of the warning:

    drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_reg_mr':
    drivers/infiniband/hw/nes/nes_verbs.c:1955: warning: 'pbl_count_256' may be used uninitialized in this function

Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 14:50:36 -07:00
Yossi Etigin
e028cc55cc IPoIB: Disable NAPI while CQ is being drained
If NAPI is enabled while IPoIB's CQ is being drained, it creates a
race on priv->ibwc between ipoib_poll() and ipoib_drain_cq(), leading
to memory corruption.

The solution is to enable/disable NAPI in ipoib_ib_dev_{open/stop}()
instead of in ipoib_{open/stop}(), and sync NAPI on the INITIALIZED
flag instead on the ADMIN_UP flag. This way NAPI will be disabled when
ipoib_drain_cq() is called.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1587>.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 13:58:08 -07:00
Steve Wise
96ac7e8892 RDMA/cxgb3: Adjust ORD/IRD (if needed) for peer2peer connections
NFS/RDMA currently fails to set up connections if peer2peer is on.
This is due to the fact that the NFS/RDMA client sets its ORD to 0.

If peer2peer is set, make sure the active side ORD is >= 1 and the
passive side IRD is >=1.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-20 13:53:15 -07:00
Linus Torvalds
0534c8cb5c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/nes: Add support for new SFP+ PHY
  RDMA/nes: Add wide_ppm_offset parm for switch compatibility
  RDMA/nes: Fix SFP+ PHY initialization
  RDMA/nes: Fix nes_nic_cm_xmit() error handling
  RDMA/nes: Fix error handling issues
  RDMA/nes: Fix incorrect casts on 32-bit architectures
  IPoIB: Document newish features
  RDMA/cma: Create cm id even when IB port is down
  RDMA/cma: Use rate from IPoIB broadcast when joining IPoIB multicast groups
  IPoIB: Avoid free_netdev() BUG when destroying a child interface
  mlx4_core: Don't leak mailbox for SET_PORT on Ethernet ports
  RDMA/cxgb3: Release dependent resources only when endpoint memory is freed.
  RDMA/cxgb3: Handle EEH events
  IB/mlx4: Use pgprot_writecombine() for BlueFlame pages
2009-04-09 16:42:26 -07:00
Roland Dreier
07306c0b98 Merge branches 'cma', 'cxgb3', 'ipoib', 'mlx4' and 'nes' into for-next 2009-04-08 14:28:21 -07:00
Chien Tung
4303565df4 RDMA/nes: Add support for new SFP+ PHY
Add new register settings for new SFP+ PHY/firmware.
Add new PHY to to nes_netdev_get/set_settings.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:27:56 -07:00
Chien Tung
a4849fc157 RDMA/nes: Add wide_ppm_offset parm for switch compatibility
We have observed unstable link with a new BNT switch.

Add wide_ppm_offset parameter to allow the user to control the clock
ppm offset on the CX4 interface for better compatibility.  Default is
100ppm, setting it to 1 will increase it to 300ppm.  Change default
SerDes1 reference clock to external source.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:27:18 -07:00
Chien Tung
1b9493248c RDMA/nes: Fix SFP+ PHY initialization
SFP+ PHY initialization has very long delays, incorrect settings for
direct attach copper cables, and inconsistent link detection.

Adjust delays to the minimum required by the PHY.  Worst case is now
less than 4 seconds.  Add new register settings for direct attach
cables.  Change link detection logic to use two new registers for more
consistent link state detection.  Reorganize code to shorten line
length.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:27:09 -07:00
Faisal Latif
5962c2c803 RDMA/nes: Fix nes_nic_cm_xmit() error handling
We are getting crash or hung situation when we are running network
cable pull tests during RDMA traffic.

In schedule_nes_timer(), we return an error if nes_nic_cm_xmit()
returns failure.  This is changed to success as skb is being put on
the timer routines to be processed later.  In send_syn() case, we are
indicating connect failure once from nes_connect() and the other when
the rexmit retries expires.

The other issue is skb->users which we are incrementing before calling
nes_nic_cm_xmit() which calls dev_queue_xmit() but in case of failure
we are decrementing the skb->users at the same time putting the skb on
the rexmit path.  Even if dev_queue_xmit() fails, the skb->users is
decremented already.  We are removing the decrement of skb->users in
case of failure from both schedule_nes_timer() as well as from
nes_cm_timer_tick().

There is also extra check in nes_cm_timer_tick() for rexmit failure
which does a break from the loop is removed.  This causes problem as
the other nodes have their cm_node->ref_count incremented and are not
processed.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:23:55 -07:00
Faisal Latif
79fc3d7410 RDMA/nes: Fix error handling issues
Fix issues found by static code analysis:

(1) Check if cm_node was successfully created for loopback connection.

(2) schedule_nes_timer() does not free up allocated memory after
    encountering an error.  There is a WARN_ON() for this condition.

(3) there is a cm_node->freed flag which is set but not used.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:22:20 -07:00
Don Wood
7a5efb62f6 RDMA/nes: Fix incorrect casts on 32-bit architectures
The were some incorrect casts to unsigned long that caused 64-bit values
to be truncated on 32-bit architectures and made the driver pass invalid
adresses and lengths to the hardware.  The problems were primarily seen
with kernels with highmem configured but some could show up in
non-highmem kernels, too.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:21:02 -07:00
Yossi Etigin
d2ca39f262 RDMA/cma: Create cm id even when IB port is down
When doing rdma_resolve_addr(), if the relevant IB port is down, the
function fails and the cm_id is not bound to the correct device.
Therefore, application does not have a device handle and cannot wait
for the port to become active.  The function fails because the
underlying IPoIB interface is not joined to the broadcast group and
therefore the SA does not have a multicast record to take a Q_Key
from.

The fix is to use lazy Q_Key resolution - cma_set_qkey() will set
id_priv->qkey if it was not set, and will be called just before the
Q_Key is really required.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 13:42:33 -07:00
Yang Hongyang
284901a90a dma-mapping: replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:11 -07:00
Yang Hongyang
6a35528a83 dma-mapping: replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)
Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:10 -07:00
Yossi Etigin
84adeee9aa RDMA/cma: Use rate from IPoIB broadcast when joining IPoIB multicast groups
When joining an IPoIB multicast group, use the same rate as in the
broadcast group.  Otherwise, if the RDMA CM creates this group before
IPoIB does, it might get a different rate.  This will cause IPoIB to
fail joining to the same group later on, because IPoIB uses strict
rate selection.

Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-01 13:55:32 -07:00
Roland Dreier
edb5abb1e2 IPoIB: Avoid free_netdev() BUG when destroying a child interface
We have to release the RTNL before calling free_netdev() so that the
device state has a chance to become NETREG_UNREGISTERED.  Otherwise
when removing a child interface, we hit the BUG() that tests the
device state in free_netdev().

Reported-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-31 10:22:32 -07:00
Steve Wise
874d8df5ed RDMA/cxgb3: Release dependent resources only when endpoint memory is freed.
The cxgb3 l2t entry, hwtid, and dst entry were being released before
all the iwch_ep references were released.  This can cause a crash in
t3_l2t_send_slow() and other places where the l2t entry is used.

The fix is to defer releasing these resources until all endpoint
references are gone.

Details:

- move flags field to the iwch_ep_common struct.
- add a flag indicating resources are to be released.
- release resources at endpoint free time instead of close/abort time.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-30 08:37:59 -07:00
Steve Wise
04b5d028f5 RDMA/cxgb3: Handle EEH events
- wrap calls into cxgb3 and fail them if we're in the middle
  of a PCI EEH event.

- correctly unwind and release endpoint and other resources when
  we are in an EEH event.

- dispatch IB_EVENT_DEVICE_FATAL event when cxgb3 notifies iw_cxgb3 of
  a fatal error.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-30 08:37:56 -07:00
Roland Dreier
e1d60ec669 IB/mlx4: Use pgprot_writecombine() for BlueFlame pages
The PAT work on x86 has finally made pgprot_writecombine() a usable API
for modular drivers.  As the comment indicates, this is exactly what we
want to use in mlx4_ib to map BlueFlame pages up to userspace, since
using WC for these pages improves small message latency significantly.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-30 08:31:05 -07:00
Linus Torvalds
d54b3538b0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (119 commits)
  [SCSI] scsi_dh_rdac: Retry for NOT_READY check condition
  [SCSI] mpt2sas: make global symbols unique
  [SCSI] sd: Make revalidate less chatty
  [SCSI] sd: Try READ CAPACITY 16 first for SBC-2 devices
  [SCSI] sd: Refactor sd_read_capacity()
  [SCSI] mpt2sas v00.100.11.15
  [SCSI] mpt2sas: add MPT2SAS_MINOR(221) to miscdevice.h
  [SCSI] ch: Add scsi type modalias
  [SCSI] 3w-9xxx: add power management support
  [SCSI] bsg: add linux/types.h include to bsg.h
  [SCSI] cxgb3i: fix function descriptions
  [SCSI] libiscsi: fix possbile null ptr session command cleanup
  [SCSI] iscsi class: remove host no argument from session creation callout
  [SCSI] libiscsi: pass session failure a session struct
  [SCSI] iscsi lib: remove qdepth param from iscsi host allocation
  [SCSI] iscsi lib: have lib create work queue for transmitting IO
  [SCSI] iscsi class: fix lock dep warning on logout
  [SCSI] libiscsi: don't cap queue depth in iscsi modules
  [SCSI] iscsi_tcp: replace scsi_debug/tcp_debug logging with iscsi conn logging
  [SCSI] libiscsi_tcp: replace tcp_debug/scsi_debug logging with session/conn logging
  ...
2009-03-28 13:30:43 -07:00
Roland Dreier
7c757eb9f8 RDMA/nes: Fix mis-merge
When net-next and infiniband were merged upstream, each branch deleted
one of a pair of adjacent lines from nes_nic.c, but when Linus fixed the
conflict up, he brought back both of the lines.  Fix up to the intended
final tree state.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-26 17:00:25 -07:00
Linus Torvalds
6671de344c Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
  posix timers: fix RLIMIT_CPU && fork()
  time: ntp: fix bug in ntp_update_offset() & do_adjtimex(), fix
  time: ntp: clean up second_overflow()
  time: ntp: simplify ntp_tick_adj calculations
  time: ntp: make 64-bit constants more robust
  time: ntp: refactor do_adjtimex() some more
  time: ntp: refactor do_adjtimex()
  time: ntp: fix bug in ntp_update_offset() & do_adjtimex()
  time: ntp: micro-optimize ntp_update_offset()
  time: ntp: simplify ntp_update_offset_fll()
  time: ntp: refactor and clean up ntp_update_offset()
  time: ntp: refactor up ntp_update_frequency()
  time: ntp: clean up ntp_update_frequency()
  time: ntp: simplify the MAX_TICKADJ_SCALED definition
  time: ntp: simplify the second_overflow() code flow
  time: ntp: clean up kernel/time/ntp.c
  x86: hpet: stop HPET_COUNTER when programming periodic mode
  x86: hpet: provide separate functions to stop and start the counter
  x86: hpet: print HPET registers during setup (if hpet=verbose is used)
  time: apply NTP frequency/tick changes immediately
  ...
2009-03-26 16:05:42 -07:00
Linus Torvalds
13220a94d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1750 commits)
  ixgbe: Allow Priority Flow Control settings to survive a device reset
  net: core: remove unneeded include in net/core/utils.c.
  e1000e: update version number
  e1000e: fix close interrupt race
  e1000e: fix loss of multicast packets
  e1000e: commonize tx cleanup routine to match e1000 & igb
  netfilter: fix nf_logger name in ebt_ulog.
  netfilter: fix warning in ebt_ulog init function.
  netfilter: fix warning about invalid const usage
  e1000: fix close race with interrupt
  e1000: cleanup clean_tx_irq routine so that it completely cleans ring
  e1000: fix tx hang detect logic and address dma mapping issues
  bridge: bad error handling when adding invalid ether address
  bonding: select current active slave when enslaving device for mode tlb and alb
  gianfar: reallocate skb when headroom is not enough for fcb
  Bump release date to 25Mar2009 and version to 0.22
  r6040: Fix second PHY address
  qeth: fix wait_event_timeout handling
  qeth: check for completion of a running recovery
  qeth: unregister MAC addresses during recovery.
  ...

Manually fixed up conflicts in:
	drivers/infiniband/hw/cxgb3/cxio_hal.h
	drivers/infiniband/hw/nes/nes_nic.c
2009-03-26 15:54:36 -07:00
Linus Torvalds
39b566eedb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (30 commits)
  RDMA/cxgb3: Enforce required firmware
  IB/mlx4: Unregister IB device prior to CLOSE PORT command
  mlx4_core: Add link type autosensing
  mlx4_core: Don't perform SET_PORT command for Ethernet ports
  RDMA/nes: Handle MPA Reject message properly
  RDMA/nes: Improve use of PBLs
  RDMA/nes: Remove LLTX
  RDMA/nes: Inform hardware that asynchronous event has been handled
  RDMA/nes: Fix tmp_addr compilation warning
  RDMA/nes: Report correct vendor_id and vendor_part_id
  RDMA/nes: Update copyright to new legal entity and year
  RDMA/nes: Account for freed PBL after HW operation
  IB: Remove useless ibdev_is_alive() tests from sysfs code
  IB/sa_query: Fix AH leak due to update_sm_ah() race
  IB/mad: Fix ib_post_send_mad() returning 0 with no generate send comp
  IB/mad: initialize mad_agent_priv before putting on lists
  IB/mad: Fix null pointer dereference in local_completions()
  IB/mad: Fix RMPP header RRespTime manipulation
  IB/iser: Remove hard setting of path MTU
  mlx4_core: Add device IDs for MT25458 10GigE devices
  ...
2009-03-26 15:47:08 -07:00
David S. Miller
08abe18af1 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/net/wimax/i2400m/usb-notif.c
2009-03-26 15:23:24 -07:00
Ingo Molnar
7c526e1fef Merge branches 'timers/new-apis', 'timers/ntp' and 'timers/urgent' into timers/core 2009-03-26 15:45:52 +01:00
Roland Dreier
09f98bafea Merge branches 'cxgb3', 'endian', 'ipath', 'ipoib', 'iser', 'mad', 'misc', 'mlx4', 'mthca', 'nes' and 'sysfs' into for-next 2009-03-24 20:44:41 -07:00
Steve Wise
d1fbe04eee RDMA/cxgb3: Enforce required firmware
The cxgb3 NIC driver can handle more firmware versions than iw_cxgb3,
and since commit 8207befa ("cxgb3: untie strict FW matching") cxgb3
will load with firmware versions that iw_cxgb3 can't handle.  The FW
major number indicates a specific interface between the FW and
iw_cxgb3.  Thus if the major number of the running firmware does not
match the required version compiled into iw_cxgb3, then iw_cxgb3 must
not register that device.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-24 20:44:18 -07:00
Stephen Hemminger
fe8114e8e1 infiniband: convert ipoib to net_device_ops
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 19:19:14 -07:00
Stephen Hemminger
d0929553be infiniband: convert nes driver to net_device_ops
Also, removed unnecessary memset() since alloc_netdev returns
zeroed memory.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 19:19:13 -07:00
Stephen Hemminger
687c75dcf3 infiniband: convert c2 to net_device_ops
Convert this driver to new net_device_ops infrastructure.
Also use default net_device get-stats infrastructure

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-21 19:19:13 -07:00
Yevgeny Petrilin
a6a47771b1 IB/mlx4: Unregister IB device prior to CLOSE PORT command
According to the ConnectX programmer's reference manual, all
operations should be stopped, all QPs should be torn down and all WQEs
flushed before the CLOSE_PORT command is invoked.  In some cases
reversing the order of operations (as implemented now) could cause
a loss of completions.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-18 19:49:54 -07:00
Mike Christie
5e7facb77f [SCSI] iscsi class: remove host no argument from session creation callout
We do not need to have llds set the host no for the session's
parent, because we know the session's parent is going to be
the host. This removes it from the session creation callback
and converts the drivers.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:29:39 -05:00
Mike Christie
4d1083509a [SCSI] iscsi lib: remove qdepth param from iscsi host allocation
The qdepth setting was useful when we needed libiscsi to verify
the setting. Now we just need to make sure if older tools
passed in zero then we need to set some default.

So this patch just has us use the sht->cmd_per_lun or if
for LLD does a host per session then we can set it on per
host basis.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:28:55 -05:00
Mike Christie
32ae763e3f [SCSI] iscsi lib: have lib create work queue for transmitting IO
We were using the shost work queue which ended up being
a little akward since all iscsi hosts need a thread for
scanning, but only drivers hooked into libiscsi need
a workqueue for transmitting. So this patch moves the
xmit workqueue to the lib.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:28:37 -05:00
Mike Christie
e28f3d5b51 [SCSI] libiscsi: don't cap queue depth in iscsi modules
There is no need to cap the queue depth in the modules. We set
this in userspace and can do that there. For performance testing
with ram based targets, this is helpful since we can have very
high queue depths.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:28:06 -05:00
Mike Christie
48a237a26d [SCSI] iser: have iser use its own logging
iser has its own logging inrfastrucutre. Convert it to use
it instead of libiscsi.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-03-13 15:26:51 -05:00
Faisal Latif
c12e56ef69 RDMA/nes: Don't allow userspace QPs to use STag zero
STag zero is a special STag that allows consumers to access any bus
address without registering memory.  The nes driver unfortunately
allows STag zero to be used even with QPs created by unprivileged
userspace consumers, which means that any process with direct verbs
access to the nes device can read and write any memory accessible to
the underlying PCI device (usually any memory in the system).  Such
access is usually given for cluster software such as MPI to use, so
this is a local privilege escalation bug on most systems running this
driver.

The driver was using STag zero to receive the last streaming mode
data; to allow STag zero to be disabled for unprivileged QPs, the
driver now registers a special MR for this data.

Cc: <stable@kernel.org>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-12 16:21:41 -07:00
Faisal Latif
9d5ab13325 RDMA/nes: Handle MPA Reject message properly
While doing testing, there are failures as MPA Reject call is not
handled.  To handle MPA Reject call, following changes are done:

*Handle inbound/outbound MPA Reject response message.
	When nes_reject() is called for pending MPA request reply,
	send the MPA Reject message to its peer (active
	side)cm_node. The peer cm_node (active side) will indicate
	Reject message event for the pending Connect Request.

*Handle MPA Reject response message for loopback connections and listener.
	When MPA Request is rejected, check if it is a loopback
	connection and if it is then it will send Reject message event
	to its peer loopback node. Also when destroying listener,
	check if the cm_nodes for that listener are loopback or not.

*Add gracefull connection close with the MPA Reject response message.
	Send gracefull close (FIN, FIN ACK..) to terminate the cm_nodes.

*Some code re-org while making the above changes.
	Removed recv_list and recv_list_lock from the cm_node
	structure as there can be only one receive close entry on the
	timer. Also implemented handle_recv_entry() as receive close
	entry is processed from both nes_rem_ref_cm_node() as well as
	nes_cm_timer_tick().

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:15:01 -08:00
Don Wood
0145f341a9 RDMA/nes: Improve use of PBLs
Two level 256 byte PBLs was not implemented so the driver could report
out of memory when in fact there were PBLs still available.

This solution prefers to use 4KB PBLs over two level 256B PBLs until
the number of 4KB PBLs falls below a threshold.  At this point the 4KB
PBL structure is converted to use 256B PBLs which prevents the driver
from running out of 4KB PBLs too quickly.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:15:00 -08:00
Faisal Latif
2869975cfb RDMA/nes: Remove LLTX
NETIF_F_LLTX is deprecated. Remove private TX locking from the driver
and remove the NETIF_F_LLTX feature flag.  This also fixes a warning
in some configs that comes from doing skb_linearize() call in the
hard_start_xmit method with IRQs disabled (if HIGHMEM is enabled,
skb_linearize() may end up enabling BHs, which is a no-no if hard IRQs
are disabled in that context).  By getting rid of LLTX, we do not
disable IRQs when skb_linearize() is called.

Remove the sq_lock as it is not needed for non-LLTX.  Fix ethtool not
to show the counter for sq_lock.

Reported-by: aluno3@poczta.onet.pl
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:11 -08:00
Don Wood
fd87778cb9 RDMA/nes: Inform hardware that asynchronous event has been handled
When asynchronous events are processed by software, it is necessary
to let the hardware know that software has handled the event.  This
frees up the entry in the asynchronous event queue.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:11 -08:00
Chien Tung
7b14ab0b43 RDMA/nes: Fix tmp_addr compilation warning
In find_node(), tmp_addr causes an "unused variable" warning when
INFINIBAND_NES_DEBUG is not defined.  It's only used in a nes_debug()
and the print does not make sense.  So take out the whole thing.

Reported-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:11 -08:00
Chien Tung
b9c367e7e6 RDMA/nes: Report correct vendor_id and vendor_part_id
ibv_devinfo displays 0 for vendor_id and vendor_part_id.  Fill in OUI
and device_id for those two fields.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:10 -08:00
Chien Tung
cd6853d3eb RDMA/nes: Update copyright to new legal entity and year
Update copyright to the new legal entity, Intel-NE, Inc., an Intel
company.  Update copyright for the new year.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:10 -08:00
Don Wood
dae5d13a7e RDMA/nes: Account for freed PBL after HW operation
Fix occurrences where the software PBL counts were changed before the
hardware was updated.  This bug allowed another thread to overallocate
the hardware resources.

Add proper PBL accounting in case nes_reg_mr() fails.

Signed-off-by: Don Wood <donald.e.wood@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-06 15:12:09 -08:00
Roland Dreier
6432f36684 IB: Remove useless ibdev_is_alive() tests from sysfs code
Some attribute show functions test ibdev_is_alive() to make sure that
it's OK to access device state.  However, the sysfs attributes will
not be registered until the device is fully initialized, and they'll
be unregistered before anything is torn down, so ibdev_is_alive()
doesn't do anything useful.  Remove it.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-03-04 15:22:39 -08:00