iser-target: Add iSCSI Extensions for RDMA (iSER) target driver

This patch adds support for iSCSI Extensions for RDMA target mode,
and includes CQ pooling per isert_device context distributed across
multiple active iser target sessions.

It also uses cmwq process context for RX / TX ib_post_cq() polling
via isert_cq_desc->cq_[rx,tx]_work invoked by isert_cq_[rx,tx]_callback()
hardIRQ context callbacks.

v5 changes:

- Use ISER_RECV_DATA_SEG_LEN instead of hardcoded value in ISER_RX_PAD_SIZE (Or)
- Fix make W=1 warnings (Or)
- Add missing depends on NET && INFINIBAND_ADDR_TRANS in Kconfig (Randy + Or)
- Make isert_device_find_by_ib_dev() return proper ERR_PTR (Wei Yongjun)
- Properly setup iscsi_np->np_sockaddr in isert_setup_np() (Shlomi + nab)
- Add special case for early ISCSI_OP_SCSI_CMD exception handling (nab)

v4 changes:
- Mark isert_cq_rx_work as static (Or)
- Drop unnecessary ib_dma_sync_single_for_cpu + ib_dma_sync_single_for_device
  calls for isert_cmd->sense_buf_dma from isert_put_response (Or)
- Use 12288 for ISER_RX_PAD_SIZE base to save extra page per
  struct iser_rx_desc (Or + nab)
- Drop now unnecessary isert_rx_desc usage, and convert RX users to
  iser_rx_desc (Or + nab)
- Move isert_[alloc,free]_rx_descriptors() ahead of
  isert_create_device_ib_res() usage (nab)
- Mark isert_cq_[rx,tx]_callback() + prototypes as static
- Fix 'warning: 'ret' may be used uninitialized' warning for
  isert_create_device_ib_res on powerpc allmodconfig (fengguang + nab)
- Fix 'warning: 'ret' may be used uninitialized' warning for
  isert_connect_request on i386 allyesconfig (fengguang + nab)
- Fix pr_debug conversion specification in isert_rx_completion()
  (fengguang + nab)
- Drop unnecessary isert_conn->conn_cm_id != NULL check in
  isert_connect_release causing the build warning:
  "variable dereferenced before check 'isert_conn->conn_cm_id'"
- Fix isert_lid + isert_np leak in isert_setup_np failure path
- Add isert_conn->conn_wait_comp_err usage in isert_free_conn()
  for isert_cq_comp_err completion path
- Add isert_conn->logout_posted bit to determine decrement of
  isert_conn->post_send_buf_count from logout response completion
- Always set ISER_CONN_DOWN from isert_disconnect_work() callback

v3 changes:

- Convert to use per isert_cq_desc->cq_[rx,tx]_work + drop tasklets (Or + nab)
- Move IB_EVENT_QP_LAST_WQE_REACHED warn into correct
  isert_qp_event_callback (Or)
- Drop unnecessary IB_ACCESS_REMOTE_* access flag usage in
  isert_create_device_ib_res (Or)
- Add common isert_init_send_wr(), and convert isert_put_* calls (Or)
- Move to verbs+core logic to single ib_isert.[c,h]  (Or + nab)
- Add kmem_cache isert_cmd_cache usage for descriptor allocation (nab)
- Move common ib_post_send() logic used by isert_put_*() to
  isert_post_response() (nab)
- Add isert_put_reject call in isert_response_queue() for posting
  ISCSI_REJECT response. (nab)
- Add ISTATE_SEND_REJECT checking in isert_do_control_comp. (nab)

v2 changes:

- Drop unused ISERT_ADDR_ROUTE_TIMEOUT define
- Add rdma_notify() call for IB_EVENT_COMM_EST in isert_qp_event_callback()
- Make isert_query_device() less verbose
- Drop unused RDMA_CM_EVENT_ADDR_ERROR and RDMA_CM_EVENT_ROUTE_ERROR
  cases from isert_cma_handler()
- Drop unused rdma/ib_fmr_pool.h include
- Update isert_conn_setup_qp() to assign cq based upon least used
- Add isert_create_device_ib_res() to setup PD, CQs and MRs for each
  underlying struct ib_device, instead of using per isert_conn resources.
- Add isert_free_device_ib_res() to release PD, CQs and MRs for each
  underlying struct ib_device.
- Add isert_device_find_by_ib_dev()
- Change isert_connect_request() to drop PD, CQs and MRs allocation,
  and use isert_device_find_by_ib_dev() instead.
- Add isert_device_try_release()
- Change isert_connect_release() to decrement cq_active_qps, and drop
  PD, CQs and MRs resource release.
- Update isert_connect_release() to call isert_device_try_release()
- Make isert_create_device_ib_res() determine device->cqs_used based
  upon num_online_cpus()
- Drop misleading isert_dump_ib_wc() usage
- Drop unused rdma/ib_fmr_pool.h include
- Use proper xfer_len for login PDUs in isert_rx_completion()
- Add isert_release_cmd() usage
- Change isert_alloc_cmd() to setup iscsi_cmd.release_cmd() pointer
- Change isert_put_cmd() to perform per iscsi_opcode specific release
  logic
- Add isert_unmap_cmd() call for ISCSI_OP_SCSI_CMD from isert_put_cmd()
- Change isert_send_completion() to call
  atomic_dec(&isert_conn->post_send_buf_count)
  based upon per iscsi_opcode logic
- Drop ISTATE_REMOVE processing from isert_immediate_queue()
- Drop ISTATE_SEND_DATAIN processing from isert_response_queue()
- Drop ISTATE_SEND_STATUS processing from isert_response_queue()
- Drop iscsit_transport->iscsit_unmap_cmd() and ->iscsit_free_cmd()
- Convert iser_cq_tx_tasklet() to use struct isert_cq_desc pooling logic
- Convert isert_cq_tx_callback() to use struct isert_cq_desc pooling
  logic
- Convert iser_cq_rx_tasklet() to use struct isert_cq_desc pooling logic
- Convert isert_cq_rx_callback() to use struct isert_cq_desc pooling
  logic
- Add explict iscsit_stop_dataout_timer() call to
  isert_do_rdma_read_comp()
- Use isert_get_dataout() for iscsit_transport->iscsit_get_dataout()
  caller
- Drop ISTATE_SEND_R2T processing from isert_immediate_queue()
- Drop unused rdma/ib_fmr_pool.h include
- Drop isert_cmd->cmd_kref in favor of se_cmd->cmd_kref usage
- Add struct isert_device in order to support multiple EQs + CQ pooling
- Add struct isert_cq_desc
- Drop tasklets and cqs from isert_conn
- Bump ISERT_MAX_CQ to 64
- Various minor checkpatch fixes

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This commit is contained in:
Nicholas Bellinger 2013-03-07 00:56:19 -08:00
parent 04b59babc0
commit b8d26b3be8
7 changed files with 2475 additions and 0 deletions

View File

@ -59,5 +59,6 @@ source "drivers/infiniband/ulp/srp/Kconfig"
source "drivers/infiniband/ulp/srpt/Kconfig"
source "drivers/infiniband/ulp/iser/Kconfig"
source "drivers/infiniband/ulp/isert/Kconfig"
endif # INFINIBAND

View File

@ -13,3 +13,4 @@ obj-$(CONFIG_INFINIBAND_IPOIB) += ulp/ipoib/
obj-$(CONFIG_INFINIBAND_SRP) += ulp/srp/
obj-$(CONFIG_INFINIBAND_SRPT) += ulp/srpt/
obj-$(CONFIG_INFINIBAND_ISER) += ulp/iser/
obj-$(CONFIG_INFINIBAND_ISERT) += ulp/isert/

View File

@ -0,0 +1,5 @@
config INFINIBAND_ISERT
tristate "iSCSI Extentions for RDMA (iSER) target support"
depends on INET && INFINIBAND_ADDR_TRANS && TARGET_CORE && ISCSI_TARGET
---help---
Support for iSCSI Extentions for RDMA (iSER) Target on Infiniband fabrics.

View File

@ -0,0 +1,2 @@
ccflags-y := -Idrivers/target -Idrivers/target/iscsi
obj-$(CONFIG_INFINIBAND_ISERT) += ib_isert.o

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,138 @@
#include <linux/socket.h>
#include <linux/in.h>
#include <linux/in6.h>
#include <rdma/ib_verbs.h>
#include <rdma/rdma_cm.h>
#define ISERT_RDMA_LISTEN_BACKLOG 10
enum isert_desc_type {
ISCSI_TX_CONTROL,
ISCSI_TX_DATAIN
};
enum iser_ib_op_code {
ISER_IB_RECV,
ISER_IB_SEND,
ISER_IB_RDMA_WRITE,
ISER_IB_RDMA_READ,
};
enum iser_conn_state {
ISER_CONN_INIT,
ISER_CONN_UP,
ISER_CONN_TERMINATING,
ISER_CONN_DOWN,
};
struct iser_rx_desc {
struct iser_hdr iser_header;
struct iscsi_hdr iscsi_header;
char data[ISER_RECV_DATA_SEG_LEN];
u64 dma_addr;
struct ib_sge rx_sg;
char pad[ISER_RX_PAD_SIZE];
} __packed;
struct iser_tx_desc {
struct iser_hdr iser_header;
struct iscsi_hdr iscsi_header;
enum isert_desc_type type;
u64 dma_addr;
struct ib_sge tx_sg[2];
int num_sge;
struct isert_cmd *isert_cmd;
struct ib_send_wr send_wr;
} __packed;
struct isert_rdma_wr {
struct list_head wr_list;
struct isert_cmd *isert_cmd;
enum iser_ib_op_code iser_ib_op;
struct ib_sge *ib_sge;
int num_sge;
struct scatterlist *sge;
int send_wr_num;
struct ib_send_wr *send_wr;
};
struct isert_cmd {
uint32_t read_stag;
uint32_t write_stag;
uint64_t read_va;
uint64_t write_va;
u64 sense_buf_dma;
u32 sense_buf_len;
u32 read_va_off;
u32 write_va_off;
u32 rdma_wr_num;
struct isert_conn *conn;
struct iscsi_cmd iscsi_cmd;
struct ib_sge *ib_sge;
struct iser_tx_desc tx_desc;
struct isert_rdma_wr rdma_wr;
struct work_struct comp_work;
};
struct isert_device;
struct isert_conn {
enum iser_conn_state state;
bool logout_posted;
int post_recv_buf_count;
atomic_t post_send_buf_count;
u32 responder_resources;
u32 initiator_depth;
u32 max_sge;
char *login_buf;
char *login_req_buf;
char *login_rsp_buf;
u64 login_req_dma;
u64 login_rsp_dma;
unsigned int conn_rx_desc_head;
struct iser_rx_desc *conn_rx_descs;
struct ib_recv_wr conn_rx_wr[ISERT_MIN_POSTED_RX];
struct iscsi_conn *conn;
struct list_head conn_accept_node;
struct completion conn_login_comp;
struct iser_tx_desc conn_login_tx_desc;
struct rdma_cm_id *conn_cm_id;
struct ib_pd *conn_pd;
struct ib_mr *conn_mr;
struct ib_qp *conn_qp;
struct isert_device *conn_device;
struct work_struct conn_logout_work;
wait_queue_head_t conn_wait;
wait_queue_head_t conn_wait_comp_err;
struct kref conn_kref;
};
#define ISERT_MAX_CQ 64
struct isert_cq_desc {
struct isert_device *device;
int cq_index;
struct work_struct cq_rx_work;
struct work_struct cq_tx_work;
};
struct isert_device {
int cqs_used;
int refcount;
int cq_active_qps[ISERT_MAX_CQ];
struct ib_device *ib_device;
struct ib_pd *dev_pd;
struct ib_mr *dev_mr;
struct ib_cq *dev_rx_cq[ISERT_MAX_CQ];
struct ib_cq *dev_tx_cq[ISERT_MAX_CQ];
struct isert_cq_desc *cq_desc;
struct list_head dev_node;
};
struct isert_np {
wait_queue_head_t np_accept_wq;
struct rdma_cm_id *np_cm_id;
struct mutex np_accept_mutex;
struct list_head np_accept_list;
struct completion np_login_comp;
};

View File

@ -0,0 +1,47 @@
/* From iscsi_iser.h */
struct iser_hdr {
u8 flags;
u8 rsvd[3];
__be32 write_stag; /* write rkey */
__be64 write_va;
__be32 read_stag; /* read rkey */
__be64 read_va;
} __packed;
/*Constant PDU lengths calculations */
#define ISER_HEADERS_LEN (sizeof(struct iser_hdr) + sizeof(struct iscsi_hdr))
#define ISER_RECV_DATA_SEG_LEN 8192
#define ISER_RX_PAYLOAD_SIZE (ISER_HEADERS_LEN + ISER_RECV_DATA_SEG_LEN)
#define ISER_RX_LOGIN_SIZE (ISER_HEADERS_LEN + ISCSI_DEF_MAX_RECV_SEG_LEN)
/* QP settings */
/* Maximal bounds on received asynchronous PDUs */
#define ISERT_MAX_TX_MISC_PDUS 4 /* NOOP_IN(2) , ASYNC_EVENT(2) */
#define ISERT_MAX_RX_MISC_PDUS 6 /* NOOP_OUT(2), TEXT(1), *
* SCSI_TMFUNC(2), LOGOUT(1) */
#define ISCSI_DEF_XMIT_CMDS_MAX 128 /* from libiscsi.h, must be power of 2 */
#define ISERT_QP_MAX_RECV_DTOS (ISCSI_DEF_XMIT_CMDS_MAX)
#define ISERT_MIN_POSTED_RX (ISCSI_DEF_XMIT_CMDS_MAX >> 2)
#define ISERT_INFLIGHT_DATAOUTS 8
#define ISERT_QP_MAX_REQ_DTOS (ISCSI_DEF_XMIT_CMDS_MAX * \
(1 + ISERT_INFLIGHT_DATAOUTS) + \
ISERT_MAX_TX_MISC_PDUS + \
ISERT_MAX_RX_MISC_PDUS)
#define ISER_RX_PAD_SIZE (ISER_RECV_DATA_SEG_LEN + 4096 - \
(ISER_RX_PAYLOAD_SIZE + sizeof(u64) + sizeof(struct ib_sge)))
#define ISER_VER 0x10
#define ISER_WSV 0x08
#define ISER_RSV 0x04
#define ISCSI_CTRL 0x10
#define ISER_HELLO 0x20
#define ISER_HELLORPLY 0x30