Commit Graph

22 Commits

Author SHA1 Message Date
Linus Torvalds 58e57fbd1c Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (41 commits)
  Revert "Seperate read and write statistics of in_flight requests"
  cfq-iosched: don't delay async queue if it hasn't dispatched at all
  block: Topology ioctls
  cfq-iosched: use assigned slice sync value, not default
  cfq-iosched: rename 'desktop' sysfs entry to 'low_latency'
  cfq-iosched: implement slower async initiate and queue ramp up
  cfq-iosched: delay async IO dispatch, if sync IO was just done
  cfq-iosched: add a knob for desktop interactiveness
  Add a tracepoint for block request remapping
  block: allow large discard requests
  block: use normal I/O path for discard requests
  swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL
  fs/bio.c: move EXPORT* macros to line after function
  Add missing blk_trace_remove_sysfs to be in pair with blk_trace_init_sysfs
  cciss: fix build when !PROC_FS
  block: Do not clamp max_hw_sectors for stacking devices
  block: Set max_sectors correctly for stacking devices
  cciss: cciss_host_attr_groups should be const
  cciss: Dynamically allocate the drive_info_struct for each logical drive.
  cciss: Add usage_count attribute to each logical drive in /sys
  ...
2009-10-04 12:39:14 -07:00
Philipp Reisner 5788c56891 dst/connector: Disallow unpliviged users to configure dst
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-02 10:54:13 -07:00
Philipp Reisner 7069331dbe connector: Provide the sender's credentials to the callback
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-02 10:54:01 -07:00
Christoph Hellwig c15227de13 block: use normal I/O path for discard requests
prepare_discard_fn() was being called in a place where memory allocation
was effectively impossible.  This makes it inappropriate for all but
the most trivial translations of Linux's DISCARD operation to the block
command set.  Additionally adding a payload there makes the ownership
of the bio backing unclear as it's now allocated by the device driver
and not the submitter as usual.

It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether
the queue supports discard operations or not.  blkdev_issue_discard now
allocates a one-page, sector-length payload which is the right thing
for the common ATA and SCSI implementations.

The mtd implementation of prepare_discard_fn() is replaced with simply
checking for the request being a discard.

Largely based on a previous patch from Matthew Wilcox <matthew@wil.cx>
which did the prepare_discard_fn but not the different payload allocation
yet.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-10-01 21:19:30 +02:00
Julia Lawall d0e0507ad6 Staging: dst: correct error-handling code
dst_state_alloc returns an ERR_PTR value in an error case instead of NULL.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@match exists@
expression x, E;
statement S1, S2;
@@

x = dst_state_alloc(...)
... when != x = E
(
*  if (x == NULL || ...) S1 else S2
|
*  if (x == NULL && ...) S1 else S2
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-15 12:02:06 -07:00
Linus Torvalds 355bbd8cb8 Merge branch 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.32' of git://git.kernel.dk/linux-2.6-block: (29 commits)
  block: use blkdev_issue_discard in blk_ioctl_discard
  Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads
  block: don't assume device has a request list backing in nr_requests store
  block: Optimal I/O limit wrapper
  cfq: choose a new next_req when a request is dispatched
  Seperate read and write statistics of in_flight requests
  aoe: end barrier bios with EOPNOTSUPP
  block: trace bio queueing trial only when it occurs
  block: enable rq CPU completion affinity by default
  cfq: fix the log message after dispatched a request
  block: use printk_once
  cciss: memory leak in cciss_init_one()
  splice: update mtime and atime on files
  block: make blk_iopoll_prep_sched() follow normal 0/1 return convention
  cfq-iosched: get rid of must_alloc flag
  block: use interrupts disabled version of raise_softirq_irqoff()
  block: fix comment in blk-iopoll.c
  block: adjust default budget for blk-iopoll
  block: fix long lines in block/blk-iopoll.c
  block: add blk-iopoll, a NAPI like approach for block devices
  ...
2009-09-14 17:55:15 -07:00
Jens Axboe 1f98a13f62 bio: first step in sanitizing the bio->bi_rw flag testing
Get rid of any functions that test for these bits and make callers
use bio_rw_flagged() directly. Then it is at least directly apparent
what variable and flag they check.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 14:33:31 +02:00
Mike Frysinger 0741241c6b connector: make callback argument type explicit
The connector documentation states that the argument to the callback
function is always a pointer to a struct cn_msg, but rather than encode it
in the API itself, it uses a void pointer everywhere.  This doesn't make
much sense to encode the pointer in documentation as it prevents proper C
type checking from occurring and can easily allow people to use the wrong
pointer type.  So convert the argument type to an explicit struct cn_msg
pointer.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-17 10:13:21 -07:00
Evgeniy Polyakov e333720166 Staging: DST: fix build dependancy
On Mon, Jan 19, 2009 at 10:12:24AM -0800, Randy Dunlap (randy.dunlap@oracle.com) wrote:
> 
> DST build fails when CONFIG_BLOCK=n:

DST should depend on block and block device, in the original patch its
kconfig entry was in the BLK_DEV menu, so this dependency was satisfied
automatically. Should attached patch be pushed into drivers/staging?

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:33 -07:00
Evgeniy Polyakov e5f753a5c7 Staging: DST: Kconfig text update.
Kconfig help text update

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:33 -07:00
Evgeniy Polyakov 30c7c1c630 Staging: DST: Do not allow empty barriers.
Do not allow empty barriers or generic_make_request() ->  scsi_setup_fs_cmnd()
will explode

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:33 -07:00
Evgeniy Polyakov e55b689268 Staging: DST: extend thread pool exit conditions.
Added thread pool exit condition into the thread_pool_del_worker(). If
called in parallel another thread can steal

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:33 -07:00
Evgeniy Polyakov 3e5510ab0c Staging: DST: optimize bio allocation.
Use bio prepend feature as suggested by Jens Axboe.

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:33 -07:00
Evgeniy Polyakov cac22275ac Staging: dst: kconfig update.
On Wed, Jan 14, 2009 at 02:05:33AM +0300, Evgeniy Polyakov (zbr@ioremap.net) wrote:
> --- /dev/null
> +++ b/drivers/staging/dst/Kconfig
> @@ -0,0 +1,71 @@
> +config DST
> +	tristate "Distributed storage"
> +	depends on NET && CRYPTO && SYSFS
> +	select CONNECTOR
> +	select LIBCRC32C

Above is not needed.

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:33 -07:00
Evgeniy Polyakov 9539bec7b7 Staging: dst: kconfig and makefile changes.
Signed-off-by: Evgeniy Polaykov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:32 -07:00
Greg Kroah-Hartman fdc4f2e95d staging: dst: replace bus_id with dev_set_name
bus_id is going away, use the dev_set_name() function instead.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:32 -07:00
Evgeniy Polyakov 1c93d9cf46 Staging: dst: crypto processing.
DST may fully encrypt the data channel in case of untrusted channel and implement
strong checksum of the transferred data. It is possible to configure algorithms
and crypto keys, they should match on both sides of the network channel.
Crypto processing does not introduce noticeble performance overhead, since DST
uses configurable pool of threads to perform crypto processing.

This patch introduces crypto processing helpers and crypto engine initialization:
glueing with the crypto layer, allocation and initialization of the crypto
processing thread pool, allocation of the cached pages, which are used to temporary
encrypt data into, since it is forbidden to encrypt data in-place, since pages
are used by the higher layers.

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:32 -07:00
Evgeniy Polyakov a0fde6e06e Staging: dst: transactions.
DST uses transaction model, when each store has to be explicitly acked
from the remote node to be considered as successfully written. There
may be lots of in-flight transactions. When remote host does not ack
the transaction it will be resent predefined number of times with specified
timeouts between them. All those parameters are configurable. Transactions
are marked as failed after all resends completed unsuccessfully, having
long enough resend timeout and/or large number of resends allows not to
return error to the higher (FS usually) layer in case of short network
problems or remote node outages. In case of network RAID setup this means
that storage will not degrade until transactions are marked as failed, and
thus will not force checksum recalculation and data rebuild. In case of
connection failure DST will try to reconnect to the remote node automatically.
DST sends ping commands at idle time to detect if remote node is alive.

Because of transactional model it is possible to use zero-copy sending
without worry of data corruption (which in turn could be detected by the
strong checksums though).

Transactions are handled in this patch: allocation/freeing/completion,
scanning for stall and to-be-resent transactions and overall management
of the storing tree.

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:32 -07:00
Evgeniy Polyakov 6694b31ac1 Staging: dst: thread pool.
Kernel currently does not allow to queue work into some entity which
will perform it in the process context and have simple way to extend
number of worker and work with them not as separate objects, but with
pool as a whole. So thread pool model was implemented in the DST.

Thread pool abstraction allows to schedule a work to be performed
on behalf of kernel thread. One does not operate with threads itself,
instead user provides setup and cleanup callbacks for thread pool itself,
and action and cleanup callbacks for each submitted work.

Each worker has private data initialized at creation time and data,
provided by user at scheduling time.

When action is being performed, thread can not be used by other users,
instead they will sleep until there is free thread to pick their work.

Thread pool is used for crypto processing of incoming and outgoing IO
requests to reduce the overall overhead.

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:32 -07:00
Evgeniy Polyakov 03b55b9ded Staging: dst: export node.
This patch introduces remote (export) node machinery: initialization
address/port (and other socket parameters), export block device (can be
another DST storage for example or local device like /dev/sda1), local
IO processing engine (BIO state machines, receiving/submitting logic).
Network management for the export node like accepting new client, scheduling
its command processing thread, receiving/sending IO requests, all are placed
here.

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:32 -07:00
Evgeniy Polyakov a2376185dc Staging: dst: network state machine.
Each DST device contains of two nodes: local and remote (called also as export node).
This patch contains local node processing engine: network state storage,
socket processing loops and state machine, socket polling machinery, reconnection
logic, send/receive basic helpers, related IO commands and so on.

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:32 -07:00
Evgeniy Polyakov ce0d9d7255 Staging: dst: core files.
This patch contains DST core files, which introduce
block layer, connector and sysfs registration glue and main headers.

Connector is used for the configuration of the node (its type, address,
device name and so on). Sysfs provides bits of information about running
devices in the following format:

+/*
+ * DST sysfs tree for device called 'storage':
+ *
+ * /sys/bus/dst/devices/storage/
+ * /sys/bus/dst/devices/storage/type : 192.168.4.80:1025
+ * /sys/bus/dst/devices/storage/size : 800
+ * /sys/bus/dst/devices/storage/name : storage
+ */

DST header contains structure definitions and protocol command description.

Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-04-03 14:53:32 -07:00