Commit Graph

304 Commits

Author SHA1 Message Date
Alex Elder 43bfe5de9f libceph: define osd data initialization helpers
Define and use functions that encapsulate the initializion of a
ceph_osd_data structure.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:18:06 -07:00
Alex Elder 6010a451c3 rbd: define inbound data size for method ops
When rbd creates an object request containing an object method call
operation it is passing 0 for the size.  I originally thought this
was because the length was not needed for method calls, but I think
it really should be supplied, to describe how much space is
available to receive response data.  So provide the supplied length.

This resolves:
    http://tracker.ceph.com/issues/4659

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:18:04 -07:00
Alex Elder fdce58ccb5 libceph: record length of bio list with bio
When assigning a bio pointer to an osd request, we don't have an
efficient way of knowing the total length bytes in the bio list.
That information is available at the point it's set up by the rbd
code, so record it with the osd data when it's set.

This and the next patch are related to maintaining the length of a
message's data independent of the message header, as described here:
    http://tracker.ceph.com/issues/4589

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:17:56 -07:00
Alex Elder 33803f3300 libceph: define source request op functions
The rbd code has a function that allocates and populates a
ceph_osd_req_op structure (the in-core version of an osd request
operation).  When reviewed, Josh suggested two things: that the
big varargs function might be better split into type-specific
functions; and that this functionality really belongs in the osd
client rather than rbd.

This patch implements both of Josh's suggestions.  It breaks
up the rbd function into separate functions and defines them
in the osd client module as exported interfaces.  Unlike the
rbd version, however, the functions don't allocate an osd_req_op
structure; they are provided the address of one and that is
initialized instead.

The rbd function has been eliminated and calls to it have been
replaced by calls to the new routines.  The rbd code now now use a
stack (struct) variable to hold the op rather than allocating and
freeing it each time.

For now only the capabilities used by rbd are implemented.
Implementing all the other osd op types, and making the rest of the
code use it will be done separately, in the next few patches.

Note that only the extent, cls, and watch portions of the
ceph_osd_req_op structure are currently used.  Delete the others
(xattr, pgls, and snap) from its definition so nobody thinks it's
actually implemented or needed.  We can add it back again later
if needed, when we know it's been tested.

This (and a few follow-on patches) resolves:
    http://tracker.ceph.com/issues/3861

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:17:45 -07:00
Alex Elder adfe695a25 ceph: move max constant definitions
Move some definitions for max integer values out of the rbd code and
into the more central "decode.h" header file.  These really belong
in a Linux (or libc) header somewhere, but I haven't gotten around
to proposing that yet.

This is in preparation for moving some code out of rbd.c and into
the osd client.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:17:42 -07:00
Alex Elder 175face2ba libceph: let osd ops determine request data length
The length of outgoing data in an osd request is dependent on the
osd ops that are embedded in that request.  Each op is encoded into
a request message using osd_req_encode_op(), so that should be used
to determine the amount of outgoing data implied by the op as it
is encoded.

Have osd_req_encode_op() return the number of bytes of outgoing data
implied by the op being encoded, and accumulate and use that in
ceph_osdc_build_request().

As a result, ceph_osdc_build_request() no longer requires its "len"
parameter, so get rid of it.

Using the sum of the op lengths rather than the length provided is
a valid change because:
    - The only callers of osd ceph_osdc_build_request() are
      rbd and the osd client (in ceph_osdc_new_request() on
      behalf of the file system).
    - When rbd calls it, the length provided is only non-zero for
      write requests, and in that case the single op has the
      same length value as what was passed here.
    - When called from ceph_osdc_new_request(), (it's not all that
      easy to see, but) the length passed is also always the same
      as the extent length encoded in its (single) write op if
      present.

This resolves:
    http://tracker.ceph.com/issues/4406

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:17:02 -07:00
Alex Elder e0c594878e libceph: record byte count not page count
Record the byte count for an osd request rather than the page count.
The number of pages can always be derived from the byte count (and
alignment/offset) but the reverse is not true.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:16:36 -07:00
Alex Elder 0fff87ec79 libceph: separate read and write data
An osd request defines information about where data to be read
should be placed as well as where data to write comes from.
Currently these are represented by common fields.

Keep information about data for writing separate from data to be
read by splitting these into data_in and data_out fields.

This is the key patch in this whole series, in that it actually
identifies which osd requests generate outgoing data and which
generate incoming data.  It's less obvious (currently) that an osd
CALL op generates both outgoing and incoming data; that's the focus
of some upcoming work.

This resolves:
    http://tracker.ceph.com/issues/4127

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:16:27 -07:00
Alex Elder 2ac2b7a6d4 libceph: distinguish page and bio requests
An osd request uses either pages or a bio list for its data.  Use a
union to record information about the two, and add a data type
tag to select between them.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:16:25 -07:00
Alex Elder 2794a82a11 libceph: separate osd request data info
Pull the fields in an osd request structure that define the data for
the request out into a separate structure.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:16:24 -07:00
Alex Elder 46faeed4a6 rbd: do a safe list traversal in rbd_img_request_submit()
It's possible that the reference to the object request dropped
inside the loop in rbd_img_request_submit() will be the last
one, in which case the content of the object pointer can't be
trusted.

Use a safe form of the object request list traversal to avoid
problems.

This resolves:
    http://tracker.ceph.com/issues/4705

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-04-17 11:39:09 -07:00
Alex Elder 6e2a4505db rbd: don't zero-fill non-image object requests
A result of ENOENT from a read request for an object that's part of
an rbd image indicates that there is a hole in that portion of the
image.  Similarly, a short read for such an object indicates that
the remainder of the read should be interpreted a full read with
zeros filling out the end of the request.

This behavior is not correct for objects that are not backing rbd
image data.  Currently rbd_img_obj_request_callback() assumes it
should be done for all objects.

Change rbd_img_obj_request_callback() so it only does this zeroing
for image objects.  Encapsulate that special handling in its own
function.  Add an assertion that the image object request is a bio
request, since we assume that (and we currently don't support any
other types).

This resolves a problem identified here:
    http://tracker.ceph.com/issues/4559

The regression was introduced by bf0d5f503d.

Reported-by: Dan van der Ster <dan@vanderster.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-off-by: Sage Weil <sage@inktank.com>
2013-03-29 11:32:07 -07:00
Sage Weil 1b83bef24c libceph: update osd request/reply encoding
Use the new version of the encoding for osd requests and replies.  In the
process, update the way we are tracking request ops and reply lengths and
results in the struct ceph_osd_request.  Update the rbd and fs/ceph users
appropriately.

The main changes are:
 - we keep pointers into the request memory for fields we need to update
   each time the request is sent out over the wire
 - we keep information about the result in an array in the request struct
   where the users can easily get at it.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
2013-02-26 15:02:50 -08:00
Alex Elder c47f937154 rbd: pass length, not op for osd completions
The only thing type-specific osd completion functions do with their
osd op parameter is (in some cases) extract the number of bytes
transferred from it.  In the other cases, the xferred bytes field
is not used, and total message data transfer byte count (which may
well be zero) is used.

Just set the object request transfer count in the main osd request
callback function and provide that to the other routines.  There is
then no longer any need to pass the op pointer to the type-specific
completion routines, so drop those parameters.

Stop doing anything with the total message data length.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-02-26 15:00:06 -08:00
Alex Elder 39bf2c5d09 rbd: move rbd_osd_trivial_callback()
This function is slightly out of place, probably the result
of an errant automatic merge or something.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-02-26 14:59:49 -08:00
Alex Elder cc344fa1b5 rbd: eliminate sparse warnings
Fengguang Wu reminded me that there were outstanding sparse reports
in the ceph and rbd code.  This patch fixes these problems in rbd
that lead to those reports:
    - Convert functions that are never referenced externally to have
      static scope.
    - Add a lockdep annotation to rbd_request_fn(), because it
      releases a lock before acquiring it again.

This partially resolves:
    http://tracker.ceph.com/issues/4184

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-25 15:37:08 -06:00
Alex Elder 37206ee5be rbd: normalize dout() calls
Add dout() calls to facilitate tracing of image and object requests.
Change a few existing calls so they use __func__ rather than the
hard-coded function name.  Have calls always add ":" after the name
of the function, and prefix pointer values with a consistent tag
indicating what it represents.  (Note that there remain some older
dout() calls that are left untouched by this patch.)

Issue a warning if rbd_osd_write_callback() ever gets a short write.

This resolves:
    http://tracker.ceph.com/issues/4235

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-25 15:36:56 -06:00
Alex Elder 632b88cade rbd: barriers are hard
Let's go shopping!

I'm afraid this may not have gotten it right:
    07741308  rbd: add barriers near done flag operations

The smp_wmb() should have been done *before* setting the done flag,
to ensure all other data was valid before marking the object request
done.

Switch to use atomic_inc_return() here to set the done flag, which
allows us to verify we don't mark something done more than once.
Doing this also implies general barriers before and after the call.

And although a read memory barrier might have been sufficient before
reading the done flag, convert this to a full memory barrier just
to put this issue to bed.

This resolves:
    http://tracker.ceph.com/issues/4238

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-25 15:36:50 -06:00
Alex Elder 4dda41d3d7 rbd: ignore zero-length requests
The old request code simply ignored zero-length requests.  We should
still operate that same way to avoid any changes in behavior.  We
can implement handling for special zero-length requests separately
(see http://tracker.ceph.com/issues/4236).

Add some assertions based on this new constraint.

This resolves:
    http://tracker.ceph.com/issues/4237

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-25 15:36:36 -06:00
Alex Elder 903bb32e89 libceph: drop return value from page vector copy routines
The return values provided for ceph_copy_to_page_vector() and
ceph_copy_from_page_vector() serve no purpose, so get rid of them.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-19 19:14:05 -06:00
Alex Elder 23ed6e13b3 rbd: ignore result of ceph_copy_from_page_vector()
The result of ceph_copy_from_page_vector() is simply the length
argument it is provided.

This is called by rbd_obj_method_sync(), which returns the result if
it's non-negative.  But we always either ignore or overwrite that
return value.  So explicitly ignore what's returned by the copy
function, and have rbd_obj_method_sync() always return either a
negative errno or 0.

We also return the result of ceph_copy_from_page_vector() in
rbd_obj_read_sync().  There we still want to return the number of
bytes transferred, but we can use the value we already have in hand
rather than what ceph_copy_from_page_vector() provides.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-19 19:14:05 -06:00
Alex Elder 1ceae7ef0f rbd: prevent bytes transferred overflow
In rbd_obj_read_sync(), verify the number of bytes transferred won't
exceed what can be represented by a size_t before using it to
indicate the number of bytes to copy to the result buffer.

(The real motivation for this is to prepare for the next patch.)

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-19 19:14:04 -06:00
Alex Elder fbfab53966 libceph: allow STAT osd operations
Add support for CEPH_OSD_OP_STAT operations in the osd client
and in rbd.

This operation sends no data to the osd; everything required is
encoded in identity of the target object.

The result will be ENOENT if the object doesn't exist.  If it does
exist and no other error occurs the server returns the size and last
modification time of the target object as output data (in little
endian format).  The size is a 64 bit unsigned and the time is
ceph_timespec structure (two unsigned 32-bit integers, representing
a seconds and nanoseconds value).

This resolves:
    http://tracker.ceph.com/issues/4007

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-19 19:14:03 -06:00
Alex Elder ef06f4d32a rbd: add parentheses to object request iterator macros
The for_each_obj_request*() macros should parenthesize their uses of
the ireq parameter.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-19 19:14:02 -06:00
Alex Elder 3c663bbdcd libceph: kill ceph_osdc_create_event() "one_shot" parameter
There is only one caller of ceph_osdc_create_event(), and it
provides 0 as its "one_shot" argument.  Get rid of that argument and
just use 0 in its place.

Replace the code in handle_watch_notify() that executes if one_shot
is nonzero in the event with a BUG_ON() call.

While modifying "osd_client.c", give handle_watch_notify() static
scope.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-18 12:20:00 -06:00
Alex Elder 077413082f rbd: add barriers near done flag operations
Somehow, I missed this little item in Documentation/atomic_ops.txt:
    *** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! ***

Create and use some helper functions that include the proper memory
barriers for manipulating the done field.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:11 -08:00
Alex Elder a14ea269dd rbd: turn off interrupts for open/remove locking
This commit:
    bc7a62ee5 rbd: prevent open for image being removed
added checking for removing rbd before allowing an open, and used
the same request spinlock for protecting that and updating the open
count as is used for the request queue.

However it used the non-irq protected version of the spinlocks.
Fix that.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:11 -08:00
Alex Elder 9cbb1d7268 libceph: don't require r_num_pages for bio requests
There is a check in the completion path for osd requests that
ensures the number of pages allocated is enough to hold the amount
of incoming data expected.

For bio requests coming from rbd the "number of pages" is not really
meaningful (although total length would be).  So stop requiring that
nr_pages be supplied for bio requests.  This is done by checking
whether the pages pointer is null before checking the value of
nr_pages.

Note that this value is passed on to the messenger, but there it's
only used for debugging--it's never used for validation.

While here, change another spot that used r_pages in a debug message
inappropriately, and also invalidate the r_con_filling_msg pointer
after dropping a reference to it.

This resolves:
    http://tracker.ceph.com/issues/3875

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:11 -08:00
Alex Elder 1e32d34cfa rbd: don't take extra bio reference for osd client
Currently, if the OSD client finds an osd request has had a bio list
attached to it, it drops a reference to it (or rather, to the first
entry on that list) when the request is released.

The code that added that reference (i.e., the rbd client) is
therefore required to take an extra reference to that first bio
structure.

The osd client doesn't really do anything with the bio pointer other
than transfer it from the osd request structure to outgoing (for
writes) and ingoing (for reads) messages.  So it really isn't the
right place to be taking or dropping references.

Furthermore, the rbd client already holds references to all bio
structures it passes to the osd client, and holds them until the
request is completed.  So there's no need for this extra reference
whatsoever.

So remove the bio_put() call in ceph_osdc_release_request(), as
well as its matching bio_get() call in rbd_osd_req_create().

This change could lead to a crash if old libceph.ko was used with
new rbd.ko.  Add a compatibility check at rbd initialization time to
avoid this possibilty.

This resolves:
    http://tracker.ceph.com/issues/3798    and
    http://tracker.ceph.com/issues/3799

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:11 -08:00
Alex Elder b82d167be6 rbd: prevent open for image being removed
An open request for a mapped rbd image can arrive while removal of
that mapping is underway.  We need to prevent such an open request
from succeeding.  (It appears that Maciej Galkiewicz ran into this
problem.)

Define and use a "removing" flag to indicate a mapping is getting
removed.  Set it in the remove path after verifying nothing holds
the device open.  And check it in the open path before allowing the
open to proceed.  Acquire the rbd device's lock around each of these
spots to avoid any races accessing the flags and open_count fields.

This addresses:
    http://tracker.newdream.net/issues/3427

Reported-by: Maciej Galkiewicz <maciejgalkiewicz@ragnarson.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:10 -08:00
Alex Elder 6d292906f8 rbd: define flags field, use it for exists flag
Define a new rbd device flags field, manipulated using bit
operations.  Replace the use of the current "exists" flag with a bit
in this new "flags" field.  Add a little commentary about the
"exists" flag, which does not need to be manipulated atomically.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:10 -08:00
Alex Elder 8eb8756530 rbd: don't drop watch requests on completion
When we register an osd request to linger, it means that request
will stay around (under control of the osd client) until we've
unregistered it.  We do that for an rbd image's header object, and
we keep a pointer to the object request associated with it.

Keep a reference to the watch object request for as long as it is
registered to linger.  Drop it again after we've removed the linger
registration.

This resolves:
    http://tracker.ceph.com/issues/3937

(Note: this originally came about because the osd client was
issuing a callback more than once.  But that behavior will be
changing soon, documented in tracker issue 3967.)

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:10 -08:00
Alex Elder 25dcf954c3 rbd: decrement obj request count when deleting
Decrement the obj_request_count value when deleting an object
request from its image request's list.  Rearrange a few lines
in the surrounding code.

This resolves:
    http://tracker.ceph.com/issues/3940

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:10 -08:00
Alex Elder 975241afcb rbd: track object rather than osd request for watch
Switch to keeping track of the object request pointer rather than
the osd request used to watch the rbd image header object.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:10 -08:00
Alex Elder 6977c3f983 rbd: unregister linger in watch sync routine
Move the code that unregisters an rbd device's lingering header
object watch request into rbd_dev_header_watch_sync(), so it
occurs in the same function that originally sets up that request.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:09 -08:00
Alex Elder 9f20e02a53 rbd: get rid of rbd_req_sync_exec()
Get rid rbd_req_sync_exec() because it is no longer used.  That
eliminates the last use of rbd_req_sync_op(), so get rid of that
too.  And finally, that leaves rbd_do_request() unreferenced, so get
rid of that.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:09 -08:00
Alex Elder 36be9a7618 rbd: implement sync method with new code
Reimplement synchronous object method calls using the new request
tracking code.  Use the name rbd_obj_method_sync()

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:09 -08:00
Alex Elder cf81b60e4b rbd: send notify ack asynchronously
When we receive notification of a change to an rbd image's header
object we need to refresh our information about the image (its
size and snapshot context).  Once we have refreshed our rbd image
we need to acknowledge the notification.

This acknowledgement was previously done synchronously, but there's
really no need to wait for it to complete.

Change it so the caller doesn't wait for the notify acknowledgement
request to complete.  And change the name to reflect it's no longer
synchronous.

This resolves:
    http://tracker.newdream.net/issues/3877

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:09 -08:00
Alex Elder 5ae9db81b4 rbd: get rid of rbd_req_sync_notify_ack()
Get rid rbd_req_sync_notify_ack() because it is no longer used.
As a result rbd_simple_req_cb() becomes unreferenced, so get rid
of that too.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:09 -08:00
Alex Elder b8d70035b3 rbd: use new code for notify ack
Use the new object request tracking mechanism for handling a
notify_ack request.

Move the callback function below the definition of this so we don't
have to do a pre-declaration.

This resolves:
    http://tracker.newdream.net/issues/3754

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:08 -08:00
Alex Elder ecf7a0318b rbd: get rid of rbd_req_sync_watch()
Get rid of rbd_req_sync_watch(), because it is no longer used.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:08 -08:00
Alex Elder 9969ebc5af rbd: implement watch/unwatch with new code
Implement a new function to set up or tear down a watch event
for an mapped rbd image header using the new request code.

Create a new object request type "nodata" to handle this.  And
define rbd_osd_trivial_callback() which simply marks a request done.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:08 -08:00
Alex Elder 86ea43bfcb rbd: get rid of rbd_req_sync_read()
Delete rbd_req_sync_read() is no longer used, so get rid of it.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:08 -08:00
Alex Elder 788e2df3b9 rbd: implement sync object read with new code
Reimplement the synchronous read operation used for reading a
version 1 header using the new request tracking code.  Name the
resulting function rbd_obj_read_sync() to better reflect that
it's a full object operation, not an object request.  To do this,
implement a new OBJ_REQUEST_PAGES object request type.

This implements a new mechanism to allow the caller to wait for
completion for an rbd_obj_request by calling rbd_obj_request_wait().

This partially resolves:
    http://tracker.newdream.net/issues/3755

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:08 -08:00
Alex Elder 7d250b949a rbd: kill rbd_req_coll and rbd_request
The two remaining callers of rbd_do_request() always pass a null
collection pointer, so the "coll" and "coll_index" parameters are
not needed.  There is no other use of that data structure, so it
can be eliminated.

Deleting them means there is no need to allocate a rbd_request
structure for the callback function.  And since that's the only use
of *that* structure, it too can be eliminated.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:07 -08:00
Alex Elder 2250a71b59 rbd: kill rbd_rq_fn() and all other related code
Now that the request function has been replaced by one using the new
request management data structures the old one can go away.
Deleting it makes rbd_dev_do_request() no longer needed, and
deleting that makes other functions unneeded, and so on.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:07 -08:00
Alex Elder bf0d5f503d rbd: new request tracking code
This patch fully implements the new request tracking code for rbd
I/O requests.

Each I/O request to an rbd image will get an rbd_image_request
structure allocated to track it.  This provides access to all
information about the original request, as well as access to the
set of one or more object requests that are initiated as a result
of the image request.

An rbd_obj_request structure defines a request sent to a single osd
object (possibly) as part of an rbd image request.  An rbd object
request refers to a ceph_osd_request structure built up to represent
the request; for now it will contain a single osd operation.  It
also provides space to hold the result status and the version of the
object when the osd request completes.

An rbd_obj_request structure can also stand on its own.  This will
be used for reading the version 1 header object, for issuing
acknowledgements to event notifications, and for making object
method calls.

All rbd object requests now complete asynchronously with respect
to the osd client--they supply a common callback routine.

This resolves:
    http://tracker.newdream.net/issues/3741

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-13 18:29:07 -08:00
Alex Elder c04306471a rbd: don't retry setting up header watch
When an rbd image is initially mapped a watch event is registered so
we can do something if the header object changes.

The code that does this currently loops if initiating the watch
request results in an ERANGE error.  The osds will never return
ERANGE, so there's no reason to do this loop, so get rid of it.

This resolves:
    http://tracker.newdream.net/issues/3860

Note that the problem this loop was intended to solve is a race
between collecting image header information and setting up the watch
on the header object.  The real fix for that problem is described
here:
    http://tracker.newdream.net/issues/3871

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-25 17:33:37 -06:00
Alex Elder 38901e0f24 rbd: check for overflow in rbd_get_num_segments()
The return type of rbd_get_num_segments() is int, but the values it
operates on are u64.  Although it's not likely, there's no guarantee
the result won't exceed what can be respresented in an int.  The
function is already designed to return -ERANGE on error, so just add
this possible overflow as another reason to return that.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-25 17:33:14 -06:00
Alex Elder 98571b5aa7 rbd: small changes
A few very minor changes to the rbd code:
    - RBD_MAX_OPT_LEN is unused, so get rid of it
    - Consolidate rbd options definitions
    - Make rbd_segment_name() return pointer to const char

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-01-25 17:32:46 -06:00