linux

Author	SHA1	Message	Date
David Teigland	222d396092	[DLM] fix master recovery If master recovery happens on an rsb in one recovery sequence, then that sequence is aborted before lock recovery happens, then in the next sequence, we rely on the previous master recovery (which may now be invalid due to another node ignoring a lookup result) and go on do to the lock recovery where we get stuck due to an invalid master value. recovery cycle begins: master of rsb X has left nodes A and B send node C an rcom lookup for X to find the new master C gets lookup from B first, sets B as new master, and sends reply back to B C gets lookup from A next, and sends reply back to A saying B is master A gets lookup reply from C and sets B as the new master in the rsb recovery cycle on A, B and C is aborted to start a new recovery B gets lookup reply from C and ignores it since there's a new recovery recovery cycle begins: some other node has joined B doesn't think it's the master of X so it doesn't rebuild it in the directory C looks up the master of X, no one is master, so it becomes new master B looks up the master of X, finds it's C A believes that B is the master of X, so it sends its lock to B B sends an error back to A A resends this repeats forever, the incorrect master value on A is never corrected The fix is to do master recovery on an rsb that still has the NEW_MASTER flag set from an earlier recovery sequence, and therefore didn't complete lock recovery. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:58 -05:00
David Teigland	a1bc86e6bd	[DLM] fix user unlocking When a user process exits, we clear all the locks it holds. There is a problem, though, with locks that the process had begun unlocking before it exited. We couldn't find the lkb's that were in the process of being unlocked remotely, to flag that they are DEAD. To solve this, we move lkb's being unlocked onto a new list in the per-process structure that tracks what locks the process is holding. We can then go through this list to flag the necessary lkb's when clearing locks for a process when it exits. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:55 -05:00
Patrick Caulfield	1d6e8131cf	[DLM] Use workqueues for dlm lowcomms This patch converts the DLM TCP lowcomms to use workqueues rather than using its own daemon functions. Simultaneously removing a lot of code and making it more scalable on multi-processor machines. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:52 -05:00
David Teigland	d200778e12	[DLM] expose dlm_config_info fields in configfs Make the dlm_config_info values readable and writeable via configfs entries. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:43 -05:00
David Teigland	99fc64874a	[DLM] add config entry to enable log_debug Add a new dlm_config_info field to enable log_debug output and change log_debug() to use it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:40 -05:00
David Teigland	68c817a1c4	[DLM] rename dlm_config_info fields Add a "ci_" prefix to the fields in the dlm_config_info struct so that we can use macros to add configfs functions to access them (in a later patch). No functional changes in this patch, just naming changes. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:37 -05:00
David Teigland	8ec6886748	[DLM] change some log_error to log_debug Some common, non-error messages should use log_debug instead of log_error so they can be turned off. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:34 -05:00
Patrick Caulfield	4edde74eed	[DLM] Fix spin lock already unlocked bug I just noticed this message when testing some other changes I'd made to lowcomms (to use workqueues) but the problem seems to be in the current git trees too. I'm amazed no-one has seen it. BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868 Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:21 -05:00
Patrick Caulfield	3fb4a251fe	[DLM] Fix schedule() calls I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton. These four should really be calls to schedule() or the dlm can busy-wait. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:18 -05:00
Adrian Bunk	927255f038	[DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions Remove the following unused functions: - lowcomms_send_message() - lowcomms_max_buffer_size() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:05 -05:00
David Teigland	075529b5e1	[DLM] fix lost flags in stub replies When the dlm fakes an unlock/cancel reply from a failed node using a stub message struct, it wasn't setting the flags in the stub message. So, in the process of receiving the fake message the lkb flags would be updated and cleared from the zero flags in the message. The problem observed in tests was the loss of the USER flag which caused the dlm to think a user lock was a kernel lock and subsequently fail an assertion checking the validity of the ast/callback field. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:02 -05:00
David Teigland	8d07fd509e	[DLM] fix receive_request() lvb copying LVB's are not sent as part of new requests, but the code receiving the request was copying data into the lvb anyway. The space in the message where it mistakenly thought the lvb lived actually contained the resource name, so it wound up incorrectly copying this name data into the lvb. Fix is to just create the lvb, not copy junk into it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:59 -05:00
David Teigland	da49f36f4f	[DLM] fix send_args() lvb copying The send_args() function is used to copy parameters into a message for a number different message types. Only some of those types are set up beforehand (in create_message) to include space for sending lvb data. send_args was wrongly copying the lvb for all message types as long as the lock had an lvb. This means that the lvb data was being written past the end of the message into unknown space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:56 -05:00
David Teigland	9e971b715d	[DLM] add version check Check if we receive a message from another lockspace member running a version of the dlm with an incompatible inter-node message protocol. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:53 -05:00
David Teigland	38aa8b0c59	[DLM] fix old rcom messages A reply to a recovery message will often be received after the relevant recovery sequence has aborted and the next recovery sequence has begun. We need to ignore replies to these old messages from the previous recovery. There's already a way to do this for synchronous recovery requests using the rc_id number, but not for async. Each recovery sequence already has a locally unique sequence number associated with it. This patch adds a field to the rcom (recovery message) structure where this recovery sequence number can be placed, rc_seq. When a node sends a reply to a recovery request, it copies the rc_seq number it received into rc_seq_reply. When the first node receives the reply to its recovery message, it will check whether rc_seq_reply matches the current recovery sequence number, ls_recover_seq, and if not then it ignores the old reply. An old, inadequate approach to filtering out old replies (checking if the current stage of recovery has moved back to the start) has been removed from two spots. The protocol version number is changed to reflect the different rcom structures. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:50 -05:00
David Teigland	dc200a8848	[DLM] fix resend rcom lock There's a chance the new master of resource hasn't learned it's the new master before another node sends it a lock during recovery. The node sending the lock needs to resend if this happens. - A sends a master lookup for resource R to C - B sends a master lookup for resource R to C - C receives A's lookup, assigns A to be master of R and sends a reply back to A - C receives B's lookup and sends a reply back to B saying that A is the master - B receives lookup reply from C and sends its lock for R to A - A receives lock from B, doesn't think it's the master of R and sends an error back to B - A receives lookup reply from C and becomes master of R - B gets error back from A and resends its lock back to A (this resending is what this patch does) - A receives lock from B, it now sees it's the master of R and takes the lock Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:47 -05:00
Patrick Caulfield	c80e7c83d5	[DLM] fix compile warning This patch fixes a compile warning in lowcomms-tcp.c indicating that kmem_cache_t is deprecated. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-12-15 12:51:22 -05:00
Linus Torvalds	1c1afa3c05	Merge master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * master.kernel.org:/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (73 commits) [DLM] Clean up lowcomms [GFS2] Change gfs2_fsync() to use write_inode_now() [GFS2] Fix indent in recovery.c [GFS2] Don't flush everything on fdatasync [GFS2] Add a comment about reading the super block [GFS2] Mount problem with the GFS2 code [GFS2] Remove gfs2_check_acl() [DLM] fix format warnings in rcom.c and recoverd.c [GFS2] lock function parameter [DLM] don't accept replies to old recovery messages [DLM] fix size of STATUS_REPLY message [GFS2] fs/gfs2/log.c:log_bmap() fix printk format warning [DLM] fix add_requestqueue checking nodes list [GFS2] Fix recursive locking in gfs2_getattr [GFS2] Fix recursive locking in gfs2_permission [GFS2] Reduce number of arguments to meta_io.c:getbuf() [GFS2] Move gfs2_meta_syncfs() into log.c [GFS2] Fix journal flush problem [GFS2] mark_inode_dirty after write to stuffed file [GFS2] Fix glock ordering on inode creation ...	2006-12-07 09:13:20 -08:00
Christoph Lameter	e18b890bb0	[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name ".c" -o -name ".h"\|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:25 -08:00
Patrick Caulfield	ac33d07105	[DLM] Clean up lowcomms This fixes up most of the things pointed out by akpm and Pavel Machek with comments below indicating why some things have been left: Andrew Morton wrote: > >> +static struct nodeinfo nodeid2nodeinfo(int nodeid, gfp_t alloc) >> +{ >> + struct nodeinfo ni; >> + int r; >> + int n; >> + >> + down_read(&nodeinfo_lock); > > Given that this function can sleep, I wonder if `alloc' is useful. > > I see lots of callers passing in a literal "0" for `alloc'. That's in fact > a secret (GFP_ATOMIC & ~__GFP_HIGH). I doubt if that's what you really > meant. Particularly as the code could at least have used __GFP_WAIT (aka > GFP_NOIO) which is much, much more reliable than "0". In fact "0" is the > least reliable mode possible. > > IOW, this is all bollixed up. When 0 is passed into nodeid2nodeinfo the function does not try to allocate a new structure at all. it's an indication that the caller only wants the nodeinfo struct for that nodeid if there actually is one in existance. I've tidied the function itself so it's more obvious, (and tidier!) >> +/* Data received from remote end / >> +static int receive_from_sock(void) >> +{ >> + int ret = 0; >> + struct msghdr msg; >> + struct kvec iov[2]; >> + unsigned len; >> + int r; >> + struct sctp_sndrcvinfo sinfo; >> + struct cmsghdr cmsg; >> + struct nodeinfo ni; >> + >> + /* These two are marginally too big for stack allocation, but this >> + * function is (currently) only called by dlm_recvd so static should be >> + * OK. >> + / >> + static struct sockaddr_storage msgname; >> + static char incmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; > > whoa. This is globally singly-threaded code?? Yes. it is only ever run in the context of dlm_recvd. >> >> +static void initiate_association(int nodeid) >> +{ >> + struct sockaddr_storage rem_addr; >> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; > > Another static buffer to worry about. Globally singly-threaded code? Yes. Only ever called by dlm_sendd. >> + >> +/ Send a message / >> +static int send_to_sock(struct nodeinfo ni) >> +{ >> + int ret = 0; >> + struct writequeue_entry e; >> + int len, offset; >> + struct msghdr outmsg; >> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))]; > > Singly-threaded? Yep. >> >> +static void dealloc_nodeinfo(void) >> +{ >> + int i; >> + >> + for (i=1; i<=max_nodeid; i++) { >> + struct nodeinfo ni = nodeid2nodeinfo(i, 0); >> + if (ni) { >> + idr_remove(&nodeinfo_idr, i); > > Didn't that need locking? Not. it's only ever called at DLM shutdown after all the other threads have been stopped. >> >> +static int write_list_empty(void) >> +{ >> + int status; >> + >> + spin_lock_bh(&write_nodes_lock); >> + status = list_empty(&write_nodes); >> + spin_unlock_bh(&write_nodes_lock); >> + >> + return status; >> +} > > This function's return value is meaningless. As soon as the lock gets > dropped, the return value can get out of sync with reality. > > Looking at the caller, this _might_ happen to be OK, but it's a nasty and > dangerous thing. Really the locking should be moved into the caller. It's just an optimisation to allow the caller to schedule if there is no work to do. if something arrives immediately afterwards then it will get picked up when the process re-awakes (and it will be woken by that arrival). The 'accepting' atomic has gone completely. as Andrew pointed out it didn't really achieve much anyway. I suspect it was a plaster over some other startup or shutdown bug to be honest. Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Pavel Machek <pavel@ucw.cz>	2006-12-07 09:25:13 -05:00
Ryusuke Konishi	57adf7eede	[DLM] fix format warnings in rcom.c and recoverd.c This fixes the following gcc warnings generated on the architectures where uint64_t != unsigned long long (e.g. ppc64). fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'uint64_t' fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'uint64_t' fs/dlm/recoverd.c:48: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' fs/dlm/recoverd.c:202: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' fs/dlm/recoverd.c:210: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t' Signed-off-by: Ryusuke Konishi <ryusuke@osrg.net> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:37:22 -05:00
David Teigland	98f176fb32	[DLM] don't accept replies to old recovery messages We often abort a recovery after sending a status request to a remote node. We want to ignore any potential status reply we get from the remote node. If we get one of these unwanted replies, we've often moved on to the next recovery message and incremented the message sequence counter, so the reply will be ignored due to the seq number. In some cases, we've not moved on to the next message so the seq number of the reply we want to ignore is still correct, causing the reply to be accepted. The next recovery message will then mistake this old reply as a new one. To fix this, we add the flag RCOM_WAIT to indicate when we can accept a new reply. We clear this flag if we abort recovery while waiting for a reply. Before the flag is set again (to allow new replies) we know that any old replies will be rejected due to their sequence number. We also initialize the recovery-message sequence number to a random value when a lockspace is first created. This makes it clear when messages are being rejected from an old instance of a lockspace that has since been recreated. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:37:14 -05:00
David Teigland	1babdb4531	[DLM] fix size of STATUS_REPLY message When the not_ready routine sends a "fake" status reply with blank status flags, it needs to use the correct size for a normal STATUS_REPLY by including the size of the would-be config parameters. We also fill in the non-existant config parameters with an invalid lvblen value so it's easier to notice if these invalid paratmers are ever being used. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:37:08 -05:00
David Teigland	2896ee37cc	[DLM] fix add_requestqueue checking nodes list Requests that arrive after recovery has started are saved in the requestqueue and processed after recovery is done. Some of these requests are purged during recovery if they are from nodes that have been removed. We move the purging of the requests (dlm_purge_requestqueue) to later in the recovery sequence which allows the routine saving requests (dlm_add_requestqueue) to avoid filtering out requests by nodeid since the same will be done by the purge. The current code has add_requestqueue filtering by nodeid but doesn't hold any locks when accessing the list of current nodes. This also means that we need to call the purge routine when the lockspace is being shut down since the add routine will not be rejecting requests itself any more. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:37:00 -05:00
Patrick Caulfield	b98c95af01	[DLM] Fix DLM config The attached patch fixes the DLM config so that it selects the chosen network transport. It should fix the bug where DLM can be left selected when NET gets unselected. This incorporates all the comments received about this patch. Cc: Adrian Bunk <bunk@stusta.de> Cc: Andrew Morton <akpm@osdl.org> Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:41 -05:00
David Teigland	6f90a8b1b8	[DLM] clear sbflags on lock master RH BZ 211622 The ALTMODE flag can be set in the lock master's copy of the lock but never cleared, so ALTMODE will also be returned in a subsequent conversion of the lock when it shouldn't be. This results in lock_dlm incorrectly switching to the alternate lock mode when returning the result to gfs which then asserts when it sees the wrong lock state. The fix is to propagate the cleared sbflags value to the master node when the lock is requested. QA's d_rwrandirectlarge test triggers this bug very quickly. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:27 -05:00
David Teigland	4b77f2c93d	[DLM] do full recover_locks barrier Red Hat BZ 211914 The previous patch "[DLM] fix aborted recovery during node removal" was incomplete as discovered with further testing. It set the bit for the RS_LOCKS barrier but did not then wait for the barrier. This is often ok, but sometimes it will cause yet another recovery hang. If it's a new node that also has the lowest nodeid that skips the barrier wait, then it misses the important step of collecting and reporting the barrier status from the other nodes (which is the job of the low nodeid in the barrier wait routine). Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:24 -05:00
David Teigland	2cdc98aaf0	[DLM] fix stopping unstarted recovery Red Hat BZ 211914 When many nodes are joining a lockspace simultaneously, the dlm gets a quick sequence of stop/start events, a pair for adding each node. dlm_controld in user space sends dlm_recoverd in the kernel each stop and start event. dlm_controld will sometimes send the stop before dlm_recoverd has had a chance to take up the previously queued start. The stop aborts the processing of the previous start by setting the RECOVERY_STOP flag. dlm_recoverd is erroneously clearing this flag and ignoring the stop/abort if it happens to take up the start after the stop meant to abort it. The fix is to check the sequence number that's incremented for each stop/start before clearing the flag. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:16 -05:00
David Teigland	91c0dc93a1	[DLM] fix aborted recovery during node removal Red Hat BZ 211914 With the new cluster infrastructure, dlm recovery for a node removal can be aborted and restarted for a node addition. When this happens, the restarted recovery isn't aware that it's doing recovery for the earlier removal as well as the addition. So, it then skips the recovery steps only required when nodes are removed. This can result in locks not being purged for failed/removed nodes. The fix is to check for removed nodes for which recovery has not been completed at the start of a new recovery sequence. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:13 -05:00
David Teigland	d4400156d4	[DLM] fix requestqueue race Red Hat BZ 211914 There's a race between dlm_recoverd (1) enabling locking and (2) clearing out the requestqueue, and dlm_recvd (1) checking if locking is enabled and (2) adding a message to the requestqueue. An order of recoverd(1), recvd(1), recvd(2), recoverd(2) will result in a message being left on the requestqueue. The fix is to have dlm_recvd check if dlm_recoverd has enabled locking after taking the mutex for the requestqueue and if it has processing the message instead of queueing it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:10 -05:00
David Teigland	435618b75b	[DLM] status messages ping-pong between unmounted nodes Red Hat BZ 213682 If two nodes leave the lockspace (while unmounting the fs in the case of gfs) after one has sent a STATUS message to the other, STATUS/STATUS_REPLY messages will then ping-pong between the nodes when neither of them can find the lockspace in question any longer. We kill this by not sending another STATUS message when we get a STATUS_REPLY for an unknown lockspace. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:06 -05:00
David Teigland	5206980964	[DLM] res_recover_locks_count not reset when recover_locks is aborted Red Hat BZ 213684 If a node sends an lkb to the new master (RCOM_LOCK message) during recovery and recovery is then aborted on both nodes before it gets a reply, the res_recover_locks_count needs to be reset to 0 so that when the subsequent recovery comes along and sends the lkb to the new master again the assertion doesn't trigger that checks that counter is zero. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:03 -05:00
Patrick Caulfield	fdda387f73	[DLM] Add support for tcp communications The following patch adds a TCP based communications layer to the DLM which is compile time selectable. The existing SCTP layer gives the advantage of allowing multihoming, whereas the TCP layer has been heavily tested in previous versions of the DLM and is known to be robust and therefore can be used as a baseline for performance testing. Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-30 10:35:00 -05:00
Patrick Caulfield	e2de7f5655	[DLM] fix oops in kref_put when removing a lockspace Now that the lockspace struct is freed when the last sysfs object is released this patch prevents use of that lockspace by sysfs. We attempt to re-get the lockspace from the lockspace list and fail the request if it has been removed. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-06 09:28:01 -05:00
Patrick Caulfield	ba542e3b92	[DLM] Fix kref_put oops This patch fixes the recounting on the lockspace kobject. Previously the lockspace was freed while userspace could have had a reference to one of its sysfs files, causing an oops in kref_put. Now the lockspace kfree is moved into the kobject release() function Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-11-06 09:01:07 -05:00
Patrick Caulfield	42fb00838a	[DLM] fix iovec length in recvmsg I didn't spot that the msg_iovlen was set to 2 if there were two elements in the iovec but left at zero if not :( I think this might be why bob was still seeing trouble. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-20 09:13:10 -04:00
Patrick Caulfield	4c5e1b1a8c	[DLM] fix iovec length in recvmsg The DLM always passes the iovec length as 1, this is wrong when the circular buffer wraps round. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-12 17:11:33 -04:00
Adrian Bunk	1ee48af22e	[DLM] Kconfig: don't show an empty DLM menu Don't show an empty "Distributed Lock Manager" menu if IP_SCTP=n. Reported by Dmytro Bagrii in kernel Bugzilla #7268. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-10-12 17:10:35 -04:00
Al Viro	38d6fd26ea	[PATCH] dlm gfp_t annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-09 14:19:08 -07:00
Theodore Ts'o	bba9dfd835	[GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs) The following patches reduce the size of the VFS inode structure by 28 bytes on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction in the inode size on a UP kernel that is configured in a production mode (i.e., with no spinlock or other debugging functions enabled; if you want to save memory taken up by in-core inodes, the first thing you should do is disable the debugging options; they are responsible for a huge amount of bloat in the VFS inode structure). This patch: The filesystem or device-specific pointer in the inode is inside a union, which is pretty pointless given that all 30+ users of this field have been using the void pointer. Get rid of the union and rename it to i_private, with a comment to explain who is allowed to use the void pointer. This is just a cleanup, but it allows us to reuse the union 'u' for something something where the union will actually be used. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org>	2006-09-28 08:32:24 -04:00
Steven Whitehouse	907b9bceb4	[GFS2/DLM] Fix trailing whitespace As per Andrew Morton's request, removed trailing whitespace. Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-25 09:26:04 -04:00
David Teigland	fa9f0e4925	[DLM] confirm master for recovered waiting requests Fixing the following scenario: - A request is on the waiters list waiting for a reply from a remote node. - The request is the first one on the resource, so first_lkid is set. - The remote node fails causing recovery. - During recovery the requesting node becomes master. - The request is now processed locally instead of being a remote operation. - At this point we need to call confirm_master() on the resource since we're certain we're now the master node. This will clear first_lkid. - We weren't calling confirm_master(), so first_lkid was not being cleared causing subsequent requests on that resource to get stuck. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-08 17:00:12 -04:00
David Teigland	a1d144c71d	[DLM] use snprintf in sysfs show Use snprintf(buf, PAGE_SIZE, ...) instead of sprintf in sysfs show methods. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-09-07 09:44:01 -04:00
David Teigland	c6e6f0ba8f	[DLM] force removal of user lockspace Check if the FORCEFREE flag has been provided from user space. If so, set the force option to dlm_release_lockspace() so that any remaining locks will be freed. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-31 12:15:37 -04:00
David Teigland	5f88f1ea16	[DLM] add new lockspace to list ealier When a new lockspace was being created, the recoverd thread was being started for it before the lockspace was added to the global list of lockspaces. The new thread was looking up the lockspace in the global list and sometimes not finding it due to the race with the original thread adding it to the list. We need to add the lockspace to the global list before starting the thread instead of after, and if the new thread can't find the lockspace for some reason, it should return an error. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-25 10:02:53 -04:00
David Teigland	233e515f40	[DLM] recover_locks not clearing NEW_MASTER flag When there are no locks on a resource, the recover_locks() function fails to clear the NEW_MASTER flag by going directly to out, missing the line that clears the flag. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-24 09:38:19 -04:00
David Teigland	f5888750aa	[DLM] sequence number missing in not_ready reply When a status reply is sent for a lockspace that doesn't yet exist, the message sequence number from the sender was not being copied into the reply causing the sender to ignore the reply. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-24 09:37:43 -04:00
David Teigland	32f105a123	[DLM] down conversion clearing flags The down-conversion optimization was resulting in the lkb flags being cleared because the stub message reply had no flags value set. Copy the current flags into the stub message so they'll be copied back into the lkb as part of processing the fake reply. Also add an assertion to catch this error more directly if it exists elsewhere. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-23 16:07:31 -04:00
Patrick Caulfield	c059f70e35	[DLM] down conversion clearing flags Oh, and here's (hopefully) the last of these ua_tmp patches. I think I've caught all the paths now. Sorry it didn't make the last one. Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-23 10:33:06 -04:00
Patrick Caulfield	10948eb4ed	[DLM] preserve lksb address in user conversions This patch fixes bz#203444 where the LKSB was lost during userland conversion operations Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-23 09:55:40 -04:00
David Teigland	a345da3e8f	[DLM] dump rsb and locks on assert Introduce new function dlm_dump_rsb() to call within assertions instead of dlm_print_rsb(). The new function dumps info about all locks on the rsb in addition to rsb details. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-21 09:50:09 -04:00
David Teigland	fcc8abc8d4	[DLM] move kmap to after spin_unlock Doing the kmap() while holding the spinlock was causing recursive spinlock problems. It seems the kmap was scheduling, although there was no warning as I'd expect. Patrick, do we need locking around the kmap? Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-11 09:44:00 -04:00
David Teigland	4a99c3d9d6	[DLM] reject replies to old requests When recoveries are aborted by other recoveries we can get replies to status or names requests that we've given up on. This can cause problems if we're making another request and receive an old reply. Add a sequence number to status/names requests and reject replies that don't match. A field already exists for the seq number that's used in other message types. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-09 17:32:07 -04:00
David Teigland	faa0f26772	[DLM] show nodeid for recovery message To aid debugging, it's useful to be able to see what nodeid the dlm is waiting on for a message reply. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-09 09:46:38 -04:00
David Teigland	06442440bc	[DLM] break from snprintf loop When the debug buffer has filled up, break from the loop and return the correct number of bytes that have been written. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-09 09:46:04 -04:00
David Teigland	f6db1b8e72	[DLM] abort recovery more quickly When we abort one recovery to do another, break out of the ping_members() routine more quickly, and wake up the dlm_recoverd thread more quickly instead of waiting for it to time out. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-09 09:45:31 -04:00
David Teigland	5ff519112a	[DLM] print bad length in assertion Print the violating name length in the assertion. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-09 09:44:54 -04:00
Patrick Caulfield	cc346d555f	[DLM] fix userland unlock This patch fixes the userland DLM unlock code so that it correctly returns the address of the userland lock status block in its completion AST. It fixes bug #201348 Patrick Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-08-08 10:34:40 -04:00
David Teigland	ae4a382004	[DLM] fix i_private > I think you must have an old version of the base kernel as well? > i_private no longer exists in struct inode, so you'll have to use > something else, I have that patch in my stack but didn't send it; for some reason I thought it was already changed in your git tree. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-26 15:31:15 -04:00
David Teigland	20abf975f7	[DLM] fix broken patches On Wed, Jul 26, 2006 at 10:47:14AM +0100, Steven Whitehouse wrote: > Hi, > > I've applied all the patches you sent, but they don't build: Argh, sorry about that... when I fixed these a long time ago they somehow never got included in the quilt patches. I mistakenly assumed the quilt patches matched the source I had in front of me. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-26 14:42:05 -04:00
David Teigland	81456807a3	[DLM] schedule during long loop through locks The loop through all waiting locks in recover_waiters can potentially be long, so we should schedule explicitly. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-26 08:42:57 -04:00
David Teigland	2b4e926aab	[DLM] fix loop in grant_after_purge The loop in grant_after_purge is intended to find all rsb's in each hash bucket that have the LOCKS_PURGED flag set. The loop was quitting the current bucket after finding just one rsb instead of going until there are no more. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-26 08:42:26 -04:00
David Teigland	f7da790d74	[DLM] set purged flag on rsbs If a node becomes the new master of an rsb during recovery, the LOCKS_PURGED flag needs to be set on it so that any waiting/converting locks will try to be granted. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-26 08:42:01 -04:00
David Teigland	5de6319b18	[DLM] more info through debugfs Display more information from debugfs, particularly locks waiting for a master lookup or operations waiting for a remote reply. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-26 08:41:37 -04:00
David Teigland	3609819818	[DLM] fix whitespace damage My previous dlm patch added trailing whitespace damage, fix that. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-21 01:55:41 -04:00
David Teigland	34e22bed19	[DLM] fix leaking user locks User NOQUEUE lock requests to a remote node that failed with -EAGAIN were never being removed from a process's list of locks. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-20 00:11:15 -04:00
Adrian Bunk	3b4a0a7494	[DLM] [RFC: -mm patch] fs/dlm/lock.c: unexport dlm_lvb_operations On Thu, Jul 13, 2006 at 10:48:00PM -0700, Andrew Morton wrote: >... > Changes since 2.6.18-rc1-mm1: >... > git-gfs2.patch >... > git trees. >... This patch removes the unused EXPORT_SYMBOL_GPL(dlm_lvb_operations). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-20 00:09:09 -04:00
David Teigland	597d0cae0f	[DLM] dlm: user locks This changes the way the dlm handles user locks. The core dlm is now aware of user locks so they can be dealt with more efficiently. There is no more dlm_device module which previously managed its own duplicate copy of every user lock. Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-07-13 09:25:34 -04:00
Patrick Caulfield	9f5aa2a921	[DLM] Fix potential conflict in DLM userland locks Just spotted this one. The lockinfo structs are hashed by lockid but into a global structure. So that if there are two lockspaces with the same lockid all hell breaks loose. I'm not exactly sure what will happen but it can't be good! The attached patch moves the lockinfo_idr into the user_ls structure so that lockids are localised. patrick Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-19 16:27:52 -04:00
David Teigland	7d5513d58d	[DLM] init rwsem earlier The nodeinfo_lock rwsem needs to be initialized when the module is loaded instead of when the dlm is first used. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-19 09:15:38 -04:00
Patrick Caulfield	22da645fd6	[DLM] compat patch Here's a patch which add 32/64 bit compat to the DLM IOs and tidies the structures for alignment. As it causes an ABI change I had few qualms about adding the extra flag for "is64bit" as it simply uses a byte that would have been padding. Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-06-09 16:14:20 -04:00
Steven Whitehouse	47c96298cd	[GFS2] Change name due to local_nodeid being a macro Change names of local_nodeid to dlm_local_nodeid to prevent a namespace collision. Changed other local variable to match. Cc: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-25 17:43:14 -04:00
David Teigland	9229f01349	[GFS2] Cast 64 bit printk args to unsigned long long. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-24 09:21:30 -04:00
David Teigland	97a35d1e5f	[DLM] fix grant_after_purge softlockup In dlm_grant_after_purge() we were holding a hash table read_lock while calling put_rsb() which potentially removes the rsb from the hash table, taking the same lock in write. Fix this by flagging rsb's ahead of time that have been purged. Then iteratively read_lock the hash table, find a flagged rsb, unlock, process rsb. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-05-02 13:34:03 -04:00
David Teigland	c56b39cd2c	[DLM] PATCH 3/3 dlm: show recover state Expose the current recovery state in sysfs to help in debugging. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-04-28 10:51:53 -04:00
David Teigland	1c032c0311	[DLM] PATCH 2/3 dlm: lowcomms close When a node is removed from a lockspace configuration, close our connection to it, clearing any remaining messages for it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-04-28 10:50:41 -04:00
David Teigland	ae118962b9	[DLM] PATCH 1/3 dlm: force free user lockspace Lockspaces created from user space should be forcibly freed without requiring any further user space interaction. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-04-28 10:48:59 -04:00
Patrick Caulfield	714dc65c34	[DLM] Convert a semaphore to a completion Convert a semaphore into a completion in device.c. Cc: David Teigland <teigland@redhat.com> Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-04-25 14:49:01 -04:00
Steven Whitehouse	96c2c0083d	[DLM] Update Kconfig in the light of comments on lkml We now depend on user selectable options rather than select them. There is no dependancy on SYSFS since this selection is independant of the DLM (even though it wouldn't be sensible to build the DLM without it) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-04-25 13:23:09 -04:00
David Teigland	b3f58d8f2b	[DLM] Pass in lockspace to lkb put function In some cases a lockspace isn't attached to the lkb, so that it needs to be passed directly to the lkb put function. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-02-28 11:16:37 -05:00
David Teigland	3bcd3687f8	[DLM] Remove range locks from the DLM This patch removes support for range locking from the DLM Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2006-02-23 09:56:38 +00:00
David Teigland	901359256b	[DLM] Update DLM to the latest patch level Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steve Whitehouse <swhiteho@redhat.com>	2006-01-20 08:47:07 +00:00
David Teigland	e7fd41792f	[DLM] The core of the DLM for GFS2/CLVM This is the core of the distributed lock manager which is required to use GFS2 as a cluster filesystem. It is also used by CLVM and can be used as a standalone lock manager independantly of either of these two projects. It implements VAX-style locking modes. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steve Whitehouse <swhiteho@redhat.com>	2006-01-18 09:30:29 +00:00

1 2 3 4 5

233 Commits