Commit Graph

2907 Commits

Author SHA1 Message Date
Oleg Nesterov 5debfa6da5 [PATCH] coredump: shutdown current process first
This patch optimizes zap_threads() for the case when there are no ->mm
users except the current's thread group.  In that case we can avoid
'for_each_process()' loop.

It also adds a useful invariant: SIGNAL_GROUP_EXIT (if checked under
->siglock) always implies that all threads (except may be current) have
pending SIGKILL.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:27 -07:00
Oleg Nesterov dcf560c593 [PATCH] coredump: some code relocations
This is a preparation for the next patch.  No functional changes.
Basically, this patch moves '->flags & SIGNAL_GROUP_EXIT' check into
zap_threads(), and 'complete(vfork_done)' into coredump_wait outside of
->mmap_sem protected area.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:27 -07:00
Oleg Nesterov 7b1c6154fa [PATCH] coredump: don't take tasklist_lock
This patch removes tasklist_lock from zap_threads().
This is safe wrt:

	do_exit:
		The caller holds mm->mmap_sem. This means that task which
		shares the same ->mm can't pass exit_mm(), so it can't be
		unhashed from init_task.tasks or ->thread_group lists.

	fork:
		None of sub-threads can fork after zap_process(leader). All
		processes which were created before this point should be
		visible to zap_threads() because copy_process() adds the new
		process to the tail of init_task.tasks list, and ->siglock
		lock/unlock provides a memory barrier.

	de_thread:
		It does list_replace_rcu(&leader->tasks, &current->tasks).
		So zap_threads() will see either old or new leader, it does
		not matter. However, it can change p->sighand, so we should
		use lock_task_sighand() in zap_process().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:27 -07:00
Oleg Nesterov d5f70c00ad [PATCH] coredump: kill ptrace related stuff
With this patch zap_process() sets SIGNAL_GROUP_EXIT while sending SIGKILL to
the thread group.  This means that a TASK_TRACED task

	1. Will be awakened by signal_wake_up(1)

	2. Can't sleep again via ptrace_notify()

	3. Can't go to do_signal_stop() after return
	   from ptrace_stop() in get_signal_to_deliver()

So we can remove all ptrace related stuff from coredump path.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:27 -07:00
Oleg Nesterov 281de339ce [PATCH] coredump: speedup SIGKILL sending
With this patch a thread group is killed atomically under ->siglock.  This is
faster because we can use sigaddset() instead of force_sig_info() and this is
used in further patches.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:27 -07:00
Oleg Nesterov aceecc0412 [PATCH] coredump: optimize ->mm users traversal
zap_threads() iterates over all threads to find those ones which share
current->mm.  All threads in the thread group share the same ->mm, so we can
skip entire thread group if it has another ->mm.

This patch shifts the killing of thread group into the newly added
zap_process() function.  This looks as unnecessary complication, but it is
used in further patches.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Oleg Nesterov 2ceb8693ef [PATCH] de_thread: fix lockless do_each_thread
We should keep the value of old_leader->tasks.next in de_thread, otherwise
we can't do for_each_process/do_each_thread without tasklist_lock held.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Eric Paris 42c3e03ef6 [PATCH] SELinux: Add sockcreate node to procattr API
Below is a patch to add a new /proc/self/attr/sockcreate A process may write a
context into this interface and all subsequent sockets created will be labeled
with that context.  This is the same idea as the fscreate interface where a
process can specify the label of a file about to be created.  At this time one
envisioned user of this will be xinetd.  It will be able to better label
sockets for the actual services.  At this time all sockets take the label of
the creating process, so all xinitd sockets would just be labeled the same.

I tested this by creating a tcp sender and listener.  The sender was able to
write to this new proc file and then create sockets with the specified label.
I am able to be sure the new label was used since the avc denial messages
kicked out by the kernel included both the new security permission
setsockcreate and all the socket denials were for the new label, not the label
of the running process.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Oleg Nesterov c1df7fb88a [PATCH] cleanup next_tid()
Try to make next_tid() a bit more readable and deletes unnecessary
"pid_alive(pos)" check.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Oleg Nesterov a872ff0cb2 [PATCH] simplify/fix first_tid()
first_tid:

	/* If nr exceeds the number of threads there is nothing todo */
	if (nr) {
		if (nr >= get_nr_threads(leader))
			goto done;
	}

This is not reliable: sub-threads can exit after this check, so the
'for' loop below can overlap and proc_task_readdir() can return an
already filldir'ed dirents.

	for (; pos && pid_alive(pos); pos = next_thread(pos)) {
		if (--nr > 0)
			continue;

Off-by-one error, will return 'leader' when nr == 1.

This patch tries to fix these problems and simplify the code.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Eric W. Biederman cc288738c9 [PATCH] proc: Remove tasklist_lock from proc_task_readdir.
This is just like my previous removal of tasklist_lock from first_tgid, and
next_tgid.  It simply had to wait until it was rcu safe to walk the thread
list.

This should be the last instance of the tasklist_lock in proc.  So user
processes should not be able to influence the tasklist lock hold times.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Eric W. Biederman df26c40e56 [PATCH] proc: Cleanup proc_fd_access_allowed
In process of getting proc_fd_access_allowed to work it has developed a few
warts.  In particular the special case that always allows introspection and
the special case to allow inspection of kernel threads.

The special case for introspection is needed for /proc/self/mem.

The special case for kernel threads really should be overridable
by security modules.

So consolidate these checks into ptrace.c:may_attach().

The check to always allow introspection is trivial.

The check to allow access to kernel threads, and zombies is a little
trickier.  mem_read and mem_write already verify an mm exists so it isn't
needed twice.  proc_fd_access_allowed only doesn't want a check to verify
task->mm exits, s it prevents all access to kernel threads.  So just move
the task->mm check into ptrace_attach where it is needed for practical
reasons.

I did a quick audit and none of the security modules in the kernel seem to
care if they are passed a task without an mm into security_ptrace.  So the
above move should be safe and it allows security modules to come up with
more restrictive policy.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Eric W. Biederman 778c114477 [PATCH] proc: Use sane permission checks on the /proc/<pid>/fd/ symlinks
Since 2.2 we have been doing a chroot check to see if it is appropriate to
return a read or follow one of these magic symlinks.  The chroot check was
asking a question about the visibility of files to the calling process and
it was actually checking the destination process, and not the files
themselves.  That test was clearly bogus.

In my first pass through I simply fixed the test to check the visibility of
the files themselves.  That naive approach to fixing the permissions was
too strict and resulted in cases where a task could not even see all of
it's file descriptors.

What has disturbed me about relaxing this check is that file descriptors
are per-process private things, and they are occasionaly used a user space
capability tokens.  Looking a little farther into the symlink path on /proc
I did find userid checks and a check for capability (CAP_DAC_OVERRIDE) so
there were permissions checking this.

But I was still concerned about privacy.  Besides /proc there is only one
other way to find out this kind of information, and that is ptrace.  ptrace
has been around for a long time and it has a well established security
model.

So after thinking about it I finally realized that the permission checks
that make sense are the permission checks applied to ptrace_attach.  The
checks are simple per process, and won't cause nasty surprises for people
coming from less capable unices.

Unfortunately there is one case that the current ptrace_attach test does
not cover: Zombies and kernel threads.  Single stepping those kinds of
processes is impossible.  Being able to see which file descriptors are open
on these tasks is important to lsof, fuser and friends.  So for these
special processes I made the rule you can't find out unless you have
CAP_SYS_PTRACE.

These proc permission checks should now conform to the principle of least
surprise.  As well as using much less code to implement :)

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Eric W. Biederman 5b0c1dd38b [PATCH] proc: optimize proc_check_dentry_visible
The code doesn't need to sleep to when making this check so I can just do the
comparison and not worry about the reference counts.

TODO: While looking at this I realized that my original cleanup did not push
the permission check far enough down into the stack.  The call of
proc_check_dentry_visible needs to move out of the generic proc
readlink/follow link code and into the individual get_link instances.
Otherwise the shared resources checks are not quite correct (shared
files_struct does not require a shared fs_struct), and there are races with
unshare.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Eric W. Biederman 13b41b0949 [PATCH] proc: Use struct pid not struct task_ref
Incrementally update my proc-dont-lock-task_structs-indefinitely patches so
that they work with struct pid instead of struct task_ref.

Mostly this is a straight 1-1 substitution.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Eric W. Biederman 99f8955183 [PATCH] proc: don't lock task_structs indefinitely
Every inode in /proc holds a reference to a struct task_struct.  If a
directory or file is opened and remains open after the the task exits this
pinning continues.  With 8K stacks on a 32bit machine the amount pinned per
file descriptor is about 10K.

Normally I would figure a reasonable per user process limit is about 100
processes.  With 80 processes, with a 1000 file descriptors each I can trigger
the 00M killer on a 32bit kernel, because I have pinned about 800MB of useless
data.

This patch replaces the struct task_struct pointer with a pointer to a struct
task_ref which has a struct task_struct pointer.  The so the pinning of dead
tasks does not happen.

The code now has to contend with the fact that the task may now exit at any
time.  Which is a little but not muh more complicated.

With this change it takes about 1000 processes each opening up 1000 file
descriptors before I can trigger the OOM killer.  Much better.

[mlp@google.com: task_mmu small fixes]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Paul Jackson <pj@sgi.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Albert Cahalan <acahalan@gmail.com>
Signed-off-by: Prasanna Meda <mlp@google.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:25 -07:00
Eric W. Biederman 8578cea750 [PATCH] proc: make PROC_NUMBUF the buffer size for holding integers as strings
Currently in /proc at several different places we define buffers to hold a
process id, or a file descriptor .  In most of them we use either a hard coded
number or a different define.  Modify them all to use PROC_NUMBUF, so the code
has a chance of being maintained.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:25 -07:00
Eric W. Biederman 9cc8cbc7f8 [PATCH] simply fix first_tgid
Like the bug Oleg spotted in first_tid there was also a small off by one
error in first_tgid, when a seek was done on the /proc directory.  This
fixes that and changes the code structure to make it a little more obvious
what is going on.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:25 -07:00
Eric W. Biederman de7587343b [PATCH] proc: Remove tasklist_lock from proc_pid_lookup() and proc_task_lookup()
Since we no longer need the tasklist_lock for get_task_struct the lookup
methods no longer need the tasklist_lock.

This just depends on my previous patch that makes get_task_struct() rcu
safe.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:25 -07:00
Eric W. Biederman 454cc105ef [PATCH] proc: Remove tasklist_lock from proc_pid_readdir
We don't need the tasklist_lock to safely iterate through processes
anymore.

This depends on my previous to task patches that make get_task_struct rcu
safe, and that make next_task() rcu safe.  I haven't gotten
first_tid/next_tid yet only because next_thread is missing an
rcu_dereference.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:25 -07:00
Eric W. Biederman 0bc58a9102 [PATCH] proc: refactor reading directories of tasks
There are a couple of problems this patch addresses.
- /proc/<tgid>/task currently does not work correctly if you stop reading
  in the middle of a directory.

- /proc/ currently requires a full pass through the task list with
  the tasklist lock held, to determine there are no more processes to read.

- The hand rolled integer to string conversion does not properly running
  out of buffer space.

- We seem to be batching reading of pids from the tasklist without reason,
  and complicating the logic of the code.

This patch addresses that by changing how tasks are processed.  A
first_<task_type> function is built that handles restarts, and a
next_<task_type> function is built that just advances to the next task.

first_<task_type> when it detects a restart usually uses find_task_by_pid.  If
that doesn't work because there has been a seek on the directory, or we have
already given a complete directory listing, it first checks the number tasks
of that type, and only if we are under that count does it walk through all of
the tasks to find the one we are interested in.

The code that fills in the directory is simpler because there is only a single
for loop.

The hand rolled integer to string conversion is replaced by snprintf which
should handle the the out of buffer case correctly.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:25 -07:00
Eric W. Biederman cd6a3ce9ec [PATCH] proc: Close the race of a process dying durning lookup
proc_lookup and task exiting are not synchronized, although some of the
previous code may have suggested that.  Every time before we reuse a dentry
namei.c calls d_op->derevalidate which prevents us from reusing a stale dcache
entry.  Unfortunately it does not prevent us from returning a stale dcache
entry.  This race has been explicitly plugged in proc_pid_lookup but there is
nothing to confine it to just that proc lookup function.

So to prevent the race I call revalidate explictily in all of the proc lookup
functions after I call d_add, and report an error if the revalidate does not
succeed.

Years ago Al Viro did something similar but those changes got lost in the
churn.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:25 -07:00
Eric W. Biederman 48e6484d49 [PATCH] proc: Rewrite the proc dentry flush on exit optimization
To keep the dcache from filling up with dead /proc entries we flush them on
process exit.  However over the years that code has gotten hairy with a
dentry_pointer and a lock in task_struct and misdocumented as a correctness
feature.

I have rewritten this code to look and see if we have a corresponding entry in
the dcache and if so flush it on process exit.  This removes the extra fields
in the task_struct and allows me to trivially handle the case of a
/proc/<tgid>/task/<pid> entry as well as the current /proc/<pid> entries.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:24 -07:00
Eric W. Biederman 662795deb8 [PATCH] proc: Move proc_maps_operations into task_mmu.c
All of the functions for proc_maps_operations are already defined in
task_mmu.c so move the operations structure to keep the functionality
together.

Since task_nommu.c implements a dummy version of /proc/<pid>/maps give it a
simplified version of proc_maps_operations that it can modify to best suit its
needs.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:24 -07:00
Eric W. Biederman 6e66b52bf5 [PATCH] proc: Fix the link count for /proc/<pid>/task
Use getattr to get an accurate link count when needed.  This is cheaper and
more accurate than trying to derive it by walking the thread list of a
process.

Especially as it happens when needed stat instead of at readdir time.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:24 -07:00
Eric W. Biederman 0f2fe20f55 [PATCH] proc: Properly filter out files that are not visible to a process
Long ago and far away in 2.2 we started checking to ensure the files we
displayed in /proc were visible to the current process.  It was an
unsophisticated time and no one was worried about functions full of FIXMES in
a stable kernel.  As time passed the function became sacred and was enshrined
in the shrine of how things have always been.  The fixes came in but only to
keep the function working no one really remembering or documenting why we did
things that way.

The intent and the functionality make a lot of sense.  Don't let /proc be an
access point for files a process can see no other way.  The implementation
however is completely wrong.

We are currently checking the root directories of the two processes, we are
not checking the actual file descriptors themselves.

We are strangely checking with a permission method instead of just when we use
the data.

This patch fixes the logic to actually check the file descriptors and make a
note that implementing a permission method for this part of /proc almost
certainly indicates a bug in the reasoning.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:24 -07:00
Eric W. Biederman 22c2c5d75e [PATCH] proc: Kill proc_mem_inode_operations
The inode operations only exist to support the proc_permission function.
Currently mem_read and mem_write have all the same permission checks as
ptrace.  The fs check makes no sense in this context, and we can trivially get
around it by calling ptrace.

So simply the code by killing the strange weird case.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:24 -07:00
Eric W. Biederman 68602066c3 [PATCH] proc: Remove bogus proc_task_permission
First we can access every /proc/<tgid>/task/<pid> directory as /proc/<pid> so
proc_task_permission is not usefully limiting visibility.

Second having related filesystems information should have nothing to do with
process visibility.  kill does not implement any checks like that.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:24 -07:00
Eric W. Biederman aed7a6c476 [PATCH] proc: Replace proc_inode.type with proc_inode.fd
The sole renaming use of proc_inode.type is to discover the file descriptor
number, so just store the file descriptor number and don't wory about
processing this field.  This removes any /proc limits on the maximum number of
file descriptors, and clears the path to make the hard coded /proc inode
numbers go away.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:24 -07:00
Eric W. Biederman 87bfbf679f [PATCH] proc: Simplify the ownership rules for /proc
Currently in /proc if the task is dumpable all of files are owned by the tasks
effective users.  Otherwise the files are owned by root.  Unless it is the
/proc/<tgid>/ or /proc/<tgid>/task/<pid> directory in that case we always make
the directory owned by the effective user.

However the special case for directories is pointless except as a way to read
the effective user, because the permissions on both of those directories are
world readable, and executable.

/proc/<tgid>/status provides a much better way to read a processes effecitve
userid, so it is silly to try to provide that on the directory.

So this patch simplifies the code by removing a pointless special case and
gets us one step closer to being able to remove the hard coded /proc inode
numbers.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:23 -07:00
Eric W. Biederman 1679654951 [PATCH] proc: Remove unnecessary and misleading assignments from proc_pid_make_inode
The removed fields are already set by proc_alloc_inode.  Initializing them in
proc_alloc_inode implies they need it for proper cleanup.  At least ei->pde
was not set on all paths making it look like proc_alloc_inode was buggy.  So
just remove the redundant assignments.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:23 -07:00
Eric W. Biederman ff9724a3f7 [PATCH] proc: Remove useless BKL in proc_pid_readlink
We already call everything except do_proc_readlink outside of the BKL in
proc_pid_followlink, and there appears to be nothing in do_proc_readlink that
needs any special protection.

So remove this leftover from one of the BKL cleanup efforts.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:23 -07:00
Eric W. Biederman 5634708b5f [PATCH] proc: Fix the .. inode number on /proc/<pid>/fd
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:23 -07:00
Herbert Xu f05e15b594 [PATCH] nfsd kconfig: select things at the closest tristate instead of bool
I noticed recently that my CONFIG_CRYPTO_MD5 turned into a y again instead
of m.  It turns out that CONFIG_NFSD_V4 is selecting it to be y even though
I've chosen to compile nfsd as a module.

In general when we have a bool sitting under a tristate it is better to
select things you need from the tristate rather than the bool since that
allows the things you select to be modules.

The following patch does it for nfsd.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:23 -07:00
Hansjoerg Lipp 5024ad4af6 [PATCH] i4l: Gigaset drivers: add IOCTLs to compat_ioctl.h
Add the IOCTLs of the Gigaset drivers to compat_ioctl.h in order to make
them available for 32 bit programs on 64 bit platforms.  Please merge.

Signed-off-by: Hansjoerg Lipp <hjlipp@web.de>
Acked-by: Tilman Schmidt <tilman@imap.cc>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:23 -07:00
Badari Pulavarty ade1a29e16 [PATCH] ext3: Add "-o bh" option
This patch adds "-o bh" option to force use of buffer_heads.  This option
is needed when we make "nobh" as default - and if we run into problems.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:20 -07:00
Alexey Dobriyan 9637f28f8b [PATCH] reiserfs: remove reiserfs_aio_write()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <reiserfs-dev@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:19 -07:00
Michael LeMay 4eb582cf1f [PATCH] keys: add a way to store the appropriate context for newly-created keys
Add a /proc/<pid>/attr/keycreate entry that stores the appropriate context for
newly-created keys.  Modify the selinux_key_alloc hook to make use of the new
entry.  Update the flask headers to include a new "setkeycreate" permission
for processes.  Update the flask headers to include a new "create" permission
for keys.  Use the create permission to restrict which SIDs each task can
assign to newly-created keys.  Add a new parameter to the security hook
"security_key_alloc" to indicate whether it is being invoked by the kernel, or
from userspace.  If it is being invoked by the kernel, the security hook
should never fail.  Update the documentation to reflect these changes.

Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:18 -07:00
Akinobu Mita f116629d03 [PATCH] fs: use list_move()
This patch converts the combination of list_del(A) and list_add(A, B) to
list_move(A, B) under fs/.

Cc: Ian Kent <raven@themaw.net>
Acked-by: Joel Becker <joel.becker@oracle.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Hans Reiser <reiserfs-dev@namesys.com>
Cc: Urban Widmark <urban@teststation.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:18 -07:00
Akinobu Mita 1bfba4e8ea [PATCH] core: use list_move()
This patch converts the combination of list_del(A) and list_add(A, B) to
list_move(A, B).

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:17 -07:00
Akinobu Mita 8e13059a37 [PATCH] use list_add_tail() instead of list_add()
This patch converts list_add(A, B.prev) to list_add_tail(A, &B) for
readability.

Acked-by: Karsten Keil <kkeil@suse.de>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Acked-by: Jan Kara <jack@suse.cz>
AOLed-by: David Woodhouse <dwmw2@infradead.org>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:17 -07:00
Malcolm Parsons fcc18e83e1 [PATCH] uclinux: use PER_LINUX_32BIT in binfmt_flat
binfmt_flat.c calls set_personality with PER_LINUX as the personality.
On the arm architecture this results in the program running in 26bit
usermode.  PER_LINUX_32BIT should be used instead.  This doesn't affect
other architectures that use binfmt_flat.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 21:04:24 -07:00
Alexey Dobriyan 1e788f8d1a [PATCH] xfs: update ->flush method proto
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 17:43:32 -07:00
Linus Torvalds f36f44de72 Fix NFS2 compile error
Trond had apparently merged the same patch twice, causing a duplicate
include of the "internal.h" file, with resulting obvious confusion.

Tssk.  I'm the only one allowed to send out trees that don't even
compile! Who does this Trond guy think he is?

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 12:30:33 -07:00
Linus Torvalds 1d77062b14 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (51 commits)
  nfs: remove nfs_put_link()
  nfs-build-fix-99
  git-nfs-build-fixes
  Merge branch 'odirect'
  NFS: alloc nfs_read/write_data as direct I/O is scheduled
  NFS: Eliminate nfs_get_user_pages()
  NFS: refactor nfs_direct_free_user_pages
  NFS: remove user_addr, user_count, and pos from nfs_direct_req
  NFS: "open code" the NFS direct write rescheduler
  NFS: Separate functions for counting outstanding NFS direct I/Os
  NLM: Fix reclaim races
  NLM: sem to mutex conversion
  locks.c: add the fl_owner to nlm_compare_locks
  NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts
  NFS: Split fs/nfs/inode.c
  NFS: Fix typo in nfs_do_clone_mount()
  NFS: Fix compile errors introduced by referrals patches
  NFSv4: Ensure that referral mounts bind to a reserved port
  NFSv4: A root pathname is sent as a zero component4
  NFSv4: Follow a referral
  ...
2006-06-25 10:54:14 -07:00
Linus Torvalds 25581ad107 Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (244 commits)
  V4L/DVB (4210b): git-dvb: tea575x-tuner build fix
  V4L/DVB (4210a): git-dvb versus matroxfb
  V4L/DVB (4209): Added some BTTV PCI IDs for newer boards
  Fixes some sync issues between V4L/DVB development and GIT
  V4L/DVB (4206): Cx88-blackbird: always set encoder height based on tvnorm->id
  V4L/DVB (4205): Merge tda9887 module into tuner.
  V4L/DVB (4203): Explicitly set the enum values.
  V4L/DVB (4202): allow selecting CX2341x port mode
  V4L/DVB (4200): Disable bitrate_mode when encoding mpeg-1.
  V4L/DVB (4199): Add cx2341x-specific control array to cx2341x.c
  V4L/DVB (4198): Avoid newer usages of obsoleted experimental MPEGCOMP API
  V4L/DVB (4197): Port new MPEG API to saa7134-empress with saa6752hs
  V4L/DVB (4196): Port cx88-blackbird to the new MPEG API.
  V4L/DVB (4193): Update cx2341x fw encoding API doc.
  V4L/DVB (4192): Use control helpers for saa7115, cx25840, msp3400.
  V4L/DVB (4191): Add CX2341X MPEG encoder module.
  V4L/DVB (4190): Add helper functions for control processing to v4l2-common.
  V4L/DVB (4189): Add videodev support for VIDIOC_S/G/TRY_EXT_CTRLS.
  V4L/DVB (4188): Add new MPEG control/ioctl definitions to videodev2.h
  V4L/DVB (4186): Add support for the DNTV Live! mini DVB-T card.
  ...
2006-06-25 10:09:31 -07:00
Hua Zhong f58a1ebb22 [PATCH] remove unlikely(sb) in prune_dcache
likely profiling shows that the following is a miss.

After boot:
[+- ] Type | # True | # False | Function:Filename@Line
+unlikely |     1074|        0  prune_dcache()@:fs/dcache.c@409

After a bonnie++ run:
+unlikely |    66716|    19584  prune_dcache()@:fs/dcache.c@409

So remove it.

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:26 -07:00
Evgeniy Dushistov 7d93a1a53a [PATCH] ext2: cleanup: put_page and comment fix
Things which force me think a little: why so?

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:25 -07:00
Ulrich Drepper 45c9b11a1d [PATCH] Implement AT_SYMLINK_FOLLOW flag for linkat
When the linkat() syscall was added the flag parameter was added in the
last minute but it wasn't used so far.  The following patch should change
that.  My tests show that this is all that's needed.

If OLDNAME is a symlink setting the flag causes linkat to follow the
symlink and create a hardlink with the target.  This is actually the
behavior POSIX demands for link() as well but Linux wisely does not do
this.  With this flag (which will most likely be in the next POSIX
revision) the programmer can choose the behavior, defaulting to the safe
variant.  As a side effect it is now possible to implement a
POSIX-compliant link(2) function for those who are interested.

  touch file
  ln -s file symlink

  linkat(fd, "symlink", fd, "newlink", 0)
    -> newlink is hardlink of symlink

  linkat(fd, "symlink", fd, "newlink", AT_SYMLINK_FOLLOW)
    -> newlink is hardlink of file

The value of AT_SYMLINK_FOLLOW is determined by the definition we already
use in glibc.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:22 -07:00
Frode Isaksen 04a3446c90 [PATCH] fs: sys_poll with timeout -1 bug fix
If you do a poll() call with timeout -1, the wait will be a big number
(depending on HZ) instead of infinite wait, since -1 is passed to the
msecs_to_jiffies function.

Signed-off-by: Frode Isaksen <frode.isaksen@gmail.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:22 -07:00