linux

Commit Graph

Author	SHA1	Message	Date
Ingo Molnar	86998aa653	[PATCH] genirq core: fix handle_level_irq() while porting the -rt tree to 2.6.18-rc7 i noticed the following screaming-IRQ scenario on an SMP system: 2274 0Dn.:1 0.001ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.010ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.020ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.029ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.039ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.048ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.058ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.068ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.077ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.087ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) 2274 0Dn.:1 0.097ms: do_IRQ+0xc/0x103 <= (ret_from_intr+0x0/0xf) as it turns out, the bug is caused by handle_level_irq(), which if it races with another CPU already handling this IRQ, it _unmasks_ the IRQ line on the way out. This is not how 2.6.17 works, and we introduced this bug in one of the early genirq cleanups right before it went into -mm. (the bug was not in the genirq patchset for a long time, and we didnt notice the bug due to the lack of -rt rebase to the new genirq code. -rt, and hardirq-preemption in particular opens up such races much wider than anything else.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-19 07:57:20 -07:00
Kenneth Lee	e4b69aa2a1	[PATCH] bug fix in kernel/kmod.c I think there is a bug in kmod.c: In __call_usermodehelper(), when kernel_thread(wait_for_helper, ...) return success, since wait_for_helper() might call complete() at any time, the sub_info should not be used any more. Normally wait_for_helper() take a long time to finish, you may not get problem for most of the case. But if you remove /sbin/modprobe, it may become easier for you to get a oop in khelper. Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-16 12:54:32 -07:00
Imre Deak	e1ed7ac77b	[PATCH] genirq: fix typo in IRQ resend Fix a bug where the IRQ_PENDING flag is never cleared and the ISR is called endlessly without an actual interrupt. Signed-off-by: Imre Deak <imre.deak@solidboot.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-16 12:54:30 -07:00
Oleg Nesterov	dd9daa221e	[PATCH] rcu_do_batch: make ->qlen decrement irq safe rcu_do_batch() decrements rdp->qlen with irqs enabled. This is not good, it can also be modified by call_rcu() from interrupt. Decrement ->qlen once with irqs disabled, after a main loop. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-13 07:32:14 -07:00
Ingo Molnar	9bb25bf36f	[PATCH] lockdep: double the number of stack-trace entries Miles Lane reported the "BUG: MAX_STACK_TRACE_ENTRIES too low!" message, which means that during normal use his system produced enough lockdep events so that the 128-thousand entries stack-trace array got exhausted. Double the size of the array. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Miles Lane <miles.lane@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-13 07:32:14 -07:00
Al Viro	55669bfa14	[PATCH] audit: AUDIT_PERM support add support for AUDIT_PERM predicate Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-09-11 13:32:30 -04:00
Amy Griffis	5974501e2d	[PATCH] update audit rule change messages Make the audit message for implicit rule removal more informative. Make the rule update message consistent with other messages. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-09-11 13:32:17 -04:00
Amy Griffis	8ef2d3040e	[PATCH] sanity check audit_buffer Add sanity checks for NULL audit_buffer consistent with other audit_log* routines. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-09-11 13:32:17 -04:00
Steve Grubb	3b33ac3182	[PATCH] fix ppid bug in 2.6.18 kernel Hello, During some troubleshooting, I found that ppid was accidentally omitted from the legacy rule section. This resulted in EINVAL for any rule with ppid sent with AUDIT_ADD. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-09-11 13:32:04 -04:00
Thomas Gleixner	c5780e976e	[PATCH] Use the correct restart option for futex_lock_pi The current implementation of futex_lock_pi returns -ERESTART_RESTARTBLOCK in case that the lock operation has been interrupted by a signal. This results in a return of -EINTR to userspace in case there is an handler for the signal. This is wrong, because userspace expects that the lock function does not return in any case of signal delivery. This was not caught by my insufficient test case, but triggered a nasty userspace problem in an high load application scenario. Unfortunately also glibc does not check for this invalid return value. Using -ERSTARTNOINTR makes sure, that the interrupted syscall is restarted. The restart block related code can be safely removed, as the possible timeout argument is an absolute time value. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-08 10:22:50 -07:00
Ingo Molnar	068c4579fe	[PATCH] lockdep: do not touch console state when tainting the kernel Remove an unintended console_verbose() side-effect from add_taint(). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-06 11:00:02 -07:00
Pavel Machek	471b40d0df	[PATCH] prevent swsusp with PAE PAE + swsusp results in hard-to-debug crash about 50% of time during resume. Cause is known, fix needs to be ported from x86-64 (but we can't make it to 2.6.18, and I'd like this to be worked around in 2.6.18). Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-06 11:00:02 -07:00
Jarek Poplawski	fc47e7b592	[PATCH] lockdep ifdef fix With CONFIG_SMP=y CONFIG_PREEMPT=y CONFIG_LOCKDEP=y CONFIG_DEBUG_LOCK_ALLOC=y # CONFIG_PROVE_LOCKING is not set spin_unlock_irqrestore() goes through lockdep but spin_lock_irqsave() doesn't. Apparently, bad things happen. Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-06 11:00:01 -07:00
Oleg Nesterov	3b6362b833	[PATCH] eligible_child: remove an obsolete ->tgid check It is not possible to find a sub-thread in ->children/->ptrace_children lists, ptrace_attach() does not allow to attach to sub-threads. Even if it was possible to ptrace the task from the same thread group, we can't allow to release ->group_leader while there are others (ptracer) threads in the same group. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-02 14:51:27 -07:00
Henrik Kretzschmar	43a1dd502f	[PATCH] kerneldoc for handle_bad_irq() Adds the description of the parameters from handle_bad_irq(). Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-01 11:39:09 -07:00
Shailabh Nagar	35df17c57c	[PATCH] task delay accounting fixes Cleanup allocation and freeing of tsk->delays used by delay accounting. This solves two problems reported for delay accounting: 1. oops in __delayacct_blkio_ticks http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1844.html Currently tsk->delays is getting freed too early in task exit which can cause a NULL tsk->delays to get accessed via reading of /proc/<tgid>/stats. The patch fixes this problem by freeing tsk->delays closer to when task_struct itself is freed up. As a result, it also eliminates the use of tsk->delays_lock which was only being used (inadequately) to safeguard access to tsk->delays while a task was exiting. 2. Possible memory leak in kernel/delayacct.c http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1389.html The patch cleans up tsk->delays allocations after a bad fork which was missing earlier. The patch has been tested to fix the problems listed above and stress tested with rapid calls to delay accounting's taskstats command interface (which is the other path that can access the same data, besides the /proc interface causing the oops above). Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-01 11:39:08 -07:00
Nick Piggin	0d673a5a47	[PATCH] cpuset: oom panic fix cpuset_excl_nodes_overlap always returns 0 if current is exiting. This caused customer's systems to panic in the OOM killer when processes were having trouble getting memory for the final put_user in mm_release. Even though there were lots of processes to kill. Change to returning 1 in this case. This achieves parity with !CONFIG_CPUSETS case, and was observed to fix the problem. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-27 11:01:32 -07:00
Paul Jackson	4c4d50f7b3	[PATCH] cpuset: top_cpuset tracks hotplug changes to cpu_online_map Change the list of cpus allowed to tasks in the top (root) cpuset to dynamically track what cpus are online, using a CPU hotplug notifier. Make this top cpus file read-only. On systems that have cpusets configured in their kernel, but that aren't actively using cpusets (for some distros, this covers the majority of systems) all tasks end up in the top cpuset. If that system does support CPU hotplug, then these tasks cannot make use of CPUs that are added after system boot, because the CPUs are not allowed in the top cpuset. This is a surprising regression over earlier kernels that didn't have cpusets enabled. In order to keep the behaviour of cpusets consistent between systems actively making use of them and systems not using them, this patch changes the behaviour of the 'cpus' file in the top (root) cpuset, making it read only, and making it automatically track the value of cpu_online_map. Thus tasks in the top cpuset will have automatic use of hot plugged CPUs allowed by their cpuset. Thanks to Anton Blanchard and Nathan Lynch for reporting this problem, driving the fix, and earlier versions of this patch. Signed-off-by: Paul Jackson <pj@sgi.com> Cc: Nathan Lynch <ntl@pobox.com> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-27 11:01:32 -07:00
Yingchao Zhou	4edb9a143e	[PATCH] Remove redundant up() in stop_machine() An up() is called in kernel/stop_machine.c on failure, and also in the caller (unconditionally). Signed-off-by: Zhou Yingchao <yingchao.zhou@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-27 11:01:31 -07:00
Oleg Nesterov	d015baebba	[PATCH] futex_find_get_task(): remove an obscure EXIT_ZOMBIE check futex_find_get_task: if (p->state == EXIT_ZOMBIE \|\| p->exit_state == EXIT_ZOMBIE) return NULL; I can't understand this. First, p->state can't be EXIT_ZOMBIE. The ->exit_state check looks strange too. Sub-threads or tasks whose ->parent ignores SIGCHLD go directly to EXIT_DEAD state (I am ignoring a ptrace case). Why EXIT_DEAD tasks should be ok? Yes, EXIT_ZOMBIE is more important (a task may stay zombie for a long time), but this doesn't mean we should explicitely ignore other EXIT_XXX states. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-27 11:01:30 -07:00
Oleg Nesterov	f8986c241d	[PATCH] revert "Drop tasklist lock in do_sched_setscheduler" sched_setscheduler() looks at ->signal->rlim[]. It is unsafe do dereference ->signal unless tasklist_lock or ->siglock is held (or p == current). We pin the task structure, but this can't prevent from release_task()->__exit_signal() which sets ->signal = NULL. Restore tasklist_lock across the setscheduler call. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-27 11:01:29 -07:00
Andrew Morton	9b41ea7289	[PATCH] workqueue: remove lock_cpu_hotplug() Use a private lock instead. It protects all per-cpu data structures in workqueue.c, including the workqueues list. Fix a bug in schedule_on_each_cpu(): it was forgetting to lock down the per-cpu resources. Unfixed long-standing bug: if someone unplugs the CPU identified by `singlethread_cpu' the kernel will get very sick. Cc: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-08-14 12:54:29 -07:00
john stultz	e579dcbf23	[PATCH] futex_handle_fault always fails We found this issue last week w/ the -RT kernel, but it seems the same issue is in mainline as well. Basically it is possible for futex_unlock_pi to return without actually freeing the lock. This is due to buggy logic in the use of futex_handle_fault() and its attempt argument in a failure case. Looking at futex.c the logic is as follows: 1) In futex_unlock_pi() we start w/ ret=0 and we go down to the first futex_atomic_cmpxchg_inatomic(), where we find uval==-EFAULT. We then jump to the pi_faulted label. 2) From pi_faulted: We increment attempt, unlock the sem and hit the retry label. 3) From the retry label, with ret still zero, we again hit EFAULT on the first futex_atomic_cmpxchg_inatomic(), and again goto the pi_faulted label. 4) Again from pi_faulted: we increment attempt and enter the conditional, where we call futex_handle_fault. 5) futex_handle_fault fails, and we goto the out_unlock_release_sem label. 6) From out_unlock_release_sem we return, and since ret is still zero, we return without error, while never actually unlocking the lock. Issue #1: at the first futex_atomic_cmpxchg_inatomic() we should probably be setting ret=-EFAULT before jumping to pi_faulted: However in our case this doesn't really affect anything, as the glibc we're using ignores the error value from futex_unlock_pi(). Issue #2: Look at futex_handle_fault(), its first conditional will return -EFAULT if attempt is >= 2. However, from the "if(attempt++) futex_handle_fault(attempt)" logic above, we'll never call futex_handle_fault when attempt is less then two. So we never get a chance to even try to fault the page in. The following patch addresses these two issues by 1) Always setting ret to -EFAULT if futex_handle_fault fails, and 2) Removing the = in futex_handle_fault's (attempt >= 2) check. I'm really not sure this is the right fix, but wanted to bring it up so folks knew the issue is alive and well in the current -git tree. From looking at the git logs the logic was first introduced (then later copied to other places) in the following commit almost a year ago: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4732efbeb997189d9f9b04708dc26bf8613ed721;hp=5b039e681b8c5f30aac9cc04385cc94be45d0823 Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-08-14 12:54:29 -07:00
Kirill Korotaev	6997a6faaa	[PATCH] sys_getppid oopses on debug kernel sys_getppid() optimization can access a freed memory. On kernels with DEBUG_SLAB turned ON, this results in Oops. As Dave Hansen noted, this optimization is also unsafe for memory hotplug. So this patch always takes the lock to be safe. [oleg@tv-sign.ru: simplifications] Signed-off-by: Kirill Korotaev <dev@openvz.org> Cc: <stable@kernel.org> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-08-14 12:54:29 -07:00
Andrew Morton	657b3010d8	[PATCH] panic.c build fix kernel/panic.c: In function 'add_taint': kernel/panic.c:176: warning: implicit declaration of function 'debug_locks_off' Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-08-14 12:54:28 -07:00
Jan Blunck	3773dc9205	[PATCH] fix hrtimer percpu usage typo The percpu variable is used incorrectly in switch_hrtimer_base(). Signed-off-by: Jan Blunck <jblunck@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-08-14 12:54:28 -07:00
Thomas Gleixner	ce2c6b5384	[PATCH] futex: Apply recent futex fixes to futex_compat The recent fixups in futex.c need to be applied to futex_compat.c too. Fixes a hang reported by Olaf. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-06 08:57:49 -07:00
KAMEZAWA Hiroyuki	58c1b5b079	[PATCH] memory hotadd fixes: find_next_system_ram catch range fix find_next_system_ram() is used to find available memory resource at onlining newly added memory. This patch fixes following problem. find_next_system_ram() cannot catch this case. Resource: (start)-------------(end) Section : (start)-------------(end) Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Keith Mannthey <kmannth@gmail.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-06 08:57:48 -07:00
KAMEZAWA Hiroyuki	0f04ab5efb	[PATCH] memory hotadd fixes: change find_next_system_ram's return value manner find_next_system_ram() returns valid memory range which meets requested area, only used by memory-hot-add. This function always rewrite requested resource even if returned area is not fully fit in requested one. And sometimes the returnd resource is larger than requested area. This annoyes the caller. This patch changes the returned value to fit in requested area. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Keith Mannthey <kmannth@gmail.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-06 08:57:48 -07:00
Antonino A. Daplas	78944e549d	[PATCH] vt: printk: Fix framebuffer console triggering might_sleep assertion Reported by: Dave Jones Whilst printk'ing to both console and serial console, I got this... (2.6.18rc1) BUG: sleeping function called from invalid context at kernel/sched.c:4438 in_atomic():0, irqs_disabled():1 Call Trace: [<ffffffff80271db8>] show_trace+0xaa/0x23d [<ffffffff80271f60>] dump_stack+0x15/0x17 [<ffffffff8020b9f8>] __might_sleep+0xb2/0xb4 [<ffffffff8029232e>] __cond_resched+0x15/0x55 [<ffffffff80267eb8>] cond_resched+0x3b/0x42 [<ffffffff80268c64>] console_conditional_schedule+0x12/0x14 [<ffffffff80368159>] fbcon_redraw+0xf6/0x160 [<ffffffff80369c58>] fbcon_scroll+0x5d9/0xb52 [<ffffffff803a43c4>] scrup+0x6b/0xd6 [<ffffffff803a4453>] lf+0x24/0x44 [<ffffffff803a7ff8>] vt_console_print+0x166/0x23d [<ffffffff80295528>] __call_console_drivers+0x65/0x76 [<ffffffff80295597>] _call_console_drivers+0x5e/0x62 [<ffffffff80217e3f>] release_console_sem+0x14b/0x232 [<ffffffff8036acd6>] fb_flashcursor+0x279/0x2a6 [<ffffffff80251e3f>] run_workqueue+0xa8/0xfb [<ffffffff8024e5e0>] worker_thread+0xef/0x122 [<ffffffff8023660f>] kthread+0x100/0x136 [<ffffffff8026419e>] child_rip+0x8/0x12 This can occur when release_console_sem() is called but the log buffer still has contents that need to be flushed. The console drivers are called while the console_may_schedule flag is still true. The might_sleep() is triggered when fbcon calls console_conditional_schedule(). Fix by setting console_may_schedule to zero earlier, before the call to the console drivers. Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-06 08:57:47 -07:00
Chuck Ebbert	9f59ce5d0e	[PATCH] ptrace: make pid of child process available for PTRACE_EVENT_VFORK_DONE When delivering PTRACE_EVENT_VFORK_DONE, provide pid of the child process when tracer calls ptrace(PTRACE_GETEVENTMSG). This is already (accidentally) available when the tracer is tracing VFORK in addition to VFORK_DONE. Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Cc: Daniel Jacobowitz <dan@debian.org> Cc: Albert Cahalan <acahalan@gmail.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-06 08:57:46 -07:00
Christian Borntraeger	e91467ecd1	[PATCH] bug in futex unqueue_me This patch adds a barrier() in futex unqueue_me to avoid aliasing of two pointers. On my s390x system I saw the following oops: Unable to handle kernel pointer dereference at virtual kernel address 0000000000000000 Oops: 0004 [#1] CPU: 0 Not tainted Process mytool (pid: 13613, task: 000000003ecb6ac0, ksp: 00000000366bdbd8) Krnl PSW : 0704d00180000000 00000000003c9ac2 (_spin_lock+0xe/0x30) Krnl GPRS: 00000000ffffffff 000000003ecb6ac0 0000000000000000 0700000000000000 0000000000000000 0000000000000000 000001fe00002028 00000000000c091f 000001fe00002054 000001fe00002054 0000000000000000 00000000366bddc0 00000000005ef8c0 00000000003d00e8 0000000000144f91 00000000366bdcb8 Krnl Code: ba 4e 20 00 12 44 b9 16 00 3e a7 84 00 08 e3 e0 f0 88 00 04 Call Trace: ([<0000000000144f90>] unqueue_me+0x40/0xe4) [<0000000000145a0c>] do_futex+0x33c/0xc40 [<000000000014643e>] sys_futex+0x12e/0x144 [<000000000010bb00>] sysc_noemu+0x10/0x16 [<000002000003741c>] 0x2000003741c The code in question is: static int unqueue_me(struct futex_q q) { int ret = 0; spinlock_t lock_ptr; /* In the common case we don't take the spinlock, which is nice. / retry: lock_ptr = q->lock_ptr; if (lock_ptr != 0) { spin_lock(lock_ptr); / * q->lock_ptr can change between reading it and * spin_lock(), causing us to take the wrong lock. This * corrects the race condition. [...] and my compiler (gcc 4.1.0) makes the following out of it: 00000000000003c8 <unqueue_me>: 3c8: eb bf f0 70 00 24 stmg %r11,%r15,112(%r15) 3ce: c0 d0 00 00 00 00 larl %r13,3ce <unqueue_me+0x6> 3d0: R_390_PC32DBL .rodata+0x2a 3d4: a7 f1 1e 00 tml %r15,7680 3d8: a7 84 00 01 je 3da <unqueue_me+0x12> 3dc: b9 04 00 ef lgr %r14,%r15 3e0: a7 fb ff d0 aghi %r15,-48 3e4: b9 04 00 b2 lgr %r11,%r2 3e8: e3 e0 f0 98 00 24 stg %r14,152(%r15) 3ee: e3 c0 b0 28 00 04 lg %r12,40(%r11) /* write q->lock_ptr in r12 / 3f4: b9 02 00 cc ltgr %r12,%r12 3f8: a7 84 00 4b je 48e <unqueue_me+0xc6> / if r12 is zero then jump over the code.... / 3fc: e3 20 b0 28 00 04 lg %r2,40(%r11) / write q->lock_ptr in r2 / 402: c0 e5 00 00 00 00 brasl %r14,402 <unqueue_me+0x3a> 404: R_390_PC32DBL _spin_lock+0x2 / use r2 as parameter for spin_lock */ So the code becomes more or less: if (q->lock_ptr != 0) spin_lock(q->lock_ptr) instead of if (lock_ptr != 0) spin_lock(lock_ptr) Which caused the oops from above. After adding a barrier gcc creates code without this problem: [...] (the same) 3ee: e3 c0 b0 28 00 04 lg %r12,40(%r11) 3f4: b9 02 00 cc ltgr %r12,%r12 3f8: b9 04 00 2c lgr %r2,%r12 3fc: a7 84 00 48 je 48c <unqueue_me+0xc4> 400: c0 e5 00 00 00 00 brasl %r14,400 <unqueue_me+0x38> 402: R_390_PC32DBL _spin_lock+0x2 As a general note, this code of unqueue_me seems a bit fishy. The retry logic of unqueue_me only works if we can guarantee, that the original value of q->lock_ptr is always a spinlock (Otherwise we overwrite kernel memory). We know that q->lock_ptr can change. I dont know what happens with the original spinlock, as I am not an expert with the futex code. Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@timesys.com> Signed-off-by: Christian Borntraeger <borntrae@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-06 08:57:46 -07:00
Rafael J. Wysocki	a7ef7878ea	[PATCH] Make suspend possible with a traced process at a breakpoint It should be possible to suspend, either to RAM or to disk, if there's a traced process that has just reached a breakpoint. However, this is a special case, because its parent process might have been frozen already and then we are unable to deliver the "freeze" signal to the traced process. If this happens, it's better to cancel the freezing of the traced process. Ref. http://bugzilla.kernel.org/show_bug.cgi?id=6787 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-06 08:57:45 -07:00
Al Viro	3f2792ffbd	[PATCH] take filling ->pid, etc. out of audit_get_context() move that stuff downstream and into the only branch where it'll be used. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-08-03 10:59:51 -04:00
Al Viro	5ac3a9c26c	[PATCH] don't bother with aux entires for dummy context Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-08-03 10:59:42 -04:00
Al Viro	d51374adf5	[PATCH] mark context of syscall entered with no rules as dummy Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-08-03 10:59:26 -04:00
Al Viro	471a5c7c83	[PATCH] introduce audit rules counter Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-08-03 10:55:18 -04:00
Amy Griffis	5422e01ac1	[PATCH] fix audit oops with invalid operator Michael C Thompson wrote: [Tue Aug 01 2006, 02:36:36PM EDT] > The trigger for this oops is: > # auditctl -a exit,always -S pread64 -F 'inode<1' Setting the err value will fix it. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-08-03 10:54:43 -04:00
Amy Griffis	6988434ee5	[PATCH] fix oops with CONFIG_AUDIT and !CONFIG_AUDITSYSCALL Always initialize the audit_inode_hash[] so we don't oops on list rules. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-08-03 10:50:39 -04:00
Amy Griffis	73d3ec5aba	[PATCH] fix missed create event for directory audit When an object is created via a symlink into an audited directory, audit misses the event due to not having collected the inode data for the directory. Modify __audit_inode_child() to copy the parent inode data if a parent wasn't found in audit_names[]. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-08-03 10:50:30 -04:00
Amy Griffis	3e2efce067	[PATCH] fix faulty inode data collection for open() with O_CREAT When the specified path is an existing file or when it is a symlink, audit collects the wrong inode number, which causes it to miss the open() event. Adding a second hook to the open() path fixes this. Also add audit_copy_inode() to consolidate some code. Signed-off-by: Amy Griffis <amy.griffis@hp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-08-03 10:50:21 -04:00
Linus Torvalds	ae74c3b69a	Fix force_sig_info() semantics after cleanups Suresh points out that commit `b0423a0d9c` broke the semantics of a synchronous signal like SIGSEGV occurring recursively inside its own handler handler (or, indeed, any other context when the signal was blocked). That was unintentional, and this fixes things up by reinstating the old semantics, but without reverting the cleanups. Cc: Paul E. McKenney <paulmck@us.ibm.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-08-02 20:17:49 -07:00
Josh Triplett	51d8c5edd3	[PATCH] timer: Fix tvec_bases initializer kernel/timer.c defines a (per-cpu) pointer to tvec_base_t, but initializes it using { &a_tvec_base_t }, which sparse warns about; change this to just &a_tvec_base_t. Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:44 -07:00
Steven Rostedt	d07fe82c24	[PATCH] reference rt-mutex-design in rtmutex.c In order to prevent Doc Rot, this patch adds a reference to the design document for rtmutex.c in rtmutex.c. So when someone needs to update or change the design of that file they will know that a document actually exists that explains the design (helping them change it), and hopefully that they will update the document if they too change the design. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:43 -07:00
Tim Chen	3c829c367a	[PATCH] Reducing local_bh_enable/disable overhead in irqtrace The recent changes from irqtrace feature has added overheads to local_bh_disable and local_bh_enable that reduces UDP performance across x86_64 and IA64, even though IA64 does not support the irqtrace feature. Patch in question is [PATCH]lockdep: irqtrace subsystem, core http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=c ommit;h=de30a2b355ea85350ca2f58f3b9bf4e5bc007986 Prior to this patch, local_bh_disable was a short macro. Now it is a function which calls __local_bh_disable with added irq flags save and restore. The irq flags save and restore were also added to local_bh_enable, probably for injecting the trace irqs code. This overhead is on the generic code path across all architectures. On a IA_64 test machine (Itanium-2 1.6 GHz) running a benchmark like netperf's UDP streaming test, the added overhead results in a drop of 3% in throughput, as udp_sendmsg calls the local_bh_enable/disable several times. Other workloads that have heavy usages of local_bh_enable/disable could also be affected. The patch ideally should not have affected IA-64 performance as it does not have IRQ tracing support. A significant portion of the overhead is in the added irq flags save and restore, which I think is not needed if IRQ tracing is unused. A suggested patch is attached below that recovers the lost performance. However, the "ifdef"s in the patch are a bit ugly. Signed-off-by: Tim Chen <tim.c.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:42 -07:00
Heiko Carstens	b50f60ceee	[PATCH] pi-futex: missing pi_waiters plist initialization Initialize init task's pi_waiters plist. Otherwise cpu hotplug of cpu 0 might crash, since rt_mutex_getprio() accesses an uninitialized list head. call chain which led to crash: take_cpu_down sched_idle_next __setscheduler rt_mutex_getprio Using PLIST_HEAD_INIT in the INIT_TASK macro doesn't work unfortunately, since the pi_waiters member is only conditionally present. Cc: Arjan van de Ven <arjan@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:41 -07:00
Rolf Eike Beer	0fcb78c22f	[PATCH] Add DocBook documentation for workqueue functions kernel/workqueue.c was omitted from generating kernel documentation. This adds a new section "Workqueues and Kevents" and adds documentation for some of the functions. Some functions in this file already had DocBook-style comments, now they finally become visible. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:40 -07:00
Jim Houston	2d7d253548	[PATCH] fix cond_resched() fix In cond_resched_lock() it calls __resched_legal() before dropping the spin lock. __resched_legal() will always finds the preempt_count non-zero and will prevent the call to __cond_resched(). The attached patch adds a parameter to __resched_legal() with the expected preempt_count value. Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:40 -07:00
Steven Rostedt	6ea24f9ad1	[PATCH] fix bad macro param in timer.c We have #define INDEX(N) (base->timer_jiffies >> (TVR_BITS + N * TVN_BITS)) & TVN_MASK and it's used via list = varray[i + 1]->vec + (INDEX(i + 1)); So, due to underparenthesisation, this INDEX(i+1) is now a ... (TVR_BITS + i + 1 * TVN_BITS)) ... So this bugfix changes behaviour. It worked before by sheer luck: "If i was anything but 0, it was broken. But this was only used by s390 and arm. Since it was for the next interrupt, could that next interrupt be a problem (going into the second cascade)? But it was probably seldom wrong. That is, this would fail if the next interrupt was in the second cascade, and was wrapped. Which may never of happened. Also if it did happen, it would have just missed the interrupt. If an interrupt was missed, and no one was there to miss it, was it really missed :-)" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:40 -07:00
Chandra Seetharaman	8c78f3075d	[PATCH] cpu hotplug: replace __devinit* with __cpuinit* for cpu notifications Few of the callback functions and notifier blocks that are associated with cpu notifications incorrectly have __devinit and __devinitdata. They should be __cpuinit and __cpuinitdata instead. It makes no functional difference but wastes text area when CONFIG_HOTPLUG is enabled and CONFIG_HOTPLUG_CPU is not. This patch fixes all those instances. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Cc: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:39 -07:00
bibo, mao	a9ad965ea9	[PATCH] IA64: kprobe invalidate icache of jump buffer Kprobe inserts breakpoint instruction in probepoint and then jumps to instruction slot when breakpoint is hit, the instruction slot icache must be consistent with dcache. Here is the patch which invalidates instruction slot icache area. Without this patch, in some machines there will be fault when executing instruction slot where icache content is inconsistent with dcache. Signed-off-by: bibo,mao <bibo.mao@intel.com> Acked-by: "Luck, Tony" <tony.luck@intel.com> Acked-by: Keshavamurthy Anil S <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:38 -07:00
Shailabh Nagar	163ecdff06	[PATCH] delay accounting: temporarily enable by default Enable delay accounting by default so that feature gets coverage testing without requiring special measures. Earlier, it was off by default and had to be enabled via a boot time param. This patch reverses the default behaviour to improve coverage testing. It can be removed late in the kernel development cycle if its believed users shouldn't have to incur any cost if they don't want delay accounting. Or it can be retained forever if the utility of the stats is deemed common enough to warrant keeping the feature on. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:37 -07:00
Shailabh Nagar	d94a041519	[PATCH] taskstats: free skb, avoid returns in send_cpu_listeners Add a missing freeing of skb in the case there are no listeners at all. Also remove the returning of error values by the function as it is unused by the sole caller. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:37 -07:00
Shailabh Nagar	7d94dddd43	[PATCH] make taskstats sending completely independent of delay accounting on/off status Complete the separation of delay accounting and taskstats by ignoring the return value of delay accounting functions that fill in parts of taskstats before it is sent out (either in response to a command or as part of a task exit). Also make delayacct_add_tsk return silently when delay accounting is turned off rather than treat it as an error. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:37 -07:00
David Brownell	15a647eba9	[PATCH] genirq: {en,dis}able_irq_wake() need refcounting too IRQs need refcounting and a state flag to track whether the the IRQ should be enabled or disabled as a "normal IRQ" source after a series of calls to {en,dis}able_irq(). For shared IRQs, the IRQ must be enabled so long as at least one driver needs it active. Likewise, IRQs need the same support to track whether the IRQ should be enabled or disabled as a "wakeup event" source after a series of calls to {en,dis}able_irq_wake(). For shared IRQs, the IRQ must be enabled as a wakeup source during sleep so long as at least one driver needs it. But right now they _don't have_ that refcounting ... which means sharing a wakeup-capable IRQ can't work correctly in some configurations. This patch adds the refcount and flag mechanisms to set_irq_wake() -- which is what {en,dis}able_irq_wake() call -- and minimal documentation of what the irq wake mechanism does. Drivers relying on the older (broken) "toggle" semantics will trigger a warning; that'll be a handful of drivers on ARM systems. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:36 -07:00
Siddha, Suresh B	f712c0c7e1	[PATCH] sched: build_sched_domains() fix Use the correct groups while initializing sched groups power for allnodes_domain. This fixes the crash observed while creating exclusive cpusets. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Reported-and-tested-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:36 -07:00
Ingo Molnar	e3f2ddeac7	[PATCH] pi-futex: robust-futex exit Fix robust PI-futexes to be properly unlocked on unexpected exit. For this to work the kernel has to know whether a futex is a PI or a non-PI one, because the semantics are different. Since the space in relevant glibc data structures is extremely scarce, the best solution is to encode the 'PI' information in bit 0 of the robust list pointer. Existing (non-PI) glibc robust futexes have this bit always zero, so the ABI is kept. New glibc with PI-robust-futexes will set this bit. Further fixes from Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-28 21:02:00 -07:00
Ingo Molnar	627371d73c	[PATCH] pi-futex: robust-futex exit crash fix Fix pi_state->list handling bugs: list handling mishap, locking error. Plus add more debug checks and fix a few style issues i noticed while debugging this. (reported by Ulrich Drepper and Jakub Jelinek.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-28 21:02:00 -07:00
Paul Jackson	abb5a5cc6b	[PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset callback_mutex lock. It only happens on cpu_exclusive cpusets, due to the dynamic sched domain code trying to take the cpu hotplug lock inside the cpuset callback_mutex lock. This bug has apparently been here for several months, but didn't get hit until the right customer load on a large system. This fix appears right from inspection, but it will take a few more days running it on that customers workload to be confident we nailed it. We don't have any other reproducible test case. The cpu_hotplug_lock() tends to cover large runs of code. The other places that hold both that lock and the cpuset callback mutex lock always nest the cpuset lock inside the hotplug lock. This place tries to do the reverse, risking an ABBA deadlock. This is in the cpuset_rmdir() code, where we: * take the callback_mutex lock * mark the cpuset CS_REMOVED * call update_cpu_domains for cpu_exclusive cpusets * in that call, take the cpu_hotplug lock if the cpuset is marked for removal. Thanks to Jack Steiner for identifying this deadlock. The fix is to tear down the dynamic sched domain before we grab the cpuset callback_mutex lock. This way, the two locks are serialized, with the hotplug lock taken and released before trying for the cpuset lock. I suspect that this bug was introduced when I changed the cpuset locking from one lock to two. The dynamic sched domain dependency on cpu_exclusive cpusets and its hotplug hooks were added to this code earlier, when cpusets had only a single lock. It may well have been fine then. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-23 13:03:05 -07:00
Linus Torvalds	aa95387774	cpu hotplug: simplify and hopefully fix locking The CPU hotplug locking was quite messy, with a recursive lock to handle the fact that both the actual up/down sequence wanted to protect itself from being re-entered, but the callbacks that it called also tended to want to protect themselves from CPU events. This splits the lock into two (one to serialize the whole hotplug sequence, the other to protect against the CPU present bitmaps changing). The latter still allows recursive usage because some subsystems (ondemand policy for cpufreq at least) had already gotten too used to the lax locking, but the locking mistakes are hopefully now less fundamental, and we now warn about recursive lock usage when we see it, in the hope that it can be fixed. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-23 12:12:16 -07:00
Shailabh Nagar	bb129994c3	[PATCH] Remove down_write() from taskstats code invoked on the exit() path In send_cpu_listeners(), which is called on the exit path, a down_write() was protecting operations like skb_clone() and genlmsg_unicast() that do GFP_KERNEL allocations. If the oom-killer decides to kill tasks to satisfy the allocations,the exit of those tasks could block on the same semphore. The down_write() was only needed to allow removal of invalid listeners from the listener list. The patch converts the down_write to a down_read and defers the removal to a separate critical region. This ensures that even if the oom-killer is called, no other task's exit is blocked as it can still acquire another down_read. Thanks to Andrew Morton & Herbert Xu for pointing out the oom related pitfalls, and to Chandra Seetharaman for suggesting this fix instead of using something more complex like RCU. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:57 -07:00
Shailabh Nagar	f9fd8914c1	[PATCH] per-task delay accounting taskstats interface: control exit data through cpumasks On systems with a large number of cpus, with even a modest rate of tasks exiting per cpu, the volume of taskstats data sent on thread exit can overflow a userspace listener's buffers. One approach to avoiding overflow is to allow listeners to get data for a limited and specific set of cpus. By scaling the number of listeners and/or the cpus they monitor, userspace can handle the statistical data overload more gracefully. In this patch, each listener registers to listen to a specific set of cpus by specifying a cpumask. The interest is recorded per-cpu. When a task exits on a cpu, its taskstats data is unicast to each listener interested in that cpu. Thanks to Andrew Morton for pointing out the various scalability and general concerns of previous attempts and for suggesting this design. [akpm@osdl.org: build fix] Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:57 -07:00
Shailabh Nagar	ad4ecbcba7	[PATCH] delay accounting taskstats interface send tgid once Send per-tgid data only once during exit of a thread group instead of once with each member thread exit. Currently, when a thread exits, besides its per-tid data, the per-tgid data of its thread group is also sent out, if its thread group is non-empty. The per-tgid data sent consists of the sum of per-tid stats for all remaining threads of the thread group. This patch modifies this sending in two ways: - the per-tgid data is sent only when the last thread of a thread group exits. This cuts down heavily on the overhead of sending/receiving per-tgid data, especially when other exploiters of the taskstats interface aren't interested in per-tgid stats - the semantics of the per-tgid data sent are changed. Instead of being the sum of per-tid data for remaining threads, the value now sent is the true total accumalated statistics for all threads that are/were part of the thread group. The patch also addresses a minor issue where failure of one accounting subsystem to fill in the taskstats structure was causing the send of taskstats to not be sent at all. The patch has been tested for stability and run cerberus for over 4 hours on an SMP. [akpm@osdl.org: bugfixes] Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:57 -07:00
Shailabh Nagar	2589045466	[PATCH] per-task-delay-accounting: /proc export of aggregated block I/O delays Export I/O delays seen by a task through /proc/<tgid>/stats for use in top etc. Note that delays for I/O done for swapping in pages (swapin I/O) is clubbed together with all other I/O here (this is not the case in the netlink interface where the swapin I/O is kept distinct) [akpm@osdl.org: printk warning fix] Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:57 -07:00
Shailabh Nagar	6f44993fe1	[PATCH] per-task-delay-accounting: delay accounting usage of taskstats interface Usage of taskstats interface by delay accounting. Signed-off-by: Shailabh Nagar <nagar@us.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:56 -07:00
Shailabh Nagar	c757249af1	[PATCH] per-task-delay-accounting: taskstats interface Create a "taskstats" interface based on generic netlink (NETLINK_GENERIC family), for getting statistics of tasks and thread groups during their lifetime and when they exit. The interface is intended for use by multiple accounting packages though it is being created in the context of delay accounting. This patch creates the interface without populating the fields of the data that is sent to the user in response to a command or upon the exit of a task. Each accounting package interested in using taskstats has to provide an additional patch to add its stats to the common structure. [akpm@osdl.org: cleanups, Kconfig fix] Signed-off-by: Shailabh Nagar <nagar@us.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:56 -07:00
Chandra Seetharaman	52f17b6c2b	[PATCH] per-task-delay-accounting: cpu delay collection via schedstats Make the task-related schedstats functions callable by delay accounting even if schedstats collection isn't turned on. This removes the dependency of delay accounting on schedstats. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:56 -07:00
Shailabh Nagar	0ff922452d	[PATCH] per-task-delay-accounting: sync block I/O and swapin delay collection Unlike earlier iterations of the delay accounting patches, now delays are only collected for the actual I/O waits rather than try and cover the delays seen in I/O submission paths. Account separately for block I/O delays incurred as a result of swapin page faults whose frequency can be affected by the task/process' rss limit. Hence swapin delays can act as feedback for rss limit changes independent of I/O priority changes. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:56 -07:00
Shailabh Nagar	ca74e92b46	[PATCH] per-task-delay-accounting: setup Initialization code related to collection of per-task "delay" statistics which measure how long it had to wait for cpu, sync block io, swapping etc. The collection of statistics and the interface are in other patches. This patch sets up the data structures and allows the statistics collection to be disabled through a kernel boot parameter. Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com> Signed-off-by: Balbir Singh <balbir@in.ibm.com> Cc: Jes Sorensen <jes@sgi.com> Cc: Peter Chubb <peterc@gelato.unsw.edu.au> Cc: Erich Focht <efocht@ess.nec.de> Cc: Levent Serinol <lserinol@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:56 -07:00
Ingo Molnar	3a5f5e488c	[PATCH] lockdep: core, fix rq-lock handling on __ARCH_WANT_UNLOCKED_CTXSW On platforms that have __ARCH_WANT_UNLOCKED_CTXSW set and want to implement lock validator support there's a bug in rq->lock handling: in this case we dont 'carry over' the runqueue lock into another task - but still we did a spinlock_release() of it. Fix this by making the spinlock_release() in context_switch() dependent on !__ARCH_WANT_UNLOCKED_CTXSW. (Reported by Ralf Baechle on MIPS, which has __ARCH_WANT_UNLOCKED_CTXSW. This fixes a lockdep-internal BUG message on such platforms.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:55 -07:00
OGAWA Hirofumi	a4afee02a5	[PATCH] Fix sighand->siglock usage in kernel/acct.c IRQs must be disabled before taking ->siglock. Noticed by lockdep. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:54 -07:00
john stultz	3e143475c2	[PATCH] improve timekeeping resume robustness Resolve problems seen w/ APM suspend. Due to resume initialization ordering, its possible we could get a timer interrupt before the timekeeping resume() function is called. This patch ensures we don't do any timekeeping accounting before we're fully resumed. (akpm: fixes the machine-freezes-on-APM-resume bug) Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:54 -07:00
Adrian Bunk	6bc02d8412	[PATCH] unexport open_softirq Christoph Hellwig: open_softirq just enables a softirq. The softirq array is statically allocated so to add a new one you would have to patch the kernel. So there's no point to keep this export at all as any user would have to patch the enum in include/linux/interrupt.h anyway. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:53 -07:00
Luca Tettamanti	60198f9992	[PATCH] Add try_to_freeze() to rt-test kthreads When CONFIG_RT_MUTEX_TESTER is enabled kernel refuses to suspend the machine because it's unable to freeze the rt-test-* threads. Add try_to_freeze() after schedule() so that the threads will be freezed correctly; I've tested the patch and it lets the notebook suspends and resumes nicely. Signed-off-by: Luca Tettamanti <kronos.it@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:53 -07:00
Andrew Morton	a0009652af	[PATCH] del_timer_sync(): add cpu_relax() Relax the CPU in the del_timer_sync() busywait loop. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:52 -07:00
Adrian Bunk	52e92e5788	[PATCH] remove kernel/kthread.c:kthread_stop_sem() Remove the now-unneeded kthread_stop_sem(). Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:52 -07:00
Andreas Gruenbacher	098c5eea03	[PATCH] null-terminate over-long /proc/kallsyms symbols Got a customer bug report (https://bugzilla.novell.com/190296) about kernel symbols longer than 127 characters which end up in a string buffer that is not NULL terminated, leading to garbage in /proc/kallsyms. Using strlcpy prevents this from happening, even though such symbols still won't come out right. A better fix would be to not use a fixed-size buffer, but it's probably not worth the trouble. (Modversion'ed symbols even have a length limit of 60.) [bunk@stusta.de: build fix] Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:52 -07:00
Adrian Bunk	cd6ef2ada5	[PATCH] The scheduled unexport of insert_resource Implement the scheduled unexport of insert_resource. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-07-12 16:09:08 -07:00
Adrian Bunk	26865e9c26	[PATCH] remove kernel/power/pm.c:pm_unregister_all() Remove the deprecated and no longer used pm_unregister_all(). Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-07-12 16:09:08 -07:00
Marcel Holtmann	abf75a5033	[PATCH] Fix prctl privilege escalation and suid_dumpable (CVE-2006-2451) Based on a patch from Ernie Petrides During security research, Red Hat discovered a behavioral flaw in core dump handling. A local user could create a program that would cause a core file to be dumped into a directory they would not normally have permissions to write to. This could lead to a denial of service (disk consumption), or allow the local user to gain root privileges. The prctl() system call should never allow to set "dumpable" to the value 2. Especially not for non-privileged users. This can be split into three cases: 1) running as root -- then core dumps will already be done as root, and so prctl(PR_SET_DUMPABLE, 2) is not useful 2) running as non-root w/setuid-to-root -- this is the debatable case 3) running as non-root w/setuid-to-non-root -- then you definitely do NOT want "dumpable" to get set to 2 because you have the privilege escalation vulnerability With case #2, the only potential usefulness is for a program that has designed to run with higher privilege (than the user invoking it) that wants to be able to create root-owned root-validated core dumps. This might be useful as a debugging aid, but would only be safe if the program had done a chdir() to a safe directory. There is no benefit to a production setuid-to-root utility, because it shouldn't be dumping core in the first place. If this is true, then the same debugging aid could also be accomplished with the "suid_dumpable" sysctl. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-12 12:50:25 -07:00
Arjan van de Ven	2c16e9c888	[PATCH] lockdep: disable lock debugging when kernel state becomes untrusted Disable lockdep debugging in two situations where the integrity of the kernel no longer is guaranteed: when oopsing and when hitting a tainting-condition. The goal is to not get weird lockdep traces that don't make sense or are otherwise undebuggable, to not waste time. Lockdep assumes that the previous state it knows about is valid to operate, which is why lockdep turns itself off after the first violation it reports, after that point it can no longer make that assumption. A kernel oops means that the integrity of the kernel compromised; in addition anything lockdep would report is of lesser importance than the oops. All the tainting conditions are of similar integrity-violating nature and also make debugging/diagnosing more difficult. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:27 -07:00
Christoph Hellwig	c59923a15c	[PATCH] remove the tasklist_lock export As announced half a year ago this patch will remove the tasklist_lock export. The previous two patches got rid of the remaining modular users. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:26 -07:00
Ingo Molnar	21d71f513b	[PATCH] uninline init_waitqueue_head() allyesconfig vmlinux size delta: text data bss dec filename 20736884 6073834 3075176 29885894 vmlinux.before 20721009 6073966 3075176 29870151 vmlinux.after ~18 bytes per callsite, 15K of text size (~0.1%) saved. (as an added bonus this also removes a lockdep annotation.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:25 -07:00
Linus Torvalds	aeceb15738	[PATCH] swsusp: fix panic when signature can't be read Do not panic a machine when swsusp signature can't be read. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:22 -07:00
Andrew Morton	712f403af6	[PATCH] swsusp warning fix kernel/power/swap.c: In function 'swsusp_write': kernel/power/swap.c:275: warning: 'start' may be used uninitialized in this function gcc isn't smart enough, so help it. Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:22 -07:00
Rafael J. Wysocki	95018f7c94	[PATCH] swsusp: do not use memcpy for snapshotting memory swsusp should not use memcpy for snapshotting memory, because on some architectures memcpy may increase preempt_count (i386 does this when CONFIG_X86_USE_3DNOW is set). Then, as a result, wrong value of preempt_count is stored in the image. Replace memcpy in copy_data_pages with an open-coded loop. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:22 -07:00
Roman Zippel	e154ff3d2c	[PATCH] adjust clock for lost ticks A large number of lost ticks can cause an overadjustment of the clock. To compensate for this we look at the current error and the larger the error already is the more careful we are at adjusting the error. As small extra fix reset the error when the clock is set. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Acked-by: john stultz <johnstul@us.ibm.com> Cc: Uwe Bugla <uwe.bugla@gmx.de> Cc: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:18 -07:00
Thomas Gleixner	06a9ec291b	[PATCH] pi-futex: Validate futex type instead of oopsing Calling futex_lock_pi is called with a reference to a non PI futex and waiters exist already, lookup_pi_state() oopses due to pi_state == NULL. Check this condition and return -EINVAL to userspace. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:18 -07:00
Adrian Bunk	80d6679a62	[PATCH] kernel/softirq.c: EXPORT_UNUSED_SYMBOL This patch marks an unused export as EXPORT_UNUSED_SYMBOL. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:18 -07:00
Adrian Bunk	c0fc84d2e5	[PATCH] kernel/printk.c: EXPORT_SYMBOL_UNUSED This patch marks unused exports as EXPORT_SYMBOL_UNUSED. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:17 -07:00
Ingo Molnar	d6d897cec2	[PATCH] lockdep: core, reduce per-lock class-cache size lockdep_map is embedded into every lock, which blows up data structure sizes all around the kernel. Reduce the class-cache to be for the default class only - that is used in 99.9% of the cases and even if we dont have a class cached, the lookup in the class-hash is lockless. This change reduces the per-lock dep_map overhead by 56 bytes on 64-bit platforms and by 28 bytes on 32-bit platforms. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:14 -07:00
Arjan van de Ven	55794a412f	[PATCH] lockdep: improve debug output Make lockdep print which lock is held, in the "kfree() of a live lock" scenario. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:14 -07:00
Andi Kleen	f9829cceb6	[PATCH] Minor cleanup to lockdep.c - Use printk formatting for indentation - Don't leave NTFS in the default event filter Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:14 -07:00
Andreas Mohr	2ed6e34f88	[PATCH] small kernel/sched.c cleanup - constify and optimize stat_nam (thanks to Michael Tokarev!) - spelling and comment fixes Signed-off-by: Andreas Mohr <andi@lisas.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:13 -07:00
Peter Williams	0a565f7919	[PATCH] sched: fix bug in __migrate_task() Problem: In the function __migrate_task(), deactivate_task() followed by activate_task() is used to move the task from one run queue to another. This has two undesirable effects: 1. The task's priority is recalculated. (Nowhere else in the scheduler code is the priority recalculated for a change of CPU.) 2. The task's time stamp is set to the current time. At the very least, this makes the adjustment of the time stamp before the call to deactivate_task() redundant but I believe the problem is more serious as the time stamp now holds the time of the queue change instead of the time at which the task was woken. In addition, unless dest_rq is the same queue as "current" is on the time stamp could be inaccurate due to inter CPU drift. Solution: Replace the call to activate_task() with one to __activate_task(). Signed-off-by: Peter Williams <pwil3058@bigpond.net.au> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:13 -07:00
Linus Torvalds	ca78f6baca	Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq * master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq: Move workqueue exports to where the functions are defined. [CPUFREQ] Misc cleanups in ondemand. [CPUFREQ] Make ondemand sampling per CPU and remove the mutex usage in sampling path. [CPUFREQ] Add queue_delayed_work_on() interface for workqueues. [CPUFREQ] Remove slowdown from ondemand sampling path.	2006-07-04 14:00:26 -07:00
Andrew Morton	d8cb7c1ded	[PATCH] revert "kthread: convert stop_machine into a kthread" Jiri reports that the stop_machin kthread conversion caused his machine to hang when suspending. Hyperthreading is apparently involved. I don't see why that would be and I can't reproduce it. Revert to the 2.6.17 code. Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-03 21:25:20 -07:00
Linus Torvalds	912b2539e1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: powerpc: add defconfig for Freescale MPC8349E-mITX board powerpc: Add base support for the Freescale MPC8349E-mITX eval board Documentation: correct values in MPC8548E SEC example node [POWERPC] Actually copy over i8259.c to arch/ppc/syslib this time [POWERPC] Add new interrupt mapping core and change platforms to use it [POWERPC] Copy i8259 code back to arch/ppc [POWERPC] New device-tree interrupt parsing code [POWERPC] Use the genirq framework [PATCH] genirq: Allow fasteoi handler to retrigger disabled interrupts [POWERPC] Update the SWIM3 (powermac) floppy driver [POWERPC] Fix error handling in detecting legacy serial ports [POWERPC] Fix booting on Momentum "Apache" board (a Maple derivative) [POWERPC] Fix various offb and BootX-related issues [POWERPC] Add a default config for 32-bit CHRP machines [POWERPC] fix implicit declaration on cell. [POWERPC] change get_property to return void *	2006-07-03 15:28:34 -07:00
Ingo Molnar	70b97a7f0b	[PATCH] sched: cleanup, convert sched.c-internal typedefs to struct convert: - runqueue_t to 'struct rq' - prio_array_t to 'struct prio_array' - migration_req_t to 'struct migration_req' I was the one who added these but they are both against the kernel coding style and also were used inconsistently at places. So just get rid of them at once, now that we are flushing the scheduler patch-queue anyway. Conversion was mostly scripted, the result was reviewed and all secondary whitespace and style impact (if any) was fixed up by hand. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-03 15:27:11 -07:00
Ingo Molnar	36c8b58689	[PATCH] sched: cleanup, remove task_t, convert to struct task_struct cleanup: remove task_t and convert all the uses to struct task_struct. I introduced it for the scheduler anno and it was a mistake. Conversion was mostly scripted, the result was reviewed and all secondary whitespace and style impact (if any) was fixed up by hand. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-03 15:27:11 -07:00

1 2 3 4 5 ...

1355 Commits