linux

Author	SHA1	Message	Date
Magnus Damm	28b146c84e	sh: intc - add support for SH7710 This patch converts the cpu specific interrupt setup code for sh7710 from ipr to intc. While at it new vectors are added to match the information provided by the datasheet. Version two simplifies the Kconfig part. Vectors for IRQ4 and IRQ5 are enabled by default. Use plat_irq_setup_pins() if pins IRQ0-3 should be used in IRQ mode. This patch also adds sh7710 specific platform data for the rtc driver. The base address of SCIF1 is adjusted to match the datasheet. Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-09-21 11:57:45 +09:00
Magnus Damm	70e8be0a4e	sh: intc - add support for SH7705 This patch converts the cpu specific interrupt setup code for sh7705 from ipr to intc. While at it new vectors are added to match the information provided by the datasheet. Vectors for IRQ4 and IRQ5 are enabled by default. Use plat_irq_setup_pins() if pins IRQ0-3 should be used in IRQ mode. This patch also adds sh7705 specific platform data for the rtc driver. Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-09-21 11:57:45 +09:00
Linus Torvalds	335fb8fc71	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata] ahci: add ATI SB800 PCI IDs libata-sff: Fix documentation libata: Update the blacklist with a few more devices	2007-09-20 13:25:35 -07:00
Davide Libenzi	b8fceee17a	signalfd simplification This simplifies signalfd code, by avoiding it to remain attached to the sighand during its lifetime. In this way, the signalfd remain attached to the sighand only during poll(2) (and select and epoll) and read(2). This also allows to remove all the custom "tsk == current" checks in kernel/signal.c, since dequeue_signal() will only be called by "current". I think this is also what Ben was suggesting time ago. The external effect of this, is that a thread can extract only its own private signals and the group ones. I think this is an acceptable behaviour, in that those are the signals the thread would be able to fetch w/out signalfd. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-20 13:19:59 -07:00
Wolfgang Walter	9db619e665	rpc: fix garbage in printk in svc_tcp_accept() we upgraded the kernel of a nfs-server from 2.6.17.11 to 2.6.22.6. Since then we get the message lockd: too many open TCP sockets, consider increasing the number of nfsd threads lockd: last TCP connect from ^\\236^\É^D These random characters in the second line are caused by a bug in svc_tcp_accept. (Note: there are two previous __svc_print_addr(sin, buf, sizeof(buf)) calls in this function, either of which would initialize buf correctly; but both are inside "if"'s and are not necessarily executed. This is less obvious in the second case, which is inside a dprintk(), which is a macro which expands to an if statement.) Signed-off-by: Wolfgang Walter <wolfgang.walter@studentenwerk.mhn.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-20 13:15:57 -07:00
henry su	c69c0892d8	[libata] ahci: add ATI SB800 PCI IDs ATI/AMD SB800 shares some device IDs with SB700, and SB800 adds two more device IDs:0x4394,0x4395. Signed-off-by: henry su <henry.su.ati@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 16:07:33 -04:00
Alan Cox	e1cc9de836	libata-sff: Fix documentation Code moved to ioread/iowrite but the comment didn't Also note a posting issue Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 15:58:26 -04:00
Alan Cox	0e3dbc01d5	libata: Update the blacklist with a few more devices Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 15:58:26 -04:00
Linus Torvalds	f685ddaf0f	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [BNX2]: Add PHY workaround for 5709 A1. [PPP] L2TP: Fix skb handling in pppol2tp_xmit [PPP] L2TP: Fix skb handling in pppol2tp_recv_core [PPP] L2TP: Disallow non-UDP datagram sockets [PPP] pppoe: Fix double-free on skb after transmit failure [PKT_SCHED]: Fix 'SFQ qdisc crashes with limit of 2 packets' [NETFILTER]: MAINTAINERS update [NETFILTER]: nfnetlink_log: fix sending of multipart messages	2007-09-20 12:42:47 -07:00
Linus Torvalds	460edb3cd0	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: sky2: version 1.18 sky2: receive FIFO checking sky2: fe+ chip support sky2: reorganize chip revision features sky2: ethtool speed report bug sky2: fix VLAN receive processing (resend) phy: export phy_mii_ioctl myri10ge: Add support for PCI device id 9	2007-09-20 12:42:23 -07:00
Stephen Hemminger	faf60e72d0	sky2: version 1.18 Update version number Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 15:23:00 -04:00
Stephen Hemminger	75e806838a	sky2: receive FIFO checking A driver writer from another operating system hinted that the versions of Yukon 2 chip with rambuffer (EC and XL) have a hardware bug that if the FIFO ever gets completely full it will hang. Sounds like a classic ring full vs ring empty wrap around bug. As a workaround, use the existing watchdog timer to check for ring full lockup. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 15:23:00 -04:00
Stephen Hemminger	05745c4ab1	sky2: fe+ chip support Add support for newest Marvell chips. The Yukon FE plus chip is found in some not yet released laptops. Tested on hardware evaluation boards. This version of the patch is for 2.6.23. It supersedes the two previous patches that are sitting in netdev-2.6 (upstream branch). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 15:23:00 -04:00
Stephen Hemminger	ea76e63598	sky2: reorganize chip revision features This patch should cause no functional changes in driver behaviour. There are (too) many revisions of the Yukon 2 chip now. Instead of adding more conditionals based on chip revision; rerganize into a set of feature flags so adding new versions is less problematic. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 15:23:00 -04:00
Stephen Hemminger	c99210b50f	sky2: ethtool speed report bug On 100mbit versions, the driver always reports gigabit speed available. The correct modes are already computed, then overwritten. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 15:23:00 -04:00
Stephen Hemminger	d6532232cd	sky2: fix VLAN receive processing (resend) The length check for truncated frames was not correctly handling the case where VLAN acceleration had already read the tag. Also, the Yukon EX has some features that use high bit of status as security tag. Signed-off-by: Pierre-Yves Ritschard <pyr@spootnik.org> Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 15:22:59 -04:00
Stefan Richter	be7963b7e7	ieee1394: ohci1394: fix initialization if built non-modular Initialization of ohci1394 was broken according to one reporter if the driver was statically linked, i.e. not built as loadable module. Dmesg: PCI: Device 0000:02:07.0 not available because of resource collisions ohci1394: Failed to enable OHCI hardware. This was reported for a Toshiba Satellite 5100-503. The cause is commit `8df4083c52` in Linux 2.6.19-rc1 which only served purposes of early remote debugging via FireWire. This functionality is better provided by the currently out-of-tree driver ohci1394_earlyinit. Reversal of the commit was OK'd by Andi Kleen. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2007-09-20 21:19:45 +02:00
Michael Chan	cd46171c72	[BNX2]: Add PHY workaround for 5709 A1. Add the DIS_EARLY_DAC PHY workaround for 5709 A1. Without it, link sometimes does not come up. Update version to 1.6.5. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-20 12:14:21 -07:00
Herbert Xu	f3d5e3a415	[PPP] L2TP: Fix skb handling in pppol2tp_xmit This patch makes pppol2tp_xmit call skb_cow_head so that we don't modify cloned skb data. It also gets rid of skb2 we only need to preserve the original skb for congestion notification, which is only applicable for ppp_async and ppp_sync. The other semantic change made here is the removal of socket accounting for data tranmitted out of pppol2tp_xmit. The original code leaked any existing socket skb accounting. We could fix this by dropping the original skb owner. However, this is undesirable as the packet has not physically left the host yet. In fact, all other tunnels in the kernel do not account skb's passing through to their own socket. In partciular, ESP over UDP does not do so and it is the closest tunnel type to PPPoL2TP. So this patch simply removes the socket accounting in pppol2tp_xmit. The accounting still applies to control packets of course. I've also added a reminder that the outgoing checksum here doesn't work. I suppose existing deployments don't actually enable checksums. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-20 12:14:18 -07:00
Herbert Xu	7a70e39b66	[PPP] L2TP: Fix skb handling in pppol2tp_recv_core The function pppol2tp_recv_core doesn't handle non-linear packets properly. It also fails to check the remote offset field. This patch fixes these problems. It also removes an unnecessary check on the UDP header which has already been performed by the UDP layer. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-20 12:14:17 -07:00
Herbert Xu	a14d6abc94	[PPP] L2TP: Disallow non-UDP datagram sockets With the addition of UDP-Lite we need to refine the socket check so that only genuine UDP sockets are allowed through. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-20 12:14:17 -07:00
Herbert Xu	21d0c83302	[PPP] pppoe: Fix double-free on skb after transmit failure When I got rid of the second packet in __pppoe_xmit I created a double-free on the skb because of the goto abort on failure. This patch removes that. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-20 12:14:16 -07:00
Alexey Kuznetsov	5588b40d7c	[PKT_SCHED]: Fix 'SFQ qdisc crashes with limit of 2 packets' Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-20 12:14:08 -07:00
Patrick McHardy	1a03b81db9	[NETFILTER]: MAINTAINERS update Update netfilter list addresses and an old email address of myself. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-20 12:13:56 -07:00
Eric Leblond	29c5d4afba	[NETFILTER]: nfnetlink_log: fix sending of multipart messages The following patch fixes the handling of netlink packets containing multiple messages. As exposed during netfilter workshop, nfnetlink_log was overwritten the message type of the last message (setting it to MSG_DONE) in a multipart packet. The consequence was libnfnetlink to ignore the last message in the packet. The following patch adds a supplementary message (with type MSG_DONE) af the end of the netlink skb. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-20 12:13:52 -07:00
Linus Torvalds	6d0b842d3b	Fix CRLF line endings in Documentation/input/iforce-protocol.txt Emil Medve points out that this documentation file uses CRLF line endings, which means that if you use [core] autocrlf=input (which makes sense if you ever develop under Windows, for example, or if you use other broken tools) in your git config, git will always complain about the file being dirty. This removes the bogus DOS line endings, and removes whitespace at the end of line. Cc: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-20 11:33:45 -07:00
Paul Bolle	bbc15f46fe	[x86 setup] Fix typo in arch/i386/boot/header.S There's an obvious typo in arch/i386/boot/header.S (in your linux-2.6-x86setup.git) that I noticed by just studying the code. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2007-09-20 11:06:59 -07:00
H. Peter Anvin	91c4b8cb5a	[acpi] Correct the decoding of video mode numbers in wakeup.S wakeup.S looks at the video mode number from the setup header and looks to see if it is a VESA mode. Unfortunately, the decoding is done incorrectly and it will attempt to frob the VESA BIOS for any mode number 0x0200 or larger. Correct this, and remove a bunch of #if 0'd code. Massive thanks to Jeff Chua for reporting the bug, and suffering though a large number of experiments in order to track this problem down. Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2007-09-20 11:06:58 -07:00
H. Peter Anvin	3f662b3f6e	[x86 setup] Present the canonical video mode number to the kernel Canonicalize the video mode number as presented to the kernel. The video mode number may be user-entered (e.g. ASK_VGA), an alias (e.g. NORMAL_VGA), or a size specification, and that confuses the suspend wakeup code. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2007-09-20 11:06:58 -07:00
Domen Puncer	680e9fe9d6	phy: export phy_mii_ioctl Export phy_mii_ioctl, so network drivers can use it when built as modules too. Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-20 02:35:50 -04:00
Linus Torvalds	81cfe79b9c	Linux 2.6.23-rc7	2007-09-19 16:01:13 -07:00
Linus Torvalds	097cc62283	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: fix invalid sched_class use sched: add /proc/sys/kernel/sched_compat_yield	2007-09-19 15:47:59 -07:00
Eric Paris	31e8793094	SELinux: fix array out of bounds when mounting with selinux options Given an illegal selinux option it was possible for match_token to work in random memory at the end of the match_table_t array. Note that privilege is required to perform a context mount, so this issue is effectively limited to root only. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>	2007-09-20 08:06:40 +10:00
Hiroshi Shimamoto	9c95e7319b	sched: fix invalid sched_class use When using rt_mutex, a NULL pointer dereference is occurred at enqueue_task_rt. Here is a scenario; 1) there are two threads, the thread A is fair_sched_class and thread B is rt_sched_class. 2) Thread A is boosted up to rt_sched_class, because the thread A has a rt_mutex lock and the thread B is waiting the lock. 3) At this time, when thread A create a new thread C, the thread C has a rt_sched_class. 4) When doing wake_up_new_task() for the thread C, the priority of the thread C is out of the RT priority range, because the normal priority of thread A is not the RT priority. It makes data corruption by overflowing the rt_prio_array. The new thread C should be fair_sched_class. The new thread should be valid scheduler class before queuing. This patch fixes to set the suitable scheduler class. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-09-19 23:34:46 +02:00
Ingo Molnar	1799e35d5b	sched: add /proc/sys/kernel/sched_compat_yield add /proc/sys/kernel/sched_compat_yield to make sys_sched_yield() more agressive, by moving the yielding task to the last position in the rbtree. with sched_compat_yield=0: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2539 mingo 20 0 1576 252 204 R 50 0.0 0:02.03 loop_yield 2541 mingo 20 0 1576 244 196 R 50 0.0 0:02.05 loop with sched_compat_yield=1: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2584 mingo 20 0 1576 248 196 R 99 0.0 0:52.45 loop 2582 mingo 20 0 1576 256 204 R 0 0.0 0:00.00 loop_yield Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2007-09-19 23:34:46 +02:00
Brice Goglin	a07bc1ffae	myri10ge: Add support for PCI device id 9 Add support for new Myri-10G boards with PCI device id 9. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-09-19 16:22:09 -04:00
Linus Torvalds	a88a8eff1e	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] cpu-bugs64.c: GCC 3.3 constraint workaround [MIPS] DEC: Initialise ioasic_ssr_lock	2007-09-19 11:45:32 -07:00
Linus Torvalds	c39c06b961	Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb * master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: V4L/DVB (6173a): Documentation: Remove reference to dead "cpia_pp=" boot-time option Revert "V4L/DVB (6173a): Documentation: Remove reference to dead "cpia_pp=" boot-time option"	2007-09-19 11:41:15 -07:00
Linus Torvalds	a78feb7c8a	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: [XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer. [XFS] Ensure file size updates have been completed before writing inode to disk. [XFS] On-demand reaping of the MRU cache	2007-09-19 11:40:13 -07:00
Linus Torvalds	91fe7d7cdd	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SUNSAB]: Fix several bugs.	2007-09-19 11:39:39 -07:00
Linus Torvalds	d56c5c414c	Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: ide: remove unused variables from drivers/ide/ppc/pmac.c ide: ST320413A has the same problem as ST340823A	2007-09-19 11:39:10 -07:00
Linus Torvalds	f15f41383d	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Fix timekeeping on PowerPC 601 [POWERPC] Don't expose clock vDSO functions when CPU has no timebase [POWERPC] spusched: Fix null pointer dereference in find_victim	2007-09-19 11:38:25 -07:00
Linus Torvalds	dbe3ed1c07	x86-64: page faults from user mode are always user faults Randy Dunlap noticed an interesting "crashme" behaviour on his dual Prescott Xeon setup, where he gets page faults with the error code having a zero "user" bit, but the register state points back to user mode. This may be a CPU microcode buglet triggered by some strange instruction pattern that crashme generates, and loading a microcode update seems to possibly have fixed it. Regardless, we really should trust the register state more than the error code, since it's really the register state that determines whether we can actually send a signal, or whether we're in kernel mode and need to oops/kill the process in the case of a page fault. Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-19 11:37:14 -07:00
Maciej W. Rozycki	09abbcffb3	[MIPS] cpu-bugs64.c: GCC 3.3 constraint workaround Add a workaround to address warnings generated on the "n" constraint by GCC 3.3 and below. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-09-19 19:33:14 +01:00
Maciej W. Rozycki	6883599943	[MIPS] DEC: Initialise ioasic_ssr_lock Fix the definition of the ioasic_ssr_lock spinlock to include a proper initialisation. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-09-19 19:33:14 +01:00
Dmitry Torokhov	4f01a757e7	Driver core: fix deprectated sysfs structure for nested class devices Nested class devices used to have 'device' symlink point to a real (physical) device instead of a parent class device. When converting subsystems to struct device we need to keep doing what class devices did if CONFIG_SYSFS_DEPRECATED is Y, otherwise parts of udev break. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Cc: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Greg KH <greg@kroah.com> Tested-by: Anssi Hannula <anssi.hannula@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-19 11:24:18 -07:00
Jeff Dike	508a92741a	uml: fix irqstack crash This patch fixes a crash caused by an interrupt coming in when an IRQ stack is being torn down. When this happens, handle_signal will loop, setting up the IRQ stack again because the tearing down had finished, and handling whatever signals had come in. However, to_irq_stack returns a mask of pending signals to be handled, plus bit zero is set if the IRQ stack was already active, and thus shouldn't be torn down. This causes a problem because when handle_signal goes around the loop, sig will be zero, and to_irq_stack will duly set bit zero in the returned mask, faking handle_signal into believing that it shouldn't tear down the IRQ stack and return thread_info pointers back to their original values. This will eventually cause a crash, as the IRQ stack thread_info will continue pointing to the original task_struct and an interrupt will look into it after it has been freed. The fix is to stop passing a signal number into to_irq_stack. Rather, the pending signals mask is initialized beforehand with the bit for sig already set. References to sig in to_irq_stack can be replaced with references to the mask. [akpm@linux-foundation.org: use UL] Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-19 11:24:18 -07:00
Lee Schermerhorn	480eccf9ae	Fix NUMA Memory Policy Reference Counting This patch proposes fixes to the reference counting of memory policy in the page allocation paths and in show_numa_map(). Extracted from my "Memory Policy Cleanups and Enhancements" series as stand-alone. Shared policy lookup [shmem] has always added a reference to the policy, but this was never unrefed after page allocation or after formatting the numa map data. Default system policy should not require additional ref counting, nor should the current task's task policy. However, show_numa_map() calls get_vma_policy() to examine what may be [likely is] another task's policy. The latter case needs protection against freeing of the policy. This patch adds a reference count to a mempolicy returned by get_vma_policy() when the policy is a vma policy or another task's mempolicy. Again, shared policy is already reference counted on lookup. A matching "unref" [__mpol_free()] is performed in alloc_page_vma() for shared and vma policies, and in show_numa_map() for shared and another task's mempolicy. We can call __mpol_free() directly, saving an admittedly inexpensive inline NULL test, because we know we have a non-NULL policy. Handling policy ref counts for hugepages is a bit trickier. huge_zonelist() returns a zone list that might come from a shared or vma 'BIND policy. In this case, we should hold the reference until after the huge page allocation in dequeue_hugepage(). The patch modifies huge_zonelist() to return a pointer to the mempolicy if it needs to be unref'd after allocation. Kernel Build [16cpu, 32GB, ia64] - average of 10 runs: w/o patch w/ refcount patch Avg Std Devn Avg Std Devn Real: 100.59 0.38 100.63 0.43 User: 1209.60 0.37 1209.91 0.31 System: 81.52 0.42 81.64 0.34 Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-19 11:24:18 -07:00
Pavel Emelyanov	28f300d236	Fix user namespace exiting OOPs It turned out, that the user namespace is released during the do_exit() in exit_task_namespaces(), but the struct user_struct is released only during the put_task_struct(), i.e. MUCH later. On debug kernels with poisoned slabs this will cause the oops in uid_hash_remove() because the head of the chain, which resides inside the struct user_namespace, will be already freed and poisoned. Since the uid hash itself is required only when someone can search it, i.e. when the namespace is alive, we can safely unhash all the user_struct-s from it during the namespace exiting. The subsequent free_uid() will complete the user_struct destruction. For example simple program #include <sched.h> char stack[2 * 1024 * 1024]; int f(void foo) { return 0; } int main(void) { clone(f, stack + 1 1024 * 1024, 0x10000000, 0); return 0; } run on kernel with CONFIG_USER_NS turned on will oops the kernel immediately. This was spotted during OpenVZ kernel testing. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: "Serge E. Hallyn" <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-19 11:24:18 -07:00
Pavel Emelyanov	735de2230f	Convert uid hash to hlist Surprisingly, but (spotted by Alexey Dobriyan) the uid hash still uses list_heads, thus occupying twice as much place as it could. Convert it to hlist_heads. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-19 11:24:18 -07:00

1 2 3 4 5 ...

65193 Commits