ia64 and ppc64 had hugetlb_free_pgtables functions which were no longer being
called, and it wasn't obvious what to do about them.
The ppc64 case turns out to be easy: the associated tables are noted elsewhere
and freed later, safe to either skip its hugetlb areas or go through the
motions of freeing nothing. Since ia64 does need a special case, restore to
ppc64 the special case of skipping them.
The ia64 hugetlb case has been broken since pgd_addr_end went in, though it
probably appeared to work okay if you just had one such area; in fact it's
been broken much longer if you consider a long munmap spanning from another
region into the hugetlb region.
In the ia64 hugetlb region, more virtual address bits are available than in
the other regions, yet the page tables are structured the same way: the page
at the bottom is larger. Here we need to scale down each addr before passing
it to the standard free_pgd_range. Was about to write a hugely_scaled_down
macro, but found htlbpage_to_page already exists for just this purpose. Fixed
off-by-one in ia64 is_hugepage_only_range.
Uninline free_pgd_range to make it available to ia64. Make sure the
vma-gathering loop in free_pgtables cannot join a hugepage_only_range to any
other (safe to join huges? probably but don't bother).
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There's only one usage of MM_VM_SIZE(mm) left, and it's a troublesome macro
because mm doesn't contain the (32-bit emulation?) info needed. But it too is
only needed because we ignore the end from the vma list.
We could make flush_pgtables return that end, or unmap_vmas. Choose the
latter, since it's a natural fit with unmap_mapping_range_vma needing to know
its restart addr. This does make more than minimal change, but if unmap_vmas
had returned the end before, this is how we'd have done it, rather than
storing the break_addr in zap_details.
unmap_vmas used to return count of vmas scanned, but that's just debug which
hasn't been useful in a while; and if we want the map_count 0 on exit check
back, it can easily come from the final remove_vm_struct loop.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Recent woes with some arches needing their own pgd_addr_end macro; and 4-level
clear_page_range regression since 2.6.10's clear_page_tables; and its
long-standing well-known inefficiency in searching throughout the higher-level
page tables for those few entries to clear and free: all can be blamed on
ignoring the list of vmas when we free page tables.
Replace exit_mmap's clear_page_range of the total user address space by
free_pgtables operating on the mm's vma list; unmap_region use it in the same
way, giving floor and ceiling beyond which it may not free tables. This
brings lmbench fork/exec/sh numbers back to 2.6.10 (unless preempt is enabled,
in which case latency fixes spoil unmap_vmas throughput).
Beware: the do_mmap_pgoff driver failure case must now use unmap_region
instead of zap_page_range, since a page table might have been allocated, and
can only be freed while it is touched by some vma.
Move free_pgtables from mmap.c to memory.c, where its lower levels are adapted
from the clear_page_range levels. (Most of free_pgtables' old code was
actually for a non-existent case, prev not properly set up, dating from before
hch gave us split_vma.) Pass mmu_gather** in the public interfaces, since we
might want to add latency lockdrops later; but no attempt to do so yet, going
by vma should itself reduce latency.
But what if is_hugepage_only_range? Those ia64 and ppc64 cases need careful
examination: put that off until a later patch of the series.
What of x86_64's 32bit vdso page __map_syscall32 maps outside any vma?
And the range to sparc64's flush_tlb_pgtables? It's less clear to me now that
we need to do more than is done here - every PMD_SIZE ever occupied will be
flushed, do we really have to flush every PGDIR_SIZE ever partially occupied?
A shame to complicate it unnecessarily.
Special thanks to David Miller for time spent repairing my ceilings.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Fix prototypes for debugfs functions (in configurations where
debugfs is disabled).
Signed-off-by: Michal Ostrowski <mostrows@speakeasy.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The current <linux/debugfs.h> include file is a little fragile in that
it is not self-contained and hence may cause compile warnings or
errors depending on the files included before it, the kernel config
and the architecture. This patch makes things a little more robust by:
- including <linux/types.h> to get definitions of u32, mode_t, and so on.
- forward declaring struct file_operations.
- including <linux/err.h> when CONFIG_DEBUG_FS is not set
The last change is particularly useful, as a kernel developer is
likely to build with debugfs always enabled and never see the build
breakage cased if debugfs is disabled.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
sysfs: allow changing the permissions for already created attributes
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is the first of a few installments of PM API updates to match the
recent switch to "pm_message_t". This installment primarily affects
USB device drivers (for USB interfaces), and it changes the handful of
drivers which currently implement suspend methods:
- <linux/usb.h> and usbcore, signature change
- Some drivers only changed the signature, net effect this just
shuts up "sparse -Wbitwise":
* hid-core
* stir4200
- Two network drivers did that, and also grew slightly more
featureful suspend code ... they now properly shut down
their activities. (As should stir4200...)
* pegasus
* usbnet
Note that the Wake-On-Lan (WOL) support in pegasus doesn't yet work; looks
to me like it's missing a request to turn it on, vs just configuring it.
The ASIX code in usbnet also has WOL hooks that are ready to use; untested.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Index: gregkh-2.6/drivers/net/irda/stir4200.c
===================================================================
With older gcc's:
In file included from drivers/usb/class/cdc-acm.c:63:
include/linux/usb_cdc.h:117: field `bDetailData' has incomplete type
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
diff -puN include/linux/usb_cdc.h~usb_cdc-build-fix include/linux/usb_cdc.h
Like Alpha, sparc64's struct stat was defined before we had the
nanosecond et al. fields added. So like Alpha I have to cons up a
struct stat64 to get this stuff. I'll work on the glibc bits soon.
Also, we were forgetting to fill in the nanosecond fields in the sparc
compat stat64 syscalls.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch removes volatile qualifier from scsi_device->device_busy,
Scsi_Host->host_busy and ->host_failed as the volatile qualifiers
don't serve any purpose now. While at it, convert those fields from
unsigned short to unsigned int as suggested by Christoph.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We have a DID_IMM_RETRY to require a retry at once, but we could do with
a DID_REQUEUE to instruct the mid-layer to treat this command in the
same manner as QUEUE_FULL or BUSY (i.e. halt the submission until
another command returns ... or the queue pressure builds if there are no
outstanding commands).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_cmnd->serial_number_at_timeout doesn't serve any purpose
anymore. All serial_number == serial_number_at_timeout tests
are always true in abort callbacks. Kill the field. Also, as
->pid always equals ->serial_number and ->serial_number
doesn't have any special meaning anymore, update comments
above ->serial_number accordingly. Once we remove all uses of
this field from all lldd's, this field should go.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_cmnd->internal_timeout field doesn't have any meaning
anymore. Kill the field.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We were flushing the D-cache excessively for ptrace() processing
and this makes debugging threads so slow as to be totally unusable.
All process page accesses via ptrace() go via access_process_vm().
This routine, for each process page, uses get_user_pages(). That
in turn does a flush_dcache_page() on the child pages before we
copy in/out the ptrace request data.
Therefore, all we need to do after the data movement is:
1) Flush the D-cache pages if the kernel maps the page to a different
color than userspace does.
2) If we wrote to the page, we need to flush the I-cache on older cpus.
Previously we just flushed the entire cache at the end of a ptrace()
request, and that was beyond stupid.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix show_regs() to provide a backtrace. Provide a new __show_regs()
function which implements the common subset of show_regs() and die().
Add prototypes to asm-arm/system.h
Signed-off-by: Russell King <rmk@arm.linux.org.uk>
We have a DID_IMM_RETRY to require a retry at once, but we could do with
a DID_REQUEUE to instruct the mid-layer to treat this command in the
same manner as QUEUE_FULL or BUSY (i.e. halt the submission until
another command returns ... or the queue pressure builds if there are no
outstanding commands).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_cmnd->serial_number_at_timeout doesn't serve any purpose
anymore. All serial_number == serial_number_at_timeout tests
are always true in abort callbacks. Kill the field. Also, as
->pid always equals ->serial_number and ->serial_number
doesn't have any special meaning anymore, update comments
above ->serial_number accordingly. Once we remove all uses of
this field from all lldd's, this field should go.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
scsi_cmnd->internal_timeout field doesn't have any meaning
anymore. Kill the field.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- add a comment to the device structure that the device_busy field
is now protected by the request_queue->queue_lock
- null out sdev->request_queue after the queue is released to trap
any (and there shouldn't be any) use after the queue is freed.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The current problem seen is that the queue lock is actually in the
SCSI device structure, so when that structure is freed on device
release, we go boom if the queue tries to access the lock again.
The fix here is to move the lock from the scsi_device to the queue.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch hides reparent_to_init(). reparent_to_init() should only be
called by daemonize().
Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
gcc-4 warns with
include/linux/cpuset.h:21: warning: type qualifiers ignored on function
return type
cpuset_cpus_allowed is declared with const
extern const cpumask_t cpuset_cpus_allowed(const struct task_struct *p);
First const should be __attribute__((const)), but the gcc manual
explains that:
"Note that a function that has pointer arguments and examines the data
pointed to must not be declared const. Likewise, a function that calls a
non-const function usually must not be const. It does not make sense for
a const function to return void."
The following patch remove const from the function declaration.
Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Acked-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the new io infrastructure, all of our operators are expecting the
underlying device to be little endian (because the PCI bus, their main
consumer, is LE).
However, there are a fair few devices and busses in the world that are
actually Big Endian. There's even evidence that some of these BE bus and
chip types are attached to LE systems. Thus, there's a need for a BE
equivalent of our io{read,write}{16,32} operations.
The attached patch adds this as io{read,write}{16,32}be. When it's in,
I'll add the first consume (the 53c700 SCSI chip driver).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The hlist_for_each_entry_rcu() comment block refers to a nonexistent
hlist_add_rcu() API, needs to change to hlist_add_head_rcu().
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
These have been deprecated since ->compat_ioctl when in, thus only a short
deprecation period. There's four users left: i2o_config, s390/z90crypy,
s390/dasd and s390/zfcp and for the first two patches are about to be
submitted to get rid of it.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This fixes u32 vs. pm_message_t confusion in remaining places. Fortunately
there's few of them.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This fixes pm_message_t vs. u32 confusion in ppc and aty (I *hope* that's
basically radeon code...). I was not able to test most of these, but I'm
not really changing anything, so it should be okay.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I thought I'm done with fixing u32 vs. pm_message_t ... unfortunately
that turned out not to be the case as Russel King pointed out. Here are
fixes for Documentation and common code (mainly system devices).
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This will allow hotplug CPU in the future and in general cleans up a lot of
crufty code. It also should plug some races that the old hackish way
introduces. Remove one old race workaround in NMI watchdog setup that is not
needed anymore.
I removed the old total sum of bogomips reporting code. The brag value of
BogoMips has been greatly devalued in the last years on the open market.
Real CPU hotplug will need some more work, but the infrastructure for it is
there now.
One drawback: the new TSC sync algorithm is less accurate than before. The
old way of zeroing TSCs is too intrusive to do later. Instead the TSC of the
BP is duplicated now, which is less accurate.
akpm:
- sync_tsc_bp_init seems to have the sense of `init' inverted.
- SPIN_LOCK_UNLOCKED is deprecated - use DEFINE_SPINLOCK.
Cc: <rusty@rustcorp.com.au>
Cc: <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It was confusingly named.
Signed-off-by: Andi Kleen <ak@suse.de>
DESC
x86_64: Switch SMP bootup over to new CPU hotplug state machine
EDESC
From: "Andi Kleen" <ak@suse.de>
This will allow hotplug CPU in the future and in general cleans up a lot of
crufty code. It also should plug some races that the old hackish way
introduces. Remove one old race workaround in NMI watchdog setup that is not
needed anymore.
I removed the old total sum of bogomips reporting code. The brag value of
BogoMips has been greatly devalued in the last years on the open market.
Real CPU hotplug will need some more work, but the infrastructure for it is
there now.
One drawback: the new TSC sync algorithm is less accurate than before. The
old way of zeroing TSCs is too intrusive to do later. Instead the TSC of the
BP is duplicated now, which is less accurate.
Cc: <rusty@rustcorp.com.au>
Cc: <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Appended patch adds the support for Intel dual-core detection and displaying
the core related information in /proc/cpuinfo.
It adds two new fields "core id" and "cpu cores" to x86 /proc/cpuinfo and the
"core id" field for x86_64("cpu cores" field is already present in x86_64).
Number of processor cores in a die is detected using cpuid(4) and this is
documented in IA-32 Intel Architecture Software Developer's Manual (vol 2a)
(http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
This patch also adds cpu_core_map similar to cpu_sibling_map.
Slightly hacked by AK.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Calling a notifier three times in the debug handler does not make much sense,
because a debugger can figure out the various conditions by itself. Remove
the additional calls to DIE_DEBUG and DIE_DEBUGSTEP completely.
This matches what i386 does now.
This also makes sure interrupts are always still disabled when calling a
debugger, which prevents:
BUG: using smp_processor_id() in preemptible [00000001] code: tpopf/1470
caller is post_kprobe_handler+0x9/0x70
Call Trace:<ffffffff8024f10f>{smp_processor_id+191} <ffffffff80120e69>{post_kpro
be_handler+9}
<ffffffff80120f7a>{kprobe_exceptions_notify+58}
<ffffffff80144fc0>{notifier_call_chain+32} <ffffffff80110daf>{do_debug+335}
<ffffffff8010f513>{debug+127} <EOE>
on preemptible debug kernels with kprobes when single stepping in user space.
This was probably a bug even on non preempt kernels, this function was
supposed to be running with interrupts off according to a comment there.
Note to third part debugger maintainers: please double check your debugger can
still single step.
Cc: <prasanna@in.ibm.com>
Cc: <jbeulich@novell.com>
Cc: <kaos@sgi.com>
Cc: <jim.houston@ccur.com>
Cc: <jfv@bluesong.net>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Look for gaps in the e820 memory map to put PCI resources in.
This hopefully fixes problems with the PCI code assigning 32bit BARs MMIO
resources which are >32bit.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
local_t is actually a win over atomic_t because it does not need lock
prefixes.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On Intel Noconas the TSC ticks with a constant frequency. Don't scale the
factor used by udelay when cpufreq changes the frequency.
This generalizes an earlier patch by Intel for this.
Cc: <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This allows to use them on x86-64
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use a real VMA to map the 32bit vsyscall page
This interacts better with Hugh's upcomming VMA walk optimization
Also removes some ugly special cases.
Code roughly modelled after the ppc64 vdso version from Ben Herrenschmidt.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the correct file name in BUG()
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>