Commit 2a95ea6c0d ("procfs: do not overflow get_{idle,iowait}_time
for nohz") did not take into account that one some architectures jiffies
and cputime use different units.
This causes get_idle_time() to return numbers in the wrong units, making
the idle time fields in /proc/stat wrong.
Instead of converting the usec value returned by
get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
usecs_to_cputime64 to convert it to the correct unit of cputime64_t.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Artem S. Tashkinov" <t.artem@mailcity.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove dead sysctl_userprocess_debug. The variable is
gone since we map our old userprocess_debug sysctl to
the common code show_unhandled_signals sysctl on s390.
Signed-off-by: Heiko Carsten <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The queue_start_poll function pointer field in struct qdio_initialize
had to change its type and become a vector of function pointers to
support asynchronous delivery of storage blocks so rename the field to
make the type change explicit and ensure no other user of qdio tries
to use the field the old way. During setting up the qdio queues, only
dereference vector elements if the vector is actually allocated.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Einar Lueck <elelueck@de.ibm.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The panic function will first print the panic message to the console,
then stop additional cpus with smp_send_stop and finally call the
function on the panic notifier list.
In case of an I/O based console the panic message will cause I/O to
be started and a function on the panic notifier list will wait for the
completion of the I/O. That does not work if an I/O completion interrupt
has already been delivered to a cpu that is then stopped by smp_send_stop.
To break this cyclic dependency add code to smp_send_stop that gives
the additional cpu the opportunity to complete outstanding interrupts.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Call generic IPC demultiplexer instead of having a nearly identical
s390 variant. Also make sure that native and compat handling now have
the same behaviour.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move the program interruption code and the translation exception identifier
to the pt_regs structure as 'int_code' and 'int_parm_long' and make the
first level interrupt handler in entry[64].S store the two values. That
makes it possible to drop 'prot_addr' and 'trap_no' from the thread_struct
and to reduce the number of arguments to a lot of functions. Finally
un-inline do_trap. Overall this saves 5812 bytes in the .text section of
the 64 bit kernel.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Increase cpu topology change poll frequency if a change is anticipated.
Otherwise a user might be a bit confused to have to wait up to a minute
in order to see a change this should be visible immediatly.
However there is no guarantee that the change will happen during the
time frame the poll frequency is increased.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Another round of cleanup for entry[64].S, in particular the program check
handler looks more reasonable now. The code size for the 31 bit kernel
has been reduced by 616 byte and by 528 byte for the 64 bit version.
Even better the code is a bit faster as well.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There is no reason for the cpu-measurement-facility host id constant to
reside in the lowcore where space is precious. Use an entry in the literal
pool in HANDLE_SIE_INTERCEPT and a stack slot in sie64a.
While we are at it replace the id -1 with 0 to indicate host execution.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove all ifdefs from topology code and also only compile it for the
CONFIG_SCHED_BOOK case. The new code selects SCHED_MC if SCHED_BOOK is
selected. SCHED_MC without SCHED_BOOK is not possible anymore.
Furthermore various sysfs attributes are not available anymore for the
!SCHED_BOOK case. In particular all attributes that correspond to
CPU polarization.
But since all real world kernels have SCHED_BOOK selected anyway this
doesn't matter too much.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The kernel address space of a 64 bit kernel currently uses a three level
page table and the vmemmap array has a fixed address and a fixed maximum
size. A three level page table is good enough for systems with less than
3.8TB of memory, for bigger systems four page table levels need to be
used. Each page table level costs a bit of performance, use 3 levels for
normal systems and 4 levels only for the really big systems.
To avoid bloating sparse.o too much set MAX_PHYSMEM_BITS to 46 for a
maximum of 64TB of memory.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We simply say that regular this_cpu use must be safe regardless of
preemption and interrupt state. That has no material change for x86
and s390 implementations of this_cpu operations. However, arches that
do not provide their own implementation for this_cpu operations will
now get code generated that disables interrupts instead of preemption.
-tj: This is part of on-going percpu API cleanup. For detailed
discussion of the subject, please refer to the following thread.
http://thread.gmane.org/gmane.linux.kernel/1222078
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <alpine.DEB.2.00.1112221154380.11787@router.home>
* master: (848 commits)
SELinux: Fix RCU deref check warning in sel_netport_insert()
binary_sysctl(): fix memory leak
mm/vmalloc.c: remove static declaration of va from __get_vm_area_node
ipmi_watchdog: restore settings when BMC reset
oom: fix integer overflow of points in oom_badness
memcg: keep root group unchanged if creation fails
nilfs2: potential integer overflow in nilfs_ioctl_clean_segments()
nilfs2: unbreak compat ioctl
cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask
evm: prevent racing during tfm allocation
evm: key must be set once during initialization
mmc: vub300: fix type of firmware_rom_wait_states module parameter
Revert "mmc: enable runtime PM by default"
mmc: sdhci: remove "state" argument from sdhci_suspend_host
x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT
IB/qib: Correct sense on freectxts increment and decrement
RDMA/cma: Verify private data length
cgroups: fix a css_set not found bug in cgroup_attach_proc
oprofile: Fix uninitialized memory access when writing to writing to oprofilefs
Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
...
Conflicts:
kernel/cgroup_freezer.c
Put the logic to compute the event index into a per pmu method. This
is required because the x86 rules are weird and wonderful and don't
match the capabilities of the current scheme.
AFAIK only powerpc actually has a usable userspace read of the PMCs
but I'm not at all sure anybody actually used that.
ARM is restored to the default since it currently does not support
userspace access at all. And all software events are provided with a
method that reports their index as 0 (disabled).
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arun Sharma <asharma@fb.com>
Link: http://lkml.kernel.org/n/tip-dfydxodki16lylkt3gl2j7cw@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make cputime_t and cputime64_t nocast to enable sparse checking to
detect incorrect use of cputime. Drop the cputime macros for simple
scalar operations. The conversion macros are still needed.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch makes sure we don't underindicate _PAGE_CHANGED in case
we have a race between an operation that changes the page and this
code path that hits us between page_get_storage_key and
page_set_storage_key. Note that we still have a potential
underindication on _PAGE_REFERENCED in the unlikely event that
the page was changed but not referenced _and_ someone references
the page in the race window. That's not considered to be a problem.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The forcedeth changes had a conflict with the conversion over
to atomic u64 statistics in net-next.
The libertas cfg.c code had a conflict with the bss reference
counting fix by John Linville in net-next.
Conflicts:
drivers/net/ethernet/nvidia/forcedeth.c
drivers/net/wireless/libertas/cfg.c
SIGP sense running may cause an intercept on higher level
virtualization, so handle it by checking the CPUSTAT_RUNNING flag.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
CPUSTAT_RUNNING was implemented signifying that a vcpu is not stopped.
This is not, however, what the architecture says: RUNNING should be
set when the host is acting on the behalf of the guest operating
system.
CPUSTAT_RUNNING has been changed to be set in kvm_arch_vcpu_load()
and to be unset in kvm_arch_vcpu_put().
For signifying stopped state of a vcpu, a host-controlled bit has
been used and is set/unset basically on the reverse as the old
CPUSTAT_RUNNING bit (including pushing it down into stop handling
proper in handle_stop()).
Cc: stable@kernel.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
In ESA mode STCKF is not defined even if the facility bit is enabled.
To prevent an illegal operation we must also check if we run a 64 bit kernel.
To make the check perform well add the STCKF bit to the machine flags.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The pgste_update_all / pgste_update_young and pgste_set_pte need to
check if the pte entry contains a valid page address before the storage
key can be accessed. In addition pgste_set_pte needs to set the access
key and fetch protection bit of the new pte entry, not the old entry.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The 802.1X EAPOL handshake hostapd does requires
knowing whether the frame was ack'ed by the peer.
Currently, we fudge this pretty badly by not even
transmitting the frame as a normal data frame but
injecting it with radiotap and getting the status
out of radiotap monitor as well. This is rather
complex, confuses users (mon.wlan0 presence) and
doesn't work with all hardware.
To get rid of that hack, introduce a real wifi TX
status option for data frame transmissions.
This works similar to the existing TX timestamping
in that it reflects the SKB back to the socket's
error queue with a SCM_WIFI_STATUS cmsg that has
an int indicating ACK status (0/1).
Since it is possible that at some point we will
want to have TX timestamping and wifi status in a
single errqueue SKB (there's little point in not
doing that), redefine SO_EE_ORIGIN_TIMESTAMPING
to SO_EE_ORIGIN_TXSTATUS which can collect more
than just the timestamp; keep the old constant
as an alias of course. Currently the internal APIs
don't make that possible, but it wouldn't be hard
to split them up in a way that makes it possible.
Thanks to Neil Horman for helping me figure out
the functions that add the control messages.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (75 commits)
KVM: SVM: Keep intercepting task switching with NPT enabled
KVM: s390: implement sigp external call
KVM: s390: fix register setting
KVM: s390: fix return value of kvm_arch_init_vm
KVM: s390: check cpu_id prior to using it
KVM: emulate lapic tsc deadline timer for guest
x86: TSC deadline definitions
KVM: Fix simultaneous NMIs
KVM: x86 emulator: convert push %sreg/pop %sreg to direct decode
KVM: x86 emulator: switch lds/les/lss/lfs/lgs to direct decode
KVM: x86 emulator: streamline decode of segment registers
KVM: x86 emulator: simplify OpMem64 decode
KVM: x86 emulator: switch src decode to decode_operand()
KVM: x86 emulator: qualify OpReg inhibit_byte_regs hack
KVM: x86 emulator: switch OpImmUByte decode to decode_imm()
KVM: x86 emulator: free up some flag bits near src, dst
KVM: x86 emulator: switch src2 to generic decode_operand()
KVM: x86 emulator: expand decode flags to 64 bits
KVM: x86 emulator: split dst decode to a generic decode_operand()
KVM: x86 emulator: move memop, memopp into emulation context
...
We use both the external call and emergency call IPIs to signal remote
cpus. Therefore it makes sense to account them differently withing
/proc/irqstats so we actually know what happened.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this compiler error for !CONFIG_SMP:
CC arch/s390/mm/pgtable.o
arch/s390/mm/pgtable.c: In function ‘gmap_flush_tlb’:
arch/s390/mm/pgtable.c:202:3: error: implicit declaration of function ‘__tlb_flush_global’ [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix three sparse warnings in math-emu / sysinfo:
arch/s390/kernel/sysinfo.c:448:17: error: return expression in void function
arch/s390/kernel/sysinfo.c:445:25: warning: shift too big (32) for type unsigned int
arch/s390/kernel/sysinfo.c:445:25: warning: shift too big (32) for type unsigned int
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add get_clock_fast() which uses the slightly faster stckf if available.
If stckf is not available fall back to stck, which has the same width.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Linux on System z uses a ballooner based on diagnose 0x10. (aka as
collaborative memory management). This patch implements diagnose
0x10 on the guest address space.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
gmap_fault needs to walk the guest page table. However, parts of
that may change if some other thread does munmap. In that case
gmap_unmap_notifier will also unmap the corresponding parts from
the guest page table. We need to take mmap_sem in order to serialize
these operations.
do_exception now calls __gmap_fault with mmap_sem held which does
not get exported to modules. The exported function, which is called
from KVM, now takes mmap_sem.
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add support for CHSC I/O interrupt statistics in /proc/interrupts.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The user space program can change its addressing mode between the
24-bit, 31-bit and the 64-bit mode if the kernel is 64 bit. Currently
the kernel always forces the standard amode on signal delivery and
signal return and on ptrace: 64-bit for a 64-bit process, 31-bit for
a compat process and 31-bit kernels. Change the signal and ptrace code
to allow the full range of addressing modes. Signal handlers are
run in the standard addressing mode for the process.
One caveat is that even an 31-bit compat process can switch to the
64-bit mode. The next signal will switch back into the 31-bit mode
and there is no room in the 31-bit compat signal frame to store the
information that the program came from the 64-bit mode.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Split out addressing mode bits from PSW_BASE_BITS, rename PSW_BASE_BITS
to PSW_MASK_BASE, get rid of psw_user32_bits, remove unused function
enabled_wait(), introduce PSW_MASK_USER, and drop PSW_MASK_MERGE macros.
Change psw_kernel_bits / psw_user_bits to contain only the bits that
are always set in the respective mode.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add an explicit TIF_SYSCALL bit that indicates if a task is inside
a system call. The svc_code in the pt_regs structure is now only
valid if TIF_SYSCALL is set. With this definition TIF_RESTART_SVC
can be replaced with TIF_SYSCALL. Overall do_signal is a bit more
readable and it saves a few lines of code.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
An instruction with an address right below the adress limit for the
current addressing mode will wrap. The instruction restart logic in
the protection fault handler and the signal code need to follow the
wrapping rules to find the correct instruction address.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For a ERESTARTNOHAND/ERESTARTSYS/ERESTARTNOINTR restarting system call
do_signal will prepare the restart of the system call with a rewind of
the PSW before calling get_signal_to_deliver (where the debugger might
take control). For A ERESTART_RESTARTBLOCK restarting system call
do_signal will set -EINTR as return code.
There are two issues with this approach:
1) strace never sees ERESTARTNOHAND, ERESTARTSYS, ERESTARTNOINTR or
ERESTART_RESTARTBLOCK as the rewinding already took place or the
return code has been changed to -EINTR
2) if get_signal_to_deliver does not return with a signal to deliver
the restart via the repeat of the svc instruction is left in place.
This opens a race if another signal is made pending before the
system call instruction can be reexecuted. The original system call
will be restarted even if the second signal would have ended the
system call with -EINTR.
These two issues can be solved by dropping the early rewind of the
system call before get_signal_to_deliver has been called and by using
the TIF_RESTART_SVC magic to do the restart if no signal has to be
delivered. The only situation where the system call restart via the
repeat of the svc instruction is appropriate is when a SA_RESTART
signal is delivered to user space.
Unfortunately this breaks inferior calls by the debugger again. The
system call number and the length of the system call instruction is
lost over the inferior call and user space will see ERESTARTNOHAND/
ERESTARTSYS/ERESTARTNOINTR/ERESTART_RESTARTBLOCK. To correct this a
new ptrace interface is added to save/restore the system call number
and system call instruction length.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the save_area_64 field from the 0xe00 - 0xf00 area in the lowcore.
Use a free slot in the save_area array instead.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch implements the crash_map_pages() function for s390.
KEXEC_CRASH_MEM_ALIGN is set to HPAGE_SIZE, in order to support
kernel mappings that use large pages. We also use HPAGE_SIZE alignment
for CONFIG_HUGETLB_PAGE=n in order to have the same 1 MiB alignment on
all s390 systems.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch defines for s390 an ABI defined pointer to the vmcoreinfo note at
a well known address. With this patch tools are able to find this information
in dumps created by stand-alone or hypervisor dump tools.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch provides the architecture specific part of the s390 kdump
support.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add access function for real memory needed by s390 kdump backend.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
PSW restart can be triggered on offline CPUs. If this happens, currently
the PSW restart code fails, because functions like smp_processor_id()
do not work on offline CPUs. This patch fixes this as follows:
If PSW restart is triggered on an offline CPU, the PSW restart (sigp restart)
is done a second time on another CPU that is online and the old CPU is
stopped afterwards.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
_TIF_SINGLE_STEP is incorrectly defined as 1<<TIF_FREEZE. Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Current IRQ statistics support does not show detail counts for I/O
interrupts which are processed internally only. The result is a
summation count which is way off such as this one:
CPU0 CPU1 CPU2
I/O: 1331 710 442
[...]
QAI: 15 16 16 [I/O] QDIO Adapter Interrupt
QDI: 1 0 0 [I/O] QDIO Interrupt
DAS: 706 645 381 [I/O] DASD
C15: 26 10 0 [I/O] 3215
C70: 0 0 0 [I/O] 3270
TAP: 0 0 0 [I/O] Tape
VMR: 0 0 0 [I/O] Unit Record Devices
LCS: 0 0 0 [I/O] LCS
CLW: 0 0 0 [I/O] CLAW
CTC: 0 0 0 [I/O] CTC
APB: 0 0 0 [I/O] AP Bus
Fix this by moving I/O interrupt accounting into the common I/O layer.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Implement sigp external call, which might be required for guests that
issue an external call instead of an emergency signal for IPI.
This fixes an issue with "KVM: unknown SIGP: 0x02" when booting
such an SMP guest.
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits)
leases: fix write-open/read-lease race
nfs: drop unnecessary locking in llseek
ext4: replace cut'n'pasted llseek code with generic_file_llseek_size
vfs: add generic_file_llseek_size
vfs: do (nearly) lockless generic_file_llseek
direct-io: merge direct_io_walker into __blockdev_direct_IO
direct-io: inline the complete submission path
direct-io: separate map_bh from dio
direct-io: use a slab cache for struct dio
direct-io: rearrange fields in dio/dio_submit to avoid holes
direct-io: fix a wrong comment
direct-io: separate fields only used in the submission path from struct dio
vfs: fix spinning prevention in prune_icache_sb
vfs: add a comment to inode_permission()
vfs: pass all mask flags check_acl and posix_acl_permission
vfs: add hex format for MAY_* flag values
vfs: indicate that the permission functions take all the MAY_* flags
compat: sync compat_stats with statfs.
vfs: add "device" tag to /proc/self/mountstats
cleanup: vfs: small comment fix for block_invalidatepage
...
Fix up trivial conflict in fs/gfs2/file.c (llseek changes)
This was found by inspection while tracking a similar
bug in compat_statfs64, that has been fixed in mainline
since decemeber.
- This fixes a bug where not all of the f_spare fields
were cleared on mips and s390.
- Add the f_flags field to struct compat_statfs
- Copy f_flags to userspace in case someone cares.
- Use __clear_user to copy the f_spare field to userspace
to ensure that all of the elements of f_spare are cleared.
On some architectures f_spare is has 5 ints and on some
architectures f_spare only has 4 ints. Which makes
the previous technique of clearing each int individually
broken.
I don't expect anyone actually uses the old statfs system
call anymore but if they do let them benefit from having
the compat and the native version working the same.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Analog to git commit 59e4c3a2fe
do not clear the additional personality flags on exec. We
need to inherit the personality bits in PER_MASK across exec.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
598841ca99 ([S390] use gmap address
spaces for kvm guest images) changed kvm to use a separate address
space for kvm guests. This address space was switched in __vcpu_run
In some cases (preemption, page fault) there is the possibility that
this address space switch is lost.
The typical symptom was a huge amount of validity intercepts or
random guest addressing exceptions.
Fix this by doing the switch in sie_loop and sie_exit and saving the
address space in the gmap structure itself. Also use the preempt
notifier.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
FICON Express8S supports hardware data router, which requires an
adapted qdio request format.
This part 1/2 provides the qdio base required for exploitation in
zfcp.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch ensures that signal adapter commands are issued if they are
indicated to be required.
Signed-off-by: Einar Lueck <elelueck@de.ibm.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces support for asynchronous delivery of storage blocks for
Hipersockets. Upper layers may exploit this functionality to reuse SBALs for
which the delivery status is still pending.
Signed-off-by: Einar Lueck <elelueck@de.ibm.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The address limit is already set in flush_old_exec() so those calls to
set_fs(USER_DS) are redundant.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
The diagnose 308 call is the prefered method for clearing all ongoing I/O.
Therefore if it is available we use it instead of doing a manual reset.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
With this patch a new S390 shutdown trigger "restart" is added. If under
z/VM "systerm restart" is entered or under the HMC the "PSW restart" button
is pressed, the PSW located at 0 (31 bit) or 0x1a0 (64 bit) bit is loaded.
Now we execute do_restart() that processes the restart action that is
defined under /sys/firmware/shutdown_actions/on_restart. Currently the
following actions are possible: reipl (default), stop, vmcmd, dump, and
dump_reipl.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
After changing all consumers of atomics to include <linux/atomic.h>, we
ran into some compile time errors due to this dependency chain:
linux/atomic.h
-> asm/atomic.h
-> asm-generic/atomic-long.h
where atomic-long.h could use funcs defined later in linux/atomic.h
without a prototype. This patches moves the code that includes
asm-generic/atomic*.h to linux/atomic.h.
Archs that need <asm-generic/atomic64.h> need to select
CONFIG_GENERIC_ATOMIC64 from now on (some of them used to include it
unconditionally).
Compile tested on i386 and x86_64 with allnoconfig.
Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is in preparation for more generic atomic primitives based on
__atomic_add_unless.
Signed-off-by: Arun Sharma <asharma@fb.com>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The majority of architectures implement ext2 atomic bitops as
test_and_{set,clear}_bit() without spinlock.
This adds this type of generic implementation in ext2-atomic-setbit.h and
use it wherever possible.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Suggested-by: Andreas Dilger <adilger@dilger.ca>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ poleg@redhat.com: no need to declare show_regs() in ptrace.h, sched.h does this ]
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SIGP emerg needs to pass the source vpu adress into __LC_CPU_ADDRESS of the
target guest.
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch removes kvm-s390 internal assumption of a linear mapping
of guest address space to user space. Previously, guest memory was
translated to user addresses using a fixed offset (gmsor). The new
code uses gmap_fault to resolve guest addresses.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch switches kvm from using (Qemu's) user address space to
Martin's gmap address space. This way QEMU does not have to use a
linker script in order to fit large guests at low addresses in its
address space.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add code that allows KVM to control the virtual memory layout that
is seen by a guest. The guest address space uses a second page table
that shares the last level pte-tables with the process page table.
If a page is unmapped from the process page table it is automatically
unmapped from the guest page table as well.
The guest address space mapping starts out empty, KVM can map any
individual 1MB segments from the process virtual memory to any 1MB
aligned location in the guest virtual memory. If a target segment in
the process virtual memory does not exist or is unmapped while a
guest mapping exists the desired target address is stored as an
invalid segment table entry in the guest page table.
The population of the guest page table is fault driven.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The alignment is missing for various global symbols in s390 assembly code.
With a recent gcc and an instruction like stgrl this can lead to a
specification exception if the instruction uses such a mis-aligned address.
Specify the alignment explicitely and while add it define __ALIGN for s390
and use the ENTRY define to save some lines of code.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The entry to / exit from sie has subtle dependencies to the first level
interrupt handler. Move the sie assembler code to entry64.S and replace
the SIE_HOOK callback with a test and the new _TIF_SIE bit.
In addition this patch fixes several problems in regard to the check for
the_TIF_EXIT_SIE bits. The old code checked the TIF bits before executing
the interrupt handler and it only modified the instruction address if it
pointed directly to the sie instruction. In both cases it could miss
a TIF bit that normally would cause an exit from the guest and would
reenter the guest context.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When running a kvm guest we can get intercepts for tprot, if the host
page table is read-only or not populated. This patch implements the
most common case (linux memory detection).
This also allows host copy on write for guest memory on newer systems.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Do not trace arch_local_save_flags(), arch_local_irq_*() and friends.
Although they are marked inline, gcc may still make a function out of
them and add it to the pool of functions that are traced by the function
tracer. This can cause undesirable results (kernel panic, triple faults,
etc).
Add the notrace notation to prevent them from ever being traced.
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
KVM is not available for 31 bit but the KVM defines cause warnings:
arch/s390/include/asm/pgtable.h: In function 'ptep_test_and_clear_user_dirty':
arch/s390/include/asm/pgtable.h:817: warning: integer constant is too large for 'unsigned long' type
arch/s390/include/asm/pgtable.h:818: warning: integer constant is too large for 'unsigned long' type
arch/s390/include/asm/pgtable.h: In function 'ptep_test_and_clear_user_young':
arch/s390/include/asm/pgtable.h:837: warning: integer constant is too large for 'unsigned long' type
arch/s390/include/asm/pgtable.h:838: warning: integer constant is too large for 'unsigned long' type
Add 31 bit versions of the KVM defines to remove the warnings.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Replace the s390 specific rcu page-table freeing code with the
generic variant. This requires to duplicate the definition for the
struct mmu_table_batch as s390 does not use the generic tlb flush
code.
While we are at it remove the restriction that page table fragments
can not be reused after a single fragment has been freed with rcu
and split out allocation and freeing of page tables with pgstes.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The qdio SBAL entry flag is made-up of four different values that are
independent of one another. Some of the bits are reserved by the
hardware and should not be changed by qdio. Currently all four values
are overwritten since the SBAL entry flag is defined as an u32.
Split the SBAL entry flag into four u8's as defined by the hardware
and don't touch the reserved bits.
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
page_get_storage_key() and page_set_storage_key() expect a page address
and not its page frame number. This got inconsistent with 2d42552d
"[S390] merge page_test_dirty and page_clear_dirty".
Result is that we read/write storage keys from random pages and do not
have a working dirty bit tracking at all.
E.g. SetPageUpdate() doesn't clear the dirty bit of requested pages, which
for example ext4 doesn't like very much and panics after a while.
Unable to handle kernel paging request at virtual user address (null)
Oops: 0004 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
CPU: 1 Not tainted 2.6.39-07551-g139f37f-dirty #152
Process flush-94:0 (pid: 1576, task: 000000003eb34538, ksp: 000000003c287b70)
Krnl PSW : 0704c00180000000 0000000000316b12 (jbd2_journal_file_inode+0x10e/0x138)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000 0700000000000000
0000000000316a62 000000003eb34cd0 0000000000000025 000000003c287b88
0000000000000001 000000003c287a70 000000003f1ec678 000000003f1ec000
0000000000000000 000000003e66ec00 0000000000316a62 000000003c287988
Krnl Code: 0000000000316b04: f0a0000407f4 srp 4(11,%r0),2036,0
0000000000316b0a: b9020022 ltgr %r2,%r2
0000000000316b0e: a7740015 brc 7,316b38
>0000000000316b12: e3d0c0000024 stg %r13,0(%r12)
0000000000316b18: 4120c010 la %r2,16(%r12)
0000000000316b1c: 4130d060 la %r3,96(%r13)
0000000000316b20: e340d0600004 lg %r4,96(%r13)
0000000000316b26: c0e50002b567 brasl %r14,36d5f4
Call Trace:
([<0000000000316a62>] jbd2_journal_file_inode+0x5e/0x138)
[<00000000002da13c>] mpage_da_map_and_submit+0x2e8/0x42c
[<00000000002daac2>] ext4_da_writepages+0x2da/0x504
[<00000000002597e8>] writeback_single_inode+0xf8/0x268
[<0000000000259f06>] writeback_sb_inodes+0xd2/0x18c
[<000000000025a700>] writeback_inodes_wb+0x80/0x168
[<000000000025aa92>] wb_writeback+0x2aa/0x324
[<000000000025abde>] wb_do_writeback+0xd2/0x274
[<000000000025ae3a>] bdi_writeback_thread+0xba/0x1c4
[<00000000001737be>] kthread+0xa6/0xb0
[<000000000056c1da>] kernel_thread_starter+0x6/0xc
[<000000000056c1d4>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<0000000000316a8a>] jbd2_journal_file_inode+0x86/0x138
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
32bit and 64bit on x86 are tested and working. The rest I have looked
at closely and I can't find any problems.
setns is an easy system call to wire up. It just takes two ints so I
don't expect any weird architecture porting problems.
While doing this I have noticed that we have some architectures that are
very slow to get new system calls. cris seems to be the slowest where
the last system calls wired up were preadv and pwritev. avr32 is weird
in that recvmmsg was wired up but never declared in unistd.h. frv is
behind with perf_event_open being the last syscall wired up. On h8300
the last system call wired up was epoll_wait. On m32r the last system
call wired up was fallocate. mn10300 has recvmmsg as the last system
call wired up. The rest seem to at least have syncfs wired up which was
new in the 2.6.39.
v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com>
v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com>
v4: Moved wiring up of the system call to another patch
v5: ported to v2.6.39-rc6
v6: rebased onto parisc-next and net-next to avoid syscall conflicts.
v7: ported to Linus's latest post 2.6.39 tree.
> arch/blackfin/include/asm/unistd.h | 3 ++-
> arch/blackfin/mach-common/entry.S | 1 +
Acked-by: Mike Frysinger <vapier@gentoo.org>
Oh - ia64 wiring looks good.
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM: Fix PM QOS's user mode interface to work with ASCII input
PM / Hibernate: Update kerneldoc comments in hibernate.c
PM / Hibernate: Remove arch_prepare_suspend()
PM / Hibernate: Update some comments in core hibernate code
The previous style change enables to use asm-generic/bitops/le.h on s390.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The style that we normally use in asm-generic is to test the macro itself
for existence, so in asm-generic, do:
#ifndef find_next_zero_bit_le
extern unsigned long find_next_zero_bit_le(const void *addr,
unsigned long size, unsigned long offset);
#endif
and in the architectures, write
static inline unsigned long find_next_zero_bit_le(const void *addr,
unsigned long size, unsigned long offset)
#define find_next_zero_bit_le find_next_zero_bit_le
This adds the #define for each of the optimized find bitops in the
architectures.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement ndelay() on s390 as well.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Both functions take an int instead of an unsigned int. Fixes these
compile warnings:
kernel/sched.c:7167:2: warning: initialization from incompatible pointer type
kernel/sched.c:7170:2: warning: initialization from incompatible pointer type
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Turn __access_ok() into a define and add a __chk_user_ptr() call
instead.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge irq.c and s390_ext.c into irq.c. That way all external interrupt
related functions are together.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Interrupt sources like pfault, sclp, dasd_diag and virtio all use the
service signal external interrupt subclass mask in control register 0
to enable and disable the corresponding interrupt.
Because no reference counting is implemented each subsystem thinks it
is the only user of subclass and sets and clears the bit like it wants.
This leads to case that unloading the dasd diag module under z/VM
causes both sclp and pfault interrupts to be masked. The result will
be locked up system sooner or later.
Fix this by introducing a new way to set (register) and clear
(unregister) the service signal subclass mask bit in cr0.
Also convert all drivers.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Adapt the stand-alone s390 mmu_gather implementation to the new
preemptible mmu_gather interface.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All architectures supporting hibernation define
arch_prepare_suspend() as an empty function, so remove it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
sendmmsg is reachable via the socket system call. We don't enable a second
way on s390 to reach the same system call.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Rework the architecture page table functions to access the bits in the
page table extension array (pgste). There are a number of changes:
1) Fix missing pgste update if the attach_count for the mm is <= 1.
2) For every operation that affects the invalid bit in the pte or the
rcp byte in the pgste the pcl lock needs to be acquired. The function
pgste_get_lock gets the pcl lock and returns the current pgste value
for a pte pointer. The function pgste_set_unlock stores the pgste
and releases the lock. Between these two calls the bits in the pgste
can be shuffled.
3) Define two software bits in the pte _PAGE_SWR and _PAGE_SWC to avoid
calling SetPageDirty and SetPageReferenced from pgtable.h. If the
host reference backup bit or the host change backup bit has been
set the dirty/referenced state is transfered to the pte. The common
code will pick up the state from the pte.
4) Add ptep_modify_prot_start and ptep_modify_prot_commit for mprotect.
5) Remove pgd_populate_kernel, pud_populate_kernel, pmd_populate_kernel
pgd_clear_kernel, pud_clear_kernel, pmd_clear_kernel and ptep_invalidate.
6) Rename kvm_s390_test_and_clear_page_dirty to
ptep_test_and_clear_user_dirty and add ptep_test_and_clear_user_young.
7) Define mm_exclusive() and mm_has_pgste() helper to improve readability.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The page_clear_dirty primitive always sets the default storage key
which resets the access control bits and the fetch protection bit.
That will surprise a KVM guest that sets non-zero access control
bits or the fetch protection bit. Merge page_test_dirty and
page_clear_dirty back to a single function and only clear the
dirty bit from the storage key.
In addition move the function page_test_and_clear_dirty and
page_test_and_clear_young to page.h where they belong. This
requires to change the parameter from a struct page * to a page
frame number.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>