Git commit d97d929f06 ("s390: move cacheinfo sysfs to generic cacheinfo
infrastructure") removed the general-instructions-extension availability
check before the ecag instruction is executed.
Without this check this may lead to crashes on machines without this facility.
Therefore add the check again where needed.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Given that sizeof(long) is now always 8, we can simplify syscall_get_arch()
a bit.
Just another piece I didn't find when removing 31 bit support.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a BUILD_BUG_ON() to enforce that irqclass_sub_desc contains the required
number of defined interrupt descriptions and won't be filled up with zeros.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Rename two more files which I forgot. Also remove the "asm" from the
swsusp_asm64.S file, since the ".S" suffix already makes it obvious
that this file contains assembler code.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
ipl.c uses 3 different macros to create a sysfs show function for ipl
attributes. Define IPL_ATTR_SHOW_FN which is used by all macros.
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use macros wherever applicable and create a shutdown_action attribute
group to simplify the code.
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use macros wherever applicable and put bin_attributes inside attribute_groups
to simplify/remove some code.
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove __user address space annotation for sim_stor_event() calls since it
generates false positive warnings from sparse.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
As reported by sparse these can and should be static.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use the sturg instruction instead of the stura instruction. This allows to
modify up to eight bytes in a row instead of only four.
For function tracer enabling and disabling this reduces the time needed to
modify the text sections by 50%, since for each mcount call site six bytes
need to be changed.
Also remove the EXTABLE entries, since calls to this function are not
supposed to fail.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the s390 architecture implementation of probe_kernel_write() and
instead use a new function s390_kernel_write() to modify kernel text and
data everywhere.
The s390 implementation of probe_kernel_write() was potentially broken
since it modified memory in a read-modify-write fashion, which read four
bytes, modified the requested bytes within those four bytes and wrote
the result back.
If two cpus would modify the same four byte area at different locations
within that area, this could lead to corruption.
Right now the only places which called probe_kernel_write() did run within
stop_machine_run. Therefore the scenario can't happen right now, however
that might change at any time.
To fix this rename probe_kernel_write() to s390_kernel_write() which can
have special semantics, like only call it while running within stop_machine().
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch extends the diag288 watchdog driver to be able to deal with KVM
hypervisors. Only z/VM needs special handling, we can use the same interface
as on LPAR. Remove all pr_info output to avoid misconception. Because there
is no value in these messages and only the pr_err messages make sense.
Signed-off-by: Xu Wang <gesaint@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There's no reason why we wouldn't want to be able to send a keep alive
message to /dev/watchdog (feed dog) for the s390 diag288 watchdog, so
let's enable the WDIOF_KEEPALIVEPING option.
Signed-off-by: Xu Wang <gesaint@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the hard coded scheduler for the DASD device driver to enable
change of the scheduler during runtime. Set recommended deadline
scheduler via additional udev rule.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In case of a translation exception the page tables are corrupted. If the
exception handler then calls die() which again calls show_regs()
-> show_code() -> copy_from_user(), the kernel may access the same memory
location again and generates yet another translation exception. Which in turn
will lead to a deadlock on the die_lock spinlock, which the kernel tries to
grab recursively.
Given that the page tables are corrupted anyway, if we see such an exception,
let's simply panic.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Given that the kernel now always runs in 64 bit mode, it is
pointless to check if the z/Architecture mode is active.
Remove the checks.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Since sizeof(long) == 4 is always false now, simplify cmpxchg_double a bit.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the 31 bit syscalls from the syscall table. This is a separate patch
just in case I screwed something up so it can be easily reverted.
However the conversion was done with a script, so everything should be ok.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Rename a couple of files to get rid of the "64" suffix.
"git blame" will still work.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the 31 bit support in order to reduce maintenance cost and
effectively remove dead code. Since a couple of years there is no
distribution left that comes with a 31 bit kernel.
The 31 bit kernel also has been broken since more than a year before
anybody noticed. In addition I added a removal warning to the kernel
shown at ipl for 5 minutes: a960062e58 ("s390: add 31 bit warning
message") which let everybody know about the plan to remove 31 bit
code. We didn't get any response.
Given that the last 31 bit only machine was introduced in 1999 let's
remove the code.
Anybody with 31 bit user space code can still use the compat mode.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
After a suspend/resume cycle we missed to enable smt again, which leads
to all sorts of bugs, since the kernel assumes smt is enabled, while the
hardware thinks it is not.
Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reported-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For compat tasks the mmap randomization does not use the maximum
randomization value from mmap_rnd_mask but the fixed value of 0x7ff.
This needs to be respected in the definition of STACK_RND_MASK as
well.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The SF_CYCLES_BASIC_DIAG is always registered even if it is turned of in the
current hardware configuration. Because diagnostic-sampling is typically not
turned on in the hardware configuration, do not register this perf event by
default. Enable it only if the diagnostic-sampling function is authorized.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge misc fixes from Andrew Morton:
"13 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
memcg: disable hierarchy support if bound to the legacy cgroup hierarchy
mm: reorder can_do_mlock to fix audit denial
kasan, module: move MODULE_ALIGN macro into <linux/moduleloader.h>
kasan, module, vmalloc: rework shadow allocation for modules
fanotify: fix event filtering with FAN_ONDIR set
mm/nommu.c: export symbol max_mapnr
arch/c6x/include/asm/pgtable.h: define dummy pgprot_writecombine for !MMU
nilfs2: fix deadlock of segment constructor during recovery
mm: cma: fix CMA aligned offset calculation
mm, hugetlb: close race when setting PageTail for gigantic pages
mm, oom: do not fail __GFP_NOFAIL allocation if oom killer is disabled
drivers/rtc/rtc-s3c.c: add .needs_src_clk to s3c6410 RTC data
ocfs2: make append_dio an incompat feature
If the memory cgroup controller is initially mounted in the scope of the
default cgroup hierarchy and then remounted to a legacy hierarchy, it will
still have hierarchy support enabled, which is incorrect. We should
disable hierarchy support if bound to the legacy cgroup hierarchy.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A userspace call to mmap(MAP_LOCKED) may result in the successful locking
of memory while also producing a confusing audit log denial. can_do_mlock
checks capable and rlimit. If either of these return positive
can_do_mlock returns true. The capable check leads to an LSM hook used by
apparmour and selinux which produce the audit denial. Reordering so
rlimit is checked first eliminates the denial on success, only recording a
denial when the lock is unsuccessful as a result of the denial.
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Paul Cassella <cassella@cray.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
include/linux/moduleloader.h is more suitable place for this macro.
Also change alignment to PAGE_SIZE for CONFIG_KASAN=n as such
alignment already assumed in several places.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current approach in handling shadow memory for modules is broken.
Shadow memory could be freed only after memory shadow corresponds it is no
longer used. vfree() called from interrupt context could use memory its
freeing to store 'struct llist_node' in it:
void vfree(const void *addr)
{
...
if (unlikely(in_interrupt())) {
struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
if (llist_add((struct llist_node *)addr, &p->list))
schedule_work(&p->wq);
Later this list node used in free_work() which actually frees memory.
Currently module_memfree() called in interrupt context will free shadow
before freeing module's memory which could provoke kernel crash.
So shadow memory should be freed after module's memory. However, such
deallocation order could race with kasan_module_alloc() in module_alloc().
Free shadow right before releasing vm area. At this point vfree()'d
memory is not used anymore and yet not available for other allocations.
New VM_KASAN flag used to indicate that vm area has dynamically allocated
shadow memory so kasan frees shadow only if it was previously allocated.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With FAN_ONDIR set, the user can end up getting events, which it hasn't
marked. This was revealed with fanotify04 testcase failure on
Linux-4.0-rc1, and is a regression from 3.19, revealed with 66ba93c0d7
("fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask").
# /opt/ltp/testcases/bin/fanotify04
[ ... ]
fanotify04 7 TPASS : event generated properly for type 100000
fanotify04 8 TFAIL : fanotify04.c:147: got unexpected event 30
fanotify04 9 TPASS : No event as expected
The testcase sets the adds the following marks : FAN_OPEN | FAN_ONDIR for
a fanotify on a dir. Then does an open(), followed by close() of the
directory and expects to see an event FAN_OPEN(0x20). However, the
fanotify returns (FAN_OPEN|FAN_CLOSE_NOWRITE(0x10)). This happens due to
the flaw in the check for event_mask in fanotify_should_send_event() which
does:
if (event_mask & marks_mask & ~marks_ignored_mask)
return true;
where, event_mask == (FAN_ONDIR | FAN_CLOSE_NOWRITE),
marks_mask == (FAN_ONDIR | FAN_OPEN),
marks_ignored_mask == 0
Fix this by masking the outgoing events to the user, as we already take
care of FAN_ONDIR and FAN_EVENT_ON_CHILD.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Eric Paris <eparis@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several modules may need max_mapnr, so export, the related error with
allmodconfig under c6x:
MODPOST 3327 modules
ERROR: "max_mapnr" [fs/pstore/ramoops.ko] undefined!
ERROR: "max_mapnr" [drivers/media/v4l2-core/videobuf2-dma-contig.ko] undefined!
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When !MMU, asm-generic will not define default pgprot_writecombine, so c6x
needs to define it by itself. The related error:
CC [M] fs/pstore/ram_core.o
fs/pstore/ram_core.c: In function 'persistent_ram_vmap':
fs/pstore/ram_core.c:399:10: error: implicit declaration of function 'pgprot_writecombine' [-Werror=implicit-function-declaration]
prot = pgprot_writecombine(PAGE_KERNEL);
^
fs/pstore/ram_core.c:399:8: error: incompatible types when assigning to type 'pgprot_t {aka struct <anonymous>}' from type 'int'
prot = pgprot_writecombine(PAGE_KERNEL);
^
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
during recovery at mount time. The code path that caused the deadlock was
as follows:
nilfs_fill_super()
load_nilfs()
nilfs_salvage_orphan_logs()
* Do roll-forwarding, attach segment constructor for recovery,
and kick it.
nilfs_segctor_thread()
nilfs_segctor_thread_construct()
* A lock is held with nilfs_transaction_lock()
nilfs_segctor_do_construct()
nilfs_segctor_drop_written_files()
iput()
iput_final()
write_inode_now()
writeback_single_inode()
__writeback_single_inode()
do_writepages()
nilfs_writepage()
nilfs_construct_dsync_segment()
nilfs_transaction_lock() --> deadlock
This can happen if commit 7ef3ff2fea ("nilfs2: fix deadlock of segment
constructor over I_SYNC flag") is applied and roll-forward recovery was
performed at mount time. The roll-forward recovery can happen if datasync
write is done and the file system crashes immediately after that. For
instance, we can reproduce the issue with the following steps:
< nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
# dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
# dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
count=1 && reboot -nfh
< the system will immediately reboot >
# mount -t nilfs2 /dev/sdb1 /nilfs
The deadlock occurs because iput() can run segment constructor through
writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags. The
above commit changed segment constructor so that it calls iput()
asynchronously for inodes with i_nlink == 0, but that change was
imperfect.
This fixes the another deadlock by deferring iput() in segment constructor
even for the case that mount is not finished, that is, for the case that
MS_ACTIVE flag is not set.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reported-by: Yuxuan Shui <yshuiv7@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The CMA aligned offset calculation is incorrect for non-zero order_per_bit
values.
For example, if cma->order_per_bit=1, cma->base_pfn= 0x2f800000 and
align_order=12, the function returns a value of 0x17c00 instead of 0x400.
This patch fixes the CMA aligned offset calculation.
The previous calculation was wrong and would return too-large values for
the offset, so that when cma_alloc looks for free pages in the bitmap with
the requested alignment > order_per_bit, it starts too far into the bitmap
and so CMA allocations will fail despite there actually being plenty of
free pages remaining. It will also probably have the wrong alignment.
With this change, we will get the correct offset into the bitmap.
One affected user is powerpc KVM, which has kvm_cma->order_per_bit set to
KVM_CMA_CHUNK_ORDER - PAGE_SHIFT, or 18 - 12 = 6.
[gregory.0xf0@gmail.com: changelog additions]
Signed-off-by: Danesh Petigara <dpetigara@broadcom.com>
Reviewed-by: Gregory Fong <gregory.0xf0@gmail.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that gigantic pages are dynamically allocatable, care must be taken to
ensure that p->first_page is valid before setting PageTail.
If this isn't done, then it is possible to race and have compound_head()
return NULL.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo Handa has pointed out that __GFP_NOFAIL allocations might fail
after OOM killer is disabled if the allocation is performed by a kernel
thread. This behavior was introduced from the very beginning by
7f33d49a2e ("mm, PM/Freezer: Disable OOM killer when tasks are frozen").
This means that the basic contract for the allocation request is broken
and the context requesting such an allocation might blow up unexpectedly.
There are basically two ways forward.
1) move oom_killer_disable after kernel threads are frozen. This has a
risk that the OOM victim wouldn't be able to finish because it would
depend on an already frozen kernel thread. This would be really tricky
to debug.
2) do not fail GFP_NOFAIL allocation no matter what and risk a
potential Freezable kernel threads will loop and fail the suspend.
Incidental allocations after kernel threads are frozen will at least
dump a warning - if we are lucky and the serial console is still active
of course...
This patch implements the later option because it is safer. We would see
warning rather than allocation failures for the kernel threads which would
blow up otherwise and have a higher chances to identify __GFP_NOFAIL users
from deeper pm code.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: David Rientjes <rientjes@gooogle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit df9e26d093 ("rtc: s3c: add support for RTC of Exynos3250 SoC")
added an "rtc_src" DT property to specify the clock used as a source to
the S3C real-time clock.
Not all SoCs needs this so commit eaf3a65908 ("drivers/rtc/rtc-s3c.c:
fix initialization failure without rtc source clock") changed to check
the struct s3c_rtc_data .needs_src_clk to conditionally grab the clock.
But that commit didn't update the data for each IP version so the RTC
broke on the boards that needs a source clock. This is the case of at
least Exynos5250 and Exynos5440 which uses the s3c6410 RTC IP block.
This commit fixes the S3C rtc on the Exynos5250 Snow and Exynos5420
Peach Pit and Pi Chromebooks.
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Chanwoo Choi <cw00.choi@samsung.com>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Tyler Baker <tyler.baker@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It turns out that making this feature ro_compat isn't quite enough to
prevent accidental corruption on mount from older kernels. Ocfs2 (like
other file systems) will process orphaned inodes even when the user mounts
in 'ro' mode. So for the case of a filesystem not knowing the append_dio
feature, mounting the filesystem could result in orphaned-for-dio files
being deleted, which we clearly don't want.
So instead, turn this into an incompat flag.
Btw, this is kind of my fault - initially I asked that we add a flag to
cover the feature and even suggested that we use an ro flag. It wasn't
until I was looking through our commits for v4.0-rc1 that I realized we
actually want this to be incompat.
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The wrong value is being returned by change_huge_pmd since commit
10c1045f28 ("mm: numa: avoid unnecessary TLB flushes when setting
NUMA hinting entries") which allows a fallthrough that tries to adjust
non-existent PTEs. This patch corrects it.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull i2c fix from Wolfram Sang:
"An important bugfix for the I2C subsystem core"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Revert "i2c: core: Dispose OF IRQ mapping at client removal time"
APM X-Gene host bridge driver
- Add register offset to config space base address (Feng Kan)
Miscellaneous
- Don't read past the end of sysfs "driver_override" buffer (Sasha Levin)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVAbsOAAoJEFmIoMA60/r858gP/0eW9rgawzcdtsidmvmPligj
NLJ/L8H+z4n9az0o3EDef4Tcv4lO0J6bLgr+YblTLJaYWQfbKZo3cXCXi3EnM0MF
+vkWh8TQvHeTW7L3e/KwwWtkg14zpJ6KTgpLSGzW87BNcSOzC76dfGNyZJ5CIuSf
nJgQtQ2gFQNRM0BgR5S+BGGeXPOtOE8ytJyOV6Z3MOtzTYprMaixzDs9XgDLASEu
6vzb7S62f//FWbTLF+gvBuAMb6VFv/ORZOHxlsZPjhXSJ1bfHKO6caYIgJsYuau1
E9OYuIdsAr0sXm6ejNmlgSxSGB1yUvEi7onOwGe3N11AwRzzd/BfyFbS46sqzpBN
IwflhW4SNX8dfZYB3lowd2aDirwGlBLSxOsepTBlDgBlQ7ANoemoAmOY0pOvIkCu
jUObW8PaD3sCfwMCNNwu+eISYBAP7GC2KfgWK2jvCqjfEH5+myP+ibDed8Z01Yie
838QRgPys+Z4nVmeDi0HnXkwYpDmcwez6YKpYukl62GJUb5zSbZDDjoYE7kVk90h
8aBeaQO0SkR3DB++hirQPhWz5YAIJ4Looxr/86SbZ6y2zQhDimDQ15eKmV11PyfO
CkmiQCJ0rf3n0AhVgHPt7OCaZ8hmmDShQs32Xtf26+MVf59lBYTsM3zAs93kOyqN
kKMKQknE6rJ09FOFmVnC
=4mzw
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.0-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Here are a couple updates for v4.0.
One fixes a config accessor problem on APM X-Gene that we introduced
when switching to generic config accessors, and the other fixes an
older read-past-end-of-buffer problem in sysfs.
APM X-Gene host bridge driver
- Add register offset to config space base address (Feng Kan)
Miscellaneous
- Don't read past the end of sysfs "driver_override" buffer (Sasha Levin)"
* tag 'pci-v4.0-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: xgene: Add register offset to config space base address
PCI: Don't read past the end of sysfs "driver_override" buffer
- Fix syscall error recovery
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iEYEABECAAYFAlUBReoACgkQykllyylKDCE2JQCdGkwwVSH7hoPHSUUAIcstxR1U
JJYAoIElkFV8azSi1y4Cf6spNL76mYNs
=JekW
-----END PGP SIGNATURE-----
Merge tag 'microblaze-4.0-rc4' of git://git.monstr.eu/linux-2.6-microblaze
Pull arch/microblaze fixes from Michal Simek:
"Fix syscall error recovery.
Two patches - one is just preparation patch for the second which is
fixing the problem with syscalls"
* tag 'microblaze-4.0-rc4' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Fix syscall error recovery for invalid syscall IDs
microblaze: Coding style cleanup
Dave Chinner reported that commit 4d94246699 ("mm: convert
p[te|md]_mknonnuma and remaining page table manipulations") slowed down
his xfsrepair test enormously. In particular, it was using more system
time due to extra TLB flushing.
The ultimate reason turns out to be how the change to use the regular
page table accessor functions broke the NUMA grouping logic. The old
special mknuma/mknonnuma code accessed the page table present bit and
the magic NUMA bit directly, while the new code just changes the page
protections using PROT_NONE and the regular vma protections.
That sounds equivalent, and from a fault standpoint it really is, but a
subtle side effect is that the *other* protection bits of the page table
entries also change. And the code to decide how to group the NUMA
entries together used the writable bit to decide whether a particular
page was likely to be shared read-only or not.
And with the change to make the NUMA handling use the regular permission
setting functions, that writable bit was basically always cleared for
private mappings due to COW. So even if the page actually ends up being
written to in the end, the NUMA balancing would act as if it was always
shared RO.
This code is a heuristic anyway, so the fix - at least for now - is to
instead check whether the page is dirty rather than writable. The bit
doesn't change with protection changes.
NOTE! This also adds a FIXME comment to revisit this issue,
Not only should we probably re-visit the whole "is this a shared
read-only page" heuristic (we might want to take the vma permissions
into account and base this more on those than the per-page ones, and
also look at whether the particular access that triggers it is a write
or not), but the whole COW issue shows that we should think about the
NUMA fault handling some more.
For example, maybe we should do the early-COW thing that a regular fault
does. Or maybe we should accept that while using the same bits as
PROTNONE was a good thing (and got rid of the specual NUMA bit), we
might still want to just preseve the other protection bits across NUMA
faulting.
Those are bigger questions, left for later. This just fixes up the
heuristic so that it at least approximates working again. More analysis
and work needed.
Reported-by: Dave Chinner <david@fromorbit.com>
Tested-by: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>,
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit e4df3a0b62
("i2c: core: Dispose OF IRQ mapping at client removal time")
Calling irq_dispose_mapping() will destroy the mapping and disassociate
the IRQ from the IRQ chip to which it belongs. Keeping it is OK, because
existent mappings are reused properly.
Also, this commit breaks drivers using devm* for IRQ management on
OF-based systems because devm* cleanup happens in device code, after
bus's remove() method returns.
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Reported-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[wsa: updated the commit message with findings fromt the other bug report]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Fixes: e4df3a0b62
Remove struct pt_regs from user header and use generic ucontext.h.
Signed-off-by: Chung-Ling Tang <cltang@codesourcery.com>
Acked-by: Ley Foon Tan <lftan@altera.com>
* pxa3xx_nand
- fix timeout issues when draining the FIFO (BCH only)
- don't crash when no chip-selects are used
* hisi504_nand
- depend on HAS_DMA, to fix compile errors
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU/4s9AAoJEFySrpd9RFgtTp8P/AoOj+FoRZAjIcr1oRRvIj68
gUJTihk/2caf4PzLek1mk2BMHbs3axvnyNEGyXeTkPBnM03ZSMkq0MCdCF+qpGpV
qWnkUG3Q1SlMTkhOr1Uq28DZpkx5PVi5j859wKxw9wfkprUXb5JksX2qno+Hmfet
+CYp+65tU/6xNeaLGtDawfxvsVXEn40hKfP7JGYfyrrbQ3jEcCuZRm19fDb+JtTg
W8oBR08CwSoOKmpcNyltHsjlmDiF35/u0pdfIZxym1BFWyLXAsmV723MtI9J2ILG
hEHZZwG3rxyrV/JBFl6sPlcyU6zQGmXl3rCrEoK/rxqmY00gY2x9fsqGCJOUls+u
A6aCoIjlJ/EOuRBm+wpyEopuljZ6ybVAkdMRmiY2cIt1oOcvMtkRQzw+I+9i/fLa
qQuhqPlEFwB63TCsi/xVwNL8VXpeBfhsSTAeE/y2ZJFuKmOym040eoENKWoEThY/
8Nhma3sLazeSCW4vX1r5glPP0sseARo0/z9HhzjRe249VBjR7Zz60/GoF4cCXW63
RJNXHfvxPbWvOPizT6x6HxlGcZX6dq18wgvBV46uxX2a0X3r1G3qtluyrNZHO3ab
T2lG8Yx44uN+Bwbaxh74UB/RgvDBxeNoVf5onc+u1MhN1QU/09NfroLIY5eLEDcf
d0+yZdOFKvYjY7pU1SFa
=GF4t
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20150310' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
* pxa3xx_nand
- fix timeout issues when draining the FIFO (BCH only)
- don't crash when no chip-selects are used
* hisi504_nand
- depend on HAS_DMA, to fix compile errors
* tag 'for-linus-20150310' of git://git.infradead.org/linux-mtd:
mtd: nand: MTD_NAND_HISI504 should depend on HAS_DMA
mtd: pxa3xx_nand: fix driver when num_cs is 0
mtd: nand: pxa3xx: Fix PIO FIFO draining
The patches contain:
* Fix multiple ARM IOMMU drivers to behave well when the
hardware is not present
* Mark MSM driver as broken
* Fix build errors with the new ARM generic io-page-table code
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJU/zbaAAoJECvwRC2XARrjZFsP/2rZAiDJ1NGBOoJO+v+QlEY/
//wxpyKWnJ977vkgIAVvBZqWLFbwO8UF3jnRlfF4nrpbkbRynB62kczov9G/3Mzo
kur1gT472xYSTd87yGfBKl9hfIn5GQpe8BFX3UU9SkkSR11z7iDBb6X7LevmeyqB
B24WgzX0haKCSYMj5/jd1iskzI1KomhoLLUrh5+vmDHj12P9B2TmCuK/6hZOBtQI
GSCvrje3FWHdymCfGym2QZmwQIdm6ZlvzSBuTLp8F3PUKbf7hVvpszCbMRJ/ZYCl
gDvn0mDXyi14heEeXzHVF6574Sd1UBfxRWpgmDQpDRUAREzt3NW+FUbwMQg1XnzD
nN8ft1QCjhzK2elb0rsp6ZN4tkLEGZMU/EgUfnrLvt9/yhhOzwN7u2tDIXpM5dwW
KsgX/cKQssH8tJBfRy8YyXmb6ez9nV3yo0byHUCrC5YlCYrSGEpSe9mMstO5/Zo7
UFL0OPdbscdU1H2kYTA8YA1RywmUM/icn8mTrh/B2qkhxwusEJeqj4UOmtM3oiwF
RuUp3hfeYYuDaDIiAsJ0yaaSMdmQ2LummJClYJOVVOSAKacLWwG4HkyBLSrMIiol
H30cUEtGGnSbrptnYAI5M3gVFuNj1kTVjSw2ccc3EaEiAC7/iOPk4ho52ItEAkpT
ZGCBV1jtJ51RNeOh8r66
=dL5c
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"The patches contain:
- fix multiple ARM IOMMU drivers to behave well when the hardware is
not present
- mark MSM driver as broken
- fix build errors with the new ARM generic io-page-table code"
* tag 'iommu-fixes-v4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/io-pgtable-arm: Add built time dependency
iommu/msm: Mark driver BROKEN
iommu/rockchip: Play nice in multi-platform builds
iommu/omap: Play nice in multi-platform builds
iommu/exynos: Play nice in multi-platform builds
iommu/io-pgtable-arm: Fix self-test WARNs on i386
Pull kvm/s390 bugfixes from Marcelo Tosatti.
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: non-LPAR case obsolete during facilities mask init
KVM: s390: include guest facilities in kvm facility test
KVM: s390: fix in memory copy of facility lists
KVM: s390/cpacf: Fix kernel bug under z/VM
KVM: s390/cpacf: Enable key wrapping by default