Commit Graph

109 Commits

Author SHA1 Message Date
David Hildenbrand 0266def913 xen/balloon: Fix mapping PG_offline pages to user space
The XEN balloon driver - in contrast to other balloon drivers - allows
to map some inflated pages to user space. Such pages are allocated via
alloc_xenballooned_pages() and freed via free_xenballooned_pages().
The pfn space of these allocated pages is used to map other things
by the hypervisor using hypercalls.

Pages marked with PG_offline must never be mapped to user space (as
this page type uses the mapcount field of struct pages).

So what we can do is, clear/set PG_offline when allocating/freeing an
inflated pages. This way, most inflated pages can be excluded by
dumping tools and the "reused for other purpose" balloon pages are
correctly not marked as PG_offline.

Fixes: 77c4adf6a6 (xen/balloon: mark inflated pages PG_offline)
Reported-by: Julien Grall <julien.grall@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-03-15 15:35:35 +01:00
David Hildenbrand 77c4adf6a6 xen/balloon: mark inflated pages PG_offline
Mark inflated and never onlined pages PG_offline, to tell the world that
the content is stale and should not be dumped.

Link: http://lkml.kernel.org/r/20181119101616.8901-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Julien Freche <jfreche@vmware.com>
Cc: Kairui Song <kasong@redhat.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Lianbo Jiang <lijiang@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Pankaj gupta <pagupta@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xavier Deguillard <xdeguillard@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:14 -08:00
Arun KS a9cd410a3d mm/page_alloc.c: memory hotplug: free pages as higher order
When freeing pages are done with higher order, time spent on coalescing
pages by buddy allocator can be reduced.  With section size of 256MB,
hot add latency of a single section shows improvement from 50-60 ms to
less than 1 ms, hence improving the hot add latency by 60 times.  Modify
external providers of online callback to align with the change.

[arunks@codeaurora.org: v11]
  Link: http://lkml.kernel.org/r/1547792588-18032-1-git-send-email-arunks@codeaurora.org
[akpm@linux-foundation.org: remove unused local, per Arun]
[akpm@linux-foundation.org: avoid return of void-returning __free_pages_core(), per Oscar]
[akpm@linux-foundation.org: fix it for mm-convert-totalram_pages-and-totalhigh_pages-variables-to-atomic.patch]
[arunks@codeaurora.org: v8]
  Link: http://lkml.kernel.org/r/1547032395-24582-1-git-send-email-arunks@codeaurora.org
[arunks@codeaurora.org: v9]
  Link: http://lkml.kernel.org/r/1547098543-26452-1-git-send-email-arunks@codeaurora.org
Link: http://lkml.kernel.org/r/1538727006-5727-1-git-send-email-arunks@codeaurora.org
Signed-off-by: Arun KS <arunks@codeaurora.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:14 -08:00
David Hildenbrand f29d8e9c01 mm/memory_hotplug: drop "online" parameter from add_memory_resource()
Userspace should always be in charge of how to online memory and if memory
should be onlined automatically in the kernel.  Let's drop the parameter
to overwrite this - XEN passes memhp_auto_online, just like add_memory(),
so we can directly use that instead internally.

Link: http://lkml.kernel.org/r/20181123123740.27652-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:48 -08:00
Linus Torvalds 292974c5ac xen: fixes for 4.20-rc5
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCXAOQWAAKCRCAXGG7T9hj
 vqqLAQCC+eJNcIEGXm2B5SpadB4v/TKPk+GYRK/zUP0FVPHo3AEAoSTZlmCUUSno
 7me8n0sTrd4+ZKMGHvFgAYbpoO90XQM=
 =BxP8
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.20a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - A revert of a previous commit as it is no longer necessary and has
   shown to cause problems in some memory hotplug cases.

 - Some small fixes and a minor cleanup.

 - A patch for adding better diagnostic data in a very rare failure
   case.

* tag 'for-linus-4.20a-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  pvcalls-front: fixes incorrect error handling
  Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"
  xen: xlate_mmu: add missing header to fix 'W=1' warning
  xen/x86: add diagnostic printout to xen_mc_flush() in case of error
  x86/xen: cleanup includes in arch/x86/xen/spinlock.c
2018-12-02 12:15:55 -08:00
Igor Druzhinin 123664101a Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"
This reverts commit b3cf8528bb.

That commit unintentionally broke Xen balloon memory hotplug with
"hotplug_unpopulated" set to 1. As long as "System RAM" resource
got assigned under a new "Unusable memory" resource in IO/Mem tree
any attempt to online this memory would fail due to general kernel
restrictions on having "System RAM" resources as 1st level only.

The original issue that commit has tried to workaround fa564ad963
("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f,
60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move
and shrink AMD 64-bit window to avoid conflict") which made the
original fix to Xen ballooning unnecessary.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2018-11-29 17:53:31 +01:00
David Hildenbrand 8df1d0e4a2 mm/memory_hotplug: make add_memory() take the device_hotplug_lock
add_memory() currently does not take the device_hotplug_lock, however
is aleady called under the lock from
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
to synchronize against CPU hot-remove and similar.

In general, we should hold the device_hotplug_lock when adding memory to
synchronize against online/offline request (e.g.  from user space) - which
already resulted in lock inversions due to device_lock() and
mem_hotplug_lock - see 30467e0b3b ("mm, hotplug: fix concurrent memory
hot-add deadlock").  add_memory()/add_memory_resource() will create memory
block devices, so this really feels like the right thing to do.

Holding the device_hotplug_lock makes sure that a memory block device
can really only be accessed (e.g. via .online/.state) from user space,
once the memory has been fully added to the system.

The lock is not held yet in
	drivers/xen/balloon.c
	arch/powerpc/platforms/powernv/memtrace.c
	drivers/s390/char/sclp_cmd.c
	drivers/hv/hv_balloon.c
So, let's either use the locked variants or take the lock.

Don't export add_memory_resource(), as it once was exported to be used by
XEN, which is never built as a module.  If somebody requires it, we also
have to export a locked variant (as device_hotplug_lock is never
exported).

Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:17 -07:00
Mike Rapoport 57c8a661d9 mm: remove include/linux/bootmem.h
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>

@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
  Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:16 -07:00
Oleksandr Andrushchenko ae4c51a50c xen/balloon: Share common memory reservation routines
Memory {increase|decrease}_reservation and VA mappings update/reset
code used in balloon driver can be made common, so other drivers can
also re-use the same functionality without open-coding.
Create a dedicated file for the shared code and export corresponding
symbols for other kernel modules.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-07-26 23:05:13 -04:00
Boris Ostrovsky b3cf8528bb xen/balloon: Mark unallocated host memory as UNUSABLE
Commit f5775e0b61 ("x86/xen: discard RAM regions above the maximum
reservation") left host memory not assigned to dom0 as available for
memory hotplug.

Unfortunately this also meant that those regions could be used by
others. Specifically, commit fa564ad963 ("x86/PCI: Enable a 64bit BAR
on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") may try to map those
addresses as MMIO.

To prevent this mark unallocated host memory as E820_TYPE_UNUSABLE (thus
effectively reverting f5775e0b61) and keep track of that region as
a hostmem resource that can be used for the hotplug.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
2017-12-20 13:16:20 -05:00
Boris Ostrovsky b194da25ca xen: Don't try to call xen_alloc_p2m_entry() on autotranslating guests
Commit aba831a69632 ("xen: remove tests for pvh mode in pure pv paths")
removed XENFEAT_auto_translated_physmap test in xen_alloc_p2m_entry()
since it is assumed that the routine is never called by non-PV guests.

However, alloc_xenballooned_pages() may make this call on a PVH guest.
Prevent this from happening by adding XENFEAT_auto_translated_physmap
check there.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Fixes: aba831a69632 ("xen: remove tests for pvh mode in pure pv paths")
2017-08-31 09:45:55 -04:00
Juergen Gross 96edd61dcf xen/balloon: don't online new memory initially
When setting up the Xenstore watch for the memory target size the new
watch will fire at once. Don't try to reach the configured target size
by onlining new memory in this case, as the current memory size will
be smaller in almost all cases due to e.g. BIOS reserved pages.

Onlining new memory will lead to more problems e.g. undesired conflicts
with NVMe devices meant to be operated as block devices.

Instead remember the difference between target size and current size
when the watch fires for the first time and apply it to any further
size changes, too.

In order to avoid races between balloon.c and xen-balloon.c init calls
do the xen-balloon.c initialization from balloon.c.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-07-23 08:13:18 +02:00
Vitaly Kuznetsov 4fee9ad84e xen/balloon: decorate PV-only parts with #ifdef CONFIG_XEN_PV
Balloon driver uses several PV-only concepts (xen_start_info,
xen_extra_mem,..) and it seems the simpliest solution to make HVM-only
build happy is to decorate these parts with #ifdefs.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:09:56 +02:00
Ingo Molnar 5b825c3af1 sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h>
Add #include <linux/cred.h> dependencies to all .c files rely on sched.h
doing that for them.

Note that even if the count where we need to add extra headers seems high,
it's still a net win, because <linux/sched.h> is included in over
2,200 files ...

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:31 +01:00
Ross Lagerwall 709613ad2b xen/balloon: Only mark a page as managed when it is released
Only mark a page as managed when it is released back to the allocator.
This ensures that the managed page count does not get falsely increased
when a VM is running. Correspondingly change it so that pages are
marked as unmanaged after getting them from the allocator.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-12 15:22:22 +01:00
Ross Lagerwall 842775f150 xen/balloon: Fix declared-but-not-defined warning
Fix a declared-but-not-defined warning when building with
XEN_BALLOON_MEMORY_HOTPLUG=n. This fixes a regression introduced by
commit dfd74a1edf ("xen/balloon: Fix crash when ballooning on x86 32
bit PAE").

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-06-23 11:36:15 +01:00
Ross Lagerwall dfd74a1edf xen/balloon: Fix crash when ballooning on x86 32 bit PAE
Commit 55b3da98a4 (xen/balloon: find
non-conflicting regions to place hotplugged memory) caused a
regression in 4.4.

When ballooning on an x86 32 bit PAE system with close to 64 GiB of
memory, the address returned by allocate_resource may be above 64 GiB.
When using CONFIG_SPARSEMEM, this setup is limited to using physical
addresses < 64 GiB.  When adding memory at this address, it runs off
the end of the mem_section array and causes a crash.  Instead, fail
the ballooning request.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: <stable@vger.kernel.org> # 4.4+
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-04-06 11:29:18 +01:00
Linus Torvalds 55fc733c7e xen: features and fixes for 4.6-rc0
- Make earlyprintk=xen work for HVM guests.
 - Remove module support for things never built as modules.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJW8YhlAAoJEFxbo/MsZsTRiCgH/3GnL93s/BsRTrA/DYykYqA4
 rPPkDd6WMlEbrko6FY7o4IxUAZ50LWU8wjmDNVTT+R/VKlWCVh2+ksxjbH8e0nzr
 0QohAYTdG1VGmhkrwTjKj0wC3Nr8vaDqoBXAFRz4DiQdbobn12wzXjaVrbl8RRNc
 oHDqR+DfZj6ontqx8oFkkTk7CFKIGEOtm/uBhCAV2fNkQSUCzuQyKLlarxM6jq/2
 EnLR1bhnyoy5zNFzXxmaJgZOit8QFMKvPdDRknlp9yDHYJ5zolJkv+6FPiAG6SZs
 A8yAUPQoAD+ekAiV2DIhb8qoNNNdZXdRCFBpoudwX3GyyU4O2P5Intl0xEg17Nk=
 =L8S4
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from David Vrabel:
 "Features and fixes for 4.6:

  - Make earlyprintk=xen work for HVM guests

  - Remove module support for things never built as modules"

* tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  drivers/xen: make platform-pci.c explicitly non-modular
  drivers/xen: make sys-hypervisor.c explicitly non-modular
  drivers/xen: make xenbus_dev_[front/back]end explicitly non-modular
  drivers/xen: make [xen-]ballon explicitly non-modular
  xen: audit usages of module.h ; remove unnecessary instances
  xen/x86: Drop mode-selecting ifdefs in startup_xen()
  xen/x86: Zero out .bss for PV guests
  hvc_xen: make early_printk work with HVM guests
  hvc_xen: fix xenboot for DomUs
  hvc_xen: add earlycon support
2016-03-22 12:55:17 -07:00
Paul Gortmaker 106eaa8e6e drivers/xen: make [xen-]ballon explicitly non-modular
The Makefile / Kconfig currently controlling compilation here is:

obj-y   += grant-table.o features.o balloon.o manage.o preempt.o time.o
[...]
obj-$(CONFIG_XEN_BALLOON)               += xen-balloon.o

...with:

drivers/xen/Kconfig:config XEN_BALLOON
drivers/xen/Kconfig:    bool "Xen memory balloon driver"

...meaning that they currently are not being built as modules by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

In doing so we uncover two implict includes that were obtained
by module.h having such a wide include scope itself:

In file included from drivers/xen/xen-balloon.c:41:0:
include/xen/balloon.h:26:51: warning: ‘struct page’ declared inside parameter list [enabled by default]
 int alloc_xenballooned_pages(int nr_pages, struct page **pages);
                                                   ^
include/xen/balloon.h: In function ‘register_xen_selfballooning’:
include/xen/balloon.h:35:10: error: ‘ENOSYS’ undeclared (first use in this function)
  return -ENOSYS;
          ^

This is fixed by adding mm-types.h and errno.h to the list.

We also delete the MODULE_LICENSE tags since all that information
is already contained at the top of the file in the comments.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-03-21 15:13:44 +00:00
Vitaly Kuznetsov 703fc13a3f xen_balloon: support memory auto onlining policy
Add support for the newly added kernel memory auto onlining policy to
Xen ballon driver.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Suggested-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vitaly Kuznetsov 31bc3858ea memory-hotplug: add automatic onlining policy for the newly added memory
Currently, all newly added memory blocks remain in 'offline' state
unless someone onlines them, some linux distributions carry special udev
rules like:

  SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"

to make this happen automatically.  This is not a great solution for
virtual machines where memory hotplug is being used to address high
memory pressure situations as such onlining is slow and a userspace
process doing this (udev) has a chance of being killed by the OOM killer
as it will probably require to allocate some memory.

Introduce default policy for the newly added memory blocks in
/sys/devices/system/memory/auto_online_blocks file with two possible
values: "offline" which preserves the current behavior and "online"
which causes all newly added memory blocks to go online as soon as
they're added.  The default is "offline".

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Toshi Kani 782b86641e xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM
Set IORESOURCE_SYSTEM_RAM in struct resource.flags of "System
RAM" entries.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: David Vrabel <david.vrabel@citrix.com> # xen
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-mm <linux-mm@kvack.org>
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1453841853-11383-9-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-30 09:49:58 +01:00
Julien Grall 3990dd2703 xen/balloon: Use the correct sizeof when declaring frame_list
The type of the item in frame_list is xen_pfn_t which is not an unsigned
long on ARM but an uint64_t.

With the current computation, the size of frame_list will be 2 *
PAGE_SIZE rather than PAGE_SIZE.

I bet it's just mistake when the type has been switched from "unsigned
long" to "xen_pfn_t" in commit 965c0aaafe
"xen: balloon: use correct type for frame_list".

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:44 +01:00
Julien Grall 30756c6299 xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
For ARM64 guests, Linux is able to support either 64K or 4K page
granularity. Although, the hypercall interface is always based on 4K
page granularity.

With 64K page granularity, a single page will be spread over multiple
Xen frame.

To avoid splitting the page into 4K frame, take advantage of the
extent_order field to directly allocate/free chunk of the Linux page
size.

Note that PVMMU is only used for PV guest (which is x86) and the page
granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure
that because the code has not been modified.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-10-23 14:20:38 +01:00
David Vrabel 4a69c909de xen/balloon: pre-allocate p2m entries for ballooned pages
Pages returned by alloc_xenballooned_pages() will be used for grant
mapping which will call set_phys_to_machine() (in PV guests).

Ballooned pages are set as INVALID_P2M_ENTRY in the p2m and thus may
be using the (shared) missing tables and a subsequent
set_phys_to_machine() will need to allocate new tables.

Since the grant mapping may be done from a context that cannot sleep,
the p2m entries must already be allocated.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2015-10-23 14:20:31 +01:00
David Vrabel 1cf6a6c829 xen/balloon: use hotplugged pages for foreign mappings etc.
alloc_xenballooned_pages() is used to get ballooned pages to back
foreign mappings etc.  Instead of having to balloon out real pages,
use (if supported) hotplugged memory.

This makes more memory available to the guest and reduces
fragmentation in the p2m.

This is only enabled if the xen.balloon.hotplug_unpopulated sysctl is
set to 1.  This sysctl defaults to 0 in case the udev rules to
automatically online hotplugged memory do not exist.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Add xen.balloon.hotplug_unpopulated sysctl to enable use of hotplug
  for unpopulated pages.
2015-10-23 14:20:05 +01:00
David Vrabel 81b286e0f1 xen/balloon: make alloc_xenballoon_pages() always allocate low pages
All users of alloc_xenballoon_pages() wanted low memory pages, so
remove the option for high memory.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2015-10-23 14:20:05 +01:00
David Vrabel b2ac6aa8f7 xen/balloon: only hotplug additional memory if required
Now that we track the total number of pages (included hotplugged
regions), it is easy to determine if more memory needs to be
hotplugged.

Add a new BP_WAIT state to signal that the balloon process needs to
wait until kicked by the memory add notifier (when the new section is
onlined by userspace).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Return BP_WAIT if enough sections are already hotplugged.

v2:
- New BP_WAIT status after adding new memory sections.
2015-10-23 14:20:04 +01:00
David Vrabel de5a77d842 xen/balloon: rationalize memory hotplug stats
The stats used for memory hotplug make no sense and are fiddled with
in odd ways.  Remove them and introduce total_pages to track the total
number of pages (both populated and unpopulated) including those within
hotplugged regions (note that this includes not yet onlined pages).

This will be used in a subsequent commit (xen/balloon: only hotplug
additional memory if required) when deciding whether additional memory
needs to be hotplugged.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2015-10-23 14:20:04 +01:00
David Vrabel 55b3da98a4 xen/balloon: find non-conflicting regions to place hotplugged memory
Instead of placing hotplugged memory at the end of RAM (which may
conflict with PCI devices or reserved regions) use allocate_resource()
to get a new, suitably aligned resource that does not conflict.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
---
v3:
- Remove stale comment.
2015-10-23 14:20:03 +01:00
Julien Grall 0df4f266b3 xen: Use correctly the Xen memory terminologies
Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This resulted in some misimplementation of helpers on ARM and
confused developers about the expected behavior.

For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.

For clarity and avoid new confusion, replace any reference to mfn with
gfn in any helpers used by PV drivers. The x86 code will still keep some
reference of pfn_to_mfn which may be used by all kind of guests
No changes as been made in the hypercall field, even
though they may be invalid, in order to keep the same as the defintion
in xen repo.

Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a
name to close to the KVM function gfn_to_page.

Take also the opportunity to simplify simple construction such
as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up
will come in follow-up patches.

[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08 18:03:49 +01:00
Juergen Gross 626d750866 xen: switch extra memory accounting to use pfns
Instead of using physical addresses for accounting of extra memory
areas available for ballooning switch to pfns as this is much less
error prone regarding partial pages.

Reported-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-08 16:28:06 +01:00
Juergen Gross 929423fa83 xen: release lock occasionally during ballooning
When dom0 is being ballooned balloon_process() will hold the balloon
mutex until it is finished. This will block e.g. creation of new
domains as the device backends for the new domain need some
autoballooned pages for the ring buffers.

Avoid this by releasing the balloon mutex from time to time during
ballooning. Adjust the comment above balloon_process() regarding
multiple instances of balloon_process().

Instead of open coding it, just use cond_resched().

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-07-20 13:37:05 +01:00
Juergen Gross 3c56b3a12c xen/balloon: before adding hotplugged memory, set frames to invalid
Commit 25b884a83d ("x86/xen: set
regions above the end of RAM as 1:1") introduced a regression.

To be able to add memory pages which were added via memory hotplug to
a pv domain, the pages must be "invalid" instead of "identity" in the
p2m list before they can be added.

Suggested-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: <stable@vger.kernel.org> # 3.16+
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-03-23 15:15:53 +00:00
David Vrabel 0bb599fd30 xen: remove scratch frames for ballooned pages and m2p override
The scratch frame mappings for ballooned pages and the m2p override
are broken.  Remove them in preparation for replacing them with
simpler mechanisms that works.

The scratch pages did not ensure that the page was not in use.  In
particular, the foreign page could still be in use by hardware.  If
the guest reused the frame the hardware could read or write that
frame.

The m2p override did not handle the same frame being granted by two
different grant references.  Trying an M2P override lookup in this
case is impossible.

With the m2p override removed, the grant map/unmap for the kernel
mappings (for x86 PV) can be easily batched in
set_foreign_p2m_mapping() and clear_foreign_p2m_mapping().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-01-28 14:03:10 +00:00
Boris Ostrovsky fd8b795113 xen/balloon: Don't continue ballooning when BP_ECANCELED is encountered
Commit 3dcf63677d ("xen/balloon: cancel ballooning if adding new
memory failed") makes reserve_additional_memory() return BP_ECANCELED
when an error is encountered. This error, however, is ignored by the
caller (balloon_process()) since it is overwritten by subsequent call
to update_schedule(). This results in continuous attempts to add more
memory, all of which are likely to fail again.

We should stop trying to schedule next iteration of ballooning when
the current one has failed.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-23 16:24:01 +01:00
David Vrabel 3dcf63677d xen/balloon: cancel ballooning if adding new memory failed
If the balloon driver is adding additional memory regions to the
balloon and add_memory() fails it will likely continuously fail so
cancel the balloon operation.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2014-09-02 15:37:19 +01:00
David Vrabel fb9a0c4436 xen/balloon: set ballooned out pages as invalid in p2m
Since cd9151e26d (xen/balloon: set a
mapping for ballooned out pages), a ballooned out page had its entry
in the p2m set to the MFN of one of the scratch pages.  This means
that the p2m will contain many entries pointing to the same MFN.

During a domain save, these many-to-one entries are not identified as
such and the scratch page is saved multiple times. On restore the
ballooned pages are populated with new frames and the domain may use
up its allocation before all pages can be restored.

Since the original fix only needed to keep a mapping for the ballooned
page it is safe to set ballooned out pages as INVALID_P2M_ENTRY in the
p2m (as they were before). Thus preventing them from being saved and
re-populated on restore.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reported-by: Marek Marczykowski <marmarek@invisiblethingslab.com>
Tested-by: Marek Marczykowski <marmarek@invisiblethingslab.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: <stable@vger.kernel.org>
2014-07-04 15:17:43 +01:00
Linus Torvalds 467a9e1633 CPU hotplug notifiers registration fixes for 3.15-rc1
The purpose of this single series of commits from Srivatsa S Bhat (with
 a small piece from Gautham R Shenoy) touching multiple subsystems that use
 CPU hotplug notifiers is to provide a way to register them that will not
 lead to deadlocks with CPU online/offline operations as described in the
 changelog of commit 93ae4f978c (CPU hotplug: Provide lockless versions
 of callback registration functions).
 
 The first three commits in the series introduce the API and document it
 and the rest simply goes through the users of CPU hotplug notifiers and
 converts them to using the new method.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTQow2AAoJEILEb/54YlRxW4QQAJlYRDUzwFJzJzYhltQYuVR+
 4D74XMtvXgoJfg3cwdSWvMKKpJZnA9BVN0f7Hcx9wYmgdexYUuHeZJmMNyc3S2+g
 KjKBIsugvgmZhHbbLd6TJ6GBbhGT5JLt9VmSfL9zIkveInU1YHFUUqL/mxdHm4J0
 BSGKjk2rN3waRJgmY+xfliFLtQjDKFwJpMuvrgtoUyfas3f4sIV43UNbqdvA/weJ
 rzedxXOlKH/id4b56lj/4iIzcoL3mwvJJ7r6n0CEMsKv87z09kqR0O+69Tsq/cgs
 j17CsvoJOmZGk3QTeKVMQWBsvk6aPoDu3zK83gLbQMt+qjOpSTbJLz/3HZw4/TrW
 ss4nuZne1DLMGS+6hoxYbTP+6Ni//Kn+l/LrHc5jb7m1X3lMO4W2aV3IROtIE1rv
 lEP1IG01NU4u9YwkVj1dyhrkSp8tLPul4SrUK8W+oNweOC5crjJV7vJbIPJgmYiM
 IZN55wln0yVRtR4TX+rmvN0PixsInE8MeaVCmReApyF9pdzul/StxlBze5BKLSJD
 cqo1kNPpsmdxoDucqUpQ/gSvy+IOl2qnlisB5PpV93sk7De6TFDYrGHxjYIW7jMf
 StXwdCDDQhzd2Q8Kfpp895A1dbIl8rKtwA6bTU2eX+BfMVFzuMdT44cvosx1+UdQ
 sWl//rg76nb13dFjvF+q
 =SW7Q
 -----END PGP SIGNATURE-----

Merge tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull CPU hotplug notifiers registration fixes from Rafael Wysocki:
 "The purpose of this single series of commits from Srivatsa S Bhat
  (with a small piece from Gautham R Shenoy) touching multiple
  subsystems that use CPU hotplug notifiers is to provide a way to
  register them that will not lead to deadlocks with CPU online/offline
  operations as described in the changelog of commit 93ae4f978c ("CPU
  hotplug: Provide lockless versions of callback registration
  functions").

  The first three commits in the series introduce the API and document
  it and the rest simply goes through the users of CPU hotplug notifiers
  and converts them to using the new method"

* tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
  net/iucv/iucv.c: Fix CPU hotplug callback registration
  net/core/flow.c: Fix CPU hotplug callback registration
  mm, zswap: Fix CPU hotplug callback registration
  mm, vmstat: Fix CPU hotplug callback registration
  profile: Fix CPU hotplug callback registration
  trace, ring-buffer: Fix CPU hotplug callback registration
  xen, balloon: Fix CPU hotplug callback registration
  hwmon, via-cputemp: Fix CPU hotplug callback registration
  hwmon, coretemp: Fix CPU hotplug callback registration
  thermal, x86-pkg-temp: Fix CPU hotplug callback registration
  octeon, watchdog: Fix CPU hotplug callback registration
  oprofile, nmi-timer: Fix CPU hotplug callback registration
  intel-idle: Fix CPU hotplug callback registration
  clocksource, dummy-timer: Fix CPU hotplug callback registration
  drivers/base/topology.c: Fix CPU hotplug callback registration
  acpi-cpufreq: Fix CPU hotplug callback registration
  zsmalloc: Fix CPU hotplug callback registration
  scsi, fcoe: Fix CPU hotplug callback registration
  scsi, bnx2fc: Fix CPU hotplug callback registration
  scsi, bnx2i: Fix CPU hotplug callback registration
  ...
2014-04-07 14:55:46 -07:00
Wei Liu 09ed3d5ba0 xen/balloon: flush persistent kmaps in correct position
Xen balloon driver will update ballooned out pages' P2M entries to point
to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc
page while non-preemptible", kmap_flush_unused was moved after updating
P2M table. In that case for 32 bit PV guest we might end up with

  P2M    X -----> S  (S is mfn of balloon scratch page)
  M2P    Y -----> X  (Y is mfn in persistent kmap entry)

kmap_flush_unused() iterates through all the PTEs in the kmap address
space, using pte_to_page() to obtain the page. If the p2m and the m2p
are inconsistent the incorrect page is returned.  This will clear
page->address on the wrong page which may cause subsequent oopses if
that page is currently kmap'ed.

Move the flush back between get_page and __set_phys_to_machine to fix
this.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: stable@vger.kernel.org # 3.12+
2014-03-25 10:38:30 +00:00
Srivatsa S. Bhat 43ea953649 xen, balloon: Fix CPU hotplug callback registration
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

	get_online_cpus();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	register_cpu_notifier(&foobar_cpu_notifier);

	put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

The xen balloon driver doesn't take get/put_online_cpus() around this code,
but that is also buggy, since it can miss CPU hotplug events in between the
initialization and callback registration:

	for_each_online_cpu(cpu)
		init_cpu(cpu);
		   ^
		   |  Race window; Can miss CPU hotplug events here.
		   v
	register_cpu_notifier(&foobar_cpu_notifier);

Interestingly, the balloon code in xen can simply be reorganized as shown
below, to have a race-free method to register hotplug callbacks, without even
taking get/put_online_cpus(). This is because the initialization performed for
already online CPUs is exactly the same as that performed for CPUs that come
online later. Moreover, the code has checks in place to avoid double
initialization.

	register_cpu_notifier(&foobar_cpu_notifier);

	get_online_cpus();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	put_online_cpus();

A hotplug operation that occurs between registering the notifier and calling
get_online_cpus(), won't disrupt anything, because the code takes care to
perform the memory allocations only once.

So reorganize the balloon code in xen this way to fix the issues with CPU
hotplug callback registration.

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-20 13:43:48 +01:00
Jie Liu 9346c2a8de xen: simplify balloon_first_page() with list_first_entry_or_null()
Replace the code logic at balloon_first_page() by calling
list_first_entry_or_null() directly.  since here is only
one user of that routine, therefore we can just remove it.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2014-01-06 10:07:29 -05:00
Stefano Stabellini c1d15f5c8b xen/balloon: Seperate the auto-translate logic properly (v2)
The auto-xlat logic vs the non-xlat means that we don't need to for
auto-xlat guests (like PVH, HVM or ARM):
 - use P2M
 - use scratch page.

However the code in increase_reservation does modify the p2m for
auto_translate guests, but not in decrease_reservation.

Fix that by avoiding any p2m modifications in both increase_reservation
and decrease_reservation for auto_translated guests.

And also avoid allocating or using scratch pages for auto_translated guests.

Lastly, since !auto-xlat is really another way of saying 'xen_pv'
remove the redundant 'xen_pv_domain' check.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
[v2: Updated the description]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-12-13 11:20:30 -05:00
Paul Gortmaker 3b284bde70 xen: delete new instances of added __cpuinit
commit 6efa20e49b
("xen: Support 64-bit PV guest receiving NMIs") and
commit cd9151e26d
( "xen/balloon: set a mapping for ballooned out pages")
added new instances of __cpuinit usage.

We removed this a couple versions ago; we now want to remove
the compat no-op stubs.  Introducing new users is not what
we want to see at this point in time, as it will break once
the stubs are gone.

Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-11-08 15:13:16 -05:00
Boris Ostrovsky c275a57f5e xen/balloon: Set balloon's initial state to number of existing RAM pages
Currently balloon's initial value is set to max_pfn which includes
non-RAM ranges such as MMIO hole. As result, initial memory target
(specified by guest's configuration file) will appear smaller than
what balloon driver perceives to be the current number of available
pages. Thus it will balloon down "extra" pages, decreasing amount of
available memory for no good reason.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-11-08 15:13:03 -05:00
David Vrabel 24f69373e2 xen/balloon: don't alloc page while non-preemptible
get_balloon_scratch_page() disables preemption so we cannot call
alloc_page() in between get/put_balloon_scratch_page().  Shuffle bits
around in decrease_reservation() to avoid this.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-09-24 16:22:27 -04:00
Wei Liu 6a6f6e72ec xen/balloon: remove BUG_ON in increase_reservation
The BUG_ON in increase_reservation is wrong as we have P2M entry
ballooned out page set to balloon scratch page, so it might have a valid
P2M entry at that point.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2013-09-11 17:54:02 +00:00
David Vrabel 2bad07ce28 xen/balloon: ensure preemption is disabled when using a scratch page
In decrease_reservation(), if the kernel is preempted between updating
the mapping and updating the p2m then they may end up using different
scratch pages.

Use get_balloon_scratch_page() and put_balloon_scratch_page() which use
get_cpu_var() and put_cpu_var() to correctly disable preemption.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
2013-09-11 17:52:58 +00:00
Wei Liu 04660bb5d0 xen/balloon: don't set P2M entry for auto translated guest
In commit cd9151e2: xen/balloon: set a mapping for ballooned out pages
we have the ballooned out page's mapping set to a scratch page.

That commit also sets the P2M entry of ballooned out page to the scratch
page's MFN. This is necessary for PV guest but not for HVM guest. On the
other hand, setting the P2M entry would trigger BUG_ON in
__set_phys_to_machine.

The correct thing to do here is to avoid calling __set_phys_to_machine
for auto translated guest.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2013-08-30 13:24:13 -04:00
Stefano Stabellini cd9151e26d xen/balloon: set a mapping for ballooned out pages
Currently ballooned out pages are mapped to 0 and have INVALID_P2M_ENTRY
in the p2m. These ballooned out pages are used to map foreign grants
by gntdev and blkback (see alloc_xenballooned_pages).

Allocate a page per cpu and map all the ballooned out pages to the
corresponding mfn. Set the p2m accordingly. This way reading from a
ballooned out page won't cause a kernel crash (see
http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html).

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
CC: alex@alex.org.uk
CC: dcrisan@flexiant.com
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-08-09 11:23:24 -04:00