linux/Documentation
Alexey Kardashevskiy 2157e7b82f vfio: powerpc/spapr: Register memory and define IOMMU v2
The existing implementation accounts the whole DMA window in
the locked_vm counter. This is going to be worse with multiple
containers and huge DMA windows. Also, real-time accounting would requite
additional tracking of accounted pages due to the page size difference -
IOMMU uses 4K pages and system uses 4K or 64K pages.

Another issue is that actual pages pinning/unpinning happens on every
DMA map/unmap request. This does not affect the performance much now as
we spend way too much time now on switching context between
guest/userspace/host but this will start to matter when we add in-kernel
DMA map/unmap acceleration.

This introduces a new IOMMU type for SPAPR - VFIO_SPAPR_TCE_v2_IOMMU.
New IOMMU deprecates VFIO_IOMMU_ENABLE/VFIO_IOMMU_DISABLE and introduces
2 new ioctls to register/unregister DMA memory -
VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY -
which receive user space address and size of a memory region which
needs to be pinned/unpinned and counted in locked_vm.
New IOMMU splits physical pages pinning and TCE table update
into 2 different operations. It requires:
1) guest pages to be registered first
2) consequent map/unmap requests to work only with pre-registered memory.
For the default single window case this means that the entire guest
(instead of 2GB) needs to be pinned before using VFIO.
When a huge DMA window is added, no additional pinning will be
required, otherwise it would be guest RAM + 2GB.

The new memory registration ioctls are not supported by
VFIO_SPAPR_TCE_IOMMU. Dynamic DMA window and in-kernel acceleration
will require memory to be preregistered in order to work.

The accounting is done per the user process.

This advertises v2 SPAPR TCE IOMMU and restricts what the userspace
can do with v1 or v2 IOMMUs.

In order to support memory pre-registration, we need a way to track
the use of every registered memory region and only allow unregistration
if a region is not in use anymore. So we need a way to tell from what
region the just cleared TCE was from.

This adds a userspace view of the TCE table into iommu_table struct.
It contains userspace address, one per TCE entry. The table is only
allocated when the ownership over an IOMMU group is taken which means
it is only used from outside of the powernv code (such as VFIO).

As v2 IOMMU supports IODA2 and pre-IODA2 IOMMUs (which do not support
DDW API), this creates a default DMA window for IODA2 for consistency.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-11 15:16:55 +10:00
..
ABI cxl: Document external user of existing API 2015-06-03 13:27:16 +10:00
DocBook Merge branch 'drm-next-merged' of git://people.freedesktop.org/~airlied/linux into v4l_for_linus 2015-04-21 09:44:55 -03:00
EDID
PCI The documentation tree update for 4.1. Numerous fixes, the overdue removal 2015-04-18 11:10:49 -04:00
RCU
accounting
acpi ACPI / documentation: Fix ambiguity in the GPIO properties document 2015-05-04 14:26:14 +02:00
aoe
arm ARM: SoC multiplatform code changes for v4.1 2015-04-22 09:20:15 -07:00
arm64 ARM64 / ACPI: additions of ACPI documentation for arm64 2015-03-26 15:13:09 +00:00
auxdisplay
backlight
blackfin Documentation: blackfin: Makefile: Typo building issue 2015-04-11 15:19:31 +02:00
block Documentation: Remove mentioning of block barriers 2015-03-20 07:41:56 -06:00
blockdev Merge branch 'for-4.1/drivers' of git://git.kernel.dk/linux-block 2015-04-16 22:05:27 -04:00
bus-devices
cdrom
cgroups The documentation tree update for 4.1. Numerous fixes, the overdue removal 2015-04-18 11:10:49 -04:00
cma cma: debug: document new debugfs interface 2015-04-14 16:49:00 -07:00
connector
console
cpu-freq
cpuidle
cris
crypto crypto: doc - AEAD / RNG AF_ALG interface 2015-03-09 21:06:18 +11:00
development-process
device-mapper dm crypt: update URLs to new cryptsetup project page 2015-04-15 12:10:24 -04:00
devicetree ARM: SoC fixes for 4.1-rc2 2015-05-09 16:13:38 -07:00
dmaengine Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma 2015-02-18 08:49:20 -08:00
driver-model Char/Misc driver patches for 4.1-rc1 2015-04-21 09:42:58 -07:00
dvb
early-userspace
extcon
fault-injection
fb
filesystems xfs: update for 4.1-rc1 2015-04-24 07:08:41 -07:00
firmware_class
fmc
frv
gpio The documentation tree update for 4.1. Numerous fixes, the overdue removal 2015-04-18 11:10:49 -04:00
hid HID: sensor: Update document for custom sensor 2015-04-10 22:22:56 +02:00
hwmon hwmon: (it87) Add support for IT8620E 2015-04-05 06:01:00 -07:00
i2c i2c: slave: add documentation for i2c-slave-eeprom 2015-03-27 16:53:39 +01:00
ia64
ide
infiniband
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2015-04-21 12:54:08 -07:00
ioctl platform/chrome: Add Chrome OS EC userspace device interface 2015-02-26 15:45:06 -08:00
isdn
ja_JP
kbuild Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2015-02-19 10:31:37 -08:00
kdump
ko_KR
laptops thinkpad_acpi: Add adaptive_kbd_mode sysfs attr 2015-03-03 09:00:08 -08:00
leds Documentation: leds: Add description of LED Flash class extension 2015-03-09 17:18:00 -07:00
locking
m68k
memory-devices
metag
mic
mips
misc-devices
mmc
mn10300
mtd
namespaces
netlabel
networking net: rfs: fix crash in get_rps_cpus() 2015-04-26 16:07:57 -04:00
nfc
nios2
parisc
pcmcia
phy
platform
power Power management and ACPI updates for v4.1-rc1 2015-04-14 20:21:54 -07:00
powerpc powerpc/dscr: Add documentation for DSCR support 2015-06-07 19:29:27 +10:00
pps
prctl
pti
ptp
rapidio
s390
scheduler docs/completion.txt: Various tweaks and corrections 2015-04-04 15:20:26 +02:00
scsi genirq: Remove the deprecated 'IRQF_DISABLED' request_irq() flag entirely 2015-03-05 20:53:06 +01:00
security Smack: Updates for Smack documentation 2015-03-31 10:35:31 -07:00
serial
sh
sound ASoC: Updates for v4.1 2015-04-13 14:14:29 +02:00
spi Documentation/spi/spidev_test.c: fix warning 2015-04-17 09:04:12 -04:00
sysctl Doc/sysctl/kernel.txt: document threads-max 2015-04-17 09:04:07 -04:00
target target: Version 2 of TCMU ABI 2015-04-19 22:40:26 -07:00
thermal
timers documentation: Update NO_HZ_FULL interaction with POSIX timers 2015-02-26 11:57:29 -08:00
tpm
trace coresight: Correcting documentation typographical error 2015-04-03 16:17:03 +02:00
usb usb: patches for v4.1 merge window 2015-03-24 22:57:49 +01:00
vDSO
video4linux [media] media/Documentation: New flag EXECUTE_ON_WRITE 2015-04-08 06:35:16 -03:00
virtual KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation. 2015-04-21 15:21:29 +02:00
vm The documentation tree update for 4.1. Numerous fixes, the overdue removal 2015-04-18 11:10:49 -04:00
w1
watchdog
wimax
x86 x86/mm/KASLR: Propagate KASLR status to kernel proper 2015-04-03 15:26:15 +02:00
xtensa
zh_CN Documentation:Update Documentation/zh_CN/arm64/memory.txt 2015-04-04 15:20:26 +02:00
00-INDEX
BUG-HUNTING
Changes
CodeOfConflict Code of Conflict 2015-02-27 11:44:24 -08:00
CodingStyle The documentation tree update for 4.1. Numerous fixes, the overdue removal 2015-04-18 11:10:49 -04:00
DMA-API-HOWTO.txt
DMA-API.txt
DMA-ISA-LPC.txt
DMA-attributes.txt
HOWTO
IPMI.txt ipmi:ssif: Ignore spaces when comparing I2C adapter names 2015-05-05 14:24:45 -05:00
IRQ-affinity.txt
IRQ-domain.txt IRQCHIP: Update docs regarding irq_domain_add_tree() 2015-04-01 17:21:35 +02:00
IRQ.txt
Intel-IOMMU.txt
Makefile Documentation: Remove ZBOOT MMC/SDHI utility and docs 2015-02-24 06:45:25 +09:00
ManagementStyle
SAK.txt
SM501.txt
SecurityBugs
SubmitChecklist
SubmittingDrivers
SubmittingPatches checkpatch, SubmittingPatches: suggest line wrapping commit messages at 75 columns 2015-04-17 09:03:57 -04:00
VGA-softcursor.txt
applying-patches.txt
assoc_array.txt
atomic_ops.txt documentation: Clarify memory-barrier semantics of atomic operations 2015-02-26 11:57:31 -08:00
bad_memory.txt
basic_profiling.txt
bcache.txt
binfmt_misc.txt
braille-console.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
cachetlb.txt
circular-buffers.txt
clk.txt
coccinelle.txt
cpu-hotplug.txt cpumask: fix cpu-hotplug documentation 2015-03-05 13:37:01 +10:30
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
devices.txt
digsig.txt
dma-buf-sharing.txt dma-buf: cleanup dma_buf_export() to make it easily extensible 2015-04-21 14:47:16 +05:30
dontdiff
dynamic-debug-howto.txt
edac.txt
efi-stub.txt
eisa.txt
email-clients.txt Documentation/email-clients.txt: Fix one grammar mistake, add extra info about TB 2015-03-20 07:41:55 -06:00
flexible-arrays.txt
futex-requeue-pi.txt
gcov.txt
gdb-kernel-debugging.txt scripts/gdb: add basic documentation 2015-02-17 14:34:54 -08:00
highuid.txt
hsi.txt
hw_random.txt
hwspinlock.txt
init.txt
initrd.txt
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt
irqflags-tracing.txt
isapnp.txt
java.txt
kasan.txt kasan: show gcc version requirements in Kconfig and Documentation 2015-05-05 17:10:10 -07:00
kernel-doc-nano-HOWTO.txt
kernel-docs.txt
kernel-parameters.txt uas: Add US_FL_MAX_SECTORS_240 flag 2015-04-28 12:48:57 +02:00
kernel-per-CPU-kthreads.txt documentation: Update per-CPU kthreads documentation 2015-02-26 11:57:30 -08:00
kmemcheck.txt Documentation: update the CONFIG_DEBUG_PAGEALLOC description 2015-03-20 07:41:55 -06:00
kmemleak.txt
kobject.txt
kprobes.txt kprobes: Update Documentation/kprobes.txt 2015-03-20 07:41:55 -06:00
kref.txt
kselftest.txt
ldm.txt
local_ops.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lzo.txt
magic-number.txt
mailbox.txt
md-cluster.txt md-cluster: Design Documentation 2015-02-23 07:16:46 -06:00
md.txt
media-framework.txt
memory-barriers.txt The documentation tree update for 4.1. Numerous fixes, the overdue removal 2015-04-18 11:10:49 -04:00
memory-hotplug.txt mem-hotplug: fix typo in Documentation/memory-hotplug.txt 2015-03-20 07:41:55 -06:00
module-signing.txt modsign: change default key details 2015-04-30 09:35:41 -07:00
mono.txt
nommu-mmap.txt
numastat.txt
oops-tracing.txt
padata.txt
parport-lowlevel.txt
parport.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pinctrl.txt pinctrl: fix example .get_group_pins implementation signature 2015-03-18 02:02:20 +01:00
pnp.txt
preempt-locking.txt
printk-formats.txt The documentation tree update for 4.1. Numerous fixes, the overdue removal 2015-04-18 11:10:49 -04:00
pwm.txt
ramoops.txt
rbtree.txt
remoteproc.txt
rfkill.txt
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt Documentation, split up rtc.txt into documentation and test file 2015-03-24 22:01:58 -06:00
serial-console.txt
sgi-ioc4.txt
smsc_ece1099.txt
sparse.txt
stable_api_nonsense.txt
stable_kernel_rules.txt stable_kernel_rules: Add clause about specification of kernel versions to patch. 2015-03-26 23:52:24 +01:00
static-keys.txt
svga.txt
sysfs-rules.txt
sysrq.txt
this_cpu_ops.txt
unaligned-memory-access.txt
unicode.txt
unshare.txt
vfio.txt vfio: powerpc/spapr: Register memory and define IOMMU v2 2015-06-11 15:16:55 +10:00
vgaarbiter.txt
video-output.txt
vme_api.txt
volatile-considered-harmful.txt
workqueue.txt
xillybus.txt
xz.txt
zorro.txt