linux/Documentation
David Howells cc53ce53c8 Add a dentry op to allow processes to be held during pathwalk transit
Add a dentry op (d_manage) to permit a filesystem to hold a process and make it
sleep when it tries to transit away from one of that filesystem's directories
during a pathwalk.  The operation is keyed off a new dentry flag
(DCACHE_MANAGE_TRANSIT).

The filesystem is allowed to be selective about which processes it holds and
which it permits to continue on or prohibits from transiting from each flagged
directory.  This will allow autofs to hold up client processes whilst letting
its userspace daemon through to maintain the directory or the stuff behind it
or mounted upon it.

The ->d_manage() dentry operation:

	int (*d_manage)(struct path *path, bool mounting_here);

takes a pointer to the directory about to be transited away from and a flag
indicating whether the transit is undertaken by do_add_mount() or
do_move_mount() skipping through a pile of filesystems mounted on a mountpoint.

It should return 0 if successful and to let the process continue on its way;
-EISDIR to prohibit the caller from skipping to overmounted filesystems or
automounting, and to use this directory; or some other error code to return to
the user.

->d_manage() is called with namespace_sem writelocked if mounting_here is true
and no other locks held, so it may sleep.  However, if mounting_here is true,
it may not initiate or wait for a mount or unmount upon the parameter
directory, even if the act is actually performed by userspace.

Within fs/namei.c, follow_managed() is extended to check with d_manage() first
on each managed directory, before transiting away from it or attempting to
automount upon it.

follow_down() is renamed follow_down_one() and should only be used where the
filesystem deliberately intends to avoid management steps (e.g. autofs).

A new follow_down() is added that incorporates the loop done by all other
callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS
and CIFS do use it, their use is removed by converting them to use
d_automount()).  The new follow_down() calls d_manage() as appropriate.  It
also takes an extra parameter to indicate if it is being called from mount code
(with namespace_sem writelocked) which it passes to d_manage().  follow_down()
ignores automount points so that it can be used to mount on them.

__follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with
DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to
sleep.  It would be possible to enter d_manage() in rcu-walk mode too, and have
that determine whether to abort or not itself.  That would allow the autofs
daemon to continue on in rcu-walk mode.

Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't
required as every tranist from that directory will cause d_manage() to be
invoked.  It can always be set again when necessary.

==========================
WHAT THIS MEANS FOR AUTOFS
==========================

Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to
trigger the automounting of indirect mounts, and both of these can be called
with i_mutex held.

autofs knows that the i_mutex will be held by the caller in lookup(), and so
can drop it before invoking the daemon - but this isn't so for d_revalidate(),
since the lock is only held on _some_ of the code paths that call it.  This
means that autofs can't risk dropping i_mutex from its d_revalidate() function
before it calls the daemon.

The bug could manifest itself as, for example, a process that's trying to
validate an automount dentry that gets made to wait because that dentry is
expired and needs cleaning up:

	mkdir         S ffffffff8014e05a     0 32580  24956
	Call Trace:
	 [<ffffffff885371fd>] :autofs4:autofs4_wait+0x674/0x897
	 [<ffffffff80127f7d>] avc_has_perm+0x46/0x58
	 [<ffffffff8009fdcf>] autoremove_wake_function+0x0/0x2e
	 [<ffffffff88537be6>] :autofs4:autofs4_expire_wait+0x41/0x6b
	 [<ffffffff88535cfc>] :autofs4:autofs4_revalidate+0x91/0x149
	 [<ffffffff80036d96>] __lookup_hash+0xa0/0x12f
	 [<ffffffff80057a2f>] lookup_create+0x46/0x80
	 [<ffffffff800e6e31>] sys_mkdirat+0x56/0xe4

versus the automount daemon which wants to remove that dentry, but can't
because the normal process is holding the i_mutex lock:

	automount     D ffffffff8014e05a     0 32581      1              32561
	Call Trace:
	 [<ffffffff80063c3f>] __mutex_lock_slowpath+0x60/0x9b
	 [<ffffffff8000ccf1>] do_path_lookup+0x2ca/0x2f1
	 [<ffffffff80063c89>] .text.lock.mutex+0xf/0x14
	 [<ffffffff800e6d55>] do_rmdir+0x77/0xde
	 [<ffffffff8005d229>] tracesys+0x71/0xe0
	 [<ffffffff8005d28d>] tracesys+0xd5/0xe0

which means that the system is deadlocked.

This patch allows autofs to hold up normal processes whilst the daemon goes
ahead and does things to the dentry tree behind the automouter point without
risking a deadlock as almost no locks are held in d_manage() and none in
d_automount().

Signed-off-by: David Howells <dhowells@redhat.com>
Was-Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-01-15 20:07:31 -05:00
..
ABI Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 2011-01-13 20:15:35 -08:00
DocBook Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-01-13 10:05:56 -08:00
PCI Documentation: pci.txt: fix typo 2010-07-11 22:17:45 +02:00
RCU rcu: update documentation/comments for Lai's adoption patch 2010-11-29 22:01:59 -08:00
accounting taskstats: pad taskstats netlink response for aligment issues on ia64 2010-12-22 19:43:34 -08:00
acpi ACPI, APEI, Add APEI generic error status printing support 2010-12-13 23:42:12 -05:00
aoe Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
arm Merge branch 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6 2011-01-06 19:13:58 -08:00
auxdisplay
blackfin Blackfin: document SPI CS limitations with CPHA=0 2010-08-06 12:55:52 -04:00
block Documentation: remove anticipatory scheduler info 2010-11-11 12:09:59 +01:00
blockdev Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
cdrom Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
cgroups revert documentaion update for memcg's dirty ratio. 2011-01-14 07:52:02 -08:00
connector Documentation/: it's -> its where appropriate 2010-04-23 02:09:52 +02:00
console doc: fix console doc typo 2010-02-24 13:51:32 +01:00
cpu-freq [CPUFREQ] Processor Clocking Control interface driver 2010-01-13 10:55:16 -05:00
cpuidle
cris
crypto
development-process Documentation/development-process: more staging info 2010-11-18 15:00:47 -08:00
device-mapper dm: raid456 basic support 2011-01-13 20:00:02 +00:00
driver-model driver core: prune docs about device_interface 2010-11-10 16:57:11 -08:00
dvb [media] Documentation/lmedm04: Fix firmware extract information 2010-12-29 08:16:30 -02:00
early-userspace
fault-injection lkdtm: add debugfs access and loosen KPROBE ties 2010-03-06 11:26:32 -08:00
fb Merge branch 'fbdev/udlfb' 2011-01-06 18:10:09 +09:00
filesystems Add a dentry op to allow processes to be held during pathwalk transit 2011-01-15 20:07:31 -05:00
firmware_class firmware: Update hotplug script 2010-08-05 13:53:34 -07:00
frv
hwmon hwmon: (dme1737) Add support for in7 for SCH5127 2011-01-12 21:55:13 +01:00
i2c i2c: Add generic I2C multiplexer using GPIO API 2011-01-10 22:11:23 +01:00
i2o
ia64 Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
ide
infiniband Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
input Input: fix force feedback capability query example 2011-01-11 01:07:55 -08:00
ioctl pps: add kernel consumer support 2011-01-13 08:03:21 -08:00
isdn Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2010-08-04 15:31:02 -07:00
ja_JP Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
kbuild Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6 2011-01-10 08:57:03 -08:00
kdump kdump: update kexec-tools URL and Vivek's email 2010-11-25 14:36:38 +01:00
ko_KR Docs/Kconfig: Update: osdl.org -> linuxfoundation.org 2010-11-15 23:50:13 +01:00
kvm Merge branch 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm 2011-01-13 10:14:24 -08:00
laptops thinkpad-acpi: untangle ACPI/vendor backlight selection 2010-08-16 11:54:50 -04:00
leds Documentation: led drivers lp5521 and lp5523 2010-11-12 07:55:32 -08:00
lguest Docs/Kconfig: Update: osdl.org -> linuxfoundation.org 2010-11-15 23:50:13 +01:00
m68k
make kbuild: introduce HDR_ARCH_LIST for headers_install_all 2010-12-14 22:16:19 +01:00
mips
misc-devices Documentation: short descriptions for bh1770glc and apds990x drivers 2010-10-26 16:52:14 -07:00
mmc mmc: add erase, secure erase, trim and secure trim operations 2010-08-12 08:43:30 -07:00
mn10300
mtd Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
namespaces
netlabel Documentation/: it's -> its where appropriate 2010-04-23 02:09:52 +02:00
networking Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-01-13 10:05:56 -08:00
nfc NFC: Driver for NXP Semiconductors PN544 NFC chip. 2011-01-13 08:03:19 -08:00
parisc
pcmcia pcmcia: use autoconfiguration feature for ioports and iomem 2010-09-29 17:20:24 +02:00
power PM: Fix references to basic-pm-debugging.txt in drivers-testing.txt 2010-12-24 15:02:41 +01:00
powerpc Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-01-13 10:05:56 -08:00
pps pps: add parallel port PPS signal generator 2011-01-13 08:03:21 -08:00
prctl
s390 Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
scheduler Docs: typo: Complete -> Completely 2010-11-02 22:47:39 -04:00
scsi Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-01-13 10:05:56 -08:00
serial pps: timestamp is always passed to dcd_change() 2011-01-13 08:03:20 -08:00
sh sh: clkfwk: Kill off unused clk_set_rate_ex(). 2010-11-15 18:25:12 +09:00
sound Merge branch 'topic/hda' into for-linus 2011-01-13 08:37:19 +01:00
sparc
spi arm/pxa2xx: reorgazine SSP and SPI header files 2010-12-01 12:18:33 +01:00
sysctl sysctl: remove obsolete comments 2011-01-13 08:03:18 -08:00
telephony Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
thermal thermal: Add event notification to thermal framework 2011-01-12 00:08:35 -05:00
timers tree-wide: fix comment/printk typos 2010-11-01 15:38:34 -04:00
trace Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-01-13 10:05:56 -08:00
uml Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
usb USB: use the runtime-PM autosuspend implementation 2010-11-16 14:03:41 -08:00
video4linux [media] cardlist: Update lists for em28xx and saa7134 2010-12-29 08:17:17 -02:00
vm thp: transparent hugepage support documentation 2011-01-13 17:32:38 -08:00
w1 w1: DS2423 counter driver and documentation 2011-01-13 08:03:22 -08:00
watchdog watchdog: docs: add an entry for imx2_wdt 2010-07-01 16:02:55 +00:00
wimax
x86 x86: support XZ-compressed kernel 2011-01-13 08:03:25 -08:00
zh_CN Docs/Kconfig: Update: osdl.org -> linuxfoundation.org 2010-11-15 23:50:13 +01:00
.gitignore add random binaries to .gitignore 2010-04-08 11:34:34 +02:00
00-INDEX mmc: add erase, secure erase, trim and secure trim operations 2010-08-12 08:43:30 -07:00
BUG-HUNTING
Changes Documentation update broken web addresses 2010-07-11 21:55:42 +02:00
CodingStyle
DMA-API-HOWTO.txt Documentation: DMA-API-HOWTO.txt: rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN 2010-08-14 11:56:46 -07:00
DMA-API.txt dma-mapping: remove dma_is_consistent API 2010-08-11 08:59:21 -07:00
DMA-ISA-LPC.txt
DMA-attributes.txt
HOWTO Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
IPMI.txt IPMI: Add the document description of ipmi_get_smi_info 2010-12-14 00:22:00 -05:00
IRQ-affinity.txt
IRQ.txt
Intel-IOMMU.txt
Makefile [media] Remove the old V4L1 v4lgrab.c file 2010-12-29 08:17:12 -02:00
ManagementStyle
SAK.txt
SELinux.txt
SM501.txt
SecurityBugs
Smack.txt Documentation/: it's -> its where appropriate 2010-04-23 02:09:52 +02:00
SubmitChecklist Documentation: update SubmitChecklist for O=objdir and kconfig testing 2010-05-24 07:31:20 -07:00
SubmittingDrivers Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
SubmittingPatches SubmittingPatches: add more about patch descriptions 2010-08-09 20:45:05 -07:00
VGA-softcursor.txt
apparmor.txt AppArmor: update Maintainer and Documentation 2010-08-02 15:35:15 +10:00
applying-patches.txt
atomic_ops.txt Documentation/: it's -> its where appropriate 2010-04-23 02:09:52 +02:00
bad_memory.txt
basic_profiling.txt
binfmt_misc.txt Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
braille-console.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt documentation: fix almost duplicate filenames (IO/io-mapping.txt) 2010-07-20 17:49:30 +00:00
cachetlb.txt Documentation/: it's -> its where appropriate 2010-04-23 02:09:52 +02:00
circular-buffers.txt Document Linux's circular buffering capabilities 2010-03-24 16:31:22 -07:00
coccinelle.txt scripts/coccinelle: update for compatability with Coccinelle 0.2.4 2010-12-03 12:27:01 +01:00
cpu-hotplug.txt documentation: fix erroneous email address. 2010-08-11 23:04:10 +09:30
cpu-load.txt
cputopology.txt topology/sysfs: Provide book id and siblings attributes 2010-09-09 20:41:25 +02:00
credentials.txt CRED: Fix __task_cred()'s lockdep check and banner comment 2010-07-29 15:16:18 -07:00
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
devices.txt Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 2010-10-28 09:35:11 -07:00
dmaengine.txt
dontdiff Documentation/dontdiff: add further autogenerated files to ignore list 2011-01-06 09:59:37 -08:00
dynamic-debug-howto.txt Dynamic Debug: Introduce ddebug_query= boot parameter 2010-10-22 10:16:42 -07:00
edac.txt EDAC: Fix typos in Documentation/edac.txt 2010-11-25 17:32:47 +01:00
eisa.txt doc: fix Defaultd -> Defaults typo in EISA doc 2010-02-05 12:22:39 +01:00
email-clients.txt Documentation/email-clients.txt: update Thunderbird docs with wordwrap plugin 2011-01-13 08:03:15 -08:00
feature-removal-schedule.txt Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 2011-01-13 20:15:35 -08:00
flexible-arrays.txt
futex-requeue-pi.txt
gcov.txt
gpio.txt Revert "gpiolib: annotate gpio-intialization with __must_check" 2011-01-13 17:26:46 -08:00
highuid.txt
hw_random.txt
init.txt init/main.c: improve usability in case of init binary failure 2010-03-06 11:26:29 -08:00
initrd.txt
intel_txt.txt Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
io-mapping.txt
io_ordering.txt
iostats.txt remove extraneous 'is' from Documentation/iostats.txt 2011-01-03 14:00:28 +01:00
irqflags-tracing.txt
isapnp.txt
java.txt
kernel-doc-nano-HOWTO.txt docbook: warn on unused doc entries 2010-09-11 16:49:21 -07:00
kernel-docs.txt Documentation: update kernel-docs.txt 2011-01-06 09:59:38 -08:00
kernel-parameters.txt Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 2011-01-13 20:15:35 -08:00
keys-request-key.txt
keys-trusted-encrypted.txt keys: add new trusted key-type 2010-11-29 08:55:25 +11:00
keys.txt
kmemcheck.txt
kmemleak.txt
kobject.txt kobject: documentation: Update to refer to kset-example.c. 2010-03-19 07:12:20 -07:00
kprobes.txt tree-wide: fix comment/printk typos 2010-11-01 15:38:34 -04:00
kref.txt
ldm.txt Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
leds-class.txt led-class: always implement blinking 2010-11-12 07:55:32 -08:00
leds-lp3944.txt
local_ops.txt
lockdep-design.txt
lockstat.txt lockstat: Add usage info to Documentation/lockstat.txt 2009-12-06 13:20:02 +01:00
logo.gif
logo.txt
magic-number.txt take coda-private headers out of include/linux 2011-01-12 20:02:48 -05:00
mca.txt
md.txt Documentation: update broken web addresses. 2010-08-04 15:21:40 +02:00
memory-barriers.txt Document Linux's circular buffering capabilities 2010-03-24 16:31:22 -07:00
memory-hotplug.txt mm: add numa node symlink for memory section in sysfs 2009-12-15 08:53:17 -08:00
memory.txt
mono.txt
mutex-design.txt mutex: Fix annotations to include it in kernel-locking docbook 2010-09-03 08:19:51 +02:00
nmi_watchdog.txt
nommu-mmap.txt nommu: fix malloc performance by adding uninitialized flag 2009-12-15 08:53:24 -08:00
numastat.txt
oops-tracing.txt panic: Add taint flag TAINT_FIRMWARE_WORKAROUND ('I') 2010-05-19 08:37:43 +01:00
padata.txt Documentation/padata.txt: fix typos etc. 2010-08-11 08:59:18 -07:00
parport-lowlevel.txt
parport.txt
pi-futex.txt
pnp.txt doc: capitalization and other minor fixes in pnp doc 2010-02-05 12:22:44 +01:00
preempt-locking.txt
printk-formats.txt
prio_tree.txt
rbtree.txt Documentation: remove anticipatory scheduler info 2010-11-11 12:09:59 +01:00
rfkill.txt Document the rfkill sysfs ABI 2010-03-10 17:09:33 -05:00
robust-futex-ABI.txt
robust-futexes.txt
rt-mutex-design.txt variable name fix to Documentation/rt-mutex-design.txt 2010-06-05 17:39:09 +02:00
rt-mutex.txt
rtc.txt
serial-console.txt
sgi-ioc4.txt
sgi-visws.txt
sparse.txt update email address 2010-07-19 10:56:54 +02:00
spinlocks.txt Documentation: rw_lock lessons learned 2009-12-14 09:46:56 -08:00
stable_api_nonsense.txt
stable_kernel_rules.txt Documentation: -stable rules: upstream commit ID requirement reworded 2010-04-22 15:24:56 -07:00
svga.txt
sysfs-rules.txt Fix typos in comments 2010-03-16 11:47:56 +01:00
sysrq.txt documentation: update sysrq.txt magic sysrq keys 2010-10-26 17:32:41 -07:00
tomoyo.txt TOMOYO: Update version to 2.3.0 2010-08-02 15:35:10 +10:00
unaligned-memory-access.txt
unicode.txt
unshare.txt
vgaarbiter.txt vgaarbiter: fix a typo in the vgaarbiter Documentation 2009-12-16 11:28:58 -08:00
video-output.txt
volatile-considered-harmful.txt Documentation/volatile-considered-harmful.txt: correct cpu_relax() documentation 2010-03-24 16:31:20 -07:00
workqueue.txt workqueue: add and use WQ_MEM_RECLAIM flag 2010-10-11 15:20:26 +02:00
xz.txt decompressors: add XZ decompressor module 2011-01-13 08:03:24 -08:00
zorro.txt