Commit Graph

32599 Commits

Author SHA1 Message Date
Dave Chinner 9fbe24d95e xfs: split out EFI/EFD log item format definition
The EFI/EFD item format definitions are shared with userspace. Split
the out of header files that contain kernel only defintions to make
it simple to shared them.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:07:13 -05:00
Dave Chinner a8da0da25c xfs: split out buf log item format definitions
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:06:37 -05:00
Dave Chinner 69432832fd xfs: split out inode log item format definition
The log item format definitions are shared with userspace. Split
them out of header files that contain kernel only defintions to make
it simple to shared them.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:05:19 -05:00
Dave Chinner fc06c6d064 xfs: separate out log format definitions
The on-disk format definitions for the log are spread randoms
through a couple of header files. Consolidate it all in a single
file that can be shared easily with userspace. This means that
xfs_log.h and xfs_log_priv.h no longer need to be shared with
userspace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-12 16:03:51 -05:00
Tejun Heo 7a378c9aea xfs: WQ_NON_REENTRANT is meaningless and going away
dbf2576e37 ("workqueue: make all workqueues non-reentrant") made
WQ_NON_REENTRANT no-op and the flag is going away.  Remove its usages.

This patch doesn't introduce any behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ben Myers <bpm@sgi.com>
Cc: Alex Elder <elder@kernel.org>
Cc: xfs@oss.sgi.com
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-30 13:11:17 -05:00
Dave Chinner e60896d8f2 xfs: di_flushiter considered harmful
When we made all inode updates transactional, we no longer needed
the log recovery detection for inodes being newer on disk than the
transaction being replayed - it was redundant as replay of the log
would always result in the latest version of the inode would be on
disk. It was redundant, but left in place because it wasn't
considered to be a problem.

However, with the new "don't read inodes on create" optimisation,
flushiter has come back to bite us. Essentially, the optimisation
made always initialises flushiter to zero in the create transaction,
and so if we then crash and run recovery and the inode already on
disk has a non-zero flushiter it will skip recovery of that inode.
As a result, log recovery does the wrong thing and we end up with a
corrupt filesystem.

Because we have to support old kernel to new kernel upgrades, we
can't just get rid of the flushiter support in log recovery as we
might be upgrading from a kernel that doesn't have fully transactional
inode updates.  Unfortunately, for v4 superblocks there is no way to
guarantee that log recovery knows about this fact.

We cannot add a new inode format flag to say it's a "special inode
create" because it won't be understood by older kernels and so
recovery could do the wrong thing on downgrade. We cannot specially
detect the combination of zero mode/non-zero flushiter on disk to
non-zero mode, zero flushiter in the log item during recovery
because wrapping of the flushiter can result in false detection.

Hence that makes this "don't use flushiter" optimisation limited to
a disk format that guarantees that we don't need it. And that means
the only fix here is to limit the "no read IO on create"
optimisation to version 5 superblocks....

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-24 12:15:23 -05:00
Chandra Seetharaman d892d5864f xfs: Start using pquotaino from the superblock.
Start using pquotino and define a macro to check if the
superblock has pquotino.

Keep backward compatibilty by alowing mount of older superblock
with no separate pquota inode.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-22 14:46:26 -05:00
Chandra Seetharaman 0102629776 xfs: Initialize all quota inodes to be NULLFSINO
mkfs doesn't initialize the quota inodes to NULLFSINO as it does for the
other internal inodes. This leads to two in-core values (0 and NULLFSINO)
to be checked against, to make sure if a quota inode is valid.

Solve that problem by initializing the in-core values of all quotaino
values to NULLFSINO if they are 0 in the disk.

Note that these values are not written back to on-disk superblock unless
some quota is enabled on the filesystem. Even in that case sb_pquotino is
written to disk only if the on-disk superblock supports pquotino

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-22 14:10:53 -05:00
Chandra Seetharaman 297aa63769 xfs: Fix a deadlock in xfs_log_commit_cil() code path
While testing and rearranging pquota/gquota code, I stumbled
on a xfs_shutdown() during a mount. But the mount just hung.

Debugged and found that there is a deadlock involving
&log->l_cilp->xc_ctx_lock.

It is in a code path where &log->l_cilp->xc_ctx_lock is first
acquired in read mode and some levels down the same semaphore
is being acquired in write mode causing a deadlock.

This is the stack:
xfs_log_commit_cil -> acquires &log->l_cilp->xc_ctx_lock in read mode
  xlog_print_tic_res
    xfs_force_shutdown
      xfs_log_force_umount
        xlog_cil_force
          xlog_cil_force_lsn
            xlog_cil_push_foreground
              xlog_cil_push - tries to acquire same semaphore in write mode

This patch fixes the deadlock by changing the reason code for
xfs_force_shutdown in xlog_print_tic_res() to SHUTDOWN_LOG_IO_ERROR.

SHUTDOWN_LOG_IO_ERROR is the right reason code to be set since
we are in the log path.

Thanks to Dave for suggesting this solution.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-22 13:58:10 -05:00
Jie Liu 58e59854a3 xfs: fix assertion failure in xfs_vm_write_failed()
In xfs_vm_write_failed(), we evaluate the block_offset of pos with
PAGE_MASK which is an unsigned long.  That is fine on 64-bit platforms
regardless of whether the request pos is 32-bit or 64-bit.  However, on
32-bit platforms the value is 0xfffff000 and so the high 32 bits in it
will be masked off with (pos & PAGE_MASK) for a 64-bit pos.

As a result, the evaluated block_offset is incorrect which will cause
this failure ASSERT(block_offset + from == pos); and potentially pass
the wrong block to xfs_vm_kill_delalloc_range().

In this case, we can get a kernel panic if CONFIG_XFS_DEBUG is enabled:

XFS: Assertion failed: block_offset + from == pos, file: fs/xfs/xfs_aops.c, line: 1504

------------[ cut here ]------------
 kernel BUG at fs/xfs/xfs_message.c:100!
 invalid opcode: 0000 [#1] SMP
 ........
 Pid: 4057, comm: mkfs.xfs Tainted: G           O 3.9.0-rc2 #1
 EIP: 0060:[<f94a7e8b>] EFLAGS: 00010282 CPU: 0
 EIP is at assfail+0x2b/0x30 [xfs]
 EAX: 00000056 EBX: f6ef28a0 ECX: 00000007 EDX: f57d22a4
 ESI: 1c2fb000 EDI: 00000000 EBP: ea6b5d30 ESP: ea6b5d1c
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
 CR0: 8005003b CR2: 094f3ff4 CR3: 2bcb4000 CR4: 000006f0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
 Process mkfs.xfs (pid: 4057, ti=ea6b4000 task=ea5799e0 task.ti=ea6b4000)
 Stack:
 00000000 f9525c48 f951fa80 f951f96b 000005e4 ea6b5d7c f9494b34 c19b0ea2
 00000066 f3d6c620 c19b0ea2 00000000 e9a91458 00001000 00000000 00000000
 00000000 c15c7e89 00000000 1c2fb000 00000000 00000000 1c2fb000 00000080
 Call Trace:
 [<f9494b34>] xfs_vm_write_failed+0x74/0x1b0 [xfs]
 [<c15c7e89>] ? printk+0x4d/0x4f
 [<f9494d7d>] xfs_vm_write_begin+0x10d/0x170 [xfs]
 [<c110a34c>] generic_file_buffered_write+0xdc/0x210
 [<f949b669>] xfs_file_buffered_aio_write+0xf9/0x190 [xfs]
 [<f949b7f3>] xfs_file_aio_write+0xf3/0x160 [xfs]
 [<c115e504>] do_sync_write+0x94/0xd0
 [<c115ed1f>] vfs_write+0x8f/0x160
 [<c115e470>] ? wait_on_retry_sync_kiocb+0x50/0x50
 [<c115f017>] sys_write+0x47/0x80
 [<c15d860d>] sysenter_do_call+0x12/0x28
 .............
 EIP: [<f94a7e8b>] assfail+0x2b/0x30 [xfs] SS:ESP 0068:ea6b5d1c
 ---[ end trace cdd9af4f4ecab42f ]---
 Kernel panic - not syncing: Fatal exception

In order to avoid this, we can evaluate the block_offset of the start
of the page by using shifts rather than masks the mismatch problem.

Thanks Dave Chinner for help finding and fixing this bug.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-22 13:12:19 -05:00
Linus Torvalds 41d9884c44 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs stuff from Al Viro:
 "O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including
  making simple_lookup() usable for filesystems with non-NULL s_d_op,
  which allows us to get rid of quite a bit of ugliness"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  sunrpc: now we can just set ->s_d_op
  cgroup: we can use simple_lookup() now
  efivarfs: we can use simple_lookup() now
  make simple_lookup() usable for filesystems that set ->s_d_op
  configfs: don't open-code d_alloc_name()
  __rpc_lookup_create_exclusive: pass string instead of qstr
  rpc_create_*_dir: don't bother with qstr
  llist: llist_add() can use llist_add_batch()
  llist: fix/simplify llist_add() and llist_add_batch()
  fput: turn "list_head delayed_fput_list" into llist_head
  fs/file_table.c:fput(): add comment
  Safer ABI for O_TMPFILE
2013-07-14 11:42:26 -07:00
Al Viro 6e8cd2cb46 efivarfs: we can use simple_lookup() now
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14 17:48:35 +04:00
Al Viro 74931da7a6 make simple_lookup() usable for filesystems that set ->s_d_op
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14 17:43:25 +04:00
Al Viro ec193cf5af configfs: don't open-code d_alloc_name()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-14 17:16:52 +04:00
Linus Torvalds be9c6d9169 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Just a bunch of small fixes and tidy ups:

   1) Finish the "busy_poll" renames, from Eliezer Tamir.

   2) Fix RCU stalls in IFB driver, from Ding Tianhong.

   3) Linearize buffers properly in tun/macvtap zerocopy code.

   4) Don't crash on rmmod in vxlan, from Pravin B Shelar.

   5) Spinlock used before init in alx driver, from Maarten Lankhorst.

   6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry
      Kravkov.

   7) Dummy and ifb driver load failure paths can oops, fixes from Tan
      Xiaojun and Ding Tianhong.

   8) Correct MTU calculations in IP tunnels, from Alexander Duyck.

   9) Account all TCP retransmits in SNMP stats properly, from Yuchung
      Cheng.

  10) atl1e and via-rhine do not handle DMA mapping failures properly,
      from Neil Horman.

  11) Various equal-cost multipath route fixes in ipv6 from Hannes
      Frederic Sowa"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
  ipv6: only static routes qualify for equal cost multipathing
  via-rhine: fix dma mapping errors
  atl1e: fix dma mapping warnings
  tcp: account all retransmit failures
  usb/net/r815x: fix cast to restricted __le32
  usb/net/r8152: fix integer overflow in expression
  net: access page->private by using page_private
  net: strict_strtoul is obsolete, use kstrtoul instead
  drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe
  drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe
  drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe
  net/usb: add relative mii functions for r815x
  net/tipc: use %*phC to dump small buffers in hex form
  qlcnic: Adding Maintainers.
  gre: Fix MTU sizing check for gretap tunnels
  pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts
  pkt_sched: sch_qfq: improve efficiency of make_eligible
  gso: Update tunnel segmentation to support Tx checksum offload
  inet: fix spacing in assignment
  ifb: fix oops when loading the ifb failed
  ...
2013-07-13 17:42:22 -07:00
Linus Torvalds 239dab4636 xfs: update (#2) for 3.11-rc1
- fix for xfs_fsr returning -EINVAL
 - cleanup in xfs_bulkstat
 - cleanup in xfs_open_by_handle
 - update mount options documentation
 - clean up local format handling in xfs_bmapi_write
 - fix dquot log reservations which were too small
 - fix sgid inheritance for subdirectories when default acls are in use
 - add project quota fields to various structures
 - fix teardown of quotainfo structures when quotas are turned off
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJR4FDjAAoJENaLyazVq6ZOhZIQAMLWC4Wz+3PpRLkIlXZ83wdG
 LYDNxdYntSVDXNLCrfIhuavgW1yhseLcQZD34g0hgOLQRsmjvUw2ikO6u7qlyoV1
 GZZVwdVhsZNicycvuEE0Sva9Jmjgxe0XUOCxJBNAWN0fm5Jzg7w0OGWyzxn2obRX
 L4LNPqnQS/phCrtNYfUnXpCuUcy0KbpX4GYrdt5tThIWHm7AcyRnOArEkFvVuwdf
 N3OSN/jSMaEF/GsAqmYFYSXhuL9P1vyBSlyFW82YbyFFd4FKZJbRiFbOsrgbFSkA
 Ssum7N+rfMc9DCbJsrztgxFaYpj42JR5eCm+jvTejx8nJWKiGjMVtzjq4QtwQQ6e
 vby7MzdjZ+l2oJclA0y8hOjeg0R7sPVP7xZziZRuK4PHsjtBH3N2FtCOeBtlGhyW
 14LK+z+5YXU/gEwmxV5LaknODb2mxvWycf70jaQ6bvrQRUiFnPNIxYKvgAx8YJxl
 jgYSassHHKtLg0S54P/T91tRsyDVOhy5lqgeogzK5uYa+v+xlMloG+fLzb9GmIgS
 hXgUIAo+lNlHZkw1FdD4aRgh3OMiUvLQN6woBMbbXfS5XaNpF1UG30YAXeeIqV5e
 cLChzY+jQiCsmcktb3YQs9C5yfxEciFFSYmZOaKCTgQRWWnyI4/lAu1gd2KtTYUt
 ZfV0niME4wp0kBWZHOEH
 =QiZy
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-v3.11-rc1-2' of git://oss.sgi.com/xfs/xfs

Pull more xfs updates from Ben Myers:
 "Here are a fix for xfs_fsr, a cleanup in bulkstat, a cleanup in
  xfs_open_by_handle, updated mount options documentation, a cleanup in
  xfs_bmapi_write, a fix for the size of dquot log reservations, a fix
  for sgid inheritance when acls are in use, a fix for cleaning up
  quotainfo structures, and some more of the work which allows group and
  project quotas to be used together.

  We had a few more in this last quota category that we might have liked
  to get in, but it looks there are still a few items that need to be
  addressed.

   - fix for xfs_fsr returning -EINVAL
   - cleanup in xfs_bulkstat
   - cleanup in xfs_open_by_handle
   - update mount options documentation
   - clean up local format handling in xfs_bmapi_write
   - fix dquot log reservations which were too small
   - fix sgid inheritance for subdirectories when default acls are in use
   - add project quota fields to various structures
   - fix teardown of quotainfo structures when quotas are turned off"

* tag 'for-linus-v3.11-rc1-2' of git://oss.sgi.com/xfs/xfs:
  xfs: Fix the logic check for all quotas being turned off
  xfs: Add pquota fields where gquota is used.
  xfs: fix sgid inheritance for subdirectories inheriting default acls [V3]
  xfs: dquot log reservations are too small
  xfs: remove local fork format handling from xfs_bmapi_write()
  xfs: update mount options documentation
  xfs: use get_unused_fd_flags(0) instead of get_unused_fd()
  xfs: clean up unused codes at xfs_bulkstat()
  xfs: use XFS_BMAP_BMDR_SPACE vs. XFS_BROOT_SIZE_ADJ
2013-07-13 11:40:24 -07:00
Linus Torvalds f1c4108852 Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "Fixes for 4 cifs bugs, including a reconnect problem, a problem
  parsing responses to SMB2 open request, and setting nlink incorrectly
  to some servers which don't report it properly on the wire.  Also
  improves data integrity on reconnect with series from Pavel which adds
  durable handle support for SMB2."

* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Fix a deadlock when a file is reopened
  CIFS: Reopen the file if reconnect durable handle failed
  [CIFS] Fix minor endian error in durable handle patch series
  CIFS: Reconnect durable handles for SMB2
  CIFS: Make SMB2_open use cifs_open_parms struct
  CIFS: Introduce cifs_open_parms struct
  CIFS: Request durable open for SMB2 opens
  CIFS: Simplify SMB2 create context handling
  CIFS: Simplify SMB2_open code path
  CIFS: Respect create_options in smb2_open_file
  CIFS: Fix lease context buffer parsing
  [CIFS] use sensible file nlink values if unprovided
  Limit allocation of crypto mechanisms to dialect which requires
2013-07-13 11:20:49 -07:00
Oleg Nesterov 4f5e65a1cc fput: turn "list_head delayed_fput_list" into llist_head
fput() and delayed_fput() can use llist and avoid the locking.

This is unlikely path, it is not that this change can improve
the performance, but this way the code looks simpler.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-13 13:29:10 +04:00
Andrew Morton 64372501e2 fs/file_table.c:fput(): add comment
A missed update to "fput: task_work_add() can fail if the caller has
passed exit_task_work()".

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-13 13:27:42 +04:00
Al Viro bb458c644a Safer ABI for O_TMPFILE
[suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
that will fail on old kernels in a lot more cases than what I came up with.
And make sure O_CREAT doesn't get there...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-07-13 13:26:37 +04:00
Pavel Shilovsky 689c3db4d5 CIFS: Fix a deadlock when a file is reopened
If we request reading or writing on a file that needs to be
reopened, it causes the deadlock: we are already holding rw
semaphore for reading and then we try to acquire it for writing
in cifs_relock_file. Fix this by acquiring the semaphore for
reading in cifs_relock_file due to we don't make any changes in
locks and don't need a write access.

CC: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-07-11 18:05:41 -05:00
Pavel Shilovsky b33fcf1c9d CIFS: Reopen the file if reconnect durable handle failed
This is a follow-on patch for 8/8 patch from the durable handles
series. It fixes the problem when durable file handle timeout
expired on the server and reopen returns -ENOENT for such files.
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-07-11 18:05:08 -05:00
Chandra Seetharaman c31ad439e8 xfs: Fix the logic check for all quotas being turned off
During the review of seperate pquota inode patches, David noticed
that the test to detect all quotas being turned off was
incorrect, and hence the block was not freeing all the quota
information.

The check made sense in Irix, but in Linux, quota is turned off
one at a time, which makes the test invalid for Linux.

This problem existed since XFS was ported to Linux.

David suggested to fix the problem by detecting when all quotas are
turned off by checking m_qflags.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-11 16:49:10 -05:00
Linus Torvalds 36805aaea5 Merge branch 'for-3.11/core' of git://git.kernel.dk/linux-block
Pull core block IO updates from Jens Axboe:
 "Here are the core IO block bits for 3.11. It contains:

   - A tweak to the reserved tag logic from Jan, for weirdo devices with
     just 3 free tags.  But for those it improves things substantially
     for random writes.

   - Periodic writeback fix from Jan.  Marked for stable as well.

   - Fix for a race condition in IO scheduler switching from Jianpeng.

   - The hierarchical blk-cgroup support from Tejun.  This is the grunt
     of the series.

   - blk-throttle fix from Vivek.

  Just a note that I'm in the middle of a relocation, whole family is
  flying out tomorrow.  Hence I will be awal the remainder of this week,
  but back at work again on Monday the 15th.  CC'ing Tejun, since any
  potential "surprises" will most likely be from the blk-cgroup work.
  But it's been brewing for a while and sitting in my tree and
  linux-next for a long time, so should be solid."

* 'for-3.11/core' of git://git.kernel.dk/linux-block: (36 commits)
  elevator: Fix a race in elevator switching
  block: Reserve only one queue tag for sync IO if only 3 tags are available
  writeback: Fix periodic writeback after fs mount
  blk-throttle: implement proper hierarchy support
  blk-throttle: implement throtl_grp->has_rules[]
  blk-throttle: Account for child group's start time in parent while bio climbs up
  blk-throttle: add throtl_qnode for dispatch fairness
  blk-throttle: make throtl_pending_timer_fn() ready for hierarchy
  blk-throttle: make tg_dispatch_one_bio() ready for hierarchy
  blk-throttle: make blk_throtl_bio() ready for hierarchy
  blk-throttle: make blk_throtl_drain() ready for hierarchy
  blk-throttle: dispatch from throtl_pending_timer_fn()
  blk-throttle: implement dispatch looping
  blk-throttle: separate out throtl_service_queue->pending_timer from throtl_data->dispatch_work
  blk-throttle: set REQ_THROTTLED from throtl_charge_bio() and gate stats update with it
  blk-throttle: implement sq_to_tg(), sq_to_td() and throtl_log()
  blk-throttle: add throtl_service_queue->parent_sq
  blk-throttle: generalize update_disptime optimization in blk_throtl_bio()
  blk-throttle: dispatch to throtl_data->service_queue.bio_lists[]
  blk-throttle: move bio_lists[] and friends to throtl_service_queue
  ...
2013-07-11 13:03:24 -07:00
Linus Torvalds 1466b77a7b NFS client updates for Linux 3.11 (part 2)
Highlights include:
 - Fix an_rpc pipefs regression that causes a deadlock on mount
 - Readdir optimisations by Scott Mayhew and Jeff Layton
 - clean up the rpc_pipefs dentry operation setup
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR3vVIAAoJEGcL54qWCgDyBWEP/0blqSlJId4zZj4xDviRFqJ4
 93C7b/Vn7LrAcNCgDQsPkkzTwAX5yTB1H5eNtMuyggAdGj89d4n0jXgBniIMHmqI
 Pjrr/XMQ65NddehrO491N01iJSfP9wE3CizJodnAv4VxMRO3xqiJG85lcnoLOFea
 V1FnEFUu9oi8e93cQt2fe6KdmTu/SuRqlqR7WPGyTFgS26x1l8nkp2OQgulit5Up
 lWuaxg4xbKOdj1jfUDXZhWUnDtkFjxyGxnKR63aA2X1DEGCUTJ6gB3tAl9pvnUb2
 RTQF3GVj+Bm/E3gE6ULJvqOjhsgWYjLAZn6hDA3yNAIiFyV7aA6gwK4oKy/B47a6
 tFEN2O1EupWzCqGyHhTArk+oEBLfUv/EgFyo7+Y0YIFV4sQTu5RbaZ0nQ2geY6LA
 50q2GH57tkXTs859gtBPQgKzgRF1ulkF1FDY9EYQHyGiUbNxBfx+6/2OI04ubQt3
 1gKUmm9w1WVzYGmHcHbxsXPT53NtAnHXW4ExcMgpaZ1YOPuIILm78ZuAw78XB/dd
 mvXRtbhVt/gs7qZAQQPp1iHIv+vnJ0KgjO62gbuTIRftw5jwWrpWcfYMUUZrMnyM
 kn326z3f4gn/vSDZI7J4tOfG1Uc7eNy+cJxStjtiNWTs3UzuWJKzJH0rZnoNZdei
 xAkLhjIUEybAqIpXJuGH
 =NqQf
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull second set of NFS client updates from Trond Myklebust:
 "This mainly contains some small readdir optimisations that had
  dependencies on Al Viro's readdir rewrite.  There is also a fix for a
  nasty deadlock which surfaced earlier in this merge window.

  Highlights include:
   - Fix an_rpc pipefs regression that causes a deadlock on mount
   - Readdir optimisations by Scott Mayhew and Jeff Layton
   - clean up the rpc_pipefs dentry operation setup"

* tag 'nfs-for-3.11-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Fix a deadlock in rpc_client_register()
  rpc_pipe: rpc_dir_inode_operations can be static
  NFS: Allow nfs_updatepage to extend a write under additional circumstances
  NFS: Make nfs_readdir revalidate less often
  NFS: Make nfs_attribute_cache_expired() non-static
  rpc_pipe: set dentry operations at d_alloc time
  nfs: set verifier on existing dentries in nfs_prime_dcache
2013-07-11 12:11:35 -07:00
Linus Torvalds 19d2f8e0fb Second round of 9p patches for the 3.11 merge window.
Several of these patches were rebased in order to correct style issues.
 Only stylistic changes were made versus the patches which were in linux-next
 for two weeks.  The rebases have been in linux-next for 3 days and have
 passed my regressions.
 
 The bulk of these are RDMA fixes and improvements.  There's also some
 additions on the extended attributes front to support some additional
 namespaces and a new option for TCP to force allocation of mount requests
 from a priviledged port.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 Comment: GPGTools - http://gpgtools.org
 
 iQIcBAABAgAGBQJR3rWXAAoJEDZk62b0Tg6xabIP/12I+SkQ57wRN03EQy5fqUdX
 gK/YMHKQ9QuDnZPBvrZ2lypesQNqVU0KINay6VEA86JG1gwzPyUd2MnpQ7F0vV3N
 XwVD54IoflV/M74xUnrgGWB8YxaPcdacQQ8yazX+mEgOgYGdWmDAl7FHmAkdKAFB
 gSl25f3PNJX1Rjay0dssNVXrVPXuJY/fZXKnNQZKtRwXffRWKsWHd8FU0Eq7F30A
 kNQB8tmMSfHBBjP+tzR0My6/kQ09jzHdtZOkH9IgVpNzqrd8tfy0l6tEvFypxqGT
 5oQFoxHHL/tUW05V0P3gYany2A7lEhSUifPKS6omqHO+vPlw+pDJw+xWlNq9fnDt
 8S8znqVuEHhvqRQW7zFdb9ac2MZi8CHHhC2wGIZ7GYjNG2q5XwE8b/QhdXQeFin7
 ibugvoW7+ZdcDewpQW27oO0g7B/8hRt8KC+1lc/8rITKIfGxbNJkGzTDl0F4Co7v
 IH7Ew5PHPe6ZiuU0QSdU+NBuvk8g8sWGxx04Xvzl3WicwOg7XvN3ivrKB9oN2U1x
 50KZRnYpwQQv/9AxyhroYU+Ufje8SF4v++zsq1eMzUcHsC/C73eatw2m764t+X4S
 8yMLrgqY1Nzif4nAMi/SDMnB/R1bXeuc8kXD9xT6XD9d2tf6e+zCHhQklVeC0tuK
 RiVRJqGrfanbKMnWIG0Y
 =n9rI
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-3.11-merge-window-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

Pull second round of 9p patches from Eric Van Hensbergen:
 "Several of these patches were rebased in order to correct style
  issues.  Only stylistic changes were made versus the patches which
  were in linux-next for two weeks.  The rebases have been in linux-next
  for 3 days and have passed my regressions.

  The bulk of these are RDMA fixes and improvements.  There's also some
  additions on the extended attributes front to support some additional
  namespaces and a new option for TCP to force allocation of mount
  requests from a priviledged port"

* tag 'for-linus-3.11-merge-window-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  fs/9p: Remove the unused variable "err" in v9fs_vfs_getattr()
  9P: Add cancelled() to the transport functions.
  9P/RDMA: count posted buffers without a pending request
  9P/RDMA: Improve error handling in rdma_request
  9P/RDMA: Do not free req->rc in error handling in rdma_request()
  9P/RDMA: Use a semaphore to protect the RQ
  9P/RDMA: Protect against duplicate replies
  9P/RDMA: increase P9_RDMA_MAXSIZE to 1MB
  9pnet: refactor struct p9_fcall alloc code
  9P/RDMA: rdma_request() needs not allocate req->rc
  9P: Fix fcall allocation for rdma
  fs/9p: xattr: add trusted and security namespaces
  net/9p: add privport option to 9p tcp transport
2013-07-11 10:21:23 -07:00
Linus Torvalds 746919d266 Code cleanups and improved buffer handling during page crypto operations
- Remove redundant code by merging some encrypt and decrypt functions
 - Get rid of a helper page allocation during page decryption by using in-place
   decryption
 - Better use of entire pages during page crypto operations
 - Several code cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIbBAABCgAGBQJR3Z1JAAoJENaSAD2qAscK+jAP92NR3W17njuPFJBHCROsS481
 tyhzy2W/LlNK1njnS6SRP3O3Icv3adiJRtMZePV7bpCH3yD/JoYZ6VHQBNV7msHW
 VuEwE6eqCkKl3NLunhwd4+m5R9qijJYGfYEzTve/RNASDU7/LPcTUF8OQ5dIB0wA
 J6IMGGZSpsFa8ymN01YzmUEmOUx1IBR2aYBT8Og4Ke117ywDYqxS0ghd1rb953sS
 7H8wnNcijs1DLGe71SZnKMVCYwO32GWarxDBAa0KsabcyY4Yr43O3ov/CsIAAA7B
 Q1Dn7KNNSOCu9G6fHrnuMOTncGnNPLhIe6Yc0PCZ7ykVstpzlNkKJ628IEonsJaJ
 4bYc3bqq4KH7rqMxjA+1GoLehpJWJzqwfiFI1fWLlYMmO2ky126rJUgSNBHQe9+M
 iWl+ZrYokSsNWBcUsIq7SJFaLIhWDNcb+Wl7RiTNBBwoBaZclrNuWKIyeWPhH+9/
 +/K3LBaggujzVpE743wgJhY60sfdHZmaRAD9agEbcG773JePXBg9OkiUp/hKSe8s
 UaGkfmwAlz8u6mR1eJCuFDCqwJKByyT4vObuOFroh7NgOHaQZghlnO4HwuOjzp6U
 wTUiMVslFY9WAsEWxdDhaCXxB8IrjHz3YZGIt8PU2eT6ucQU+HLvkkxqtzSYm9/7
 BBfryWZKwR7T50JdehI=
 =mRpN
 -----END PGP SIGNATURE-----

Merge tag 'ecryptfs-3.11-rc1-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs

Pull eCryptfs updates from Tyler Hicks:
 "Code cleanups and improved buffer handling during page crypto
  operations:
   - Remove redundant code by merging some encrypt and decrypt functions
   - Get rid of a helper page allocation during page decryption by using
     in-place decryption
   - Better use of entire pages during page crypto operations
   - Several code cleanups"

* tag 'ecryptfs-3.11-rc1-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
  Use ecryptfs_dentry_to_lower_path in a couple of places
  eCryptfs: Make extent and scatterlist crypt function parameters similar
  eCryptfs: Collapse crypt_page_offset() into crypt_extent()
  eCryptfs: Merge ecryptfs_encrypt_extent() and ecryptfs_decrypt_extent()
  eCryptfs: Combine page_offset crypto functions
  eCryptfs: Combine encrypt_scatterlist() and decrypt_scatterlist()
  eCryptfs: Decrypt pages in-place
  eCryptfs: Accept one offset parameter in page offset crypto functions
  eCryptfs: Simplify lower file offset calculation
  eCryptfs: Read/write entire page during page IO
  eCryptfs: Use entire helper page during page crypto operations
  eCryptfs: Cocci spatch "memdup.spatch"
2013-07-11 10:20:18 -07:00
Linus Torvalds 9db019278c A couple cleanups to JFS for 3.11
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.20 (GNU/Linux)
 
 iQIcBAABAgAGBQJR3XzMAAoJEDaohF61QIxk/6YP/1ehZym/G2sl836FVYqrMhiz
 E/IWbU64JbZnwc4HSOEOhZiYiRL9SDlLpPX11lKrTP84bzjw1gd1I6J2hnExmkPv
 46S6uosl6XZ1APQmK7MXAQPzbGGgfUvsgGXcm7z1hodaOFiC+AJ6D0ytM5fDS8O7
 zdqH4rhSVfVbsGYRECn/x9AdSO5sRSxcxS/U/lNBFKLNrMiAdQKGp7rr1HjO5GBs
 zKFCcq7DJ2XFiK0jm5/tPd87ENSIVeO60B49+JPxcgcJY3AnX7qC7HGEpkrJ2vHN
 WRxJ71Ut9mKMaykQ4E2RFE6o8O5LlXGcCN80/quAOFyJ0henw2TwrjYnBhXTHcYu
 V4ug0gU+7rH2eQZmoy6AxUl32lA4+Vu9J0+/64jWOFzpR5xZD50S3r7e1FUapXmv
 OBXY1fZLT8JcDDMANUI3zVykAONdy4cj6i2gUj2p7v+43CO/LTaa7a3M67XCP/me
 g+Kl7WBkUjEBTX4JnT8A9ZQ74IGcVkpdLJ+MRx1XmJfJR9wC1h0iSw8+ZUTCDXub
 Bmm6k2ah24PEZjj+S1aDhdC15RoErjXYyP4Apu2dJTNqz8l7fbWUU96L8MXtqbcq
 6Ih3sFf0RSpbYcDFJrrWdoXsgtUxO2K87Efh9ueJU46R2vP0rrK6JX7eDUHVne/h
 bFQyAGfu74rk0f8YD1ob
 =s2QZ
 -----END PGP SIGNATURE-----

Merge tag 'jfs-3.11' of git://github.com/kleikamp/linux-shaggy

Pull jfs update from Dave Kleikamp:
 "A couple cleanups to JFS for 3.11"

* tag 'jfs-3.11' of git://github.com/kleikamp/linux-shaggy:
  jfs: Update jfs_error
  jfs: fix sparse warning in fs/jfs/xattr.c
2013-07-11 10:19:34 -07:00
Linus Torvalds 0ff08ba5d0 Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from Bruce Fields:
 "Changes this time include:

   - 4.1 enabled on the server by default: the last 4.1-specific issues
     I know of are fixed, so we're not going to find the rest of the
     bugs without more exposure.
   - Experimental support for NFSv4.2 MAC Labeling (to allow running
     selinux over NFS), from Dave Quigley.
   - Fixes for some delicate cache/upcall races that could cause rare
     server hangs; thanks to Neil Brown and Bodo Stroesser for extreme
     debugging persistence.
   - Fixes for some bugs found at the recent NFS bakeathon, mostly v4
     and v4.1-specific, but also a generic bug handling fragmented rpc
     calls"

* 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits)
  nfsd4: support minorversion 1 by default
  nfsd4: allow destroy_session over destroyed session
  svcrpc: fix failures to handle -1 uid's
  sunrpc: Don't schedule an upcall on a replaced cache entry.
  net/sunrpc: xpt_auth_cache should be ignored when expired.
  sunrpc/cache: ensure items removed from cache do not have pending upcalls.
  sunrpc/cache: use cache_fresh_unlocked consistently and correctly.
  sunrpc/cache: remove races with queuing an upcall.
  nfsd4: return delegation immediately if lease fails
  nfsd4: do not throw away 4.1 lock state on last unlock
  nfsd4: delegation-based open reclaims should bypass permissions
  svcrpc: don't error out on small tcp fragment
  svcrpc: fix handling of too-short rpc's
  nfsd4: minor read_buf cleanup
  nfsd4: fix decoding of compounds across page boundaries
  nfsd4: clean up nfs4_open_delegation
  NFSD: Don't give out read delegations on creates
  nfsd4: allow client to send no cb_sec flavors
  nfsd4: fail attempts to request gss on the backchannel
  nfsd4: implement minimal SP4_MACH_CRED
  ...
2013-07-11 10:17:13 -07:00
Chandra Seetharaman 92f8ff73f1 xfs: Add pquota fields where gquota is used.
Add project quota changes to all the places where group quota field
is used:
   * add separate project quota members into various structures
   * split project quota and group quotas so that instead of overriding
     the group quota members incore, the new project quota members are
     used instead
   * get rid of usage of the OQUOTA flag incore, in favor of separate
     group and project quota flags.
   * add a project dquot argument to various functions.

Not using the pquotino field from superblock yet.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-11 10:35:32 -05:00
Michel Lespinasse 98d1e64f95 mm: remove free_area_cache
Since all architectures have been converted to use vm_unmapped_area(),
there is no remaining use for the free_area_cache.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-07-10 18:11:34 -07:00
Eliezer Tamir 076bb0c82a net: rename include/net/ll_poll.h to include/net/busy_poll.h
Rename the file and correct all the places where it is included.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-10 17:08:27 -07:00
Steve French 1c46943f84 [CIFS] Fix minor endian error in durable handle patch series
Fix endian warning:

  CHECK   fs/cifs/smb2pdu.c
fs/cifs/smb2pdu.c:1068:40: warning: incorrect type in assignment (different base types)
fs/cifs/smb2pdu.c:1068:40:    expected restricted __le32 [usertype] Next
fs/cifs/smb2pdu.c:1068:40:    got unsigned long

Signed-off-by: Steve French <smfrench@gmail.com>
2013-07-10 13:08:55 -05:00
Pavel Shilovsky 9cbc0b7339 CIFS: Reconnect durable handles for SMB2
On reconnects, we need to reopen file and then obtain all byte-range
locks held by the client. SMB2 protocol provides feature to make
this process atomic by reconnecting to the same file handle
with all it's byte-range locks. This patch adds this capability
for SMB2 shares.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10 13:08:40 -05:00
Pavel Shilovsky 064f6047a1 CIFS: Make SMB2_open use cifs_open_parms struct
to prepare it for further durable handle reconnect processing.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10 13:08:40 -05:00
Pavel Shilovsky 226730b4d8 CIFS: Introduce cifs_open_parms struct
and pass it to the open() call.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10 13:08:40 -05:00
Pavel Shilovsky 63eb3def32 CIFS: Request durable open for SMB2 opens
by passing durable context together with a handle caching lease or
batch oplock.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10 13:08:39 -05:00
Pavel Shilovsky d22cbfecbd CIFS: Simplify SMB2 create context handling
to make it easier to add other create context further.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10 13:08:39 -05:00
Pavel Shilovsky 59aa371841 CIFS: Simplify SMB2_open code path
by passing a filename to a separate iovec regardless of its length.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10 13:08:39 -05:00
Pavel Shilovsky ca81983fe5 CIFS: Respect create_options in smb2_open_file
and eliminated unused file_attribute parms of SMB2_open.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10 13:08:39 -05:00
Pavel Shilovsky fd55439638 CIFS: Fix lease context buffer parsing
to prevent missing RqLs context if it's not the first one.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steven French <steven@steven-GA-970A-DS3.(none)>
2013-07-10 13:08:39 -05:00
Carlos Maiolino 42c49d7f24 xfs: fix sgid inheritance for subdirectories inheriting default acls [V3]
XFS removes sgid bits of subdirectories under a directory containing a default
acl.

When a default acl is set, it implies xfs to call xfs_setattr_nonsize() in its
code path. Such function is shared among mkdir and chmod system calls, and
does some checks unneeded by mkdir (calling inode_change_ok()). Such checks
remove sgid bit from the inode after it has been granted.

With this patch, we extend the meaning of XFS_ATTR_NOACL flag to avoid these
checks when acls are being inherited (thanks hch).

Also, xfs_setattr_mode, doesn't need to re-check for group id and capabilities
permissions, this only implies in another try to remove sgid bit from the
directories. Such check is already done either on inode_change_ok() or
xfs_setattr_nonsize().

Changelog:

V2: Extends the meaning of XFS_ATTR_NOACL instead of wrap the tests into another
    function

V3: Remove S_ISDIR check in xfs_setattr_nonsize() from the patch

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-10 10:21:51 -05:00
Matthew Wilcox cc18ec3c8f Use ecryptfs_dentry_to_lower_path in a couple of places
There are two places in ecryptfs that benefit from using
ecryptfs_dentry_to_lower_path() instead of separate calls to
ecryptfs_dentry_to_lower() and ecryptfs_dentry_to_lower_mnt().  Both
sites use fewer instructions and less stack (determined by examining
objdump output).

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2013-07-09 23:40:28 -07:00
Linus Torvalds 496322bc91 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "This is a re-do of the net-next pull request for the current merge
  window.  The only difference from the one I made the other day is that
  this has Eliezer's interface renames and the timeout handling changes
  made based upon your feedback, as well as a few bug fixes that have
  trickeled in.

  Highlights:

   1) Low latency device polling, eliminating the cost of interrupt
      handling and context switches.  Allows direct polling of a network
      device from socket operations, such as recvmsg() and poll().

      Currently ixgbe, mlx4, and bnx2x support this feature.

      Full high level description, performance numbers, and design in
      commit 0a4db187a9 ("Merge branch 'll_poll'")

      From Eliezer Tamir.

   2) With the routing cache removed, ip_check_mc_rcu() gets exercised
      more than ever before in the case where we have lots of multicast
      addresses.  Use a hash table instead of a simple linked list, from
      Eric Dumazet.

   3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
      Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
      Marek Puzyniak, Michal Kazior, and Sujith Manoharan.

   4) Support reporting the TUN device persist flag to userspace, from
      Pavel Emelyanov.

   5) Allow controlling network device VF link state using netlink, from
      Rony Efraim.

   6) Support GRE tunneling in openvswitch, from Pravin B Shelar.

   7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
      Daniel Borkmann and Eric Dumazet.

   8) Allow controlling of TCP quickack behavior on a per-route basis,
      from Cong Wang.

   9) Several bug fixes and improvements to vxlan from Stephen
      Hemminger, Pravin B Shelar, and Mike Rapoport.  In particular,
      support receiving on multiple UDP ports.

  10) Major cleanups, particular in the area of debugging and cookie
      lifetime handline, to the SCTP protocol code.  From Daniel
      Borkmann.

  11) Allow packets to cross network namespaces when traversing tunnel
      devices.  From Nicolas Dichtel.

  12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
      manner akin to how we monitor real network traffic via ptype_all.
      From Daniel Borkmann.

  13) Several bug fixes and improvements for the new alx device driver,
      from Johannes Berg.

  14) Fix scalability issues in the netem packet scheduler's time queue,
      by using an rbtree.  From Eric Dumazet.

  15) Several bug fixes in TCP loss recovery handling, from Yuchung
      Cheng.

  16) Add support for GSO segmentation of MPLS packets, from Simon
      Horman.

  17) Make network notifiers have a real data type for the opaque
      pointer that's passed into them.  Use this to properly handle
      network device flag changes in arp_netdev_event().  From Jiri
      Pirko and Timo Teräs.

  18) Convert several drivers over to module_pci_driver(), from Peter
      Huewe.

  19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
      O(1) calculation instead.  From Eric Dumazet.

  20) Support setting of explicit tunnel peer addresses in ipv6, just
      like ipv4.  From Nicolas Dichtel.

  21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.

  22) Prevent a single high rate flow from overruning an individual cpu
      during RX packet processing via selective flow shedding.  From
      Willem de Bruijn.

  23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
      Dumazet.

  24) Don't just drop GSO packets which are above the TBF scheduler's
      burst limit, chop them up so they are in-bounds instead.  Also
      from Eric Dumazet.

  25) VLAN offloads are missed when configured on top of a bridge, fix
      from Vlad Yasevich.

  26) Support IPV6 in ping sockets.  From Lorenzo Colitti.

  27) Receive flow steering targets should be updated at poll() time
      too, from David Majnemer.

  28) Fix several corner case regressions in PMTU/redirect handling due
      to the routing cache removal, from Timo Teräs.

  29) We have to be mindful of ipv4 mapped ipv6 sockets in
      upd_v6_push_pending_frames().  From Hannes Frederic Sowa.

  30) Fix L2TP sequence number handling bugs, from James Chapman."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
  drivers/net: caif: fix wrong rtnl_is_locked() usage
  drivers/net: enic: release rtnl_lock on error-path
  vhost-net: fix use-after-free in vhost_net_flush
  net: mv643xx_eth: do not use port number as platform device id
  net: sctp: confirm route during forward progress
  virtio_net: fix race in RX VQ processing
  virtio: support unlocked queue poll
  net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
  Documentation: Fix references to defunct linux-net@vger.kernel.org
  net/fs: change busy poll time accounting
  net: rename low latency sockets functions to busy poll
  bridge: fix some kernel warning in multicast timer
  sfc: Fix memory leak when discarding scattered packets
  sit: fix tunnel update via netlink
  dt:net:stmmac: Add dt specific phy reset callback support.
  dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
  dt:net:stmmac: Allocate platform data only if its NULL.
  net:stmmac: fix memleak in the open method
  ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
  net: ipv6: fix wrong ping_v6_sendmsg return value
  ...
2013-07-09 18:24:39 -07:00
Scott Mayhew c7559663e4 NFS: Allow nfs_updatepage to extend a write under additional circumstances
Currently nfs_updatepage allows a write to be extended to cover a full
page only if we don't have a byte range lock lock on the file... but if
we have a write delegation on the file or if we have the whole file
locked for writing then we should be allowed to extend the write as
well.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
[Trond: fix up call to nfs_have_delegation()]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-09 19:32:50 -04:00
Dave Chinner b0a9dab78a xfs: dquot log reservations are too small
During review of the separate project quota inode patches, it became
obvious that the dquot log reservation calculation underestimated
the number dquots that can be modified in a transaction. This has
it's roots way back in the Irix quota implementation.

That is, when quotas were first implemented in XFS, it only
supported user and project quotas as Irix did not have group quotas.
Hence the worst case operation involving dquot modification was
calculated to involve 2 user dquots and 1 project dquot or 1 user
dequot and 2 project dquots. i.e. 3 dquots. This was determined back
in 1996, and has remained unchanged ever since.

However, back in 2001, the Linux XFS port dropped all support for
project quota and implmented group quotas over the top. This was
effectively done with a search-and-replace of project with group,
and as such the log reservation was not changed. However, with the
advent of group quotas, chmod and rename now could modify more than
3 dquots in a single transaction - both could modify 4 dquots. Hence
this log reservation has been wrong for a long time.

In 2005, project quota support was reintroduced into Linux, but it
was implemented to be mutually exclusive to group quotas and so this
didn't add any new changes to the dquot log reservation. Hence when
project quotas were in use (rather than group quotas) the log
reservation was again valid, just like in the Irix days.

Now, with the addition of the separate project quota inode, group
and project quotas are no longer mutually exclusive, and hence
operations can now modify three dquots per inode where previously it
was only two. The worst case here is the rename transaction, which
can allocate/free space on two different directory inodes, and if
they have different uid/gid/prid configurations and are world
writeable, then rename can actually modify 6 different dquots now.

Further, the dquot log reservation doesn't take into account the
space used by the dquot log format structure that precedes the dquot
that is logged, and hence further underestimates the worst case
log space required by dquots during a transaction. This has been
missing since the first commit in 1996.

Hence the worst case log reservation needs to be increased from 3 to
6, and it needs to take into account a log format header for each of
those dquots.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-09 16:43:16 -05:00
Dave Chinner f3508bcddf xfs: remove local fork format handling from xfs_bmapi_write()
The conversion from local format to extent format requires
interpretation of the data in the fork being converted, so it cannot
be done in a generic way. It is up to the caller to convert the fork
format to extent format before calling into xfs_bmapi_write() so
format conversion can be done correctly.

The code in xfs_bmapi_write() to convert the format is used
implicitly by the attribute and directory code, but they
specifically zero the fork size so that the conversion does not do
any allocation or manipulation. Move this conversion into the
shortform to leaf functions for the dir/attr code so the conversions
are explicitly controlled by all callers.

Now we can remove the conversion code in xfs_bmapi_write.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-09 16:40:22 -05:00
Scott Mayhew 07b5ce8ef2 NFS: Make nfs_readdir revalidate less often
Make nfs_readdir revalidate only when we're at the beginning of the directory or
if the cached attributes have expired.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-09 17:17:07 -04:00
Scott Mayhew 43f291cd07 NFS: Make nfs_attribute_cache_expired() non-static
NFS: Make nfs_attribute_cache_expired() non-static so we can call it from
nfs_readdir().

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-09 17:17:07 -04:00
Jeff Layton cda57a1ef6 nfs: set verifier on existing dentries in nfs_prime_dcache
nfs_prime_dcache currently only sets the verifier when it doesn't
initially a matching dentry in the dcache. Set the verifier in the case
where we do find a dentry in the dcache. This ensures that we don't
have to look up the dentry again if we want to use it after a readdir.

Cc: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-09 17:16:39 -04:00