Commit Graph

36728 Commits

Author SHA1 Message Date
Qu Wenruo 65d33fd7a6 btrfs: Add check to avoid cleanup roots already in fs_info->dead_roots.
Current btrfs_orphan_cleanup will also cleanup roots which is already in
fs_info->dead_roots without protection.
This will have conditional race with fs_info->cleaner_kthread.

This patch will use refs in root->root_item to detect roots in
dead_roots and avoid conflicts.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:35 -07:00
Miao Xie 21c7e75654 Btrfs: reclaim the reserved metadata space at background
Before applying this patch, the task had to reclaim the metadata space
by itself if the metadata space was not enough. And When the task started
the space reclamation, all the other tasks which wanted to reserve the
metadata space were blocked. At some cases, they would be blocked for
a long time, it made the performance fluctuate wildly.

So we introduce the background metadata space reclamation, when the space
is about to be exhausted, we insert a reclaim work into the workqueue, the
worker of the workqueue helps us to reclaim the reserved space at the
background. By this way, the tasks needn't reclaim the space by themselves at
most cases, and even if the tasks have to reclaim the space or are blocked
for the space reclamation, they will get enough space more quickly.

Here is my test result(Tested by compilebench):
 Memory:	2GB
 CPU:		2Cores * 1CPU
 Partition:	40GB(SSD)

Test command:
 # compilebench -D <mnt> -m

Without this patch:
 intial create total runs 30 avg 54.36 MB/s (user 0.52s sys 2.44s)
 compile total runs 30 avg 123.72 MB/s (user 0.13s sys 1.17s)
 read compiled tree total runs 3 avg 81.15 MB/s (user 0.74s sys 4.89s)
 delete compiled tree total runs 30 avg 5.32 seconds (user 0.35s sys 4.37s)

With this patch:
 intial create total runs 30 avg 59.80 MB/s (user 0.52s sys 2.53s)
 compile total runs 30 avg 151.44 MB/s (user 0.13s sys 1.11s)
 read compiled tree total runs 3 avg 83.25 MB/s (user 0.76s sys 4.91s)
 delete compiled tree total runs 30 avg 5.29 seconds (user 0.34s sys 4.34s)

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:34 -07:00
Miao Xie 32d6b47fe6 Btrfs: output warning instead of error when loading free space cache failed
If we fail to load a free space cache, we can rebuild it from the extent tree,
so it is not a serious error, we should not output a error message that
would make the users uncomfortable. This patch uses warning message instead
of it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:33 -07:00
Qu Wenruo 5a1972bd9f btrfs: Add ctime/mtime update for btrfs device add/remove.
Btrfs will send uevent to udev inform the device change,
but ctime/mtime for the block device inode is not udpated, which cause
libblkid used by btrfs-progs unable to detect device change and use old
cache, causing 'btrfs dev scan; btrfs dev rmove; btrfs dev scan' give an
error message.

Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Cc: Karel Zak <kzak@redhat.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:33 -07:00
David Sterba 61155aa04e btrfs: assert that send is not in progres before root deletion
CC: Miao Xie <miaox@cn.fujitsu.com>
CC: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:32 -07:00
David Sterba 521e0546c9 btrfs: protect snapshots from deleting during send
The patch "Btrfs: fix protection between send and root deletion"
(18f687d538) does not actually prevent to delete the snapshot
and just takes care during background cleaning, but this seems rather
user unfriendly, this patch implements the idea presented in

http://www.spinics.net/lists/linux-btrfs/msg30813.html

- add an internal root_item flag to denote a dead root
- check if the send_in_progress is set and refuse to delete, otherwise
  set the flag and proceed
- check the flag in send similar to the btrfs_root_readonly checks, for
  all involved roots

The root lookup in send via btrfs_read_fs_root_no_name will check if the
root is really dead or not. If it is, ENOENT, aborted send. If it's
alive, it's protected by send_in_progress, send can continue.

CC: Miao Xie <miaox@cn.fujitsu.com>
CC: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:31 -07:00
Daeseok Youn 944a4515b2 btrfs: remove redundant null check in btrfs_dentry_release()
It doesn't need to check NULL for kfree()

Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:31 -07:00
Filipe Manana ef3b9af50b Btrfs: implement inode_operations callback tmpfile
This implements the tmpfile callback of struct inode_operations, introduced
in the linux kernel 3.11, and implemented already by some filesystems. This
callback is invoked by the VFS when the flag O_TMPFILE is passed to the open
system call.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
2014-06-09 17:20:30 -07:00
David Sterba e4ef90ff61 btrfs: make FS_INFO ioctl available to anyone
This ioctl provides basic info about the filesystem that can be obtained
in other ways (eg. sysfs), there's no reason to restrict it to
CAP_SYSADMIN.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:29 -07:00
David Sterba 7d6213c5a7 btrfs: make DEV_INFO ioctl available to anyone
This ioctl provides basic info about the devices that can be obtained in
other ways (eg. sysfs), there's no reason to restrict it to
CAP_SYSADMIN.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:28 -07:00
David Sterba df93589a17 btrfs: export more from FS_INFO to sysfs
Similar to the FS_INFO updates, export the basic filesystem info through
sysfs: node size, sector size and clone alignment.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:28 -07:00
David Sterba 80a773fbfc btrfs: retrieve more info from FS_INFO ioctl
Provide the basic information about filesystem through the ioctl:
* b-tree node size (same as leaf size)
* sector size
* expected alignment of CLONE_RANGE and EXTENT_SAME ioctl arguments

Backward compatibility: if the values are 0, kernel does not provide
this information, the applications should ignore them.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:27 -07:00
David Sterba 7d824b6f9c btrfs: balance filter: add limit of processed chunks
This started as debugging helper, to watch the effects of converting
between raid levels on multiple devices, but could be useful standalone.

In my case the usage filter was not finegrained enough and led to
converting too many chunks at once. Another example use is in connection
with drange+devid or vrange filters that allow to work with a specific
chunk or even with a chunk on a given device.

The limit filter applies last, the value of 0 means no limiting.

CC: Ilya Dryomov <idryomov@gmail.com>
CC: Hugo Mills <hugo@carfax.org.uk>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:26 -07:00
Filipe Manana fc19c5e736 Btrfs: fix leaf corruption caused by ENOSPC while hole punching
While running a stress test with multiple threads writing to the same btrfs
file system, I ended up with a situation where a leaf was corrupted in that
it had 2 file extent item keys that had the same exact key. I was able to
detect this quickly thanks to the following patch which triggers an assertion
as soon as a leaf is marked dirty if there are duplicated keys or out of order
keys:

    Btrfs: check if items are ordered when a leaf is marked dirty
    (https://patchwork.kernel.org/patch/3955431/)

Basically while running the test, I got the following in dmesg:

    [28877.415877] WARNING: CPU: 2 PID: 10706 at fs/btrfs/file.c:553 btrfs_drop_extent_cache+0x435/0x440 [btrfs]()
    (...)
    [28877.415917] Call Trace:
    [28877.415922]  [<ffffffff816f1189>] dump_stack+0x4e/0x68
    [28877.415926]  [<ffffffff8104a32c>] warn_slowpath_common+0x8c/0xc0
    [28877.415929]  [<ffffffff8104a37a>] warn_slowpath_null+0x1a/0x20
    [28877.415944]  [<ffffffffa03775a5>] btrfs_drop_extent_cache+0x435/0x440 [btrfs]
    [28877.415949]  [<ffffffff8118e7be>] ? kmem_cache_alloc+0xfe/0x1c0
    [28877.415962]  [<ffffffffa03777d9>] fill_holes+0x229/0x3e0 [btrfs]
    [28877.415972]  [<ffffffffa0345865>] ? block_rsv_add_bytes+0x55/0x80 [btrfs]
    [28877.415984]  [<ffffffffa03792cb>] btrfs_fallocate+0xb6b/0xc20 [btrfs]
    (...)
    [29854.132560] BTRFS critical (device sdc): corrupt leaf, bad key order: block=955232256,root=1, slot=24
    [29854.132565] BTRFS info (device sdc): leaf 955232256 total ptrs 40 free space 778
    (...)
    [29854.132637] 	item 23 key (3486 108 667648) itemoff 2694 itemsize 53
    [29854.132638] 		extent data disk bytenr 14574411776 nr 286720
    [29854.132639] 		extent data offset 0 nr 286720 ram 286720
    [29854.132640] 	item 24 key (3486 108 954368) itemoff 2641 itemsize 53
    [29854.132641] 		extent data disk bytenr 0 nr 0
    [29854.132643] 		extent data offset 0 nr 0 ram 0
    [29854.132644] 	item 25 key (3486 108 954368) itemoff 2588 itemsize 53
    [29854.132645] 		extent data disk bytenr 8699670528 nr 77824
    [29854.132646] 		extent data offset 0 nr 77824 ram 77824
    [29854.132647] 	item 26 key (3486 108 1146880) itemoff 2535 itemsize 53
    [29854.132648] 		extent data disk bytenr 8699670528 nr 77824
    [29854.132649] 		extent data offset 0 nr 77824 ram 77824
    (...)
    [29854.132707] kernel BUG at fs/btrfs/ctree.h:3901!
    (...)
    [29854.132771] Call Trace:
    [29854.132779]  [<ffffffffa0342b5c>] setup_items_for_insert+0x2dc/0x400 [btrfs]
    [29854.132791]  [<ffffffffa0378537>] __btrfs_drop_extents+0xba7/0xdd0 [btrfs]
    [29854.132794]  [<ffffffff8109c0d6>] ? trace_hardirqs_on_caller+0x16/0x1d0
    [29854.132797]  [<ffffffff8109c29d>] ? trace_hardirqs_on+0xd/0x10
    [29854.132800]  [<ffffffff8118e7be>] ? kmem_cache_alloc+0xfe/0x1c0
    [29854.132810]  [<ffffffffa036783b>] insert_reserved_file_extent.constprop.66+0xab/0x310 [btrfs]
    [29854.132820]  [<ffffffffa036a6c6>] __btrfs_prealloc_file_range+0x116/0x340 [btrfs]
    [29854.132830]  [<ffffffffa0374d53>] btrfs_prealloc_file_range+0x23/0x30 [btrfs]
    (...)

So this is caused by getting an -ENOSPC error while punching a file hole, more
specifically, we get -ENOSPC error from __btrfs_drop_extents in the while loop
of file.c:btrfs_punch_hole() when it's unable to modify the btree to delete one
or more file extent items due to lack of enough free space. When this happens,
in btrfs_punch_hole(), we attempt to reclaim free space by switching our transaction
block reservation object to root->fs_info->trans_block_rsv, end our transaction and
start a new transaction basically - and, we keep increasing our current offset
(cur_offset) as long as it's smaller than the end of the target range (lockend) -
this makes use leave the loop with cur_offset == drop_end which in turn makes us
call fill_holes() for inserting a file extent item that represents a 0 bytes range
hole (and this insertion succeeds, as in the meanwhile more space became available).

This 0 bytes file hole extent item is a problem because any subsequent caller of
__btrfs_drop_extents (regular file writes, or fallocate calls for e.g.), with a
start file offset that is equal to the offset of the hole, will not remove this
extent item due to the following conditional in the while loop of
__btrfs_drop_extents:

    if (extent_end <= search_start) {
            path->slots[0]++;
            goto next_slot;
    }

This later makes the call to setup_items_for_insert() (at the very end of
__btrfs_drop_extents), insert a new file extent item with the same offset as
the 0 bytes file hole extent item that follows it. Needless is to say that this
causes chaos, either when reading the leaf from disk (btree_readpage_end_io_hook),
where we perform leaf sanity checks or in subsequent operations that manipulate
file extent items, as in the fallocate call as shown by the dmesg trace above.

Without my other patch to perform the leaf sanity checks once a leaf is marked
as dirty (if the integrity checker is enabled), it would have been much harder
to debug this issue.

This change might fix a few similar issues reported by users in the mailing
list regarding assertion failures in btrfs_set_item_key_safe calls performed
by __btrfs_drop_extents, such as the following report:

    http://comments.gmane.org/gmane.comp.file-systems.btrfs/32938

Asking fill_holes() to create a 0 bytes wide file hole item also produced the
first warning in the trace above, as we passed a range to btrfs_drop_extent_cache
that has an end smaller (by -1) than its start.

On 3.14 kernels this issue manifests itself through leaf corruption, as we get
duplicated file extent item keys in a leaf when calling setup_items_for_insert(),
but on older kernels, setup_items_for_insert() isn't called by __btrfs_drop_extents(),
instead we have callers of __btrfs_drop_extents(), namely the functions
inode.c:insert_inline_extent() and inode.c:insert_reserved_file_extent(), calling
btrfs_insert_empty_item() to insert the new file extent item, which would fail with
error -EEXIST, instead of inserting a duplicated key - which is still a serious
issue as it would make all similar file extent item replace operations keep
failing if they target the same file range.

Cc: stable@vger.kernel.org
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:26 -07:00
Liu Bo d2cbf2a260 Btrfs: do not increment on bio_index one by one
'bio_index' is just a index, it's really not necessary to do increment
one by one.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:25 -07:00
Filipe Manana a1a50f60a6 Btrfs: read inode size after acquiring the mutex when punching a hole
In a previous change, commit 12870f1c9b,
I accidentally moved the roundup of inode->i_size to outside of the
critical section delimited by the inode mutex, which is not atomic and
not correct since the size can be changed by other task before we acquire
the mutex. Therefore fix it.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:24 -07:00
Tobias Klauser 7fb18a0664 btrfs: Remove unnecessary check for NULL
iput() already checks for the inode being NULL, thus it's unnecessary to
check before calling.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:23 -07:00
Zach Brown 166ae5a418 btrfs: fix inline compressed read err corruption
uncompress_inline() is dropping the error from btrfs_decompress() after
testing it and zeroing the page that was supposed to hold decompressed
data.  This can silently turn compressed inline data in to zeros if
decompression fails due to corrupt compressed data or memory allocation
failure.

I verified this by manually forcing the error from btrfs_decompress()
for a silly named copy of od:

	if (!strcmp(current->comm, "failod"))
		ret = -ENOMEM;

  # od -x /mnt/btrfs/dir/80 | head -1
  0000000 3031 3038 310a 2d30 6f70 6e69 0a74 3031
  # echo 3 > /proc/sys/vm/drop_caches
  # cp $(which od) /tmp/failod
  # /tmp/failod -x /mnt/btrfs/dir/80 | head -1
  0000000 0000 0000 0000 0000 0000 0000 0000 0000

The fix is to pass the error to its caller.  Which still has a BUG_ON().
So we fix that too.

There seems to be no reason for the zeroing of the page on the error
from btrfs_decompress() but not from the allocation error a few lines
above.  So the page zeroing is removed.

Signed-off-by: Zach Brown <zab@redhat.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:23 -07:00
Zach Brown 774bcb35f0 btrfs: return ptr error from compression workspace
The btrfs compression wrappers translated errors from workspace
allocation to either -ENOMEM or -1.  The compression type workspace
allocators are already returning a ERR_PTR(-ENOMEM).  Just return that
and get rid of the magical -1.

This helps a future patch return errors from the compression wrappers.

Signed-off-by: Zach Brown <zab@redhat.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:22 -07:00
Zach Brown 60e1975acb btrfs: return errno instead of -1 from compression
The compression layer seems to have been built to return -1 and have
callers make up errors that make sense.  This isn't great because there
are different errors that originate down in the compression layer.

Let's return real negative errnos from the compression layer so that
callers can pass on the error without having to guess what happened.
ENOMEM for allocation failure, E2BIG when compression exceeds the
uncompressed input, and EIO for everything else.

This helps a future path return errors from btrfs_decompress().

Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:21 -07:00
Stefan Behrens 98806b446d btrfs: check_int: propagate out-of-memory error upwards
This issue was not causing any harm but IMO (and in the opinion of the
static code checker) it is better to propagate this error status upwards.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:21 -07:00
Filipe Manana 61391d5622 Btrfs: fix hang on error (such as ENOSPC) when writing extent pages
When running low on available disk space and having several processes
doing buffered file IO, I got the following trace in dmesg:

[ 4202.720152] INFO: task kworker/u8:1:5450 blocked for more than 120 seconds.
[ 4202.720401]       Not tainted 3.13.0-fdm-btrfs-next-26+ #1
[ 4202.720596] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4202.720874] kworker/u8:1    D 0000000000000001     0  5450      2 0x00000000
[ 4202.720904] Workqueue: btrfs-flush_delalloc normal_work_helper [btrfs]
[ 4202.720908]  ffff8801f62ddc38 0000000000000082 ffff880203ac2490 00000000001d3f40
[ 4202.720913]  ffff8801f62ddfd8 00000000001d3f40 ffff8800c4f0c920 ffff880203ac2490
[ 4202.720918]  00000000001d4a40 ffff88020fe85a40 ffff88020fe85ab8 0000000000000001
[ 4202.720922] Call Trace:
[ 4202.720931]  [<ffffffff816a3cb9>] schedule+0x29/0x70
[ 4202.720950]  [<ffffffffa01ec48d>] btrfs_start_ordered_extent+0x6d/0x110 [btrfs]
[ 4202.720956]  [<ffffffff8108e620>] ? bit_waitqueue+0xc0/0xc0
[ 4202.720972]  [<ffffffffa01ec559>] btrfs_run_ordered_extent_work+0x29/0x40 [btrfs]
[ 4202.720988]  [<ffffffffa0201987>] normal_work_helper+0x137/0x2c0 [btrfs]
[ 4202.720994]  [<ffffffff810680e5>] process_one_work+0x1f5/0x530
(...)
[ 4202.721027] 2 locks held by kworker/u8:1/5450:
[ 4202.721028]  #0:  (%s-%s){++++..}, at: [<ffffffff81068083>] process_one_work+0x193/0x530
[ 4202.721037]  #1:  ((&work->normal_work)){+.+...}, at: [<ffffffff81068083>] process_one_work+0x193/0x530
[ 4202.721054] INFO: task btrfs:7891 blocked for more than 120 seconds.
[ 4202.721258]       Not tainted 3.13.0-fdm-btrfs-next-26+ #1
[ 4202.721444] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4202.721699] btrfs           D 0000000000000001     0  7891   7890 0x00000001
[ 4202.721704]  ffff88018c2119e8 0000000000000086 ffff8800a33d2490 00000000001d3f40
[ 4202.721710]  ffff88018c211fd8 00000000001d3f40 ffff8802144b0000 ffff8800a33d2490
[ 4202.721714]  ffff8800d8576640 ffff88020fe85bc0 ffff88020fe85bc8 7fffffffffffffff
[ 4202.721718] Call Trace:
[ 4202.721723]  [<ffffffff816a3cb9>] schedule+0x29/0x70
[ 4202.721727]  [<ffffffff816a2ebc>] schedule_timeout+0x1dc/0x270
[ 4202.721732]  [<ffffffff8109bd79>] ? mark_held_locks+0xb9/0x140
[ 4202.721736]  [<ffffffff816a90c0>] ? _raw_spin_unlock_irq+0x30/0x40
[ 4202.721740]  [<ffffffff8109bf0d>] ? trace_hardirqs_on_caller+0x10d/0x1d0
[ 4202.721744]  [<ffffffff816a488f>] wait_for_completion+0xdf/0x120
[ 4202.721749]  [<ffffffff8107fa90>] ? try_to_wake_up+0x310/0x310
[ 4202.721765]  [<ffffffffa01ebee4>] btrfs_wait_ordered_extents+0x1f4/0x280 [btrfs]
[ 4202.721781]  [<ffffffffa020526e>] btrfs_mksubvol.isra.62+0x30e/0x5a0 [btrfs]
[ 4202.721786]  [<ffffffff8108e620>] ? bit_waitqueue+0xc0/0xc0
[ 4202.721799]  [<ffffffffa02056a9>] btrfs_ioctl_snap_create_transid+0x1a9/0x1b0 [btrfs]
[ 4202.721813]  [<ffffffffa020583a>] btrfs_ioctl_snap_create_v2+0x10a/0x170 [btrfs]
(...)

It turns out that extent_io.c:__extent_writepage(), which ends up being called
through filemap_fdatawrite_range() in btrfs_start_ordered_extent(), was getting
-ENOSPC when calling the fill_delalloc callback. In this situation, it returned
without the writepage_end_io_hook callback (inode.c:btrfs_writepage_end_io_hook)
ever being called for the respective page, which prevents the ordered extent's
bytes_left count from ever reaching 0, and therefore a finish_ordered_fn work
is never queued into the endio_write_workers queue. This makes the task that
called btrfs_start_ordered_extent() hang forever on the wait queue of the ordered
extent.

This is fairly easy to reproduce using a small filesystem and fsstress on
a quad core vm:

    mkfs.btrfs -f -b `expr 2100 \* 1024 \* 1024` /dev/sdd
    mount /dev/sdd /mnt

    fsstress -p 6 -d /mnt -n 100000 -x \
        "btrfs subvolume snapshot -r /mnt /mnt/mysnap" \
	    -f allocsp=0 \
	    -f bulkstat=0 \
	    -f bulkstat1=0 \
	    -f chown=0 \
	    -f creat=1 \
	    -f dread=0 \
	    -f dwrite=0 \
	    -f fallocate=1 \
	    -f fdatasync=0 \
	    -f fiemap=0 \
	    -f freesp=0 \
	    -f fsync=0 \
	    -f getattr=0 \
	    -f getdents=0 \
	    -f link=0 \
	    -f mkdir=0 \
	    -f mknod=0 \
	    -f punch=1 \
	    -f read=0 \
	    -f readlink=0 \
	    -f rename=0 \
	    -f resvsp=0 \
	    -f rmdir=0 \
	    -f setxattr=0 \
	    -f stat=0 \
	    -f symlink=0 \
	    -f sync=0 \
	    -f truncate=1 \
	    -f unlink=0 \
	    -f unresvsp=0 \
	    -f write=4

So just ensure that if an error happens while writing the extent page
we call the writepage_end_io_hook callback. Also make it return the
error code and ensure the caller (extent_write_cache_pages) processes
all pages in the page vector even if an error happens only for some
of them, so that ordered extents end up released.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:20 -07:00
Dave Chinner 7691283d05 Merge branch 'xfs-misc-fixes-3-for-3.16' into for-next 2014-06-10 07:32:56 +10:00
Dave Chinner 8612c7e594 Merge branch 'xfs-da-geom' into for-next 2014-06-10 07:32:41 +10:00
Dave Chinner 35f46c5f04 xfs: fix xfs_da_args sparse warning in xfs_readdir
The kbuild test robot reported:

>> fs/xfs/xfs_dir2_readdir.c:672:41: sparse: Using plain integer as NULL pointer

Fix it.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-10 07:30:36 +10:00
J. Bruce Fields 48385408b4 nfsd4: fix FREE_STATEID lockowner leak
27b11428b7 ("nfsd4: remove lockowner when removing lock stateid")
introduced a memory leak.

Cc: stable@vger.kernel.org
Reported-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-09 17:13:54 -04:00
Tom Haynes f383b7e8fd NFSv4.1: Fix typo in dprintk
Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-06-09 09:54:42 -04:00
Tom Haynes bf96bc0b3b NFSv4.1: Comment is now wrong and redundant to code
The save of the write offset was removed some time ago, so that
part of the comment is bogus.

The remainder is pretty self-evident.

So off with it!

Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-06-09 09:54:29 -04:00
Linus Torvalds 963649d735 9p fixes for the 3.16 merge window
Two bug fixes, one in xattr error path and the other in parsing
 major/minor numbers from devices.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: GPGTools - http://gpgtools.org
 
 iQIcBAABAgAGBQJTlM0UAAoJEDZk62b0Tg6xmiMP/i1t8K01JS64ZcyRYAy34oNn
 YR20Vz8zVfJ0Xkn7FdPXUZYIqi4CqawEWMUc1Yrw2qnRCkRMo1xtLT/Jobw7X0XF
 tfyc/bW9pvAfdfIA2npRENJtVWXB90nvzCf5ezCdr5tpRdWz1JJ5AhvTsubzEWrB
 oXZbn0/NHzEwvIt8sI/ZjuMKrvR1fD95a7wWnEg5Ckni+UZdKWJKA350vTXPfmfx
 334YbAy1/zBDnGsebzYMIEGtksybeiFZ2BWOscvafe0fh3eg0vqyE1i3wb7QZhO4
 IYuJA2Lq/L98BoFi3wshL+slfTwwXghFOct8OOV1NLdJh/osUndGeRGNXXZlW8R/
 zm2PE+pUKgKv0RjuSKUFr52bdcXsfKpyRJrSbh5Vh0jIT+FdZSW6sOD5ZTInK36o
 mD+VamD+wyJAmVwqoNlXRRweHyS1LCA7esuCjugK4gpy+lffRBWahnqxWQjK7Pc+
 7rAh6Q8ta79Y2AMQzC7SaiM87zcUOF6Dw/FkIHsS8UOI94dJNDNxdSfa8u+44WdT
 DgkR2ANQ5LryMJIwLwLKyEAt91zWSOUGk9dMbeTkBlJaBWWQDgTIhs6Ba7ptdI9/
 O/iu+S4jXSyFmemm0tSQwCysa4TQeAaiunjHQ5o+SnwbamuppyR4Y+/X40rGm+79
 Q5iBBF48tGfQal2AqLTx
 =EjHz
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-3.16-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

Pull 9p fixes from Eric Van Hensbergen:
 "Two bug fixes, one in xattr error path and the other in parsing
  major/minor numbers from devices"

* tag 'for-linus-3.16-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9P: fix return value in v9fs_fid_xattr_set
  fs/9p: adjust sscanf parameters accordingly to the variable types
2014-06-08 14:35:19 -07:00
Linus Torvalds f8409abdc5 Clean ups and miscellaneous bug fixes, in particular for the new
collapse_range and zero_range fallocate functions.  In addition,
 improve the scalability of adding and remove inodes from the orphan
 list.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTk9x7AAoJENNvdpvBGATwQQ4QAN85xkNWWiq0feLGZjUVTre/
 JUgRQWXZYVogAQckQoTDXqJt1qKYxO45A8oIoUMI4uzgcFJm7iJIZJAv3Hjd2ftz
 48RVwjWHblmBz6e+CdmETpzJUaJr3KXbnk3EDQzagWg3Q64dBU/yT0c4foBO8wfX
 FI1MNin70r5NGQv6Mp4xNUfMoU6liCrsMO2RWkyxY2rcmxy6tkpNO/NBAPwhmn0e
 vwKHvnnqKM08Frrt6Lz3MpXGAJ+rhTSvmL+qSRXQn9BcbphdGa4jy+i3HbviRX4N
 z77UZMgMbfK1V3YHm8KzmmbIHrmIARXUlCM7jp4HPSnb4qhyERrhVmGCJZ8civ6Q
 3Cm9WwA93PQDfRX6Kid3K1tR/ql+ryac55o9SM990osrWp4C0IH+P/CdlSN0GspN
 3pJTLHUVVcxF6gSnOD+q/JzM8Iudl87Rxb17wA+6eg3AJRaPoQSPJoqtwZ89ZwOz
 RiZGuugFp7gDOxqo32lJ53fivO/e1zxXxu0dVHHjOnHBVWX063hlcibTg8kvFWg1
 7bBvUkvgT5jR+UuDX81wPZ+c0kkmfk4gxT5sHg6RlMKeCYi3uuLmAYgla3AM4j9G
 GeNNdVTmilH7wMgYB2wxd0C5HofgKgM5YFLZWc0FVSXMeFs5ST2kbLMXAZqzrKPa
 szHFEJHIGZByXfkP/jix
 =C1ZV
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "Clean ups and miscellaneous bug fixes, in particular for the new
  collapse_range and zero_range fallocate functions.  In addition,
  improve the scalability of adding and remove inodes from the orphan
  list"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits)
  ext4: handle symlink properly with inline_data
  ext4: fix wrong assert in ext4_mb_normalize_request()
  ext4: fix zeroing of page during writeback
  ext4: remove unused local variable "stored" from ext4_readdir(...)
  ext4: fix ZERO_RANGE test failure in data journalling
  ext4: reduce contention on s_orphan_lock
  ext4: use sbi in ext4_orphan_{add|del}()
  ext4: use EXT_MAX_BLOCKS in ext4_es_can_be_merged()
  ext4: add missing BUFFER_TRACE before ext4_journal_get_write_access
  ext4: remove unnecessary double parentheses
  ext4: do not destroy ext4_groupinfo_caches if ext4_mb_init() fails
  ext4: make local functions static
  ext4: fix block bitmap validation when bigalloc, ^flex_bg
  ext4: fix block bitmap initialization under sparse_super2
  ext4: find the group descriptors on a 1k-block bigalloc,meta_bg filesystem
  ext4: avoid unneeded lookup when xattr name is invalid
  ext4: fix data integrity sync in ordered mode
  ext4: remove obsoleted check
  ext4: add a new spinlock i_raw_lock to protect the ext4's raw inode
  ext4: fix locking for O_APPEND writes
  ...
2014-06-08 13:03:35 -07:00
Linus Torvalds 3f17ea6dea Merge branch 'next' (accumulated 3.16 merge window patches) into master
Now that 3.15 is released, this merges the 'next' branch into 'master',
bringing us to the normal situation where my 'master' branch is the
merge window.

* accumulated work in next: (6809 commits)
  ufs: sb mutex merge + mutex_destroy
  powerpc: update comments for generic idle conversion
  cris: update comments for generic idle conversion
  idle: remove cpu_idle() forward declarations
  nbd: zero from and len fields in NBD_CMD_DISCONNECT.
  mm: convert some level-less printks to pr_*
  MAINTAINERS: adi-buildroot-devel is moderated
  MAINTAINERS: add linux-api for review of API/ABI changes
  mm/kmemleak-test.c: use pr_fmt for logging
  fs/dlm/debug_fs.c: replace seq_printf by seq_puts
  fs/dlm/lockspace.c: convert simple_str to kstr
  fs/dlm/config.c: convert simple_str to kstr
  mm: mark remap_file_pages() syscall as deprecated
  mm: memcontrol: remove unnecessary memcg argument from soft limit functions
  mm: memcontrol: clean up memcg zoneinfo lookup
  mm/memblock.c: call kmemleak directly from memblock_(alloc|free)
  mm/mempool.c: update the kmemleak stack trace for mempool allocations
  lib/radix-tree.c: update the kmemleak stack trace for radix tree allocations
  mm: introduce kmemleak_update_trace()
  mm/kmemleak.c: use %u to print ->checksum
  ...
2014-06-08 11:31:16 -07:00
Linus Torvalds 9d2cd01b15 Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd into next
Pull exofs raid6 support from Boaz Harrosh:
 "These simple patches will enable raid6 using the kernel's raid6_pq
  engine for support under exofs and pnfs-objects.

  There is nothing needed to do at exofs and pnfs-obj.  Just fire your
  mkfs.exofs with --raid=6 (that was already supported before) and off
  you go as usual.  The ORE will pick up the new map and will start
  writing two devices of redundancy bits.  The patches are so simple
  because most of the ORE was already for the general raid case, only a
  few bug fixes were needed and the actual wiring into the raid6_pq
  engine"

* 'for-linus' of git://git.open-osd.org/linux-open-osd:
  ore: Support for raid 6
  ore: Remove redundant dev_order(), more cleanups
  ore: (trivial) reformat some code
2014-06-07 17:07:20 -07:00
Jaegeuk Kim 9ab7013492 f2fs: support f2fs_fiemap
This patch links f2fs_fiemap with generic function with get_block.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-06-08 08:56:49 +09:00
Linus Torvalds c593e89787 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fix from Chris Mason:
 "I had this in my 3.16 merge window queue, but it is small and obvious
  enough for 3.15.  I cherry-picked and retested against current rc8"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: send, fix corrupted path strings for long paths
2014-06-07 15:12:18 -07:00
Yan, Zheng 4e217b5dc8 ceph: use truncate_pagecache() instead of truncate_inode_pages()
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-08 05:09:28 +08:00
Jeff Layton 1b19453d1c nfsd: don't halt scanning the DRC LRU list when there's an RC_INPROG entry
Currently, the DRC cache pruner will stop scanning the list when it
hits an entry that is RC_INPROG. It's possible however for a call to
take a *very* long time. In that case, we don't want it to block other
entries from being pruned if they are expired or we need to trim the
cache to get back under the limit.

Fix the DRC cache pruner to just ignore RC_INPROG entries.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06 19:22:49 -04:00
J. Bruce Fields 999e568354 nfs4: remove unused CHANGE_SECURITY_LABEL
This constant has the wrong value.  And we don't use it.  And it's been
removed from the 4.2 spec anyway.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06 19:22:49 -04:00
J. Bruce Fields 542d1ab3c7 nfsd4: kill READ64
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06 19:22:48 -04:00
J. Bruce Fields 06553991e7 nfsd4: kill READ32
While we're here, let's kill off a couple of the read-side macros.

Leaving the more complicated ones alone for now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06 19:22:47 -04:00
J. Bruce Fields 05638dc73a nfsd4: simplify server xdr->next_page use
The rpc code makes available to the NFS server an array of pages to
encod into.  The server represents its reply as an xdr buf, with the
head pointing into the first page in that array, the pages ** array
starting just after that, and the tail (if any) sharing any leftover
space in the page used by the head.

While encoding, we use xdr_stream->page_ptr to keep track of which page
we're currently using.

Currently we set xdr_stream->page_ptr to buf->pages, which makes the
head a weird exception to the rule that page_ptr always points to the
page we're currently encoding into.  So, instead set it to buf->pages -
1 (the page actually containing the head), and remove the need for a
little unintuitive logic in xdr_get_next_encode_buffer() and
xdr_truncate_encode.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06 19:22:46 -04:00
Fabian Frederick 0244756edc ufs: sb mutex merge + mutex_destroy
Commit 788257d610 ("ufs: remove the BKL") replaced BKL with mutex
protection using functions lock_ufs, unlock_ufs and struct mutex 'mutex'
in sb_info.

Commit b6963327e0 ("ufs: drop lock/unlock super") removed lock/unlock
super and added struct mutex 's_lock' in sb_info.

Those 2 mutexes are generally locked/unlocked at the same time except in
allocation (balloc, ialloc).

This patch merges the 2 mutexes and propagates first commit solution.
It also adds mutex destruction before kfree during ufs_fill_super
failure and ufs_put_super.

[akpm@linux-foundation.org: avoid ifdefs, return -EROFS not -EINVAL]
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: "Chen, Jet" <jet.chen@intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:18 -07:00
Fabian Frederick c1d4518c4e fs/dlm/debug_fs.c: replace seq_printf by seq_puts
Replace seq_printf where possible.  This patch also fixes the following
checkpatch warning "unnecessary whitespace before a quoted newline"

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:18 -07:00
Fabian Frederick 6edb56871a fs/dlm/lockspace.c: convert simple_str to kstr
Replace obsolete functions.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:18 -07:00
Fabian Frederick 4f4c337fb7 fs/dlm/config.c: convert simple_str to kstr
Replace obsolete functions

simple_strtoul/kstrtouint
simple_strtol/kstrtoint
(kstr __must_check requires the right function to be applied)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:17 -07:00
Fabian Frederick 310bf8f749 fs/reiserfs/stree.c: remove obsolete __constant
__constant_cpu_to_le32 converted to cpu_to_le32

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:17 -07:00
Fabian Frederick ae0a50aba0 fs/reiserfs/bitmap.c: coding style fixes
-Trivial code clean-up
-Fix endif }; (coccinelle warning)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:16 -07:00
Joe Perches 1f7e0616cd fs: convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:16 -07:00
Joe Perches 5eccdf3954 ntfs: convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:16 -07:00
Joe Perches 92f778dd5d inotify: convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:16 -07:00
Joe Perches f5102e5630 nfs: convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:16 -07:00
Joe Perches 7ac9fe571d lockd: convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:16 -07:00
Joe Perches 75a3294ec5 fscache: convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:16 -07:00
Joe Perches a88bbbeef6 coda: convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:16 -07:00
Fabian Frederick 04541a2f31 fs/devpts/inode.c: convert printk to pr_foo()
Also convert spaces to tabs (checkpatch warnings) if (!dentry) KERN_NOTICE
converted to pr_err (like if (!inode) error process)

[akpm@linux-foundation.org: use KBUILD_MODNAME, per Joe]
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:14 -07:00
Fabian Frederick 0227d6abb3 fs/cachefiles: replace kerror by pr_err
Also add pr_fmt in internal.h

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:14 -07:00
Fabian Frederick 4e1eb88305 FS/CACHEFILES: convert printk to pr_foo()
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:14 -07:00
Fabian Frederick ef74885353 fs/pstore: logging clean-up
- Define pr_fmt in plateform.c and ram_core.c for global prefix.

- Coalesce format fragments.

- Separate format/arguments on lines > 80 characters.

Note: Some pr_foo() were initially declared without prefix and therefore
this could break existing log analyzer.

[akpm@linux-foundation.org: missed a couple of prefix removals]
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:13 -07:00
Fabian Frederick 9606d9aa85 fs/affs: pr_debug cleanup
- Remove AFFS: prefix (defined in pr_fmt)

- Use __func__

- Separate format/arguments on lines > 80 characters.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:13 -07:00
Fabian Frederick 0158de12b0 fs/affs: convert printk to pr_foo()
-All printk(KERN_foo converted to pr_foo()

-Default printk converted to pr_warn()

-Add pr_fmt to affs.h

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:13 -07:00
Fabian Frederick 0c89d67016 fs/affs/file.c: remove unnecessary function parameters
- affs_do_readpage_ofs is always called with from = 0 ie reading from
  page->index

- File parameter is never used

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:13 -07:00
Fabian Frederick a05e16ada4 fs/proc/vmcore.c: remove NULL assignment to static
Static values are automatically initialized to NULL.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:12 -07:00
Fabian Frederick 17c2b4ee40 fs/proc/task_mmu.c: replace seq_printf by seq_puts
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:12 -07:00
Oleg Nesterov c240837fa7 signals: jffs2: fix the wrong usage of disallow_signal()
jffs2_garbage_collect_thread() does disallow_signal(SIGHUP) around
jffs2_garbage_collect_pass() and the comment says "We don't want SIGHUP
to interrupt us".

But disallow_signal() can't ensure that jffs2_garbage_collect_pass()
won't be interrupted by SIGHUP, the problem is that SIGHUP can be
already pending when disallow_signal() is called, and in this case any
interruptible sleep won't block.

Note: this is in fact because disallow_signal() is buggy and should be
fixed, see the next changes.

But there is another reason why disallow_signal() is wrong: SIG_IGN set
by disallow_signal() silently discards any SIGHUP which can be sent
before the next allow_signal(SIGHUP).

Change this code to use sigprocmask(SIG_UNBLOCK/SIG_BLOCK, SIGHUP).
This even matches the old (and wrong) semantics allow/disallow had when
this logic was written.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:11 -07:00
Manuel Schölling ef19470ef8 fs/fat/inode.c: clean up string initializations (char[] instead of char *)
Initializations like 'char *foo = "bar"' will create two variables: a
static string and a pointer (foo) to that static string.  Instead 'char
foo[] = "bar"' will declare a single variable and will end up in shorter
assembly (according to Jeff Garzik on the KernelJanitor's TODO list).

Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:11 -07:00
Conrad Meyer 190a8843de fs/fat/: add support for DOS 1.x formatted volumes
Add structure for parsed BPB information, struct fat_bios_param_block,
and move all of the deserialization and validation logic from
fat_fill_super() into fat_read_bpb().

Add a 'dos1xfloppy' mount option to infer DOS 2.x BIOS Parameter Block
defaults from block device geometry for ancient floppies and floppy
images, as a fall-back from the default BPB parsing logic.

When fat_read_bpb() finds an invalid FAT filesystem and dos1xfloppy is
set, fall back to fat_read_static_bpb().  fat_read_static_bpb()
validates that the entire BPB is zero, and that the floppy has a
DOS-style 8086 code bootstrapping header.  Then it fills in default BPB
values from media size and a table.[0]

Media size is assumed to be static for archaic FAT volumes.  See also:
[1].

Fixes kernel.org bug #42617.

[0]: https://en.wikipedia.org/wiki/File_Allocation_Table#Exceptions
[1]: http://www.win.tue.nl/~aeb/linux/fs/fat/fat-1.html

[hirofumi@mail.parknet.co.jp: fix missed error code]
Signed-off-by: Conrad Meyer <cse.cem@gmail.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick a19189e553 fs/hpfs: increase pr_warn level
This patch applies a suggestion by Mikulas Patocka asking to increase
all pr_warn without commented ones to pr_err

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick 1749a10e02 fs/hpfs: use __func__ for logging
Normalize function display fx() using __func__

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick 14da17f9c4 fs/hpfs: use pr_fmt for logging
Also remove redundant level names (warning:...)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick b7cb1ce220 fs/hpfs: convert printk to pr_foo()
No level printk in hptfs_error converted to pr_err (others to pr_warn or
pr_info)

This patch also fixes if/then/else checkpatch warnings

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick 45641c82c1 fs/ufs/balloc.c: remove err parameter in ufs_add_fragments
err is used in ufs_new_fragments (ufs_add_fragments only callsite)
not in ufs_add_fragments.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Christian Kujau df3d4e7a24 hfsplus: fix compiler warning on PowerPC
Commit a99b7069aa ("hfsplus: Fix undefined __divdi3 in
hfsplus_init_header_node()") introduced do_div() to xattr.c and the
warning below too.

As Geert remarked: "tmp" is "loff_t" which is "__kernel_loff_t", which
is "long long", i.e.  signed, while include/asm-generic/div64.h compares
its type with "uint64_t".  As inode sizes are positive, it should be
safe to change the type of "tmp" to "u64".

  In file included from
  arch/powerpc/include/asm/div64.h:1:0,
                    from include/linux/kernel.h:124,
                    from include/asm-generic/bug.h:13,
                    from arch/powerpc/include/asm/bug.h:127,
                    from include/linux/bug.h:4,
                    from include/linux/thread_info.h:11,
                    from include/asm-generic/preempt.h:4,
                    from arch/powerpc/include/generated/asm/preempt.h:1,
                    from include/linux/preempt.h:18,
                    from include/linux/spinlock.h:50,
                    from include/linux/wait.h:8,
                    from include/linux/fs.h:6,
                    from fs/hfsplus/hfsplus_fs.h:19,
                    from fs/hfsplus/xattr.c:9:
  fs/hfsplus/xattr.c: In function 'hfsplus_init_header_node':
  include/asm-generic/div64.h:43:28: warning: comparison of distinct pointer types lacks a cast [enabled by default]
     (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \
                               ^
  fs/hfsplus/xattr.c:86:2: note: in expansion of macro 'do_div'
     do_div(tmp, node_size);
     ^

Signed-off-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Sergei Antonov <saproj@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick b73f3d0e70 fs/hfsplus: fix pr_foo() and hfs_dbg formats
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Suggested-By: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Sergei Antonov ffbc067161 hfsplus: coding style fix for declarations in hfsplus_fs.h
Some function declarations in hfsplus_fs.h were with argument names,
some without, and some were mixed.  This patch adds argument names
everywhere, sorts function in order they go in .c files, and moves
hfs_part_find() to a proper section.

Auto-formatting and sorting was done with:
cfunctions *.c | indent -linux | sed "s| \* | \*|"

Signed-off-by: Sergei Antonov <saproj@gmail.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick 297cc27207 fs/hfsplus/wrapper.c: replace shift loop by ilog2
Replace while blocksize;shift by ilog2

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Sergei Antonov 2cd282a1bc hfsplus: fix "unused node is not erased" error
Zero newly allocated extents in the catalog tree if volume attributes
tell us to.  Not doing so we risk getting the "unused node is not
erased" error.  See kHFSUnusedNodeFix flag in Apple's source code for
reference.

There was a previous commit clearing the node when it is freed: commit
899bed05e9 ("hfsplus: fix issue with unzeroed unused b-tree nodes").
But it did not handle newly allocated extents (this patch fixes it).
And it zeroed nodes in all trees unconditionally which is an overkill.

This patch adds a condition and also switches to 'tree->node_size' as a
simpler method of getting the length to zero.

Signed-off-by: Sergei Antonov <saproj@gmail.com>
Cc: Anton Altaparmakov <aia21@cam.ac.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Kyle Laracey <kalaracey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick 915ab236d3 fs/hfsplus/wrapper.c: replace min/casting by min_t
Also add * before function comments (it was not detected by kernel-doc)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick d8983ca0aa fs/hfsplus/options.c: replace seq_printf by seq_puts
Replace seq_printf where possible

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:10 -07:00
Fabian Frederick e46707d153 fs/hfsplus/bnode.c: replace min/casting by min_t
Also fixes some pr_ formats

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Sergei Antonov 97a62eaefd hfsplus: emit proper file type from readdir
hfsplus_readdir() incorrectly returned DT_REG for symbolic links and
special files.  Return DT_REG, DT_LNK, DT_FIFO, DT_CHR, DT_BLK, DT_SOCK,
or DT_UNKNOWN according to mode field in catalog record.  Programs
relying on information from readdir will now work correctly with HFS+.

Signed-off-by: Sergei Antonov <saproj@gmail.com>
Cc: Anton Altaparmakov <aia21@cam.ac.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Hin-Tak Leung 7f2fc81ea2 hfsplus: remove unused routine hfsplus_attr_build_key_uni
The directory/file catalog b-tree equivalent, hfsplus_build_key_uni(),
is used by hfsplus_find_cat() for internal referencing between catalog
records.  There is no corresponding usage for attributes - attribute
records do not refer to one another.

Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Sougata Santra <sougata@tuxera.com>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Hin-Tak Leung bf29e886b2 hfsplus: correct usage of HFSPLUS_ATTR_MAX_STRLEN for non-English attributes
HFSPLUS_ATTR_MAX_STRLEN (=127) is the limit of attribute names for the
number of unicode character (UTF-16BE) storable in the HFS+ file system.
Almost all the current usage of it is wrong, in relation to NLS to
on-disk conversion.

Except for one use calling hfsplus_asc2uni (which should stay the same)
and its uses in calling hfsplus_uni2asc (which was corrected in the
earlier patch in this series concerning usage of hfsplus_uni2asc), all
the other uses are of the forms:

- char buffer[size]

- bound check: "if (namespace_adjusted_input_length > size) return failure;"

Conversion between on-disk unicode representation and NLS char strings
(in whichever direction) always needs to accommodate the worst-case NLS
conversion, so all char buffers of that size need to have a
NLS_MAX_CHARSET_SIZE x .

The bound checks are all wrong, since they compare nls_length derived
from strlen() to a unicode length limit.

It turns out that all the bound-checks do is to protect
hfsplus_asc2uni(), which can fail if the input is too large.

There is only one usage of it as far as attributes are concerned, in
hfsplus_attr_build_key().  It is in turn used by hfsplus_find_attr(),
hfsplus_create_attr(), hfsplus_delete_attr().  Thus making sure that
errors from hfsplus_asc2uni() is caught in hfsplus_attr_build_key() and
propagated is sufficient to replace all the bound checks.

Unpropagated errors from hfsplus_asc2uni() in the file catalog code was
addressed recently in an independent patch "hfsplus: fix longname
handling" by Sougata Santra.

Before this patch, trying to set a 55 CJK character (in a UTF-8 locale,
> 127/3=42) attribute plus user prefix fails with:

    $ setfattr -n user.`cat testing-string` -v `cat testing-string` \
        testing-string
    setfattr: testing-string: Operation not supported

and retrieving a stored long attributes is particular ugly(!):

    find /mnt/* -type f -exec getfattr -d {} \;
    getfattr: /mnt/testing-string: Input/output error

with console log:
    [268008.389781] hfsplus: unicode conversion failed

After the patch, both of the above works.

FYI, the test attribute string is prepared with:

echo -e -n \
"\xe9\x80\x99\xe6\x98\xaf\xe4\xb8\x80\xe5\x80\x8b\xe9\x9d\x9e\xe5" \
"\xb8\xb8\xe6\xbc\xab\xe9\x95\xb7\xe8\x80\x8c\xe6\xa5\xb5\xe5\x85" \
"\xb6\xe4\xb9\x8f\xe5\x91\xb3\xe5\x92\x8c\xe7\x9b\xb8\xe7\x95\xb6" \
"\xe7\x84\xa1\xe8\xb6\xa3\xe3\x80\x81\xe4\xbb\xa5\xe5\x8f\x8a\xe7" \
"\x84\xa1\xe7\x94\xa8\xe7\x9a\x84\xe3\x80\x81\xe5\x86\x8d\xe5\x8a" \
"\xa0\xe4\xb8\x8a\xe6\xaf\xab\xe7\x84\xa1\xe6\x84\x8f\xe7\xbe\xa9" \
"\xe7\x9a\x84\xe6\x93\xb4\xe5\xb1\x95\xe5\xb1\xac\xe6\x80\xa7\xef" \
"\xbc\x8c\xe8\x80\x8c\xe5\x85\xb6\xe5\x94\xaf\xe4\xb8\x80\xe5\x89" \
"\xb5\xe5\xbb\xba\xe7\x9b\xae\xe7\x9a\x84\xe5\x83\x85\xe6\x98\xaf" \
"\xe7\x82\xba\xe4\xba\x86\xe6\xb8\xac\xe8\xa9\xa6\xe4\xbd\x9c\xe7" \
"\x94\xa8\xe3\x80\x82" | tr -d ' '

(= "pointlessly long attribute for testing", elaborate Chinese in
UTF-8 enoding).

However, it is not possible to set double the size (110 + 5 is still
under 127) in a UTF-8 locale:

    $setfattr -n user.`cat testing-string testing-string` -v \
        `cat testing-string testing-string` testing-string
    setfattr: testing-string: Numerical result out of range

110 CJK char in UTF-8 is 330 bytes - the generic get/set attribute
system call code in linux/fs/xattr.c imposes a 255 byte limit.  One can
use a combination of iconv to encode content, changing terminal locale
for viewing, and an nls=cp932/cp936/cp949/cp950 mount option to fully
use 127-unicode attribute in a double-byte locale.

Also, as an additional information, it is possible to (mis-)use unicode
half-width/full-width forms (U+FFxx) to write attributes which looks
like english but not actually ascii.

Thanks Anton Altaparmakov for reviewing the earlier ideas behind this
change.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Sougata Santra <sougata@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Hin-Tak Leung 017f8da43e hfsplus: fix worst-case unicode to char conversion of file names and attributes
This is a series of 3 patches which corrects issues in HFS+ concerning
the use of non-english file names and attributes.  Names and attributes
are stored internally as UTF-16 units up to a fixed maximum size, and
convert to and from user-representation by NLS.  The code incorrectly
assume that NLS string lengths are equal to unicode lengths, which is
only true for English ascii usage.

This patch (of 3):

The HFS Plus Volume Format specification (TN1150) states that file names
are stored internally as a maximum of 255 unicode characters, as defined
by The Unicode Standard, Version 2.0 [Unicode, Inc.  ISBN
0-201-48345-9].  File names are converted by the NLS system on Linux
before presented to the user.

255 CJK characters converts to UTF-8 with 1 unicode character to up to 3
bytes, and to GB18030 with 1 unicode character to up to 4 bytes.  Thus,
trying in a UTF-8 locale to list files with names of more than 85 CJK
characters results in:

    $ ls /mnt
    ls: reading directory /mnt: File name too long

The receiving buffer to hfsplus_uni2asc() needs to be 255 x
NLS_MAX_CHARSET_SIZE bytes, not 255 bytes as the code has always been.

Similar consideration applies to attributes, which are stored internally
as a maximum of 127 UTF-16BE units.  See XNU source for an up-to-date
reference on attributes.

Strictly speaking, the maximum value of NLS_MAX_CHARSET_SIZE = 6 is not
attainable in the case of conversion to UTF-8, as going beyond 3 bytes
requires the use of surrogate pairs, i.e.  consuming two input units.

Thanks Anton Altaparmakov for reviewing an earlier version of this
change.

This patch fixes all callers of hfsplus_uni2asc(), and also enables the
use of long non-English file names in HFS+.  The getting and setting,
and general usage of long non-English attributes requires further
forthcoming work, in the following patches of this series.

[akpm@linux-foundation.org: fix build]
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Sougata Santra <sougata@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Fabian Frederick 6d6bd94f4d fs/coda: use __func__
Replace all function names by __func__ in pr_foo()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Fabian Frederick f38cfb2564 fs/coda: logging prefix uniformization
- Add pr_fmt based on module name.

- Remove Coda: coda: from pr_foo()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Fabian Frederick d9b4b3195a fs/coda: replace printk by pr_foo()
No level printk converted to pr_warn or pr_info

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Fabian Frederick 817e1d902a fs/befs: kernel-doc fixes
Fix some comment errors.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Fabian Frederick f38f41c31b fs/befs/linuxvfs.c: remove positive test on sector_t
sector_t is unsigned.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Fabian Frederick 6cb103b6f4 fs/befs/btree.c: replace strncpy by strlcpy + coding style fixing
- strncpy + end of string assignement replaced by strlcpy

- Fix endif };

- Fix typo

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Fabian Frederick 39d7a29f86 fs/befs/linuxvfs.c: replace strncpy by strlcpy
strncpy + end of string assignment replaced by strlcpy

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:09 -07:00
Fabian Frederick 3364d113c8 fs/ceph/debugfs.c: replace seq_printf by seq_puts
Replace seq_printf where possible.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Sage Weil <sage@inktank.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:06 -07:00
Fabian Frederick f3ae1b97be fs/ceph: replace pr_warning by pr_warn
Update the last pr_warning callsites in fs branch

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Sage Weil <sage@inktank.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:06 -07:00
Naoya Horiguchi d4c54919ed mm: add !pte_present() check on existing hugetlb_entry callbacks
The age table walker doesn't check non-present hugetlb entry in common
path, so hugetlb_entry() callbacks must check it.  The reason for this
behavior is that some callers want to handle it in its own way.

[ I think that reason is bogus, btw - it should just do what the regular
  code does, which is to call the "pte_hole()" function for such hugetlb
  entries  - Linus]

However, some callers don't check it now, which causes unpredictable
result, for example when we have a race between migrating hugepage and
reading /proc/pid/numa_maps.  This patch fixes it by adding !pte_present
checks on buggy callbacks.

This bug exists for years and got visible by introducing hugepage
migration.

ChangeLog v2:
- fix if condition (check !pte_present() instead of pte_present())

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org> [3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Backported to 3.15.  Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 13:21:16 -07:00
Filipe Manana 01a9a8a9e2 Btrfs: send, fix corrupted path strings for long paths
If a path has more than 230 characters, we allocate a new buffer to
use for the path, but we were forgotting to copy the contents of the
previous buffer into the new one, which has random content from the
kmalloc call.

Test:

    mkfs.btrfs -f /dev/sdd
    mount /dev/sdd /mnt

    TEST_PATH="/mnt/fdmanana/.config/google-chrome-mysetup/Default/Pepper_Data/Shockwave_Flash/WritableRoot/#SharedObjects/JSHJ4ZKN/s.wsj.net/[[IMPORT]]/players.edgesuite.net/flash/plugins/osmf/advanced-streaming-plugin/v2.7/osmf1.6/Ak#"
    mkdir -p $TEST_PATH
    echo "hello world" > $TEST_PATH/amaiAdvancedStreamingPlugin.txt

    btrfs subvolume snapshot -r /mnt /mnt/mysnap1
    btrfs send /mnt/mysnap1 -f /tmp/1.snap

A test for xfstests follows.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Cc: Marc Merlin <marc@merlins.org>
Tested-by: Marc MERLIN <marc@merlins.org>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-06 12:00:46 -07:00
Jaegeuk Kim 86928f984e f2fs: avoid not to call remove_dirty_inode
There is an errorneous case during the recovery like below.

In recovery_dentry,
 1) dir = f2fs_iget();
 2) mark the dir with FI_DELAY_IPUT
 3) goto unmap_out

After the end of recovery routine, there is no dirty dentries so the dir cannot
be released by iput in remove_dirty_dir_inode.

This patch fixes such the bug case by handling the iget and iput in the
recovery_dentry procedure.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-06-07 03:18:36 +09:00
Jaegeuk Kim 6fa1df533a f2fs: recover fallocated space
If a fallocated file is fsynced, we should recover the i_size after sudden
power cut.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-06-07 03:18:35 +09:00
Jan Kara 30265117ee xfs: Fix rounding in xfs_alloc_fix_len()
Rounding in xfs_alloc_fix_len() is wrong. As the comment states, the
result should be a number of a form (k*prod+mod) however due to sign
mistake the result is different. As a result allocations on raid arrays
could be misaligned in some cases.

This also seems to fix occasional assertion failure:
	XFS_WANT_CORRUPTED_GOTO(rlen <= flen, error0)
in xfs_alloc_ag_vextent_size().

Also add an assertion that the result of xfs_alloc_fix_len() is of
expected form.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 16:06:37 +10:00
Christoph Hellwig 448011e2ab xfs: tone down writepage/releasepage WARN_ONs
I recently ran into the issue fixed by

  "xfs: kill buffers over failed write ranges properly"

which spams the log with lots of backtraces.  Make debugging any
issues like that easier by using WARN_ON_ONCE in the writeback code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 16:05:15 +10:00
Dan Carpenter 72208ee060 xfs: small cleanup in xfs_lowbit64()
There are two checkpatch.pl complaints here because of the bad
indenting and because of the assignment inside the condition.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 16:04:42 +10:00
Dave Chinner 36de95567f xfs: kill xfs_buf_geterror()
Most of the callers are just calling ASSERT(!xfs_buf_geterror())
which means they are checking for bp->b_error == 0. If bp is null in
this case, we will assert fail, and hence it's no different in
result to oopsing because of a null bp. In some cases, errors have
already been checked for or the function returning the buffer can't
return a buffer with an error, so it's just a redundant assert.
Either way, the assert can either be removed.

The other two non-assert callers can just test for a buffer and
error properly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 16:02:12 +10:00
Dave Chinner 556b8883cf xfs: xfs_readsb needs to check for magic numbers
Commit daba542 ("xfs: skip verification on initial "guess"
superblock read") dropped the use of a verifier for the initial
superblock read so we can probe the sector size of the filesystem
stored in the superblock. It, however, now fails to validate that
what was read initially is actually an XFS superblock and hence will
fail the sector size check and return ENOSYS.

This causes probe-based mounts to fail because it expects XFS to
return EINVAL when it doesn't recognise the superblock format.

cc: <stable@vger.kernel.org>
Reported-by: Plamen Petrov <plamen.sisi@gmail.com>
Tested-by: Plamen Petrov <plamen.sisi@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 16:00:43 +10:00
Dave Chinner 1f6d64829d xfs: block allocation work needs to be kswapd aware
Upon memory pressure, kswapd calls xfs_vm_writepage() from
shrink_page_list(). This can result in delayed allocation occurring
and that gets deferred to the the allocation workqueue.

The allocation then runs outside kswapd context, which means if it
needs memory (and it does to demand page metadata from disk) it can
block in shrink_inactive_list() waiting for IO congestion. These
blocking waits are normally avoiding in kswapd context, so under
memory pressure writeback from kswapd can be arbitrarily delayed by
memory reclaim.

To avoid this, pass the kswapd context to the allocation being done
by the workqueue, so that memory reclaim understands correctly that
the work is being done for kswapd and therefore it is not blocked
and does not delay memory reclaim.

To avoid issues with int->char conversion of flag fields (as noticed
in v1 of this patch) convert the flag fields in the struct
xfs_bmalloca to bool types. pahole indicates these variables are
still single byte variables, so no extra space is consumed by this
change.

cc: <stable@vger.kernel.org>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:59:59 +10:00
Adrian Hunter 82b897782d perf: Differentiate exec() and non-exec() comm events
perf tools like 'perf report' can aggregate samples by comm strings,
which generally works.  However, there are other potential use-cases.
For example, to pair up 'calls' with 'returns' accurately (from branch
events like Intel BTS) it is necessary to identify whether the process
has exec'd.  Although a comm event is generated when an 'exec' happens
it is also generated whenever the comm string is changed on a whim
(e.g. by prctl PR_SET_NAME).  This patch adds a flag to the comm event
to differentiate one case from the other.

In order to determine whether the kernel supports the new flag, a
selection bit named 'exec' is added to struct perf_event_attr.  The
bit does nothing but will cause perf_event_open() to fail if the bit
is set on kernels that do not have it defined.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/537D9EBE.7030806@intel.com
Cc: Paul Mackerras <paulus@samba.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-06 07:56:22 +02:00
Peter Zijlstra e041e328c4 perf: Fix perf_event_comm() vs. exec() assumption
perf_event_comm() assumes that set_task_comm() is only called on
exec(), and in particular that its only called on current.

Neither are true, as Dave reported a WARN triggered by set_task_comm()
being called on !current.

Separate the exec() hook from the comm hook.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20140521153219.GH5226@laptop.programming.kicks-ass.net
[ Build fix. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-06 07:54:02 +02:00
Dave Chinner b2a21e7a6b xfs: remove redundant geometry information from xfs_da_state
It's carried in state->args->geo, so there's no need to duplicate it
and use more stack space than necessary.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:22:04 +10:00
Dave Chinner c2c4c477e0 xfs: replace attr LBSIZE with xfs_da_geometry
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:21:45 +10:00
Dave Chinner c59f0ad23a xfs: pass xfs_da_args to xfs_attr_leaf_newentsize
As it's only ever called from contexts where the xfs_da_args is
present and contains all the information needed inside the args
structure.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:21:27 +10:00
Dave Chinner 33a6039007 xfs: use xfs_da_geometry for block size in attr code
Rather than using the superblock value obtained through the
xfs_mount.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:21:10 +10:00
Dave Chinner bc85178a76 xfs: remove mp->m_dir_geo from directory logging
We don't pass the xfs_da_args or the geometry all the way down to
the directory buffer logging code, hence we have to use
mp->m_dir_geo here. Fix this to use the geometry passed via the
xfs_da_args, and convert all the directory logging functions for
consistency.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:20:54 +10:00
Dave Chinner 53f82db003 xfs: reduce direct usage of mp->m_dir_geo
There are many places in the directory code were we don't pass the
args into and so have to extract the geometry direct from the mount
structure. Push the args or the geometry into these leaf functions
so that we don't need to grab it from the struct xfs_mount.

This, in turn, brings use to the point where directory geometry is
no longer a property of the struct xfs_mount; it is not a global
property anymore, and hence we can start to consider per-directory
configuration of physical geometries.

Start by converting the xfs_dir_isblock/leaf code - pass in the
xfs_da_args and convert the readdir code to use xfs_da_args like
the rest of the directory code to pass information around.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:20:32 +10:00
Dave Chinner 7ab610f9e0 xfs: move node entry counts to xfs_da_geometry
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:20:02 +10:00
Dave Chinner ed358c0058 xfs: convert dir/attr btree threshold to xfs_da_geometry
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:18:10 +10:00
Dave Chinner 8f66193c89 xfs: convert m_dirblksize to xfs_da_geometry
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:15:59 +10:00
Dave Chinner d6cf13051f xfs: convert m_dirblkfsbs to xfs_da_geometry
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:14:11 +10:00
Dave Chinner 7dda6e8644 xfs: convert directory segment limits to xfs_da_geometry
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:11:18 +10:00
Dave Chinner 30028030b1 xfs: convert directory db conversion to xfs_da_geometry
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:08:18 +10:00
Dave Chinner 2998ab1d45 xfs: convert directory dablk conversion to xfs_da_geometry
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:07:53 +10:00
Dave Chinner 9b3b5522d3 xfs: convert dir byte/off conversion to xfs_da_geometry
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:06:53 +10:00
Dave Chinner 8c44a28561 xfs: kill XFS_DIR2...FIRSTDB macros
They are just simple wrappers around xfs_dir2_byte_to_db(), and
we've already removed one usage earlier in the patch set. Kill
the rest before we start removing the xfs_mount from conversion
functions.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:04:41 +10:00
Dave Chinner 892e3f342f xfs: move directory block translatiosn to xfs_dir2_priv.h
Because they aren't actually part of the on-disk format, and so
shouldn't be in xfs_da_format.h.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:04:05 +10:00
Dave Chinner 0650b55497 xfs: introduce directory geometry structure
The directory code has a dependency on the struct xfs_mount to
supply the directory block geometry. Block size, block log size,
and other parameters are pre-caclulated in the struct xfs_mount or
access directly from the superblock embedded in the struct
xfs_mount.

Extract all of this geometry information out of the struct xfs_mount
and superblock and place it into a new struct xfs_da_geometry
defined by the directory code. Allocate and initialise it at mount
time, and attach it to the struct xfs_mount so it canbe passed back
into the directory code appropriately rather than using the struct
xfs_mount.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-06 15:01:58 +10:00
Sage Weil b8e69066d8 ceph: include time stamp in every MDS request
We recently modified the client/MDS protocol to include a timestamp in the
client request.  This allows ctime updates to follow the client's clock
in most cases, which avoids subtle problems when clocks are out of sync
and timestamps are updated sometimes by the MDS clock (for most requests)
and sometimes by the client clock (for cap writeback).

Signed-off-by: Sage Weil <sage@inktank.com>
2014-06-06 09:30:00 +08:00
Zhang Zhen 23cd573b46 ceph: refactor readpage_nounlock() to make the logic clearer
If the return value of ceph_osdc_readpages() is not negative,
it is certainly greater than or equal to zero.

Remove the useless condition judgment and redundant braces.

Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:56 +08:00
Yan, Zheng ca665e0282 mds: check cap ID when handling cap export message
handle following sequence of events:
- mds0 exports an inode to mds1. client receives the cap import
  message from mds1. caps from mds0 are removed while handling
  the cap import message.
- mds1 exports an inode to mds0. client receives the cap export
  message from mds1. handle_cap_export() adds placeholder caps
  for mds0
- client receives the first cap export message (for exporting
  inode from mds0 to mds1)

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:55 +08:00
Yan, Zheng 8d08503c13 ceph: remember subtree root dirfrag's auth MDS
remember dirfrag's auth MDS when it's different from its parent inode's
auth MDS.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:55 +08:00
Yan, Zheng 3e7fbe9ceb ceph: introduce ceph_fill_fragtree()
Move the code that update the i_fragtree into a separate function.
Also add simple probabilistic test to decide whether the i_fragtree
should be updated

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:54 +08:00
Yan, Zheng 2cd698be9a ceph: handle cap import atomically
cap import messages are processed by both handle_cap_import() and
handle_cap_grant(). These two functions are not executed in the same
atomic context, so they can races with cap release.

The fix is make handle_cap_import() not release the i_ceph_lock when
it returns. Let handle_cap_grant() release the lock after it finishes
its job.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:53 +08:00
Yan, Zheng d9df278350 ceph: pre-allocate ceph_cap struct for ceph_add_cap()
So that ceph_add_cap() can be used while i_ceph_lock is locked.
This simplifies the code that handle cap import/export.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:53 +08:00
Yan, Zheng f98a128a55 ceph: update inode fields according to issued caps
Cap message and request reply from non-auth MDS may carry stale
information (corresponding locks are in LOCK states) even they
have the newest inode version. So client should update inode fields
according to issued caps.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:52 +08:00
Yan, Zheng c6bcda6f52 ceph: queue vmtruncate if necessary when handing cap grant/revoke
cap grant/revoke message from non-auth MDS can update inode's size
and truncate_seq/truncate_size. (the message arrives before auth
MDS's cap trunc message)

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:51 +08:00
Zhang Zhen 979d4c1895 ceph: remove useless ACL check
posix_acl_xattr_set() already does the check, and it's the only
way to feed in an ACL from userspace.
So the check here is useless, remove it.

Signed-off-by: zhang zhen <zhenzhang.zhang@huawei.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:50 +08:00
Fengguang Wu e84be11c53 ceph: ceph_get_parent() can be static
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Yan, Zheng <zheng.z.yan@intel.com>
2014-06-06 09:29:50 +08:00
Trond Myklebust abbec2da13 NFS: Use raw_write_seqcount_begin/end int nfs4_reclaim_open_state
The addition of lockdep code to write_seqcount_begin/end has lead to
a bunch of false positive claims of ABBA deadlocks with the so_lock
spinlock. Audits show that this simply cannot happen because the
read side code does not spin while holding so_lock.

Cc: <stable@vger.kernel.org> # 3.13.x
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-06-05 13:02:47 -04:00
Linus Torvalds a0abcf2e8f Merge branch 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull x86 cdso updates from Peter Anvin:
 "Vdso cleanups and improvements largely from Andy Lutomirski.  This
  makes the vdso a lot less ''special''"

* 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso, build: Make LE access macros clearer, host-safe
  x86/vdso, build: Fix cross-compilation from big-endian architectures
  x86/vdso, build: When vdso2c fails, unlink the output
  x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET
  x86, mm: Replace arch_vma_name with vm_ops->name for vsyscalls
  x86, mm: Improve _install_special_mapping and fix x86 vdso naming
  mm, fs: Add vm_ops->name as an alternative to arch_vma_name
  x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET
  x86, vdso: Remove vestiges of VDSO_PRELINK and some outdated comments
  x86, vdso: Move the vvar and hpet mappings next to the 64-bit vDSO
  x86, vdso: Move the 32-bit vdso special pages after the text
  x86, vdso: Reimplement vdso.so preparation in build-time C
  x86, vdso: Move syscall and sysenter setup into kernel/cpu/common.c
  x86, vdso: Clean up 32-bit vs 64-bit vdso params
  x86, mm: Ensure correct alignment of the fixmap
2014-06-05 08:05:29 -07:00
Fabian Frederick 3ff6db3287 fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init
autofs_dev_ioctl_init is only called by __init init_autofs4_fs

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:21 -07:00
Fabian Frederick 8091b895b7 fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoul
Remove obsolete simple_strtoul in ncp_getopt

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:21 -07:00
Axel Lin 3430343572 fs/binfmt_flat.c: make old_reloc() static
old_reloc() is only used in this file, make it static.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:21 -07:00
Fabian Frederick b219e25f8d fs/binfmt_elf.c: fix bool assignements
Fix coccinelle warnings.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:21 -07:00
Fabian Frederick d1826f2a3d fs/efs: convert printk(KERN_DEBUG to pr_debug
All KERN_DEBUG callsites being under #ifdef DEBUG we can safely convert
everything to pr_debug without changing current behaviour.

Remove #ifdef DEBUG around pr_debugs only (suggested by Joe Perches)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:21 -07:00
Fabian Frederick f403d1dbac fs/efs: add pr_fmt / use __func__
Also uniformize function arguments.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:20 -07:00
Fabian Frederick 179b87fb18 fs/efs: convert printk to pr_foo()
Convert all except KERN_DEBUG
(pr_debug doesn't work the same as printk(KERN_DEBUG and requires
special check)

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:20 -07:00
Fabian Frederick 00f01791e1 fs/exportfs/expfs.c: kernel-doc warning fixes
Fixing 2 typo in function comments.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:14 -07:00
Fabian Frederick e37dcbfbb2 fs/efivarfs/super.c: use static const for dentry_operations
...like other filesystems.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:14 -07:00
Tim Chen d23da150a3 fs/superblock: avoid locking counting inodes and dentries before reclaiming them
We remove the call to grab_super_passive in call to super_cache_count.
This becomes a scalability bottleneck as multiple threads are trying to do
memory reclamation, e.g.  when we are doing large amount of file read and
page cache is under pressure.  The cached objects quickly got reclaimed
down to 0 and we are aborting the cache_scan() reclaim.  But counting
creates a log jam acquiring the sb_lock.

We are holding the shrinker_rwsem which ensures the safety of call to
list_lru_count_node() and s_op->nr_cached_objects.  The shrinker is
unregistered now before ->kill_sb() so the operation is safe when we are
doing unmount.

The impact will depend heavily on the machine and the workload but for a
small machine using postmark tuned to use 4xRAM size the results were

                                  3.15.0-rc5            3.15.0-rc5
                                     vanilla         shrinker-v1r1
Ops/sec Transactions         21.00 (  0.00%)       24.00 ( 14.29%)
Ops/sec FilesCreate          39.00 (  0.00%)       44.00 ( 12.82%)
Ops/sec CreateTransact       10.00 (  0.00%)       12.00 ( 20.00%)
Ops/sec FilesDeleted       6202.00 (  0.00%)     6202.00 (  0.00%)
Ops/sec DeleteTransact       11.00 (  0.00%)       12.00 (  9.09%)
Ops/sec DataRead/MB          25.97 (  0.00%)       29.10 ( 12.05%)
Ops/sec DataWrite/MB         49.99 (  0.00%)       56.02 ( 12.06%)

ffsb running in a configuration that is meant to simulate a mail server showed

                                 3.15.0-rc5             3.15.0-rc5
                                    vanilla          shrinker-v1r1
Ops/sec readall           9402.63 (  0.00%)      9567.97 (  1.76%)
Ops/sec create            4695.45 (  0.00%)      4735.00 (  0.84%)
Ops/sec delete             173.72 (  0.00%)       179.83 (  3.52%)
Ops/sec Transactions     14271.80 (  0.00%)     14482.81 (  1.48%)
Ops/sec Read                37.00 (  0.00%)        37.60 (  1.62%)
Ops/sec Write               18.20 (  0.00%)        18.30 (  0.55%)

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:11 -07:00
Dave Chinner 28f2cd4f6d fs/superblock: unregister sb shrinker before ->kill_sb()
This series is aimed at regressions noticed during reclaim activity.  The
first two patches are shrinker patches that were posted ages ago but never
merged for reasons that are unclear to me.  I'm posting them again to see
if there was a reason they were dropped or if they just got lost.  Dave?
Time?  The last patch adjusts proportional reclaim.  Yuanhan Liu, can you
retest the vm scalability test cases on a larger machine?  Hugh, does this
work for you on the memcg test cases?

Based on ext4, I get the following results but unfortunately my larger
test machines are all unavailable so this is based on a relatively small
machine.

postmark
                                  3.15.0-rc5            3.15.0-rc5
                                     vanilla       proportion-v1r4
Ops/sec Transactions         21.00 (  0.00%)       25.00 ( 19.05%)
Ops/sec FilesCreate          39.00 (  0.00%)       45.00 ( 15.38%)
Ops/sec CreateTransact       10.00 (  0.00%)       12.00 ( 20.00%)
Ops/sec FilesDeleted       6202.00 (  0.00%)     6202.00 (  0.00%)
Ops/sec DeleteTransact       11.00 (  0.00%)       12.00 (  9.09%)
Ops/sec DataRead/MB          25.97 (  0.00%)       30.02 ( 15.59%)
Ops/sec DataWrite/MB         49.99 (  0.00%)       57.78 ( 15.58%)

ffsb (mail server simulator)
                                 3.15.0-rc5             3.15.0-rc5
                                    vanilla        proportion-v1r4
Ops/sec readall           9402.63 (  0.00%)      9805.74 (  4.29%)
Ops/sec create            4695.45 (  0.00%)      4781.39 (  1.83%)
Ops/sec delete             173.72 (  0.00%)       177.23 (  2.02%)
Ops/sec Transactions     14271.80 (  0.00%)     14764.37 (  3.45%)
Ops/sec Read                37.00 (  0.00%)        38.50 (  4.05%)
Ops/sec Write               18.20 (  0.00%)        18.50 (  1.65%)

dd of a large file
                                3.15.0-rc5            3.15.0-rc5
                                   vanilla       proportion-v1r4
WallTime DownloadTar       75.00 (  0.00%)       61.00 ( 18.67%)
WallTime DD               423.00 (  0.00%)      401.00 (  5.20%)
WallTime Delete             2.00 (  0.00%)        5.00 (-150.00%)

stutter (times mmap latency during large amounts of IO)

                            3.15.0-rc5            3.15.0-rc5
                               vanilla       proportion-v1r4
Unit >5ms Delays  80252.0000 (  0.00%)  81523.0000 ( -1.58%)
Unit Mmap min         8.2118 (  0.00%)      8.3206 ( -1.33%)
Unit Mmap mean       17.4614 (  0.00%)     17.2868 (  1.00%)
Unit Mmap stddev     24.9059 (  0.00%)     34.6771 (-39.23%)
Unit Mmap max      2811.6433 (  0.00%)   2645.1398 (  5.92%)
Unit Mmap 90%        20.5098 (  0.00%)     18.3105 ( 10.72%)
Unit Mmap 93%        22.9180 (  0.00%)     20.1751 ( 11.97%)
Unit Mmap 95%        25.2114 (  0.00%)     22.4988 ( 10.76%)
Unit Mmap 99%        46.1430 (  0.00%)     43.5952 (  5.52%)
Unit Ideal  Tput     85.2623 (  0.00%)     78.8906 (  7.47%)
Unit Tput min        44.0666 (  0.00%)     43.9609 (  0.24%)
Unit Tput mean       45.5646 (  0.00%)     45.2009 (  0.80%)
Unit Tput stddev      0.9318 (  0.00%)      1.1084 (-18.95%)
Unit Tput max        46.7375 (  0.00%)     46.7539 ( -0.04%)

This patch (of 3):

We will like to unregister the sb shrinker before ->kill_sb().  This will
allow cached objects to be counted without call to grab_super_passive() to
update ref count on sb.  We want to avoid locking during memory
reclamation especially when we are skipping the memory reclaim when we are
out of cached objects.

This is safe because grab_super_passive does a try-lock on the
sb->s_umount now, and so if we are in the unmount process, it won't ever
block.  That means what used to be a deadlock and races we were avoiding
by using grab_super_passive() is now:

        shrinker                        umount

        down_read(shrinker_rwsem)
                                        down_write(sb->s_umount)
                                        shrinker_unregister
                                          down_write(shrinker_rwsem)
                                            <blocks>
        grab_super_passive(sb)
          down_read_trylock(sb->s_umount)
            <fails>
        <shrinker aborts>
        ....
        <shrinkers finish running>
        up_read(shrinker_rwsem)
                                          <unblocks>
                                          <removes shrinker>
                                          up_write(shrinker_rwsem)
                                        ->kill_sb()
                                        ....

So it is safe to deregister the shrinker before ->kill_sb().

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:11 -07:00
Fabian Frederick 6e6870d4fd fs/hugetlbfs/inode.c: remove null test before kfree
Fix checkpatch warning:
WARNING: kfree(NULL) is safe this check is probably not required

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:11 -07:00
Fabian Frederick be1d2cf5e3 fs/hugetlbfs/inode.c: use static const for dentry_operations
...like other filesystems.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:11 -07:00
Fabian Frederick 422b2448fc fs/hugetlbfs/inode.c: add static to hugetlbfs_i_mmap_mutex_key
hugetlbfs_i_mmap_mutex_key is only used in inode.c

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:11 -07:00
Mel Gorman 2457aec637 mm: non-atomically mark page accessed during page cache allocation where possible
aops->write_begin may allocate a new page and make it visible only to have
mark_page_accessed called almost immediately after.  Once the page is
visible the atomic operations are necessary which is noticable overhead
when writing to an in-memory filesystem like tmpfs but should also be
noticable with fast storage.  The objective of the patch is to initialse
the accessed information with non-atomic operations before the page is
visible.

The bulk of filesystems directly or indirectly use
grab_cache_page_write_begin or find_or_create_page for the initial
allocation of a page cache page.  This patch adds an init_page_accessed()
helper which behaves like the first call to mark_page_accessed() but may
called before the page is visible and can be done non-atomically.

The primary APIs of concern in this care are the following and are used
by most filesystems.

	find_get_page
	find_lock_page
	find_or_create_page
	grab_cache_page_nowait
	grab_cache_page_write_begin

All of them are very similar in detail to the patch creates a core helper
pagecache_get_page() which takes a flags parameter that affects its
behavior such as whether the page should be marked accessed or not.  Then
old API is preserved but is basically a thin wrapper around this core
function.

Each of the filesystems are then updated to avoid calling
mark_page_accessed when it is known that the VM interfaces have already
done the job.  There is a slight snag in that the timing of the
mark_page_accessed() has now changed so in rare cases it's possible a page
gets to the end of the LRU as PageReferenced where as previously it might
have been repromoted.  This is expected to be rare but it's worth the
filesystem people thinking about it in case they see a problem with the
timing change.  It is also the case that some filesystems may be marking
pages accessed that previously did not but it makes sense that filesystems
have consistent behaviour in this regard.

The test case used to evaulate this is a simple dd of a large file done
multiple times with the file deleted on each iterations.  The size of the
file is 1/10th physical memory to avoid dirty page balancing.  In the
async case it will be possible that the workload completes without even
hitting the disk and will have variable results but highlight the impact
of mark_page_accessed for async IO.  The sync results are expected to be
more stable.  The exception is tmpfs where the normal case is for the "IO"
to not hit the disk.

The test machine was single socket and UMA to avoid any scheduling or NUMA
artifacts.  Throughput and wall times are presented for sync IO, only wall
times are shown for async as the granularity reported by dd and the
variability is unsuitable for comparison.  As async results were variable
do to writback timings, I'm only reporting the maximum figures.  The sync
results were stable enough to make the mean and stddev uninteresting.

The performance results are reported based on a run with no profiling.
Profile data is based on a separate run with oprofile running.

async dd
                                    3.15.0-rc3            3.15.0-rc3
                                       vanilla           accessed-v2
ext3    Max      elapsed     13.9900 (  0.00%)     11.5900 ( 17.16%)
tmpfs	Max      elapsed      0.5100 (  0.00%)      0.4900 (  3.92%)
btrfs   Max      elapsed     12.8100 (  0.00%)     12.7800 (  0.23%)
ext4	Max      elapsed     18.6000 (  0.00%)     13.3400 ( 28.28%)
xfs	Max      elapsed     12.5600 (  0.00%)      2.0900 ( 83.36%)

The XFS figure is a bit strange as it managed to avoid a worst case by
sheer luck but the average figures looked reasonable.

        samples percentage
ext3       86107    0.9783  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
ext3       23833    0.2710  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
ext3        5036    0.0573  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
ext4       64566    0.8961  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
ext4        5322    0.0713  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
ext4        2869    0.0384  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
xfs        62126    1.7675  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
xfs         1904    0.0554  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
xfs          103    0.0030  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
btrfs      10655    0.1338  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
btrfs       2020    0.0273  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
btrfs        587    0.0079  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
tmpfs      59562    3.2628  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
tmpfs       1210    0.0696  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
tmpfs         94    0.0054  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed

[akpm@linux-foundation.org: don't run init_page_accessed() against an uninitialised pointer]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Tested-by: Prabhakar Lad <prabhakar.csengg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:10 -07:00
Mel Gorman e7470ee89f fs: buffer: do not use unnecessary atomic operations when discarding buffers
Discarding buffers uses a bunch of atomic operations when discarding
buffers because ......  I can't think of a reason.  Use a cmpxchg loop to
clear all the necessary flags.  In most (all?) cases this will be a single
atomic operations.

[akpm@linux-foundation.org: move BUFFER_FLAGS_DISCARD into the .c file]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:10 -07:00
Mel Gorman b745bc85f2 mm: page_alloc: convert hot/cold parameter and immediate callers to bool
cold is a bool, make it one.  Make the likely case the "if" part of the
block instead of the else as according to the optimisation manual this is
preferred.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:09 -07:00