for-f2fs-4.11

This round introduces several interesting features such as on-disk NAT bitmaps,
 IO alignment, and a discard thread. And it includes a couple of major bug fixes
 as below.
 
 == Enhancement ==
 - introduce on-disk bitmaps to avoid scanning NAT blocks when getting free nids
 - support IO alignment to prepare open-channel SSD integration in future
 - introduce a discard thread to avoid long latency during checkpoint and fstrim
 - use SSR for warm node and enable inline_xattr by default
 - introduce in-memory bitmaps to check FS consistency for debugging
 - improve write_begin by avoiding needless read IO
 
 == Bug fix ==
 - fix broken zone_reset behavior for SMR drive
 - fix wrong victim selection policy during GC
 - fix missing behavior when preparing discard commands
 - fix bugs in atomic write support and fiemap
 - workaround to handle multiple f2fs_add_link calls having same name
 
 And it includes a bunch of clean-up patches as well.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYtmdOAAoJEEAUqH6CSFDSs0UP/AzngT37xVIhVBD13J9oHIuv
 rFA/eHVGRJmU1xc4SG1bghKm45xq8rwUX7irarfvLLc5aL+6VPGSdaRBykUr4A5N
 MN/bgK//EPp7If8EF+8PpY+9x7g67i0mtz5iD8dDrK+bUKV/IDKV1LWw5pR3g/g6
 RwMH0dUVOiD/HJ5iFp1ykTdVPe4vFY013uVmyPxUq+nCBlqlQm1nOvrGjF/HeYyX
 kqcD2LEc79GPfS5ebQIKfCfLE0rsWVnnS6YaqlDNCD5/oRim71CUtA4MPTYv29vp
 R/SebWlayEm+u68+uQUu6AyIk/1IdP0+AtRuQd/VxuteoyXmkTMHER662DqN4F8J
 npPdNrbNdlzwuAP77avy+hplqbD19yUa7o7Fl1No5rfheT3CiNTSj2uoriyEAffH
 1AM6tES7S7n5ttrXOr9iOxrK0u/vuaf7fbKVtK+RI09hwzdvyGB5HUdQB0iP/XR+
 obw8dru79ISMVZ9YuDhSfjI5ohAcfthfuqgjUt2RAfDv19IRsg5eayAp3T6nUfEX
 AGQbV/52dkO9svZztMbcBW95zmqkE0cMeX66KIMCPXNuDiE474t8k115K6kHpFwP
 e4Kx+mTSNhR1LEAaVdmCjbLb0gVrumVHTdjaZopnxTFmE70u/M6h1vY90m1LkReF
 ZDK5mhfMmGzU4wkvbgP8
 =tw8c
 -----END PGP SIGNATURE-----

Merge tag 'for-f2fs-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "This round introduces several interesting features such as on-disk NAT
  bitmaps, IO alignment, and a discard thread. And it includes a couple
  of major bug fixes as below.

  Enhancements:

   - introduce on-disk bitmaps to avoid scanning NAT blocks when getting
     free nids

   - support IO alignment to prepare open-channel SSD integration in
     future

   - introduce a discard thread to avoid long latency during checkpoint
     and fstrim

   - use SSR for warm node and enable inline_xattr by default

   - introduce in-memory bitmaps to check FS consistency for debugging

   - improve write_begin by avoiding needless read IO

  Bug fixes:

   - fix broken zone_reset behavior for SMR drive

   - fix wrong victim selection policy during GC

   - fix missing behavior when preparing discard commands

   - fix bugs in atomic write support and fiemap

   - workaround to handle multiple f2fs_add_link calls having same name

  ... and it includes a bunch of clean-up patches as well"

* tag 'for-f2fs-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (97 commits)
  f2fs: avoid to flush nat journal entries
  f2fs: avoid to issue redundant discard commands
  f2fs: fix a plint compile warning
  f2fs: add f2fs_drop_inode tracepoint
  f2fs: Fix zoned block device support
  f2fs: remove redundant set_page_dirty()
  f2fs: fix to enlarge size of write_io_dummy mempool
  f2fs: fix memory leak of write_io_dummy mempool during umount
  f2fs: fix to update F2FS_{CP_}WB_DATA count correctly
  f2fs: use MAX_FREE_NIDS for the free nids target
  f2fs: introduce free nid bitmap
  f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint
  f2fs: update the comment of default nr_pages to skipping
  f2fs: drop the duplicate pval in f2fs_getxattr
  f2fs: Don't update the xattr data that same as the exist
  f2fs: kill __is_extent_same
  f2fs: avoid bggc->fggc when enough free segments are avaliable after cp
  f2fs: select target segment with closer temperature in SSR mode
  f2fs: show simple call stack in fault injection message
  f2fs: no need lock_op in f2fs_write_inline_data
  ...
This commit is contained in:
Linus Torvalds 2017-03-01 15:55:04 -08:00
commit 25c4e6c3f0
21 changed files with 1969 additions and 813 deletions

View File

@ -125,13 +125,14 @@ active_logs=%u Support configuring the number of active logs. In the
disable_ext_identify Disable the extension list configured by mkfs, so f2fs disable_ext_identify Disable the extension list configured by mkfs, so f2fs
does not aware of cold files such as media files. does not aware of cold files such as media files.
inline_xattr Enable the inline xattrs feature. inline_xattr Enable the inline xattrs feature.
noinline_xattr Disable the inline xattrs feature.
inline_data Enable the inline data feature: New created small(<~3.4k) inline_data Enable the inline data feature: New created small(<~3.4k)
files can be written into inode block. files can be written into inode block.
inline_dentry Enable the inline dir feature: data in new created inline_dentry Enable the inline dir feature: data in new created
directory entries can be written into inode block. The directory entries can be written into inode block. The
space of inode block which is used to store inline space of inode block which is used to store inline
dentries is limited to ~3.4k. dentries is limited to ~3.4k.
noinline_dentry Diable the inline dentry feature. noinline_dentry Disable the inline dentry feature.
flush_merge Merge concurrent cache_flush commands as much as possible flush_merge Merge concurrent cache_flush commands as much as possible
to eliminate redundant command issues. If the underlying to eliminate redundant command issues. If the underlying
device handles the cache_flush command relatively slowly, device handles the cache_flush command relatively slowly,
@ -157,6 +158,8 @@ data_flush Enable data flushing before checkpoint in order to
mode=%s Control block allocation mode which supports "adaptive" mode=%s Control block allocation mode which supports "adaptive"
and "lfs". In "lfs" mode, there should be no random and "lfs". In "lfs" mode, there should be no random
writes towards main area. writes towards main area.
io_bits=%u Set the bit size of write IO requests. It should be set
with "mode=lfs".
================================================================================ ================================================================================
DEBUGFS ENTRIES DEBUGFS ENTRIES
@ -174,7 +177,7 @@ f2fs. Each file shows the whole f2fs information.
SYSFS ENTRIES SYSFS ENTRIES
================================================================================ ================================================================================
Information about mounted f2f2 file systems can be found in Information about mounted f2fs file systems can be found in
/sys/fs/f2fs. Each mounted filesystem will have a directory in /sys/fs/f2fs. Each mounted filesystem will have a directory in
/sys/fs/f2fs based on its device name (i.e., /sys/fs/f2fs/sda). /sys/fs/f2fs based on its device name (i.e., /sys/fs/f2fs/sda).
The files in each per-device directory are shown in table below. The files in each per-device directory are shown in table below.

View File

@ -249,7 +249,8 @@ static int f2fs_write_meta_page(struct page *page,
dec_page_count(sbi, F2FS_DIRTY_META); dec_page_count(sbi, F2FS_DIRTY_META);
if (wbc->for_reclaim) if (wbc->for_reclaim)
f2fs_submit_merged_bio_cond(sbi, NULL, page, 0, META, WRITE); f2fs_submit_merged_bio_cond(sbi, page->mapping->host,
0, page->index, META, WRITE);
unlock_page(page); unlock_page(page);
@ -493,6 +494,7 @@ int acquire_orphan_inode(struct f2fs_sb_info *sbi)
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
if (time_to_inject(sbi, FAULT_ORPHAN)) { if (time_to_inject(sbi, FAULT_ORPHAN)) {
spin_unlock(&im->ino_lock); spin_unlock(&im->ino_lock);
f2fs_show_injection_info(FAULT_ORPHAN);
return -ENOSPC; return -ENOSPC;
} }
#endif #endif
@ -681,8 +683,7 @@ static int get_checkpoint_version(struct f2fs_sb_info *sbi, block_t cp_addr,
return -EINVAL; return -EINVAL;
} }
crc = le32_to_cpu(*((__le32 *)((unsigned char *)*cp_block crc = cur_cp_crc(*cp_block);
+ crc_offset)));
if (!f2fs_crc_valid(sbi, crc, *cp_block, crc_offset)) { if (!f2fs_crc_valid(sbi, crc, *cp_block, crc_offset)) {
f2fs_msg(sbi->sb, KERN_WARNING, "invalid crc value"); f2fs_msg(sbi->sb, KERN_WARNING, "invalid crc value");
return -EINVAL; return -EINVAL;
@ -891,7 +892,7 @@ retry:
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA)); F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
return 0; return 0;
} }
fi = list_entry(head->next, struct f2fs_inode_info, dirty_list); fi = list_first_entry(head, struct f2fs_inode_info, dirty_list);
inode = igrab(&fi->vfs_inode); inode = igrab(&fi->vfs_inode);
spin_unlock(&sbi->inode_lock[type]); spin_unlock(&sbi->inode_lock[type]);
if (inode) { if (inode) {
@ -924,7 +925,7 @@ int f2fs_sync_inode_meta(struct f2fs_sb_info *sbi)
spin_unlock(&sbi->inode_lock[DIRTY_META]); spin_unlock(&sbi->inode_lock[DIRTY_META]);
return 0; return 0;
} }
fi = list_entry(head->next, struct f2fs_inode_info, fi = list_first_entry(head, struct f2fs_inode_info,
gdirty_list); gdirty_list);
inode = igrab(&fi->vfs_inode); inode = igrab(&fi->vfs_inode);
spin_unlock(&sbi->inode_lock[DIRTY_META]); spin_unlock(&sbi->inode_lock[DIRTY_META]);
@ -998,8 +999,6 @@ out:
static void unblock_operations(struct f2fs_sb_info *sbi) static void unblock_operations(struct f2fs_sb_info *sbi)
{ {
up_write(&sbi->node_write); up_write(&sbi->node_write);
build_free_nids(sbi, false);
f2fs_unlock_all(sbi); f2fs_unlock_all(sbi);
} }
@ -1025,6 +1024,10 @@ static void update_ckpt_flags(struct f2fs_sb_info *sbi, struct cp_control *cpc)
spin_lock(&sbi->cp_lock); spin_lock(&sbi->cp_lock);
if (cpc->reason == CP_UMOUNT && ckpt->cp_pack_total_block_count >
sbi->blocks_per_seg - NM_I(sbi)->nat_bits_blocks)
disable_nat_bits(sbi, false);
if (cpc->reason == CP_UMOUNT) if (cpc->reason == CP_UMOUNT)
__set_ckpt_flags(ckpt, CP_UMOUNT_FLAG); __set_ckpt_flags(ckpt, CP_UMOUNT_FLAG);
else else
@ -1137,6 +1140,28 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
start_blk = __start_cp_next_addr(sbi); start_blk = __start_cp_next_addr(sbi);
/* write nat bits */
if (enabled_nat_bits(sbi, cpc)) {
__u64 cp_ver = cur_cp_version(ckpt);
unsigned int i;
block_t blk;
cp_ver |= ((__u64)crc32 << 32);
*(__le64 *)nm_i->nat_bits = cpu_to_le64(cp_ver);
blk = start_blk + sbi->blocks_per_seg - nm_i->nat_bits_blocks;
for (i = 0; i < nm_i->nat_bits_blocks; i++)
update_meta_page(sbi, nm_i->nat_bits +
(i << F2FS_BLKSIZE_BITS), blk + i);
/* Flush all the NAT BITS pages */
while (get_pages(sbi, F2FS_DIRTY_META)) {
sync_meta_pages(sbi, META, LONG_MAX);
if (unlikely(f2fs_cp_error(sbi)))
return -EIO;
}
}
/* need to wait for end_io results */ /* need to wait for end_io results */
wait_on_all_pages_writeback(sbi); wait_on_all_pages_writeback(sbi);
if (unlikely(f2fs_cp_error(sbi))) if (unlikely(f2fs_cp_error(sbi)))
@ -1248,15 +1273,20 @@ int write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
f2fs_flush_merged_bios(sbi); f2fs_flush_merged_bios(sbi);
/* this is the case of multiple fstrims without any changes */ /* this is the case of multiple fstrims without any changes */
if (cpc->reason == CP_DISCARD && !is_sbi_flag_set(sbi, SBI_IS_DIRTY)) { if (cpc->reason == CP_DISCARD) {
f2fs_bug_on(sbi, NM_I(sbi)->dirty_nat_cnt); if (!exist_trim_candidates(sbi, cpc)) {
f2fs_bug_on(sbi, SIT_I(sbi)->dirty_sentries); unblock_operations(sbi);
f2fs_bug_on(sbi, prefree_segments(sbi)); goto out;
flush_sit_entries(sbi, cpc); }
clear_prefree_segments(sbi, cpc);
f2fs_wait_all_discard_bio(sbi); if (NM_I(sbi)->dirty_nat_cnt == 0 &&
unblock_operations(sbi); SIT_I(sbi)->dirty_sentries == 0 &&
goto out; prefree_segments(sbi) == 0) {
flush_sit_entries(sbi, cpc);
clear_prefree_segments(sbi, cpc);
unblock_operations(sbi);
goto out;
}
} }
/* /*
@ -1268,17 +1298,15 @@ int write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
ckpt->checkpoint_ver = cpu_to_le64(++ckpt_ver); ckpt->checkpoint_ver = cpu_to_le64(++ckpt_ver);
/* write cached NAT/SIT entries to NAT/SIT area */ /* write cached NAT/SIT entries to NAT/SIT area */
flush_nat_entries(sbi); flush_nat_entries(sbi, cpc);
flush_sit_entries(sbi, cpc); flush_sit_entries(sbi, cpc);
/* unlock all the fs_lock[] in do_checkpoint() */ /* unlock all the fs_lock[] in do_checkpoint() */
err = do_checkpoint(sbi, cpc); err = do_checkpoint(sbi, cpc);
if (err) { if (err)
release_discard_addrs(sbi); release_discard_addrs(sbi);
} else { else
clear_prefree_segments(sbi, cpc); clear_prefree_segments(sbi, cpc);
f2fs_wait_all_discard_bio(sbi);
}
unblock_operations(sbi); unblock_operations(sbi);
stat_inc_cp_count(sbi->stat_info); stat_inc_cp_count(sbi->stat_info);

View File

@ -55,8 +55,10 @@ static void f2fs_read_end_io(struct bio *bio)
int i; int i;
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
if (time_to_inject(F2FS_P_SB(bio->bi_io_vec->bv_page), FAULT_IO)) if (time_to_inject(F2FS_P_SB(bio->bi_io_vec->bv_page), FAULT_IO)) {
f2fs_show_injection_info(FAULT_IO);
bio->bi_error = -EIO; bio->bi_error = -EIO;
}
#endif #endif
if (f2fs_bio_encrypted(bio)) { if (f2fs_bio_encrypted(bio)) {
@ -93,6 +95,17 @@ static void f2fs_write_end_io(struct bio *bio)
struct page *page = bvec->bv_page; struct page *page = bvec->bv_page;
enum count_type type = WB_DATA_TYPE(page); enum count_type type = WB_DATA_TYPE(page);
if (IS_DUMMY_WRITTEN_PAGE(page)) {
set_page_private(page, (unsigned long)NULL);
ClearPagePrivate(page);
unlock_page(page);
mempool_free(page, sbi->write_io_dummy);
if (unlikely(bio->bi_error))
f2fs_stop_checkpoint(sbi, true);
continue;
}
fscrypt_pullback_bio_page(&page, true); fscrypt_pullback_bio_page(&page, true);
if (unlikely(bio->bi_error)) { if (unlikely(bio->bi_error)) {
@ -171,10 +184,46 @@ static inline void __submit_bio(struct f2fs_sb_info *sbi,
struct bio *bio, enum page_type type) struct bio *bio, enum page_type type)
{ {
if (!is_read_io(bio_op(bio))) { if (!is_read_io(bio_op(bio))) {
unsigned int start;
if (f2fs_sb_mounted_blkzoned(sbi->sb) && if (f2fs_sb_mounted_blkzoned(sbi->sb) &&
current->plug && (type == DATA || type == NODE)) current->plug && (type == DATA || type == NODE))
blk_finish_plug(current->plug); blk_finish_plug(current->plug);
if (type != DATA && type != NODE)
goto submit_io;
start = bio->bi_iter.bi_size >> F2FS_BLKSIZE_BITS;
start %= F2FS_IO_SIZE(sbi);
if (start == 0)
goto submit_io;
/* fill dummy pages */
for (; start < F2FS_IO_SIZE(sbi); start++) {
struct page *page =
mempool_alloc(sbi->write_io_dummy,
GFP_NOIO | __GFP_ZERO | __GFP_NOFAIL);
f2fs_bug_on(sbi, !page);
SetPagePrivate(page);
set_page_private(page, (unsigned long)DUMMY_WRITTEN_PAGE);
lock_page(page);
if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE)
f2fs_bug_on(sbi, 1);
}
/*
* In the NODE case, we lose next block address chain. So, we
* need to do checkpoint in f2fs_sync_file.
*/
if (type == NODE)
set_sbi_flag(sbi, SBI_NEED_CP);
} }
submit_io:
if (is_read_io(bio_op(bio)))
trace_f2fs_submit_read_bio(sbi->sb, type, bio);
else
trace_f2fs_submit_write_bio(sbi->sb, type, bio);
submit_bio(bio); submit_bio(bio);
} }
@ -185,19 +234,19 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
if (!io->bio) if (!io->bio)
return; return;
if (is_read_io(fio->op))
trace_f2fs_submit_read_bio(io->sbi->sb, fio, io->bio);
else
trace_f2fs_submit_write_bio(io->sbi->sb, fio, io->bio);
bio_set_op_attrs(io->bio, fio->op, fio->op_flags); bio_set_op_attrs(io->bio, fio->op, fio->op_flags);
if (is_read_io(fio->op))
trace_f2fs_prepare_read_bio(io->sbi->sb, fio->type, io->bio);
else
trace_f2fs_prepare_write_bio(io->sbi->sb, fio->type, io->bio);
__submit_bio(io->sbi, io->bio, fio->type); __submit_bio(io->sbi, io->bio, fio->type);
io->bio = NULL; io->bio = NULL;
} }
static bool __has_merged_page(struct f2fs_bio_info *io, struct inode *inode, static bool __has_merged_page(struct f2fs_bio_info *io,
struct page *page, nid_t ino) struct inode *inode, nid_t ino, pgoff_t idx)
{ {
struct bio_vec *bvec; struct bio_vec *bvec;
struct page *target; struct page *target;
@ -206,7 +255,7 @@ static bool __has_merged_page(struct f2fs_bio_info *io, struct inode *inode,
if (!io->bio) if (!io->bio)
return false; return false;
if (!inode && !page && !ino) if (!inode && !ino)
return true; return true;
bio_for_each_segment_all(bvec, io->bio, i) { bio_for_each_segment_all(bvec, io->bio, i) {
@ -216,10 +265,11 @@ static bool __has_merged_page(struct f2fs_bio_info *io, struct inode *inode,
else else
target = fscrypt_control_page(bvec->bv_page); target = fscrypt_control_page(bvec->bv_page);
if (idx != target->index)
continue;
if (inode && inode == target->mapping->host) if (inode && inode == target->mapping->host)
return true; return true;
if (page && page == target)
return true;
if (ino && ino == ino_of_node(target)) if (ino && ino == ino_of_node(target))
return true; return true;
} }
@ -228,22 +278,21 @@ static bool __has_merged_page(struct f2fs_bio_info *io, struct inode *inode,
} }
static bool has_merged_page(struct f2fs_sb_info *sbi, struct inode *inode, static bool has_merged_page(struct f2fs_sb_info *sbi, struct inode *inode,
struct page *page, nid_t ino, nid_t ino, pgoff_t idx, enum page_type type)
enum page_type type)
{ {
enum page_type btype = PAGE_TYPE_OF_BIO(type); enum page_type btype = PAGE_TYPE_OF_BIO(type);
struct f2fs_bio_info *io = &sbi->write_io[btype]; struct f2fs_bio_info *io = &sbi->write_io[btype];
bool ret; bool ret;
down_read(&io->io_rwsem); down_read(&io->io_rwsem);
ret = __has_merged_page(io, inode, page, ino); ret = __has_merged_page(io, inode, ino, idx);
up_read(&io->io_rwsem); up_read(&io->io_rwsem);
return ret; return ret;
} }
static void __f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, static void __f2fs_submit_merged_bio(struct f2fs_sb_info *sbi,
struct inode *inode, struct page *page, struct inode *inode, nid_t ino, pgoff_t idx,
nid_t ino, enum page_type type, int rw) enum page_type type, int rw)
{ {
enum page_type btype = PAGE_TYPE_OF_BIO(type); enum page_type btype = PAGE_TYPE_OF_BIO(type);
struct f2fs_bio_info *io; struct f2fs_bio_info *io;
@ -252,16 +301,16 @@ static void __f2fs_submit_merged_bio(struct f2fs_sb_info *sbi,
down_write(&io->io_rwsem); down_write(&io->io_rwsem);
if (!__has_merged_page(io, inode, page, ino)) if (!__has_merged_page(io, inode, ino, idx))
goto out; goto out;
/* change META to META_FLUSH in the checkpoint procedure */ /* change META to META_FLUSH in the checkpoint procedure */
if (type >= META_FLUSH) { if (type >= META_FLUSH) {
io->fio.type = META_FLUSH; io->fio.type = META_FLUSH;
io->fio.op = REQ_OP_WRITE; io->fio.op = REQ_OP_WRITE;
io->fio.op_flags = REQ_PREFLUSH | REQ_META | REQ_PRIO; io->fio.op_flags = REQ_META | REQ_PRIO;
if (!test_opt(sbi, NOBARRIER)) if (!test_opt(sbi, NOBARRIER))
io->fio.op_flags |= REQ_FUA; io->fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
} }
__submit_merged_bio(io); __submit_merged_bio(io);
out: out:
@ -271,15 +320,15 @@ out:
void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, enum page_type type, void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, enum page_type type,
int rw) int rw)
{ {
__f2fs_submit_merged_bio(sbi, NULL, NULL, 0, type, rw); __f2fs_submit_merged_bio(sbi, NULL, 0, 0, type, rw);
} }
void f2fs_submit_merged_bio_cond(struct f2fs_sb_info *sbi, void f2fs_submit_merged_bio_cond(struct f2fs_sb_info *sbi,
struct inode *inode, struct page *page, struct inode *inode, nid_t ino, pgoff_t idx,
nid_t ino, enum page_type type, int rw) enum page_type type, int rw)
{ {
if (has_merged_page(sbi, inode, page, ino, type)) if (has_merged_page(sbi, inode, ino, idx, type))
__f2fs_submit_merged_bio(sbi, inode, page, ino, type, rw); __f2fs_submit_merged_bio(sbi, inode, ino, idx, type, rw);
} }
void f2fs_flush_merged_bios(struct f2fs_sb_info *sbi) void f2fs_flush_merged_bios(struct f2fs_sb_info *sbi)
@ -315,13 +364,14 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
return 0; return 0;
} }
void f2fs_submit_page_mbio(struct f2fs_io_info *fio) int f2fs_submit_page_mbio(struct f2fs_io_info *fio)
{ {
struct f2fs_sb_info *sbi = fio->sbi; struct f2fs_sb_info *sbi = fio->sbi;
enum page_type btype = PAGE_TYPE_OF_BIO(fio->type); enum page_type btype = PAGE_TYPE_OF_BIO(fio->type);
struct f2fs_bio_info *io; struct f2fs_bio_info *io;
bool is_read = is_read_io(fio->op); bool is_read = is_read_io(fio->op);
struct page *bio_page; struct page *bio_page;
int err = 0;
io = is_read ? &sbi->read_io : &sbi->write_io[btype]; io = is_read ? &sbi->read_io : &sbi->write_io[btype];
@ -331,6 +381,9 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *fio)
bio_page = fio->encrypted_page ? fio->encrypted_page : fio->page; bio_page = fio->encrypted_page ? fio->encrypted_page : fio->page;
/* set submitted = 1 as a return value */
fio->submitted = 1;
if (!is_read) if (!is_read)
inc_page_count(sbi, WB_DATA_TYPE(bio_page)); inc_page_count(sbi, WB_DATA_TYPE(bio_page));
@ -342,6 +395,13 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *fio)
__submit_merged_bio(io); __submit_merged_bio(io);
alloc_new: alloc_new:
if (io->bio == NULL) { if (io->bio == NULL) {
if ((fio->type == DATA || fio->type == NODE) &&
fio->new_blkaddr & F2FS_IO_SIZE_MASK(sbi)) {
err = -EAGAIN;
if (!is_read)
dec_page_count(sbi, WB_DATA_TYPE(bio_page));
goto out_fail;
}
io->bio = __bio_alloc(sbi, fio->new_blkaddr, io->bio = __bio_alloc(sbi, fio->new_blkaddr,
BIO_MAX_PAGES, is_read); BIO_MAX_PAGES, is_read);
io->fio = *fio; io->fio = *fio;
@ -355,9 +415,10 @@ alloc_new:
io->last_block_in_bio = fio->new_blkaddr; io->last_block_in_bio = fio->new_blkaddr;
f2fs_trace_ios(fio, 0); f2fs_trace_ios(fio, 0);
out_fail:
up_write(&io->io_rwsem); up_write(&io->io_rwsem);
trace_f2fs_submit_page_mbio(fio->page, fio); trace_f2fs_submit_page_mbio(fio->page, fio);
return err;
} }
static void __set_data_blkaddr(struct dnode_of_data *dn) static void __set_data_blkaddr(struct dnode_of_data *dn)
@ -453,7 +514,7 @@ int f2fs_reserve_block(struct dnode_of_data *dn, pgoff_t index)
int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index) int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index)
{ {
struct extent_info ei; struct extent_info ei = {0,0,0};
struct inode *inode = dn->inode; struct inode *inode = dn->inode;
if (f2fs_lookup_extent_cache(inode, index, &ei)) { if (f2fs_lookup_extent_cache(inode, index, &ei)) {
@ -470,7 +531,7 @@ struct page *get_read_data_page(struct inode *inode, pgoff_t index,
struct address_space *mapping = inode->i_mapping; struct address_space *mapping = inode->i_mapping;
struct dnode_of_data dn; struct dnode_of_data dn;
struct page *page; struct page *page;
struct extent_info ei; struct extent_info ei = {0,0,0};
int err; int err;
struct f2fs_io_info fio = { struct f2fs_io_info fio = {
.sbi = F2FS_I_SB(inode), .sbi = F2FS_I_SB(inode),
@ -694,6 +755,9 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
struct f2fs_map_blocks map; struct f2fs_map_blocks map;
int err = 0; int err = 0;
if (is_inode_flag_set(inode, FI_NO_PREALLOC))
return 0;
map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos); map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from)); map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
if (map.m_len > map.m_lblk) if (map.m_len > map.m_lblk)
@ -742,7 +806,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
int err = 0, ofs = 1; int err = 0, ofs = 1;
unsigned int ofs_in_node, last_ofs_in_node; unsigned int ofs_in_node, last_ofs_in_node;
blkcnt_t prealloc; blkcnt_t prealloc;
struct extent_info ei; struct extent_info ei = {0,0,0};
block_t blkaddr; block_t blkaddr;
if (!maxblocks) if (!maxblocks)
@ -806,7 +870,7 @@ next_block:
} }
if (err) if (err)
goto sync_out; goto sync_out;
map->m_flags = F2FS_MAP_NEW; map->m_flags |= F2FS_MAP_NEW;
blkaddr = dn.data_blkaddr; blkaddr = dn.data_blkaddr;
} else { } else {
if (flag == F2FS_GET_BLOCK_BMAP) { if (flag == F2FS_GET_BLOCK_BMAP) {
@ -906,7 +970,7 @@ static int __get_data_block(struct inode *inode, sector_t iblock,
if (!err) { if (!err) {
map_bh(bh, inode->i_sb, map.m_pblk); map_bh(bh, inode->i_sb, map.m_pblk);
bh->b_state = (bh->b_state & ~F2FS_MAP_FLAGS) | map.m_flags; bh->b_state = (bh->b_state & ~F2FS_MAP_FLAGS) | map.m_flags;
bh->b_size = map.m_len << inode->i_blkbits; bh->b_size = (u64)map.m_len << inode->i_blkbits;
} }
return err; return err;
} }
@ -1088,7 +1152,7 @@ static int f2fs_mpage_readpages(struct address_space *mapping,
prefetchw(&page->flags); prefetchw(&page->flags);
if (pages) { if (pages) {
page = list_entry(pages->prev, struct page, lru); page = list_last_entry(pages, struct page, lru);
list_del(&page->lru); list_del(&page->lru);
if (add_to_page_cache_lru(page, mapping, if (add_to_page_cache_lru(page, mapping,
page->index, page->index,
@ -1207,7 +1271,7 @@ static int f2fs_read_data_pages(struct file *file,
struct list_head *pages, unsigned nr_pages) struct list_head *pages, unsigned nr_pages)
{ {
struct inode *inode = file->f_mapping->host; struct inode *inode = file->f_mapping->host;
struct page *page = list_entry(pages->prev, struct page, lru); struct page *page = list_last_entry(pages, struct page, lru);
trace_f2fs_readpages(inode, page, nr_pages); trace_f2fs_readpages(inode, page, nr_pages);
@ -1288,8 +1352,8 @@ out_writepage:
return err; return err;
} }
static int f2fs_write_data_page(struct page *page, static int __write_data_page(struct page *page, bool *submitted,
struct writeback_control *wbc) struct writeback_control *wbc)
{ {
struct inode *inode = page->mapping->host; struct inode *inode = page->mapping->host;
struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
@ -1307,6 +1371,7 @@ static int f2fs_write_data_page(struct page *page,
.op_flags = wbc_to_write_flags(wbc), .op_flags = wbc_to_write_flags(wbc),
.page = page, .page = page,
.encrypted_page = NULL, .encrypted_page = NULL,
.submitted = false,
}; };
trace_f2fs_writepage(page, DATA); trace_f2fs_writepage(page, DATA);
@ -1352,9 +1417,12 @@ write:
goto redirty_out; goto redirty_out;
err = -EAGAIN; err = -EAGAIN;
f2fs_lock_op(sbi); if (f2fs_has_inline_data(inode)) {
if (f2fs_has_inline_data(inode))
err = f2fs_write_inline_data(inode, page); err = f2fs_write_inline_data(inode, page);
if (!err)
goto out;
}
f2fs_lock_op(sbi);
if (err == -EAGAIN) if (err == -EAGAIN)
err = do_write_data_page(&fio); err = do_write_data_page(&fio);
if (F2FS_I(inode)->last_disk_size < psize) if (F2FS_I(inode)->last_disk_size < psize)
@ -1370,15 +1438,22 @@ out:
ClearPageUptodate(page); ClearPageUptodate(page);
if (wbc->for_reclaim) { if (wbc->for_reclaim) {
f2fs_submit_merged_bio_cond(sbi, NULL, page, 0, DATA, WRITE); f2fs_submit_merged_bio_cond(sbi, inode, 0, page->index,
DATA, WRITE);
remove_dirty_inode(inode); remove_dirty_inode(inode);
submitted = NULL;
} }
unlock_page(page); unlock_page(page);
f2fs_balance_fs(sbi, need_balance_fs); f2fs_balance_fs(sbi, need_balance_fs);
if (unlikely(f2fs_cp_error(sbi))) if (unlikely(f2fs_cp_error(sbi))) {
f2fs_submit_merged_bio(sbi, DATA, WRITE); f2fs_submit_merged_bio(sbi, DATA, WRITE);
submitted = NULL;
}
if (submitted)
*submitted = fio.submitted;
return 0; return 0;
@ -1390,6 +1465,12 @@ redirty_out:
return err; return err;
} }
static int f2fs_write_data_page(struct page *page,
struct writeback_control *wbc)
{
return __write_data_page(page, NULL, wbc);
}
/* /*
* This function was copied from write_cche_pages from mm/page-writeback.c. * This function was copied from write_cche_pages from mm/page-writeback.c.
* The major change is making write step of cold data page separately from * The major change is making write step of cold data page separately from
@ -1406,10 +1487,10 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
pgoff_t index; pgoff_t index;
pgoff_t end; /* Inclusive */ pgoff_t end; /* Inclusive */
pgoff_t done_index; pgoff_t done_index;
pgoff_t last_idx = ULONG_MAX;
int cycled; int cycled;
int range_whole = 0; int range_whole = 0;
int tag; int tag;
int nwritten = 0;
pagevec_init(&pvec, 0); pagevec_init(&pvec, 0);
@ -1446,6 +1527,7 @@ retry:
for (i = 0; i < nr_pages; i++) { for (i = 0; i < nr_pages; i++) {
struct page *page = pvec.pages[i]; struct page *page = pvec.pages[i];
bool submitted = false;
if (page->index > end) { if (page->index > end) {
done = 1; done = 1;
@ -1479,7 +1561,7 @@ continue_unlock:
if (!clear_page_dirty_for_io(page)) if (!clear_page_dirty_for_io(page))
goto continue_unlock; goto continue_unlock;
ret = mapping->a_ops->writepage(page, wbc); ret = __write_data_page(page, &submitted, wbc);
if (unlikely(ret)) { if (unlikely(ret)) {
/* /*
* keep nr_to_write, since vfs uses this to * keep nr_to_write, since vfs uses this to
@ -1493,8 +1575,8 @@ continue_unlock:
done_index = page->index + 1; done_index = page->index + 1;
done = 1; done = 1;
break; break;
} else { } else if (submitted) {
nwritten++; last_idx = page->index;
} }
if (--wbc->nr_to_write <= 0 && if (--wbc->nr_to_write <= 0 &&
@ -1516,9 +1598,9 @@ continue_unlock:
if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
mapping->writeback_index = done_index; mapping->writeback_index = done_index;
if (nwritten) if (last_idx != ULONG_MAX)
f2fs_submit_merged_bio_cond(F2FS_M_SB(mapping), mapping->host, f2fs_submit_merged_bio_cond(F2FS_M_SB(mapping), mapping->host,
NULL, 0, DATA, WRITE); 0, last_idx, DATA, WRITE);
return ret; return ret;
} }
@ -1591,14 +1673,15 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,
struct dnode_of_data dn; struct dnode_of_data dn;
struct page *ipage; struct page *ipage;
bool locked = false; bool locked = false;
struct extent_info ei; struct extent_info ei = {0,0,0};
int err = 0; int err = 0;
/* /*
* we already allocated all the blocks, so we don't need to get * we already allocated all the blocks, so we don't need to get
* the block addresses when there is no need to fill the page. * the block addresses when there is no need to fill the page.
*/ */
if (!f2fs_has_inline_data(inode) && len == PAGE_SIZE) if (!f2fs_has_inline_data(inode) && len == PAGE_SIZE &&
!is_inode_flag_set(inode, FI_NO_PREALLOC))
return 0; return 0;
if (f2fs_has_inline_data(inode) || if (f2fs_has_inline_data(inode) ||
@ -1682,7 +1765,12 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
goto fail; goto fail;
} }
repeat: repeat:
page = grab_cache_page_write_begin(mapping, index, flags); /*
* Do not use grab_cache_page_write_begin() to avoid deadlock due to
* wait_for_stable_page. Will wait that below with our IO control.
*/
page = pagecache_get_page(mapping, index,
FGP_LOCK | FGP_WRITE | FGP_CREAT, GFP_NOFS);
if (!page) { if (!page) {
err = -ENOMEM; err = -ENOMEM;
goto fail; goto fail;
@ -1715,6 +1803,11 @@ repeat:
if (len == PAGE_SIZE || PageUptodate(page)) if (len == PAGE_SIZE || PageUptodate(page))
return 0; return 0;
if (!(pos & (PAGE_SIZE - 1)) && (pos + len) >= i_size_read(inode)) {
zero_user_segment(page, len, PAGE_SIZE);
return 0;
}
if (blkaddr == NEW_ADDR) { if (blkaddr == NEW_ADDR) {
zero_user_segment(page, 0, PAGE_SIZE); zero_user_segment(page, 0, PAGE_SIZE);
SetPageUptodate(page); SetPageUptodate(page);
@ -1768,7 +1861,7 @@ static int f2fs_write_end(struct file *file,
* let generic_perform_write() try to copy data again through copied=0. * let generic_perform_write() try to copy data again through copied=0.
*/ */
if (!PageUptodate(page)) { if (!PageUptodate(page)) {
if (unlikely(copied != PAGE_SIZE)) if (unlikely(copied != len))
copied = 0; copied = 0;
else else
SetPageUptodate(page); SetPageUptodate(page);
@ -1917,7 +2010,7 @@ static int f2fs_set_data_page_dirty(struct page *page)
if (!PageUptodate(page)) if (!PageUptodate(page))
SetPageUptodate(page); SetPageUptodate(page);
if (f2fs_is_atomic_file(inode)) { if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)) {
if (!IS_ATOMIC_WRITTEN_PAGE(page)) { if (!IS_ATOMIC_WRITTEN_PAGE(page)) {
register_inmem_page(inode, page); register_inmem_page(inode, page);
return 1; return 1;

View File

@ -50,8 +50,16 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->ndirty_files = sbi->ndirty_inode[FILE_INODE]; si->ndirty_files = sbi->ndirty_inode[FILE_INODE];
si->ndirty_all = sbi->ndirty_inode[DIRTY_META]; si->ndirty_all = sbi->ndirty_inode[DIRTY_META];
si->inmem_pages = get_pages(sbi, F2FS_INMEM_PAGES); si->inmem_pages = get_pages(sbi, F2FS_INMEM_PAGES);
si->aw_cnt = atomic_read(&sbi->aw_cnt);
si->max_aw_cnt = atomic_read(&sbi->max_aw_cnt);
si->nr_wb_cp_data = get_pages(sbi, F2FS_WB_CP_DATA); si->nr_wb_cp_data = get_pages(sbi, F2FS_WB_CP_DATA);
si->nr_wb_data = get_pages(sbi, F2FS_WB_DATA); si->nr_wb_data = get_pages(sbi, F2FS_WB_DATA);
if (SM_I(sbi) && SM_I(sbi)->fcc_info)
si->nr_flush =
atomic_read(&SM_I(sbi)->fcc_info->submit_flush);
if (SM_I(sbi) && SM_I(sbi)->dcc_info)
si->nr_discard =
atomic_read(&SM_I(sbi)->dcc_info->submit_discard);
si->total_count = (int)sbi->user_block_count / sbi->blocks_per_seg; si->total_count = (int)sbi->user_block_count / sbi->blocks_per_seg;
si->rsvd_segs = reserved_segments(sbi); si->rsvd_segs = reserved_segments(sbi);
si->overp_segs = overprovision_segments(sbi); si->overp_segs = overprovision_segments(sbi);
@ -62,6 +70,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->inline_xattr = atomic_read(&sbi->inline_xattr); si->inline_xattr = atomic_read(&sbi->inline_xattr);
si->inline_inode = atomic_read(&sbi->inline_inode); si->inline_inode = atomic_read(&sbi->inline_inode);
si->inline_dir = atomic_read(&sbi->inline_dir); si->inline_dir = atomic_read(&sbi->inline_dir);
si->append = sbi->im[APPEND_INO].ino_num;
si->update = sbi->im[UPDATE_INO].ino_num;
si->orphans = sbi->im[ORPHAN_INO].ino_num; si->orphans = sbi->im[ORPHAN_INO].ino_num;
si->utilization = utilization(sbi); si->utilization = utilization(sbi);
@ -183,6 +193,9 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
/* build nm */ /* build nm */
si->base_mem += sizeof(struct f2fs_nm_info); si->base_mem += sizeof(struct f2fs_nm_info);
si->base_mem += __bitmap_size(sbi, NAT_BITMAP); si->base_mem += __bitmap_size(sbi, NAT_BITMAP);
si->base_mem += (NM_I(sbi)->nat_bits_blocks << F2FS_BLKSIZE_BITS);
si->base_mem += NM_I(sbi)->nat_blocks * NAT_ENTRY_BITMAP_SIZE;
si->base_mem += NM_I(sbi)->nat_blocks / 8;
get_cache: get_cache:
si->cache_mem = 0; si->cache_mem = 0;
@ -192,8 +205,10 @@ get_cache:
si->cache_mem += sizeof(struct f2fs_gc_kthread); si->cache_mem += sizeof(struct f2fs_gc_kthread);
/* build merge flush thread */ /* build merge flush thread */
if (SM_I(sbi)->cmd_control_info) if (SM_I(sbi)->fcc_info)
si->cache_mem += sizeof(struct flush_cmd_control); si->cache_mem += sizeof(struct flush_cmd_control);
if (SM_I(sbi)->dcc_info)
si->cache_mem += sizeof(struct discard_cmd_control);
/* free nids */ /* free nids */
si->cache_mem += (NM_I(sbi)->nid_cnt[FREE_NID_LIST] + si->cache_mem += (NM_I(sbi)->nid_cnt[FREE_NID_LIST] +
@ -254,8 +269,8 @@ static int stat_show(struct seq_file *s, void *v)
si->inline_inode); si->inline_inode);
seq_printf(s, " - Inline_dentry Inode: %u\n", seq_printf(s, " - Inline_dentry Inode: %u\n",
si->inline_dir); si->inline_dir);
seq_printf(s, " - Orphan Inode: %u\n", seq_printf(s, " - Orphan/Append/Update Inode: %u, %u, %u\n",
si->orphans); si->orphans, si->append, si->update);
seq_printf(s, "\nMain area: %d segs, %d secs %d zones\n", seq_printf(s, "\nMain area: %d segs, %d secs %d zones\n",
si->main_area_segs, si->main_area_sections, si->main_area_segs, si->main_area_sections,
si->main_area_zones); si->main_area_zones);
@ -314,8 +329,11 @@ static int stat_show(struct seq_file *s, void *v)
seq_printf(s, " - Inner Struct Count: tree: %d(%d), node: %d\n", seq_printf(s, " - Inner Struct Count: tree: %d(%d), node: %d\n",
si->ext_tree, si->zombie_tree, si->ext_node); si->ext_tree, si->zombie_tree, si->ext_node);
seq_puts(s, "\nBalancing F2FS Async:\n"); seq_puts(s, "\nBalancing F2FS Async:\n");
seq_printf(s, " - inmem: %4d, wb_cp_data: %4d, wb_data: %4d\n", seq_printf(s, " - IO (CP: %4d, Data: %4d, Flush: %4d, Discard: %4d)\n",
si->inmem_pages, si->nr_wb_cp_data, si->nr_wb_data); si->nr_wb_cp_data, si->nr_wb_data,
si->nr_flush, si->nr_discard);
seq_printf(s, " - inmem: %4d, atomic IO: %4d (Max. %4d)\n",
si->inmem_pages, si->aw_cnt, si->max_aw_cnt);
seq_printf(s, " - nodes: %4d in %4d\n", seq_printf(s, " - nodes: %4d in %4d\n",
si->ndirty_node, si->node_pages); si->ndirty_node, si->node_pages);
seq_printf(s, " - dents: %4d in dirs:%4d (%4d)\n", seq_printf(s, " - dents: %4d in dirs:%4d (%4d)\n",
@ -414,6 +432,9 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi)
atomic_set(&sbi->inline_dir, 0); atomic_set(&sbi->inline_dir, 0);
atomic_set(&sbi->inplace_count, 0); atomic_set(&sbi->inplace_count, 0);
atomic_set(&sbi->aw_cnt, 0);
atomic_set(&sbi->max_aw_cnt, 0);
mutex_lock(&f2fs_stat_mutex); mutex_lock(&f2fs_stat_mutex);
list_add_tail(&si->stat_list, &f2fs_stat_list); list_add_tail(&si->stat_list, &f2fs_stat_list);
mutex_unlock(&f2fs_stat_mutex); mutex_unlock(&f2fs_stat_mutex);

View File

@ -207,9 +207,13 @@ static struct f2fs_dir_entry *find_in_level(struct inode *dir,
f2fs_put_page(dentry_page, 0); f2fs_put_page(dentry_page, 0);
} }
if (!de && room && F2FS_I(dir)->chash != namehash) { /* This is to increase the speed of f2fs_create */
F2FS_I(dir)->chash = namehash; if (!de && room) {
F2FS_I(dir)->clevel = level; F2FS_I(dir)->task = current;
if (F2FS_I(dir)->chash != namehash) {
F2FS_I(dir)->chash = namehash;
F2FS_I(dir)->clevel = level;
}
} }
return de; return de;
@ -548,8 +552,10 @@ int f2fs_add_regular_entry(struct inode *dir, const struct qstr *new_name,
start: start:
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
if (time_to_inject(F2FS_I_SB(dir), FAULT_DIR_DEPTH)) if (time_to_inject(F2FS_I_SB(dir), FAULT_DIR_DEPTH)) {
f2fs_show_injection_info(FAULT_DIR_DEPTH);
return -ENOSPC; return -ENOSPC;
}
#endif #endif
if (unlikely(current_depth == MAX_DIR_HASH_DEPTH)) if (unlikely(current_depth == MAX_DIR_HASH_DEPTH))
return -ENOSPC; return -ENOSPC;
@ -646,14 +652,34 @@ int __f2fs_add_link(struct inode *dir, const struct qstr *name,
struct inode *inode, nid_t ino, umode_t mode) struct inode *inode, nid_t ino, umode_t mode)
{ {
struct fscrypt_name fname; struct fscrypt_name fname;
struct page *page = NULL;
struct f2fs_dir_entry *de = NULL;
int err; int err;
err = fscrypt_setup_filename(dir, name, 0, &fname); err = fscrypt_setup_filename(dir, name, 0, &fname);
if (err) if (err)
return err; return err;
err = __f2fs_do_add_link(dir, &fname, inode, ino, mode); /*
* An immature stakable filesystem shows a race condition between lookup
* and create. If we have same task when doing lookup and create, it's
* definitely fine as expected by VFS normally. Otherwise, let's just
* verify on-disk dentry one more time, which guarantees filesystem
* consistency more.
*/
if (current != F2FS_I(dir)->task) {
de = __f2fs_find_entry(dir, &fname, &page);
F2FS_I(dir)->task = NULL;
}
if (de) {
f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
err = PTR_ERR(page);
} else {
err = __f2fs_do_add_link(dir, &fname, inode, ino, mode);
}
fscrypt_free_filename(&fname); fscrypt_free_filename(&fname);
return err; return err;
} }

View File

@ -77,7 +77,7 @@ static struct extent_tree *__grab_extent_tree(struct inode *inode)
struct extent_tree *et; struct extent_tree *et;
nid_t ino = inode->i_ino; nid_t ino = inode->i_ino;
down_write(&sbi->extent_tree_lock); mutex_lock(&sbi->extent_tree_lock);
et = radix_tree_lookup(&sbi->extent_tree_root, ino); et = radix_tree_lookup(&sbi->extent_tree_root, ino);
if (!et) { if (!et) {
et = f2fs_kmem_cache_alloc(extent_tree_slab, GFP_NOFS); et = f2fs_kmem_cache_alloc(extent_tree_slab, GFP_NOFS);
@ -94,7 +94,7 @@ static struct extent_tree *__grab_extent_tree(struct inode *inode)
atomic_dec(&sbi->total_zombie_tree); atomic_dec(&sbi->total_zombie_tree);
list_del_init(&et->list); list_del_init(&et->list);
} }
up_write(&sbi->extent_tree_lock); mutex_unlock(&sbi->extent_tree_lock);
/* never died until evict_inode */ /* never died until evict_inode */
F2FS_I(inode)->extent_tree = et; F2FS_I(inode)->extent_tree = et;
@ -311,28 +311,24 @@ static struct extent_node *__lookup_extent_tree_ret(struct extent_tree *et,
tmp_node = parent; tmp_node = parent;
if (parent && fofs > en->ei.fofs) if (parent && fofs > en->ei.fofs)
tmp_node = rb_next(parent); tmp_node = rb_next(parent);
*next_ex = tmp_node ? *next_ex = rb_entry_safe(tmp_node, struct extent_node, rb_node);
rb_entry(tmp_node, struct extent_node, rb_node) : NULL;
tmp_node = parent; tmp_node = parent;
if (parent && fofs < en->ei.fofs) if (parent && fofs < en->ei.fofs)
tmp_node = rb_prev(parent); tmp_node = rb_prev(parent);
*prev_ex = tmp_node ? *prev_ex = rb_entry_safe(tmp_node, struct extent_node, rb_node);
rb_entry(tmp_node, struct extent_node, rb_node) : NULL;
return NULL; return NULL;
lookup_neighbors: lookup_neighbors:
if (fofs == en->ei.fofs) { if (fofs == en->ei.fofs) {
/* lookup prev node for merging backward later */ /* lookup prev node for merging backward later */
tmp_node = rb_prev(&en->rb_node); tmp_node = rb_prev(&en->rb_node);
*prev_ex = tmp_node ? *prev_ex = rb_entry_safe(tmp_node, struct extent_node, rb_node);
rb_entry(tmp_node, struct extent_node, rb_node) : NULL;
} }
if (fofs == en->ei.fofs + en->ei.len - 1) { if (fofs == en->ei.fofs + en->ei.len - 1) {
/* lookup next node for merging frontward later */ /* lookup next node for merging frontward later */
tmp_node = rb_next(&en->rb_node); tmp_node = rb_next(&en->rb_node);
*next_ex = tmp_node ? *next_ex = rb_entry_safe(tmp_node, struct extent_node, rb_node);
rb_entry(tmp_node, struct extent_node, rb_node) : NULL;
} }
return en; return en;
} }
@ -352,11 +348,12 @@ static struct extent_node *__try_merge_extent_node(struct inode *inode,
} }
if (next_ex && __is_front_mergeable(ei, &next_ex->ei)) { if (next_ex && __is_front_mergeable(ei, &next_ex->ei)) {
if (en)
__release_extent_node(sbi, et, prev_ex);
next_ex->ei.fofs = ei->fofs; next_ex->ei.fofs = ei->fofs;
next_ex->ei.blk = ei->blk; next_ex->ei.blk = ei->blk;
next_ex->ei.len += ei->len; next_ex->ei.len += ei->len;
if (en)
__release_extent_node(sbi, et, prev_ex);
en = next_ex; en = next_ex;
} }
@ -416,7 +413,7 @@ do_insert:
return en; return en;
} }
static unsigned int f2fs_update_extent_tree_range(struct inode *inode, static void f2fs_update_extent_tree_range(struct inode *inode,
pgoff_t fofs, block_t blkaddr, unsigned int len) pgoff_t fofs, block_t blkaddr, unsigned int len)
{ {
struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
@ -429,7 +426,7 @@ static unsigned int f2fs_update_extent_tree_range(struct inode *inode,
unsigned int pos = (unsigned int)fofs; unsigned int pos = (unsigned int)fofs;
if (!et) if (!et)
return false; return;
trace_f2fs_update_extent_tree_range(inode, fofs, blkaddr, len); trace_f2fs_update_extent_tree_range(inode, fofs, blkaddr, len);
@ -437,7 +434,7 @@ static unsigned int f2fs_update_extent_tree_range(struct inode *inode,
if (is_inode_flag_set(inode, FI_NO_EXTENT)) { if (is_inode_flag_set(inode, FI_NO_EXTENT)) {
write_unlock(&et->lock); write_unlock(&et->lock);
return false; return;
} }
prev = et->largest; prev = et->largest;
@ -492,9 +489,8 @@ static unsigned int f2fs_update_extent_tree_range(struct inode *inode,
if (!next_en) { if (!next_en) {
struct rb_node *node = rb_next(&en->rb_node); struct rb_node *node = rb_next(&en->rb_node);
next_en = node ? next_en = rb_entry_safe(node, struct extent_node,
rb_entry(node, struct extent_node, rb_node) rb_node);
: NULL;
} }
if (parts) if (parts)
@ -535,8 +531,6 @@ static unsigned int f2fs_update_extent_tree_range(struct inode *inode,
__free_extent_tree(sbi, et); __free_extent_tree(sbi, et);
write_unlock(&et->lock); write_unlock(&et->lock);
return !__is_extent_same(&prev, &et->largest);
} }
unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink) unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink)
@ -552,7 +546,7 @@ unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink)
if (!atomic_read(&sbi->total_zombie_tree)) if (!atomic_read(&sbi->total_zombie_tree))
goto free_node; goto free_node;
if (!down_write_trylock(&sbi->extent_tree_lock)) if (!mutex_trylock(&sbi->extent_tree_lock))
goto out; goto out;
/* 1. remove unreferenced extent tree */ /* 1. remove unreferenced extent tree */
@ -574,11 +568,11 @@ unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink)
goto unlock_out; goto unlock_out;
cond_resched(); cond_resched();
} }
up_write(&sbi->extent_tree_lock); mutex_unlock(&sbi->extent_tree_lock);
free_node: free_node:
/* 2. remove LRU extent entries */ /* 2. remove LRU extent entries */
if (!down_write_trylock(&sbi->extent_tree_lock)) if (!mutex_trylock(&sbi->extent_tree_lock))
goto out; goto out;
remained = nr_shrink - (node_cnt + tree_cnt); remained = nr_shrink - (node_cnt + tree_cnt);
@ -608,7 +602,7 @@ free_node:
spin_unlock(&sbi->extent_lock); spin_unlock(&sbi->extent_lock);
unlock_out: unlock_out:
up_write(&sbi->extent_tree_lock); mutex_unlock(&sbi->extent_tree_lock);
out: out:
trace_f2fs_shrink_extent_tree(sbi, node_cnt, tree_cnt); trace_f2fs_shrink_extent_tree(sbi, node_cnt, tree_cnt);
@ -655,10 +649,10 @@ void f2fs_destroy_extent_tree(struct inode *inode)
if (inode->i_nlink && !is_bad_inode(inode) && if (inode->i_nlink && !is_bad_inode(inode) &&
atomic_read(&et->node_cnt)) { atomic_read(&et->node_cnt)) {
down_write(&sbi->extent_tree_lock); mutex_lock(&sbi->extent_tree_lock);
list_add_tail(&et->list, &sbi->zombie_list); list_add_tail(&et->list, &sbi->zombie_list);
atomic_inc(&sbi->total_zombie_tree); atomic_inc(&sbi->total_zombie_tree);
up_write(&sbi->extent_tree_lock); mutex_unlock(&sbi->extent_tree_lock);
return; return;
} }
@ -666,12 +660,12 @@ void f2fs_destroy_extent_tree(struct inode *inode)
node_cnt = f2fs_destroy_extent_node(inode); node_cnt = f2fs_destroy_extent_node(inode);
/* delete extent tree entry in radix tree */ /* delete extent tree entry in radix tree */
down_write(&sbi->extent_tree_lock); mutex_lock(&sbi->extent_tree_lock);
f2fs_bug_on(sbi, atomic_read(&et->node_cnt)); f2fs_bug_on(sbi, atomic_read(&et->node_cnt));
radix_tree_delete(&sbi->extent_tree_root, inode->i_ino); radix_tree_delete(&sbi->extent_tree_root, inode->i_ino);
kmem_cache_free(extent_tree_slab, et); kmem_cache_free(extent_tree_slab, et);
atomic_dec(&sbi->total_ext_tree); atomic_dec(&sbi->total_ext_tree);
up_write(&sbi->extent_tree_lock); mutex_unlock(&sbi->extent_tree_lock);
F2FS_I(inode)->extent_tree = NULL; F2FS_I(inode)->extent_tree = NULL;
@ -718,7 +712,7 @@ void f2fs_update_extent_cache_range(struct dnode_of_data *dn,
void init_extent_cache_info(struct f2fs_sb_info *sbi) void init_extent_cache_info(struct f2fs_sb_info *sbi)
{ {
INIT_RADIX_TREE(&sbi->extent_tree_root, GFP_NOIO); INIT_RADIX_TREE(&sbi->extent_tree_root, GFP_NOIO);
init_rwsem(&sbi->extent_tree_lock); mutex_init(&sbi->extent_tree_lock);
INIT_LIST_HEAD(&sbi->extent_list); INIT_LIST_HEAD(&sbi->extent_list);
spin_lock_init(&sbi->extent_lock); spin_lock_init(&sbi->extent_lock);
atomic_set(&sbi->total_ext_tree, 0); atomic_set(&sbi->total_ext_tree, 0);

View File

@ -112,9 +112,9 @@ struct f2fs_mount_info {
#define F2FS_HAS_FEATURE(sb, mask) \ #define F2FS_HAS_FEATURE(sb, mask) \
((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0) ((F2FS_SB(sb)->raw_super->feature & cpu_to_le32(mask)) != 0)
#define F2FS_SET_FEATURE(sb, mask) \ #define F2FS_SET_FEATURE(sb, mask) \
F2FS_SB(sb)->raw_super->feature |= cpu_to_le32(mask) (F2FS_SB(sb)->raw_super->feature |= cpu_to_le32(mask))
#define F2FS_CLEAR_FEATURE(sb, mask) \ #define F2FS_CLEAR_FEATURE(sb, mask) \
F2FS_SB(sb)->raw_super->feature &= ~cpu_to_le32(mask) (F2FS_SB(sb)->raw_super->feature &= ~cpu_to_le32(mask))
/* /*
* For checkpoint manager * For checkpoint manager
@ -132,11 +132,14 @@ enum {
CP_DISCARD, CP_DISCARD,
}; };
#define DEF_BATCHED_TRIM_SECTIONS 2 #define DEF_BATCHED_TRIM_SECTIONS 2048
#define BATCHED_TRIM_SEGMENTS(sbi) \ #define BATCHED_TRIM_SEGMENTS(sbi) \
(SM_I(sbi)->trim_sections * (sbi)->segs_per_sec) (SM_I(sbi)->trim_sections * (sbi)->segs_per_sec)
#define BATCHED_TRIM_BLOCKS(sbi) \ #define BATCHED_TRIM_BLOCKS(sbi) \
(BATCHED_TRIM_SEGMENTS(sbi) << (sbi)->log_blocks_per_seg) (BATCHED_TRIM_SEGMENTS(sbi) << (sbi)->log_blocks_per_seg)
#define MAX_DISCARD_BLOCKS(sbi) \
((1 << (sbi)->log_blocks_per_seg) * (sbi)->segs_per_sec)
#define DISCARD_ISSUE_RATE 8
#define DEF_CP_INTERVAL 60 /* 60 secs */ #define DEF_CP_INTERVAL 60 /* 60 secs */
#define DEF_IDLE_INTERVAL 5 /* 5 secs */ #define DEF_IDLE_INTERVAL 5 /* 5 secs */
@ -185,11 +188,30 @@ struct discard_entry {
int len; /* # of consecutive blocks of the discard */ int len; /* # of consecutive blocks of the discard */
}; };
struct bio_entry { enum {
struct list_head list; D_PREP,
struct bio *bio; D_SUBMIT,
struct completion event; D_DONE,
int error; };
struct discard_cmd {
struct list_head list; /* command list */
struct completion wait; /* compleation */
block_t lstart; /* logical start address */
block_t len; /* length */
struct bio *bio; /* bio */
int state; /* state */
};
struct discard_cmd_control {
struct task_struct *f2fs_issue_discard; /* discard thread */
struct list_head discard_entry_list; /* 4KB discard entry list */
int nr_discards; /* # of discards in the list */
struct list_head discard_cmd_list; /* discard cmd list */
wait_queue_head_t discard_wait_queue; /* waiting queue for wake-up */
struct mutex cmd_lock;
int max_discards; /* max. discards to be issued */
atomic_t submit_discard; /* # of issued discard */
}; };
/* for the list of fsync inodes, used only during recovery */ /* for the list of fsync inodes, used only during recovery */
@ -214,6 +236,7 @@ struct fsync_inode_entry {
static inline int update_nats_in_cursum(struct f2fs_journal *journal, int i) static inline int update_nats_in_cursum(struct f2fs_journal *journal, int i)
{ {
int before = nats_in_cursum(journal); int before = nats_in_cursum(journal);
journal->n_nats = cpu_to_le16(before + i); journal->n_nats = cpu_to_le16(before + i);
return before; return before;
} }
@ -221,6 +244,7 @@ static inline int update_nats_in_cursum(struct f2fs_journal *journal, int i)
static inline int update_sits_in_cursum(struct f2fs_journal *journal, int i) static inline int update_sits_in_cursum(struct f2fs_journal *journal, int i)
{ {
int before = sits_in_cursum(journal); int before = sits_in_cursum(journal);
journal->n_sits = cpu_to_le16(before + i); journal->n_sits = cpu_to_le16(before + i);
return before; return before;
} }
@ -306,12 +330,14 @@ static inline void make_dentry_ptr(struct inode *inode,
if (type == 1) { if (type == 1) {
struct f2fs_dentry_block *t = (struct f2fs_dentry_block *)src; struct f2fs_dentry_block *t = (struct f2fs_dentry_block *)src;
d->max = NR_DENTRY_IN_BLOCK; d->max = NR_DENTRY_IN_BLOCK;
d->bitmap = &t->dentry_bitmap; d->bitmap = &t->dentry_bitmap;
d->dentry = t->dentry; d->dentry = t->dentry;
d->filename = t->filename; d->filename = t->filename;
} else { } else {
struct f2fs_inline_dentry *t = (struct f2fs_inline_dentry *)src; struct f2fs_inline_dentry *t = (struct f2fs_inline_dentry *)src;
d->max = NR_INLINE_DENTRY; d->max = NR_INLINE_DENTRY;
d->bitmap = &t->dentry_bitmap; d->bitmap = &t->dentry_bitmap;
d->dentry = t->dentry; d->dentry = t->dentry;
@ -438,8 +464,8 @@ struct f2fs_inode_info {
atomic_t dirty_pages; /* # of dirty pages */ atomic_t dirty_pages; /* # of dirty pages */
f2fs_hash_t chash; /* hash value of given file name */ f2fs_hash_t chash; /* hash value of given file name */
unsigned int clevel; /* maximum level of given file name */ unsigned int clevel; /* maximum level of given file name */
struct task_struct *task; /* lookup and create consistency */
nid_t i_xattr_nid; /* node id that contains xattrs */ nid_t i_xattr_nid; /* node id that contains xattrs */
unsigned long long xattr_ver; /* cp version of xattr modification */
loff_t last_disk_size; /* lastly written file size */ loff_t last_disk_size; /* lastly written file size */
struct list_head dirty_list; /* dirty list for dirs and files */ struct list_head dirty_list; /* dirty list for dirs and files */
@ -474,13 +500,6 @@ static inline void set_extent_info(struct extent_info *ei, unsigned int fofs,
ei->len = len; ei->len = len;
} }
static inline bool __is_extent_same(struct extent_info *ei1,
struct extent_info *ei2)
{
return (ei1->fofs == ei2->fofs && ei1->blk == ei2->blk &&
ei1->len == ei2->len);
}
static inline bool __is_extent_mergeable(struct extent_info *back, static inline bool __is_extent_mergeable(struct extent_info *back,
struct extent_info *front) struct extent_info *front)
{ {
@ -500,7 +519,7 @@ static inline bool __is_front_mergeable(struct extent_info *cur,
return __is_extent_mergeable(cur, front); return __is_extent_mergeable(cur, front);
} }
extern void f2fs_mark_inode_dirty_sync(struct inode *, bool); extern void f2fs_mark_inode_dirty_sync(struct inode *inode, bool sync);
static inline void __try_update_largest_extent(struct inode *inode, static inline void __try_update_largest_extent(struct inode *inode,
struct extent_tree *et, struct extent_node *en) struct extent_tree *et, struct extent_node *en)
{ {
@ -532,6 +551,7 @@ struct f2fs_nm_info {
struct list_head nat_entries; /* cached nat entry list (clean) */ struct list_head nat_entries; /* cached nat entry list (clean) */
unsigned int nat_cnt; /* the # of cached nat entries */ unsigned int nat_cnt; /* the # of cached nat entries */
unsigned int dirty_nat_cnt; /* total num of nat entries in set */ unsigned int dirty_nat_cnt; /* total num of nat entries in set */
unsigned int nat_blocks; /* # of nat blocks */
/* free node ids management */ /* free node ids management */
struct radix_tree_root free_nid_root;/* root of the free_nid cache */ struct radix_tree_root free_nid_root;/* root of the free_nid cache */
@ -539,9 +559,19 @@ struct f2fs_nm_info {
unsigned int nid_cnt[MAX_NID_LIST]; /* the number of free node id */ unsigned int nid_cnt[MAX_NID_LIST]; /* the number of free node id */
spinlock_t nid_list_lock; /* protect nid lists ops */ spinlock_t nid_list_lock; /* protect nid lists ops */
struct mutex build_lock; /* lock for build free nids */ struct mutex build_lock; /* lock for build free nids */
unsigned char (*free_nid_bitmap)[NAT_ENTRY_BITMAP_SIZE];
unsigned char *nat_block_bitmap;
/* for checkpoint */ /* for checkpoint */
char *nat_bitmap; /* NAT bitmap pointer */ char *nat_bitmap; /* NAT bitmap pointer */
unsigned int nat_bits_blocks; /* # of nat bits blocks */
unsigned char *nat_bits; /* NAT bits blocks */
unsigned char *full_nat_bits; /* full NAT pages */
unsigned char *empty_nat_bits; /* empty NAT pages */
#ifdef CONFIG_F2FS_CHECK_FS
char *nat_bitmap_mir; /* NAT bitmap mirror */
#endif
int bitmap_size; /* bitmap size */ int bitmap_size; /* bitmap size */
}; };
@ -632,12 +662,6 @@ struct f2fs_sm_info {
/* a threshold to reclaim prefree segments */ /* a threshold to reclaim prefree segments */
unsigned int rec_prefree_segments; unsigned int rec_prefree_segments;
/* for small discard management */
struct list_head discard_list; /* 4KB discard list */
struct list_head wait_list; /* linked with issued discard bio */
int nr_discards; /* # of discards in the list */
int max_discards; /* max. discards to be issued */
/* for batched trimming */ /* for batched trimming */
unsigned int trim_sections; /* # of sections to trim */ unsigned int trim_sections; /* # of sections to trim */
@ -648,8 +672,10 @@ struct f2fs_sm_info {
unsigned int min_fsync_blocks; /* threshold for fsync */ unsigned int min_fsync_blocks; /* threshold for fsync */
/* for flush command control */ /* for flush command control */
struct flush_cmd_control *cmd_control_info; struct flush_cmd_control *fcc_info;
/* for discard command control */
struct discard_cmd_control *dcc_info;
}; };
/* /*
@ -708,6 +734,7 @@ struct f2fs_io_info {
block_t old_blkaddr; /* old block address before Cow */ block_t old_blkaddr; /* old block address before Cow */
struct page *page; /* page to be written */ struct page *page; /* page to be written */
struct page *encrypted_page; /* encrypted page */ struct page *encrypted_page; /* encrypted page */
bool submitted; /* indicate IO submission */
}; };
#define is_read_io(rw) (rw == READ) #define is_read_io(rw) (rw == READ)
@ -787,6 +814,8 @@ struct f2fs_sb_info {
struct f2fs_bio_info read_io; /* for read bios */ struct f2fs_bio_info read_io; /* for read bios */
struct f2fs_bio_info write_io[NR_PAGE_TYPE]; /* for write bios */ struct f2fs_bio_info write_io[NR_PAGE_TYPE]; /* for write bios */
struct mutex wio_mutex[NODE + 1]; /* bio ordering for NODE/DATA */ struct mutex wio_mutex[NODE + 1]; /* bio ordering for NODE/DATA */
int write_io_size_bits; /* Write IO size bits */
mempool_t *write_io_dummy; /* Dummy pages */
/* for checkpoint */ /* for checkpoint */
struct f2fs_checkpoint *ckpt; /* raw checkpoint pointer */ struct f2fs_checkpoint *ckpt; /* raw checkpoint pointer */
@ -811,7 +840,7 @@ struct f2fs_sb_info {
/* for extent tree cache */ /* for extent tree cache */
struct radix_tree_root extent_tree_root;/* cache extent cache entries */ struct radix_tree_root extent_tree_root;/* cache extent cache entries */
struct rw_semaphore extent_tree_lock; /* locking extent radix tree */ struct mutex extent_tree_lock; /* locking extent radix tree */
struct list_head extent_list; /* lru list for shrinker */ struct list_head extent_list; /* lru list for shrinker */
spinlock_t extent_lock; /* locking extent lru list */ spinlock_t extent_lock; /* locking extent lru list */
atomic_t total_ext_tree; /* extent tree count */ atomic_t total_ext_tree; /* extent tree count */
@ -858,6 +887,9 @@ struct f2fs_sb_info {
struct f2fs_gc_kthread *gc_thread; /* GC thread */ struct f2fs_gc_kthread *gc_thread; /* GC thread */
unsigned int cur_victim_sec; /* current victim section num */ unsigned int cur_victim_sec; /* current victim section num */
/* threshold for converting bg victims for fg */
u64 fggc_threshold;
/* maximum # of trials to find a victim segment for SSR and GC */ /* maximum # of trials to find a victim segment for SSR and GC */
unsigned int max_victim_search; unsigned int max_victim_search;
@ -877,6 +909,8 @@ struct f2fs_sb_info {
atomic_t inline_xattr; /* # of inline_xattr inodes */ atomic_t inline_xattr; /* # of inline_xattr inodes */
atomic_t inline_inode; /* # of inline_data inodes */ atomic_t inline_inode; /* # of inline_data inodes */
atomic_t inline_dir; /* # of inline_dentry inodes */ atomic_t inline_dir; /* # of inline_dentry inodes */
atomic_t aw_cnt; /* # of atomic writes */
atomic_t max_aw_cnt; /* max # of atomic writes */
int bg_gc; /* background gc calls */ int bg_gc; /* background gc calls */
unsigned int ndirty_inode[NR_INODE_TYPE]; /* # of dirty inodes */ unsigned int ndirty_inode[NR_INODE_TYPE]; /* # of dirty inodes */
#endif #endif
@ -908,6 +942,10 @@ struct f2fs_sb_info {
}; };
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
#define f2fs_show_injection_info(type) \
printk("%sF2FS-fs : inject %s in %s of %pF\n", \
KERN_INFO, fault_name[type], \
__func__, __builtin_return_address(0))
static inline bool time_to_inject(struct f2fs_sb_info *sbi, int type) static inline bool time_to_inject(struct f2fs_sb_info *sbi, int type)
{ {
struct f2fs_fault_info *ffi = &sbi->fault_info; struct f2fs_fault_info *ffi = &sbi->fault_info;
@ -921,10 +959,6 @@ static inline bool time_to_inject(struct f2fs_sb_info *sbi, int type)
atomic_inc(&ffi->inject_ops); atomic_inc(&ffi->inject_ops);
if (atomic_read(&ffi->inject_ops) >= ffi->inject_rate) { if (atomic_read(&ffi->inject_ops) >= ffi->inject_rate) {
atomic_set(&ffi->inject_ops, 0); atomic_set(&ffi->inject_ops, 0);
printk("%sF2FS-fs : inject %s in %pF\n",
KERN_INFO,
fault_name[type],
__builtin_return_address(0));
return true; return true;
} }
return false; return false;
@ -1089,6 +1123,12 @@ static inline unsigned long long cur_cp_version(struct f2fs_checkpoint *cp)
return le64_to_cpu(cp->checkpoint_ver); return le64_to_cpu(cp->checkpoint_ver);
} }
static inline __u64 cur_cp_crc(struct f2fs_checkpoint *cp)
{
size_t crc_offset = le32_to_cpu(cp->checksum_offset);
return le32_to_cpu(*((__le32 *)((unsigned char *)cp + crc_offset)));
}
static inline bool __is_set_ckpt_flags(struct f2fs_checkpoint *cp, unsigned int f) static inline bool __is_set_ckpt_flags(struct f2fs_checkpoint *cp, unsigned int f)
{ {
unsigned int ckpt_flags = le32_to_cpu(cp->ckpt_flags); unsigned int ckpt_flags = le32_to_cpu(cp->ckpt_flags);
@ -1133,6 +1173,27 @@ static inline void clear_ckpt_flags(struct f2fs_sb_info *sbi, unsigned int f)
spin_unlock(&sbi->cp_lock); spin_unlock(&sbi->cp_lock);
} }
static inline void disable_nat_bits(struct f2fs_sb_info *sbi, bool lock)
{
set_sbi_flag(sbi, SBI_NEED_FSCK);
if (lock)
spin_lock(&sbi->cp_lock);
__clear_ckpt_flags(F2FS_CKPT(sbi), CP_NAT_BITS_FLAG);
kfree(NM_I(sbi)->nat_bits);
NM_I(sbi)->nat_bits = NULL;
if (lock)
spin_unlock(&sbi->cp_lock);
}
static inline bool enabled_nat_bits(struct f2fs_sb_info *sbi,
struct cp_control *cpc)
{
bool set = is_set_ckpt_flags(sbi, CP_NAT_BITS_FLAG);
return (cpc) ? (cpc->reason == CP_UMOUNT) && set : set;
}
static inline void f2fs_lock_op(struct f2fs_sb_info *sbi) static inline void f2fs_lock_op(struct f2fs_sb_info *sbi)
{ {
down_read(&sbi->cp_rwsem); down_read(&sbi->cp_rwsem);
@ -1212,8 +1273,10 @@ static inline bool inc_valid_block_count(struct f2fs_sb_info *sbi,
blkcnt_t diff; blkcnt_t diff;
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
if (time_to_inject(sbi, FAULT_BLOCK)) if (time_to_inject(sbi, FAULT_BLOCK)) {
f2fs_show_injection_info(FAULT_BLOCK);
return false; return false;
}
#endif #endif
/* /*
* let's increase this in prior to actual block count change in order * let's increase this in prior to actual block count change in order
@ -1449,11 +1512,14 @@ static inline struct page *f2fs_grab_cache_page(struct address_space *mapping,
{ {
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
struct page *page = find_lock_page(mapping, index); struct page *page = find_lock_page(mapping, index);
if (page) if (page)
return page; return page;
if (time_to_inject(F2FS_M_SB(mapping), FAULT_PAGE_ALLOC)) if (time_to_inject(F2FS_M_SB(mapping), FAULT_PAGE_ALLOC)) {
f2fs_show_injection_info(FAULT_PAGE_ALLOC);
return NULL; return NULL;
}
#endif #endif
if (!for_write) if (!for_write)
return grab_cache_page(mapping, index); return grab_cache_page(mapping, index);
@ -1532,6 +1598,7 @@ static inline void f2fs_radix_tree_insert(struct radix_tree_root *root,
static inline bool IS_INODE(struct page *page) static inline bool IS_INODE(struct page *page)
{ {
struct f2fs_node *p = F2FS_NODE(page); struct f2fs_node *p = F2FS_NODE(page);
return RAW_IS_INODE(p); return RAW_IS_INODE(p);
} }
@ -1545,6 +1612,7 @@ static inline block_t datablock_addr(struct page *node_page,
{ {
struct f2fs_node *raw_node; struct f2fs_node *raw_node;
__le32 *addr_array; __le32 *addr_array;
raw_node = F2FS_NODE(node_page); raw_node = F2FS_NODE(node_page);
addr_array = blkaddr_in_node(raw_node); addr_array = blkaddr_in_node(raw_node);
return le32_to_cpu(addr_array[offset]); return le32_to_cpu(addr_array[offset]);
@ -1628,6 +1696,7 @@ enum {
FI_UPDATE_WRITE, /* inode has in-place-update data */ FI_UPDATE_WRITE, /* inode has in-place-update data */
FI_NEED_IPU, /* used for ipu per file */ FI_NEED_IPU, /* used for ipu per file */
FI_ATOMIC_FILE, /* indicate atomic file */ FI_ATOMIC_FILE, /* indicate atomic file */
FI_ATOMIC_COMMIT, /* indicate the state of atomical committing */
FI_VOLATILE_FILE, /* indicate volatile file */ FI_VOLATILE_FILE, /* indicate volatile file */
FI_FIRST_BLOCK_WRITTEN, /* indicate #0 data block was written */ FI_FIRST_BLOCK_WRITTEN, /* indicate #0 data block was written */
FI_DROP_CACHE, /* drop dirty page cache */ FI_DROP_CACHE, /* drop dirty page cache */
@ -1635,6 +1704,7 @@ enum {
FI_INLINE_DOTS, /* indicate inline dot dentries */ FI_INLINE_DOTS, /* indicate inline dot dentries */
FI_DO_DEFRAG, /* indicate defragment is running */ FI_DO_DEFRAG, /* indicate defragment is running */
FI_DIRTY_FILE, /* indicate regular/symlink has dirty pages */ FI_DIRTY_FILE, /* indicate regular/symlink has dirty pages */
FI_NO_PREALLOC, /* indicate skipped preallocated blocks */
}; };
static inline void __mark_inode_dirty_flag(struct inode *inode, static inline void __mark_inode_dirty_flag(struct inode *inode,
@ -1779,6 +1849,7 @@ static inline unsigned int addrs_per_inode(struct inode *inode)
static inline void *inline_xattr_addr(struct page *page) static inline void *inline_xattr_addr(struct page *page)
{ {
struct f2fs_inode *ri = F2FS_INODE(page); struct f2fs_inode *ri = F2FS_INODE(page);
return (void *)&(ri->i_addr[DEF_ADDRS_PER_INODE - return (void *)&(ri->i_addr[DEF_ADDRS_PER_INODE -
F2FS_INLINE_XATTR_ADDRS]); F2FS_INLINE_XATTR_ADDRS]);
} }
@ -1817,6 +1888,11 @@ static inline bool f2fs_is_atomic_file(struct inode *inode)
return is_inode_flag_set(inode, FI_ATOMIC_FILE); return is_inode_flag_set(inode, FI_ATOMIC_FILE);
} }
static inline bool f2fs_is_commit_atomic_write(struct inode *inode)
{
return is_inode_flag_set(inode, FI_ATOMIC_COMMIT);
}
static inline bool f2fs_is_volatile_file(struct inode *inode) static inline bool f2fs_is_volatile_file(struct inode *inode)
{ {
return is_inode_flag_set(inode, FI_VOLATILE_FILE); return is_inode_flag_set(inode, FI_VOLATILE_FILE);
@ -1835,6 +1911,7 @@ static inline bool f2fs_is_drop_cache(struct inode *inode)
static inline void *inline_data_addr(struct page *page) static inline void *inline_data_addr(struct page *page)
{ {
struct f2fs_inode *ri = F2FS_INODE(page); struct f2fs_inode *ri = F2FS_INODE(page);
return (void *)&(ri->i_addr[1]); return (void *)&(ri->i_addr[1]);
} }
@ -1918,8 +1995,10 @@ static inline void *f2fs_kmalloc(struct f2fs_sb_info *sbi,
size_t size, gfp_t flags) size_t size, gfp_t flags)
{ {
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
if (time_to_inject(sbi, FAULT_KMALLOC)) if (time_to_inject(sbi, FAULT_KMALLOC)) {
f2fs_show_injection_info(FAULT_KMALLOC);
return NULL; return NULL;
}
#endif #endif
return kmalloc(size, flags); return kmalloc(size, flags);
} }
@ -1957,29 +2036,30 @@ static inline void *f2fs_kvzalloc(size_t size, gfp_t flags)
/* /*
* file.c * file.c
*/ */
int f2fs_sync_file(struct file *, loff_t, loff_t, int); int f2fs_sync_file(struct file *file, loff_t start, loff_t end, int datasync);
void truncate_data_blocks(struct dnode_of_data *); void truncate_data_blocks(struct dnode_of_data *dn);
int truncate_blocks(struct inode *, u64, bool); int truncate_blocks(struct inode *inode, u64 from, bool lock);
int f2fs_truncate(struct inode *); int f2fs_truncate(struct inode *inode);
int f2fs_getattr(struct vfsmount *, struct dentry *, struct kstat *); int f2fs_getattr(struct vfsmount *mnt, struct dentry *dentry,
int f2fs_setattr(struct dentry *, struct iattr *); struct kstat *stat);
int truncate_hole(struct inode *, pgoff_t, pgoff_t); int f2fs_setattr(struct dentry *dentry, struct iattr *attr);
int truncate_data_blocks_range(struct dnode_of_data *, int); int truncate_hole(struct inode *inode, pgoff_t pg_start, pgoff_t pg_end);
long f2fs_ioctl(struct file *, unsigned int, unsigned long); int truncate_data_blocks_range(struct dnode_of_data *dn, int count);
long f2fs_compat_ioctl(struct file *, unsigned int, unsigned long); long f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
long f2fs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
/* /*
* inode.c * inode.c
*/ */
void f2fs_set_inode_flags(struct inode *); void f2fs_set_inode_flags(struct inode *inode);
struct inode *f2fs_iget(struct super_block *, unsigned long); struct inode *f2fs_iget(struct super_block *sb, unsigned long ino);
struct inode *f2fs_iget_retry(struct super_block *, unsigned long); struct inode *f2fs_iget_retry(struct super_block *sb, unsigned long ino);
int try_to_free_nats(struct f2fs_sb_info *, int); int try_to_free_nats(struct f2fs_sb_info *sbi, int nr_shrink);
int update_inode(struct inode *, struct page *); int update_inode(struct inode *inode, struct page *node_page);
int update_inode_page(struct inode *); int update_inode_page(struct inode *inode);
int f2fs_write_inode(struct inode *, struct writeback_control *); int f2fs_write_inode(struct inode *inode, struct writeback_control *wbc);
void f2fs_evict_inode(struct inode *); void f2fs_evict_inode(struct inode *inode);
void handle_failed_inode(struct inode *); void handle_failed_inode(struct inode *inode);
/* /*
* namei.c * namei.c
@ -1989,40 +2069,47 @@ struct dentry *f2fs_get_parent(struct dentry *child);
/* /*
* dir.c * dir.c
*/ */
void set_de_type(struct f2fs_dir_entry *, umode_t); void set_de_type(struct f2fs_dir_entry *de, umode_t mode);
unsigned char get_de_type(struct f2fs_dir_entry *); unsigned char get_de_type(struct f2fs_dir_entry *de);
struct f2fs_dir_entry *find_target_dentry(struct fscrypt_name *, struct f2fs_dir_entry *find_target_dentry(struct fscrypt_name *fname,
f2fs_hash_t, int *, struct f2fs_dentry_ptr *); f2fs_hash_t namehash, int *max_slots,
int f2fs_fill_dentries(struct dir_context *, struct f2fs_dentry_ptr *, struct f2fs_dentry_ptr *d);
unsigned int, struct fscrypt_str *); int f2fs_fill_dentries(struct dir_context *ctx, struct f2fs_dentry_ptr *d,
void do_make_empty_dir(struct inode *, struct inode *, unsigned int start_pos, struct fscrypt_str *fstr);
struct f2fs_dentry_ptr *); void do_make_empty_dir(struct inode *inode, struct inode *parent,
struct page *init_inode_metadata(struct inode *, struct inode *, struct f2fs_dentry_ptr *d);
const struct qstr *, const struct qstr *, struct page *); struct page *init_inode_metadata(struct inode *inode, struct inode *dir,
void update_parent_metadata(struct inode *, struct inode *, unsigned int); const struct qstr *new_name,
int room_for_filename(const void *, int, int); const struct qstr *orig_name, struct page *dpage);
void f2fs_drop_nlink(struct inode *, struct inode *); void update_parent_metadata(struct inode *dir, struct inode *inode,
struct f2fs_dir_entry *__f2fs_find_entry(struct inode *, struct fscrypt_name *, unsigned int current_depth);
struct page **); int room_for_filename(const void *bitmap, int slots, int max_slots);
struct f2fs_dir_entry *f2fs_find_entry(struct inode *, const struct qstr *, void f2fs_drop_nlink(struct inode *dir, struct inode *inode);
struct page **); struct f2fs_dir_entry *__f2fs_find_entry(struct inode *dir,
struct f2fs_dir_entry *f2fs_parent_dir(struct inode *, struct page **); struct fscrypt_name *fname, struct page **res_page);
ino_t f2fs_inode_by_name(struct inode *, const struct qstr *, struct page **); struct f2fs_dir_entry *f2fs_find_entry(struct inode *dir,
void f2fs_set_link(struct inode *, struct f2fs_dir_entry *, const struct qstr *child, struct page **res_page);
struct page *, struct inode *); struct f2fs_dir_entry *f2fs_parent_dir(struct inode *dir, struct page **p);
int update_dent_inode(struct inode *, struct inode *, const struct qstr *); ino_t f2fs_inode_by_name(struct inode *dir, const struct qstr *qstr,
void f2fs_update_dentry(nid_t ino, umode_t mode, struct f2fs_dentry_ptr *, struct page **page);
const struct qstr *, f2fs_hash_t , unsigned int); void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry *de,
int f2fs_add_regular_entry(struct inode *, const struct qstr *, struct page *page, struct inode *inode);
const struct qstr *, struct inode *, nid_t, umode_t); int update_dent_inode(struct inode *inode, struct inode *to,
int __f2fs_do_add_link(struct inode *, struct fscrypt_name*, struct inode *, const struct qstr *name);
nid_t, umode_t); void f2fs_update_dentry(nid_t ino, umode_t mode, struct f2fs_dentry_ptr *d,
int __f2fs_add_link(struct inode *, const struct qstr *, struct inode *, nid_t, const struct qstr *name, f2fs_hash_t name_hash,
umode_t); unsigned int bit_pos);
void f2fs_delete_entry(struct f2fs_dir_entry *, struct page *, struct inode *, int f2fs_add_regular_entry(struct inode *dir, const struct qstr *new_name,
struct inode *); const struct qstr *orig_name,
int f2fs_do_tmpfile(struct inode *, struct inode *); struct inode *inode, nid_t ino, umode_t mode);
bool f2fs_empty_dir(struct inode *); int __f2fs_do_add_link(struct inode *dir, struct fscrypt_name *fname,
struct inode *inode, nid_t ino, umode_t mode);
int __f2fs_add_link(struct inode *dir, const struct qstr *name,
struct inode *inode, nid_t ino, umode_t mode);
void f2fs_delete_entry(struct f2fs_dir_entry *dentry, struct page *page,
struct inode *dir, struct inode *inode);
int f2fs_do_tmpfile(struct inode *inode, struct inode *dir);
bool f2fs_empty_dir(struct inode *dir);
static inline int f2fs_add_link(struct dentry *dentry, struct inode *inode) static inline int f2fs_add_link(struct dentry *dentry, struct inode *inode)
{ {
@ -2033,18 +2120,18 @@ static inline int f2fs_add_link(struct dentry *dentry, struct inode *inode)
/* /*
* super.c * super.c
*/ */
int f2fs_inode_dirtied(struct inode *, bool); int f2fs_inode_dirtied(struct inode *inode, bool sync);
void f2fs_inode_synced(struct inode *); void f2fs_inode_synced(struct inode *inode);
int f2fs_commit_super(struct f2fs_sb_info *, bool); int f2fs_commit_super(struct f2fs_sb_info *sbi, bool recover);
int f2fs_sync_fs(struct super_block *, int); int f2fs_sync_fs(struct super_block *sb, int sync);
extern __printf(3, 4) extern __printf(3, 4)
void f2fs_msg(struct super_block *, const char *, const char *, ...); void f2fs_msg(struct super_block *sb, const char *level, const char *fmt, ...);
int sanity_check_ckpt(struct f2fs_sb_info *sbi); int sanity_check_ckpt(struct f2fs_sb_info *sbi);
/* /*
* hash.c * hash.c
*/ */
f2fs_hash_t f2fs_dentry_hash(const struct qstr *); f2fs_hash_t f2fs_dentry_hash(const struct qstr *name_info);
/* /*
* node.c * node.c
@ -2052,163 +2139,183 @@ f2fs_hash_t f2fs_dentry_hash(const struct qstr *);
struct dnode_of_data; struct dnode_of_data;
struct node_info; struct node_info;
bool available_free_memory(struct f2fs_sb_info *, int); bool available_free_memory(struct f2fs_sb_info *sbi, int type);
int need_dentry_mark(struct f2fs_sb_info *, nid_t); int need_dentry_mark(struct f2fs_sb_info *sbi, nid_t nid);
bool is_checkpointed_node(struct f2fs_sb_info *, nid_t); bool is_checkpointed_node(struct f2fs_sb_info *sbi, nid_t nid);
bool need_inode_block_update(struct f2fs_sb_info *, nid_t); bool need_inode_block_update(struct f2fs_sb_info *sbi, nid_t ino);
void get_node_info(struct f2fs_sb_info *, nid_t, struct node_info *); void get_node_info(struct f2fs_sb_info *sbi, nid_t nid, struct node_info *ni);
pgoff_t get_next_page_offset(struct dnode_of_data *, pgoff_t); pgoff_t get_next_page_offset(struct dnode_of_data *dn, pgoff_t pgofs);
int get_dnode_of_data(struct dnode_of_data *, pgoff_t, int); int get_dnode_of_data(struct dnode_of_data *dn, pgoff_t index, int mode);
int truncate_inode_blocks(struct inode *, pgoff_t); int truncate_inode_blocks(struct inode *inode, pgoff_t from);
int truncate_xattr_node(struct inode *, struct page *); int truncate_xattr_node(struct inode *inode, struct page *page);
int wait_on_node_pages_writeback(struct f2fs_sb_info *, nid_t); int wait_on_node_pages_writeback(struct f2fs_sb_info *sbi, nid_t ino);
int remove_inode_page(struct inode *); int remove_inode_page(struct inode *inode);
struct page *new_inode_page(struct inode *); struct page *new_inode_page(struct inode *inode);
struct page *new_node_page(struct dnode_of_data *, unsigned int, struct page *); struct page *new_node_page(struct dnode_of_data *dn,
void ra_node_page(struct f2fs_sb_info *, nid_t); unsigned int ofs, struct page *ipage);
struct page *get_node_page(struct f2fs_sb_info *, pgoff_t); void ra_node_page(struct f2fs_sb_info *sbi, nid_t nid);
struct page *get_node_page_ra(struct page *, int); struct page *get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid);
void move_node_page(struct page *, int); struct page *get_node_page_ra(struct page *parent, int start);
int fsync_node_pages(struct f2fs_sb_info *, struct inode *, void move_node_page(struct page *node_page, int gc_type);
struct writeback_control *, bool); int fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode,
int sync_node_pages(struct f2fs_sb_info *, struct writeback_control *); struct writeback_control *wbc, bool atomic);
void build_free_nids(struct f2fs_sb_info *, bool); int sync_node_pages(struct f2fs_sb_info *sbi, struct writeback_control *wbc);
bool alloc_nid(struct f2fs_sb_info *, nid_t *); void build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount);
void alloc_nid_done(struct f2fs_sb_info *, nid_t); bool alloc_nid(struct f2fs_sb_info *sbi, nid_t *nid);
void alloc_nid_failed(struct f2fs_sb_info *, nid_t); void alloc_nid_done(struct f2fs_sb_info *sbi, nid_t nid);
int try_to_free_nids(struct f2fs_sb_info *, int); void alloc_nid_failed(struct f2fs_sb_info *sbi, nid_t nid);
void recover_inline_xattr(struct inode *, struct page *); int try_to_free_nids(struct f2fs_sb_info *sbi, int nr_shrink);
void recover_xattr_data(struct inode *, struct page *, block_t); void recover_inline_xattr(struct inode *inode, struct page *page);
int recover_inode_page(struct f2fs_sb_info *, struct page *); int recover_xattr_data(struct inode *inode, struct page *page,
int restore_node_summary(struct f2fs_sb_info *, unsigned int, block_t blkaddr);
struct f2fs_summary_block *); int recover_inode_page(struct f2fs_sb_info *sbi, struct page *page);
void flush_nat_entries(struct f2fs_sb_info *); int restore_node_summary(struct f2fs_sb_info *sbi,
int build_node_manager(struct f2fs_sb_info *); unsigned int segno, struct f2fs_summary_block *sum);
void destroy_node_manager(struct f2fs_sb_info *); void flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc);
int build_node_manager(struct f2fs_sb_info *sbi);
void destroy_node_manager(struct f2fs_sb_info *sbi);
int __init create_node_manager_caches(void); int __init create_node_manager_caches(void);
void destroy_node_manager_caches(void); void destroy_node_manager_caches(void);
/* /*
* segment.c * segment.c
*/ */
void register_inmem_page(struct inode *, struct page *); void register_inmem_page(struct inode *inode, struct page *page);
void drop_inmem_pages(struct inode *); void drop_inmem_pages(struct inode *inode);
int commit_inmem_pages(struct inode *); int commit_inmem_pages(struct inode *inode);
void f2fs_balance_fs(struct f2fs_sb_info *, bool); void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need);
void f2fs_balance_fs_bg(struct f2fs_sb_info *); void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi);
int f2fs_issue_flush(struct f2fs_sb_info *); int f2fs_issue_flush(struct f2fs_sb_info *sbi);
int create_flush_cmd_control(struct f2fs_sb_info *); int create_flush_cmd_control(struct f2fs_sb_info *sbi);
void destroy_flush_cmd_control(struct f2fs_sb_info *, bool); void destroy_flush_cmd_control(struct f2fs_sb_info *sbi, bool free);
void invalidate_blocks(struct f2fs_sb_info *, block_t); void invalidate_blocks(struct f2fs_sb_info *sbi, block_t addr);
bool is_checkpointed_data(struct f2fs_sb_info *, block_t); bool is_checkpointed_data(struct f2fs_sb_info *sbi, block_t blkaddr);
void refresh_sit_entry(struct f2fs_sb_info *, block_t, block_t); void refresh_sit_entry(struct f2fs_sb_info *sbi, block_t old, block_t new);
void f2fs_wait_all_discard_bio(struct f2fs_sb_info *); void f2fs_wait_discard_bio(struct f2fs_sb_info *sbi, block_t blkaddr);
void clear_prefree_segments(struct f2fs_sb_info *, struct cp_control *); void clear_prefree_segments(struct f2fs_sb_info *sbi, struct cp_control *cpc);
void release_discard_addrs(struct f2fs_sb_info *); void release_discard_addrs(struct f2fs_sb_info *sbi);
int npages_for_summary_flush(struct f2fs_sb_info *, bool); int npages_for_summary_flush(struct f2fs_sb_info *sbi, bool for_ra);
void allocate_new_segments(struct f2fs_sb_info *); void allocate_new_segments(struct f2fs_sb_info *sbi);
int f2fs_trim_fs(struct f2fs_sb_info *, struct fstrim_range *); int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range);
struct page *get_sum_page(struct f2fs_sb_info *, unsigned int); bool exist_trim_candidates(struct f2fs_sb_info *sbi, struct cp_control *cpc);
void update_meta_page(struct f2fs_sb_info *, void *, block_t); struct page *get_sum_page(struct f2fs_sb_info *sbi, unsigned int segno);
void write_meta_page(struct f2fs_sb_info *, struct page *); void update_meta_page(struct f2fs_sb_info *sbi, void *src, block_t blk_addr);
void write_node_page(unsigned int, struct f2fs_io_info *); void write_meta_page(struct f2fs_sb_info *sbi, struct page *page);
void write_data_page(struct dnode_of_data *, struct f2fs_io_info *); void write_node_page(unsigned int nid, struct f2fs_io_info *fio);
void rewrite_data_page(struct f2fs_io_info *); void write_data_page(struct dnode_of_data *dn, struct f2fs_io_info *fio);
void __f2fs_replace_block(struct f2fs_sb_info *, struct f2fs_summary *, void rewrite_data_page(struct f2fs_io_info *fio);
block_t, block_t, bool, bool); void __f2fs_replace_block(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
void f2fs_replace_block(struct f2fs_sb_info *, struct dnode_of_data *, block_t old_blkaddr, block_t new_blkaddr,
block_t, block_t, unsigned char, bool, bool); bool recover_curseg, bool recover_newaddr);
void allocate_data_block(struct f2fs_sb_info *, struct page *, void f2fs_replace_block(struct f2fs_sb_info *sbi, struct dnode_of_data *dn,
block_t, block_t *, struct f2fs_summary *, int); block_t old_addr, block_t new_addr,
void f2fs_wait_on_page_writeback(struct page *, enum page_type, bool); unsigned char version, bool recover_curseg,
void f2fs_wait_on_encrypted_page_writeback(struct f2fs_sb_info *, block_t); bool recover_newaddr);
void write_data_summaries(struct f2fs_sb_info *, block_t); void allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
void write_node_summaries(struct f2fs_sb_info *, block_t); block_t old_blkaddr, block_t *new_blkaddr,
int lookup_journal_in_cursum(struct f2fs_journal *, int, unsigned int, int); struct f2fs_summary *sum, int type);
void flush_sit_entries(struct f2fs_sb_info *, struct cp_control *); void f2fs_wait_on_page_writeback(struct page *page,
int build_segment_manager(struct f2fs_sb_info *); enum page_type type, bool ordered);
void destroy_segment_manager(struct f2fs_sb_info *); void f2fs_wait_on_encrypted_page_writeback(struct f2fs_sb_info *sbi,
block_t blkaddr);
void write_data_summaries(struct f2fs_sb_info *sbi, block_t start_blk);
void write_node_summaries(struct f2fs_sb_info *sbi, block_t start_blk);
int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
unsigned int val, int alloc);
void flush_sit_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc);
int build_segment_manager(struct f2fs_sb_info *sbi);
void destroy_segment_manager(struct f2fs_sb_info *sbi);
int __init create_segment_manager_caches(void); int __init create_segment_manager_caches(void);
void destroy_segment_manager_caches(void); void destroy_segment_manager_caches(void);
/* /*
* checkpoint.c * checkpoint.c
*/ */
void f2fs_stop_checkpoint(struct f2fs_sb_info *, bool); void f2fs_stop_checkpoint(struct f2fs_sb_info *sbi, bool end_io);
struct page *grab_meta_page(struct f2fs_sb_info *, pgoff_t); struct page *grab_meta_page(struct f2fs_sb_info *sbi, pgoff_t index);
struct page *get_meta_page(struct f2fs_sb_info *, pgoff_t); struct page *get_meta_page(struct f2fs_sb_info *sbi, pgoff_t index);
struct page *get_tmp_page(struct f2fs_sb_info *, pgoff_t); struct page *get_tmp_page(struct f2fs_sb_info *sbi, pgoff_t index);
bool is_valid_blkaddr(struct f2fs_sb_info *, block_t, int); bool is_valid_blkaddr(struct f2fs_sb_info *sbi, block_t blkaddr, int type);
int ra_meta_pages(struct f2fs_sb_info *, block_t, int, int, bool); int ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages,
void ra_meta_pages_cond(struct f2fs_sb_info *, pgoff_t); int type, bool sync);
long sync_meta_pages(struct f2fs_sb_info *, enum page_type, long); void ra_meta_pages_cond(struct f2fs_sb_info *sbi, pgoff_t index);
void add_ino_entry(struct f2fs_sb_info *, nid_t, int type); long sync_meta_pages(struct f2fs_sb_info *sbi, enum page_type type,
void remove_ino_entry(struct f2fs_sb_info *, nid_t, int type); long nr_to_write);
void release_ino_entry(struct f2fs_sb_info *, bool); void add_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type);
bool exist_written_data(struct f2fs_sb_info *, nid_t, int); void remove_ino_entry(struct f2fs_sb_info *sbi, nid_t ino, int type);
int f2fs_sync_inode_meta(struct f2fs_sb_info *); void release_ino_entry(struct f2fs_sb_info *sbi, bool all);
int acquire_orphan_inode(struct f2fs_sb_info *); bool exist_written_data(struct f2fs_sb_info *sbi, nid_t ino, int mode);
void release_orphan_inode(struct f2fs_sb_info *); int f2fs_sync_inode_meta(struct f2fs_sb_info *sbi);
void add_orphan_inode(struct inode *); int acquire_orphan_inode(struct f2fs_sb_info *sbi);
void remove_orphan_inode(struct f2fs_sb_info *, nid_t); void release_orphan_inode(struct f2fs_sb_info *sbi);
int recover_orphan_inodes(struct f2fs_sb_info *); void add_orphan_inode(struct inode *inode);
int get_valid_checkpoint(struct f2fs_sb_info *); void remove_orphan_inode(struct f2fs_sb_info *sbi, nid_t ino);
void update_dirty_page(struct inode *, struct page *); int recover_orphan_inodes(struct f2fs_sb_info *sbi);
void remove_dirty_inode(struct inode *); int get_valid_checkpoint(struct f2fs_sb_info *sbi);
int sync_dirty_inodes(struct f2fs_sb_info *, enum inode_type); void update_dirty_page(struct inode *inode, struct page *page);
int write_checkpoint(struct f2fs_sb_info *, struct cp_control *); void remove_dirty_inode(struct inode *inode);
void init_ino_entry_info(struct f2fs_sb_info *); int sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type);
int write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc);
void init_ino_entry_info(struct f2fs_sb_info *sbi);
int __init create_checkpoint_caches(void); int __init create_checkpoint_caches(void);
void destroy_checkpoint_caches(void); void destroy_checkpoint_caches(void);
/* /*
* data.c * data.c
*/ */
void f2fs_submit_merged_bio(struct f2fs_sb_info *, enum page_type, int); void f2fs_submit_merged_bio(struct f2fs_sb_info *sbi, enum page_type type,
void f2fs_submit_merged_bio_cond(struct f2fs_sb_info *, struct inode *, int rw);
struct page *, nid_t, enum page_type, int); void f2fs_submit_merged_bio_cond(struct f2fs_sb_info *sbi,
void f2fs_flush_merged_bios(struct f2fs_sb_info *); struct inode *inode, nid_t ino, pgoff_t idx,
int f2fs_submit_page_bio(struct f2fs_io_info *); enum page_type type, int rw);
void f2fs_submit_page_mbio(struct f2fs_io_info *); void f2fs_flush_merged_bios(struct f2fs_sb_info *sbi);
struct block_device *f2fs_target_device(struct f2fs_sb_info *, int f2fs_submit_page_bio(struct f2fs_io_info *fio);
block_t, struct bio *); int f2fs_submit_page_mbio(struct f2fs_io_info *fio);
int f2fs_target_device_index(struct f2fs_sb_info *, block_t); struct block_device *f2fs_target_device(struct f2fs_sb_info *sbi,
void set_data_blkaddr(struct dnode_of_data *); block_t blk_addr, struct bio *bio);
void f2fs_update_data_blkaddr(struct dnode_of_data *, block_t); int f2fs_target_device_index(struct f2fs_sb_info *sbi, block_t blkaddr);
int reserve_new_blocks(struct dnode_of_data *, blkcnt_t); void set_data_blkaddr(struct dnode_of_data *dn);
int reserve_new_block(struct dnode_of_data *); void f2fs_update_data_blkaddr(struct dnode_of_data *dn, block_t blkaddr);
int f2fs_get_block(struct dnode_of_data *, pgoff_t); int reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count);
int f2fs_preallocate_blocks(struct kiocb *, struct iov_iter *); int reserve_new_block(struct dnode_of_data *dn);
int f2fs_reserve_block(struct dnode_of_data *, pgoff_t); int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index);
struct page *get_read_data_page(struct inode *, pgoff_t, int, bool); int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from);
struct page *find_data_page(struct inode *, pgoff_t); int f2fs_reserve_block(struct dnode_of_data *dn, pgoff_t index);
struct page *get_lock_data_page(struct inode *, pgoff_t, bool); struct page *get_read_data_page(struct inode *inode, pgoff_t index,
struct page *get_new_data_page(struct inode *, struct page *, pgoff_t, bool); int op_flags, bool for_write);
int do_write_data_page(struct f2fs_io_info *); struct page *find_data_page(struct inode *inode, pgoff_t index);
int f2fs_map_blocks(struct inode *, struct f2fs_map_blocks *, int, int); struct page *get_lock_data_page(struct inode *inode, pgoff_t index,
int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *, u64, u64); bool for_write);
void f2fs_set_page_dirty_nobuffers(struct page *); struct page *get_new_data_page(struct inode *inode,
void f2fs_invalidate_page(struct page *, unsigned int, unsigned int); struct page *ipage, pgoff_t index, bool new_i_size);
int f2fs_release_page(struct page *, gfp_t); int do_write_data_page(struct f2fs_io_info *fio);
int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
int create, int flag);
int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
u64 start, u64 len);
void f2fs_set_page_dirty_nobuffers(struct page *page);
void f2fs_invalidate_page(struct page *page, unsigned int offset,
unsigned int length);
int f2fs_release_page(struct page *page, gfp_t wait);
#ifdef CONFIG_MIGRATION #ifdef CONFIG_MIGRATION
int f2fs_migrate_page(struct address_space *, struct page *, struct page *, int f2fs_migrate_page(struct address_space *mapping, struct page *newpage,
enum migrate_mode); struct page *page, enum migrate_mode mode);
#endif #endif
/* /*
* gc.c * gc.c
*/ */
int start_gc_thread(struct f2fs_sb_info *); int start_gc_thread(struct f2fs_sb_info *sbi);
void stop_gc_thread(struct f2fs_sb_info *); void stop_gc_thread(struct f2fs_sb_info *sbi);
block_t start_bidx_of_node(unsigned int, struct inode *); block_t start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
int f2fs_gc(struct f2fs_sb_info *, bool, bool); int f2fs_gc(struct f2fs_sb_info *sbi, bool sync, bool background);
void build_gc_manager(struct f2fs_sb_info *); void build_gc_manager(struct f2fs_sb_info *sbi);
/* /*
* recovery.c * recovery.c
*/ */
int recover_fsync_data(struct f2fs_sb_info *, bool); int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only);
bool space_for_roll_forward(struct f2fs_sb_info *); bool space_for_roll_forward(struct f2fs_sb_info *sbi);
/* /*
* debug.c * debug.c
@ -2227,8 +2334,9 @@ struct f2fs_stat_info {
unsigned int ndirty_dirs, ndirty_files, ndirty_all; unsigned int ndirty_dirs, ndirty_files, ndirty_all;
int nats, dirty_nats, sits, dirty_sits, free_nids, alloc_nids; int nats, dirty_nats, sits, dirty_sits, free_nids, alloc_nids;
int total_count, utilization; int total_count, utilization;
int bg_gc, nr_wb_cp_data, nr_wb_data; int bg_gc, nr_wb_cp_data, nr_wb_data, nr_flush, nr_discard;
int inline_xattr, inline_inode, inline_dir, orphans; int inline_xattr, inline_inode, inline_dir, append, update, orphans;
int aw_cnt, max_aw_cnt;
unsigned int valid_count, valid_node_count, valid_inode_count, discard_blks; unsigned int valid_count, valid_node_count, valid_inode_count, discard_blks;
unsigned int bimodal, avg_vblocks; unsigned int bimodal, avg_vblocks;
int util_free, util_valid, util_invalid; int util_free, util_valid, util_invalid;
@ -2300,6 +2408,17 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct f2fs_sb_info *sbi)
((sbi)->block_count[(curseg)->alloc_type]++) ((sbi)->block_count[(curseg)->alloc_type]++)
#define stat_inc_inplace_blocks(sbi) \ #define stat_inc_inplace_blocks(sbi) \
(atomic_inc(&(sbi)->inplace_count)) (atomic_inc(&(sbi)->inplace_count))
#define stat_inc_atomic_write(inode) \
(atomic_inc(&F2FS_I_SB(inode)->aw_cnt))
#define stat_dec_atomic_write(inode) \
(atomic_dec(&F2FS_I_SB(inode)->aw_cnt))
#define stat_update_max_atomic_write(inode) \
do { \
int cur = atomic_read(&F2FS_I_SB(inode)->aw_cnt); \
int max = atomic_read(&F2FS_I_SB(inode)->max_aw_cnt); \
if (cur > max) \
atomic_set(&F2FS_I_SB(inode)->max_aw_cnt, cur); \
} while (0)
#define stat_inc_seg_count(sbi, type, gc_type) \ #define stat_inc_seg_count(sbi, type, gc_type) \
do { \ do { \
struct f2fs_stat_info *si = F2FS_STAT(sbi); \ struct f2fs_stat_info *si = F2FS_STAT(sbi); \
@ -2332,8 +2451,8 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct f2fs_sb_info *sbi)
si->bg_node_blks += (gc_type == BG_GC) ? (blks) : 0; \ si->bg_node_blks += (gc_type == BG_GC) ? (blks) : 0; \
} while (0) } while (0)
int f2fs_build_stats(struct f2fs_sb_info *); int f2fs_build_stats(struct f2fs_sb_info *sbi);
void f2fs_destroy_stats(struct f2fs_sb_info *); void f2fs_destroy_stats(struct f2fs_sb_info *sbi);
int __init f2fs_create_root_stats(void); int __init f2fs_create_root_stats(void);
void f2fs_destroy_root_stats(void); void f2fs_destroy_root_stats(void);
#else #else
@ -2353,6 +2472,9 @@ void f2fs_destroy_root_stats(void);
#define stat_dec_inline_inode(inode) #define stat_dec_inline_inode(inode)
#define stat_inc_inline_dir(inode) #define stat_inc_inline_dir(inode)
#define stat_dec_inline_dir(inode) #define stat_dec_inline_dir(inode)
#define stat_inc_atomic_write(inode)
#define stat_dec_atomic_write(inode)
#define stat_update_max_atomic_write(inode)
#define stat_inc_seg_type(sbi, curseg) #define stat_inc_seg_type(sbi, curseg)
#define stat_inc_block_count(sbi, curseg) #define stat_inc_block_count(sbi, curseg)
#define stat_inc_inplace_blocks(sbi) #define stat_inc_inplace_blocks(sbi)
@ -2382,49 +2504,55 @@ extern struct kmem_cache *inode_entry_slab;
/* /*
* inline.c * inline.c
*/ */
bool f2fs_may_inline_data(struct inode *); bool f2fs_may_inline_data(struct inode *inode);
bool f2fs_may_inline_dentry(struct inode *); bool f2fs_may_inline_dentry(struct inode *inode);
void read_inline_data(struct page *, struct page *); void read_inline_data(struct page *page, struct page *ipage);
bool truncate_inline_inode(struct page *, u64); bool truncate_inline_inode(struct page *ipage, u64 from);
int f2fs_read_inline_data(struct inode *, struct page *); int f2fs_read_inline_data(struct inode *inode, struct page *page);
int f2fs_convert_inline_page(struct dnode_of_data *, struct page *); int f2fs_convert_inline_page(struct dnode_of_data *dn, struct page *page);
int f2fs_convert_inline_inode(struct inode *); int f2fs_convert_inline_inode(struct inode *inode);
int f2fs_write_inline_data(struct inode *, struct page *); int f2fs_write_inline_data(struct inode *inode, struct page *page);
bool recover_inline_data(struct inode *, struct page *); bool recover_inline_data(struct inode *inode, struct page *npage);
struct f2fs_dir_entry *find_in_inline_dir(struct inode *, struct f2fs_dir_entry *find_in_inline_dir(struct inode *dir,
struct fscrypt_name *, struct page **); struct fscrypt_name *fname, struct page **res_page);
int make_empty_inline_dir(struct inode *inode, struct inode *, struct page *); int make_empty_inline_dir(struct inode *inode, struct inode *parent,
int f2fs_add_inline_entry(struct inode *, const struct qstr *, struct page *ipage);
const struct qstr *, struct inode *, nid_t, umode_t); int f2fs_add_inline_entry(struct inode *dir, const struct qstr *new_name,
void f2fs_delete_inline_entry(struct f2fs_dir_entry *, struct page *, const struct qstr *orig_name,
struct inode *, struct inode *); struct inode *inode, nid_t ino, umode_t mode);
bool f2fs_empty_inline_dir(struct inode *); void f2fs_delete_inline_entry(struct f2fs_dir_entry *dentry, struct page *page,
int f2fs_read_inline_dir(struct file *, struct dir_context *, struct inode *dir, struct inode *inode);
struct fscrypt_str *); bool f2fs_empty_inline_dir(struct inode *dir);
int f2fs_inline_data_fiemap(struct inode *, int f2fs_read_inline_dir(struct file *file, struct dir_context *ctx,
struct fiemap_extent_info *, __u64, __u64); struct fscrypt_str *fstr);
int f2fs_inline_data_fiemap(struct inode *inode,
struct fiemap_extent_info *fieinfo,
__u64 start, __u64 len);
/* /*
* shrinker.c * shrinker.c
*/ */
unsigned long f2fs_shrink_count(struct shrinker *, struct shrink_control *); unsigned long f2fs_shrink_count(struct shrinker *shrink,
unsigned long f2fs_shrink_scan(struct shrinker *, struct shrink_control *); struct shrink_control *sc);
void f2fs_join_shrinker(struct f2fs_sb_info *); unsigned long f2fs_shrink_scan(struct shrinker *shrink,
void f2fs_leave_shrinker(struct f2fs_sb_info *); struct shrink_control *sc);
void f2fs_join_shrinker(struct f2fs_sb_info *sbi);
void f2fs_leave_shrinker(struct f2fs_sb_info *sbi);
/* /*
* extent_cache.c * extent_cache.c
*/ */
unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *, int); unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink);
bool f2fs_init_extent_tree(struct inode *, struct f2fs_extent *); bool f2fs_init_extent_tree(struct inode *inode, struct f2fs_extent *i_ext);
void f2fs_drop_extent_tree(struct inode *); void f2fs_drop_extent_tree(struct inode *inode);
unsigned int f2fs_destroy_extent_node(struct inode *); unsigned int f2fs_destroy_extent_node(struct inode *inode);
void f2fs_destroy_extent_tree(struct inode *); void f2fs_destroy_extent_tree(struct inode *inode);
bool f2fs_lookup_extent_cache(struct inode *, pgoff_t, struct extent_info *); bool f2fs_lookup_extent_cache(struct inode *inode, pgoff_t pgofs,
void f2fs_update_extent_cache(struct dnode_of_data *); struct extent_info *ei);
void f2fs_update_extent_cache(struct dnode_of_data *dn);
void f2fs_update_extent_cache_range(struct dnode_of_data *dn, void f2fs_update_extent_cache_range(struct dnode_of_data *dn,
pgoff_t, block_t, unsigned int); pgoff_t fofs, block_t blkaddr, unsigned int len);
void init_extent_cache_info(struct f2fs_sb_info *); void init_extent_cache_info(struct f2fs_sb_info *sbi);
int __init create_extent_cache(void); int __init create_extent_cache(void);
void destroy_extent_cache(void); void destroy_extent_cache(void);

View File

@ -20,6 +20,7 @@
#include <linux/uaccess.h> #include <linux/uaccess.h>
#include <linux/mount.h> #include <linux/mount.h>
#include <linux/pagevec.h> #include <linux/pagevec.h>
#include <linux/uio.h>
#include <linux/uuid.h> #include <linux/uuid.h>
#include <linux/file.h> #include <linux/file.h>
@ -140,8 +141,6 @@ static inline bool need_do_checkpoint(struct inode *inode)
need_cp = true; need_cp = true;
else if (!is_checkpointed_node(sbi, F2FS_I(inode)->i_pino)) else if (!is_checkpointed_node(sbi, F2FS_I(inode)->i_pino))
need_cp = true; need_cp = true;
else if (F2FS_I(inode)->xattr_ver == cur_cp_version(F2FS_CKPT(sbi)))
need_cp = true;
else if (test_opt(sbi, FASTBOOT)) else if (test_opt(sbi, FASTBOOT))
need_cp = true; need_cp = true;
else if (sbi->active_logs == 2) else if (sbi->active_logs == 2)
@ -167,7 +166,6 @@ static void try_to_fix_pino(struct inode *inode)
nid_t pino; nid_t pino;
down_write(&fi->i_sem); down_write(&fi->i_sem);
fi->xattr_ver = 0;
if (file_wrong_pino(inode) && inode->i_nlink == 1 && if (file_wrong_pino(inode) && inode->i_nlink == 1 &&
get_parent_ino(inode, &pino)) { get_parent_ino(inode, &pino)) {
f2fs_i_pino_write(inode, pino); f2fs_i_pino_write(inode, pino);
@ -276,7 +274,8 @@ sync_nodes:
flush_out: flush_out:
remove_ino_entry(sbi, ino, UPDATE_INO); remove_ino_entry(sbi, ino, UPDATE_INO);
clear_inode_flag(inode, FI_UPDATE_WRITE); clear_inode_flag(inode, FI_UPDATE_WRITE);
ret = f2fs_issue_flush(sbi); if (!atomic)
ret = f2fs_issue_flush(sbi);
f2fs_update_time(sbi, REQ_TIME); f2fs_update_time(sbi, REQ_TIME);
out: out:
trace_f2fs_sync_file_exit(inode, need_cp, datasync, ret); trace_f2fs_sync_file_exit(inode, need_cp, datasync, ret);
@ -567,8 +566,9 @@ int truncate_blocks(struct inode *inode, u64 from, bool lock)
} }
if (f2fs_has_inline_data(inode)) { if (f2fs_has_inline_data(inode)) {
if (truncate_inline_inode(ipage, from)) truncate_inline_inode(ipage, from);
set_page_dirty(ipage); if (from == 0)
clear_inode_flag(inode, FI_DATA_EXIST);
f2fs_put_page(ipage, 1); f2fs_put_page(ipage, 1);
truncate_page = true; truncate_page = true;
goto out; goto out;
@ -1541,6 +1541,8 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
if (ret) if (ret)
clear_inode_flag(inode, FI_ATOMIC_FILE); clear_inode_flag(inode, FI_ATOMIC_FILE);
out: out:
stat_inc_atomic_write(inode);
stat_update_max_atomic_write(inode);
inode_unlock(inode); inode_unlock(inode);
mnt_drop_write_file(filp); mnt_drop_write_file(filp);
return ret; return ret;
@ -1564,15 +1566,18 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
goto err_out; goto err_out;
if (f2fs_is_atomic_file(inode)) { if (f2fs_is_atomic_file(inode)) {
clear_inode_flag(inode, FI_ATOMIC_FILE);
ret = commit_inmem_pages(inode); ret = commit_inmem_pages(inode);
if (ret) { if (ret)
set_inode_flag(inode, FI_ATOMIC_FILE);
goto err_out; goto err_out;
}
}
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true); ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
if (!ret) {
clear_inode_flag(inode, FI_ATOMIC_FILE);
stat_dec_atomic_write(inode);
}
} else {
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
}
err_out: err_out:
inode_unlock(inode); inode_unlock(inode);
mnt_drop_write_file(filp); mnt_drop_write_file(filp);
@ -1870,7 +1875,7 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi,
{ {
struct inode *inode = file_inode(filp); struct inode *inode = file_inode(filp);
struct f2fs_map_blocks map = { .m_next_pgofs = NULL }; struct f2fs_map_blocks map = { .m_next_pgofs = NULL };
struct extent_info ei; struct extent_info ei = {0,0,0};
pgoff_t pg_start, pg_end; pgoff_t pg_start, pg_end;
unsigned int blk_per_seg = sbi->blocks_per_seg; unsigned int blk_per_seg = sbi->blocks_per_seg;
unsigned int total = 0, sec_num; unsigned int total = 0, sec_num;
@ -2250,8 +2255,12 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
inode_lock(inode); inode_lock(inode);
ret = generic_write_checks(iocb, from); ret = generic_write_checks(iocb, from);
if (ret > 0) { if (ret > 0) {
int err = f2fs_preallocate_blocks(iocb, from); int err;
if (iov_iter_fault_in_readable(from, iov_iter_count(from)))
set_inode_flag(inode, FI_NO_PREALLOC);
err = f2fs_preallocate_blocks(iocb, from);
if (err) { if (err) {
inode_unlock(inode); inode_unlock(inode);
return err; return err;
@ -2259,6 +2268,7 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
blk_start_plug(&plug); blk_start_plug(&plug);
ret = __generic_file_write_iter(iocb, from); ret = __generic_file_write_iter(iocb, from);
blk_finish_plug(&plug); blk_finish_plug(&plug);
clear_inode_flag(inode, FI_NO_PREALLOC);
} }
inode_unlock(inode); inode_unlock(inode);

View File

@ -48,8 +48,10 @@ static int gc_thread_func(void *data)
} }
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
if (time_to_inject(sbi, FAULT_CHECKPOINT)) if (time_to_inject(sbi, FAULT_CHECKPOINT)) {
f2fs_show_injection_info(FAULT_CHECKPOINT);
f2fs_stop_checkpoint(sbi, false); f2fs_stop_checkpoint(sbi, false);
}
#endif #endif
/* /*
@ -166,7 +168,8 @@ static void select_policy(struct f2fs_sb_info *sbi, int gc_type,
p->ofs_unit = sbi->segs_per_sec; p->ofs_unit = sbi->segs_per_sec;
} }
if (p->max_search > sbi->max_victim_search) /* we need to check every dirty segments in the FG_GC case */
if (gc_type != FG_GC && p->max_search > sbi->max_victim_search)
p->max_search = sbi->max_victim_search; p->max_search = sbi->max_victim_search;
p->offset = sbi->last_victim[p->gc_mode]; p->offset = sbi->last_victim[p->gc_mode];
@ -199,6 +202,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info *sbi)
for_each_set_bit(secno, dirty_i->victim_secmap, MAIN_SECS(sbi)) { for_each_set_bit(secno, dirty_i->victim_secmap, MAIN_SECS(sbi)) {
if (sec_usage_check(sbi, secno)) if (sec_usage_check(sbi, secno))
continue; continue;
if (no_fggc_candidate(sbi, secno))
continue;
clear_bit(secno, dirty_i->victim_secmap); clear_bit(secno, dirty_i->victim_secmap);
return secno * sbi->segs_per_sec; return secno * sbi->segs_per_sec;
} }
@ -237,6 +244,16 @@ static unsigned int get_cb_cost(struct f2fs_sb_info *sbi, unsigned int segno)
return UINT_MAX - ((100 * (100 - u) * age) / (100 + u)); return UINT_MAX - ((100 * (100 - u) * age) / (100 + u));
} }
static unsigned int get_greedy_cost(struct f2fs_sb_info *sbi,
unsigned int segno)
{
unsigned int valid_blocks =
get_valid_blocks(sbi, segno, sbi->segs_per_sec);
return IS_DATASEG(get_seg_entry(sbi, segno)->type) ?
valid_blocks * 2 : valid_blocks;
}
static inline unsigned int get_gc_cost(struct f2fs_sb_info *sbi, static inline unsigned int get_gc_cost(struct f2fs_sb_info *sbi,
unsigned int segno, struct victim_sel_policy *p) unsigned int segno, struct victim_sel_policy *p)
{ {
@ -245,7 +262,7 @@ static inline unsigned int get_gc_cost(struct f2fs_sb_info *sbi,
/* alloc_mode == LFS */ /* alloc_mode == LFS */
if (p->gc_mode == GC_GREEDY) if (p->gc_mode == GC_GREEDY)
return get_valid_blocks(sbi, segno, sbi->segs_per_sec); return get_greedy_cost(sbi, segno);
else else
return get_cb_cost(sbi, segno); return get_cb_cost(sbi, segno);
} }
@ -322,13 +339,15 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
nsearched++; nsearched++;
} }
secno = GET_SECNO(sbi, segno); secno = GET_SECNO(sbi, segno);
if (sec_usage_check(sbi, secno)) if (sec_usage_check(sbi, secno))
goto next; goto next;
if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap)) if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
goto next; goto next;
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
cost = get_gc_cost(sbi, segno, &p); cost = get_gc_cost(sbi, segno, &p);
@ -569,6 +588,9 @@ static void move_encrypted_block(struct inode *inode, block_t bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off)) if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out; goto out;
if (f2fs_is_atomic_file(inode))
goto out;
set_new_dnode(&dn, inode, NULL, NULL, 0); set_new_dnode(&dn, inode, NULL, NULL, 0);
err = get_dnode_of_data(&dn, bidx, LOOKUP_NODE); err = get_dnode_of_data(&dn, bidx, LOOKUP_NODE);
if (err) if (err)
@ -661,6 +683,9 @@ static void move_data_page(struct inode *inode, block_t bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off)) if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out; goto out;
if (f2fs_is_atomic_file(inode))
goto out;
if (gc_type == BG_GC) { if (gc_type == BG_GC) {
if (PageWriteback(page)) if (PageWriteback(page))
goto out; goto out;
@ -921,8 +946,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync, bool background)
cpc.reason = __get_cp_reason(sbi); cpc.reason = __get_cp_reason(sbi);
gc_more: gc_more:
segno = NULL_SEGNO;
if (unlikely(!(sbi->sb->s_flags & MS_ACTIVE))) if (unlikely(!(sbi->sb->s_flags & MS_ACTIVE)))
goto stop; goto stop;
if (unlikely(f2fs_cp_error(sbi))) { if (unlikely(f2fs_cp_error(sbi))) {
@ -930,30 +953,23 @@ gc_more:
goto stop; goto stop;
} }
if (gc_type == BG_GC && has_not_enough_free_secs(sbi, sec_freed, 0)) { if (gc_type == BG_GC && has_not_enough_free_secs(sbi, 0, 0)) {
gc_type = FG_GC;
/* /*
* If there is no victim and no prefree segment but still not * For example, if there are many prefree_segments below given
* enough free sections, we should flush dent/node blocks and do * threshold, we can make them free by checkpoint. Then, we
* garbage collections. * secure free segments which doesn't need fggc any more.
*/ */
if (__get_victim(sbi, &segno, gc_type) || ret = write_checkpoint(sbi, &cpc);
prefree_segments(sbi)) { if (ret)
ret = write_checkpoint(sbi, &cpc); goto stop;
if (ret) if (has_not_enough_free_secs(sbi, 0, 0))
goto stop; gc_type = FG_GC;
segno = NULL_SEGNO;
} else if (has_not_enough_free_secs(sbi, 0, 0)) {
ret = write_checkpoint(sbi, &cpc);
if (ret)
goto stop;
}
} else if (gc_type == BG_GC && !background) {
/* f2fs_balance_fs doesn't need to do BG_GC in critical path. */
goto stop;
} }
if (segno == NULL_SEGNO && !__get_victim(sbi, &segno, gc_type)) /* f2fs_balance_fs doesn't need to do BG_GC in critical path. */
if (gc_type == BG_GC && !background)
goto stop;
if (!__get_victim(sbi, &segno, gc_type))
goto stop; goto stop;
ret = 0; ret = 0;
@ -983,5 +999,16 @@ stop:
void build_gc_manager(struct f2fs_sb_info *sbi) void build_gc_manager(struct f2fs_sb_info *sbi)
{ {
u64 main_count, resv_count, ovp_count, blocks_per_sec;
DIRTY_I(sbi)->v_ops = &default_v_ops; DIRTY_I(sbi)->v_ops = &default_v_ops;
/* threshold of # of valid blocks in a section for victims of FG_GC */
main_count = SM_I(sbi)->main_segments << sbi->log_blocks_per_seg;
resv_count = SM_I(sbi)->reserved_segments << sbi->log_blocks_per_seg;
ovp_count = SM_I(sbi)->ovp_segments << sbi->log_blocks_per_seg;
blocks_per_sec = sbi->blocks_per_seg * sbi->segs_per_sec;
sbi->fggc_threshold = div64_u64((main_count - ovp_count) * blocks_per_sec,
(main_count - resv_count));
} }

View File

@ -373,8 +373,10 @@ void f2fs_evict_inode(struct inode *inode)
goto no_delete; goto no_delete;
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
if (time_to_inject(sbi, FAULT_EVICT_INODE)) if (time_to_inject(sbi, FAULT_EVICT_INODE)) {
f2fs_show_injection_info(FAULT_EVICT_INODE);
goto no_delete; goto no_delete;
}
#endif #endif
remove_ino_entry(sbi, inode->i_ino, APPEND_INO); remove_ino_entry(sbi, inode->i_ino, APPEND_INO);

View File

@ -321,9 +321,9 @@ static struct dentry *f2fs_lookup(struct inode *dir, struct dentry *dentry,
if (err) if (err)
goto err_out; goto err_out;
} }
if (!IS_ERR(inode) && f2fs_encrypted_inode(dir) && if (f2fs_encrypted_inode(dir) &&
(S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode)) && (S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode)) &&
!fscrypt_has_permitted_context(dir, inode)) { !fscrypt_has_permitted_context(dir, inode)) {
bool nokey = f2fs_encrypted_inode(inode) && bool nokey = f2fs_encrypted_inode(inode) &&
!fscrypt_has_encryption_key(inode); !fscrypt_has_encryption_key(inode);
err = nokey ? -ENOKEY : -EPERM; err = nokey ? -ENOKEY : -EPERM;
@ -663,6 +663,12 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
bool is_old_inline = f2fs_has_inline_dentry(old_dir); bool is_old_inline = f2fs_has_inline_dentry(old_dir);
int err = -ENOENT; int err = -ENOENT;
if ((f2fs_encrypted_inode(old_dir) &&
!fscrypt_has_encryption_key(old_dir)) ||
(f2fs_encrypted_inode(new_dir) &&
!fscrypt_has_encryption_key(new_dir)))
return -ENOKEY;
if ((old_dir != new_dir) && f2fs_encrypted_inode(new_dir) && if ((old_dir != new_dir) && f2fs_encrypted_inode(new_dir) &&
!fscrypt_has_permitted_context(new_dir, old_inode)) { !fscrypt_has_permitted_context(new_dir, old_inode)) {
err = -EPERM; err = -EPERM;
@ -843,6 +849,12 @@ static int f2fs_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
int old_nlink = 0, new_nlink = 0; int old_nlink = 0, new_nlink = 0;
int err = -ENOENT; int err = -ENOENT;
if ((f2fs_encrypted_inode(old_dir) &&
!fscrypt_has_encryption_key(old_dir)) ||
(f2fs_encrypted_inode(new_dir) &&
!fscrypt_has_encryption_key(new_dir)))
return -ENOKEY;
if ((f2fs_encrypted_inode(old_dir) || f2fs_encrypted_inode(new_dir)) && if ((f2fs_encrypted_inode(old_dir) || f2fs_encrypted_inode(new_dir)) &&
(old_dir != new_dir) && (old_dir != new_dir) &&
(!fscrypt_has_permitted_context(new_dir, old_inode) || (!fscrypt_has_permitted_context(new_dir, old_inode) ||

View File

@ -245,12 +245,24 @@ bool need_inode_block_update(struct f2fs_sb_info *sbi, nid_t ino)
return need_update; return need_update;
} }
static struct nat_entry *grab_nat_entry(struct f2fs_nm_info *nm_i, nid_t nid) static struct nat_entry *grab_nat_entry(struct f2fs_nm_info *nm_i, nid_t nid,
bool no_fail)
{ {
struct nat_entry *new; struct nat_entry *new;
new = f2fs_kmem_cache_alloc(nat_entry_slab, GFP_NOFS); if (no_fail) {
f2fs_radix_tree_insert(&nm_i->nat_root, nid, new); new = f2fs_kmem_cache_alloc(nat_entry_slab, GFP_NOFS);
f2fs_radix_tree_insert(&nm_i->nat_root, nid, new);
} else {
new = kmem_cache_alloc(nat_entry_slab, GFP_NOFS);
if (!new)
return NULL;
if (radix_tree_insert(&nm_i->nat_root, nid, new)) {
kmem_cache_free(nat_entry_slab, new);
return NULL;
}
}
memset(new, 0, sizeof(struct nat_entry)); memset(new, 0, sizeof(struct nat_entry));
nat_set_nid(new, nid); nat_set_nid(new, nid);
nat_reset_flag(new); nat_reset_flag(new);
@ -267,8 +279,9 @@ static void cache_nat_entry(struct f2fs_sb_info *sbi, nid_t nid,
e = __lookup_nat_cache(nm_i, nid); e = __lookup_nat_cache(nm_i, nid);
if (!e) { if (!e) {
e = grab_nat_entry(nm_i, nid); e = grab_nat_entry(nm_i, nid, false);
node_info_from_raw_nat(&e->ni, ne); if (e)
node_info_from_raw_nat(&e->ni, ne);
} else { } else {
f2fs_bug_on(sbi, nat_get_ino(e) != le32_to_cpu(ne->ino) || f2fs_bug_on(sbi, nat_get_ino(e) != le32_to_cpu(ne->ino) ||
nat_get_blkaddr(e) != nat_get_blkaddr(e) !=
@ -286,7 +299,7 @@ static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
down_write(&nm_i->nat_tree_lock); down_write(&nm_i->nat_tree_lock);
e = __lookup_nat_cache(nm_i, ni->nid); e = __lookup_nat_cache(nm_i, ni->nid);
if (!e) { if (!e) {
e = grab_nat_entry(nm_i, ni->nid); e = grab_nat_entry(nm_i, ni->nid, true);
copy_node_info(&e->ni, ni); copy_node_info(&e->ni, ni);
f2fs_bug_on(sbi, ni->blk_addr == NEW_ADDR); f2fs_bug_on(sbi, ni->blk_addr == NEW_ADDR);
} else if (new_blkaddr == NEW_ADDR) { } else if (new_blkaddr == NEW_ADDR) {
@ -325,6 +338,9 @@ static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
set_nat_flag(e, IS_CHECKPOINTED, false); set_nat_flag(e, IS_CHECKPOINTED, false);
__set_nat_cache_dirty(nm_i, e); __set_nat_cache_dirty(nm_i, e);
if (enabled_nat_bits(sbi, NULL) && new_blkaddr == NEW_ADDR)
clear_bit_le(NAT_BLOCK_OFFSET(ni->nid), nm_i->empty_nat_bits);
/* update fsync_mark if its inode nat entry is still alive */ /* update fsync_mark if its inode nat entry is still alive */
if (ni->nid != ni->ino) if (ni->nid != ni->ino)
e = __lookup_nat_cache(nm_i, ni->ino); e = __lookup_nat_cache(nm_i, ni->ino);
@ -958,9 +974,6 @@ int truncate_xattr_node(struct inode *inode, struct page *page)
f2fs_i_xnid_write(inode, 0); f2fs_i_xnid_write(inode, 0);
/* need to do checkpoint during fsync */
F2FS_I(inode)->xattr_ver = cur_cp_version(F2FS_CKPT(sbi));
set_new_dnode(&dn, inode, page, npage, nid); set_new_dnode(&dn, inode, page, npage, nid);
if (page) if (page)
@ -1018,7 +1031,7 @@ struct page *new_node_page(struct dnode_of_data *dn,
unsigned int ofs, struct page *ipage) unsigned int ofs, struct page *ipage)
{ {
struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode); struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode);
struct node_info old_ni, new_ni; struct node_info new_ni;
struct page *page; struct page *page;
int err; int err;
@ -1033,13 +1046,15 @@ struct page *new_node_page(struct dnode_of_data *dn,
err = -ENOSPC; err = -ENOSPC;
goto fail; goto fail;
} }
#ifdef CONFIG_F2FS_CHECK_FS
get_node_info(sbi, dn->nid, &old_ni); get_node_info(sbi, dn->nid, &new_ni);
f2fs_bug_on(sbi, new_ni.blk_addr != NULL_ADDR);
/* Reinitialize old_ni with new node page */ #endif
f2fs_bug_on(sbi, old_ni.blk_addr != NULL_ADDR); new_ni.nid = dn->nid;
new_ni = old_ni;
new_ni.ino = dn->inode->i_ino; new_ni.ino = dn->inode->i_ino;
new_ni.blk_addr = NULL_ADDR;
new_ni.flag = 0;
new_ni.version = 0;
set_node_addr(sbi, &new_ni, NEW_ADDR, false); set_node_addr(sbi, &new_ni, NEW_ADDR, false);
f2fs_wait_on_page_writeback(page, NODE, true); f2fs_wait_on_page_writeback(page, NODE, true);
@ -1305,16 +1320,99 @@ continue_unlock:
return last_page; return last_page;
} }
static int __write_node_page(struct page *page, bool atomic, bool *submitted,
struct writeback_control *wbc)
{
struct f2fs_sb_info *sbi = F2FS_P_SB(page);
nid_t nid;
struct node_info ni;
struct f2fs_io_info fio = {
.sbi = sbi,
.type = NODE,
.op = REQ_OP_WRITE,
.op_flags = wbc_to_write_flags(wbc),
.page = page,
.encrypted_page = NULL,
.submitted = false,
};
trace_f2fs_writepage(page, NODE);
if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
goto redirty_out;
if (unlikely(f2fs_cp_error(sbi)))
goto redirty_out;
/* get old block addr of this node page */
nid = nid_of_node(page);
f2fs_bug_on(sbi, page->index != nid);
if (wbc->for_reclaim) {
if (!down_read_trylock(&sbi->node_write))
goto redirty_out;
} else {
down_read(&sbi->node_write);
}
get_node_info(sbi, nid, &ni);
/* This page is already truncated */
if (unlikely(ni.blk_addr == NULL_ADDR)) {
ClearPageUptodate(page);
dec_page_count(sbi, F2FS_DIRTY_NODES);
up_read(&sbi->node_write);
unlock_page(page);
return 0;
}
if (atomic && !test_opt(sbi, NOBARRIER))
fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
set_page_writeback(page);
fio.old_blkaddr = ni.blk_addr;
write_node_page(nid, &fio);
set_node_addr(sbi, &ni, fio.new_blkaddr, is_fsync_dnode(page));
dec_page_count(sbi, F2FS_DIRTY_NODES);
up_read(&sbi->node_write);
if (wbc->for_reclaim) {
f2fs_submit_merged_bio_cond(sbi, page->mapping->host, 0,
page->index, NODE, WRITE);
submitted = NULL;
}
unlock_page(page);
if (unlikely(f2fs_cp_error(sbi))) {
f2fs_submit_merged_bio(sbi, NODE, WRITE);
submitted = NULL;
}
if (submitted)
*submitted = fio.submitted;
return 0;
redirty_out:
redirty_page_for_writepage(wbc, page);
return AOP_WRITEPAGE_ACTIVATE;
}
static int f2fs_write_node_page(struct page *page,
struct writeback_control *wbc)
{
return __write_node_page(page, false, NULL, wbc);
}
int fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode, int fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode,
struct writeback_control *wbc, bool atomic) struct writeback_control *wbc, bool atomic)
{ {
pgoff_t index, end; pgoff_t index, end;
pgoff_t last_idx = ULONG_MAX;
struct pagevec pvec; struct pagevec pvec;
int ret = 0; int ret = 0;
struct page *last_page = NULL; struct page *last_page = NULL;
bool marked = false; bool marked = false;
nid_t ino = inode->i_ino; nid_t ino = inode->i_ino;
int nwritten = 0;
if (atomic) { if (atomic) {
last_page = last_fsync_dnode(sbi, ino); last_page = last_fsync_dnode(sbi, ino);
@ -1336,6 +1434,7 @@ retry:
for (i = 0; i < nr_pages; i++) { for (i = 0; i < nr_pages; i++) {
struct page *page = pvec.pages[i]; struct page *page = pvec.pages[i];
bool submitted = false;
if (unlikely(f2fs_cp_error(sbi))) { if (unlikely(f2fs_cp_error(sbi))) {
f2fs_put_page(last_page, 0); f2fs_put_page(last_page, 0);
@ -1384,13 +1483,15 @@ continue_unlock:
if (!clear_page_dirty_for_io(page)) if (!clear_page_dirty_for_io(page))
goto continue_unlock; goto continue_unlock;
ret = NODE_MAPPING(sbi)->a_ops->writepage(page, wbc); ret = __write_node_page(page, atomic &&
page == last_page,
&submitted, wbc);
if (ret) { if (ret) {
unlock_page(page); unlock_page(page);
f2fs_put_page(last_page, 0); f2fs_put_page(last_page, 0);
break; break;
} else { } else if (submitted) {
nwritten++; last_idx = page->index;
} }
if (page == last_page) { if (page == last_page) {
@ -1416,8 +1517,9 @@ continue_unlock:
goto retry; goto retry;
} }
out: out:
if (nwritten) if (last_idx != ULONG_MAX)
f2fs_submit_merged_bio_cond(sbi, NULL, NULL, ino, NODE, WRITE); f2fs_submit_merged_bio_cond(sbi, NULL, ino, last_idx,
NODE, WRITE);
return ret ? -EIO: 0; return ret ? -EIO: 0;
} }
@ -1445,6 +1547,7 @@ next_step:
for (i = 0; i < nr_pages; i++) { for (i = 0; i < nr_pages; i++) {
struct page *page = pvec.pages[i]; struct page *page = pvec.pages[i];
bool submitted = false;
if (unlikely(f2fs_cp_error(sbi))) { if (unlikely(f2fs_cp_error(sbi))) {
pagevec_release(&pvec); pagevec_release(&pvec);
@ -1498,9 +1601,10 @@ continue_unlock:
set_fsync_mark(page, 0); set_fsync_mark(page, 0);
set_dentry_mark(page, 0); set_dentry_mark(page, 0);
if (NODE_MAPPING(sbi)->a_ops->writepage(page, wbc)) ret = __write_node_page(page, false, &submitted, wbc);
if (ret)
unlock_page(page); unlock_page(page);
else else if (submitted)
nwritten++; nwritten++;
if (--wbc->nr_to_write == 0) if (--wbc->nr_to_write == 0)
@ -1564,72 +1668,6 @@ int wait_on_node_pages_writeback(struct f2fs_sb_info *sbi, nid_t ino)
return ret; return ret;
} }
static int f2fs_write_node_page(struct page *page,
struct writeback_control *wbc)
{
struct f2fs_sb_info *sbi = F2FS_P_SB(page);
nid_t nid;
struct node_info ni;
struct f2fs_io_info fio = {
.sbi = sbi,
.type = NODE,
.op = REQ_OP_WRITE,
.op_flags = wbc_to_write_flags(wbc),
.page = page,
.encrypted_page = NULL,
};
trace_f2fs_writepage(page, NODE);
if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
goto redirty_out;
if (unlikely(f2fs_cp_error(sbi)))
goto redirty_out;
/* get old block addr of this node page */
nid = nid_of_node(page);
f2fs_bug_on(sbi, page->index != nid);
if (wbc->for_reclaim) {
if (!down_read_trylock(&sbi->node_write))
goto redirty_out;
} else {
down_read(&sbi->node_write);
}
get_node_info(sbi, nid, &ni);
/* This page is already truncated */
if (unlikely(ni.blk_addr == NULL_ADDR)) {
ClearPageUptodate(page);
dec_page_count(sbi, F2FS_DIRTY_NODES);
up_read(&sbi->node_write);
unlock_page(page);
return 0;
}
set_page_writeback(page);
fio.old_blkaddr = ni.blk_addr;
write_node_page(nid, &fio);
set_node_addr(sbi, &ni, fio.new_blkaddr, is_fsync_dnode(page));
dec_page_count(sbi, F2FS_DIRTY_NODES);
up_read(&sbi->node_write);
if (wbc->for_reclaim)
f2fs_submit_merged_bio_cond(sbi, NULL, page, 0, NODE, WRITE);
unlock_page(page);
if (unlikely(f2fs_cp_error(sbi)))
f2fs_submit_merged_bio(sbi, NODE, WRITE);
return 0;
redirty_out:
redirty_page_for_writepage(wbc, page);
return AOP_WRITEPAGE_ACTIVATE;
}
static int f2fs_write_node_pages(struct address_space *mapping, static int f2fs_write_node_pages(struct address_space *mapping,
struct writeback_control *wbc) struct writeback_control *wbc)
{ {
@ -1727,7 +1765,8 @@ static void __remove_nid_from_list(struct f2fs_sb_info *sbi,
radix_tree_delete(&nm_i->free_nid_root, i->nid); radix_tree_delete(&nm_i->free_nid_root, i->nid);
} }
static int add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build) /* return if the nid is recognized as free */
static bool add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build)
{ {
struct f2fs_nm_info *nm_i = NM_I(sbi); struct f2fs_nm_info *nm_i = NM_I(sbi);
struct free_nid *i; struct free_nid *i;
@ -1736,14 +1775,14 @@ static int add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build)
/* 0 nid should not be used */ /* 0 nid should not be used */
if (unlikely(nid == 0)) if (unlikely(nid == 0))
return 0; return false;
if (build) { if (build) {
/* do not add allocated nids */ /* do not add allocated nids */
ne = __lookup_nat_cache(nm_i, nid); ne = __lookup_nat_cache(nm_i, nid);
if (ne && (!get_nat_flag(ne, IS_CHECKPOINTED) || if (ne && (!get_nat_flag(ne, IS_CHECKPOINTED) ||
nat_get_blkaddr(ne) != NULL_ADDR)) nat_get_blkaddr(ne) != NULL_ADDR))
return 0; return false;
} }
i = f2fs_kmem_cache_alloc(free_nid_slab, GFP_NOFS); i = f2fs_kmem_cache_alloc(free_nid_slab, GFP_NOFS);
@ -1752,7 +1791,7 @@ static int add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build)
if (radix_tree_preload(GFP_NOFS)) { if (radix_tree_preload(GFP_NOFS)) {
kmem_cache_free(free_nid_slab, i); kmem_cache_free(free_nid_slab, i);
return 0; return true;
} }
spin_lock(&nm_i->nid_list_lock); spin_lock(&nm_i->nid_list_lock);
@ -1761,9 +1800,9 @@ static int add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build)
radix_tree_preload_end(); radix_tree_preload_end();
if (err) { if (err) {
kmem_cache_free(free_nid_slab, i); kmem_cache_free(free_nid_slab, i);
return 0; return true;
} }
return 1; return true;
} }
static void remove_free_nid(struct f2fs_sb_info *sbi, nid_t nid) static void remove_free_nid(struct f2fs_sb_info *sbi, nid_t nid)
@ -1784,17 +1823,36 @@ static void remove_free_nid(struct f2fs_sb_info *sbi, nid_t nid)
kmem_cache_free(free_nid_slab, i); kmem_cache_free(free_nid_slab, i);
} }
void update_free_nid_bitmap(struct f2fs_sb_info *sbi, nid_t nid, bool set)
{
struct f2fs_nm_info *nm_i = NM_I(sbi);
unsigned int nat_ofs = NAT_BLOCK_OFFSET(nid);
unsigned int nid_ofs = nid - START_NID(nid);
if (!test_bit_le(nat_ofs, nm_i->nat_block_bitmap))
return;
if (set)
set_bit_le(nid_ofs, nm_i->free_nid_bitmap[nat_ofs]);
else
clear_bit_le(nid_ofs, nm_i->free_nid_bitmap[nat_ofs]);
}
static void scan_nat_page(struct f2fs_sb_info *sbi, static void scan_nat_page(struct f2fs_sb_info *sbi,
struct page *nat_page, nid_t start_nid) struct page *nat_page, nid_t start_nid)
{ {
struct f2fs_nm_info *nm_i = NM_I(sbi); struct f2fs_nm_info *nm_i = NM_I(sbi);
struct f2fs_nat_block *nat_blk = page_address(nat_page); struct f2fs_nat_block *nat_blk = page_address(nat_page);
block_t blk_addr; block_t blk_addr;
unsigned int nat_ofs = NAT_BLOCK_OFFSET(start_nid);
int i; int i;
set_bit_le(nat_ofs, nm_i->nat_block_bitmap);
i = start_nid % NAT_ENTRY_PER_BLOCK; i = start_nid % NAT_ENTRY_PER_BLOCK;
for (; i < NAT_ENTRY_PER_BLOCK; i++, start_nid++) { for (; i < NAT_ENTRY_PER_BLOCK; i++, start_nid++) {
bool freed = false;
if (unlikely(start_nid >= nm_i->max_nid)) if (unlikely(start_nid >= nm_i->max_nid))
break; break;
@ -1802,11 +1860,106 @@ static void scan_nat_page(struct f2fs_sb_info *sbi,
blk_addr = le32_to_cpu(nat_blk->entries[i].block_addr); blk_addr = le32_to_cpu(nat_blk->entries[i].block_addr);
f2fs_bug_on(sbi, blk_addr == NEW_ADDR); f2fs_bug_on(sbi, blk_addr == NEW_ADDR);
if (blk_addr == NULL_ADDR) if (blk_addr == NULL_ADDR)
add_free_nid(sbi, start_nid, true); freed = add_free_nid(sbi, start_nid, true);
update_free_nid_bitmap(sbi, start_nid, freed);
} }
} }
static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync) static void scan_free_nid_bits(struct f2fs_sb_info *sbi)
{
struct f2fs_nm_info *nm_i = NM_I(sbi);
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA);
struct f2fs_journal *journal = curseg->journal;
unsigned int i, idx;
down_read(&nm_i->nat_tree_lock);
for (i = 0; i < nm_i->nat_blocks; i++) {
if (!test_bit_le(i, nm_i->nat_block_bitmap))
continue;
for (idx = 0; idx < NAT_ENTRY_PER_BLOCK; idx++) {
nid_t nid;
if (!test_bit_le(idx, nm_i->free_nid_bitmap[i]))
continue;
nid = i * NAT_ENTRY_PER_BLOCK + idx;
add_free_nid(sbi, nid, true);
if (nm_i->nid_cnt[FREE_NID_LIST] >= MAX_FREE_NIDS)
goto out;
}
}
out:
down_read(&curseg->journal_rwsem);
for (i = 0; i < nats_in_cursum(journal); i++) {
block_t addr;
nid_t nid;
addr = le32_to_cpu(nat_in_journal(journal, i).block_addr);
nid = le32_to_cpu(nid_in_journal(journal, i));
if (addr == NULL_ADDR)
add_free_nid(sbi, nid, true);
else
remove_free_nid(sbi, nid);
}
up_read(&curseg->journal_rwsem);
up_read(&nm_i->nat_tree_lock);
}
static int scan_nat_bits(struct f2fs_sb_info *sbi)
{
struct f2fs_nm_info *nm_i = NM_I(sbi);
struct page *page;
unsigned int i = 0;
nid_t nid;
if (!enabled_nat_bits(sbi, NULL))
return -EAGAIN;
down_read(&nm_i->nat_tree_lock);
check_empty:
i = find_next_bit_le(nm_i->empty_nat_bits, nm_i->nat_blocks, i);
if (i >= nm_i->nat_blocks) {
i = 0;
goto check_partial;
}
for (nid = i * NAT_ENTRY_PER_BLOCK; nid < (i + 1) * NAT_ENTRY_PER_BLOCK;
nid++) {
if (unlikely(nid >= nm_i->max_nid))
break;
add_free_nid(sbi, nid, true);
}
if (nm_i->nid_cnt[FREE_NID_LIST] >= MAX_FREE_NIDS)
goto out;
i++;
goto check_empty;
check_partial:
i = find_next_zero_bit_le(nm_i->full_nat_bits, nm_i->nat_blocks, i);
if (i >= nm_i->nat_blocks) {
disable_nat_bits(sbi, true);
up_read(&nm_i->nat_tree_lock);
return -EINVAL;
}
nid = i * NAT_ENTRY_PER_BLOCK;
page = get_current_nat_page(sbi, nid);
scan_nat_page(sbi, page, nid);
f2fs_put_page(page, 1);
if (nm_i->nid_cnt[FREE_NID_LIST] < MAX_FREE_NIDS) {
i++;
goto check_partial;
}
out:
up_read(&nm_i->nat_tree_lock);
return 0;
}
static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount)
{ {
struct f2fs_nm_info *nm_i = NM_I(sbi); struct f2fs_nm_info *nm_i = NM_I(sbi);
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA);
@ -1821,6 +1974,29 @@ static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync)
if (!sync && !available_free_memory(sbi, FREE_NIDS)) if (!sync && !available_free_memory(sbi, FREE_NIDS))
return; return;
if (!mount) {
/* try to find free nids in free_nid_bitmap */
scan_free_nid_bits(sbi);
if (nm_i->nid_cnt[FREE_NID_LIST])
return;
/* try to find free nids with nat_bits */
if (!scan_nat_bits(sbi) && nm_i->nid_cnt[FREE_NID_LIST])
return;
}
/* find next valid candidate */
if (enabled_nat_bits(sbi, NULL)) {
int idx = find_next_zero_bit_le(nm_i->full_nat_bits,
nm_i->nat_blocks, 0);
if (idx >= nm_i->nat_blocks)
set_sbi_flag(sbi, SBI_NEED_FSCK);
else
nid = idx * NAT_ENTRY_PER_BLOCK;
}
/* readahead nat pages to be scanned */ /* readahead nat pages to be scanned */
ra_meta_pages(sbi, NAT_BLOCK_OFFSET(nid), FREE_NID_PAGES, ra_meta_pages(sbi, NAT_BLOCK_OFFSET(nid), FREE_NID_PAGES,
META_NAT, true); META_NAT, true);
@ -1863,10 +2039,10 @@ static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync)
nm_i->ra_nid_pages, META_NAT, false); nm_i->ra_nid_pages, META_NAT, false);
} }
void build_free_nids(struct f2fs_sb_info *sbi, bool sync) void build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount)
{ {
mutex_lock(&NM_I(sbi)->build_lock); mutex_lock(&NM_I(sbi)->build_lock);
__build_free_nids(sbi, sync); __build_free_nids(sbi, sync, mount);
mutex_unlock(&NM_I(sbi)->build_lock); mutex_unlock(&NM_I(sbi)->build_lock);
} }
@ -1881,8 +2057,10 @@ bool alloc_nid(struct f2fs_sb_info *sbi, nid_t *nid)
struct free_nid *i = NULL; struct free_nid *i = NULL;
retry: retry:
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
if (time_to_inject(sbi, FAULT_ALLOC_NID)) if (time_to_inject(sbi, FAULT_ALLOC_NID)) {
f2fs_show_injection_info(FAULT_ALLOC_NID);
return false; return false;
}
#endif #endif
spin_lock(&nm_i->nid_list_lock); spin_lock(&nm_i->nid_list_lock);
@ -1902,13 +2080,16 @@ retry:
i->state = NID_ALLOC; i->state = NID_ALLOC;
__insert_nid_to_list(sbi, i, ALLOC_NID_LIST, false); __insert_nid_to_list(sbi, i, ALLOC_NID_LIST, false);
nm_i->available_nids--; nm_i->available_nids--;
update_free_nid_bitmap(sbi, *nid, false);
spin_unlock(&nm_i->nid_list_lock); spin_unlock(&nm_i->nid_list_lock);
return true; return true;
} }
spin_unlock(&nm_i->nid_list_lock); spin_unlock(&nm_i->nid_list_lock);
/* Let's scan nat pages and its caches to get free nids */ /* Let's scan nat pages and its caches to get free nids */
build_free_nids(sbi, true); build_free_nids(sbi, true, false);
goto retry; goto retry;
} }
@ -1956,6 +2137,8 @@ void alloc_nid_failed(struct f2fs_sb_info *sbi, nid_t nid)
nm_i->available_nids++; nm_i->available_nids++;
update_free_nid_bitmap(sbi, nid, true);
spin_unlock(&nm_i->nid_list_lock); spin_unlock(&nm_i->nid_list_lock);
if (need_free) if (need_free)
@ -2018,18 +2201,18 @@ update_inode:
f2fs_put_page(ipage, 1); f2fs_put_page(ipage, 1);
} }
void recover_xattr_data(struct inode *inode, struct page *page, block_t blkaddr) int recover_xattr_data(struct inode *inode, struct page *page, block_t blkaddr)
{ {
struct f2fs_sb_info *sbi = F2FS_I_SB(inode); struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
nid_t prev_xnid = F2FS_I(inode)->i_xattr_nid; nid_t prev_xnid = F2FS_I(inode)->i_xattr_nid;
nid_t new_xnid = nid_of_node(page); nid_t new_xnid = nid_of_node(page);
struct node_info ni; struct node_info ni;
struct page *xpage;
/* 1: invalidate the previous xattr nid */
if (!prev_xnid) if (!prev_xnid)
goto recover_xnid; goto recover_xnid;
/* Deallocate node address */ /* 1: invalidate the previous xattr nid */
get_node_info(sbi, prev_xnid, &ni); get_node_info(sbi, prev_xnid, &ni);
f2fs_bug_on(sbi, ni.blk_addr == NULL_ADDR); f2fs_bug_on(sbi, ni.blk_addr == NULL_ADDR);
invalidate_blocks(sbi, ni.blk_addr); invalidate_blocks(sbi, ni.blk_addr);
@ -2037,19 +2220,27 @@ void recover_xattr_data(struct inode *inode, struct page *page, block_t blkaddr)
set_node_addr(sbi, &ni, NULL_ADDR, false); set_node_addr(sbi, &ni, NULL_ADDR, false);
recover_xnid: recover_xnid:
/* 2: allocate new xattr nid */ /* 2: update xattr nid in inode */
remove_free_nid(sbi, new_xnid);
f2fs_i_xnid_write(inode, new_xnid);
if (unlikely(!inc_valid_node_count(sbi, inode))) if (unlikely(!inc_valid_node_count(sbi, inode)))
f2fs_bug_on(sbi, 1); f2fs_bug_on(sbi, 1);
update_inode_page(inode);
/* 3: update and set xattr node page dirty */
xpage = grab_cache_page(NODE_MAPPING(sbi), new_xnid);
if (!xpage)
return -ENOMEM;
memcpy(F2FS_NODE(xpage), F2FS_NODE(page), PAGE_SIZE);
remove_free_nid(sbi, new_xnid);
get_node_info(sbi, new_xnid, &ni); get_node_info(sbi, new_xnid, &ni);
ni.ino = inode->i_ino; ni.ino = inode->i_ino;
set_node_addr(sbi, &ni, NEW_ADDR, false); set_node_addr(sbi, &ni, NEW_ADDR, false);
f2fs_i_xnid_write(inode, new_xnid); set_page_dirty(xpage);
f2fs_put_page(xpage, 1);
/* 3: update xattr blkaddr */ return 0;
refresh_sit_entry(sbi, NEW_ADDR, blkaddr);
set_node_addr(sbi, &ni, blkaddr, false);
} }
int recover_inode_page(struct f2fs_sb_info *sbi, struct page *page) int recover_inode_page(struct f2fs_sb_info *sbi, struct page *page)
@ -2152,7 +2343,7 @@ static void remove_nats_in_journal(struct f2fs_sb_info *sbi)
ne = __lookup_nat_cache(nm_i, nid); ne = __lookup_nat_cache(nm_i, nid);
if (!ne) { if (!ne) {
ne = grab_nat_entry(nm_i, nid); ne = grab_nat_entry(nm_i, nid, true);
node_info_from_raw_nat(&ne->ni, &raw_ne); node_info_from_raw_nat(&ne->ni, &raw_ne);
} }
@ -2192,8 +2383,39 @@ add_out:
list_add_tail(&nes->set_list, head); list_add_tail(&nes->set_list, head);
} }
void __update_nat_bits(struct f2fs_sb_info *sbi, nid_t start_nid,
struct page *page)
{
struct f2fs_nm_info *nm_i = NM_I(sbi);
unsigned int nat_index = start_nid / NAT_ENTRY_PER_BLOCK;
struct f2fs_nat_block *nat_blk = page_address(page);
int valid = 0;
int i;
if (!enabled_nat_bits(sbi, NULL))
return;
for (i = 0; i < NAT_ENTRY_PER_BLOCK; i++) {
if (start_nid == 0 && i == 0)
valid++;
if (nat_blk->entries[i].block_addr)
valid++;
}
if (valid == 0) {
set_bit_le(nat_index, nm_i->empty_nat_bits);
clear_bit_le(nat_index, nm_i->full_nat_bits);
return;
}
clear_bit_le(nat_index, nm_i->empty_nat_bits);
if (valid == NAT_ENTRY_PER_BLOCK)
set_bit_le(nat_index, nm_i->full_nat_bits);
else
clear_bit_le(nat_index, nm_i->full_nat_bits);
}
static void __flush_nat_entry_set(struct f2fs_sb_info *sbi, static void __flush_nat_entry_set(struct f2fs_sb_info *sbi,
struct nat_entry_set *set) struct nat_entry_set *set, struct cp_control *cpc)
{ {
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA);
struct f2fs_journal *journal = curseg->journal; struct f2fs_journal *journal = curseg->journal;
@ -2208,7 +2430,8 @@ static void __flush_nat_entry_set(struct f2fs_sb_info *sbi,
* #1, flush nat entries to journal in current hot data summary block. * #1, flush nat entries to journal in current hot data summary block.
* #2, flush nat entries to nat page. * #2, flush nat entries to nat page.
*/ */
if (!__has_cursum_space(journal, set->entry_cnt, NAT_JOURNAL)) if (enabled_nat_bits(sbi, cpc) ||
!__has_cursum_space(journal, set->entry_cnt, NAT_JOURNAL))
to_journal = false; to_journal = false;
if (to_journal) { if (to_journal) {
@ -2244,14 +2467,21 @@ static void __flush_nat_entry_set(struct f2fs_sb_info *sbi,
add_free_nid(sbi, nid, false); add_free_nid(sbi, nid, false);
spin_lock(&NM_I(sbi)->nid_list_lock); spin_lock(&NM_I(sbi)->nid_list_lock);
NM_I(sbi)->available_nids++; NM_I(sbi)->available_nids++;
update_free_nid_bitmap(sbi, nid, true);
spin_unlock(&NM_I(sbi)->nid_list_lock);
} else {
spin_lock(&NM_I(sbi)->nid_list_lock);
update_free_nid_bitmap(sbi, nid, false);
spin_unlock(&NM_I(sbi)->nid_list_lock); spin_unlock(&NM_I(sbi)->nid_list_lock);
} }
} }
if (to_journal) if (to_journal) {
up_write(&curseg->journal_rwsem); up_write(&curseg->journal_rwsem);
else } else {
__update_nat_bits(sbi, start_nid, page);
f2fs_put_page(page, 1); f2fs_put_page(page, 1);
}
f2fs_bug_on(sbi, set->entry_cnt); f2fs_bug_on(sbi, set->entry_cnt);
@ -2262,7 +2492,7 @@ static void __flush_nat_entry_set(struct f2fs_sb_info *sbi,
/* /*
* This function is called during the checkpointing process. * This function is called during the checkpointing process.
*/ */
void flush_nat_entries(struct f2fs_sb_info *sbi) void flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
{ {
struct f2fs_nm_info *nm_i = NM_I(sbi); struct f2fs_nm_info *nm_i = NM_I(sbi);
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA);
@ -2283,7 +2513,8 @@ void flush_nat_entries(struct f2fs_sb_info *sbi)
* entries, remove all entries from journal and merge them * entries, remove all entries from journal and merge them
* into nat entry set. * into nat entry set.
*/ */
if (!__has_cursum_space(journal, nm_i->dirty_nat_cnt, NAT_JOURNAL)) if (enabled_nat_bits(sbi, cpc) ||
!__has_cursum_space(journal, nm_i->dirty_nat_cnt, NAT_JOURNAL))
remove_nats_in_journal(sbi); remove_nats_in_journal(sbi);
while ((found = __gang_lookup_nat_set(nm_i, while ((found = __gang_lookup_nat_set(nm_i,
@ -2297,27 +2528,69 @@ void flush_nat_entries(struct f2fs_sb_info *sbi)
/* flush dirty nats in nat entry set */ /* flush dirty nats in nat entry set */
list_for_each_entry_safe(set, tmp, &sets, set_list) list_for_each_entry_safe(set, tmp, &sets, set_list)
__flush_nat_entry_set(sbi, set); __flush_nat_entry_set(sbi, set, cpc);
up_write(&nm_i->nat_tree_lock); up_write(&nm_i->nat_tree_lock);
f2fs_bug_on(sbi, nm_i->dirty_nat_cnt); f2fs_bug_on(sbi, nm_i->dirty_nat_cnt);
} }
static int __get_nat_bitmaps(struct f2fs_sb_info *sbi)
{
struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
struct f2fs_nm_info *nm_i = NM_I(sbi);
unsigned int nat_bits_bytes = nm_i->nat_blocks / BITS_PER_BYTE;
unsigned int i;
__u64 cp_ver = cur_cp_version(ckpt);
block_t nat_bits_addr;
if (!enabled_nat_bits(sbi, NULL))
return 0;
nm_i->nat_bits_blocks = F2FS_BYTES_TO_BLK((nat_bits_bytes << 1) + 8 +
F2FS_BLKSIZE - 1);
nm_i->nat_bits = kzalloc(nm_i->nat_bits_blocks << F2FS_BLKSIZE_BITS,
GFP_KERNEL);
if (!nm_i->nat_bits)
return -ENOMEM;
nat_bits_addr = __start_cp_addr(sbi) + sbi->blocks_per_seg -
nm_i->nat_bits_blocks;
for (i = 0; i < nm_i->nat_bits_blocks; i++) {
struct page *page = get_meta_page(sbi, nat_bits_addr++);
memcpy(nm_i->nat_bits + (i << F2FS_BLKSIZE_BITS),
page_address(page), F2FS_BLKSIZE);
f2fs_put_page(page, 1);
}
cp_ver |= (cur_cp_crc(ckpt) << 32);
if (cpu_to_le64(cp_ver) != *(__le64 *)nm_i->nat_bits) {
disable_nat_bits(sbi, true);
return 0;
}
nm_i->full_nat_bits = nm_i->nat_bits + 8;
nm_i->empty_nat_bits = nm_i->full_nat_bits + nat_bits_bytes;
f2fs_msg(sbi->sb, KERN_NOTICE, "Found nat_bits in checkpoint");
return 0;
}
static int init_node_manager(struct f2fs_sb_info *sbi) static int init_node_manager(struct f2fs_sb_info *sbi)
{ {
struct f2fs_super_block *sb_raw = F2FS_RAW_SUPER(sbi); struct f2fs_super_block *sb_raw = F2FS_RAW_SUPER(sbi);
struct f2fs_nm_info *nm_i = NM_I(sbi); struct f2fs_nm_info *nm_i = NM_I(sbi);
unsigned char *version_bitmap; unsigned char *version_bitmap;
unsigned int nat_segs, nat_blocks; unsigned int nat_segs;
int err;
nm_i->nat_blkaddr = le32_to_cpu(sb_raw->nat_blkaddr); nm_i->nat_blkaddr = le32_to_cpu(sb_raw->nat_blkaddr);
/* segment_count_nat includes pair segment so divide to 2. */ /* segment_count_nat includes pair segment so divide to 2. */
nat_segs = le32_to_cpu(sb_raw->segment_count_nat) >> 1; nat_segs = le32_to_cpu(sb_raw->segment_count_nat) >> 1;
nat_blocks = nat_segs << le32_to_cpu(sb_raw->log_blocks_per_seg); nm_i->nat_blocks = nat_segs << le32_to_cpu(sb_raw->log_blocks_per_seg);
nm_i->max_nid = NAT_ENTRY_PER_BLOCK * nm_i->nat_blocks;
nm_i->max_nid = NAT_ENTRY_PER_BLOCK * nat_blocks;
/* not used nids: 0, node, meta, (and root counted as valid node) */ /* not used nids: 0, node, meta, (and root counted as valid node) */
nm_i->available_nids = nm_i->max_nid - sbi->total_valid_node_count - nm_i->available_nids = nm_i->max_nid - sbi->total_valid_node_count -
@ -2350,6 +2623,34 @@ static int init_node_manager(struct f2fs_sb_info *sbi)
GFP_KERNEL); GFP_KERNEL);
if (!nm_i->nat_bitmap) if (!nm_i->nat_bitmap)
return -ENOMEM; return -ENOMEM;
err = __get_nat_bitmaps(sbi);
if (err)
return err;
#ifdef CONFIG_F2FS_CHECK_FS
nm_i->nat_bitmap_mir = kmemdup(version_bitmap, nm_i->bitmap_size,
GFP_KERNEL);
if (!nm_i->nat_bitmap_mir)
return -ENOMEM;
#endif
return 0;
}
int init_free_nid_cache(struct f2fs_sb_info *sbi)
{
struct f2fs_nm_info *nm_i = NM_I(sbi);
nm_i->free_nid_bitmap = f2fs_kvzalloc(nm_i->nat_blocks *
NAT_ENTRY_BITMAP_SIZE, GFP_KERNEL);
if (!nm_i->free_nid_bitmap)
return -ENOMEM;
nm_i->nat_block_bitmap = f2fs_kvzalloc(nm_i->nat_blocks / 8,
GFP_KERNEL);
if (!nm_i->nat_block_bitmap)
return -ENOMEM;
return 0; return 0;
} }
@ -2365,7 +2666,11 @@ int build_node_manager(struct f2fs_sb_info *sbi)
if (err) if (err)
return err; return err;
build_free_nids(sbi, true); err = init_free_nid_cache(sbi);
if (err)
return err;
build_free_nids(sbi, true, true);
return 0; return 0;
} }
@ -2423,7 +2728,14 @@ void destroy_node_manager(struct f2fs_sb_info *sbi)
} }
up_write(&nm_i->nat_tree_lock); up_write(&nm_i->nat_tree_lock);
kvfree(nm_i->nat_block_bitmap);
kvfree(nm_i->free_nid_bitmap);
kfree(nm_i->nat_bitmap); kfree(nm_i->nat_bitmap);
kfree(nm_i->nat_bits);
#ifdef CONFIG_F2FS_CHECK_FS
kfree(nm_i->nat_bitmap_mir);
#endif
sbi->nm_info = NULL; sbi->nm_info = NULL;
kfree(nm_i); kfree(nm_i);
} }

View File

@ -174,7 +174,7 @@ static inline void next_free_nid(struct f2fs_sb_info *sbi, nid_t *nid)
spin_unlock(&nm_i->nid_list_lock); spin_unlock(&nm_i->nid_list_lock);
return; return;
} }
fnid = list_entry(nm_i->nid_list[FREE_NID_LIST].next, fnid = list_first_entry(&nm_i->nid_list[FREE_NID_LIST],
struct free_nid, list); struct free_nid, list);
*nid = fnid->nid; *nid = fnid->nid;
spin_unlock(&nm_i->nid_list_lock); spin_unlock(&nm_i->nid_list_lock);
@ -186,6 +186,12 @@ static inline void next_free_nid(struct f2fs_sb_info *sbi, nid_t *nid)
static inline void get_nat_bitmap(struct f2fs_sb_info *sbi, void *addr) static inline void get_nat_bitmap(struct f2fs_sb_info *sbi, void *addr)
{ {
struct f2fs_nm_info *nm_i = NM_I(sbi); struct f2fs_nm_info *nm_i = NM_I(sbi);
#ifdef CONFIG_F2FS_CHECK_FS
if (memcmp(nm_i->nat_bitmap, nm_i->nat_bitmap_mir,
nm_i->bitmap_size))
f2fs_bug_on(sbi, 1);
#endif
memcpy(addr, nm_i->nat_bitmap, nm_i->bitmap_size); memcpy(addr, nm_i->nat_bitmap, nm_i->bitmap_size);
} }
@ -228,6 +234,9 @@ static inline void set_to_next_nat(struct f2fs_nm_info *nm_i, nid_t start_nid)
unsigned int block_off = NAT_BLOCK_OFFSET(start_nid); unsigned int block_off = NAT_BLOCK_OFFSET(start_nid);
f2fs_change_bit(block_off, nm_i->nat_bitmap); f2fs_change_bit(block_off, nm_i->nat_bitmap);
#ifdef CONFIG_F2FS_CHECK_FS
f2fs_change_bit(block_off, nm_i->nat_bitmap_mir);
#endif
} }
static inline nid_t ino_of_node(struct page *node_page) static inline nid_t ino_of_node(struct page *node_page)
@ -291,14 +300,11 @@ static inline void fill_node_footer_blkaddr(struct page *page, block_t blkaddr)
{ {
struct f2fs_checkpoint *ckpt = F2FS_CKPT(F2FS_P_SB(page)); struct f2fs_checkpoint *ckpt = F2FS_CKPT(F2FS_P_SB(page));
struct f2fs_node *rn = F2FS_NODE(page); struct f2fs_node *rn = F2FS_NODE(page);
size_t crc_offset = le32_to_cpu(ckpt->checksum_offset); __u64 cp_ver = cur_cp_version(ckpt);
__u64 cp_ver = le64_to_cpu(ckpt->checkpoint_ver);
if (__is_set_ckpt_flags(ckpt, CP_CRC_RECOVERY_FLAG))
cp_ver |= (cur_cp_crc(ckpt) << 32);
if (__is_set_ckpt_flags(ckpt, CP_CRC_RECOVERY_FLAG)) {
__u64 crc = le32_to_cpu(*((__le32 *)
((unsigned char *)ckpt + crc_offset)));
cp_ver |= (crc << 32);
}
rn->footer.cp_ver = cpu_to_le64(cp_ver); rn->footer.cp_ver = cpu_to_le64(cp_ver);
rn->footer.next_blkaddr = cpu_to_le32(blkaddr); rn->footer.next_blkaddr = cpu_to_le32(blkaddr);
} }
@ -306,14 +312,11 @@ static inline void fill_node_footer_blkaddr(struct page *page, block_t blkaddr)
static inline bool is_recoverable_dnode(struct page *page) static inline bool is_recoverable_dnode(struct page *page)
{ {
struct f2fs_checkpoint *ckpt = F2FS_CKPT(F2FS_P_SB(page)); struct f2fs_checkpoint *ckpt = F2FS_CKPT(F2FS_P_SB(page));
size_t crc_offset = le32_to_cpu(ckpt->checksum_offset);
__u64 cp_ver = cur_cp_version(ckpt); __u64 cp_ver = cur_cp_version(ckpt);
if (__is_set_ckpt_flags(ckpt, CP_CRC_RECOVERY_FLAG)) { if (__is_set_ckpt_flags(ckpt, CP_CRC_RECOVERY_FLAG))
__u64 crc = le32_to_cpu(*((__le32 *) cp_ver |= (cur_cp_crc(ckpt) << 32);
((unsigned char *)ckpt + crc_offset)));
cp_ver |= (crc << 32);
}
return cp_ver == cpver_of_node(page); return cp_ver == cpver_of_node(page);
} }
@ -343,7 +346,7 @@ static inline bool IS_DNODE(struct page *node_page)
unsigned int ofs = ofs_of_node(node_page); unsigned int ofs = ofs_of_node(node_page);
if (f2fs_has_xattr_block(ofs)) if (f2fs_has_xattr_block(ofs))
return false; return true;
if (ofs == 3 || ofs == 4 + NIDS_PER_BLOCK || if (ofs == 3 || ofs == 4 + NIDS_PER_BLOCK ||
ofs == 5 + 2 * NIDS_PER_BLOCK) ofs == 5 + 2 * NIDS_PER_BLOCK)

View File

@ -378,11 +378,9 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct inode *inode,
if (IS_INODE(page)) { if (IS_INODE(page)) {
recover_inline_xattr(inode, page); recover_inline_xattr(inode, page);
} else if (f2fs_has_xattr_block(ofs_of_node(page))) { } else if (f2fs_has_xattr_block(ofs_of_node(page))) {
/* err = recover_xattr_data(inode, page, blkaddr);
* Deprecated; xattr blocks should be found from cold log. if (!err)
* But, we should remain this for backward compatibility. recovered++;
*/
recover_xattr_data(inode, page, blkaddr);
goto out; goto out;
} }
@ -428,8 +426,9 @@ retry_dn:
} }
if (!file_keep_isize(inode) && if (!file_keep_isize(inode) &&
(i_size_read(inode) <= (start << PAGE_SHIFT))) (i_size_read(inode) <= ((loff_t)start << PAGE_SHIFT)))
f2fs_i_size_write(inode, (start + 1) << PAGE_SHIFT); f2fs_i_size_write(inode,
(loff_t)(start + 1) << PAGE_SHIFT);
/* /*
* dest is reserved block, invalidate src block * dest is reserved block, invalidate src block
@ -552,10 +551,8 @@ next:
int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only) int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
{ {
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_WARM_NODE);
struct list_head inode_list; struct list_head inode_list;
struct list_head dir_list; struct list_head dir_list;
block_t blkaddr;
int err; int err;
int ret = 0; int ret = 0;
bool need_writecp = false; bool need_writecp = false;
@ -571,8 +568,6 @@ int recover_fsync_data(struct f2fs_sb_info *sbi, bool check_only)
/* prevent checkpoint */ /* prevent checkpoint */
mutex_lock(&sbi->cp_mutex); mutex_lock(&sbi->cp_mutex);
blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
/* step #1: find fsynced inode numbers */ /* step #1: find fsynced inode numbers */
err = find_fsync_dnodes(sbi, &inode_list); err = find_fsync_dnodes(sbi, &inode_list);
if (err || list_empty(&inode_list)) if (err || list_empty(&inode_list))

View File

@ -26,7 +26,7 @@
#define __reverse_ffz(x) __reverse_ffs(~(x)) #define __reverse_ffz(x) __reverse_ffs(~(x))
static struct kmem_cache *discard_entry_slab; static struct kmem_cache *discard_entry_slab;
static struct kmem_cache *bio_entry_slab; static struct kmem_cache *discard_cmd_slab;
static struct kmem_cache *sit_entry_set_slab; static struct kmem_cache *sit_entry_set_slab;
static struct kmem_cache *inmem_entry_slab; static struct kmem_cache *inmem_entry_slab;
@ -242,11 +242,12 @@ void drop_inmem_pages(struct inode *inode)
{ {
struct f2fs_inode_info *fi = F2FS_I(inode); struct f2fs_inode_info *fi = F2FS_I(inode);
clear_inode_flag(inode, FI_ATOMIC_FILE);
mutex_lock(&fi->inmem_lock); mutex_lock(&fi->inmem_lock);
__revoke_inmem_pages(inode, &fi->inmem_pages, true, false); __revoke_inmem_pages(inode, &fi->inmem_pages, true, false);
mutex_unlock(&fi->inmem_lock); mutex_unlock(&fi->inmem_lock);
clear_inode_flag(inode, FI_ATOMIC_FILE);
stat_dec_atomic_write(inode);
} }
static int __commit_inmem_pages(struct inode *inode, static int __commit_inmem_pages(struct inode *inode,
@ -262,7 +263,7 @@ static int __commit_inmem_pages(struct inode *inode,
.op_flags = REQ_SYNC | REQ_PRIO, .op_flags = REQ_SYNC | REQ_PRIO,
.encrypted_page = NULL, .encrypted_page = NULL,
}; };
bool submit_bio = false; pgoff_t last_idx = ULONG_MAX;
int err = 0; int err = 0;
list_for_each_entry_safe(cur, tmp, &fi->inmem_pages, list) { list_for_each_entry_safe(cur, tmp, &fi->inmem_pages, list) {
@ -288,15 +289,15 @@ static int __commit_inmem_pages(struct inode *inode,
/* record old blkaddr for revoking */ /* record old blkaddr for revoking */
cur->old_addr = fio.old_blkaddr; cur->old_addr = fio.old_blkaddr;
last_idx = page->index;
submit_bio = true;
} }
unlock_page(page); unlock_page(page);
list_move_tail(&cur->list, revoke_list); list_move_tail(&cur->list, revoke_list);
} }
if (submit_bio) if (last_idx != ULONG_MAX)
f2fs_submit_merged_bio_cond(sbi, inode, NULL, 0, DATA, WRITE); f2fs_submit_merged_bio_cond(sbi, inode, 0, last_idx,
DATA, WRITE);
if (!err) if (!err)
__revoke_inmem_pages(inode, revoke_list, false, false); __revoke_inmem_pages(inode, revoke_list, false, false);
@ -315,6 +316,8 @@ int commit_inmem_pages(struct inode *inode)
f2fs_balance_fs(sbi, true); f2fs_balance_fs(sbi, true);
f2fs_lock_op(sbi); f2fs_lock_op(sbi);
set_inode_flag(inode, FI_ATOMIC_COMMIT);
mutex_lock(&fi->inmem_lock); mutex_lock(&fi->inmem_lock);
err = __commit_inmem_pages(inode, &revoke_list); err = __commit_inmem_pages(inode, &revoke_list);
if (err) { if (err) {
@ -336,6 +339,8 @@ int commit_inmem_pages(struct inode *inode)
} }
mutex_unlock(&fi->inmem_lock); mutex_unlock(&fi->inmem_lock);
clear_inode_flag(inode, FI_ATOMIC_COMMIT);
f2fs_unlock_op(sbi); f2fs_unlock_op(sbi);
return err; return err;
} }
@ -347,8 +352,10 @@ int commit_inmem_pages(struct inode *inode)
void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need) void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need)
{ {
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
if (time_to_inject(sbi, FAULT_CHECKPOINT)) if (time_to_inject(sbi, FAULT_CHECKPOINT)) {
f2fs_show_injection_info(FAULT_CHECKPOINT);
f2fs_stop_checkpoint(sbi, false); f2fs_stop_checkpoint(sbi, false);
}
#endif #endif
if (!need) if (!need)
@ -381,7 +388,7 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi)
if (!available_free_memory(sbi, FREE_NIDS)) if (!available_free_memory(sbi, FREE_NIDS))
try_to_free_nids(sbi, MAX_FREE_NIDS); try_to_free_nids(sbi, MAX_FREE_NIDS);
else else
build_free_nids(sbi, false); build_free_nids(sbi, false, false);
if (!is_idle(sbi)) if (!is_idle(sbi))
return; return;
@ -423,6 +430,9 @@ static int submit_flush_wait(struct f2fs_sb_info *sbi)
if (sbi->s_ndevs && !ret) { if (sbi->s_ndevs && !ret) {
for (i = 1; i < sbi->s_ndevs; i++) { for (i = 1; i < sbi->s_ndevs; i++) {
trace_f2fs_issue_flush(FDEV(i).bdev,
test_opt(sbi, NOBARRIER),
test_opt(sbi, FLUSH_MERGE));
ret = __submit_flush_wait(FDEV(i).bdev); ret = __submit_flush_wait(FDEV(i).bdev);
if (ret) if (ret)
break; break;
@ -434,7 +444,7 @@ static int submit_flush_wait(struct f2fs_sb_info *sbi)
static int issue_flush_thread(void *data) static int issue_flush_thread(void *data)
{ {
struct f2fs_sb_info *sbi = data; struct f2fs_sb_info *sbi = data;
struct flush_cmd_control *fcc = SM_I(sbi)->cmd_control_info; struct flush_cmd_control *fcc = SM_I(sbi)->fcc_info;
wait_queue_head_t *q = &fcc->flush_wait_queue; wait_queue_head_t *q = &fcc->flush_wait_queue;
repeat: repeat:
if (kthread_should_stop()) if (kthread_should_stop())
@ -463,16 +473,16 @@ repeat:
int f2fs_issue_flush(struct f2fs_sb_info *sbi) int f2fs_issue_flush(struct f2fs_sb_info *sbi)
{ {
struct flush_cmd_control *fcc = SM_I(sbi)->cmd_control_info; struct flush_cmd_control *fcc = SM_I(sbi)->fcc_info;
struct flush_cmd cmd; struct flush_cmd cmd;
trace_f2fs_issue_flush(sbi->sb, test_opt(sbi, NOBARRIER),
test_opt(sbi, FLUSH_MERGE));
if (test_opt(sbi, NOBARRIER)) if (test_opt(sbi, NOBARRIER))
return 0; return 0;
if (!test_opt(sbi, FLUSH_MERGE) || !atomic_read(&fcc->submit_flush)) { if (!test_opt(sbi, FLUSH_MERGE))
return submit_flush_wait(sbi);
if (!atomic_read(&fcc->submit_flush)) {
int ret; int ret;
atomic_inc(&fcc->submit_flush); atomic_inc(&fcc->submit_flush);
@ -506,8 +516,8 @@ int create_flush_cmd_control(struct f2fs_sb_info *sbi)
struct flush_cmd_control *fcc; struct flush_cmd_control *fcc;
int err = 0; int err = 0;
if (SM_I(sbi)->cmd_control_info) { if (SM_I(sbi)->fcc_info) {
fcc = SM_I(sbi)->cmd_control_info; fcc = SM_I(sbi)->fcc_info;
goto init_thread; goto init_thread;
} }
@ -517,14 +527,14 @@ int create_flush_cmd_control(struct f2fs_sb_info *sbi)
atomic_set(&fcc->submit_flush, 0); atomic_set(&fcc->submit_flush, 0);
init_waitqueue_head(&fcc->flush_wait_queue); init_waitqueue_head(&fcc->flush_wait_queue);
init_llist_head(&fcc->issue_list); init_llist_head(&fcc->issue_list);
SM_I(sbi)->cmd_control_info = fcc; SM_I(sbi)->fcc_info = fcc;
init_thread: init_thread:
fcc->f2fs_issue_flush = kthread_run(issue_flush_thread, sbi, fcc->f2fs_issue_flush = kthread_run(issue_flush_thread, sbi,
"f2fs_flush-%u:%u", MAJOR(dev), MINOR(dev)); "f2fs_flush-%u:%u", MAJOR(dev), MINOR(dev));
if (IS_ERR(fcc->f2fs_issue_flush)) { if (IS_ERR(fcc->f2fs_issue_flush)) {
err = PTR_ERR(fcc->f2fs_issue_flush); err = PTR_ERR(fcc->f2fs_issue_flush);
kfree(fcc); kfree(fcc);
SM_I(sbi)->cmd_control_info = NULL; SM_I(sbi)->fcc_info = NULL;
return err; return err;
} }
@ -533,7 +543,7 @@ init_thread:
void destroy_flush_cmd_control(struct f2fs_sb_info *sbi, bool free) void destroy_flush_cmd_control(struct f2fs_sb_info *sbi, bool free)
{ {
struct flush_cmd_control *fcc = SM_I(sbi)->cmd_control_info; struct flush_cmd_control *fcc = SM_I(sbi)->fcc_info;
if (fcc && fcc->f2fs_issue_flush) { if (fcc && fcc->f2fs_issue_flush) {
struct task_struct *flush_thread = fcc->f2fs_issue_flush; struct task_struct *flush_thread = fcc->f2fs_issue_flush;
@ -543,7 +553,7 @@ void destroy_flush_cmd_control(struct f2fs_sb_info *sbi, bool free)
} }
if (free) { if (free) {
kfree(fcc); kfree(fcc);
SM_I(sbi)->cmd_control_info = NULL; SM_I(sbi)->fcc_info = NULL;
} }
} }
@ -623,60 +633,144 @@ static void locate_dirty_segment(struct f2fs_sb_info *sbi, unsigned int segno)
mutex_unlock(&dirty_i->seglist_lock); mutex_unlock(&dirty_i->seglist_lock);
} }
static struct bio_entry *__add_bio_entry(struct f2fs_sb_info *sbi, static void __add_discard_cmd(struct f2fs_sb_info *sbi,
struct bio *bio) struct bio *bio, block_t lstart, block_t len)
{ {
struct list_head *wait_list = &(SM_I(sbi)->wait_list); struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
struct bio_entry *be = f2fs_kmem_cache_alloc(bio_entry_slab, GFP_NOFS); struct list_head *cmd_list = &(dcc->discard_cmd_list);
struct discard_cmd *dc;
INIT_LIST_HEAD(&be->list); dc = f2fs_kmem_cache_alloc(discard_cmd_slab, GFP_NOFS);
be->bio = bio; INIT_LIST_HEAD(&dc->list);
init_completion(&be->event); dc->bio = bio;
list_add_tail(&be->list, wait_list); bio->bi_private = dc;
dc->lstart = lstart;
dc->len = len;
dc->state = D_PREP;
init_completion(&dc->wait);
return be; mutex_lock(&dcc->cmd_lock);
list_add_tail(&dc->list, cmd_list);
mutex_unlock(&dcc->cmd_lock);
} }
void f2fs_wait_all_discard_bio(struct f2fs_sb_info *sbi) static void __remove_discard_cmd(struct f2fs_sb_info *sbi, struct discard_cmd *dc)
{ {
struct list_head *wait_list = &(SM_I(sbi)->wait_list); int err = dc->bio->bi_error;
struct bio_entry *be, *tmp;
list_for_each_entry_safe(be, tmp, wait_list, list) { if (dc->state == D_DONE)
struct bio *bio = be->bio; atomic_dec(&(SM_I(sbi)->dcc_info->submit_discard));
int err;
wait_for_completion_io(&be->event); if (err == -EOPNOTSUPP)
err = be->error; err = 0;
if (err == -EOPNOTSUPP)
err = 0;
if (err) if (err)
f2fs_msg(sbi->sb, KERN_INFO, f2fs_msg(sbi->sb, KERN_INFO,
"Issue discard failed, ret: %d", err); "Issue discard failed, ret: %d", err);
bio_put(dc->bio);
bio_put(bio); list_del(&dc->list);
list_del(&be->list); kmem_cache_free(discard_cmd_slab, dc);
kmem_cache_free(bio_entry_slab, be);
}
} }
static void f2fs_submit_bio_wait_endio(struct bio *bio) /* This should be covered by global mutex, &sit_i->sentry_lock */
void f2fs_wait_discard_bio(struct f2fs_sb_info *sbi, block_t blkaddr)
{ {
struct bio_entry *be = (struct bio_entry *)bio->bi_private; struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
struct list_head *wait_list = &(dcc->discard_cmd_list);
struct discard_cmd *dc, *tmp;
struct blk_plug plug;
be->error = bio->bi_error; mutex_lock(&dcc->cmd_lock);
complete(&be->event);
blk_start_plug(&plug);
list_for_each_entry_safe(dc, tmp, wait_list, list) {
if (blkaddr == NULL_ADDR) {
if (dc->state == D_PREP) {
dc->state = D_SUBMIT;
submit_bio(dc->bio);
atomic_inc(&dcc->submit_discard);
}
continue;
}
if (dc->lstart <= blkaddr && blkaddr < dc->lstart + dc->len) {
if (dc->state == D_SUBMIT)
wait_for_completion_io(&dc->wait);
else
__remove_discard_cmd(sbi, dc);
}
}
blk_finish_plug(&plug);
/* this comes from f2fs_put_super */
if (blkaddr == NULL_ADDR) {
list_for_each_entry_safe(dc, tmp, wait_list, list) {
wait_for_completion_io(&dc->wait);
__remove_discard_cmd(sbi, dc);
}
}
mutex_unlock(&dcc->cmd_lock);
} }
static void f2fs_submit_discard_endio(struct bio *bio)
{
struct discard_cmd *dc = (struct discard_cmd *)bio->bi_private;
complete(&dc->wait);
dc->state = D_DONE;
}
static int issue_discard_thread(void *data)
{
struct f2fs_sb_info *sbi = data;
struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
wait_queue_head_t *q = &dcc->discard_wait_queue;
struct list_head *cmd_list = &dcc->discard_cmd_list;
struct discard_cmd *dc, *tmp;
struct blk_plug plug;
int iter = 0;
repeat:
if (kthread_should_stop())
return 0;
blk_start_plug(&plug);
mutex_lock(&dcc->cmd_lock);
list_for_each_entry_safe(dc, tmp, cmd_list, list) {
if (dc->state == D_PREP) {
dc->state = D_SUBMIT;
submit_bio(dc->bio);
atomic_inc(&dcc->submit_discard);
if (iter++ > DISCARD_ISSUE_RATE)
break;
} else if (dc->state == D_DONE) {
__remove_discard_cmd(sbi, dc);
}
}
mutex_unlock(&dcc->cmd_lock);
blk_finish_plug(&plug);
iter = 0;
congestion_wait(BLK_RW_SYNC, HZ/50);
wait_event_interruptible(*q,
kthread_should_stop() || !list_empty(&dcc->discard_cmd_list));
goto repeat;
}
/* this function is copied from blkdev_issue_discard from block/blk-lib.c */ /* this function is copied from blkdev_issue_discard from block/blk-lib.c */
static int __f2fs_issue_discard_async(struct f2fs_sb_info *sbi, static int __f2fs_issue_discard_async(struct f2fs_sb_info *sbi,
struct block_device *bdev, block_t blkstart, block_t blklen) struct block_device *bdev, block_t blkstart, block_t blklen)
{ {
struct bio *bio = NULL; struct bio *bio = NULL;
block_t lblkstart = blkstart;
int err; int err;
trace_f2fs_issue_discard(sbi->sb, blkstart, blklen); trace_f2fs_issue_discard(bdev, blkstart, blklen);
if (sbi->s_ndevs) { if (sbi->s_ndevs) {
int devi = f2fs_target_device_index(sbi, blkstart); int devi = f2fs_target_device_index(sbi, blkstart);
@ -688,14 +782,12 @@ static int __f2fs_issue_discard_async(struct f2fs_sb_info *sbi,
SECTOR_FROM_BLOCK(blklen), SECTOR_FROM_BLOCK(blklen),
GFP_NOFS, 0, &bio); GFP_NOFS, 0, &bio);
if (!err && bio) { if (!err && bio) {
struct bio_entry *be = __add_bio_entry(sbi, bio); bio->bi_end_io = f2fs_submit_discard_endio;
bio->bi_private = be;
bio->bi_end_io = f2fs_submit_bio_wait_endio;
bio->bi_opf |= REQ_SYNC; bio->bi_opf |= REQ_SYNC;
submit_bio(bio);
}
__add_discard_cmd(sbi, bio, lblkstart, blklen);
wake_up(&SM_I(sbi)->dcc_info->discard_wait_queue);
}
return err; return err;
} }
@ -703,24 +795,13 @@ static int __f2fs_issue_discard_async(struct f2fs_sb_info *sbi,
static int __f2fs_issue_discard_zone(struct f2fs_sb_info *sbi, static int __f2fs_issue_discard_zone(struct f2fs_sb_info *sbi,
struct block_device *bdev, block_t blkstart, block_t blklen) struct block_device *bdev, block_t blkstart, block_t blklen)
{ {
sector_t nr_sects = SECTOR_FROM_BLOCK(blklen); sector_t sector, nr_sects;
sector_t sector;
int devi = 0; int devi = 0;
if (sbi->s_ndevs) { if (sbi->s_ndevs) {
devi = f2fs_target_device_index(sbi, blkstart); devi = f2fs_target_device_index(sbi, blkstart);
blkstart -= FDEV(devi).start_blk; blkstart -= FDEV(devi).start_blk;
} }
sector = SECTOR_FROM_BLOCK(blkstart);
if (sector & (bdev_zone_sectors(bdev) - 1) ||
nr_sects != bdev_zone_sectors(bdev)) {
f2fs_msg(sbi->sb, KERN_INFO,
"(%d) %s: Unaligned discard attempted (block %x + %x)",
devi, sbi->s_ndevs ? FDEV(devi).path: "",
blkstart, blklen);
return -EIO;
}
/* /*
* We need to know the type of the zone: for conventional zones, * We need to know the type of the zone: for conventional zones,
@ -735,7 +816,18 @@ static int __f2fs_issue_discard_zone(struct f2fs_sb_info *sbi,
return __f2fs_issue_discard_async(sbi, bdev, blkstart, blklen); return __f2fs_issue_discard_async(sbi, bdev, blkstart, blklen);
case BLK_ZONE_TYPE_SEQWRITE_REQ: case BLK_ZONE_TYPE_SEQWRITE_REQ:
case BLK_ZONE_TYPE_SEQWRITE_PREF: case BLK_ZONE_TYPE_SEQWRITE_PREF:
trace_f2fs_issue_reset_zone(sbi->sb, blkstart); sector = SECTOR_FROM_BLOCK(blkstart);
nr_sects = SECTOR_FROM_BLOCK(blklen);
if (sector & (bdev_zone_sectors(bdev) - 1) ||
nr_sects != bdev_zone_sectors(bdev)) {
f2fs_msg(sbi->sb, KERN_INFO,
"(%d) %s: Unaligned discard attempted (block %x + %x)",
devi, sbi->s_ndevs ? FDEV(devi).path: "",
blkstart, blklen);
return -EIO;
}
trace_f2fs_issue_reset_zone(bdev, blkstart);
return blkdev_reset_zones(bdev, sector, return blkdev_reset_zones(bdev, sector,
nr_sects, GFP_NOFS); nr_sects, GFP_NOFS);
default: default:
@ -800,13 +892,14 @@ static void __add_discard_entry(struct f2fs_sb_info *sbi,
struct cp_control *cpc, struct seg_entry *se, struct cp_control *cpc, struct seg_entry *se,
unsigned int start, unsigned int end) unsigned int start, unsigned int end)
{ {
struct list_head *head = &SM_I(sbi)->discard_list; struct list_head *head = &SM_I(sbi)->dcc_info->discard_entry_list;
struct discard_entry *new, *last; struct discard_entry *new, *last;
if (!list_empty(head)) { if (!list_empty(head)) {
last = list_last_entry(head, struct discard_entry, list); last = list_last_entry(head, struct discard_entry, list);
if (START_BLOCK(sbi, cpc->trim_start) + start == if (START_BLOCK(sbi, cpc->trim_start) + start ==
last->blkaddr + last->len) { last->blkaddr + last->len &&
last->len < MAX_DISCARD_BLOCKS(sbi)) {
last->len += end - start; last->len += end - start;
goto done; goto done;
} }
@ -818,10 +911,11 @@ static void __add_discard_entry(struct f2fs_sb_info *sbi,
new->len = end - start; new->len = end - start;
list_add_tail(&new->list, head); list_add_tail(&new->list, head);
done: done:
SM_I(sbi)->nr_discards += end - start; SM_I(sbi)->dcc_info->nr_discards += end - start;
} }
static void add_discard_addrs(struct f2fs_sb_info *sbi, struct cp_control *cpc) static bool add_discard_addrs(struct f2fs_sb_info *sbi, struct cp_control *cpc,
bool check_only)
{ {
int entries = SIT_VBLOCK_MAP_SIZE / sizeof(unsigned long); int entries = SIT_VBLOCK_MAP_SIZE / sizeof(unsigned long);
int max_blocks = sbi->blocks_per_seg; int max_blocks = sbi->blocks_per_seg;
@ -835,12 +929,13 @@ static void add_discard_addrs(struct f2fs_sb_info *sbi, struct cp_control *cpc)
int i; int i;
if (se->valid_blocks == max_blocks || !f2fs_discard_en(sbi)) if (se->valid_blocks == max_blocks || !f2fs_discard_en(sbi))
return; return false;
if (!force) { if (!force) {
if (!test_opt(sbi, DISCARD) || !se->valid_blocks || if (!test_opt(sbi, DISCARD) || !se->valid_blocks ||
SM_I(sbi)->nr_discards >= SM_I(sbi)->max_discards) SM_I(sbi)->dcc_info->nr_discards >=
return; SM_I(sbi)->dcc_info->max_discards)
return false;
} }
/* SIT_VBLOCK_MAP_SIZE should be multiple of sizeof(unsigned long) */ /* SIT_VBLOCK_MAP_SIZE should be multiple of sizeof(unsigned long) */
@ -848,7 +943,8 @@ static void add_discard_addrs(struct f2fs_sb_info *sbi, struct cp_control *cpc)
dmap[i] = force ? ~ckpt_map[i] & ~discard_map[i] : dmap[i] = force ? ~ckpt_map[i] & ~discard_map[i] :
(cur_map[i] ^ ckpt_map[i]) & ckpt_map[i]; (cur_map[i] ^ ckpt_map[i]) & ckpt_map[i];
while (force || SM_I(sbi)->nr_discards <= SM_I(sbi)->max_discards) { while (force || SM_I(sbi)->dcc_info->nr_discards <=
SM_I(sbi)->dcc_info->max_discards) {
start = __find_rev_next_bit(dmap, max_blocks, end + 1); start = __find_rev_next_bit(dmap, max_blocks, end + 1);
if (start >= max_blocks) if (start >= max_blocks)
break; break;
@ -858,13 +954,17 @@ static void add_discard_addrs(struct f2fs_sb_info *sbi, struct cp_control *cpc)
&& (end - start) < cpc->trim_minlen) && (end - start) < cpc->trim_minlen)
continue; continue;
if (check_only)
return true;
__add_discard_entry(sbi, cpc, se, start, end); __add_discard_entry(sbi, cpc, se, start, end);
} }
return false;
} }
void release_discard_addrs(struct f2fs_sb_info *sbi) void release_discard_addrs(struct f2fs_sb_info *sbi)
{ {
struct list_head *head = &(SM_I(sbi)->discard_list); struct list_head *head = &(SM_I(sbi)->dcc_info->discard_entry_list);
struct discard_entry *entry, *this; struct discard_entry *entry, *this;
/* drop caches */ /* drop caches */
@ -890,17 +990,14 @@ static void set_prefree_as_free_segments(struct f2fs_sb_info *sbi)
void clear_prefree_segments(struct f2fs_sb_info *sbi, struct cp_control *cpc) void clear_prefree_segments(struct f2fs_sb_info *sbi, struct cp_control *cpc)
{ {
struct list_head *head = &(SM_I(sbi)->discard_list); struct list_head *head = &(SM_I(sbi)->dcc_info->discard_entry_list);
struct discard_entry *entry, *this; struct discard_entry *entry, *this;
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi); struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
struct blk_plug plug;
unsigned long *prefree_map = dirty_i->dirty_segmap[PRE]; unsigned long *prefree_map = dirty_i->dirty_segmap[PRE];
unsigned int start = 0, end = -1; unsigned int start = 0, end = -1;
unsigned int secno, start_segno; unsigned int secno, start_segno;
bool force = (cpc->reason == CP_DISCARD); bool force = (cpc->reason == CP_DISCARD);
blk_start_plug(&plug);
mutex_lock(&dirty_i->seglist_lock); mutex_lock(&dirty_i->seglist_lock);
while (1) { while (1) {
@ -916,9 +1013,13 @@ void clear_prefree_segments(struct f2fs_sb_info *sbi, struct cp_control *cpc)
dirty_i->nr_dirty[PRE] -= end - start; dirty_i->nr_dirty[PRE] -= end - start;
if (force || !test_opt(sbi, DISCARD)) if (!test_opt(sbi, DISCARD))
continue; continue;
if (force && start >= cpc->trim_start &&
(end - 1) <= cpc->trim_end)
continue;
if (!test_opt(sbi, LFS) || sbi->segs_per_sec == 1) { if (!test_opt(sbi, LFS) || sbi->segs_per_sec == 1) {
f2fs_issue_discard(sbi, START_BLOCK(sbi, start), f2fs_issue_discard(sbi, START_BLOCK(sbi, start),
(end - start) << sbi->log_blocks_per_seg); (end - start) << sbi->log_blocks_per_seg);
@ -935,6 +1036,8 @@ next:
start = start_segno + sbi->segs_per_sec; start = start_segno + sbi->segs_per_sec;
if (start < end) if (start < end)
goto next; goto next;
else
end = start - 1;
} }
mutex_unlock(&dirty_i->seglist_lock); mutex_unlock(&dirty_i->seglist_lock);
@ -946,11 +1049,62 @@ next:
cpc->trimmed += entry->len; cpc->trimmed += entry->len;
skip: skip:
list_del(&entry->list); list_del(&entry->list);
SM_I(sbi)->nr_discards -= entry->len; SM_I(sbi)->dcc_info->nr_discards -= entry->len;
kmem_cache_free(discard_entry_slab, entry); kmem_cache_free(discard_entry_slab, entry);
} }
}
blk_finish_plug(&plug); static int create_discard_cmd_control(struct f2fs_sb_info *sbi)
{
dev_t dev = sbi->sb->s_bdev->bd_dev;
struct discard_cmd_control *dcc;
int err = 0;
if (SM_I(sbi)->dcc_info) {
dcc = SM_I(sbi)->dcc_info;
goto init_thread;
}
dcc = kzalloc(sizeof(struct discard_cmd_control), GFP_KERNEL);
if (!dcc)
return -ENOMEM;
INIT_LIST_HEAD(&dcc->discard_entry_list);
INIT_LIST_HEAD(&dcc->discard_cmd_list);
mutex_init(&dcc->cmd_lock);
atomic_set(&dcc->submit_discard, 0);
dcc->nr_discards = 0;
dcc->max_discards = 0;
init_waitqueue_head(&dcc->discard_wait_queue);
SM_I(sbi)->dcc_info = dcc;
init_thread:
dcc->f2fs_issue_discard = kthread_run(issue_discard_thread, sbi,
"f2fs_discard-%u:%u", MAJOR(dev), MINOR(dev));
if (IS_ERR(dcc->f2fs_issue_discard)) {
err = PTR_ERR(dcc->f2fs_issue_discard);
kfree(dcc);
SM_I(sbi)->dcc_info = NULL;
return err;
}
return err;
}
static void destroy_discard_cmd_control(struct f2fs_sb_info *sbi, bool free)
{
struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
if (dcc && dcc->f2fs_issue_discard) {
struct task_struct *discard_thread = dcc->f2fs_issue_discard;
dcc->f2fs_issue_discard = NULL;
kthread_stop(discard_thread);
}
if (free) {
kfree(dcc);
SM_I(sbi)->dcc_info = NULL;
}
} }
static bool __mark_sit_entry_dirty(struct f2fs_sb_info *sbi, unsigned int segno) static bool __mark_sit_entry_dirty(struct f2fs_sb_info *sbi, unsigned int segno)
@ -995,14 +1149,32 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
/* Update valid block bitmap */ /* Update valid block bitmap */
if (del > 0) { if (del > 0) {
if (f2fs_test_and_set_bit(offset, se->cur_valid_map)) if (f2fs_test_and_set_bit(offset, se->cur_valid_map)) {
#ifdef CONFIG_F2FS_CHECK_FS
if (f2fs_test_and_set_bit(offset,
se->cur_valid_map_mir))
f2fs_bug_on(sbi, 1);
else
WARN_ON(1);
#else
f2fs_bug_on(sbi, 1); f2fs_bug_on(sbi, 1);
#endif
}
if (f2fs_discard_en(sbi) && if (f2fs_discard_en(sbi) &&
!f2fs_test_and_set_bit(offset, se->discard_map)) !f2fs_test_and_set_bit(offset, se->discard_map))
sbi->discard_blks--; sbi->discard_blks--;
} else { } else {
if (!f2fs_test_and_clear_bit(offset, se->cur_valid_map)) if (!f2fs_test_and_clear_bit(offset, se->cur_valid_map)) {
#ifdef CONFIG_F2FS_CHECK_FS
if (!f2fs_test_and_clear_bit(offset,
se->cur_valid_map_mir))
f2fs_bug_on(sbi, 1);
else
WARN_ON(1);
#else
f2fs_bug_on(sbi, 1); f2fs_bug_on(sbi, 1);
#endif
}
if (f2fs_discard_en(sbi) && if (f2fs_discard_en(sbi) &&
f2fs_test_and_clear_bit(offset, se->discard_map)) f2fs_test_and_clear_bit(offset, se->discard_map))
sbi->discard_blks++; sbi->discard_blks++;
@ -1167,17 +1339,6 @@ static void write_current_sum_page(struct f2fs_sb_info *sbi,
f2fs_put_page(page, 1); f2fs_put_page(page, 1);
} }
static int is_next_segment_free(struct f2fs_sb_info *sbi, int type)
{
struct curseg_info *curseg = CURSEG_I(sbi, type);
unsigned int segno = curseg->segno + 1;
struct free_segmap_info *free_i = FREE_I(sbi);
if (segno < MAIN_SEGS(sbi) && segno % sbi->segs_per_sec)
return !test_bit(segno, free_i->free_segmap);
return 0;
}
/* /*
* Find a new segment from the free segments bitmap to right order * Find a new segment from the free segments bitmap to right order
* This function should be returned with success, otherwise BUG * This function should be returned with success, otherwise BUG
@ -1382,16 +1543,39 @@ static int get_ssr_segment(struct f2fs_sb_info *sbi, int type)
{ {
struct curseg_info *curseg = CURSEG_I(sbi, type); struct curseg_info *curseg = CURSEG_I(sbi, type);
const struct victim_selection *v_ops = DIRTY_I(sbi)->v_ops; const struct victim_selection *v_ops = DIRTY_I(sbi)->v_ops;
int i, cnt;
bool reversed = false;
if (IS_NODESEG(type) || !has_not_enough_free_secs(sbi, 0, 0)) /* need_SSR() already forces to do this */
return v_ops->get_victim(sbi, if (v_ops->get_victim(sbi, &(curseg)->next_segno, BG_GC, type, SSR))
&(curseg)->next_segno, BG_GC, type, SSR); return 1;
/* For data segments, let's do SSR more intensively */ /* For node segments, let's do SSR more intensively */
for (; type >= CURSEG_HOT_DATA; type--) if (IS_NODESEG(type)) {
if (type >= CURSEG_WARM_NODE) {
reversed = true;
i = CURSEG_COLD_NODE;
} else {
i = CURSEG_HOT_NODE;
}
cnt = NR_CURSEG_NODE_TYPE;
} else {
if (type >= CURSEG_WARM_DATA) {
reversed = true;
i = CURSEG_COLD_DATA;
} else {
i = CURSEG_HOT_DATA;
}
cnt = NR_CURSEG_DATA_TYPE;
}
for (; cnt-- > 0; reversed ? i-- : i++) {
if (i == type)
continue;
if (v_ops->get_victim(sbi, &(curseg)->next_segno, if (v_ops->get_victim(sbi, &(curseg)->next_segno,
BG_GC, type, SSR)) BG_GC, i, SSR))
return 1; return 1;
}
return 0; return 0;
} }
@ -1402,20 +1586,17 @@ static int get_ssr_segment(struct f2fs_sb_info *sbi, int type)
static void allocate_segment_by_default(struct f2fs_sb_info *sbi, static void allocate_segment_by_default(struct f2fs_sb_info *sbi,
int type, bool force) int type, bool force)
{ {
struct curseg_info *curseg = CURSEG_I(sbi, type);
if (force) if (force)
new_curseg(sbi, type, true); new_curseg(sbi, type, true);
else if (type == CURSEG_WARM_NODE) else if (!is_set_ckpt_flags(sbi, CP_CRC_RECOVERY_FLAG) &&
new_curseg(sbi, type, false); type == CURSEG_WARM_NODE)
else if (curseg->alloc_type == LFS && is_next_segment_free(sbi, type))
new_curseg(sbi, type, false); new_curseg(sbi, type, false);
else if (need_SSR(sbi) && get_ssr_segment(sbi, type)) else if (need_SSR(sbi) && get_ssr_segment(sbi, type))
change_curseg(sbi, type, true); change_curseg(sbi, type, true);
else else
new_curseg(sbi, type, false); new_curseg(sbi, type, false);
stat_inc_seg_type(sbi, curseg); stat_inc_seg_type(sbi, CURSEG_I(sbi, type));
} }
void allocate_new_segments(struct f2fs_sb_info *sbi) void allocate_new_segments(struct f2fs_sb_info *sbi)
@ -1424,9 +1605,6 @@ void allocate_new_segments(struct f2fs_sb_info *sbi)
unsigned int old_segno; unsigned int old_segno;
int i; int i;
if (test_opt(sbi, LFS))
return;
for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) { for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) {
curseg = CURSEG_I(sbi, i); curseg = CURSEG_I(sbi, i);
old_segno = curseg->segno; old_segno = curseg->segno;
@ -1439,6 +1617,24 @@ static const struct segment_allocation default_salloc_ops = {
.allocate_segment = allocate_segment_by_default, .allocate_segment = allocate_segment_by_default,
}; };
bool exist_trim_candidates(struct f2fs_sb_info *sbi, struct cp_control *cpc)
{
__u64 trim_start = cpc->trim_start;
bool has_candidate = false;
mutex_lock(&SIT_I(sbi)->sentry_lock);
for (; cpc->trim_start <= cpc->trim_end; cpc->trim_start++) {
if (add_discard_addrs(sbi, cpc, true)) {
has_candidate = true;
break;
}
}
mutex_unlock(&SIT_I(sbi)->sentry_lock);
cpc->trim_start = trim_start;
return has_candidate;
}
int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range) int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range)
{ {
__u64 start = F2FS_BYTES_TO_BLK(range->start); __u64 start = F2FS_BYTES_TO_BLK(range->start);
@ -1573,6 +1769,8 @@ void allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
*new_blkaddr = NEXT_FREE_BLKADDR(sbi, curseg); *new_blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
f2fs_wait_discard_bio(sbi, *new_blkaddr);
/* /*
* __add_sum_entry should be resided under the curseg_mutex * __add_sum_entry should be resided under the curseg_mutex
* because, this function updates a summary entry in the * because, this function updates a summary entry in the
@ -1584,14 +1782,15 @@ void allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
stat_inc_block_count(sbi, curseg); stat_inc_block_count(sbi, curseg);
if (!__has_curseg_space(sbi, type))
sit_i->s_ops->allocate_segment(sbi, type, false);
/* /*
* SIT information should be updated before segment allocation, * SIT information should be updated before segment allocation,
* since SSR needs latest valid block information. * since SSR needs latest valid block information.
*/ */
refresh_sit_entry(sbi, old_blkaddr, *new_blkaddr); refresh_sit_entry(sbi, old_blkaddr, *new_blkaddr);
if (!__has_curseg_space(sbi, type))
sit_i->s_ops->allocate_segment(sbi, type, false);
mutex_unlock(&sit_i->sentry_lock); mutex_unlock(&sit_i->sentry_lock);
if (page && IS_NODESEG(type)) if (page && IS_NODESEG(type))
@ -1603,15 +1802,20 @@ void allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
static void do_write_page(struct f2fs_summary *sum, struct f2fs_io_info *fio) static void do_write_page(struct f2fs_summary *sum, struct f2fs_io_info *fio)
{ {
int type = __get_segment_type(fio->page, fio->type); int type = __get_segment_type(fio->page, fio->type);
int err;
if (fio->type == NODE || fio->type == DATA) if (fio->type == NODE || fio->type == DATA)
mutex_lock(&fio->sbi->wio_mutex[fio->type]); mutex_lock(&fio->sbi->wio_mutex[fio->type]);
reallocate:
allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr, allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr,
&fio->new_blkaddr, sum, type); &fio->new_blkaddr, sum, type);
/* writeout dirty page into bdev */ /* writeout dirty page into bdev */
f2fs_submit_page_mbio(fio); err = f2fs_submit_page_mbio(fio);
if (err == -EAGAIN) {
fio->old_blkaddr = fio->new_blkaddr;
goto reallocate;
}
if (fio->type == NODE || fio->type == DATA) if (fio->type == NODE || fio->type == DATA)
mutex_unlock(&fio->sbi->wio_mutex[fio->type]); mutex_unlock(&fio->sbi->wio_mutex[fio->type]);
@ -1753,7 +1957,8 @@ void f2fs_wait_on_page_writeback(struct page *page,
if (PageWriteback(page)) { if (PageWriteback(page)) {
struct f2fs_sb_info *sbi = F2FS_P_SB(page); struct f2fs_sb_info *sbi = F2FS_P_SB(page);
f2fs_submit_merged_bio_cond(sbi, NULL, page, 0, type, WRITE); f2fs_submit_merged_bio_cond(sbi, page->mapping->host,
0, page->index, type, WRITE);
if (ordered) if (ordered)
wait_on_page_writeback(page); wait_on_page_writeback(page);
else else
@ -2228,7 +2433,7 @@ void flush_sit_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
/* add discard candidates */ /* add discard candidates */
if (cpc->reason != CP_DISCARD) { if (cpc->reason != CP_DISCARD) {
cpc->trim_start = segno; cpc->trim_start = segno;
add_discard_addrs(sbi, cpc); add_discard_addrs(sbi, cpc, false);
} }
if (to_journal) { if (to_journal) {
@ -2263,8 +2468,12 @@ void flush_sit_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
f2fs_bug_on(sbi, sit_i->dirty_sentries); f2fs_bug_on(sbi, sit_i->dirty_sentries);
out: out:
if (cpc->reason == CP_DISCARD) { if (cpc->reason == CP_DISCARD) {
__u64 trim_start = cpc->trim_start;
for (; cpc->trim_start <= cpc->trim_end; cpc->trim_start++) for (; cpc->trim_start <= cpc->trim_end; cpc->trim_start++)
add_discard_addrs(sbi, cpc); add_discard_addrs(sbi, cpc, false);
cpc->trim_start = trim_start;
} }
mutex_unlock(&sit_i->sentry_lock); mutex_unlock(&sit_i->sentry_lock);
@ -2276,7 +2485,7 @@ static int build_sit_info(struct f2fs_sb_info *sbi)
struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi); struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi);
struct sit_info *sit_i; struct sit_info *sit_i;
unsigned int sit_segs, start; unsigned int sit_segs, start;
char *src_bitmap, *dst_bitmap; char *src_bitmap;
unsigned int bitmap_size; unsigned int bitmap_size;
/* allocate memory for SIT information */ /* allocate memory for SIT information */
@ -2305,6 +2514,13 @@ static int build_sit_info(struct f2fs_sb_info *sbi)
!sit_i->sentries[start].ckpt_valid_map) !sit_i->sentries[start].ckpt_valid_map)
return -ENOMEM; return -ENOMEM;
#ifdef CONFIG_F2FS_CHECK_FS
sit_i->sentries[start].cur_valid_map_mir
= kzalloc(SIT_VBLOCK_MAP_SIZE, GFP_KERNEL);
if (!sit_i->sentries[start].cur_valid_map_mir)
return -ENOMEM;
#endif
if (f2fs_discard_en(sbi)) { if (f2fs_discard_en(sbi)) {
sit_i->sentries[start].discard_map sit_i->sentries[start].discard_map
= kzalloc(SIT_VBLOCK_MAP_SIZE, GFP_KERNEL); = kzalloc(SIT_VBLOCK_MAP_SIZE, GFP_KERNEL);
@ -2331,17 +2547,22 @@ static int build_sit_info(struct f2fs_sb_info *sbi)
bitmap_size = __bitmap_size(sbi, SIT_BITMAP); bitmap_size = __bitmap_size(sbi, SIT_BITMAP);
src_bitmap = __bitmap_ptr(sbi, SIT_BITMAP); src_bitmap = __bitmap_ptr(sbi, SIT_BITMAP);
dst_bitmap = kmemdup(src_bitmap, bitmap_size, GFP_KERNEL); sit_i->sit_bitmap = kmemdup(src_bitmap, bitmap_size, GFP_KERNEL);
if (!dst_bitmap) if (!sit_i->sit_bitmap)
return -ENOMEM; return -ENOMEM;
#ifdef CONFIG_F2FS_CHECK_FS
sit_i->sit_bitmap_mir = kmemdup(src_bitmap, bitmap_size, GFP_KERNEL);
if (!sit_i->sit_bitmap_mir)
return -ENOMEM;
#endif
/* init SIT information */ /* init SIT information */
sit_i->s_ops = &default_salloc_ops; sit_i->s_ops = &default_salloc_ops;
sit_i->sit_base_addr = le32_to_cpu(raw_super->sit_blkaddr); sit_i->sit_base_addr = le32_to_cpu(raw_super->sit_blkaddr);
sit_i->sit_blocks = sit_segs << sbi->log_blocks_per_seg; sit_i->sit_blocks = sit_segs << sbi->log_blocks_per_seg;
sit_i->written_valid_blocks = 0; sit_i->written_valid_blocks = 0;
sit_i->sit_bitmap = dst_bitmap;
sit_i->bitmap_size = bitmap_size; sit_i->bitmap_size = bitmap_size;
sit_i->dirty_sentries = 0; sit_i->dirty_sentries = 0;
sit_i->sents_per_block = SIT_ENTRY_PER_BLOCK; sit_i->sents_per_block = SIT_ENTRY_PER_BLOCK;
@ -2626,11 +2847,6 @@ int build_segment_manager(struct f2fs_sb_info *sbi)
sm_info->min_ipu_util = DEF_MIN_IPU_UTIL; sm_info->min_ipu_util = DEF_MIN_IPU_UTIL;
sm_info->min_fsync_blocks = DEF_MIN_FSYNC_BLOCKS; sm_info->min_fsync_blocks = DEF_MIN_FSYNC_BLOCKS;
INIT_LIST_HEAD(&sm_info->discard_list);
INIT_LIST_HEAD(&sm_info->wait_list);
sm_info->nr_discards = 0;
sm_info->max_discards = 0;
sm_info->trim_sections = DEF_BATCHED_TRIM_SECTIONS; sm_info->trim_sections = DEF_BATCHED_TRIM_SECTIONS;
INIT_LIST_HEAD(&sm_info->sit_entry_set); INIT_LIST_HEAD(&sm_info->sit_entry_set);
@ -2641,6 +2857,10 @@ int build_segment_manager(struct f2fs_sb_info *sbi)
return err; return err;
} }
err = create_discard_cmd_control(sbi);
if (err)
return err;
err = build_sit_info(sbi); err = build_sit_info(sbi);
if (err) if (err)
return err; return err;
@ -2734,6 +2954,9 @@ static void destroy_sit_info(struct f2fs_sb_info *sbi)
if (sit_i->sentries) { if (sit_i->sentries) {
for (start = 0; start < MAIN_SEGS(sbi); start++) { for (start = 0; start < MAIN_SEGS(sbi); start++) {
kfree(sit_i->sentries[start].cur_valid_map); kfree(sit_i->sentries[start].cur_valid_map);
#ifdef CONFIG_F2FS_CHECK_FS
kfree(sit_i->sentries[start].cur_valid_map_mir);
#endif
kfree(sit_i->sentries[start].ckpt_valid_map); kfree(sit_i->sentries[start].ckpt_valid_map);
kfree(sit_i->sentries[start].discard_map); kfree(sit_i->sentries[start].discard_map);
} }
@ -2746,6 +2969,9 @@ static void destroy_sit_info(struct f2fs_sb_info *sbi)
SM_I(sbi)->sit_info = NULL; SM_I(sbi)->sit_info = NULL;
kfree(sit_i->sit_bitmap); kfree(sit_i->sit_bitmap);
#ifdef CONFIG_F2FS_CHECK_FS
kfree(sit_i->sit_bitmap_mir);
#endif
kfree(sit_i); kfree(sit_i);
} }
@ -2756,6 +2982,7 @@ void destroy_segment_manager(struct f2fs_sb_info *sbi)
if (!sm_info) if (!sm_info)
return; return;
destroy_flush_cmd_control(sbi, true); destroy_flush_cmd_control(sbi, true);
destroy_discard_cmd_control(sbi, true);
destroy_dirty_segmap(sbi); destroy_dirty_segmap(sbi);
destroy_curseg(sbi); destroy_curseg(sbi);
destroy_free_segmap(sbi); destroy_free_segmap(sbi);
@ -2771,15 +2998,15 @@ int __init create_segment_manager_caches(void)
if (!discard_entry_slab) if (!discard_entry_slab)
goto fail; goto fail;
bio_entry_slab = f2fs_kmem_cache_create("bio_entry", discard_cmd_slab = f2fs_kmem_cache_create("discard_cmd",
sizeof(struct bio_entry)); sizeof(struct discard_cmd));
if (!bio_entry_slab) if (!discard_cmd_slab)
goto destroy_discard_entry; goto destroy_discard_entry;
sit_entry_set_slab = f2fs_kmem_cache_create("sit_entry_set", sit_entry_set_slab = f2fs_kmem_cache_create("sit_entry_set",
sizeof(struct sit_entry_set)); sizeof(struct sit_entry_set));
if (!sit_entry_set_slab) if (!sit_entry_set_slab)
goto destroy_bio_entry; goto destroy_discard_cmd;
inmem_entry_slab = f2fs_kmem_cache_create("inmem_page_entry", inmem_entry_slab = f2fs_kmem_cache_create("inmem_page_entry",
sizeof(struct inmem_pages)); sizeof(struct inmem_pages));
@ -2789,8 +3016,8 @@ int __init create_segment_manager_caches(void)
destroy_sit_entry_set: destroy_sit_entry_set:
kmem_cache_destroy(sit_entry_set_slab); kmem_cache_destroy(sit_entry_set_slab);
destroy_bio_entry: destroy_discard_cmd:
kmem_cache_destroy(bio_entry_slab); kmem_cache_destroy(discard_cmd_slab);
destroy_discard_entry: destroy_discard_entry:
kmem_cache_destroy(discard_entry_slab); kmem_cache_destroy(discard_entry_slab);
fail: fail:
@ -2800,7 +3027,7 @@ fail:
void destroy_segment_manager_caches(void) void destroy_segment_manager_caches(void)
{ {
kmem_cache_destroy(sit_entry_set_slab); kmem_cache_destroy(sit_entry_set_slab);
kmem_cache_destroy(bio_entry_slab); kmem_cache_destroy(discard_cmd_slab);
kmem_cache_destroy(discard_entry_slab); kmem_cache_destroy(discard_entry_slab);
kmem_cache_destroy(inmem_entry_slab); kmem_cache_destroy(inmem_entry_slab);
} }

View File

@ -164,6 +164,9 @@ struct seg_entry {
unsigned int ckpt_valid_blocks:10; /* # of valid blocks last cp */ unsigned int ckpt_valid_blocks:10; /* # of valid blocks last cp */
unsigned int padding:6; /* padding */ unsigned int padding:6; /* padding */
unsigned char *cur_valid_map; /* validity bitmap of blocks */ unsigned char *cur_valid_map; /* validity bitmap of blocks */
#ifdef CONFIG_F2FS_CHECK_FS
unsigned char *cur_valid_map_mir; /* mirror of current valid bitmap */
#endif
/* /*
* # of valid blocks and the validity bitmap stored in the the last * # of valid blocks and the validity bitmap stored in the the last
* checkpoint pack. This information is used by the SSR mode. * checkpoint pack. This information is used by the SSR mode.
@ -186,9 +189,12 @@ struct segment_allocation {
* the page is atomically written, and it is in inmem_pages list. * the page is atomically written, and it is in inmem_pages list.
*/ */
#define ATOMIC_WRITTEN_PAGE ((unsigned long)-1) #define ATOMIC_WRITTEN_PAGE ((unsigned long)-1)
#define DUMMY_WRITTEN_PAGE ((unsigned long)-2)
#define IS_ATOMIC_WRITTEN_PAGE(page) \ #define IS_ATOMIC_WRITTEN_PAGE(page) \
(page_private(page) == (unsigned long)ATOMIC_WRITTEN_PAGE) (page_private(page) == (unsigned long)ATOMIC_WRITTEN_PAGE)
#define IS_DUMMY_WRITTEN_PAGE(page) \
(page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE)
struct inmem_pages { struct inmem_pages {
struct list_head list; struct list_head list;
@ -203,6 +209,9 @@ struct sit_info {
block_t sit_blocks; /* # of blocks used by SIT area */ block_t sit_blocks; /* # of blocks used by SIT area */
block_t written_valid_blocks; /* # of valid blocks in main area */ block_t written_valid_blocks; /* # of valid blocks in main area */
char *sit_bitmap; /* SIT bitmap pointer */ char *sit_bitmap; /* SIT bitmap pointer */
#ifdef CONFIG_F2FS_CHECK_FS
char *sit_bitmap_mir; /* SIT bitmap mirror */
#endif
unsigned int bitmap_size; /* SIT bitmap size */ unsigned int bitmap_size; /* SIT bitmap size */
unsigned long *tmp_map; /* bitmap for temporal use */ unsigned long *tmp_map; /* bitmap for temporal use */
@ -317,6 +326,9 @@ static inline void seg_info_from_raw_sit(struct seg_entry *se,
se->ckpt_valid_blocks = GET_SIT_VBLOCKS(rs); se->ckpt_valid_blocks = GET_SIT_VBLOCKS(rs);
memcpy(se->cur_valid_map, rs->valid_map, SIT_VBLOCK_MAP_SIZE); memcpy(se->cur_valid_map, rs->valid_map, SIT_VBLOCK_MAP_SIZE);
memcpy(se->ckpt_valid_map, rs->valid_map, SIT_VBLOCK_MAP_SIZE); memcpy(se->ckpt_valid_map, rs->valid_map, SIT_VBLOCK_MAP_SIZE);
#ifdef CONFIG_F2FS_CHECK_FS
memcpy(se->cur_valid_map_mir, rs->valid_map, SIT_VBLOCK_MAP_SIZE);
#endif
se->type = GET_SIT_TYPE(rs); se->type = GET_SIT_TYPE(rs);
se->mtime = le64_to_cpu(rs->mtime); se->mtime = le64_to_cpu(rs->mtime);
} }
@ -414,6 +426,12 @@ static inline void get_sit_bitmap(struct f2fs_sb_info *sbi,
void *dst_addr) void *dst_addr)
{ {
struct sit_info *sit_i = SIT_I(sbi); struct sit_info *sit_i = SIT_I(sbi);
#ifdef CONFIG_F2FS_CHECK_FS
if (memcmp(sit_i->sit_bitmap, sit_i->sit_bitmap_mir,
sit_i->bitmap_size))
f2fs_bug_on(sbi, 1);
#endif
memcpy(dst_addr, sit_i->sit_bitmap, sit_i->bitmap_size); memcpy(dst_addr, sit_i->sit_bitmap, sit_i->bitmap_size);
} }
@ -634,6 +652,12 @@ static inline pgoff_t current_sit_addr(struct f2fs_sb_info *sbi,
check_seg_range(sbi, start); check_seg_range(sbi, start);
#ifdef CONFIG_F2FS_CHECK_FS
if (f2fs_test_bit(offset, sit_i->sit_bitmap) !=
f2fs_test_bit(offset, sit_i->sit_bitmap_mir))
f2fs_bug_on(sbi, 1);
#endif
/* calculate sit block address */ /* calculate sit block address */
if (f2fs_test_bit(offset, sit_i->sit_bitmap)) if (f2fs_test_bit(offset, sit_i->sit_bitmap))
blk_addr += sit_i->sit_blocks; blk_addr += sit_i->sit_blocks;
@ -659,6 +683,9 @@ static inline void set_to_next_sit(struct sit_info *sit_i, unsigned int start)
unsigned int block_off = SIT_BLOCK_OFFSET(start); unsigned int block_off = SIT_BLOCK_OFFSET(start);
f2fs_change_bit(block_off, sit_i->sit_bitmap); f2fs_change_bit(block_off, sit_i->sit_bitmap);
#ifdef CONFIG_F2FS_CHECK_FS
f2fs_change_bit(block_off, sit_i->sit_bitmap_mir);
#endif
} }
static inline unsigned long long get_mtime(struct f2fs_sb_info *sbi) static inline unsigned long long get_mtime(struct f2fs_sb_info *sbi)
@ -689,6 +716,15 @@ static inline block_t sum_blk_addr(struct f2fs_sb_info *sbi, int base, int type)
- (base + 1) + type; - (base + 1) + type;
} }
static inline bool no_fggc_candidate(struct f2fs_sb_info *sbi,
unsigned int secno)
{
if (get_valid_blocks(sbi, secno, sbi->segs_per_sec) >=
sbi->fggc_threshold)
return true;
return false;
}
static inline bool sec_usage_check(struct f2fs_sb_info *sbi, unsigned int secno) static inline bool sec_usage_check(struct f2fs_sb_info *sbi, unsigned int secno)
{ {
if (IS_CURSEC(sbi, secno) || (sbi->cur_victim_sec == secno)) if (IS_CURSEC(sbi, secno) || (sbi->cur_victim_sec == secno))
@ -700,8 +736,8 @@ static inline bool sec_usage_check(struct f2fs_sb_info *sbi, unsigned int secno)
* It is very important to gather dirty pages and write at once, so that we can * It is very important to gather dirty pages and write at once, so that we can
* submit a big bio without interfering other data writes. * submit a big bio without interfering other data writes.
* By default, 512 pages for directory data, * By default, 512 pages for directory data,
* 512 pages (2MB) * 3 for three types of nodes, and * 512 pages (2MB) * 8 for nodes, and
* max_bio_blocks for meta are set. * 256 pages * 8 for meta are set.
*/ */
static inline int nr_pages_to_skip(struct f2fs_sb_info *sbi, int type) static inline int nr_pages_to_skip(struct f2fs_sb_info *sbi, int type)
{ {

View File

@ -89,6 +89,7 @@ enum {
Opt_active_logs, Opt_active_logs,
Opt_disable_ext_identify, Opt_disable_ext_identify,
Opt_inline_xattr, Opt_inline_xattr,
Opt_noinline_xattr,
Opt_inline_data, Opt_inline_data,
Opt_inline_dentry, Opt_inline_dentry,
Opt_noinline_dentry, Opt_noinline_dentry,
@ -101,6 +102,7 @@ enum {
Opt_noinline_data, Opt_noinline_data,
Opt_data_flush, Opt_data_flush,
Opt_mode, Opt_mode,
Opt_io_size_bits,
Opt_fault_injection, Opt_fault_injection,
Opt_lazytime, Opt_lazytime,
Opt_nolazytime, Opt_nolazytime,
@ -121,6 +123,7 @@ static match_table_t f2fs_tokens = {
{Opt_active_logs, "active_logs=%u"}, {Opt_active_logs, "active_logs=%u"},
{Opt_disable_ext_identify, "disable_ext_identify"}, {Opt_disable_ext_identify, "disable_ext_identify"},
{Opt_inline_xattr, "inline_xattr"}, {Opt_inline_xattr, "inline_xattr"},
{Opt_noinline_xattr, "noinline_xattr"},
{Opt_inline_data, "inline_data"}, {Opt_inline_data, "inline_data"},
{Opt_inline_dentry, "inline_dentry"}, {Opt_inline_dentry, "inline_dentry"},
{Opt_noinline_dentry, "noinline_dentry"}, {Opt_noinline_dentry, "noinline_dentry"},
@ -133,6 +136,7 @@ static match_table_t f2fs_tokens = {
{Opt_noinline_data, "noinline_data"}, {Opt_noinline_data, "noinline_data"},
{Opt_data_flush, "data_flush"}, {Opt_data_flush, "data_flush"},
{Opt_mode, "mode=%s"}, {Opt_mode, "mode=%s"},
{Opt_io_size_bits, "io_bits=%u"},
{Opt_fault_injection, "fault_injection=%u"}, {Opt_fault_injection, "fault_injection=%u"},
{Opt_lazytime, "lazytime"}, {Opt_lazytime, "lazytime"},
{Opt_nolazytime, "nolazytime"}, {Opt_nolazytime, "nolazytime"},
@ -143,6 +147,7 @@ static match_table_t f2fs_tokens = {
enum { enum {
GC_THREAD, /* struct f2fs_gc_thread */ GC_THREAD, /* struct f2fs_gc_thread */
SM_INFO, /* struct f2fs_sm_info */ SM_INFO, /* struct f2fs_sm_info */
DCC_INFO, /* struct discard_cmd_control */
NM_INFO, /* struct f2fs_nm_info */ NM_INFO, /* struct f2fs_nm_info */
F2FS_SBI, /* struct f2fs_sb_info */ F2FS_SBI, /* struct f2fs_sb_info */
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
@ -166,6 +171,8 @@ static unsigned char *__struct_ptr(struct f2fs_sb_info *sbi, int struct_type)
return (unsigned char *)sbi->gc_thread; return (unsigned char *)sbi->gc_thread;
else if (struct_type == SM_INFO) else if (struct_type == SM_INFO)
return (unsigned char *)SM_I(sbi); return (unsigned char *)SM_I(sbi);
else if (struct_type == DCC_INFO)
return (unsigned char *)SM_I(sbi)->dcc_info;
else if (struct_type == NM_INFO) else if (struct_type == NM_INFO)
return (unsigned char *)NM_I(sbi); return (unsigned char *)NM_I(sbi);
else if (struct_type == F2FS_SBI) else if (struct_type == F2FS_SBI)
@ -281,7 +288,7 @@ F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_max_sleep_time, max_sleep_time);
F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_no_gc_sleep_time, no_gc_sleep_time); F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_no_gc_sleep_time, no_gc_sleep_time);
F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_idle, gc_idle); F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_idle, gc_idle);
F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, reclaim_segments, rec_prefree_segments); F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, reclaim_segments, rec_prefree_segments);
F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, max_small_discards, max_discards); F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_small_discards, max_discards);
F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, batched_trim_sections, trim_sections); F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, batched_trim_sections, trim_sections);
F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, ipu_policy, ipu_policy); F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, ipu_policy, ipu_policy);
F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, min_ipu_util, min_ipu_util); F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, min_ipu_util, min_ipu_util);
@ -439,6 +446,9 @@ static int parse_options(struct super_block *sb, char *options)
case Opt_inline_xattr: case Opt_inline_xattr:
set_opt(sbi, INLINE_XATTR); set_opt(sbi, INLINE_XATTR);
break; break;
case Opt_noinline_xattr:
clear_opt(sbi, INLINE_XATTR);
break;
#else #else
case Opt_user_xattr: case Opt_user_xattr:
f2fs_msg(sb, KERN_INFO, f2fs_msg(sb, KERN_INFO,
@ -452,6 +462,10 @@ static int parse_options(struct super_block *sb, char *options)
f2fs_msg(sb, KERN_INFO, f2fs_msg(sb, KERN_INFO,
"inline_xattr options not supported"); "inline_xattr options not supported");
break; break;
case Opt_noinline_xattr:
f2fs_msg(sb, KERN_INFO,
"noinline_xattr options not supported");
break;
#endif #endif
#ifdef CONFIG_F2FS_FS_POSIX_ACL #ifdef CONFIG_F2FS_FS_POSIX_ACL
case Opt_acl: case Opt_acl:
@ -535,11 +549,23 @@ static int parse_options(struct super_block *sb, char *options)
} }
kfree(name); kfree(name);
break; break;
case Opt_io_size_bits:
if (args->from && match_int(args, &arg))
return -EINVAL;
if (arg > __ilog2_u32(BIO_MAX_PAGES)) {
f2fs_msg(sb, KERN_WARNING,
"Not support %d, larger than %d",
1 << arg, BIO_MAX_PAGES);
return -EINVAL;
}
sbi->write_io_size_bits = arg;
break;
case Opt_fault_injection: case Opt_fault_injection:
if (args->from && match_int(args, &arg)) if (args->from && match_int(args, &arg))
return -EINVAL; return -EINVAL;
#ifdef CONFIG_F2FS_FAULT_INJECTION #ifdef CONFIG_F2FS_FAULT_INJECTION
f2fs_build_fault_attr(sbi, arg); f2fs_build_fault_attr(sbi, arg);
set_opt(sbi, FAULT_INJECTION);
#else #else
f2fs_msg(sb, KERN_INFO, f2fs_msg(sb, KERN_INFO,
"FAULT_INJECTION was not selected"); "FAULT_INJECTION was not selected");
@ -558,6 +584,13 @@ static int parse_options(struct super_block *sb, char *options)
return -EINVAL; return -EINVAL;
} }
} }
if (F2FS_IO_SIZE_BITS(sbi) && !test_opt(sbi, LFS)) {
f2fs_msg(sb, KERN_ERR,
"Should set mode=lfs with %uKB-sized IO",
F2FS_IO_SIZE_KB(sbi));
return -EINVAL;
}
return 0; return 0;
} }
@ -591,6 +624,7 @@ static struct inode *f2fs_alloc_inode(struct super_block *sb)
static int f2fs_drop_inode(struct inode *inode) static int f2fs_drop_inode(struct inode *inode)
{ {
int ret;
/* /*
* This is to avoid a deadlock condition like below. * This is to avoid a deadlock condition like below.
* writeback_single_inode(inode) * writeback_single_inode(inode)
@ -623,10 +657,12 @@ static int f2fs_drop_inode(struct inode *inode)
spin_lock(&inode->i_lock); spin_lock(&inode->i_lock);
atomic_dec(&inode->i_count); atomic_dec(&inode->i_count);
} }
trace_f2fs_drop_inode(inode, 0);
return 0; return 0;
} }
ret = generic_drop_inode(inode);
return generic_drop_inode(inode); trace_f2fs_drop_inode(inode, ret);
return ret;
} }
int f2fs_inode_dirtied(struct inode *inode, bool sync) int f2fs_inode_dirtied(struct inode *inode, bool sync)
@ -750,6 +786,9 @@ static void f2fs_put_super(struct super_block *sb)
write_checkpoint(sbi, &cpc); write_checkpoint(sbi, &cpc);
} }
/* be sure to wait for any on-going discard commands */
f2fs_wait_discard_bio(sbi, NULL_ADDR);
/* write_checkpoint can update stat informaion */ /* write_checkpoint can update stat informaion */
f2fs_destroy_stats(sbi); f2fs_destroy_stats(sbi);
@ -782,7 +821,7 @@ static void f2fs_put_super(struct super_block *sb)
kfree(sbi->raw_super); kfree(sbi->raw_super);
destroy_device_list(sbi); destroy_device_list(sbi);
mempool_destroy(sbi->write_io_dummy);
destroy_percpu_info(sbi); destroy_percpu_info(sbi);
kfree(sbi); kfree(sbi);
} }
@ -882,6 +921,8 @@ static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
seq_puts(seq, ",nouser_xattr"); seq_puts(seq, ",nouser_xattr");
if (test_opt(sbi, INLINE_XATTR)) if (test_opt(sbi, INLINE_XATTR))
seq_puts(seq, ",inline_xattr"); seq_puts(seq, ",inline_xattr");
else
seq_puts(seq, ",noinline_xattr");
#endif #endif
#ifdef CONFIG_F2FS_FS_POSIX_ACL #ifdef CONFIG_F2FS_FS_POSIX_ACL
if (test_opt(sbi, POSIX_ACL)) if (test_opt(sbi, POSIX_ACL))
@ -918,6 +959,12 @@ static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
else if (test_opt(sbi, LFS)) else if (test_opt(sbi, LFS))
seq_puts(seq, "lfs"); seq_puts(seq, "lfs");
seq_printf(seq, ",active_logs=%u", sbi->active_logs); seq_printf(seq, ",active_logs=%u", sbi->active_logs);
if (F2FS_IO_SIZE_BITS(sbi))
seq_printf(seq, ",io_size=%uKB", F2FS_IO_SIZE_KB(sbi));
#ifdef CONFIG_F2FS_FAULT_INJECTION
if (test_opt(sbi, FAULT_INJECTION))
seq_puts(seq, ",fault_injection");
#endif
return 0; return 0;
} }
@ -995,6 +1042,7 @@ static void default_options(struct f2fs_sb_info *sbi)
sbi->active_logs = NR_CURSEG_TYPE; sbi->active_logs = NR_CURSEG_TYPE;
set_opt(sbi, BG_GC); set_opt(sbi, BG_GC);
set_opt(sbi, INLINE_XATTR);
set_opt(sbi, INLINE_DATA); set_opt(sbi, INLINE_DATA);
set_opt(sbi, INLINE_DENTRY); set_opt(sbi, INLINE_DENTRY);
set_opt(sbi, EXTENT_CACHE); set_opt(sbi, EXTENT_CACHE);
@ -1686,36 +1734,55 @@ int f2fs_commit_super(struct f2fs_sb_info *sbi, bool recover)
static int f2fs_scan_devices(struct f2fs_sb_info *sbi) static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
{ {
struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi); struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi);
unsigned int max_devices = MAX_DEVICES;
int i; int i;
for (i = 0; i < MAX_DEVICES; i++) { /* Initialize single device information */
if (!RDEV(i).path[0]) if (!RDEV(0).path[0]) {
if (!bdev_is_zoned(sbi->sb->s_bdev))
return 0; return 0;
max_devices = 1;
}
if (i == 0) { /*
sbi->devs = kzalloc(sizeof(struct f2fs_dev_info) * * Initialize multiple devices information, or single
MAX_DEVICES, GFP_KERNEL); * zoned block device information.
if (!sbi->devs) */
return -ENOMEM; sbi->devs = kcalloc(max_devices, sizeof(struct f2fs_dev_info),
} GFP_KERNEL);
if (!sbi->devs)
return -ENOMEM;
memcpy(FDEV(i).path, RDEV(i).path, MAX_PATH_LEN); for (i = 0; i < max_devices; i++) {
FDEV(i).total_segments = le32_to_cpu(RDEV(i).total_segments);
if (i == 0) {
FDEV(i).start_blk = 0;
FDEV(i).end_blk = FDEV(i).start_blk +
(FDEV(i).total_segments <<
sbi->log_blocks_per_seg) - 1 +
le32_to_cpu(raw_super->segment0_blkaddr);
} else {
FDEV(i).start_blk = FDEV(i - 1).end_blk + 1;
FDEV(i).end_blk = FDEV(i).start_blk +
(FDEV(i).total_segments <<
sbi->log_blocks_per_seg) - 1;
}
FDEV(i).bdev = blkdev_get_by_path(FDEV(i).path, if (i > 0 && !RDEV(i).path[0])
break;
if (max_devices == 1) {
/* Single zoned block device mount */
FDEV(0).bdev =
blkdev_get_by_dev(sbi->sb->s_bdev->bd_dev,
sbi->sb->s_mode, sbi->sb->s_type); sbi->sb->s_mode, sbi->sb->s_type);
} else {
/* Multi-device mount */
memcpy(FDEV(i).path, RDEV(i).path, MAX_PATH_LEN);
FDEV(i).total_segments =
le32_to_cpu(RDEV(i).total_segments);
if (i == 0) {
FDEV(i).start_blk = 0;
FDEV(i).end_blk = FDEV(i).start_blk +
(FDEV(i).total_segments <<
sbi->log_blocks_per_seg) - 1 +
le32_to_cpu(raw_super->segment0_blkaddr);
} else {
FDEV(i).start_blk = FDEV(i - 1).end_blk + 1;
FDEV(i).end_blk = FDEV(i).start_blk +
(FDEV(i).total_segments <<
sbi->log_blocks_per_seg) - 1;
}
FDEV(i).bdev = blkdev_get_by_path(FDEV(i).path,
sbi->sb->s_mode, sbi->sb->s_type);
}
if (IS_ERR(FDEV(i).bdev)) if (IS_ERR(FDEV(i).bdev))
return PTR_ERR(FDEV(i).bdev); return PTR_ERR(FDEV(i).bdev);
@ -1735,6 +1802,8 @@ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
"Failed to initialize F2FS blkzone information"); "Failed to initialize F2FS blkzone information");
return -EINVAL; return -EINVAL;
} }
if (max_devices == 1)
break;
f2fs_msg(sbi->sb, KERN_INFO, f2fs_msg(sbi->sb, KERN_INFO,
"Mount Device [%2d]: %20s, %8u, %8x - %8x (zone: %s)", "Mount Device [%2d]: %20s, %8u, %8x - %8x (zone: %s)",
i, FDEV(i).path, i, FDEV(i).path,
@ -1751,6 +1820,8 @@ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
FDEV(i).total_segments, FDEV(i).total_segments,
FDEV(i).start_blk, FDEV(i).end_blk); FDEV(i).start_blk, FDEV(i).end_blk);
} }
f2fs_msg(sbi->sb, KERN_INFO,
"IO Block Size: %8d KB", F2FS_IO_SIZE_KB(sbi));
return 0; return 0;
} }
@ -1868,12 +1939,19 @@ try_onemore:
if (err) if (err)
goto free_options; goto free_options;
if (F2FS_IO_SIZE(sbi) > 1) {
sbi->write_io_dummy =
mempool_create_page_pool(2 * (F2FS_IO_SIZE(sbi) - 1), 0);
if (!sbi->write_io_dummy)
goto free_options;
}
/* get an inode for meta space */ /* get an inode for meta space */
sbi->meta_inode = f2fs_iget(sb, F2FS_META_INO(sbi)); sbi->meta_inode = f2fs_iget(sb, F2FS_META_INO(sbi));
if (IS_ERR(sbi->meta_inode)) { if (IS_ERR(sbi->meta_inode)) {
f2fs_msg(sb, KERN_ERR, "Failed to read F2FS meta data inode"); f2fs_msg(sb, KERN_ERR, "Failed to read F2FS meta data inode");
err = PTR_ERR(sbi->meta_inode); err = PTR_ERR(sbi->meta_inode);
goto free_options; goto free_io_dummy;
} }
err = get_valid_checkpoint(sbi); err = get_valid_checkpoint(sbi);
@ -2048,6 +2126,8 @@ skip_recovery:
sbi->valid_super_block ? 1 : 2, err); sbi->valid_super_block ? 1 : 2, err);
} }
f2fs_msg(sbi->sb, KERN_NOTICE, "Mounted with checkpoint version = %llx",
cur_cp_version(F2FS_CKPT(sbi)));
f2fs_update_time(sbi, CP_TIME); f2fs_update_time(sbi, CP_TIME);
f2fs_update_time(sbi, REQ_TIME); f2fs_update_time(sbi, REQ_TIME);
return 0; return 0;
@ -2091,6 +2171,8 @@ free_devices:
free_meta_inode: free_meta_inode:
make_bad_inode(sbi->meta_inode); make_bad_inode(sbi->meta_inode);
iput(sbi->meta_inode); iput(sbi->meta_inode);
free_io_dummy:
mempool_destroy(sbi->write_io_dummy);
free_options: free_options:
destroy_percpu_info(sbi); destroy_percpu_info(sbi);
kfree(options); kfree(options);

View File

@ -217,6 +217,112 @@ static struct f2fs_xattr_entry *__find_xattr(void *base_addr, int index,
return entry; return entry;
} }
static struct f2fs_xattr_entry *__find_inline_xattr(void *base_addr,
void **last_addr, int index,
size_t len, const char *name)
{
struct f2fs_xattr_entry *entry;
unsigned int inline_size = F2FS_INLINE_XATTR_ADDRS << 2;
list_for_each_xattr(entry, base_addr) {
if ((void *)entry + sizeof(__u32) > base_addr + inline_size ||
(void *)XATTR_NEXT_ENTRY(entry) + sizeof(__u32) >
base_addr + inline_size) {
*last_addr = entry;
return NULL;
}
if (entry->e_name_index != index)
continue;
if (entry->e_name_len != len)
continue;
if (!memcmp(entry->e_name, name, len))
break;
}
return entry;
}
static int lookup_all_xattrs(struct inode *inode, struct page *ipage,
unsigned int index, unsigned int len,
const char *name, struct f2fs_xattr_entry **xe,
void **base_addr)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
void *cur_addr, *txattr_addr, *last_addr = NULL;
nid_t xnid = F2FS_I(inode)->i_xattr_nid;
unsigned int size = xnid ? VALID_XATTR_BLOCK_SIZE : 0;
unsigned int inline_size = 0;
int err = 0;
inline_size = inline_xattr_size(inode);
if (!size && !inline_size)
return -ENODATA;
txattr_addr = kzalloc(inline_size + size + sizeof(__u32),
GFP_F2FS_ZERO);
if (!txattr_addr)
return -ENOMEM;
/* read from inline xattr */
if (inline_size) {
struct page *page = NULL;
void *inline_addr;
if (ipage) {
inline_addr = inline_xattr_addr(ipage);
} else {
page = get_node_page(sbi, inode->i_ino);
if (IS_ERR(page)) {
err = PTR_ERR(page);
goto out;
}
inline_addr = inline_xattr_addr(page);
}
memcpy(txattr_addr, inline_addr, inline_size);
f2fs_put_page(page, 1);
*xe = __find_inline_xattr(txattr_addr, &last_addr,
index, len, name);
if (*xe)
goto check;
}
/* read from xattr node block */
if (xnid) {
struct page *xpage;
void *xattr_addr;
/* The inode already has an extended attribute block. */
xpage = get_node_page(sbi, xnid);
if (IS_ERR(xpage)) {
err = PTR_ERR(xpage);
goto out;
}
xattr_addr = page_address(xpage);
memcpy(txattr_addr + inline_size, xattr_addr, size);
f2fs_put_page(xpage, 1);
}
if (last_addr)
cur_addr = XATTR_HDR(last_addr) - 1;
else
cur_addr = txattr_addr;
*xe = __find_xattr(cur_addr, index, len, name);
check:
if (IS_XATTR_LAST_ENTRY(*xe)) {
err = -ENODATA;
goto out;
}
*base_addr = txattr_addr;
return 0;
out:
kzfree(txattr_addr);
return err;
}
static int read_all_xattrs(struct inode *inode, struct page *ipage, static int read_all_xattrs(struct inode *inode, struct page *ipage,
void **base_addr) void **base_addr)
{ {
@ -348,23 +454,20 @@ static inline int write_all_xattrs(struct inode *inode, __u32 hsize,
} }
xattr_addr = page_address(xpage); xattr_addr = page_address(xpage);
memcpy(xattr_addr, txattr_addr + inline_size, PAGE_SIZE - memcpy(xattr_addr, txattr_addr + inline_size, MAX_XATTR_BLOCK_SIZE);
sizeof(struct node_footer));
set_page_dirty(xpage); set_page_dirty(xpage);
f2fs_put_page(xpage, 1); f2fs_put_page(xpage, 1);
/* need to checkpoint during fsync */
F2FS_I(inode)->xattr_ver = cur_cp_version(F2FS_CKPT(sbi));
return 0; return 0;
} }
int f2fs_getxattr(struct inode *inode, int index, const char *name, int f2fs_getxattr(struct inode *inode, int index, const char *name,
void *buffer, size_t buffer_size, struct page *ipage) void *buffer, size_t buffer_size, struct page *ipage)
{ {
struct f2fs_xattr_entry *entry; struct f2fs_xattr_entry *entry = NULL;
void *base_addr;
int error = 0; int error = 0;
size_t size, len; unsigned int size, len;
void *base_addr = NULL;
if (name == NULL) if (name == NULL)
return -EINVAL; return -EINVAL;
@ -373,21 +476,16 @@ int f2fs_getxattr(struct inode *inode, int index, const char *name,
if (len > F2FS_NAME_LEN) if (len > F2FS_NAME_LEN)
return -ERANGE; return -ERANGE;
error = read_all_xattrs(inode, ipage, &base_addr); error = lookup_all_xattrs(inode, ipage, index, len, name,
&entry, &base_addr);
if (error) if (error)
return error; return error;
entry = __find_xattr(base_addr, index, len, name);
if (IS_XATTR_LAST_ENTRY(entry)) {
error = -ENODATA;
goto cleanup;
}
size = le16_to_cpu(entry->e_value_size); size = le16_to_cpu(entry->e_value_size);
if (buffer && size > buffer_size) { if (buffer && size > buffer_size) {
error = -ERANGE; error = -ERANGE;
goto cleanup; goto out;
} }
if (buffer) { if (buffer) {
@ -395,8 +493,7 @@ int f2fs_getxattr(struct inode *inode, int index, const char *name,
memcpy(buffer, pval, size); memcpy(buffer, pval, size);
} }
error = size; error = size;
out:
cleanup:
kzfree(base_addr); kzfree(base_addr);
return error; return error;
} }
@ -445,6 +542,13 @@ cleanup:
return error; return error;
} }
static bool f2fs_xattr_value_same(struct f2fs_xattr_entry *entry,
const void *value, size_t size)
{
void *pval = entry->e_name + entry->e_name_len;
return (entry->e_value_size == size) && !memcmp(pval, value, size);
}
static int __f2fs_setxattr(struct inode *inode, int index, static int __f2fs_setxattr(struct inode *inode, int index,
const char *name, const void *value, size_t size, const char *name, const void *value, size_t size,
struct page *ipage, int flags) struct page *ipage, int flags)
@ -479,12 +583,17 @@ static int __f2fs_setxattr(struct inode *inode, int index,
found = IS_XATTR_LAST_ENTRY(here) ? 0 : 1; found = IS_XATTR_LAST_ENTRY(here) ? 0 : 1;
if ((flags & XATTR_REPLACE) && !found) { if (found) {
if ((flags & XATTR_CREATE)) {
error = -EEXIST;
goto exit;
}
if (f2fs_xattr_value_same(here, value, size))
goto exit;
} else if ((flags & XATTR_REPLACE)) {
error = -ENODATA; error = -ENODATA;
goto exit; goto exit;
} else if ((flags & XATTR_CREATE) && found) {
error = -EEXIST;
goto exit;
} }
last = here; last = here;

View File

@ -72,9 +72,10 @@ struct f2fs_xattr_entry {
for (entry = XATTR_FIRST_ENTRY(addr);\ for (entry = XATTR_FIRST_ENTRY(addr);\
!IS_XATTR_LAST_ENTRY(entry);\ !IS_XATTR_LAST_ENTRY(entry);\
entry = XATTR_NEXT_ENTRY(entry)) entry = XATTR_NEXT_ENTRY(entry))
#define MAX_XATTR_BLOCK_SIZE (PAGE_SIZE - sizeof(struct node_footer))
#define MIN_OFFSET(i) XATTR_ALIGN(inline_xattr_size(i) + PAGE_SIZE - \ #define VALID_XATTR_BLOCK_SIZE (MAX_XATTR_BLOCK_SIZE - sizeof(__u32))
sizeof(struct node_footer) - sizeof(__u32)) #define MIN_OFFSET(i) XATTR_ALIGN(inline_xattr_size(i) + \
VALID_XATTR_BLOCK_SIZE)
#define MAX_VALUE_LEN(i) (MIN_OFFSET(i) - \ #define MAX_VALUE_LEN(i) (MIN_OFFSET(i) - \
sizeof(struct f2fs_xattr_header) - \ sizeof(struct f2fs_xattr_header) - \

View File

@ -36,6 +36,12 @@
#define F2FS_NODE_INO(sbi) (sbi->node_ino_num) #define F2FS_NODE_INO(sbi) (sbi->node_ino_num)
#define F2FS_META_INO(sbi) (sbi->meta_ino_num) #define F2FS_META_INO(sbi) (sbi->meta_ino_num)
#define F2FS_IO_SIZE(sbi) (1 << (sbi)->write_io_size_bits) /* Blocks */
#define F2FS_IO_SIZE_KB(sbi) (1 << ((sbi)->write_io_size_bits + 2)) /* KB */
#define F2FS_IO_SIZE_BYTES(sbi) (1 << ((sbi)->write_io_size_bits + 12)) /* B */
#define F2FS_IO_SIZE_BITS(sbi) ((sbi)->write_io_size_bits) /* power of 2 */
#define F2FS_IO_SIZE_MASK(sbi) (F2FS_IO_SIZE(sbi) - 1)
/* This flag is used by node and meta inodes, and by recovery */ /* This flag is used by node and meta inodes, and by recovery */
#define GFP_F2FS_ZERO (GFP_NOFS | __GFP_ZERO) #define GFP_F2FS_ZERO (GFP_NOFS | __GFP_ZERO)
#define GFP_F2FS_HIGH_ZERO (GFP_NOFS | __GFP_ZERO | __GFP_HIGHMEM) #define GFP_F2FS_HIGH_ZERO (GFP_NOFS | __GFP_ZERO | __GFP_HIGHMEM)
@ -108,6 +114,7 @@ struct f2fs_super_block {
/* /*
* For checkpoint * For checkpoint
*/ */
#define CP_NAT_BITS_FLAG 0x00000080
#define CP_CRC_RECOVERY_FLAG 0x00000040 #define CP_CRC_RECOVERY_FLAG 0x00000040
#define CP_FASTBOOT_FLAG 0x00000020 #define CP_FASTBOOT_FLAG 0x00000020
#define CP_FSCK_FLAG 0x00000010 #define CP_FSCK_FLAG 0x00000010
@ -272,6 +279,7 @@ struct f2fs_node {
* For NAT entries * For NAT entries
*/ */
#define NAT_ENTRY_PER_BLOCK (PAGE_SIZE / sizeof(struct f2fs_nat_entry)) #define NAT_ENTRY_PER_BLOCK (PAGE_SIZE / sizeof(struct f2fs_nat_entry))
#define NAT_ENTRY_BITMAP_SIZE ((NAT_ENTRY_PER_BLOCK + 7) / 8)
struct f2fs_nat_entry { struct f2fs_nat_entry {
__u8 version; /* latest version of cached nat entry */ __u8 version; /* latest version of cached nat entry */

View File

@ -6,8 +6,8 @@
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
#define show_dev(entry) MAJOR(entry->dev), MINOR(entry->dev) #define show_dev(dev) MAJOR(dev), MINOR(dev)
#define show_dev_ino(entry) show_dev(entry), (unsigned long)entry->ino #define show_dev_ino(entry) show_dev(entry->dev), (unsigned long)entry->ino
TRACE_DEFINE_ENUM(NODE); TRACE_DEFINE_ENUM(NODE);
TRACE_DEFINE_ENUM(DATA); TRACE_DEFINE_ENUM(DATA);
@ -55,25 +55,35 @@ TRACE_DEFINE_ENUM(CP_DISCARD);
{ IPU, "IN-PLACE" }, \ { IPU, "IN-PLACE" }, \
{ OPU, "OUT-OF-PLACE" }) { OPU, "OUT-OF-PLACE" })
#define F2FS_BIO_FLAG_MASK(t) (t & (REQ_RAHEAD | REQ_PREFLUSH | REQ_FUA)) #define F2FS_OP_FLAGS (REQ_RAHEAD | REQ_SYNC | REQ_PREFLUSH | REQ_META |\
#define F2FS_BIO_EXTRA_MASK(t) (t & (REQ_META | REQ_PRIO)) REQ_PRIO)
#define F2FS_BIO_FLAG_MASK(t) (t & F2FS_OP_FLAGS)
#define show_bio_type(op_flags) show_bio_op_flags(op_flags), \ #define show_bio_type(op,op_flags) show_bio_op(op), \
show_bio_extra(op_flags) show_bio_op_flags(op_flags)
#define show_bio_op(op) \
__print_symbolic(op, \
{ REQ_OP_READ, "READ" }, \
{ REQ_OP_WRITE, "WRITE" }, \
{ REQ_OP_FLUSH, "FLUSH" }, \
{ REQ_OP_DISCARD, "DISCARD" }, \
{ REQ_OP_ZONE_REPORT, "ZONE_REPORT" }, \
{ REQ_OP_SECURE_ERASE, "SECURE_ERASE" }, \
{ REQ_OP_ZONE_RESET, "ZONE_RESET" }, \
{ REQ_OP_WRITE_SAME, "WRITE_SAME" }, \
{ REQ_OP_WRITE_ZEROES, "WRITE_ZEROES" })
#define show_bio_op_flags(flags) \ #define show_bio_op_flags(flags) \
__print_symbolic(F2FS_BIO_FLAG_MASK(flags), \ __print_symbolic(F2FS_BIO_FLAG_MASK(flags), \
{ 0, "WRITE" }, \ { REQ_RAHEAD, "(RA)" }, \
{ REQ_RAHEAD, "READAHEAD" }, \ { REQ_SYNC, "(S)" }, \
{ REQ_SYNC, "REQ_SYNC" }, \ { REQ_SYNC | REQ_PRIO, "(SP)" }, \
{ REQ_PREFLUSH, "REQ_PREFLUSH" }, \
{ REQ_FUA, "REQ_FUA" })
#define show_bio_extra(type) \
__print_symbolic(F2FS_BIO_EXTRA_MASK(type), \
{ REQ_META, "(M)" }, \ { REQ_META, "(M)" }, \
{ REQ_PRIO, "(P)" }, \
{ REQ_META | REQ_PRIO, "(MP)" }, \ { REQ_META | REQ_PRIO, "(MP)" }, \
{ REQ_SYNC | REQ_PREFLUSH , "(SF)" }, \
{ REQ_SYNC | REQ_META | REQ_PRIO, "(SMP)" }, \
{ REQ_PREFLUSH | REQ_META | REQ_PRIO, "(FMP)" }, \
{ 0, " \b" }) { 0, " \b" })
#define show_data_type(type) \ #define show_data_type(type) \
@ -235,7 +245,7 @@ TRACE_EVENT(f2fs_sync_fs,
), ),
TP_printk("dev = (%d,%d), superblock is %s, wait = %d", TP_printk("dev = (%d,%d), superblock is %s, wait = %d",
show_dev(__entry), show_dev(__entry->dev),
__entry->dirty ? "dirty" : "not dirty", __entry->dirty ? "dirty" : "not dirty",
__entry->wait) __entry->wait)
); );
@ -305,6 +315,13 @@ DEFINE_EVENT(f2fs__inode_exit, f2fs_unlink_exit,
TP_ARGS(inode, ret) TP_ARGS(inode, ret)
); );
DEFINE_EVENT(f2fs__inode_exit, f2fs_drop_inode,
TP_PROTO(struct inode *inode, int ret),
TP_ARGS(inode, ret)
);
DEFINE_EVENT(f2fs__inode, f2fs_truncate, DEFINE_EVENT(f2fs__inode, f2fs_truncate,
TP_PROTO(struct inode *inode), TP_PROTO(struct inode *inode),
@ -534,7 +551,7 @@ TRACE_EVENT(f2fs_background_gc,
), ),
TP_printk("dev = (%d,%d), wait_ms = %ld, prefree = %u, free = %u", TP_printk("dev = (%d,%d), wait_ms = %ld, prefree = %u, free = %u",
show_dev(__entry), show_dev(__entry->dev),
__entry->wait_ms, __entry->wait_ms,
__entry->prefree, __entry->prefree,
__entry->free) __entry->free)
@ -555,6 +572,7 @@ TRACE_EVENT(f2fs_get_victim,
__field(int, alloc_mode) __field(int, alloc_mode)
__field(int, gc_mode) __field(int, gc_mode)
__field(unsigned int, victim) __field(unsigned int, victim)
__field(unsigned int, cost)
__field(unsigned int, ofs_unit) __field(unsigned int, ofs_unit)
__field(unsigned int, pre_victim) __field(unsigned int, pre_victim)
__field(unsigned int, prefree) __field(unsigned int, prefree)
@ -568,20 +586,23 @@ TRACE_EVENT(f2fs_get_victim,
__entry->alloc_mode = p->alloc_mode; __entry->alloc_mode = p->alloc_mode;
__entry->gc_mode = p->gc_mode; __entry->gc_mode = p->gc_mode;
__entry->victim = p->min_segno; __entry->victim = p->min_segno;
__entry->cost = p->min_cost;
__entry->ofs_unit = p->ofs_unit; __entry->ofs_unit = p->ofs_unit;
__entry->pre_victim = pre_victim; __entry->pre_victim = pre_victim;
__entry->prefree = prefree; __entry->prefree = prefree;
__entry->free = free; __entry->free = free;
), ),
TP_printk("dev = (%d,%d), type = %s, policy = (%s, %s, %s), victim = %u " TP_printk("dev = (%d,%d), type = %s, policy = (%s, %s, %s), "
"ofs_unit = %u, pre_victim_secno = %d, prefree = %u, free = %u", "victim = %u, cost = %u, ofs_unit = %u, "
show_dev(__entry), "pre_victim_secno = %d, prefree = %u, free = %u",
show_dev(__entry->dev),
show_data_type(__entry->type), show_data_type(__entry->type),
show_gc_type(__entry->gc_type), show_gc_type(__entry->gc_type),
show_alloc_mode(__entry->alloc_mode), show_alloc_mode(__entry->alloc_mode),
show_victim_policy(__entry->gc_mode), show_victim_policy(__entry->gc_mode),
__entry->victim, __entry->victim,
__entry->cost,
__entry->ofs_unit, __entry->ofs_unit,
(int)__entry->pre_victim, (int)__entry->pre_victim,
__entry->prefree, __entry->prefree,
@ -713,7 +734,7 @@ TRACE_EVENT(f2fs_reserve_new_blocks,
), ),
TP_printk("dev = (%d,%d), nid = %u, ofs_in_node = %u, count = %llu", TP_printk("dev = (%d,%d), nid = %u, ofs_in_node = %u, count = %llu",
show_dev(__entry), show_dev(__entry->dev),
(unsigned int)__entry->nid, (unsigned int)__entry->nid,
__entry->ofs_in_node, __entry->ofs_in_node,
(unsigned long long)__entry->count) (unsigned long long)__entry->count)
@ -753,7 +774,7 @@ DECLARE_EVENT_CLASS(f2fs__submit_page_bio,
(unsigned long)__entry->index, (unsigned long)__entry->index,
(unsigned long long)__entry->old_blkaddr, (unsigned long long)__entry->old_blkaddr,
(unsigned long long)__entry->new_blkaddr, (unsigned long long)__entry->new_blkaddr,
show_bio_type(__entry->op_flags), show_bio_type(__entry->op, __entry->op_flags),
show_block_type(__entry->type)) show_block_type(__entry->type))
); );
@ -775,15 +796,15 @@ DEFINE_EVENT_CONDITION(f2fs__submit_page_bio, f2fs_submit_page_mbio,
TP_CONDITION(page->mapping) TP_CONDITION(page->mapping)
); );
DECLARE_EVENT_CLASS(f2fs__submit_bio, DECLARE_EVENT_CLASS(f2fs__bio,
TP_PROTO(struct super_block *sb, struct f2fs_io_info *fio, TP_PROTO(struct super_block *sb, int type, struct bio *bio),
struct bio *bio),
TP_ARGS(sb, fio, bio), TP_ARGS(sb, type, bio),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(dev_t, dev) __field(dev_t, dev)
__field(dev_t, target)
__field(int, op) __field(int, op)
__field(int, op_flags) __field(int, op_flags)
__field(int, type) __field(int, type)
@ -793,37 +814,55 @@ DECLARE_EVENT_CLASS(f2fs__submit_bio,
TP_fast_assign( TP_fast_assign(
__entry->dev = sb->s_dev; __entry->dev = sb->s_dev;
__entry->op = fio->op; __entry->target = bio->bi_bdev->bd_dev;
__entry->op_flags = fio->op_flags; __entry->op = bio_op(bio);
__entry->type = fio->type; __entry->op_flags = bio->bi_opf;
__entry->type = type;
__entry->sector = bio->bi_iter.bi_sector; __entry->sector = bio->bi_iter.bi_sector;
__entry->size = bio->bi_iter.bi_size; __entry->size = bio->bi_iter.bi_size;
), ),
TP_printk("dev = (%d,%d), rw = %s%s, %s, sector = %lld, size = %u", TP_printk("dev = (%d,%d)/(%d,%d), rw = %s%s, %s, sector = %lld, size = %u",
show_dev(__entry), show_dev(__entry->target),
show_bio_type(__entry->op_flags), show_dev(__entry->dev),
show_bio_type(__entry->op, __entry->op_flags),
show_block_type(__entry->type), show_block_type(__entry->type),
(unsigned long long)__entry->sector, (unsigned long long)__entry->sector,
__entry->size) __entry->size)
); );
DEFINE_EVENT_CONDITION(f2fs__submit_bio, f2fs_submit_write_bio, DEFINE_EVENT_CONDITION(f2fs__bio, f2fs_prepare_write_bio,
TP_PROTO(struct super_block *sb, struct f2fs_io_info *fio, TP_PROTO(struct super_block *sb, int type, struct bio *bio),
struct bio *bio),
TP_ARGS(sb, fio, bio), TP_ARGS(sb, type, bio),
TP_CONDITION(bio) TP_CONDITION(bio)
); );
DEFINE_EVENT_CONDITION(f2fs__submit_bio, f2fs_submit_read_bio, DEFINE_EVENT_CONDITION(f2fs__bio, f2fs_prepare_read_bio,
TP_PROTO(struct super_block *sb, struct f2fs_io_info *fio, TP_PROTO(struct super_block *sb, int type, struct bio *bio),
struct bio *bio),
TP_ARGS(sb, fio, bio), TP_ARGS(sb, type, bio),
TP_CONDITION(bio)
);
DEFINE_EVENT_CONDITION(f2fs__bio, f2fs_submit_read_bio,
TP_PROTO(struct super_block *sb, int type, struct bio *bio),
TP_ARGS(sb, type, bio),
TP_CONDITION(bio)
);
DEFINE_EVENT_CONDITION(f2fs__bio, f2fs_submit_write_bio,
TP_PROTO(struct super_block *sb, int type, struct bio *bio),
TP_ARGS(sb, type, bio),
TP_CONDITION(bio) TP_CONDITION(bio)
); );
@ -1082,16 +1121,16 @@ TRACE_EVENT(f2fs_write_checkpoint,
), ),
TP_printk("dev = (%d,%d), checkpoint for %s, state = %s", TP_printk("dev = (%d,%d), checkpoint for %s, state = %s",
show_dev(__entry), show_dev(__entry->dev),
show_cpreason(__entry->reason), show_cpreason(__entry->reason),
__entry->msg) __entry->msg)
); );
TRACE_EVENT(f2fs_issue_discard, TRACE_EVENT(f2fs_issue_discard,
TP_PROTO(struct super_block *sb, block_t blkstart, block_t blklen), TP_PROTO(struct block_device *dev, block_t blkstart, block_t blklen),
TP_ARGS(sb, blkstart, blklen), TP_ARGS(dev, blkstart, blklen),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(dev_t, dev) __field(dev_t, dev)
@ -1100,22 +1139,22 @@ TRACE_EVENT(f2fs_issue_discard,
), ),
TP_fast_assign( TP_fast_assign(
__entry->dev = sb->s_dev; __entry->dev = dev->bd_dev;
__entry->blkstart = blkstart; __entry->blkstart = blkstart;
__entry->blklen = blklen; __entry->blklen = blklen;
), ),
TP_printk("dev = (%d,%d), blkstart = 0x%llx, blklen = 0x%llx", TP_printk("dev = (%d,%d), blkstart = 0x%llx, blklen = 0x%llx",
show_dev(__entry), show_dev(__entry->dev),
(unsigned long long)__entry->blkstart, (unsigned long long)__entry->blkstart,
(unsigned long long)__entry->blklen) (unsigned long long)__entry->blklen)
); );
TRACE_EVENT(f2fs_issue_reset_zone, TRACE_EVENT(f2fs_issue_reset_zone,
TP_PROTO(struct super_block *sb, block_t blkstart), TP_PROTO(struct block_device *dev, block_t blkstart),
TP_ARGS(sb, blkstart), TP_ARGS(dev, blkstart),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(dev_t, dev) __field(dev_t, dev)
@ -1123,21 +1162,21 @@ TRACE_EVENT(f2fs_issue_reset_zone,
), ),
TP_fast_assign( TP_fast_assign(
__entry->dev = sb->s_dev; __entry->dev = dev->bd_dev;
__entry->blkstart = blkstart; __entry->blkstart = blkstart;
), ),
TP_printk("dev = (%d,%d), reset zone at block = 0x%llx", TP_printk("dev = (%d,%d), reset zone at block = 0x%llx",
show_dev(__entry), show_dev(__entry->dev),
(unsigned long long)__entry->blkstart) (unsigned long long)__entry->blkstart)
); );
TRACE_EVENT(f2fs_issue_flush, TRACE_EVENT(f2fs_issue_flush,
TP_PROTO(struct super_block *sb, unsigned int nobarrier, TP_PROTO(struct block_device *dev, unsigned int nobarrier,
unsigned int flush_merge), unsigned int flush_merge),
TP_ARGS(sb, nobarrier, flush_merge), TP_ARGS(dev, nobarrier, flush_merge),
TP_STRUCT__entry( TP_STRUCT__entry(
__field(dev_t, dev) __field(dev_t, dev)
@ -1146,13 +1185,13 @@ TRACE_EVENT(f2fs_issue_flush,
), ),
TP_fast_assign( TP_fast_assign(
__entry->dev = sb->s_dev; __entry->dev = dev->bd_dev;
__entry->nobarrier = nobarrier; __entry->nobarrier = nobarrier;
__entry->flush_merge = flush_merge; __entry->flush_merge = flush_merge;
), ),
TP_printk("dev = (%d,%d), %s %s", TP_printk("dev = (%d,%d), %s %s",
show_dev(__entry), show_dev(__entry->dev),
__entry->nobarrier ? "skip (nobarrier)" : "issue", __entry->nobarrier ? "skip (nobarrier)" : "issue",
__entry->flush_merge ? " with flush_merge" : "") __entry->flush_merge ? " with flush_merge" : "")
); );
@ -1267,7 +1306,7 @@ TRACE_EVENT(f2fs_shrink_extent_tree,
), ),
TP_printk("dev = (%d,%d), shrunk: node_cnt = %u, tree_cnt = %u", TP_printk("dev = (%d,%d), shrunk: node_cnt = %u, tree_cnt = %u",
show_dev(__entry), show_dev(__entry->dev),
__entry->node_cnt, __entry->node_cnt,
__entry->tree_cnt) __entry->tree_cnt)
); );
@ -1314,7 +1353,7 @@ DECLARE_EVENT_CLASS(f2fs_sync_dirty_inodes,
), ),
TP_printk("dev = (%d,%d), %s, dirty count = %lld", TP_printk("dev = (%d,%d), %s, dirty count = %lld",
show_dev(__entry), show_dev(__entry->dev),
show_file_type(__entry->type), show_file_type(__entry->type),
__entry->count) __entry->count)
); );