There is a bug in btrfs-convert in 4.1.2, even we don't allow mixed
block group for converted image, btrfs-convert will still create image
with data and metadata inside one chunk.
And further more, the chunk type is still DATA or METADATA, not
DATA|METADATA (not mixed).
So add btrfsck check for it right now.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Added a missing newline to some error messages.
Also printf() was changed to fprintf(stderr) for error messages.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Reviewed-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
mkfs creates more than one fs_devices in fs_uuids.
1: one is for file system being created
2: others are created in test_dev_for_mkfs in order to check mount point
test_dev_for_mkfs()-> ... -> btrfs_scan_one_device()
Current code only closes 1, and this patch also closes in case 2.
Similar problem exist in other tools, eg.::
cmd-check.c: the function is:
cmd_check()->check_mounted()-> ... -> btrfs_scan_one_device()
...
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
rec->crossing_stripes is not initialized in allocate place,
and have possibility causing wrong report for normal tree block.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We need not check path before btrfs_free_path() is called because
path is checked in btrfs_free_path().
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For a special case, discount file extent repair function will cause
infinite loop.
The case is, if the file loses all its extents, we won't have a hole
to fill, causing repair function doing nothing. Since the
I_ERR_DISCOUNT doesn't disappear, fsck will do an infinite loop.
For such case, just puch hole to fill the whole range to fix it.
Reported-by: Robert Munteanu <robert.munteanu@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If a file lost all its extents, fsck will unable to print out the hole.
Add an extra check to print out the hole range.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Kernel btrfs_map_block() function has a limitation that it can only
map BTRFS_STRIPE_LEN size.
That will cause scrub fails to scrub tree block which crosses strip
boundary, causing BUG_ON().
Normally, it's OK as metadata is always in metadata chunk and
BTRFS_STRIPE_LEN can always be divided by node/leaf size.
So without mixed block group, tree block won't cross stripe boundary.
But for mixed block group, especially for btrfs converted from ext4,
it's almost sure one or more tree blocks are not aligned with node size
and cross stripe boundary.
Causing bug with kernel scrub.
This patch will report the problem, although we don't have a good idea
how to fix it in user space until we add the ability to relocate tree
block in user space.
Also, kernel code should also be checked for such tree block alloc
problems.
Reported-by: Chris Murphy <lists@colorremedies.com>
Reported-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Some unknown kernel bug makes inode nbytes modification out of sync with
file extent update.
But it's quite easy to fix in btrfs-progs anyway.
So just fix it by adding a new function repair_inode_nbytes by using the
found_size in inode_record.
Reported-by: Christian <cdysthe@gmail.com>
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In copy_inode_rec(), a shallow copy happens on rec->holes rb_root.
So for shared inode case, new rec->holes still points to old rb_root,
and when the old inode record is freed, the new inode_rec->holes will
points to garbage and cause segfault when we try to free new
inode_rec->holes.
Fix it by calling copy_file_extent_holes() to do deep copy.
Reported-by: Eric Sandeen <sandeen@redhat.com>
Reported-by: Filipe David Manana <fdmanana@gmail.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
make[1]: Nothing to be done for `all'.
cmds-check.c: In function ‘del_file_extent_hole’:
cmds-check.c:289:26: warning: ‘prev.start’ may be used uninitialized in this function
cmds-check.c:289:26: warning: ‘prev.len’ may be used uninitialized in this function
cmds-check.c:290:26: warning: ‘next.start’ may be used uninitialized in this function
cmds-check.c:290:26: warning: ‘next.len’ may be used uninitialized in this function
Reported-by: Anand Jain <Anand.Jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
We're not using it anywhere. The best practice is to add enums with
values > 255 for the long options, option index counting is error prone.
Signed-off-by: David Sterba <dsterba@suse.cz>
Before this patch, csum tree rebuild will not work with extent tree
rebuild, since extent tree rebuild will only build up basic block
groups, but csum tree rebuild needs data extents to rebuild.
So if one use btrfsck with --init-csum-tree and --init-extent-tree, csum
tree will be empty and tons of "missing csum" error will be outputted.
This patch allows csum tree rebuild get its data from fs/subvol trees
using regular file extents (which is also the only one using csum tree
currently).
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[renamed to fill_csum_tree_from_one_fs_root]
Signed-off-by: David Sterba <dsterba@suse.cz>
There is a case that can cause nlink fix function.
For example, lost+found dir already has the following files:
---------------------------
|ino |filename |
|-------------------------|
|258 |normal_file |
|259 |normal_file.260 |
---------------------------
The next inode to be fixed is the following:
---------------------------
|260 |normail_file |
---------------------------
And when trying to move inode to lost+found dir, its file name conflicts
with inode 258, and even add ".INO" suffix, it still conflicts with
inode 259.
Since the move failed, the LINK_COUNT_ERR flag is not cleared, the inode
record will not be freed, btrfsck will try fix it again and again,
causing the infinite loop.
The patch will first change the ".INO" suffix naming to a loop behavior,
and clear the LINK_COUNT_ERR flag anyway to avoid infinite loop.
Reported-by: Naohiro Aota <naota@elisp.net>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
We can have FULL_BACKREF set or not set when we need the opposite, this patch
fixes this problem by setting a bit when the flag is set improperly. This way
we can either correct the problem when we re-create the extent item if the
backrefs are also wrong, or we can just set the flag properly in the extent
item. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
We don't want to keep extent records pinned down if we fix stuff as we may need
the space and we can be pretty sure that these records are correct. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
We hold a transaction open for the entirety of fixing extent refs. This works
out ok most of the time but we can be tight on space and run out of space when
fixing things. To get around this just push down the transaction starting dance
into the functions that actually fix things. This keeps us from ending up with
ENOSPC because we pinned everything and allows the code to be a bit simpler.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
The METADUMP super flag makes us skip doing the chunk tree reading which isn't
helpful for the new restore since we have a valid chunk tree. But we still want
to have a way for the kernel to know that this is a metadump restore so it
doesn't do things like verify data checksums. We also want to skip some of the
device extent checks in fsck since those will obviously not match. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
The data reloc root is weird with it's csums. It'll copy an entire extent and
then log any csums it finds, which makes it look weird when it comes to prealloc
extents. So just skip the data reloc tree, it's special and we just don't need
to worry about it. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
We have logic to fix the root locations for roots in response to a corruption
bug we had earlier. However this work doesn't apply to reloc roots and can
screw things up worse, so make sure we skip any reloc roots that we find.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
If we fix bad blocks during run_next_block we will return -EAGAIN to loop around
and start again. The deal_with_roots work messed up this handling, this patch
fixes it. With this patch we can properly deal with broken tree blocks.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Previously we used to just set FULL_BACKREF if we couldn't lookup an extent info
for an extent. Now we just bail out if we can't lookup the extent info, which
is less than good since fsck is supposed to fix these very problems. So instead
figure out the flag we are supposed to use and pass that along instead. This
patch also provides a test image to test this functionality. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Allow read_tree_block() and read_node_slot() to return error pointer.
This should help caller to get more specified error number.
For existing callers, change (!eb) judgmentt to
(!extent_buffer_uptodate(eb)) to keep the compatibility, and for caller
missing the check, use PTR_ERR(eb) if possible.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Since orphan extents are handled in previous patches, now just punch
holes to fill the file extents hole.
Also since now btrfsck is aware of whether the inode has orphan file
extent, allow repair_inode_no_item() to determine filetype according to
orphan file extent.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
In some fs tree leaf/node corruption case, file extents may be lost, but
in extent tree, its record may still exists.
This provide the possibility for such orphan file extents to be
recovered even we can't recover its compression and other info, we can
still insert it as a normal non-compression file extent.
This patch provides the repair and report function for such orphan file
extent.
Even after such repair, user may still need to try to decompress its
data if user knows that is a compressed extent.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Record every file extent discontinuous hole in inode_record using a
rb_tree member.
Before the patch, btrfsck will only record the first file extent hole by
using first_extent_gap, that's good for detecting error, but not
suitable for fixing it.
This patch provides the ability to record every file extent hole and
report it.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Before this patch, when a extent's data ref points to a invalid key in
fs tree, this happens if a leaf/node of fs tree is corrupted, btrfsck
can't do any repair and just exit.
In fact, such problem can be handled in fs tree repair routines, rebuild
the inode item(if missing) and add back the extent data (with some
assumption).
So this patch records such data extent refs for later fs tree recovery
routine.
TODO:
Restore orphan data extent refs into btrfs_root is not the best
method. It's best to directly restore it into inode_record, however
current extent tree and fs tree can't cooperate together, so use
btrfs_root as a temporary storage until inode_cache is built.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
- use standard PACKAGE_{NAME,VERSION,STRING,URL,...} autoconf macros
rather than homemade BTRFS_BUILD_VERSION
- don't #include version.h, now the file is necessary for library API only
Note that "btrfs version" returns "btrfs-progs <version>" instead of
the original confusing "btrfs <version>".
Signed-off-by: Karel Zak <kzak@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Particularly during support conversations, people get confused about
which options to use with btrfs check. Adding a flag, --readonly, which
implies the default read-only behaviour and which conflicts with the
read-write operations, should help make the behaviour of the tool clear.
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
Signed-off-by: David Sterba <dsterba@suse.cz>
The current approach to option parsing, where long-only options are
selected on the basis of their position in the long_options array is
fragile and painful to modify if options are to be inserted into the
list, rather than appended.
Instead, use the last field of struct option to return a value which
cannot be a char (and hence a short option), and simply switch on those
within the case statement.
Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
Signed-off-by: David Sterba <dsterba@suse.cz>
glibc 2.10+ (5+ years old) enables all the desired features:
_XOPEN_SOURCE 700, __XOPEN2K8, POSIX_C_SOURCE, DEFAULT_SOURCE; with a
single _GNU_SOURCE define in the makefile alone. For portability to
other libc implementations (e.g. dietlibc) _XOPEN_SOURCE=700 is also
defined.
This also resolves Debian bug report filed by Michael Tautschnig -
"Inconsistent use of _XOPEN_SOURCE results in conflicting
declarations". Whilst I was not able to reproduce the results, the
reported fact is that _XOPEN_SOURCE set to 500 in one set of files
(e.g. cmds-filesystem.c) generates/defines different struct stat from
other files (cmds-replace.c).
This patch thus cleans up all feature defines, and sets them at a
consistent level.
Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=747969
Signed-off-by: Dimitri John Ledkov <dimitri.j.ledkov@intel.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The original check_inode_recs() will return -1 if found any error in a
inode_record. This is OK for original design since there is almost
nothing can repair at that time.
However more and more error from nlink mismatch to missing inode item
can be repaired in try_repair_inode(), check_inode_recs() should not
increase the error count if the inode can be repair.
With this patch, repair function for leaf-corruption will not return
error if all corruption inode can be recovered.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
The commit f495a2ac66 ("btrfs-progs: fsck: remove unfriendly BUG_ON()
for searching tree failure") is causing tons of extent buffer leak if some
csum mismatches in btrfsck.
This is caused by a misplaced btrfs_release_path(), fix it.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
In case the buffer is corrupted and the for loop does not happen, we'd
return garbage. The caller retunrs -EIO in case of any corruption, use
that value in fix_key_order.
Resolves-coverity-id: 1246944
Signed-off-by: David Sterba <dsterba@suse.cz>
In reset_nlink(), we believe rec->found_link as accurate number of the
valid links. However it only records the number of found DIR_ITEM, so we
can't use it as reliable value.
Before this patch, in some case, leaf corruption recovery will believe
there is a valid backref but don't add_link() since it can't find any
valid one and don't put it into the lost+found dir.
So the recovered inode will be considered as an orphan item without
orphan item and repair_inode_orphan_item() will add orphan item for it,
causing all the filename/filetype we recovered being a waste of time.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If the tree's empty, we'll get NULL and dereference it.
Resolves-Coverity-CID: 1248828
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
If some critical roots are corrupt, reapr_root_items() will fail,
this is detected by fsck_tests.sh's extent rebuilding tests.
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
This patch makes us to rebuild a really corrupt extent tree with snapshots.
To implement this, we have to verify whether a block is FULL BACKREF.
This idea come from Josef Bacik:
1) We walk down the original tree, every eb we encounter has
btrfs_header_owner(eb) == root->objectid. We add normal references
for this root (BTRFS_TREE_BLOCK_REF_KEY) for this root. World peace
is achieved.
2) We walk down the snapshotted tree. Say we didn't change anything
at all, it was just a clean snapshot and then boom. So the
btrfs_header_owner(root->node) == root->objectid, so normal backref.
We walk down to the next level, where btrfs_header_owner(eb) !=
root->objectid, but the level above did, so we add normal refs for all
of these blocks. We go down the next level, now our
btrfs_header_owner(parent) != root->objectid and
btrfs_header_owner(eb) != root->objectid. This is where we need to
now go back and see if btrfs_header_owner(eb) currently has a ref on
eb. If it does we are done, move on to the next block in this same
level, we don't have to go further down.
3) Harder case, we snapshotted and then changed things in the original
root. Do the same thing as in step 2, but now we get down to
btrfs_header_owner(eb) != root->objectid && btrfs_header_owner(parent)
!= root->objectid. We lookup the references we have for eb and notice
that btrfs_header_owner(eb) no longer refers to eb. So now we must
set FULL_BACKREF on this extent reference and add a
SHARED_BLOCK_REF_KEY for this eb using the parent->start as the
offset. And we need to keep walking down and doing the same thing
until we either hit level 0 or btrfs_header_owner(eb) has a ref on the
block.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Previously, we deal with node block firstly and then leaf block which can
maximize readahead. However, to rebuild extent tree, we need deal with snapshot
one by one.
This patch makes us deal with snapshot one by one if we need rebuild extent
tree otherwise we drop into previous way.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Add a basic inode item rebuild function for I_ERR_NO_INODE_ITEM.
The main use case is to repair btrfs which fs root has corrupted leaf,
but it is already working for case if the corrupteed fs root leaf/node
contains no inode extent_data.
The repair needs 3 elements for inode rebuild:
1. inode number
This is quite easy, existing inode_record codes will detect it quite
well.
2. inode type
This is the trick part. The only reliable method is to recovery it from
parent's dir_index/item.
The remaining method will search for regular file extent for FILE
type or child's backref for DIR(todo).
Fallback will be FILE.
Inode name(inode_ref) will be recoverd by nlink repair function.
This is just a fundamental implement, some advanced recovery can be
improved later with btrfs-progs infrastructure change.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>