Commit graph

3757 commits

Author SHA1 Message Date
David Sterba 509784bee2 fixup for btrfs_find_create_tree_block
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-12 14:35:57 +02:00
Qu Wenruo b782d087ae btrfs-progs: scrub: Introduce offline scrub function
Now, btrfs-progs has a kernel scrub equivalent.
A new option, --offline is added to "btrfs scrub start".

If --offline is given, btrfs scrub will just act like kernel scrub, to
check every copy of extent and do a report on corrupted data and if it's
recoverable.

The advantage compare to kernel scrub is:
1) No race
   Unlike kernel scrub, which is done in parallel, offline scrub is done
   by a single thread.
   Although it may be slower than kernel one, it's safer and no false
   alert.

2) Correctness
   Kernel has a known bug (fix submitted) which will recovery RAID5/6
   data but screw up P/Q, due to the hardness coding in kernel.
   While in btrfs-progs, no page, (almost) no memory size limit, we're
   can focus on the scrub, and make things easier.

New offline scrub can detect and report P/Q corruption with
recoverability report, while kernel will only report data stripe error.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Su <suy.fnst@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo f3d259fe62 btrfs-progs: scrub: Introduce function to check a whole block group
Introduce new function, scrub_one_block_group(), to scrub a block group.

For Single/DUP/RAID0/RAID1/RAID10, we use old mirror number based
map_block, and check extent by extent.

For parity based profile (RAID5/6), we use new map_block_v2() and check
full stripe by full stripe.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo 6bbeb7df3d btrfs-progs: scrub: Introduce a function to scrub one full stripe
Introduce a new function, scrub_one_full_stripe(), to check a full
stripe.

It handles the full stripe scrub in the following steps:
0) Check if we need to check full stripe
   If full stripe contains no extent, why waste our CPU and IO?

1) Read out full stripe
   Then we know how many devices are missing or have read error.
   If out of repair, then exit

   If have missing device or have read error, try recover here.

2) Check data stripe against csum
   We add data stripe with csum error as corrupted stripe, just like
   dev missing or read error.
   Then recheck if csum mismatch is still below tolerance.

Finally we check the full stripe using 2 factors only:
A) If the full stripe go through recover ever
B) If the full stripe has csum error

Combine factor A and B we get:
1) A && B: Recovered, csum mismatch
   Screwed up totally
2) A && !B: Recovered, csum match
   Recoverable, data corrupted but P/Q is good to recover
3) !A && B: Not recovered, csum mismatch
   Try to recover corrupted data stripes
   If recovered csum match, then recoverable
   Else, screwed up
4) !A && !B: Not recovered, no csum mismatch
   Best case, just check if P/Q matches.
   If P/Q matches, everything is good
   Else, just P/Q is screwed up, still recoverable.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo 0b13425729 btrfs-progs: scrub: Introduce helper to write a full stripe
Introduce a internal helper, write_full_stripe() to calculate P/Q and
write the whole full stripe.

This is useful to recover RAID56 stripes.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo fa8040d571 btrfs-progs: scrub: Introduce function to recover data parity
Introduce function, recover_from_parities(), to recover data stripes.

It just wraps raid56_recov() with extra check functions to
scrub_full_stripe structure.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo e2c28e53dd btrfs-progs: extent-tree: Introduce function to check if there is any extent in given range.
Introduce a new function, btrfs_check_extent_exists(), to check if there
is any extent in the range specified by user.

The parameter can be a large range, and if any extent exists in the
range, it will return >0 (in fact it will return 1).
Or return 0 if no extent is found.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo 116b4d33ce btrfs-progs: scrub: Introduce function to verify parities
Introduce new function, verify_parities(), to check whether parities match
with full stripe, whose data stripes match with their csum.

Caller should fill the scrub_full_stripe structure properly before
calling this function.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo 9cfd624e9a btrfs-progs: scrub: Introduce function to scrub one data stripe
Introduce new function, scrub_one_data_stripe(), to check all data and
tree blocks inside the data stripe.

This function will not try to recovery any error, but only check if any
data/tree blocks has mismatch csum.

If data missing csum, which is completely valid for case like nodatasum,
it will just record it, but not report as error.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo c99430a4a1 btrfs-progs: scrub: Introduce function to scrub one mirror-based extent
Introduce a new function, scrub_one_extent(), as a wrapper to check one
mirror-based extent.

It will accept a btrfs_path parameter @path, which must point to a
META/EXTENT_ITEM.
And @start, @len, which must be a subset of META/EXTENT_ITEM.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo 5552d0fc46 btrfs-progs: scrub: Introduce functions to scrub mirror based data blocks
Introduce new function, check/recover_data_mirror(), to check and recover
mirror based data blocks.

Unlike tree block, data blocks must be recovered sector by sector, so we
introduced corrupted_bitmap for check and recover.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo 1e5a6f6c88 btrfs-progs: scrub: Introduce functions to scrub mirror based tree block
Introduce new functions, check/recover_tree_mirror(), to check and
recover mirror-based tree blocks (Single/DUP/RAID0/1/10).

check_tree_mirror() can also be used on in-memory tree blocks using @data
parameter.
This is very handy for RAID5/6 case, either checking the data stripe
tree block by @bytenr and 0 as @mirror, or using @data parameter for
recovered in-memory data.

While recover_tree_mirror() is only used for mirror-based profiles, as
RAID56 recovery is done by stripe unit, not mirror unit.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo f44a420be7 btrfs-progs: scrub: Introduce structures to support offline scrub for RAID56
Introuduce new local structures, scrub_full_stripe and scrub_stripe, for
incoming offline RAID56 scrub support.

For pure stripe/mirror based profiles, like raid0/1/10/dup/single, we
will follow the original bytenr and mirror number based iteration, so
they don't need any extra structures for these profiles.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:35:05 +02:00
Qu Wenruo 453e66a3ba btrfs-progs: csum: Introduce function to read out data csums
Introduce a new function: btrfs_read_data_csums(), to read out csums
for sectors in range.

This is quite useful for read out data csum so we don't need to do it
using open code.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:34:37 +02:00
Qu Wenruo 4ab47d1e90 btrfs-progs: Allow __btrfs_map_block_v2 to remove unrelated stripes
For READ, caller normally hopes to get what they request, other than
full stripe map.

In this case, we should remove unrelated stripe map, just like the
following case:
               32K               96K
               |<-request range->|
         0              64k           128K
RAID0:   |    Data 1    |   Data 2    |
              disk1         disk2
Before this patch, we return the full stripe:
Stripe 0: Logical 0, Physical X, Len 64K, Dev disk1
Stripe 1: Logical 64k, Physical Y, Len 64K, Dev disk2

After this patch, we limit the stripe result to the request range:
Stripe 0: Logical 32K, Physical X+32K, Len 32K, Dev disk1
Stripe 1: Logical 64k, Physical Y, Len 32K, Dev disk2

And if it's a RAID5/6 stripe, we just handle it like RAID0, ignoring
parities.

This should make caller easier to use.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
2017-09-12 14:33:59 +02:00
Qu Wenruo 066745d697 btrfs-progs: Introduce new btrfs_map_block function which returns more unified result.
Introduce a new function, __btrfs_map_block_v2().

Unlike old btrfs_map_block(), which needs different parameter to handle
different RAID profile, this new function uses unified btrfs_map_block
structure to handle all RAID profile in a more meaningful method:

Return physical address along with logical address for each stripe.

For RAID1/Single/DUP (none-stripped):
result would be like:
Map block: Logical 128M, Len 10M, Type RAID1, Stripe len 0, Nr_stripes 2
Stripe 0: Logical 128M, Physical X, Len: 10M Dev dev1
Stripe 1: Logical 128M, Physical Y, Len: 10M Dev dev2

Result will be as long as possible, since it's not stripped at all.

For RAID0/10 (stripped without parity):
Result will be aligned to full stripe size:
Map block: Logical 64K, Len 128K, Type RAID10, Stripe len 64K, Nr_stripes 4
Stripe 0: Logical 64K, Physical X, Len 64K Dev dev1
Stripe 1: Logical 64K, Physical Y, Len 64K Dev dev2
Stripe 2: Logical 128K, Physical Z, Len 64K Dev dev3
Stripe 3: Logical 128K, Physical W, Len 64K Dev dev4

For RAID5/6 (stripped with parity and dev-rotation):
Result will be aligned to full stripe size:
Map block: Logical 64K, Len 128K, Type RAID6, Stripe len 64K, Nr_stripes 4
Stripe 0: Logical 64K, Physical X, Len 64K Dev dev1
Stripe 1: Logical 128K, Physical Y, Len 64K Dev dev2
Stripe 2: Logical RAID5_P, Physical Z, Len 64K Dev dev3
Stripe 3: Logical RAID6_Q, Physical W, Len 64K Dev dev4

The new unified layout should be very flex and can even handle things
like N-way RAID1 (which old mirror_num basic one can't handle well).

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
2017-09-12 14:33:59 +02:00
David Sterba cb1be701ce
Btrfs progs v4.13
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:20:05 +02:00
David Sterba 3d341f5baa btrfs-progs: update CHANGES for v4.13
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Misono, Tomohiro a351dd8478 btrfs-progs: test: add new test for inspect-internal rootid
This new test checks inspect-internal rootid
 - handle path to subvolume/directory/file as an argument
 - get different id for each subvolume
 - get the expected id for each file/directory (i.e. the same as
	 containing subvolume)

Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 6fb88e2859 btrfs-progs: tests: check for kernel support for reiserfs
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 7299e0d294 btrfs-progs: tests: enhance post-rollback fsck tests
The post-rollback helper still assumes just extN, we need an extra
argument that'll get passed to fsck. Change all callsites at once so the
tests do not fail temporarily.

Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Zhang Yu e96921bcaa Btrfs-progs: print-tree: check num_stripes in print_chunk
[TEST/fuzz] case: 004-simple-dump-tree

Since the wrong key(DATA_RELOC_TREE CHUNK_ITEM 0) in root tree,
error calling print_chunk(), resulting in num_stripes == 0.

ERROR:
     [TEST/fuzz]   004-simple-dump-tree
ctree.h:317: btrfs_chunk_item_size: BUG_ON `num_stripes == 0`
        triggered, value 1

failed (ignored, ret=134): /myproject/btrfs-progs/btrfs
inspect-internal dump-tree
/myproject/btrfs-progs/tests/fuzz-tests/images/
bko-155201-wrong-chunk-item-in-root-tree.raw.restored

test failed for case 004-simple-dump-tree
Makefile:288: recipe for target 'test-fuzz' failed
make: *** [test-fuzz] Error 1

So, check on num_stripes in print_chunk

Signed-off-by: Zhang Yu <zhangyu-fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Misono, Tomohiro cb39164f9d btrfs-progs: test: fix name generation not to contain trailing spaces
First patch causes test-convert fails.  This is because
generate_dataset() creates a name containing trailing spaces for
"slow_symlink" type, and cause getfacl error in convert_test_perm().
(This is not noticed since original run_check_stdout() throws away the
error.)

Fix this by use space for delimiter for cut.

Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Misono, Tomohiro 02e9bb9f23 btrfs-progs: test: fix run_check_stdout() call _fail()
run_check_stdout() uses "... | tee ... || _fail".  However, since tee
won't fail, _fail() is not called even if first command fails.

Fix this by checking PIPESTATUS in the end.

Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Qu Wenruo dbe96ecd3f btrfs-progs: tests: Add test case for mkfs --rootdir parameter
Add test case which checks if -r|--rootdir mkfs option can handle
symlink/char/block/fifo files.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Qu Wenruo 081e4e9bb8 btrfs-progs: mkfs: Fix wrong file type for dir items and indexes when specifying root directory
[Bug]
If using mkfs.btrfs with "-r" parameter and specified directory has
fifo/socket/char/block special file, then created filesystem can't pass
fsck:

------
checking fs roots
	unresolved ref dir 241158 index 3 namelen 9 name S.dirmngr filetype 0 errors 80, filetype mismatch
ERROR: errors found in fs roots
------

[Reason]
Btrfs dir items/indexes records inode type, while "-r" only handles
directories, regular files and symlink, it makes such special files type
to be regular file and caused the problem.

[Fix]
Add missing types for add_directory_items(), so that result of
"mkfs.btrfs -r" can pass mkfs.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Gu Jinxiang 944e14da14 btrfs-progs: mkfs: Move the tree root creation to own function
make_btrfs is too long to understand, make creatation of root tree
in a function.

Some of the tree roots are now created in a loop, where the code is just
copypasted. We now make use of the reference_root_table to translate
block index to root objectid.

Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
[ updated changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba df4a04484a btrfs-progs: tests: missing device and slack space report
Verify that a missing device will not result in reporting a negative
value interpreted as 16EiB.

Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Patrik Lundquist 8c6b05d726 btrfs-progs: device usage: don't calculate slack on missing device
Print      Device slack:              0.00B
instead of Device slack:           16.00EiB

Signed-off-by: Patrik Lundquist <patrik.lundquist@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Misono, Tomohiro 88ef0b8397 btrfs-progs: inspect rootid: Allow a file to be specified
Since cmd_inspect_rootid() calls btrfs_open_dir(), it rejects a file to
be specified. But as the document says, a file should be supported.

This patch introduces btrfs_open_file_or_dir(), which is a counterpart
of btrfs_open_dir(), to safely check and open btrfs file or directory.
The original btrfs_open_dir() content is moved to btrfs_open() and shared
by both function.

Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba f47587d83d btrfs-progs: tests: convert misc/011-delete-missing-device to loopdevs
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 83fe48c54b btrfs-progs: tests: convert misc/006-image-on-missing-device to loopdevs
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 528a5bf6ad btrfs-progs: tests: move loopdev helpers out of the testcase to common
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 3a6895b823 btrfs-progs: tests: cleanup loop device helpers
Make the loop device helpers a bit more generic before moving them to
the common helpers.

Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 36db1080c3 btrfs-progs: print-tree: factor out extent_csum dump
Factor out code to own helper and tweak the format so it matches the
rest.

Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Josef Bacik 5fbc00cc73 btrfs-progs: print the csum length in debug-tree
While looking at a log of a corrupted fs I needed to verify we were
missing csums for a given range.  Make this easier by printing out the
range of bytes a csum item covers.

Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 872837ebbf btrfs-progs: tests: add testcase for 'fi du' and empty subvol
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Goffredo Baroncelli b5665d66e8 btrfs-progs: fi du: don't call lookup_path_rootid for BTRFS_EMPTY_SUBVOL_DIR_OBJECTID
When ino is BTRFS_EMPTY_SUBVOL_DIR_OBJECTID, the item is not referred to
any file-tree. So lookup_path_rootid() doesn't return any meaningful
value.

As was reported, this can be triggered by

$ btrfs sub create test1
$ btrfs sub create test1/test2
$ btrfs sub snap test1 test1.snap
$ btrfs fi du -s test1
  Total   Exclusive  Set shared  Filename
  0.00B       0.00B       0.00B  test1
$ btrfs fi du -s test1.snap
  Total   Exclusive  Set shared  Filename
ERROR: cannot check space of 'test1.snap': Inappropriate ioctl for device

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
Goffredo Baroncelli 02d04d8b23 btrfs-progs: reset the ret value when ignoring an error from du_add_file
In du_walk_dir(), when du_add_file() returns an error it is usually
ignored. However if the error is returned querying the last item, the
error is returned to the caller.

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba a36d92cb8b btrfs-progs: tests: add test for check --force
Basic test of the --force functionality, on an empty filesystem.

Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 162fdf9538 btrfs-progs: check: add option to skip mount checks
Sometimes it's needed to do a check on a mounted filesystem. This should
work fine on a quiescent filesystem or a read-only mount. Changes on the
block device done by kernel might confuse the userspace checker and it
might crash when it reads some stale data.

Repair without mount checks is not supported right now.

Signed-off-by: David Sterba <dsterba@suse.cz>
2017-09-08 16:15:05 +02:00
David Sterba 8609c8bad6 btrfs-progs: print-tree: factor out temporary_item dump
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba a4b65f00d5 btrfs-progs: print-tree: factor out persistent_item dump
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 61a578751d btrfs-progs: print-tree: factor out qgroup_limit dump
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 90631b721e btrfs-progs: print-tree: factor out qgroup_info dump
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 346d2e16dd btrfs-progs: print-tree: factor out qgroup_status dump
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba 59714a77c3 btrfs-progs: print-tree: factor out dev_extent dump
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba c3b767a208 btrfs-progs: print-tree: factor out free_space_info dump
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba b3122697f6 btrfs-progs: print-tree: factor out shared_data_ref dump
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00
David Sterba c23c1271d3 btrfs-progs: print-tree: factor out extent_data_ref dump
Signed-off-by: David Sterba <dsterba@suse.com>
2017-09-08 16:15:05 +02:00