Commit graph

79 commits

Author SHA1 Message Date
Anand Jain d46a0ef6a0 btrfs-progs: rename struct open_ctree_flags to open_ctree_args
The struct open_ctree_flags currently holds arguments for
open_ctree_fs_info(), it can be confusing when mixed with a local variable
named open_ctree_flags as below in the function cmd_inspect_dump_tree().

  cmd_inspect_dump_tree()
  ::
  struct open_ctree_flags ocf = { 0 };
  ::
  unsigned open_ctree_flags;

So rename struct open_ctree_flags to struct open_ctree_args.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-26 15:00:47 +02:00
David Sterba 22cf63d8ee btrfs-progs: kernel-shared: add helper write_extent_buffer_chunk_tree_uuid
Sync the helper write_extent_buffer_chunk_tree_uuid from kernel.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-27 23:40:56 +02:00
David Sterba 339de9b2d7 btrfs-progs: kernel-shared: use write_extent_buffer_fsid where possible
We already have the helper but don't use it everywhere.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-27 16:29:58 +02:00
Qu Wenruo 33a21e7578 btrfs-progs: tune: rework the main idea of csum change
The existing attempt for changing csum types is as the following:

- Create a new temporary csum root
- Generate new data csums into the temporary csum root
- Drop the old csum tree and make the temporary one as csum root
- Change the checksums for metadata in-place

Unfortunately after some experiments, the csum root switch method has a
big pitfall, the backref items in extent tree.

Those backref items still point back to the old tree, meaning without a
lot of extra tricks, the extent tree would be corrupted.

Thus we have to go a new single tree variant:

- Generate new data csums into the csum root
  The new data csums would have a different objectid to distinguish
  them.
- Drop the old data csum items
- Change the key objectids of the new csums
- Change the checksums for metadata in-place

This means unfortunately we have to revert most of the old code, and
update the temporary item format.

The new temporary item would only record the target csum type.
At every stage we have a method to determine the progress, thus no need
for an item, but in the future it's still open for change.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:32 +02:00
David Sterba 8f33e591e9 btrfs-progs: partial sync of ctree.c from kernel
Sync checksum helpers, extent buffer helpers with kernel 6.4-rc1.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:31 +02:00
Qu Wenruo 46364d3766 btrfs-progs: replace write_and_map_eb() by write_data_to_disk()
The function write_and_map_eb() is quite abused as a way to write any
generic buffer back to disk.

But we have a more suitable function already, write_data_to_disk().

This patch would remove the abused write_data_to_disk() calls, and
convert the only three valid call sites to write_data_to_disk() instead.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:31 +02:00
Josef Bacik e150c843ea btrfs-progs: sync tree-checker.[ch] from kernel
This syncs tree-checker.c from the kernel.  The main modification was to
add a open ctree flag to skip the deeper leaf checks, and plumbing this
through tree-checker.c.  We need this for things like fsck or
btrfs-image that need to work with slightly corrupted file systems, and
these checks simply make us unable to look at the corrupted blocks.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:30 +02:00
Josef Bacik 361e4bea13 btrfs-progs: rename btrfs_check_* to __btrfs_check_*
These helpers are called __btrfs_check_* in the kernel as they return
the special enum to indicate what part of the leaf/node failed.  Rename
the uses in btrfs-progs to match the kernel naming convention to make it
easier to sync that code.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:30 +02:00
Josef Bacik a474379935 btrfs-progs: add a btrfs_read_extent_buffer helper
This exists in the kernel to do the read on an extent buffer we may have
already looked up and initialized.  Simply create this helper by
extracting out the existing code from read_tree_block and make
read_tree_block call this helper.  This gives us the helper we need to
sync ctree.c into btrfs-progs, and keeps the code the same in
btrfs-progs.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:30 +02:00
Josef Bacik e176d5eab6 btrfs-progs: add an atomic parameter to btrfs_buffer_uptodate
We have this extra parameter in the kernel to indicate if we are atomic
and thus can't lock the io_tree when checking the transid for an extent
buffer.  This isn't necessary in btrfs-progs, but to allow for easier
syncing of ctree.c add this argument to our copy of btrfs_buffer_uptodate.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:30 +02:00
Josef Bacik 5d84ee58e9 btrfs-progs: update arguments of find_extent_buffer
In the kernel we only take a bytenr for this as the extent buffer cache
is indexed on bytenr.  Since we're passing in the btrfs_fs_info we can
simply use the ->nodesize for the blocksize, and drop the blocksize
argument completely.  This brings us into parity with the kernel, which
will allow the syncing of ctree.c.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:30 +02:00
Josef Bacik b3477244f9 btrfs-progs: update read_tree_block to match the kernel definition
The in-kernel version of read_tree_block adds some extra sanity checks
to make sure we don't return blocks that don't match what we expect.
This includes the owning root, the level, and the expected first key.
We don't actually do these checks in btrfs-progs, however kernel code
we're going to sync will expect this calling convention, so update it to
match the in-kernel code and then update all the callers.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:30 +02:00
Josef Bacik 6da52f41c7 btrfs-progs: pass root_id for btrfs_free_tree_block
In the kernel we pass in the root_id for btrfs_free_tree_block instead
of the root itself.  Update the btrfs-progs version of the helper to
match what we do in the kernel.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:30 +02:00
Josef Bacik 656938b665 btrfs-progs: rename clear_extent_buffer_dirty to btrfs_clear_buffer_dirty
This is a mirror of the change I've done in the kernel, but in progs
it's even more simply because clean_tree_block was just a wrapper around
clear_extent_buffer_dirty.  Change this to btrfs_clear_buffer_dirty, and
then update all the callers to use this helper instead of
clean_tree_block and clear_extent_buffer_dirty.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:29 +02:00
Josef Bacik 5780714b58 btrfs-progs: add btrfs_locking_nest to btrfs_alloc_tree_block
This is how btrfs_alloc_tree_block is defined in the kernel, so when we
go to sync this code in it'll be easier if we're already setup to accept
this argument.  Since we're in progs we don't care about nesting so just
use BTRFS_NORMAL_NESTING everywhere, as we sync in the kernel code it'll
get updated to whatever is appropriate.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:29 +02:00
Josef Bacik 006baaecdd btrfs-progs: rename btrfs_alloc_free_block to btrfs_alloc_tree_block
This is in keeping with what the function actually does, and is named
this way in the kernel.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:29 +02:00
Josef Bacik 8e427ada49 btrfs-progs: copy btrfs_root::state from kernel
We changed from members in the root for all the different flags to a
bit based flag system.  In order to make syncing the kernel code into
btrfs-progs easier go ahead and sync in the bits we use and update all
the users of the old ->track_dirty and ->ref_cows to use the state bits.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:29 +02:00
Josef Bacik 4a9a8f2a8a btrfs-progs: sync extent-io-tree.[ch] and misc.h from the kernel
This is a bit larger than the previous syncs, because we use
extent_io_tree's everywhere.  There's a lot of stuff added to
kerncompat.h, and then I went through and cleaned up all the API
changes, which were

- extent_io_tree_init takes an fs_info and an owner now.
- extent_io_tree_cleanup is now extent_io_tree_release.
- set_extent_dirty takes a gfpmask.
- clear_extent_dirty takes a cached_state.
- find_first_extent_bit takes a cached_state.

The diffstat looks insane for this, but keep in mind extent-io-tree.c
and extent-io-tree.h are ~2000 loc just by themselves.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:29 +02:00
Josef Bacik b0a4eab561 btrfs-progs: remove parent_key arg from btrfs_check_* helpers
Now that this is unused by these helpers and only used by the repair
related code we can remove this argument from the main helpers.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:28 +02:00
Josef Bacik d36571ed9a btrfs-progs: remove fs_info argument from btrfs_check_* helpers
This can be pulled out of the extent buffer that is passed in, drop the
fs_info argument from the function.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:28 +02:00
Qu Wenruo 68a04bc710 btrfs-progs: tune: add new option to convert back to extent tree
With previous btrfstune support to convert to block-group-tree, it has
implemented most of the infrastructure for bi-directional conversion.

This patch will implement the remaining conversion support to go back to
extent tree.

The modification includes:

- New convert_to_extent_tree() function in btrfstune.c
  It's almost the same as convert_to_bg_tree(), but with small changes:
  * No need to set extra features like NO_HOLES/FST.
  * Need to delete the block group tree when everything finished.

- Update btrfs_delete_and_free_root() to handle non-global roots
  Currently the function can only accepts global roots (extent/csum/free
  space trees)

  If we pass a non-global root into the function, we will screw up
  global_roots_tree and crash.

  Since we're going to use btrfs_delete_and_free_root() to free block
  group tree which is not a global tree, this is needed.

- New handling for half converted fs in get_last_converted_bg()
  There are two cases need to be handled:

  * The bg tree is already empty
    We need to grab the first bg in extent tree.
    Or at conversion function we will fail at grabbing the first bg.

  * The bg tree is not empty
    Then we need to grab the last bg in extent tree.

- Extra root switching in involved functions. This involves:

  * read_converting_block_groups()
  * insert_block_group_item()
  * update_block_group_item()

  We just need to update our target root according to the current
  compat_ro and super flags.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-19 01:10:24 +02:00
David Sterba 9be33f558c btrfs-progs: tune: update checksum conversion
The checksum conversion is still experimental and still does not convert
all filesystems correctly. Do not use on valuable data.

Previous implementation copied the UUID conversion which was not a good
base for the checksum conversion so it left out basically all trees
except extent and checksum.

This update adds the base for the required safety features:

- let the old csum tree intact until the full conversion is done (ie.
  all data are still verifiable)
- add on-disk status tracking item, this should keep the from/to
  checksum conversion, last generation to catch potential updates of the
  underlying filesystem if conversion is interrupted and the filesystem
  mounted
- convert most of the fundamental trees, the subvolumes, tree log and
  relocation trees are not converted
- trees are converted in-place to avoid potentially running out of space
  but this might be better done by transaction protection with a
  temporary tree

Known issues:

- not all trees are converted
- not all checksums are correctly inserted into the new tree and reading
  the files leads to EIO

Issue: #438
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-28 20:11:22 +01:00
Josef Bacik fac1fae3ef btrfs-progs: rename extent buffer flags to EXTENT_BUFFER_*
We have been overloading the extent_state flags for use on the extent
buffers as well.  When we sync extent-io-tree.[ch] this will become
impossible, so rename these flags to EXTENT_BUFFER_* and use those
definitions instead of the extent_state definitions.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-28 18:57:44 +01:00
Josef Bacik 20d88c17e7 btrfs-progs: move extent cache code directly into btrfs_fs_info
We have some extra features in the btrfs-progs copy of the
extent_io_tree that don't exist in the kernel.  In order to make syncing
easier simply move this functionality into btrfs_fs_info, that way we
can sync in the new extent_io_tree code and not have to worry about
breaking anything.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-28 18:57:44 +01:00
Josef Bacik 412eea9e97 btrfs-progs: do not pass io_tree into verify_parent_transid
We do not use the io_tree, don't bother passing it into
verify_parent_transid.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-28 18:57:44 +01:00
Josef Bacik ccee633f3a btrfs-progs: move dirty eb tracking to it's own io_tree
btrfs-progs has a cache tree embedded in the extent_io_tree in order to
track extent buffers.  We use the extent_io_tree part to track dirty,
and the cache tree to keep the extent buffers in.  When we sync
extent-io-tree.[ch] we'll lose this ability, so separate out the dirty
tracking into its own extent_io_tree.  Subsequent patches will adjust
the extent buffer lookup so it doesn't use the custom extent_io_tree
thing.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-28 18:57:43 +01:00
Josef Bacik af30cf2e3e btrfs-progs: make the find extent buffer helpers take fs_info
This is a cleanup patch to make syncing the btrfs kernel code into
btrfs-progs easier.  In btrfs-progs we have an extra cache in the
extent_io_tree that's exclusively used for the extent buffer tracking.
In order to untangle this dependency start passing around the fs_info to
search for extent_buffers, and then have the helpers use the appropriate
structure to find the extent buffer.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-28 18:57:43 +01:00
David Sterba c2be0e2ce0 btrfs-progs: use template for out of memory error messages
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:09 +02:00
David Sterba feef6aaaf6 btrfs-progs: kernel-lib: remove radix-tree
The radix-tree is not used in userspace code. In kernel it's for
tracking unpersisted and in-memory structures and has been replaced by
the xarray.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:08:07 +02:00
Qu Wenruo d8f3355734 btrfs-progs: unexport csum_tree_block()
The function csum_tree_block() is not really utilized by anyone, all
current callers just use csum_tree_block_size().

Furthermore there is a stale definition in common/utils.h which is using
the old "struct btrfs_root" as the first argument, while we have already
migrated to "struct btrfs_fs_info".

So just unexport csum_tree_block() and remove the stale definition.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-11 09:06:11 +02:00
Qu Wenruo 2f2f6bfe17 btrfs-progs: btrfstune: add the ability to convert to block group tree feature
The new '-b' option will be responsible for converting to block group
tree compat ro feature.

The workflow looks like this for new convert:

- Setting CHANGING_BG_TREE flag
  And initialize fs_info->last_converted_bg_bytenr value to (u64)-1.

  Any bg with bytenr >= last_converted_bg_bytenr will have its bg item
  update go to the new root (bg tree).

- Iterate each block group by their bytenr in descending order
  This involves:
  * Delete the old bg item from the old tree (extent tree)
  * Update last_converted_bg_bytenr to the bytenr of the bg
  * Add the new bg item into the new tree (bg tree)
  * If we have converted a bunch of bgs, commit current transaction

- Clear CHANGING_BG_TREE flag
  And set the new BLOCK_GROUP_TREE compat ro flag and commit.

And since we're doing the convert in multiple transactions, we also need
to resume from last interrupted convert.

In that case, we just grab the last unconverted bg, and start from it.

And to co-operate with the new kernel requirement for both no-holes and
free-space-tree features, the convert tool will check for
free-space-tree feature. If not enabled, will error out with an error
message to how to continue (by mounting with "-o space_cache=v2").

For missing no-holes feature, we just need to set the flag during
convert.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 18:25:32 +02:00
Qu Wenruo 1430b41427 btrfs-progs: separate block group tree from extent tree v2
Block group tree feature is completely a standalone feature, and it has
been over 5 years before the initial introduction to solve the long
mount time.

I don't really want to waste another 5 years waiting for a feature which
may or may not work, but definitely not properly reviewed for its
preparation patches.

So this patch will separate the block group tree feature into a
standalone compat RO feature.

There is a catch, in mkfs create_block_group_tree(), current
tree-checker only accepts block group item with valid chunk_objectid,
but the existing code from extent-tree-v2 didn't properly initialize it.

This patch will also fix above mentioned problem so kernel can mount it
correctly.

Now mkfs/fsck should be able to handle the fs with block group tree.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 18:25:32 +02:00
Qu Wenruo c5a21a7814 btrfs-progs: don't save block group root into super block
The extent tree v2 (thankfully not yet fully materialized) needs a
new root for storing all block group items.

My initial proposal years ago just added a new tree rootid, and load it
from tree root, just like what we did for quota/free space tree/uuid/extent
roots.

But the extent tree v2 patches introduced a completely new (and to me,
wasteful) way to store block group tree root into super block.

Currently there are only 3 trees stored in super blocks, and they all
have their valid reasons:

- Chunk root
  Needed for bootstrap.

- Tree root
  Really the entrance of all trees.

- Log root
  This is special as log root has to be updated out of existing
  transaction mechanism.

There is not even any reason to put block group root into super blocks,
the block group tree is updated at the same timing as old extent tree,
no need for extra bootstrap/out-of-transaction update.

So just move block group root from super block into tree root.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-12 15:31:27 +02:00
Qu Wenruo 75fea7496c btrfs-progs: use write_data_to_disk() to handle RAID56 in write_and_map_eb()
Function write_data_to_disk() can handle RAID56 writes without any
problem.

So just call write_data_to_disk() inside write_and_map_eb() instead of
manually doing the RAID56 write.

Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-08-16 15:18:12 +02:00
Qu Wenruo fc6925bfd3 btrfs-progs: avoid repeated data write for metadata
[BUG]
Shinichiro reported that "mkfs.btrfs -m DUP" is doing repeated write
into the device.
For non-zoned device this is not a big deal, but for zoned device this
is critical, as zoned device doesn't support overwrite at all.

[CAUSE]
The problem is related to write_and_map_eb() call, since commit
2a93728391 ("btrfs-progs: use write_data_to_disk() to replace
write_extent_to_disk()"), we call write_data_to_disk() for metadata
write back.

But the problem is, write_data_to_disk() will call btrfs_map_block()
with rw = WRITE.

By that btrfs_map_block() will always return all stripes, while in
write_data_to_disk() we also iterate through each mirror of the range.

This results above repeated writeback.

[FIX]
Fix this problem by completely remove @mirror argument
from write_data_to_disk().
With extra comments to explicitly show that function will write to
all mirrors.

Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: 2a93728391 ("btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()")
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-08-16 15:18:12 +02:00
Qu Wenruo 3ff9d35257 btrfs-progs: use read_data_from_disk() to replace read_extent_from_disk() and replace read_extent_data()
The function read_extent_from_disk() is only a wrapper to read tree
block.

And read_extent_data() is just a while loop to eliminate short read
caused by stripe boundary.

In fact, a lot of call sites of read_extent_data() are either reading
metadata (thus no possible short read) or doing extra loop by
themselves.

This patch will replace those two functions with read_data_from_disk(),
making it the only entrance for data/metadata read.
And update read_data_from_disk() to return the read bytes, so caller can
do a simple while loop.

For the few callers of read_extent_data(), open-code a small while loop
for them.

This will allow later RAID56 read repair using P/Q much easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:08:30 +02:00
Qu Wenruo 2a93728391 btrfs-progs: use write_data_to_disk() to replace write_extent_to_disk()
Function write_extent_to_disk() is just writing the content of a tree
block to disk.

It can not handle RAID56, and its work is the same as
write_data_to_disk().

Thus we can replace write_extent_to_disk() with write_data_to_disk()
easily.

There is only one special call site in write_raid56_with_parity(), which
can easily be replace with btrfs_pwrite() directly.

This reduce the write entrance, and make later eb::fd removal easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:08:29 +02:00
Qu Wenruo 01c25d73f1 btrfs-progs: extract metadata restore read code into its own helper
For metadata restore, our logical address is mapped to a single device
with logical address 1:1 mapped to device physical address.

Move this part of code into a helper, this will make later extent buffer
read path refactoer much easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:07:09 +02:00
Qu Wenruo 7a0c4b5dc1 btrfs-progs: remove the unnecessary BTRFS_SUPER_INFO_OFFSET path for tree block read
We used to use read_whole_eb() to read super block, but it's no longer
the case (so long that I can not even find out which commit did the
conversion).

Thus there is no need for read_whole_eb() to handle super block read
anymore.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-04-25 19:07:08 +02:00
Josef Bacik 02fb308bdc btrfs-progs: make btrfs_create_tree take a key for the root key
We're going to start create global roots from mkfs, and we need to have
a offset set for the root key.  Make the btrfs_create_tree() take a key
for the root_key instead of just the objectid so we can setup these new
style roots properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:22 +01:00
Josef Bacik c4164edeb5 btrfs-progs: add a btrfs_delete_and_free_root helper
The free space tree code already does this, but we need it for cleaning
up per block group roots.  Abstract this code out into a helper so that
we can use it in multiple places in the future.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:19 +01:00
Josef Bacik e33738306c btrfs-progs: handle the per-block group global root id
We will now be using block_group->chunk_objectid to point at the global
root id for this particular block group.  For now we'll assign this
based on mod'ing the offset of the block group against the number of
global root id's and handle the block_group_item updating appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:17 +01:00
Josef Bacik 27eaa3b514 btrfs-progs: set the number of global roots in the super block
In order to make sure the file system is consistent we need to record
the number of global roots we should have in the super block.  We could
infer this from the number of global roots we find, however this could
lead to interesting fuzzing problems, so add a source of truth to the
super block in order to make it easier to verify the file system is
consistent.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:07:14 +01:00
Josef Bacik 9ee6cc78a8 btrfs-progs: add support for loading the block group root
This adds the ability to load the block group root, as well as make sure
the various backup super block and super block updates are made
appropriately.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 18:06:51 +01:00
Josef Bacik 6cf2da1f6f btrfs-progs: store BTRFS_LEAF_DATA_SIZE in the fs_info
This is going to be a different value based on the incompat settings of
the file system, just store this in the fs_info instead of calculating
it every time.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-09 15:13:12 +01:00
Josef Bacik ee21714e21 btrfs-progs: don't check skip_csum_check if there's no fs_info
The chunk recover code passes in a buffer it allocates with metadata but
no fs_info, causing fuzz-test 008 to segfault.  Fix this test to only
check the flag if we have buf->fs_info set.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-08 18:18:01 +01:00
Josef Bacik 401690ff68 btrfs-progs: properly populate missing trees
With my global roots prep patches I regressed us on handling the case
where we didn't find a root at all.  In this case we need to return an
error (prior we returned -ENOENT) or we need to populate a dummy tree if
we have OPEN_CTREE_PARTIAL set.  This fixes a segfault of fuzz test 006.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-08 18:18:01 +01:00
Josef Bacik 29beeb1a30 btrfs-progs: do not try to load the free space tree if it's not enabled
We were previously getting away with this because the
load_global_roots() treated ENOENT like everything was a-ok.  However
that was a bug and fixing that bug uncovered a problem where we were
unconditionally trying to load the free space tree.  Fix that by
skipping the load if we do not have the compat bit set.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-08 18:18:01 +01:00
David Sterba 1bb6fb896d btrfs-progs: btrfstune: experimental, new option to switch csums
This is still work in progress but can survive some stress testing.
There are still some sanity checks missing, do not user this on valuable
data. To enables this, configure must be run with the experimental
features enabled.

  $ mkfs.btrfs --csum crc32c /dev/sdx
  $ <mount, fill with data, unmount>

  $ btrfstune --csum sha256

Will change the checksum to sha256.

Implementation:

- set bit on superblock when the checksums are being changed (similar to
  the uuid rewrite)
- metadata checksums are overwritten in place
- data checksums:
  - the checksum tree is completely deleted and no checksums are
    verified
  - data blocks are enumerated and all checksums generated (same as
    check --init-csum-tree)

To make it usable, it should be restartable and track the current
progress somehow. Also the previous data checksums should be verified
any time they're available.

Signed-off-by: David Sterba <dsterba@suse.com>
2022-03-08 18:10:03 +01:00
Josef Bacik 532bf58b5b btrfs-progs: sanity check global roots key.offset
For !extent tree v2 we should validate the key.offset == 0, and for
extent tree v2 we should validate that key.offset < nr_global_roots.  If
this fails we need to fail to load the global root so that the
appropriate action is taken.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-02-16 22:48:01 +01:00