Commit graph

33 commits

Author SHA1 Message Date
Anand Jain e6a466d44d btrfs-progs: tune: fix btrfstune --help for -m -M option
The -m | -M option for btrfstune, sounds like metadata_uuid is being
changed which is wrong. The fsid is being changed the original fsid is
being copied into the metadata_uuid. So update the help text.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-05-17 17:36:59 +02:00
Qu Wenruo 738bb7f0f0 btrfs-progs: tune: properly open zoned devices for RW
[BUG]
There is a report that, for zoned devices btrfstune is unable to convert
it to block group tree.

 # btrfstune /dev/nullb0 --convert-to-block-group-tree
 Error reading 1342193664, -1
 Error reading 1342193664, -1
 ERROR: cannot read chunk root
 ERROR: open ctree failed

[CAUSE]
For read-write opened zoned devices, all the read/write has to be
aligned to its sector size.

However btrfs stores its metadata by extent_buffer::data[], which has
all the structures before it, thus never aligned to zoned device sector
size.

Normally we would require btrfs_pread() and btrfs_pwrite() to do the
extra alignment, but during open_ctree(), we are not aware if a device
is zoned or not.

Thus we rely on if the fd is opened with O_DIRECT flag, if the fd has
O_DIRECT, then we would temporarily set fs_info->zoned for chunk tree
read.

Unforunately not all open_ctree_fd() callers have the flags set
properly, and btrfstune is one of the missing call site.

This makes all the read not properly aligned and cause read failure.

[FIX]
Just manually check if the target device is a zoned one, and set
O_DIRECT accordingly.

Issue: #765
Pull-request: #767
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-04-18 19:16:00 +02:00
Qu Wenruo ace7d241cc btrfs-progs: tune: fix the missing close() of filesystem fd
[BUG]
In "btrfs tune" subcommand, we're using open_ctree_fd(), which
requires passing a valid fd, and the caller is responsible to properly
handling the lifespan of the fd.

But we just call close_ctree() and btrfs_close_all_devices(), without
closing the fd.

[FIX]
Just manually close the fd.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-02-08 08:30:37 +01:00
David Sterba d71ece203a btrfs-progs: tune: use long option to enable simple quota
The initial patch used -q for enabling simple quota but this is not
right, -q is reserved for --quiet and we want the long options first.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-21 15:51:07 +02:00
Qu Wenruo 146cca7e16 btrfs-progs: move clear-cache.[ch] from check/ to common/ directory
The clear-cache functionality is shared by several commands:

- btrfs check
  For --clear-cache and --clear-ino-cache.

- btrfstune
  Mostly for block-group-tree feature conversion.

- btrfs-convert
  To enable the now default v2 space cache.

Thus it's no longer proper to keep clear-cache.[ch] under check/
directory, move them to common/ directory.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-13 18:13:12 +02:00
David Sterba 21aa6777b2 btrfs-progs: clean up includes, using include-what-you-use
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:57 +02:00
Anand Jain dd718e9f6f btrfs-progs: tune: use the latest bdev in fs_devices for super_copy
Currently, btrfstune relies on the superblock of the device specified
in the btrfstune argument for fs_info::super_copy. However, it should
use fs_devices::latest_bdev, as it points to the device with the highest
fs_devices::generation number. This will contain the superblock updates
that other devices may have missed and we can now support reuniting
devices following failures of btrfstune -m|M|u|U as in the patches:

   btrfs-progs: add support to fix superblock with CHANGING_FSID_V2 flag
   btrfs-progs: recover from the failed btrfstune -m|M

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:55 +02:00
Boris Burkov 4a0e715d78 btrfs-progs: tune: add support for squota
Add the ability to enable simple quotas on an existing file system at
rest with btrfstune.

This is similar to the functionality in mkfs, except it must also find
all the roots for which it must create qgroups. Note that this *does
not* retroactively compute usage for existing extents as that is
impossible for data. This is consistent with the behavior of the live
enable ioctl.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-03 01:11:55 +02:00
Qu Wenruo 7a0da24f52 btrfs-progs: tune: do not allow multiple incompatible features to be set
[PROBLEM]
Btrfstune allows multiple different options to be executed in one go,
some options are completely fine, like no-holes along with extref, but
with more and more options, we need more exclusive checks.

In fact a lot of new options are already not following the old
success/total checks.

[ENHANCEMENT]
There is really no need to allow multiple features to be set in one go.

So this patch introduces an array which groups all the compatible
options into following categories:

- Extent tree
  This includes converting to/from extent and block group tree.

- Space cache
  This includes converting to v2 space cache.

- Metadata UUID
  This includes changing metadata uuid.

- FSID change
  This includes the slower full fs fsid rewrites.

- Csum change
  This includes the csum rewrites.

- Seed devices
  This includes changing the device seed flag.

- Legacy options
  This includes no-holes/extref/skinny-metadata features, which are
  already default mkfs features.

Now we only allow options inside the same group to be specified.
E.g. "btrfstune -r -S 1" would fail as it includes both legacy and seed
groups.
Meanwhile "btrfstune -r -n" would still be allowed.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-09-01 13:58:46 +02:00
Anand Jain 3aa3342959 btrfs-progs: tune: cache local variable of fs_info
Since the root pointer dereferences for the fs_info several times,
it is rational to save the fs_info.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-28 17:24:56 +02:00
Anand Jain 59fe1f46fe btrfs-progs: btrfstune: consolidate error handling in main()
The upcoming "--device" option requires memory to parse devices, which
should be freed before returning from the main() function. As a
preparation for adding the "--device" option to the "btrfstune" command,
provide a consolidated error return exit from the main function with a
"goto" labeled "free_out". The label "free_out" may not make sense
currently, but it will when the "--device" option is added.

There are several return statements within the main function, and
changing all of them in the main "--device" feature patch would deviate
from the actual for the feature changes. Hence, it made sense to create
a preparatory patch.

The return value for any failure remains the same as in the original code.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-28 17:24:23 +02:00
Anand Jain 9ce233965f btrfs-progs: track active metadata_uuid per fs_devices
When the fsid does not match the metadata_uuid, the METADATA_UUID flag
is set in the superblock.

Changing the fsid using the btrfstune -U|-u option is not possible on a
filesystem with the METADATA_UUID flag set. But we are checking the
METADATA_UUID only from the super_copy, and not from the other scanned
device.

To fix this bug, track the metadata_uuid at the fs_devices level instead
of checking it only on the specified device in the argument, and use it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-28 17:24:22 +02:00
Anand Jain 19c337ad9a btrfs-progs: btrfstune: check for missing devices
If btrfstune is executed on a filesystem that contains a missing device,
the command will now fail.

It is ok to fail when any of the options supported by btrfstune are
used, the filesystem devices should be all available.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-23 19:36:31 +02:00
Anand Jain 485fceec8d btrfs-progs: register device after successfully changing the fsid
Testing with the fstests config option POST_MKFS_CMD="btrfstune -m"
reported failure, as shown below:

  ./check btrfs/003

  [111.635618] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 1 transid 6 /dev/sdb2 scanned by systemd-udevd (1117)
  [111.642199] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 2 transid 6 /dev/sdb3 scanned by systemd-udevd (1114)
  [111.660882] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 3 transid 6 /dev/sdb5 scanned by systemd-udevd (1116)
  [111.672623] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 4 transid 6 /dev/sdb6 scanned by systemd-udevd (993)
  [111.701301] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 6 transid 6 /dev/sdb8 scanned by systemd-udevd (1080)
  [111.706513] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 5 transid 6 /dev/sdb7 scanned by systemd-udevd (1117)
  [111.716532] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 7 transid 6 /dev/sdb9 scanned by systemd-udevd (1114)
  [111.721253] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 8 transid 6 /dev/sdb10 scanned by mkfs.btrfs (1504)
  [112.405186] BTRFS: device fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e devid 4 transid 8 /dev/sdb6 scanned by systemd-udevd (1117)
  [112.422104] BTRFS: device fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e devid 6 transid 8 /dev/sdb8 scanned by systemd-udevd (1523)
  [112.448355] BTRFS: device fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e devid 1 transid 8 /dev/sdb2 scanned by systemd-udevd (1115)
  [112.456126] BTRFS error: device /dev/sdb3 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
  [112.461299] BTRFS error: device /dev/sdb7 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
  [112.465690] BTRFS info (device sdb2): using crc32c (crc32c-generic) checksum algorithm
  [112.468758] BTRFS info (device sdb2): using free space tree
  [112.471318] BTRFS error: device /dev/sdb9 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
  [112.475962] BTRFS error: device /dev/sdb10 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
  [112.481934] BTRFS error: device /dev/sdb5 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
  [112.494614] BTRFS error (device sdb2): devid 2 uuid 99a57db7-2ef6-4240-a700-07ee7e64fb36 is missing
  [112.497834] BTRFS error (device sdb2): failed to read chunk tree: -2
  [112.507705] BTRFS error (device sdb2): open_ctree failed

The original fsid created by mkfs was a6599a65-8b6d-4156-bb55-0a3a2f0eae9d,
and the fsid created by the btrfstune -m option was
1b3bacbf-14db-49c9-a3ef-547998aacc4e.

During mount (after btrfstune -m), only 3 out of 8 devices were scanned
by systemd, while the rest were still being discovered. Consequently, the
mount command raced to find the missing devices. Since the mount command
in the kernel sets the flag fsdevices::opened, any further new alloc_device()
were blocked, resulting in the error "the fs is already mounted."

It is a good idea to register all devices after changing the fsid.
The previous registrations are already stale after changing the fsid.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-11 19:59:17 +02:00
Anand Jain 77f366c9da btrfs-progs: add noscan parameter to check_where_mounted
The function check_where_mounted() scans the system for all other btrfs
devices, which is necessary for its operation.  However, in certain
cases, devices remaining in the scanned state is undesirable.  Introduce
the 'noscan' argument to make devices unscanned before return.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-07-26 15:00:48 +02:00
Qu Wenruo 33a21e7578 btrfs-progs: tune: rework the main idea of csum change
The existing attempt for changing csum types is as the following:

- Create a new temporary csum root
- Generate new data csums into the temporary csum root
- Drop the old csum tree and make the temporary one as csum root
- Change the checksums for metadata in-place

Unfortunately after some experiments, the csum root switch method has a
big pitfall, the backref items in extent tree.

Those backref items still point back to the old tree, meaning without a
lot of extra tricks, the extent tree would be corrupted.

Thus we have to go a new single tree variant:

- Generate new data csums into the csum root
  The new data csums would have a different objectid to distinguish
  them.
- Drop the old data csum items
- Change the key objectids of the new csums
- Change the checksums for metadata in-place

This means unfortunately we have to revert most of the old code, and
update the temporary item format.

The new temporary item would only record the target csum type.
At every stage we have a method to determine the progress, thus no need
for an item, but in the future it's still open for change.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:32 +02:00
Qu Wenruo d4f4d7b76e btrfs-progs: tune: add --convert-to-free-space-tree option
From the very beginning of free-space-tree feature, we allow mount
option "space_cache=v2" to convert the filesystem to the new feature.

But this is not the proper practice for new features (no matter if it's
incompat or compat_ro).

This is already making the clear_cache/space_cache mount option more
complex.

Thus this patch introduces the proper way to enable free-space-tree, and
I hope one day we can deprecate the "space_cache=" mount option.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-26 18:02:27 +02:00
David Sterba 455b1cf094 btrfs-progs: tune: rename bgt conversion options
Rename the options so they more accurately reflect what the command is
actually doing. The feature is enabled/disabled in the end but it's not
a simple on/off like for others, the conversion takes time.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-27 14:23:52 +02:00
Qu Wenruo 68a04bc710 btrfs-progs: tune: add new option to convert back to extent tree
With previous btrfstune support to convert to block-group-tree, it has
implemented most of the infrastructure for bi-directional conversion.

This patch will implement the remaining conversion support to go back to
extent tree.

The modification includes:

- New convert_to_extent_tree() function in btrfstune.c
  It's almost the same as convert_to_bg_tree(), but with small changes:
  * No need to set extra features like NO_HOLES/FST.
  * Need to delete the block group tree when everything finished.

- Update btrfs_delete_and_free_root() to handle non-global roots
  Currently the function can only accepts global roots (extent/csum/free
  space trees)

  If we pass a non-global root into the function, we will screw up
  global_roots_tree and crash.

  Since we're going to use btrfs_delete_and_free_root() to free block
  group tree which is not a global tree, this is needed.

- New handling for half converted fs in get_last_converted_bg()
  There are two cases need to be handled:

  * The bg tree is already empty
    We need to grab the first bg in extent tree.
    Or at conversion function we will fail at grabbing the first bg.

  * The bg tree is not empty
    Then we need to grab the last bg in extent tree.

- Extra root switching in involved functions. This involves:

  * read_converting_block_groups()
  * insert_block_group_item()
  * update_block_group_item()

  We just need to update our target root according to the current
  compat_ro and super flags.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-19 01:10:24 +02:00
Qu Wenruo 716c3be363 btrfs-progs: move block-group-tree out of experimental features
The feedback from the community on block group tree is very positive,
the only complain is, end users need to recompile btrfs-progs with
experimental features to enjoy the new feature.

So let's move it out of experimental features and let more people enjoy
faster mount speed.

Also change the option of btrfstune, from `-b` to
`--enable-block-group-tree` to avoid short option.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 19:28:05 +02:00
David Sterba a7fa81f296 btrfs-progs: open code print_usage where applicable
After previous change to usage() that now has the return code, there's
no purpose of the print_usage() wrapper so it can be removed.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-28 20:11:23 +01:00
Qu Wenruo f61b90aff9 btrfs-progs: make usage call properly return an exit value
[BUG]
Currently cli/009 test case failed with different exit number:

  ====== RUN CHECK /home/adam/btrfs-progs/btrfstune --help
  usage: btrfstune [options] device
  [...]
  failed: /home/adam/btrfs-progs/btrfstune --help
  test failed for case 009-btrfstune

[CAUSE]
In tune/main.c, we have the following call on usage():

  static void print_usage(int ret)
  {
	usage(&tune_cmd);
	exit(ret);
  }

However usage() itself would always call exit(1):

  void usage(const struct cmd_struct *cmd)
  {
	usage_command_usagestr(cmd->usagestr, NULL, 0, true, true);
	exit(1);
  }

This makes prevents any caller of usage() to modify its exit number.

[FIX]
Add a new argument @error for print_usage(), so we can properly return 0
for -h/--help usage.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-28 20:11:23 +01:00
David Sterba 9be33f558c btrfs-progs: tune: update checksum conversion
The checksum conversion is still experimental and still does not convert
all filesystems correctly. Do not use on valuable data.

Previous implementation copied the UUID conversion which was not a good
base for the checksum conversion so it left out basically all trees
except extent and checksum.

This update adds the base for the required safety features:

- let the old csum tree intact until the full conversion is done (ie.
  all data are still verifiable)
- add on-disk status tracking item, this should keep the from/to
  checksum conversion, last generation to catch potential updates of the
  underlying filesystem if conversion is interrupted and the filesystem
  mounted
- convert most of the fundamental trees, the subvolumes, tree log and
  relocation trees are not converted
- trees are converted in-place to avoid potentially running out of space
  but this might be better done by transaction protection with a
  temporary tree

Known issues:

- not all trees are converted
- not all checksums are correctly inserted into the new tree and reading
  the files leads to EIO

Issue: #438
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-28 20:11:22 +01:00
David Sterba 79b3ed9ab8 btrfs-progs: tune: fix typos and mistakes in messages
The conversion have been copy&pasted from one code but not all messages
reflect that and mistakenly say fsid instead of csum, etc. Also rename
functions converting the trees to more descriptive names.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-28 19:49:31 +01:00
David Sterba 85f49ce2cd btrfs-progs: tune: convert printf to message helpers
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-28 19:49:31 +01:00
David Sterba 512467c17e btrfs-progs: tune: use help and cmd_struct for printing help text
Unify the btrfstune help text so it uses the help framework. The cmd
struct is set up only partially.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:44:02 +01:00
David Sterba 4de40f3a5e btrfs-progs: tune: remove unused includes
Remove includes reported by include-what-you-use after the refactoring.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:44:02 +01:00
David Sterba 3cf00a0a38 btrfs-progs: tune: factor out checksum change to own file
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:44:01 +01:00
David Sterba 44fecc5fd2 btrfs-progs: tune: factor out conversion to block group tree to own file
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:43:47 +01:00
David Sterba 8962705987 btrfs-progs: tune: factor out metadata UUID change to own file
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:43:32 +01:00
David Sterba b2616464d3 btrfs-progs: tune: factor out UUID change to own file
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:43:19 +01:00
David Sterba b380d972e2 btrfs-progs: tune: factor out seeding to own file
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:43:04 +01:00
David Sterba f66478b961 btrfs-progs: move btrfstune to own directory
Move the source file to own directory so it can be further split and
refactored. File needs to be renamed to main.c so the build magic works.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-18 17:42:04 +01:00
Renamed from btrfstune.c (Browse further)