Commit graph

132 commits

Author SHA1 Message Date
Chris Mason
4ff9e2af17 Add btrfs-list for listing subvolumes
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2010-02-28 15:35:38 -05:00
Chris Mason
95d3f20b51 Mixed back reference (FORWARD ROLLING FORMAT CHANGE)
This commit introduces a new kind of back reference for btrfs metadata.
Once a filesystem has been mounted with this commit, IT WILL NO LONGER
BE MOUNTABLE BY OLDER KERNELS.

The new back ref provides information about pointer's key, level and in which
tree the pointer lives. This information allow us to find the pointer by
searching the tree. The shortcoming of the new back ref is that it only works
for pointers in tree blocks referenced by their owner trees.

This is mostly a problem for snapshots, where resolving one of these fuzzy back
references would be O(number_of_snapshots) and quite slow.  The solution used
here is to use the fuzzy back references in the common case where a given tree
block is only referenced by one root, and use the full back references when
multiple roots have a reference
2009-06-08 13:30:36 -04:00
Chris Mason
cc04d99e90 Add scan of the btrfs log tree to btrfs-debug-tree 2009-04-15 14:30:14 -04:00
Yan Zheng
9a6930e9be Add semantic checks to btrfsck for files and directories
This patch makes btrfsck check more things, including
directory items, file extents, checksumming, inode link
counts etc.

The code for these checks is similar to the code verifies
extent back references. The main difference is that
shared tree blocks are treated specially. The partial
checking results(unresolved references and/or errors)
of shared sub-trees are cached. This avoids scanning
the shared blocks several times. Thank you,

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2009-01-07 14:57:12 -05:00
Yan Zheng
0d53b212d8 Btrfs: update converter for the new disk format
This patch updates the ext3 to btrfs converter for the new
disk format. This mainly involves changing the convert's
data relocation and free space management code. This patch
also ports some functions from kernel module to btrfs-progs.
Thank you,

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-12-17 16:10:07 -05:00
Chris Mason
b238b4b072 Btrfs: Add inode sequence number for NFS and reserved space in a few structs
This adds a sequence number to the btrfs inode that is increased on
every update.  NFS will be able to use that to detect when an inode has
changed, without relying on inaccurate time fields.

While we're here, this also:

Puts reserved space into the super block and inode

Adds a log root transid to the super so we can pick the newest super
based on the fsync log as well as the main transaction ID.  For now
the log root transid is always zero, but that'll get fixed.

Adds a starting offset to the dev_item.  This will let us do better
alignment calculations if we know the start of a partition on the disk.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-12-08 17:01:14 -05:00
Chris Mason
d79f499eae Btrfs: move data checksumming into a dedicated tree
Btrfs stores checksums for each data block.  Until now, they have
been stored in the subvolume trees, indexed by the inode that is
referencing the data block.  This means that when we read the inode,
we've probably read in at least some checksums as well.

But, this has a few problems:

* The checksums are indexed by logical offset in the file.  When
compression is on, this means we have to do the expensive checksumming
on the uncompressed data.  It would be faster if we could checksum
the compressed data instead.

* If we implement encryption, we'll be checksumming the plain text and
storing that on disk.  This is significantly less secure.

* For either compression or encryption, we have to get the plain text
back before we can verify the checksum as correct.  This makes the raid
layer balancing and extent moving much more expensive.

* It makes the front end caching code more complex, as we have touch
the subvolume and inodes as we cache extents.

* There is potentitally one copy of the checksum in each subvolume
referencing an extent.

The solution used here is to store the extent checksums in a dedicated
tree.  This allows us to index the checksums by phyiscal extent
start and length.  It means:

* The checksum is against the data stored on disk, after any compression
or encryption is done.

* The checksum is stored in a central location, and can be verified without
following back references, or reading inodes.

This makes compression significantly faster by reducing the amount of
data that needs to be checksummed.  It will also allow much faster
raid management code in general.

The checksums are indexed by a key with a fixed objectid (a magic value
in ctree.h) and offset set to the starting byte of the extent.  This
allows us to copy the checksum items into the fsync log tree directly (or
any other tree), without having to invent a second format for them.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-12-08 17:00:31 -05:00
Chris Mason
d45ee76e4f Rev the disk format for the compat code and the csum selection
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-12-02 10:20:36 -05:00
Josef Bacik
1148e55804 btrfs-progs: support for different csum algorithims
This is the btrfs-progs version of the patch to add the ability to have
different csum algorithims.  Note I didn't change the image maker since it
seemed a bit more complicated than just changing some stuff around so I will let
Yan take care of that.

Everything else was converted and for now a mkfs just
sets the type to be BTRFS_CSUM_TYPE_CRC32.

Signed-off-by: Josef Bacik <jbacik@redhat.com>
2008-12-02 09:58:23 -05:00
Josef Bacik
76b8244a7a btrfs-progs: add support for compat flags
This patch updates btrfs-progs with the disk format changes for the
compatability flags.

Signed-off-by: Josef Bacik <jbacik@redhat.com>
2008-12-02 07:18:36 -05:00
Yan Zheng
aa62e84c84 Btrfs image tool
This patch adds btrfs image tool. The image tool is
a debugging tool that creates/restores btrfs metadump
image.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-11-20 09:52:48 -05:00
Chris Mason
49bc666d5f Update the super magic string to match the seed and root format changes 2008-11-18 14:13:44 -05:00
Chris Mason
2d9bc57b9a Add disk format requirements for subvol backward and forward refs 2008-11-18 10:34:08 -05:00
Yan Zheng
4d1d3a59d6 update btrfs-progs for seed device support
This patch does the following:

1) Update device management code to match the kernel code.

2) Allocator fixes.

3) Add a program called btrfstune to set/clear the SEEDING
   super block flags.
2008-11-18 10:40:06 -05:00
Yan Zheng
95470dfaf1 Add fallocate support v2
This patch updates btrfs-progs for fallocate support.
 
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-10-31 12:48:02 -04:00
Chris Mason
712c1dea5e Rev the disk format for compression and root pointer generation fields
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-10-29 14:09:40 -04:00
Yan Zheng
38702ea7c6 Add root tree pointer transaction ids
This patch adds transaction IDs to root tree pointers.
Transaction IDs in tree pointers are compared with the
generation numbers in block headers when reading root
blocks of trees. This can detect some types of IO errors.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-10-29 14:07:47 -04:00
Chris Mason
c830821ddf Add disk format elements for compression
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-10-29 14:37:41 -04:00
Chris Mason
b431f25ec7 Rev the disk format for the new back reference format
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-10-09 13:40:23 -04:00
Yan Zheng
9559e0b09e Count space allocated to file in bytes
This patch updates btrfs-progs for counting space
allocated to file in bytes.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-10-09 11:55:26 -04:00
Yan Zheng
5986faaf47 Remove offset field from struct btrfs_extent_ref
The offset field in struct btrfs_extent_ref records the position
inside file that file extent is referenced by. In the new back
reference system, tree leaves holding reference to file extent
are recorded explicitly. We can quickly scan these tree leaves, so the
offset field is not required.

This patch also makes the back reference system check the objectid
when extents are being deleted

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-10-09 11:55:30 -04:00
Zheng Yan
428b7fa630 Full back reference support
This patch makes the back reference system to explicit record the
location of parent node for all types of extents. The location of
parent node is placed into the offset field of backref key. Every
time a tree block is balanced, the back references for the affected
lower level extents are updated.
2008-09-23 12:29:10 -04:00
Chris Mason
cea88ec1d7 Disk format changes required for write ahead tree log 2008-09-05 16:15:58 -04:00
Chris Mason
2f2f9ef77a Rev the disk format 2008-08-21 15:49:34 -04:00
Josef Bacik
0045e0dd70 btrfs-progs: add orphan support to print-tree
This adds orphan support to print-tree so when debug_tree hits an orphan item it
will print out "orphan item" under it so you know what it is
2008-07-30 09:15:02 -04:00
Chris Mason
e74b89d675 Rev the disk format 2008-07-24 13:52:04 -04:00
Josef Bacik
059c20b384 btrfs-progs new dir index support 2008-07-24 12:13:32 -04:00
Chris Mason
a62332eeb2 Add a readonly flag open_ctree to force RO opens 2008-05-05 09:45:26 -04:00
Chris Mason
083faf794f Add mkfs.btrfs -A offset to control allocation start on devices
This is a utility option for the resizer, it makes sure to allocate
at offset bytes in the disk or higher.  It ensures the resizer will have
something to move when testing it.
2008-04-25 16:55:21 -04:00
Yan Zheng
4415143185 Speed improvement and bug fixes for ext3 converter
This patch improves converter's allocator and fixes a bug in data relocation
function. The new allocator caches free blocks as Btrfs's default allocator.
In testing here, the user CPU time reduced to half of the original when
checksum and small file packing was disabled. This patch also enlarges the
size of block groups created by the converter.
2008-04-24 14:57:50 -04:00
Chris Mason
8bfbb6b6f8 Update the Ext3 converter
The main changes in this patch are adding chunk handing and data relocation
ability. In the last step of conversion, the converter relocates data in system
chunk and move chunk tree into system chunk. In the rollback process, the
converter remove chunk tree from system chunk and copy data back.

Regards
YZ
---
2008-04-22 14:06:56 -04:00
Chris Mason
588bb9dfff Add support for filesystem labels via mkfs.btrfs -L 2008-04-18 10:31:42 -04:00
Chris Mason
d25165e95c Use device uuids when scanning devices 2008-04-18 10:31:42 -04:00
Chris Mason
1f81c1b6fc Add raid10 support 2008-04-16 11:14:21 -04:00
Chris Mason
951fd7371c Add chunk uuids and update multi-device back references
Block headers now store the chunk tree uuid

Chunk items records the device uuid for each stripes

Device extent items record better back refs to the chunk tree

Block groups record better back refs to the chunk tree

The chunk tree format has also changed.  The objectid of BTRFS_CHUNK_ITEM_KEY
used to be the logical offset of the chunk.  Now it is a chunk tree id,
with the logical offset being stored in the offset field of the key.

This allows a single chunk tree to record multiple logical address spaces,
upping the number of bytes indexed by a chunk tree from 2^64 to
2^128.
2008-04-15 15:42:08 -04:00
Chris Mason
a37e1e7204 Recow all roots at the end of mkfs
The mkfs code bootstraps the filesystem on a single device.  Once
the raid block groups are setup, it needs to recow all of the blocks so
that each tree is properly allocated.
2008-04-04 15:42:17 -04:00
Chris Mason
c7be130df7 Add support for single single duplication of metadata 2008-04-03 16:35:48 -04:00
Chris Mason
a6de0bd778 Add mirroring support across multiple drives 2008-04-03 16:35:48 -04:00
Chris Mason
ad67cd73b7 Update struct btrfs_header flags, and use it to indicate buffers are written 2008-04-01 10:20:06 -04:00
Chris Mason
e9e3422f85 Implement raid0 when multiple devices are present
This defaults to striping across all devices
2008-03-25 16:50:20 -04:00
Chris Mason
0dcfa3b827 Walk all block devices looking for btrfs 2008-03-24 15:05:44 -04:00
Chris Mason
26afd0f31d ioctls to scan for btrfs filesystems 2008-03-24 15:04:49 -04:00
Chris Mason
1f3ba6a3f9 Btrfsck updates for multi-device filesystems 2008-03-24 15:04:37 -04:00
Chris Mason
d12d4c7203 Dynamic chunk allocation 2008-03-24 15:03:58 -04:00
Chris Mason
510be29677 Add support for multiple devices per filesystem 2008-03-24 15:03:18 -04:00
Chris Mason
aa69dec31c Add inode item and backref in one insert, reducing cpu usage 2008-01-29 15:15:18 -05:00
Chris Mason
cbf87cad07 During deletes and truncate, remove many items at once from the tree 2008-01-29 15:11:36 -05:00
Chris Mason
80791984f6 Rename the extent_map code to extent_io
This mirrors the changes in the kernel code.
2008-03-04 11:16:54 -05:00
David Miller
8871a0eaa9 Unaligned access fixes
The first problem is that these SETGET macros lose typing information,
and therefore can't see the 'packed' attribute and therefore take
unaligned access SIGBUS signals on sparc64 when trying to derefernce
the member.

The next problem is a similar issue in btrfs_name_hash().  This gets
passed things like &key.offset which is a member of a packed
structure, losing this packed'ness information btrfs_name_hash()
performs a potentially unaligned memory access, again resulting in a
SIGBUS.
2008-02-15 11:19:58 -05:00
Chris Mason
f64e047c7c Update magic 2008-02-04 10:11:12 -05:00