Page MenuHomeFreeBSD

ext2fs: extents, initial RW support
ClosedPublic

Authored by fsu on Aug 20 2017, 4:28 PM.

Details

Test Plan

Could be tested with script from attachment.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Very interesting feature. Small knits for now. Thanks!

sys/fs/ext2fs/ext2_extents.c
51 ↗(On Diff #32267)

Perhaps these definitions be in ext2_extents.h ?

131 ↗(On Diff #32267)

style: the assignment should be done after the declaration section. Preferably just before the variable is first used.

275 ↗(On Diff #32267)

Same as above.

400 ↗(On Diff #32267)

Minor style: please switch the multiplication to leave the size last. This matches the suggested syntax for calloc(3).

Remove forgotten unneeded stuff.

I might have missed it, but have you considered the case that requires writing EXT4_HUGE_FILEs?

Of course, I am waiting for kevlo@ to chime in as he has more experience with this than me.

In D12087#251372, @pfg wrote:

I might have missed it, but have you considered the case that requires writing EXT4_HUGE_FILEs?

Yep, you can see it from attached test script, the mkfs option -O huge_file was added in all cases.

Of course, I am waiting for kevlo@ to chime in as he has more experience with this than me.

Me too:)

In D12087#251428, @thisisadrgreenthumb_gmail.com wrote:
In D12087#251372, @pfg wrote:

I might have missed it, but have you considered the case that requires writing EXT4_HUGE_FILEs?

Yep, you can see it from attached test script, the mkfs option -O huge_file was added in all cases.

Such "test" is insufficient.

We hande huge_file fine for reading: we actually detect EXT4_HUGE_FILE and convert from blocks, which works fine for reading. For write though, we never set EXT4_HUGE_FILE.

There are two approaches: NetBSD uses blocks if the attribute is present, we instead always do the conversion for reading (but not for writing since we didn't have write support). I am unsure what linux does but we should at least check for overflow and do the conversion back to blocks when needed.

Ok, I am not sure that it is possible to overflow this variable.
We have 16 bit from e2di_nblock_high and 32 bit from e2di_nblock.
So:
pow(2, 48) * 512 / pow(1024,4) = 131072
Where result is in TBs.
The we can see from https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout (File System Maximums table)
That for 64k blocks in both 32bit and 64bit cases we have maximum file size 256TB.

As I can see on linux there is the next logic:
if (i_blocks <= 0xffffffffffffULL) {

		/*
		 * i_blocks can be represented in a 48 bit variable
		 * as multiple of 512 bytes
		 */
		...

} else {
... - use blocks
}

But, I can not imagine case when the uint64_t i_blocks will be more than 0xffffffffffff.
That's why write support was not added here: https://reviews.freebsd.org/D11209
If you can suggest the case, when i_blocks is more than pow(2,48), it would be great.
Also, we can add the same logic as in linux, how it could be formulated... just to be sure.

In D12087#251589, @thisisadrgreenthumb_gmail.com wrote:

Ok, I am not sure that it is possible to overflow this variable.
We have 16 bit from e2di_nblock_high and 32 bit from e2di_nblock.
So:
pow(2, 48) * 512 / pow(1024,4) = 131072
Where result is in TBs.
The we can see from https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout (File System Maximums table)
That for 64k blocks in both 32bit and 64bit cases we have maximum file size 256TB.

As I can see on linux there is the next logic:
if (i_blocks <= 0xffffffffffffULL) {

		/*
		 * i_blocks can be represented in a 48 bit variable
		 * as multiple of 512 bytes
		 */
		...

} else {
... - use blocks
}

The only direct reference to the flag I see in linux is here:

http://src.illumos.org/source/xref/linux-master/fs/ext4/super.c#2603

The upper_limit depends on the huge_file option.

But, I can not imagine case when the uint64_t i_blocks will be more than 0xffffffffffff.
That's why write support was not added here: https://reviews.freebsd.org/D11209
If you can suggest the case, when i_blocks is more than pow(2,48), it would be great.
Also, we can add the same logic as in linux, how it could be formulated... just to be sure.

As you note, linux will store 48 bits but on FreeBSD we handle an i_blocks of uint64_t, so theoretically an overflow is possible.

My doubt though is .. does ext4 do the conversion unconditionally when the filesystem can handle it (huge_file feature is set), or is it just a case that kicks in when an overflow happens. The later means you can reset the huge_file flag and still mount the filesystem without losing everything (not that it matters too much as it is an inode flag).

This looks good to me.

This revision is now accepted and ready to land.Sep 24 2017, 2:46 PM

Add cem@, he may want to review it before it lands.

In D12087#259854, @pfg wrote:

Add cem@, he may want to review it before it lands.

Thanks Pedro, I expecting, that @cem's review really will help, it is, how it could be said... fresh eyes.

I'm not really familiar with this code nor ext4, sorry.

This revision was automatically updated to reflect the committed changes.