Page MenuHomeFreeBSD

ext2fs: RW support for Extended Attributes
ClosedPublic

Authored by fsu on Apr 22 2017, 2:14 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Nov 15, 7:30 AM
Unknown Object (File)
Sun, Nov 10, 11:56 AM
Unknown Object (File)
Tue, Nov 5, 5:33 PM
Unknown Object (File)
Sun, Nov 3, 11:04 PM
Unknown Object (File)
Sun, Nov 3, 11:03 PM
Unknown Object (File)
Sun, Nov 3, 11:03 PM
Unknown Object (File)
Sun, Nov 3, 10:42 PM
Unknown Object (File)
Fri, Nov 1, 2:48 PM
Subscribers

Details

Test Plan

The next script with patched tools/regression/fsx was used.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Hmm.. have you given a thought on making the in memory inode more similar to UFS? I mean using something like this:

	/*
	 * Data for extended attribute modification.
 	 */
	u_char	  *i_ea_area;	/* Pointer to malloced copy of EA area */
	unsigned  i_ea_len;	/* Length of i_ea_area */
	int	  i_ea_error;	/* First errno in transaction */
	int	  i_ea_refs;	/* Number of users of EA area */

I admit it looks pretty straightforward as you did it but I was hoping to keep more similarities with UFS.

In D10460#216919, @pfg wrote:

Hmm.. have you given a thought on making the in memory inode more similar to UFS? I mean using something like this:

	/*
	 * Data for extended attribute modification.
 	 */
	u_char	  *i_ea_area;	/* Pointer to malloced copy of EA area */
	unsigned  i_ea_len;	/* Length of i_ea_area */
	int	  i_ea_error;	/* First errno in transaction */
	int	  i_ea_refs;	/* Number of users of EA area */

I admit it looks pretty straightforward as you did it but I was hoping to keep more similarities with UFS.

Am I right, that your idea is to allocate the memory area for EA's in case of inode reading,
then work with it, the write this memory to disk and free it, in case of inode inactivation?

In D10460#216920, @thisisadrgreenthumb_gmail.com wrote:
In D10460#216919, @pfg wrote:

Hmm.. have you given a thought on making the in memory inode more similar to UFS? I mean using something like this:

	/*
	 * Data for extended attribute modification.
 	 */
	u_char	  *i_ea_area;	/* Pointer to malloced copy of EA area */
	unsigned  i_ea_len;	/* Length of i_ea_area */
	int	  i_ea_error;	/* First errno in transaction */
	int	  i_ea_refs;	/* Number of users of EA area */

I admit it looks pretty straightforward as you did it but I was hoping to keep more similarities with UFS.

Am I right, that your idea is to allocate the memory area for EA's in case of inode reading,
then work with it, the write this memory to disk and free it, in case of inode inactivation?

Indeed that would seem necessary but the UFS implementation seems to do more specific VFS interaction and takes care of locking.
Look at sys/ufs/ufs/ufs_extattr.c.

DISCLAIMER: I really don't know this code well at all.

I am not sure, that keep EA's in memory is good idea. It is right, that, the number of bread()/bwrite() will be decreased. But:

  • The EA area is not consistent, iblock/ibody.
  • The EA blocks could be shared (which is not implemented in my patch), the additional syncronization will be required.
  • The EA's max size could be decreased in next revisions of extfs by adding EA blocks redirection for example.

So, it is completely possible to implement, but the future EA's functionality integration will be difficult, from my point of view.

In D10460#216926, @thisisadrgreenthumb_gmail.com wrote:

I am not sure, that keep EA's in memory is good idea. It is right, that, the number of bread()/bwrite() will be decreased. But:

  • The EA area is not consistent, iblock/ibody.
  • The EA blocks could be shared (which is not implemented in my patch), the additional syncronization will be required.
  • The EA's max size could be decreased in next revisions of extfs by adding EA blocks redirection for example.

So, it is completely possible to implement, but the future EA's functionality integration will be difficult, from my point of view.

We have been trying to keep the fs implementation as near as UFS as we possibly can in the hopes of making ext2fs behave "natively" and also to make it easy to adopt changes from UFS. OTOH, as we start to implement some fs specifics it is OK to diverge and your patch is certainly an improvement.

Let's wait a bit ... maybe someone provides better feedback.

rwatson implemented the UFS EAs, perhaps he could give us some feedback on this approach for ext2fs.

So, it is not difficult to implement a pointers to EA data in the memory inode. It will be like:

        /*
	 * Data for extended attribute modification.
 	 */
        u_char	  *i_ea_inode_area;
        unsigned  i_ea_inode_len;

u_char *i_ea_block_area;

unsigned  i_ea_block_len;

int i_ea_error; /* First errno in transaction */
int i_ea_refs; /* Number of users of EA area */

But, again, I am not agree, that it is a good idea to keep attributes in memory.
Ok, let's wait third opinion.

FYI, the ufs_extattr.c implementation is for UFS1 only, where there isn't in-layout storage for attributes. UFS2 uses code in ffs_vnops.c, relying on an additional block hung off the inode, and is probably a better reference for this work. The UFS implementation provides transaction-like semantics for multi-EA update -- hence the open/close behaviour. That is desirable when working with multiple simultaneous MAC policies that are each adding metadata in different attributes. I'm not sure how useful that will be to FreeBSD ext2fs users in practice, but it is the semantics in our VFS as a result -- and is what implies the in-memory copy so that we can atomically commit all the updates. I wonder if the 'default' mapping for ext2_extattr_index_to_linux() is safe..? I'm not sure what namespaces exist in Linux these days, but it might be one prefers to be conservative and protect non-user namespaces from access by unprivileged users.

Yep, the question about attr namespace mapping still open, because there is no "empty" attribute name in linux.
So, in case of empty namespace passed to setattr(), seems like some error should be returned, I will try to verify ufs behaviour in this case, when I will fix these switches.

Ok, Robert, could you please share your opinion about keeping EA's data in memory in case of ext2fs, based on our discussion above?

:) In all cases I should agree with Pedro in the end because he is code base owner.

The expert on the UFS2 extattr code is phk, who wrote it. I believe UFS2 generally relies on the buffer cache to cache the extattr block associated with an inode, so that it follows normal LRU-like eviction rules, etc. I believe that the only time UFS2 keeps extattr data hung off the inode in a special in-memory buffer is during a multi-operation transaction started by VOP_OPENEXTATTR and a corresponding later VOP_CLOSEXTATTR. Between those two VOPs, if a buffer is present, writes occur against the buffer rather than against the buffer cache, allowing the writes to be batched atomically. Otherwise, I believe that UFS2 will simply issue updates to bits of buffer-cache-resident extattr data. Take a look at ffs_extread and ffs_extwrite for details.

Ok, thanks for answer.
Will try to prepare another revision, which will be closer to ffs.

In D10460#216936, @thisisadrgreenthumb_gmail.com wrote:

Yep, the question about attr namespace mapping still open, because there is no "empty" attribute name in linux.
So, in case of empty namespace passed to setattr(), seems like some error should be returned, I will try to verify ufs behaviour in this case, when I will fix these switches.

Ok, Robert, could you please share your opinion about keeping EA's data in memory in case of ext2fs, based on our discussion above?

:) In all cases I should agree with Pedro in the end because he is code base owner.

Well, not sure I am the code "owner" since I don't really use linux at all :). Ultimately your code is not wrong but ...

If the objective is to make ext2fs an alternative to UFS then it would be ideal to support the VFS semantics and that involves the in-memory copy. Would it be difficult to just copy the ffs_*_ea() functionality from ffs_vnops.c?

I am truly sorry that this would mean rewriting a big hunk of your code though :(.

Make implementation more closer to ufs/ffs.

pfg requested changes to this revision.Apr 30 2017, 8:22 PM

I just started looking at it, but perhaps you missed getting rid of i_facl? It's no something we carry in the BSD inode.

sys/fs/ext2fs/ext2_extattr.c
126 ↗(On Diff #27870)

style(9): missing parenthesis for the return.

144 ↗(On Diff #27870)

style(9): Missing parenthesis for return.

188 ↗(On Diff #27870)

style(9): Missing parenthesis for return.

sys/fs/ext2fs/inode.h
102 ↗(On Diff #27870)

I thought the above would make i_facl unnecessary?

This revision now requires changes to proceed.Apr 30 2017, 8:22 PM

Hmm .. to clarify further:

I would think we can convert the linux attributes to FBSD attributes (e2di_facl to i_ea_area & friends) in ext2_inode_cnv.c so we wouldn't carry an in memory i_facl. That would make the UFS and ext2fs code identical?

The removing i_facl from BSD node will cost additional block (with on-disk inode, to get i_facl value) reading under ext2_extread()/ext2_extwrite(). I can try to implement it, if additional bread() is not so critical.

In D10460#218676, @thisisadrgreenthumb_gmail.com wrote:

The removing i_facl from BSD node will cost additional block (with on-disk inode, to get i_facl value) reading under ext2_extread()/ext2_extwrite(). I can try to implement it, if additional bread() is not so critical.

I think you could leave it as a temporary variable in ext2_ei2i() and call a helper function to do the conversion. Same thing, another helper function, for ext2_i2ei.
In other words you could be moving all the conversion process to ext2_inode_cnv.c

Sorry to put you into all this ;).

fsu edited edge metadata.

Fix parenthesis for return (style(9)) + remove facl field from in-memory inode.

Thanks!

sys/fs/ext2fs/ext2_alloc.c
54 ↗(On Diff #27887)

Why this change?
ffs_alloccg() is static so I think ext2_alloccg() should be the same.

sys/fs/ext2fs/ext2_inode_cnv.c
55 ↗(On Diff #27887)

Does this compile?

I am not unhappy with this version but I should note there is somewhat of an inefficiency:

While in memory operations we shouldn't be dealing with the dinode: we normally only deal with the dinode in ext2_inode_cnv.c where we basically determine what is/goes in the disk and what not.

Ok, could you plase suggest me the way,
how I should allocate EA blocks in the ext2_vnops.c
instead of ext2_alloccg() function.
Am I right, that I should add something like ext2_allocfacl() to ext2_alloc.c,
and declare it sys/fs/ext2fs/ext2_extern.h?

Also, about the dinode, if I have no facl value under ea vnops,
how I can get it without dinode reading from disk?
Or I should add to something like:
void ext2_ei_getfacl() and void ext2_ei_putfacl() to ext2_inode_cnv.c?

sys/fs/ext2fs/ext2_inode_cnv.c
55 ↗(On Diff #27887)

This is my typo, will be fixed in next review request.

In D10460#218829, @thisisadrgreenthumb_gmail.com wrote:

Ok, could you plase suggest me the way,
how I should allocate EA blocks in the ext2_vnops.c
instead of ext2_alloccg() function.
Am I right, that I should add something like ext2_allocfacl() to ext2_alloc.c,
and declare it sys/fs/ext2fs/ext2_extern.h?

Also, about the dinode, if I have no facl value under ea vnops,
how I can get it without dinode reading from disk?
Or I should add to something like:
void ext2_ei_getfacl() and void ext2_ei_putfacl() to ext2_inode_cnv.c?

Let's go back to the previous revision ... the one with i_facl in the in-memory inode. I think it adapts better to what I would like.

I think we can commit that and then work out how to get rid of i_facl cleanly.

:) Ok, but could you please suggest something about ext2_alloccg()?

In D10460#218836, @thisisadrgreenthumb_gmail.com wrote:

:) Ok, but could you please suggest something about ext2_alloccg()?

OK.. I missed that one completely. How does FFS do it?

FFS uses:
#define UFS_BALLOC ... VFSTOUFS((aa)->v_mount)->um_balloc(aa, bb, cc, dd, ee, ff).
But, as I understand, I can not use:
ext2_balloc(struct inode *ip, e2fs_lbn_t lbn, int size, struct ucred *cred,

struct buf **bpp, int flags)

in case of EA, because EA block is not part of file blocks.
So, as I could be seen for me:
I can use alloccg() and remove static declaration (as now), or implement the wrapper function, like ext2_allocfacl(), which will call alloccg().

FYI, there is another important semantic difference between BSD and Linux extended attributes. In FreeBSD, ACLs are exposed (and manipulated) via separate vnode operations in VFS, and similarly ACL system calls, since our VFS is ACL-aware, whereas in Linux, they use the extended attribute system calls and inode operations to carry a variety of metadata including ACLs. As such, if extfs is implementing ACLs, we'll want a wrapper that maps them to/from FreeBSD ACL vnode operations on the way through. There is arguably a desire to do something similar in the linuxulator to ensure that ACL operations enter our VFS as ACL operations rather than EA operations.

In D10460#218854, @thisisadrgreenthumb_gmail.com wrote:

FFS uses:
#define UFS_BALLOC ... VFSTOUFS((aa)->v_mount)->um_balloc(aa, bb, cc, dd, ee, ff).
But, as I understand, I can not use:
ext2_balloc(struct inode *ip, e2fs_lbn_t lbn, int size, struct ucred *cred,

struct buf **bpp, int flags)

in case of EA, because EA block is not part of file blocks.
So, as I could be seen for me:
I can use alloccg() and remove static declaration (as now), or implement the wrapper function, like ext2_allocfacl(), which will call alloccg().

I would go for the wrapper ext2_allocfacl() function.

This said, just before Robert's comment I was considering the added complexity, and that the UFS approach wasn't really buying us anything due to the differences in EAs, and thinking we could just go back to you initial KISS approach.

FYI, there is another important semantic difference between BSD and Linux extended attributes. In FreeBSD, ACLs are exposed (and manipulated) via separate vnode operations in VFS, and similarly ACL system calls, since our VFS is ACL-aware, whereas in Linux, they use the extended attribute system calls and inode operations to carry a variety of metadata including ACLs. As such, if extfs is implementing ACLs, we'll want a wrapper that maps them to/from FreeBSD ACL vnode operations on the way through. There is arguably a desire to do something similar in the linuxulator to ensure that ACL operations enter our VFS as ACL operations rather than EA operations.

Thanks for the input. Fedor is/was certainly considering adding support for ACLs. The big question is whether it's easier to keep the UFS approach for EAs or start from the simpler approach using the in-memory i_facl.

I can confirm that, I ready to try add ext2fs ACL's, but we should managed with EAs before this.
So, I am waiting for final decision about EA's implementation approach.

In D10460#218871, @thisisadrgreenthumb_gmail.com wrote:

I can confirm that, I ready to try add ext2fs ACL's, but we should managed with EAs before this.
So, I am waiting for final decision about EA's implementation approach.

As I meant to say -.. if you prefer we go back to the initial patch, I am fine with that. I was hoping doing EAs the UFS way would make easier the ACL stuff but it doesn't seem that way anymore.

Switch back to initial version + some style(9) fixes.

I approve going back to the simpler approach, and seeing from there how the ACLs go.

sys/fs/ext2fs/ext2_extattr.c
218 ↗(On Diff #27967)

Couldn't help noticing ...

Given that this break is followed by two closing semicolons ... does the break do anything?

sys/fs/ext2fs/ext2_extattr.c
218 ↗(On Diff #27967)

The full code of the for loop:
for (entry = EXT2_IFIRST(header); !EXT2_IS_LAST_ENTRY(entry);

	    entry = EXT2_EXTATTR_NEXT(entry)) {
		if (ext2_extattr_index_to_bsd(entry->e_name_index) != attrnamespace)
			continue;

		if (uio == NULL)
			*size += entry->e_name_len + 1;
		else {
			char *attr_name = malloc(entry->e_name_len + 1, M_TEMP, M_WAITOK);
			attr_name[0] = entry->e_name_len;
			memcpy(&attr_name[1], entry->e_name, entry->e_name_len);
			error = uiomove(attr_name, entry->e_name_len + 1, uio);
			free(attr_name, M_TEMP);
			if (error)
				break;
		}

}

So, if uiomove() return error > 0, the loop will be break, and error will be returned by function.
BTW: Can not find how to pass full diff, that context will be available. I use git format-patch --full-index, but review board do not want to display it.

This revision is now accepted and ready to land.May 3 2017, 9:49 PM
This revision was automatically updated to reflect the committed changes.