Page MenuHomeFreeBSD

account for ashift when gathering buffers to be written to l2arc device
ClosedPublic

Authored by avg on Jun 9 2015, 10:00 AM.

Details

Summary

The problem is that since OpenSolaris commit
illumos/illumos-gate@e14bb3258d05c1b1077e2db7cf77088924e56919
l2ad_hand is kept aligned based on ashift (which is derived from
the cache device's logical and physical block sizes).
So, the hand could be advanced by more than the sum of b_asize-s
of the written L2ARC buffers. This is because b_asize is a misnomer
at the moment as it does not always represent the allocated size:
if a buffer is compressed, then the compressed size is properly rounded,
but if the compression fails or it is not applied, then the original
size is kept and it could be smaller than what ashift requires.

For the same reasons arcstat_l2_asize and the reported used space
on the cache device could be smaller than the actual allocated size
if ashift > 9. That problem is not fixed by this change.

This change only ensures that l2ad_hand is not advanced by more
than target_sz. Otherwise we would overwrite active (unevicted)
L2ARC buffers. That problem is manifested as growing l2_cksum_bad
and l2_io_error counters.

This change also changes 'p' prefix to 'a' prefix in a few places
where variables represent allocated rather than physical size.

The resolved problem may also result in the reported allocated size
being greater than the cache device's capacity, because of the
overwritten buffers (more than one buffer claiming the same disk
space).

PR: 198242
PR: 195746 (possibly related)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

avg retitled this revision from to account for ashift when gathering buffers to be written to l2arc device.
avg updated this object.
avg edited the test plan for this revision. (Show Details)
avg added a reviewer: delphij.
avg added a subscriber: ZFS.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
5291 ↗(On Diff #6055)

I don't really like the fact that arcstat_l2_asize isn't calculating asize tbh. Would it not be better to fix this properly in this patch too?

5380 ↗(On Diff #6055)

As this doesn't include the uplift to match block size then the space allocation will be incorrect.

I've not checked but this could cause issues later when the device is full but we still think there is space available?

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
5291 ↗(On Diff #6055)

Two notes:

  • it's been like that for years, so no rush
  • I do have a patch that ensures that all l2 buffers have ashift adjusted sizes, that automatically makes the stats correct

I will submit the patch for a review soon.

5380 ↗(On Diff #6055)

First, nothing changes from the previous code in this respect: you can easily see that stats_size is exactly the same as what write_asize used to be.
Second, this size is used for reporting only. Decisions about how much to write and where to write are made based on l2ad_hand, l2ad_start and l2ad_end values.

In D2764#52934, @smh wrote:

That does not help much as the commit was in the typical Oracle style: many unrelated changes piled together in one huge commit.
Here is a slightly better link: https://github.com/illumos/illumos-gate/commit/e14bb3258d05c1b1077e2db7cf77088924e56919#diff-dc835266d68d5a3711f1bae8d024d796R4186

While that commit introduced the problem of buffers overwriting I think that the accounting problem actually came from the following much more recent change:
https://github.com/illumos/illumos-gate/commit/aad02571bc59671aa3103bb070ae365f531b0b62#diff-dc835266d68d5a3711f1bae8d024d796R4653

avg edited edge metadata.

Rebase after the recent massive imports from the upstream.

This revision was automatically updated to reflect the committed changes.