Page MenuHomeFreeBSD

tests/sys/aio: Fix vectored_big_iovcnt flaky test failure
ClosedPublic

Authored by olivier on Sun, Oct 5, 10:59 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Oct 11, 4:33 PM
Unknown Object (File)
Sat, Oct 11, 4:33 PM
Unknown Object (File)
Sat, Oct 11, 4:33 PM
Unknown Object (File)
Sat, Oct 11, 4:33 PM
Unknown Object (File)
Sat, Oct 11, 4:33 PM
Unknown Object (File)
Sat, Oct 11, 8:01 AM
Unknown Object (File)
Tue, Oct 7, 4:25 PM
Unknown Object (File)
Mon, Oct 6, 6:24 PM
Subscribers
None

Details

Summary

The vectored_big_iovcnt test was failing intermittently with "aio short
write (16384)".

The test creates an MD device with a fixed size of GLOBAL_MAX (16384
bytes) but calculates the buffer size as 512 * (max_buf_aio + 1). When
max_buf_aio exceeds 31, this results in a buffer larger than the MD
device capacity, causing the write operation to be truncated to the
device size and triggering a test failure

Test Plan

We had this quiet rare flaky behaviour with this test, and the error message was this one:
aio short write (16384)
So, once improved this error message, the new error message was:
aio short write: got 16384, expected: 131584 (max_buf_aio=256, iovcnt=257)

To reproduce easly:

$ sudo sysctl vfs.aio.max_buf_aio=256
vfs.aio.max_buf_aio: 16 -> 256
$sudo kyua test sys/aio/aio_test:vectored_big_iovcnt

But remember: There is nothing in the standard aio tests that is modifing the default value.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

olivier created this revision.
asomers requested changes to this revision.Sun, Oct 5, 5:39 PM
asomers added inline comments.
sys/aio/aio_test.c
848

This part should be unnecessary, since Kyua runs each test case in a fresh tempdir. Why did you find it necessary?

1834

Lowering max_buf_aio here will prevent overshooting the device capacity. But it will also prevent the test from exceeding the kernel's internal limit, defeating the point of the test. Instead of reducing max_buf_aio, it would be better to increase the md's device size.

This revision now requires changes to proceed.Sun, Oct 5, 5:39 PM
olivier marked an inline comment as done.
olivier added inline comments.
sys/aio/aio_test.c
848

Because I was seeing this in my jenking error report:

Files left in work directory after failure: mdunit_link

But you’ve right, I need to remove it because it is useless since kyua remove all the tempdir at the end, and less code it better.

1834

OK, will propose a new version!

olivier marked an inline comment as done.
olivier edited the summary of this revision. (Show Details)

Increase the MD device size to support larger (non default) vfs.aio.max_buf_aio

Added new version more simpler: Just increase the MD size

asomers requested changes to this revision.Tue, Oct 7, 3:51 PM

The MD_LEN variable is now badly named. I think it's ok to leave its value the same. Most test cases don't require 1 MB of I/O. But you should at least change that name.

This revision now requires changes to proceed.Tue, Oct 7, 3:51 PM

The MD_LEN variable is now badly named. I think it's ok to leave its value the same. Most test cases don't require 1 MB of I/O. But you should at least change that name.

So, let’s use DEVICE_IO_LEN then (to clarifies that this constant represents the I/O buffer length used for device tests (16 KB), not the size of the MD device itself which is now 1 MB) ?

The MD_LEN variable is now badly named. I think it's ok to leave its value the same. Most test cases don't require 1 MB of I/O. But you should at least change that name.

So, let’s use DEVICE_IO_LEN then (to clarifies that this constant represents the I/O buffer length used for device tests (16 KB), not the size of the MD device itself which is now 1 MB) ?

Ok, that should work.

Rename MD_LEN to DEVICE_IO_LEN

This revision is now accepted and ready to land.Tue, Oct 7, 8:14 PM