Page MenuHomeFreeBSD

CAM SMR (Shingled Magnetic Recording) and SATA passthrough patches
ClosedPublic

Authored by ken on Apr 29 2016, 3:28 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jan 20, 1:26 PM
Unknown Object (File)
Mon, Jan 20, 10:00 AM
Unknown Object (File)
Thu, Jan 9, 5:55 AM
Unknown Object (File)
Wed, Jan 8, 6:42 AM
Unknown Object (File)
Mon, Jan 6, 11:16 PM
Unknown Object (File)
Fri, Jan 3, 7:04 AM
Unknown Object (File)
Fri, Dec 27, 1:45 PM
Unknown Object (File)
Dec 2 2024, 1:09 PM
Subscribers

Details

Summary

This includes support for Host Managed, Host Aware and Drive Managed SMR drives that are either SCSI (ZBC) or ATA (ZAC) attached via a SAS controller. This also includes support for Host Aware SATA drives that are attached via a SATA controller.

The big drive vendors are moving to SMR for at least some of their drives. The primary challenge with SMR is that it requires writing a relatively large zone sequentially starting at the beginning of the zone. The usual zone size is 256MB. It is conceptually almost like having a 256MB sector size.

We (Spectra Logic) are working on ZFS changes that will use this CAM and GEOM infrastructure to make ZFS play well with SMR drives. Those changes aren't yet done.

These patches include:
o A new 'camcontrol zone' command that allows displaying and managing drive zones via SCSI/ATA passthrough.
o A new zonectl(8) utility that uses the new DIOCZONECMD ioctl to display and manage zones via the da(4) (and later ada(4)) driver.
o Changes to diskinfo -v to display the zone mode of a drive.
o A new disk zone API, sys/sys/disk_zone.h.
o A new bio type, BIO_ZONE, and modifications to GEOM to support it. This new bio will allow filesystems to query zone support in a drive and manage zoned drives.
o Extensive modifications to the da(4) driver to handle probing SCSI and SATA behind SAS SMR drives.
o Modifications to the ada(4) driver to handle probing of SATA SMR drives.
o ATA passthrough improvements, including support for the new 32-byte SCSI ATA PASS-THROUGH(32) command that supports the auxiliary register. This hasn't been tested due to lack of a SAT layer that implements it. It isn't used in the da(4) driver, because if a SAT layer supports the 32 byte passthrough command, it will likely also support ZBC -> ZAC translation.
o Added a new camcontrol epc subcommand, that allows getting/setting Extended Power Conditions timers for ATA drives that support it.

If you look through the code, you'll notice that the disk_zone.h API is separate from the SCSI and ATA APIs. The intent is to allow filesystems and other consumers of the API to just talk to the disk zone API without dealing with the SCSI and ATA specifics. Another reason behind all of this is that even though the SCSI ZBC and ATA ZAC specs were developed in concert, and are intended to be functionally identical, they are still SCSI and ATA. As usual, SCSI is big endian and ATA is little endian. So to
present a common API to the filesystem, we give all of the zone data back in native byte order, regardless of the underlying device protocol.

Another thing to note is the extensive use of ATA passthrough in the da(4) driver. This is necessary because although the SCSI SAT (SCSI to ATA Translation) specification has been updated to include SCSI zone commands (ZBC) to ATA zone commands (ZAC), LSI/Avago/Broadcom has not yet released SAS controller firmware that implements it. That said, though, the da(4) driver is setup to prefer using the SCSI command set because that will almost certainly be more efficient than using ATA passthrough. (ATA passthrough commands are typically single stepped by necessity, although that could theoretically change for NCQ commands.)

I have only tested the code so far with Seagate SATA Drive Managed and Host Aware drives. I would appreciate testing with any drives. (And testing to make sure that the patches don't cause problems with existing hardware.) Right now, all you can really do is manage the zones manually using camcontrol(8) or zonectl(8). Automatic management will come with the ZFS changes. (Or changes to other filesysems if people want to do it.)

If you have a SATA Host Aware drive, in theory camcontrol(8) should allow you to manage the drive if you have it attached to a SATA controller.

Test Plan

Get a zone list with camcontrol:

camcontrol zone da0 -c rz -v

Get a zone list with zonectl:

zonectl -d /dev/da0 -c rz

Look at the drive zone status:

diskinfo -v /dev/da0
sysctl kern.cam.da.0

Look at EPC power modes:

camcontrol epc da0 -c list

Set the drive to spin down after an hour of idle time:

camcontrol epc da0 -c timer -T 3600 -p Standby_z -e -s

Tell the drive to go to the Standby_z power state:

camcontrol epc da4 -c goto -p Standby_z

Check the current drive state without causing it to spin up:

camcontrol epc da0 -c status -P

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

ken retitled this revision from to CAM SMR (Shingled Magnetic Recording) and SATA passthrough patches.
ken updated this object.
ken edited the test plan for this revision. (Show Details)
ken added reviewers: scottl, imp, asomers, slm, mav, smh.
ken set the repository for this revision to rS FreeBSD src repository - subversion.
ken edited edge metadata.
ken added a reviewer: trasz.

Add two changes:

Change 1217733 on 2016/04/29 by kenm@ken.spectrabsd

Enable NCQ capabilities for ZAC management commands in the ada(4)
driver.

Since we have the ability to set the SATA auxiliary register for
ATA CCBs in FreeBSD/head now, we can enable NCQ ZAC commands.

This has not yet been tested.

sys/cam/ata/ata_all.c:
        In ata_zac_mgmt_out() and ata_zac_mgmt_in(), enable the
        use_ncq code that was previously commented out.  Now that
        the spec defines what is needed and we have the ability to
        set the auxiliary register, we can do NCQ ZAC commands with
        hardware that supports those commands.

sys/cam/ata/ata_da.c:
        Change the ADA_FLAG_PIM_CAN_NCQ_TRIM bit to
        ADA_FLAG_PIM_ATA_EXT, since it really is a more generic
        driver and controller capability to set the auxiliary
        register.

        If the controller and driver support it, use NCQ for the
        ZAC management commands.  We might need to further qualify
        this if there is a way to identify whether a given drive
        supports NCQ ZAC commands.

Change 1217633 on 2016/04/29 by kenm@ken.spectrabsd

Merge change 1217630 into my test3 branch:

Change 1217630 by kenm@ken.spectrabsd8 on 2016/04/29 14:34:28

Add some missing commands to ata_op_string and re-order the various
SET FEATURES subcommands.

sys/cam/ata/ata_all.c:
        In ata_op_string(), add a few missing commands, like READ
        LOG DMA EXT.

        Add decoding for the subcommands of NCQ NON-DATA, SEND
        FPDMA QUEUED, and RECEIVE FPDMA QUEUED.

        Re-order the SET FEATURES subcommands in numeric order, so
        that it is easier to determine what is and is not in the
        list.

Only looked at the man pages, because Phabricator is acting flaky and really seems to want to lose my comments.

sbin/camcontrol/camcontrol.8
2197 ↗(On Diff #15754)

Use the serial comma:
s/Finish/Finish,/

2232 ↗(On Diff #15754)

"recommends should have" is weird. Maybe

Report zones where the device recommends resetting write pointers.
2332 ↗(On Diff #15754)

Typo.

2349 ↗(On Diff #15754)

Spaces around the "/" make the options a little less clear. Is it "condition enable"/"disable state"? Or "condition enable/disable state"?

2360 ↗(On Diff #15754)

As above, spaces around slash.

2676 ↗(On Diff #15754)

Passive->active. Also probably a little clearer using .Pa for the device names. Or maybe there is a better macro.

Set the timer for the Idle_a power condition on drive
.Pa ada0
to 60.1 seconds, enable that particular power condition, and save the timer
value and the enabled state of the power condition.
2684 ↗(On Diff #15754)

"should" is best used as a recommendation. Is the aside saying "(which is probably the drive's lowest power state)"?

So maybe:

Tell drive
.Pa da4
to go to the Standby_z power state (which is probably the
drive's lowest power state) and hold in that state until it is
explicitly released by another
.Cm goto
command.
2692 ↗(On Diff #15754)

Kind of redundant, also passive "will report" rather than active "report".

Report only the current power state of drive
.Pa da .
2699 ↗(On Diff #15754)

Passive->active:

Display the ATA Power Conditions log (Log Address 0x08) for drive
.Pa ada0 .
ken edited edge metadata.
  1. Updated to fix things pointed out by wblock.
  2. Limit the size of data fetched by REPORT ZONES to whatever the controller / system can support.
  3. Fix SEND / RECEIVE FPDMA queued in the achi(4) driver.
  4. Add a hidden option in camcontrol(8) to enable sending an NCQ report zones request. The ada(4) driver will do it automatically if the controller supports it.
ken marked 9 inline comments as done.May 17 2016, 2:15 PM

Fixed issues pointed out by wblock.

sbin/camcontrol/camcontrol.8
2347 ↗(On Diff #16470)

Typo is still there... specifically, "required".

sys/cam/ata/ata_da.c
1563 ↗(On Diff #16470)

Duplicated word "do do".

2658 ↗(On Diff #16470)

Spelling: definitions

sys/cam/scsi/scsi_da.c
3948 ↗(On Diff #16470)

Spelling: definitions

sys/sys/ata.h
408 ↗(On Diff #16470)

Spelling: receive

usr.sbin/zonectl/zonectl.8
138 ↗(On Diff #16470)

Use the serial comma: s/Finish/Finish,/

210 ↗(On Diff #16470)

On this and the next few examples, use active rather than passive.

"Display" instead of "This will display"
"Issue" instead of "This will issue"

ken marked 5 inline comments as done.May 18 2016, 5:26 PM

Thanks for the review!

ken edited edge metadata.

Fix things pointed out by wblock.

wblock added a reviewer: wblock.

The man pages look okay. Always a good idea to run igor -R and mandoc -Tlint on them before commit, too. Thank you!

This revision is now accepted and ready to land.May 19 2016, 4:18 AM
ken edited edge metadata.

A couple of minor edits, and bump __FreeBSD_version.

This revision now requires review to proceed.May 19 2016, 1:41 PM
This revision was automatically updated to reflect the committed changes.