Index: stable/10/share/man/man4/pass.4 =================================================================== --- stable/10/share/man/man4/pass.4 (revision 292347) +++ stable/10/share/man/man4/pass.4 (revision 292348) @@ -1,120 +1,232 @@ .\" .\" Copyright (c) 1998, 1999 Kenneth D. Merry. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. The name of the author may not be used to endorse or promote products .\" derived from this software without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd October 10, 1998 +.Dd March 17, 2015 .Dt PASS 4 .Os .Sh NAME .Nm pass .Nd CAM application passthrough driver .Sh SYNOPSIS .Cd device pass .Sh DESCRIPTION The .Nm driver provides a way for userland applications to issue CAM CCBs to the kernel. .Pp Since the .Nm driver allows direct access to the CAM subsystem, system administrators should exercise caution when granting access to this driver. If used improperly, this driver can allow userland applications to crash a machine or cause data loss. .Pp The .Nm driver attaches to every .Tn SCSI +and +.Tn ATA device found in the system. Since it attaches to every device, it provides a generic means of accessing .Tn SCSI +and +.Tn ATA devices, and allows the user to access devices which have no "standard" peripheral driver associated with them. .Sh KERNEL CONFIGURATION It is only necessary to configure one .Nm device in the kernel; .Nm devices are automatically allocated as .Tn SCSI +and +.Tn ATA devices are found. .Sh IOCTLS -.Bl -tag -width 012345678901234 -.It CAMIOCOMMAND +.Bl -tag -width 5n +.It CAMIOCOMMAND union ccb * This ioctl takes most kinds of CAM CCBs and passes them through to the CAM transport layer for action. Note that some CCB types are not allowed through the passthrough device, and must be sent through the .Xr xpt 4 device instead. Some examples of xpt-only CCBs are XPT_SCAN_BUS, XPT_DEV_MATCH, XPT_RESET_BUS, XPT_SCAN_LUN, XPT_ENG_INQ, and XPT_ENG_EXEC. These CCB types have various attributes that make it illogical or impossible to service them through the passthrough interface. -.It CAMGETPASSTHRU +.It CAMGETPASSTHRU union ccb * This ioctl takes an XPT_GDEVLIST CCB, and returns the passthrough device corresponding to the device in question. Although this ioctl is available through the .Nm driver, it is of limited use, since the caller must already know that the device in question is a passthrough device if they are issuing this ioctl. It is probably more useful to issue this ioctl through the .Xr xpt 4 device. +.It CAMIOQUEUE union ccb * +Queue a CCB to the +.Xr pass 4 +driver to be executed asynchronously. +The caller may use +.Xr select 2 , +.Xr poll 2 +or +.Xr kevent 2 +to receive notification when the CCB has completed. +.Pp +This ioctl takes most CAM CCBs, but some CCB types are not allowed through +the pass device, and must be sent through the +.Xr xpt 4 +device instead. +Some examples of xpt-only CCBs are XPT_SCAN_BUS, +XPT_DEV_MATCH, XPT_RESET_BUS, XPT_SCAN_LUN, XPT_ENG_INQ, and XPT_ENG_EXEC. +These CCB types have various attributes that make it illogical or +impossible to service them through the passthrough interface. +.Pp +Although the +.Dv CAMIOQUEUE +ioctl is not defined to take an argument, it does require a +pointer to a union ccb. +It is not defined to take an argument to avoid an extra malloc and copy +inside the generic +.Xr ioctl 2 +handler. +.pp +The completed CCB will be returned via the +.Dv CAMIOGET +ioctl. +An error will only be returned from the +.Dv CAMIOQUEUE +ioctl if there is an error allocating memory for the request or copying +memory from userland. +All other errors will be reported as standard CAM CCB status errors. +Since the CCB is not copied back to the user process from the pass driver +in the +.Dv CAMIOQUEUE +ioctl, the user's passed-in CCB will not be modfied. +This is the case even with immediate CCBs. +Instead, the completed CCB must be retrieved via the +.Dv CAMIOGET +ioctl and the status examined. +.Pp +Multiple CCBs may be queued via the +.Dv CAMIOQUEUE +ioctl at any given time, and they may complete in a different order than +the order that they were submitted. +The caller must take steps to identify CCBs that are queued and completed. +The +.Dv periph_priv +structure inside struct ccb_hdr is available for userland use with the +.Dv CAMIOQUEUE +and +.Dv CAMIOGET +ioctls, and will be preserved across calls. +Also, the periph_links linked list pointers inside struct ccb_hdr are +available for userland use with the +.Dv CAMIOQUEUE +and +.Dv CAMIOGET +ioctls and will be preserved across calls. +.It CAMIOGET union ccb * +Retrieve completed CAM CCBs queued via the +.Dv CAMIOQUEUE +ioctl. +An error will only be returned from the +.Dv CAMIOGET +ioctl if the +.Xr pass 4 +driver fails to copy data to the user process or if there are no completed +CCBs available to retrieve. +If no CCBs are available to retrieve, +errno will be set to +.Dv ENOENT . +.Pp +All other errors will be reported as standard CAM CCB status errors. +.Pp +Although the +.Dv CAMIOGET +ioctl is not defined to take an argument, it does require a +pointer to a union ccb. +It is not defined to take an argument to avoid an extra malloc and copy +inside the generic +.Xr ioctl 2 +handler. +.Pp +The pass driver will report via +.Xr select 2 , +.Xr poll 2 +or +.Xr kevent 2 +when a CCB has completed. +One CCB may be retrieved per +.Dv CAMIOGET +call. +CCBs may be returned in an order different than the order they were +submitted. +So the caller should use the +.Dv periph_priv +area inside the CCB header to store pointers to identifying information. .El .Sh FILES .Bl -tag -width /dev/passn -compact .It Pa /dev/pass Ns Ar n Character device nodes for the .Nm driver. There should be one of these for each device accessed through the CAM subsystem. .El .Sh DIAGNOSTICS None. .Sh SEE ALSO +.Xr kqueue 2 , +.Xr poll 2 , +.Xr select 2 , .Xr cam 3 , .Xr cam 4 , .Xr cam_cdbparse 3 , +.Xr cd 4 , +.Xr ctl 4 , +.Xr da 4 , +.Xr sa 4 , .Xr xpt 4 , -.Xr camcontrol 8 +.Xr camcontrol 8 , +.Xr camdd 8 .Sh HISTORY The CAM passthrough driver first appeared in .Fx 3.0 . .Sh AUTHORS .An Kenneth Merry Aq ken@FreeBSD.org -.Sh BUGS -It might be nice to have a way to asynchronously send CCBs through the -passthrough driver. -This would probably require some sort of read/write -interface or an asynchronous ioctl interface. Index: stable/10/sys/cam/ata/ata_da.c =================================================================== --- stable/10/sys/cam/ata/ata_da.c (revision 292347) +++ stable/10/sys/cam/ata/ata_da.c (revision 292348) @@ -1,2145 +1,2156 @@ /*- * Copyright (c) 2009 Alexander Motin * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_ada.h" #include #ifdef _KERNEL #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #endif /* _KERNEL */ #ifndef _KERNEL #include #include #endif /* _KERNEL */ #include #include #include #include #include #include #include /* geometry translation */ #ifdef _KERNEL #define ATA_MAX_28BIT_LBA 268435455UL typedef enum { ADA_STATE_RAHEAD, ADA_STATE_WCACHE, ADA_STATE_NORMAL } ada_state; typedef enum { ADA_FLAG_CAN_48BIT = 0x0002, ADA_FLAG_CAN_FLUSHCACHE = 0x0004, ADA_FLAG_CAN_NCQ = 0x0008, ADA_FLAG_CAN_DMA = 0x0010, ADA_FLAG_NEED_OTAG = 0x0020, ADA_FLAG_WAS_OTAG = 0x0040, ADA_FLAG_CAN_TRIM = 0x0080, ADA_FLAG_OPEN = 0x0100, ADA_FLAG_SCTX_INIT = 0x0200, ADA_FLAG_CAN_CFA = 0x0400, ADA_FLAG_CAN_POWERMGT = 0x0800, ADA_FLAG_CAN_DMA48 = 0x1000, ADA_FLAG_DIRTY = 0x2000 } ada_flags; typedef enum { ADA_Q_NONE = 0x00, ADA_Q_4K = 0x01, } ada_quirks; #define ADA_Q_BIT_STRING \ "\020" \ "\0014K" typedef enum { ADA_CCB_RAHEAD = 0x01, ADA_CCB_WCACHE = 0x02, ADA_CCB_BUFFER_IO = 0x03, ADA_CCB_DUMP = 0x05, ADA_CCB_TRIM = 0x06, ADA_CCB_TYPE_MASK = 0x0F, } ada_ccb_state; /* Offsets into our private area for storing information */ #define ccb_state ppriv_field0 #define ccb_bp ppriv_ptr1 struct disk_params { u_int8_t heads; u_int8_t secs_per_track; u_int32_t cylinders; u_int32_t secsize; /* Number of bytes/logical sector */ u_int64_t sectors; /* Total number sectors */ }; #define TRIM_MAX_BLOCKS 8 #define TRIM_MAX_RANGES (TRIM_MAX_BLOCKS * ATA_DSM_BLK_RANGES) struct trim_request { uint8_t data[TRIM_MAX_RANGES * ATA_DSM_RANGE_SIZE]; TAILQ_HEAD(, bio) bps; }; struct ada_softc { struct bio_queue_head bio_queue; struct bio_queue_head trim_queue; int outstanding_cmds; /* Number of active commands */ int refcount; /* Active xpt_action() calls */ ada_state state; ada_flags flags; ada_quirks quirks; int sort_io_queue; int trim_max_ranges; int trim_running; int read_ahead; int write_cache; #ifdef ADA_TEST_FAILURE int force_read_error; int force_write_error; int periodic_read_error; int periodic_read_count; #endif struct disk_params params; struct disk *disk; struct task sysctl_task; struct sysctl_ctx_list sysctl_ctx; struct sysctl_oid *sysctl_tree; struct callout sendordered_c; struct trim_request trim_req; }; struct ada_quirk_entry { struct scsi_inquiry_pattern inq_pat; ada_quirks quirks; }; static struct ada_quirk_entry ada_quirk_table[] = { { /* Hitachi Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "Hitachi H??????????E3*", "*" }, /*quirks*/ADA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "SAMSUNG HD155UI*", "*" }, /*quirks*/ADA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "SAMSUNG HD204UI*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST????DL*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Barracuda Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST???DM*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Barracuda Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST????DM*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST9500423AS*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST9500424AS*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST9640423AS*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST9640424AS*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST9750420AS*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST9750422AS*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST9750423AS*", "*" }, /*quirks*/ADA_Q_4K }, { /* Seagate Momentus Thin Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "ST???LT*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Caviar Red Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD????CX*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD????RS*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Caviar Green/Red Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD????RX*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Caviar Red Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD??????CX*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Caviar Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD??????EX*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD??????RS*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD??????RX*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD???PKT*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD?????PKT*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD???PVT*", "*" }, /*quirks*/ADA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "WDC WD?????PVT*", "*" }, /*quirks*/ADA_Q_4K }, /* SSDs */ { /* * Corsair Force 2 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "Corsair CSSD-F*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Corsair Force 3 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "Corsair Force 3*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Corsair Neutron GTX SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "Corsair Neutron GTX*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Corsair Force GT & GS SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "Corsair Force G*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Crucial M4 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "M4-CT???M4SSD2*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Crucial RealSSD C300 SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "C300-CTFDDAC???MAG*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Intel 320 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "INTEL SSDSA2CW*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Intel 330 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "INTEL SSDSC2CT*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Intel 510 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "INTEL SSDSC2MH*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Intel 520 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "INTEL SSDSC2BW*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Intel X25-M Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "INTEL SSDSA2M*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Kingston E100 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "KINGSTON SE100S3*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Kingston HyperX 3k SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "KINGSTON SH103S3*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Marvell SSDs (entry taken from OpenSolaris) * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "MARVELL SD88SA02*", "*" }, /*quirks*/ADA_Q_4K }, { /* * OCZ Agility 2 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "OCZ-AGILITY2*", "*" }, /*quirks*/ADA_Q_4K }, { /* * OCZ Agility 3 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "OCZ-AGILITY3*", "*" }, /*quirks*/ADA_Q_4K }, { /* * OCZ Deneva R Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "DENRSTE251M45*", "*" }, /*quirks*/ADA_Q_4K }, { /* * OCZ Vertex 2 SSDs (inc pro series) * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "OCZ?VERTEX2*", "*" }, /*quirks*/ADA_Q_4K }, { /* * OCZ Vertex 3 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "OCZ-VERTEX3*", "*" }, /*quirks*/ADA_Q_4K }, { /* * OCZ Vertex 4 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "OCZ-VERTEX4*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Samsung 830 Series SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "SAMSUNG SSD 830 Series*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Samsung 840 SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "Samsung SSD 840*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Samsung 843T Series SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "SAMSUNG MZ7WD*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Samsung 850 SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "Samsung SSD 850*", "*" }, /*quirks*/ADA_Q_4K }, { /* * Samsung PM853T Series SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "SAMSUNG MZ7GE*", "*" }, /*quirks*/ADA_Q_4K }, { /* * SuperTalent TeraDrive CT SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "FTM??CT25H*", "*" }, /*quirks*/ADA_Q_4K }, { /* * XceedIOPS SATA SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "SG9XCS2D*", "*" }, /*quirks*/ADA_Q_4K }, { /* Default */ { T_ANY, SIP_MEDIA_REMOVABLE|SIP_MEDIA_FIXED, /*vendor*/"*", /*product*/"*", /*revision*/"*" }, /*quirks*/0 }, }; static disk_strategy_t adastrategy; static dumper_t adadump; static periph_init_t adainit; static void adaasync(void *callback_arg, u_int32_t code, struct cam_path *path, void *arg); static void adasysctlinit(void *context, int pending); static periph_ctor_t adaregister; static periph_dtor_t adacleanup; static periph_start_t adastart; static periph_oninv_t adaoninvalidate; static void adadone(struct cam_periph *periph, union ccb *done_ccb); static int adaerror(union ccb *ccb, u_int32_t cam_flags, u_int32_t sense_flags); static void adagetparams(struct cam_periph *periph, struct ccb_getdev *cgd); static timeout_t adasendorderedtag; static void adashutdown(void *arg, int howto); static void adasuspend(void *arg); static void adaresume(void *arg); #ifndef ADA_DEFAULT_LEGACY_ALIASES #define ADA_DEFAULT_LEGACY_ALIASES 1 #endif #ifndef ADA_DEFAULT_TIMEOUT #define ADA_DEFAULT_TIMEOUT 30 /* Timeout in seconds */ #endif #ifndef ADA_DEFAULT_RETRY #define ADA_DEFAULT_RETRY 4 #endif #ifndef ADA_DEFAULT_SEND_ORDERED #define ADA_DEFAULT_SEND_ORDERED 1 #endif #ifndef ADA_DEFAULT_SPINDOWN_SHUTDOWN #define ADA_DEFAULT_SPINDOWN_SHUTDOWN 1 #endif #ifndef ADA_DEFAULT_SPINDOWN_SUSPEND #define ADA_DEFAULT_SPINDOWN_SUSPEND 1 #endif #ifndef ADA_DEFAULT_READ_AHEAD #define ADA_DEFAULT_READ_AHEAD 1 #endif #ifndef ADA_DEFAULT_WRITE_CACHE #define ADA_DEFAULT_WRITE_CACHE 1 #endif #define ADA_RA (softc->read_ahead >= 0 ? \ softc->read_ahead : ada_read_ahead) #define ADA_WC (softc->write_cache >= 0 ? \ softc->write_cache : ada_write_cache) #define ADA_SIO (softc->sort_io_queue >= 0 ? \ softc->sort_io_queue : cam_sort_io_queues) /* * Most platforms map firmware geometry to actual, but some don't. If * not overridden, default to nothing. */ #ifndef ata_disk_firmware_geom_adjust #define ata_disk_firmware_geom_adjust(disk) #endif static int ada_legacy_aliases = ADA_DEFAULT_LEGACY_ALIASES; static int ada_retry_count = ADA_DEFAULT_RETRY; static int ada_default_timeout = ADA_DEFAULT_TIMEOUT; static int ada_send_ordered = ADA_DEFAULT_SEND_ORDERED; static int ada_spindown_shutdown = ADA_DEFAULT_SPINDOWN_SHUTDOWN; static int ada_spindown_suspend = ADA_DEFAULT_SPINDOWN_SUSPEND; static int ada_read_ahead = ADA_DEFAULT_READ_AHEAD; static int ada_write_cache = ADA_DEFAULT_WRITE_CACHE; static SYSCTL_NODE(_kern_cam, OID_AUTO, ada, CTLFLAG_RD, 0, "CAM Direct Access Disk driver"); SYSCTL_INT(_kern_cam_ada, OID_AUTO, legacy_aliases, CTLFLAG_RW, &ada_legacy_aliases, 0, "Create legacy-like device aliases"); TUNABLE_INT("kern.cam.ada.legacy_aliases", &ada_legacy_aliases); SYSCTL_INT(_kern_cam_ada, OID_AUTO, retry_count, CTLFLAG_RW, &ada_retry_count, 0, "Normal I/O retry count"); TUNABLE_INT("kern.cam.ada.retry_count", &ada_retry_count); SYSCTL_INT(_kern_cam_ada, OID_AUTO, default_timeout, CTLFLAG_RW, &ada_default_timeout, 0, "Normal I/O timeout (in seconds)"); TUNABLE_INT("kern.cam.ada.default_timeout", &ada_default_timeout); SYSCTL_INT(_kern_cam_ada, OID_AUTO, send_ordered, CTLFLAG_RW, &ada_send_ordered, 0, "Send Ordered Tags"); TUNABLE_INT("kern.cam.ada.send_ordered", &ada_send_ordered); SYSCTL_INT(_kern_cam_ada, OID_AUTO, spindown_shutdown, CTLFLAG_RW, &ada_spindown_shutdown, 0, "Spin down upon shutdown"); TUNABLE_INT("kern.cam.ada.spindown_shutdown", &ada_spindown_shutdown); SYSCTL_INT(_kern_cam_ada, OID_AUTO, spindown_suspend, CTLFLAG_RW, &ada_spindown_suspend, 0, "Spin down upon suspend"); TUNABLE_INT("kern.cam.ada.spindown_suspend", &ada_spindown_suspend); SYSCTL_INT(_kern_cam_ada, OID_AUTO, read_ahead, CTLFLAG_RW, &ada_read_ahead, 0, "Enable disk read-ahead"); TUNABLE_INT("kern.cam.ada.read_ahead", &ada_read_ahead); SYSCTL_INT(_kern_cam_ada, OID_AUTO, write_cache, CTLFLAG_RW, &ada_write_cache, 0, "Enable disk write cache"); TUNABLE_INT("kern.cam.ada.write_cache", &ada_write_cache); /* * ADA_ORDEREDTAG_INTERVAL determines how often, relative * to the default timeout, we check to see whether an ordered * tagged transaction is appropriate to prevent simple tag * starvation. Since we'd like to ensure that there is at least * 1/2 of the timeout length left for a starved transaction to * complete after we've sent an ordered tag, we must poll at least * four times in every timeout period. This takes care of the worst * case where a starved transaction starts during an interval that * meets the requirement "don't send an ordered tag" test so it takes * us two intervals to determine that a tag must be sent. */ #ifndef ADA_ORDEREDTAG_INTERVAL #define ADA_ORDEREDTAG_INTERVAL 4 #endif static struct periph_driver adadriver = { adainit, "ada", TAILQ_HEAD_INITIALIZER(adadriver.units), /* generation */ 0 }; PERIPHDRIVER_DECLARE(ada, adadriver); static int adaopen(struct disk *dp) { struct cam_periph *periph; struct ada_softc *softc; int error; periph = (struct cam_periph *)dp->d_drv1; if (cam_periph_acquire(periph) != CAM_REQ_CMP) { return(ENXIO); } cam_periph_lock(periph); if ((error = cam_periph_hold(periph, PRIBIO|PCATCH)) != 0) { cam_periph_unlock(periph); cam_periph_release(periph); return (error); } CAM_DEBUG(periph->path, CAM_DEBUG_TRACE | CAM_DEBUG_PERIPH, ("adaopen\n")); softc = (struct ada_softc *)periph->softc; softc->flags |= ADA_FLAG_OPEN; cam_periph_unhold(periph); cam_periph_unlock(periph); return (0); } static int adaclose(struct disk *dp) { struct cam_periph *periph; struct ada_softc *softc; union ccb *ccb; int error; periph = (struct cam_periph *)dp->d_drv1; softc = (struct ada_softc *)periph->softc; cam_periph_lock(periph); CAM_DEBUG(periph->path, CAM_DEBUG_TRACE | CAM_DEBUG_PERIPH, ("adaclose\n")); /* We only sync the cache if the drive is capable of it. */ if ((softc->flags & ADA_FLAG_DIRTY) != 0 && (softc->flags & ADA_FLAG_CAN_FLUSHCACHE) != 0 && (periph->flags & CAM_PERIPH_INVALID) == 0 && cam_periph_hold(periph, PRIBIO) == 0) { ccb = cam_periph_getccb(periph, CAM_PRIORITY_NORMAL); cam_fill_ataio(&ccb->ataio, 1, adadone, CAM_DIR_NONE, 0, NULL, 0, ada_default_timeout*1000); if (softc->flags & ADA_FLAG_CAN_48BIT) ata_48bit_cmd(&ccb->ataio, ATA_FLUSHCACHE48, 0, 0, 0); else ata_28bit_cmd(&ccb->ataio, ATA_FLUSHCACHE, 0, 0, 0); error = cam_periph_runccb(ccb, adaerror, /*cam_flags*/0, /*sense_flags*/0, softc->disk->d_devstat); if (error != 0) xpt_print(periph->path, "Synchronize cache failed\n"); else softc->flags &= ~ADA_FLAG_DIRTY; xpt_release_ccb(ccb); cam_periph_unhold(periph); } softc->flags &= ~ADA_FLAG_OPEN; while (softc->refcount != 0) cam_periph_sleep(periph, &softc->refcount, PRIBIO, "adaclose", 1); cam_periph_unlock(periph); cam_periph_release(periph); return (0); } static void adaschedule(struct cam_periph *periph) { struct ada_softc *softc = (struct ada_softc *)periph->softc; if (softc->state != ADA_STATE_NORMAL) return; /* Check if we have more work to do. */ if (bioq_first(&softc->bio_queue) || (!softc->trim_running && bioq_first(&softc->trim_queue))) { xpt_schedule(periph, CAM_PRIORITY_NORMAL); } } /* * Actually translate the requested transfer into one the physical driver * can understand. The transfer is described by a buf and will include * only one physical transfer. */ static void adastrategy(struct bio *bp) { struct cam_periph *periph; struct ada_softc *softc; periph = (struct cam_periph *)bp->bio_disk->d_drv1; softc = (struct ada_softc *)periph->softc; cam_periph_lock(periph); CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("adastrategy(%p)\n", bp)); /* * If the device has been made invalid, error out */ if ((periph->flags & CAM_PERIPH_INVALID) != 0) { cam_periph_unlock(periph); biofinish(bp, NULL, ENXIO); return; } /* * Place it in the queue of disk activities for this disk */ if (bp->bio_cmd == BIO_DELETE) { bioq_disksort(&softc->trim_queue, bp); } else { if (ADA_SIO) bioq_disksort(&softc->bio_queue, bp); else bioq_insert_tail(&softc->bio_queue, bp); } /* * Schedule ourselves for performing the work. */ adaschedule(periph); cam_periph_unlock(periph); return; } static int adadump(void *arg, void *virtual, vm_offset_t physical, off_t offset, size_t length) { struct cam_periph *periph; struct ada_softc *softc; u_int secsize; union ccb ccb; struct disk *dp; uint64_t lba; uint16_t count; int error = 0; dp = arg; periph = dp->d_drv1; softc = (struct ada_softc *)periph->softc; cam_periph_lock(periph); secsize = softc->params.secsize; lba = offset / secsize; count = length / secsize; if ((periph->flags & CAM_PERIPH_INVALID) != 0) { cam_periph_unlock(periph); return (ENXIO); } if (length > 0) { xpt_setup_ccb(&ccb.ccb_h, periph->path, CAM_PRIORITY_NORMAL); ccb.ccb_h.ccb_state = ADA_CCB_DUMP; cam_fill_ataio(&ccb.ataio, 0, adadone, CAM_DIR_OUT, 0, (u_int8_t *) virtual, length, ada_default_timeout*1000); if ((softc->flags & ADA_FLAG_CAN_48BIT) && (lba + count >= ATA_MAX_28BIT_LBA || count >= 256)) { ata_48bit_cmd(&ccb.ataio, ATA_WRITE_DMA48, 0, lba, count); } else { ata_28bit_cmd(&ccb.ataio, ATA_WRITE_DMA, 0, lba, count); } xpt_polled_action(&ccb); error = cam_periph_error(&ccb, 0, SF_NO_RECOVERY | SF_NO_RETRY, NULL); if ((ccb.ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(ccb.ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); if (error != 0) printf("Aborting dump due to I/O error.\n"); cam_periph_unlock(periph); return (error); } if (softc->flags & ADA_FLAG_CAN_FLUSHCACHE) { xpt_setup_ccb(&ccb.ccb_h, periph->path, CAM_PRIORITY_NORMAL); ccb.ccb_h.ccb_state = ADA_CCB_DUMP; cam_fill_ataio(&ccb.ataio, 0, adadone, CAM_DIR_NONE, 0, NULL, 0, ada_default_timeout*1000); if (softc->flags & ADA_FLAG_CAN_48BIT) ata_48bit_cmd(&ccb.ataio, ATA_FLUSHCACHE48, 0, 0, 0); else ata_28bit_cmd(&ccb.ataio, ATA_FLUSHCACHE, 0, 0, 0); xpt_polled_action(&ccb); error = cam_periph_error(&ccb, 0, SF_NO_RECOVERY | SF_NO_RETRY, NULL); if ((ccb.ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(ccb.ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); if (error != 0) xpt_print(periph->path, "Synchronize cache failed\n"); } cam_periph_unlock(periph); return (error); } static void adainit(void) { cam_status status; /* * Install a global async callback. This callback will * receive async callbacks like "new device found". */ status = xpt_register_async(AC_FOUND_DEVICE, adaasync, NULL, NULL); if (status != CAM_REQ_CMP) { printf("ada: Failed to attach master async callback " "due to status 0x%x!\n", status); } else if (ada_send_ordered) { /* Register our event handlers */ if ((EVENTHANDLER_REGISTER(power_suspend, adasuspend, NULL, EVENTHANDLER_PRI_LAST)) == NULL) printf("adainit: power event registration failed!\n"); if ((EVENTHANDLER_REGISTER(power_resume, adaresume, NULL, EVENTHANDLER_PRI_LAST)) == NULL) printf("adainit: power event registration failed!\n"); if ((EVENTHANDLER_REGISTER(shutdown_post_sync, adashutdown, NULL, SHUTDOWN_PRI_DEFAULT)) == NULL) printf("adainit: shutdown event registration failed!\n"); } } /* * Callback from GEOM, called when it has finished cleaning up its * resources. */ static void adadiskgonecb(struct disk *dp) { struct cam_periph *periph; periph = (struct cam_periph *)dp->d_drv1; cam_periph_release(periph); } static void adaoninvalidate(struct cam_periph *periph) { struct ada_softc *softc; softc = (struct ada_softc *)periph->softc; /* * De-register any async callbacks. */ xpt_register_async(0, adaasync, periph, periph->path); /* * Return all queued I/O with ENXIO. * XXX Handle any transactions queued to the card * with XPT_ABORT_CCB. */ bioq_flush(&softc->bio_queue, NULL, ENXIO); bioq_flush(&softc->trim_queue, NULL, ENXIO); disk_gone(softc->disk); } static void adacleanup(struct cam_periph *periph) { struct ada_softc *softc; softc = (struct ada_softc *)periph->softc; cam_periph_unlock(periph); /* * If we can't free the sysctl tree, oh well... */ if ((softc->flags & ADA_FLAG_SCTX_INIT) != 0 && sysctl_ctx_free(&softc->sysctl_ctx) != 0) { xpt_print(periph->path, "can't remove sysctl context\n"); } disk_destroy(softc->disk); callout_drain(&softc->sendordered_c); free(softc, M_DEVBUF); cam_periph_lock(periph); } static void adaasync(void *callback_arg, u_int32_t code, struct cam_path *path, void *arg) { struct ccb_getdev cgd; struct cam_periph *periph; struct ada_softc *softc; periph = (struct cam_periph *)callback_arg; switch (code) { case AC_FOUND_DEVICE: { struct ccb_getdev *cgd; cam_status status; cgd = (struct ccb_getdev *)arg; if (cgd == NULL) break; if (cgd->protocol != PROTO_ATA) break; /* * Allocate a peripheral instance for * this device and start the probe * process. */ status = cam_periph_alloc(adaregister, adaoninvalidate, adacleanup, adastart, "ada", CAM_PERIPH_BIO, path, adaasync, AC_FOUND_DEVICE, cgd); if (status != CAM_REQ_CMP && status != CAM_REQ_INPROG) printf("adaasync: Unable to attach to new device " "due to status 0x%x\n", status); break; } case AC_GETDEV_CHANGED: { softc = (struct ada_softc *)periph->softc; xpt_setup_ccb(&cgd.ccb_h, periph->path, CAM_PRIORITY_NORMAL); cgd.ccb_h.func_code = XPT_GDEV_TYPE; xpt_action((union ccb *)&cgd); if ((cgd.ident_data.capabilities1 & ATA_SUPPORT_DMA) && (cgd.inq_flags & SID_DMA)) softc->flags |= ADA_FLAG_CAN_DMA; else softc->flags &= ~ADA_FLAG_CAN_DMA; if (cgd.ident_data.support.command2 & ATA_SUPPORT_ADDRESS48) { softc->flags |= ADA_FLAG_CAN_48BIT; if (cgd.inq_flags & SID_DMA48) softc->flags |= ADA_FLAG_CAN_DMA48; else softc->flags &= ~ADA_FLAG_CAN_DMA48; } else softc->flags &= ~(ADA_FLAG_CAN_48BIT | ADA_FLAG_CAN_DMA48); if ((cgd.ident_data.satacapabilities & ATA_SUPPORT_NCQ) && (cgd.inq_flags & SID_DMA) && (cgd.inq_flags & SID_CmdQue)) softc->flags |= ADA_FLAG_CAN_NCQ; else softc->flags &= ~ADA_FLAG_CAN_NCQ; if ((cgd.ident_data.support_dsm & ATA_SUPPORT_DSM_TRIM) && (cgd.inq_flags & SID_DMA)) softc->flags |= ADA_FLAG_CAN_TRIM; else softc->flags &= ~ADA_FLAG_CAN_TRIM; cam_periph_async(periph, code, path, arg); break; } case AC_ADVINFO_CHANGED: { uintptr_t buftype; buftype = (uintptr_t)arg; if (buftype == CDAI_TYPE_PHYS_PATH) { struct ada_softc *softc; softc = periph->softc; disk_attr_changed(softc->disk, "GEOM::physpath", M_NOWAIT); } break; } case AC_SENT_BDR: case AC_BUS_RESET: { softc = (struct ada_softc *)periph->softc; cam_periph_async(periph, code, path, arg); if (softc->state != ADA_STATE_NORMAL) break; xpt_setup_ccb(&cgd.ccb_h, periph->path, CAM_PRIORITY_NORMAL); cgd.ccb_h.func_code = XPT_GDEV_TYPE; xpt_action((union ccb *)&cgd); if (ADA_RA >= 0 && cgd.ident_data.support.command1 & ATA_SUPPORT_LOOKAHEAD) softc->state = ADA_STATE_RAHEAD; else if (ADA_WC >= 0 && cgd.ident_data.support.command1 & ATA_SUPPORT_WRITECACHE) softc->state = ADA_STATE_WCACHE; else break; if (cam_periph_acquire(periph) != CAM_REQ_CMP) softc->state = ADA_STATE_NORMAL; else xpt_schedule(periph, CAM_PRIORITY_DEV); } default: cam_periph_async(periph, code, path, arg); break; } } static void adasysctlinit(void *context, int pending) { struct cam_periph *periph; struct ada_softc *softc; char tmpstr[80], tmpstr2[80]; periph = (struct cam_periph *)context; /* periph was held for us when this task was enqueued */ if ((periph->flags & CAM_PERIPH_INVALID) != 0) { cam_periph_release(periph); return; } softc = (struct ada_softc *)periph->softc; snprintf(tmpstr, sizeof(tmpstr), "CAM ADA unit %d", periph->unit_number); snprintf(tmpstr2, sizeof(tmpstr2), "%d", periph->unit_number); sysctl_ctx_init(&softc->sysctl_ctx); softc->flags |= ADA_FLAG_SCTX_INIT; softc->sysctl_tree = SYSCTL_ADD_NODE(&softc->sysctl_ctx, SYSCTL_STATIC_CHILDREN(_kern_cam_ada), OID_AUTO, tmpstr2, CTLFLAG_RD, 0, tmpstr); if (softc->sysctl_tree == NULL) { printf("adasysctlinit: unable to allocate sysctl tree\n"); cam_periph_release(periph); return; } SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "read_ahead", CTLFLAG_RW | CTLFLAG_MPSAFE, &softc->read_ahead, 0, "Enable disk read ahead."); SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "write_cache", CTLFLAG_RW | CTLFLAG_MPSAFE, &softc->write_cache, 0, "Enable disk write cache."); SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "sort_io_queue", CTLFLAG_RW | CTLFLAG_MPSAFE, &softc->sort_io_queue, 0, "Sort IO queue to try and optimise disk access patterns"); #ifdef ADA_TEST_FAILURE /* * Add a 'door bell' sysctl which allows one to set it from userland * and cause something bad to happen. For the moment, we only allow * whacking the next read or write. */ SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "force_read_error", CTLFLAG_RW | CTLFLAG_MPSAFE, &softc->force_read_error, 0, "Force a read error for the next N reads."); SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "force_write_error", CTLFLAG_RW | CTLFLAG_MPSAFE, &softc->force_write_error, 0, "Force a write error for the next N writes."); SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "periodic_read_error", CTLFLAG_RW | CTLFLAG_MPSAFE, &softc->periodic_read_error, 0, "Force a read error every N reads (don't set too low)."); #endif cam_periph_release(periph); } static int adagetattr(struct bio *bp) { int ret; struct cam_periph *periph; periph = (struct cam_periph *)bp->bio_disk->d_drv1; cam_periph_lock(periph); ret = xpt_getattr(bp->bio_data, bp->bio_length, bp->bio_attribute, periph->path); cam_periph_unlock(periph); if (ret == 0) bp->bio_completed = bp->bio_length; return ret; } static cam_status adaregister(struct cam_periph *periph, void *arg) { struct ada_softc *softc; struct ccb_pathinq cpi; struct ccb_getdev *cgd; char announce_buf[80], buf1[32]; struct disk_params *dp; caddr_t match; u_int maxio; int legacy_id, quirks; cgd = (struct ccb_getdev *)arg; if (cgd == NULL) { printf("adaregister: no getdev CCB, can't register device\n"); return(CAM_REQ_CMP_ERR); } softc = (struct ada_softc *)malloc(sizeof(*softc), M_DEVBUF, M_NOWAIT|M_ZERO); if (softc == NULL) { printf("adaregister: Unable to probe new device. " "Unable to allocate softc\n"); return(CAM_REQ_CMP_ERR); } bioq_init(&softc->bio_queue); bioq_init(&softc->trim_queue); if ((cgd->ident_data.capabilities1 & ATA_SUPPORT_DMA) && (cgd->inq_flags & SID_DMA)) softc->flags |= ADA_FLAG_CAN_DMA; if (cgd->ident_data.support.command2 & ATA_SUPPORT_ADDRESS48) { softc->flags |= ADA_FLAG_CAN_48BIT; if (cgd->inq_flags & SID_DMA48) softc->flags |= ADA_FLAG_CAN_DMA48; } if (cgd->ident_data.support.command2 & ATA_SUPPORT_FLUSHCACHE) softc->flags |= ADA_FLAG_CAN_FLUSHCACHE; if (cgd->ident_data.support.command1 & ATA_SUPPORT_POWERMGT) softc->flags |= ADA_FLAG_CAN_POWERMGT; if ((cgd->ident_data.satacapabilities & ATA_SUPPORT_NCQ) && (cgd->inq_flags & SID_DMA) && (cgd->inq_flags & SID_CmdQue)) softc->flags |= ADA_FLAG_CAN_NCQ; if ((cgd->ident_data.support_dsm & ATA_SUPPORT_DSM_TRIM) && (cgd->inq_flags & SID_DMA)) { softc->flags |= ADA_FLAG_CAN_TRIM; softc->trim_max_ranges = TRIM_MAX_RANGES; if (cgd->ident_data.max_dsm_blocks != 0) { softc->trim_max_ranges = min(cgd->ident_data.max_dsm_blocks * ATA_DSM_BLK_RANGES, softc->trim_max_ranges); } } if (cgd->ident_data.support.command2 & ATA_SUPPORT_CFA) softc->flags |= ADA_FLAG_CAN_CFA; periph->softc = softc; /* * See if this device has any quirks. */ match = cam_quirkmatch((caddr_t)&cgd->ident_data, (caddr_t)ada_quirk_table, sizeof(ada_quirk_table)/sizeof(*ada_quirk_table), sizeof(*ada_quirk_table), ata_identify_match); if (match != NULL) softc->quirks = ((struct ada_quirk_entry *)match)->quirks; else softc->quirks = ADA_Q_NONE; bzero(&cpi, sizeof(cpi)); xpt_setup_ccb(&cpi.ccb_h, periph->path, CAM_PRIORITY_NONE); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); TASK_INIT(&softc->sysctl_task, 0, adasysctlinit, periph); /* * Register this media as a disk */ (void)cam_periph_hold(periph, PRIBIO); cam_periph_unlock(periph); snprintf(announce_buf, sizeof(announce_buf), "kern.cam.ada.%d.quirks", periph->unit_number); quirks = softc->quirks; TUNABLE_INT_FETCH(announce_buf, &quirks); softc->quirks = quirks; softc->read_ahead = -1; snprintf(announce_buf, sizeof(announce_buf), "kern.cam.ada.%d.read_ahead", periph->unit_number); TUNABLE_INT_FETCH(announce_buf, &softc->read_ahead); softc->write_cache = -1; snprintf(announce_buf, sizeof(announce_buf), "kern.cam.ada.%d.write_cache", periph->unit_number); TUNABLE_INT_FETCH(announce_buf, &softc->write_cache); /* Disable queue sorting for non-rotational media by default. */ if (cgd->ident_data.media_rotation_rate == ATA_RATE_NON_ROTATING) softc->sort_io_queue = 0; else softc->sort_io_queue = -1; adagetparams(periph, cgd); softc->disk = disk_alloc(); softc->disk->d_rotation_rate = cgd->ident_data.media_rotation_rate; softc->disk->d_devstat = devstat_new_entry(periph->periph_name, periph->unit_number, softc->params.secsize, DEVSTAT_ALL_SUPPORTED, DEVSTAT_TYPE_DIRECT | XPORT_DEVSTAT_TYPE(cpi.transport), DEVSTAT_PRIORITY_DISK); softc->disk->d_open = adaopen; softc->disk->d_close = adaclose; softc->disk->d_strategy = adastrategy; softc->disk->d_getattr = adagetattr; softc->disk->d_dump = adadump; softc->disk->d_gone = adadiskgonecb; softc->disk->d_name = "ada"; softc->disk->d_drv1 = periph; maxio = cpi.maxio; /* Honor max I/O size of SIM */ if (maxio == 0) maxio = DFLTPHYS; /* traditional default */ else if (maxio > MAXPHYS) maxio = MAXPHYS; /* for safety */ if (softc->flags & ADA_FLAG_CAN_48BIT) maxio = min(maxio, 65536 * softc->params.secsize); else /* 28bit ATA command limit */ maxio = min(maxio, 256 * softc->params.secsize); softc->disk->d_maxsize = maxio; softc->disk->d_unit = periph->unit_number; softc->disk->d_flags = DISKFLAG_DIRECT_COMPLETION; if (softc->flags & ADA_FLAG_CAN_FLUSHCACHE) softc->disk->d_flags |= DISKFLAG_CANFLUSHCACHE; if (softc->flags & ADA_FLAG_CAN_TRIM) { softc->disk->d_flags |= DISKFLAG_CANDELETE; softc->disk->d_delmaxsize = softc->params.secsize * ATA_DSM_RANGE_MAX * softc->trim_max_ranges; } else if ((softc->flags & ADA_FLAG_CAN_CFA) && !(softc->flags & ADA_FLAG_CAN_48BIT)) { softc->disk->d_flags |= DISKFLAG_CANDELETE; softc->disk->d_delmaxsize = 256 * softc->params.secsize; } else softc->disk->d_delmaxsize = maxio; if ((cpi.hba_misc & PIM_UNMAPPED) != 0) softc->disk->d_flags |= DISKFLAG_UNMAPPED_BIO; strlcpy(softc->disk->d_descr, cgd->ident_data.model, MIN(sizeof(softc->disk->d_descr), sizeof(cgd->ident_data.model))); strlcpy(softc->disk->d_ident, cgd->ident_data.serial, MIN(sizeof(softc->disk->d_ident), sizeof(cgd->ident_data.serial))); softc->disk->d_hba_vendor = cpi.hba_vendor; softc->disk->d_hba_device = cpi.hba_device; softc->disk->d_hba_subvendor = cpi.hba_subvendor; softc->disk->d_hba_subdevice = cpi.hba_subdevice; softc->disk->d_sectorsize = softc->params.secsize; softc->disk->d_mediasize = (off_t)softc->params.sectors * softc->params.secsize; if (ata_physical_sector_size(&cgd->ident_data) != softc->params.secsize) { softc->disk->d_stripesize = ata_physical_sector_size(&cgd->ident_data); softc->disk->d_stripeoffset = (softc->disk->d_stripesize - ata_logical_sector_offset(&cgd->ident_data)) % softc->disk->d_stripesize; } else if (softc->quirks & ADA_Q_4K) { softc->disk->d_stripesize = 4096; softc->disk->d_stripeoffset = 0; } softc->disk->d_fwsectors = softc->params.secs_per_track; softc->disk->d_fwheads = softc->params.heads; ata_disk_firmware_geom_adjust(softc->disk); if (ada_legacy_aliases) { #ifdef ATA_STATIC_ID legacy_id = xpt_path_legacy_ata_id(periph->path); #else legacy_id = softc->disk->d_unit; #endif if (legacy_id >= 0) { snprintf(announce_buf, sizeof(announce_buf), "kern.devalias.%s%d", softc->disk->d_name, softc->disk->d_unit); snprintf(buf1, sizeof(buf1), "ad%d", legacy_id); setenv(announce_buf, buf1); } } else legacy_id = -1; /* * Acquire a reference to the periph before we register with GEOM. * We'll release this reference once GEOM calls us back (via * adadiskgonecb()) telling us that our provider has been freed. */ if (cam_periph_acquire(periph) != CAM_REQ_CMP) { xpt_print(periph->path, "%s: lost periph during " "registration!\n", __func__); cam_periph_lock(periph); return (CAM_REQ_CMP_ERR); } disk_create(softc->disk, DISK_VERSION); cam_periph_lock(periph); cam_periph_unhold(periph); dp = &softc->params; snprintf(announce_buf, sizeof(announce_buf), "%juMB (%ju %u byte sectors)", ((uintmax_t)dp->secsize * dp->sectors) / (1024 * 1024), (uintmax_t)dp->sectors, dp->secsize); xpt_announce_periph(periph, announce_buf); xpt_announce_quirks(periph, softc->quirks, ADA_Q_BIT_STRING); if (legacy_id >= 0) printf("%s%d: Previously was known as ad%d\n", periph->periph_name, periph->unit_number, legacy_id); /* * Create our sysctl variables, now that we know * we have successfully attached. */ if (cam_periph_acquire(periph) == CAM_REQ_CMP) taskqueue_enqueue(taskqueue_thread, &softc->sysctl_task); /* * Add async callbacks for bus reset and * bus device reset calls. I don't bother * checking if this fails as, in most cases, * the system will function just fine without * them and the only alternative would be to * not attach the device on failure. */ xpt_register_async(AC_SENT_BDR | AC_BUS_RESET | AC_LOST_DEVICE | AC_GETDEV_CHANGED | AC_ADVINFO_CHANGED, adaasync, periph, periph->path); /* * Schedule a periodic event to occasionally send an * ordered tag to a device. */ callout_init_mtx(&softc->sendordered_c, cam_periph_mtx(periph), 0); callout_reset(&softc->sendordered_c, (ada_default_timeout * hz) / ADA_ORDEREDTAG_INTERVAL, adasendorderedtag, softc); if (ADA_RA >= 0 && cgd->ident_data.support.command1 & ATA_SUPPORT_LOOKAHEAD) { softc->state = ADA_STATE_RAHEAD; } else if (ADA_WC >= 0 && cgd->ident_data.support.command1 & ATA_SUPPORT_WRITECACHE) { softc->state = ADA_STATE_WCACHE; } else { softc->state = ADA_STATE_NORMAL; return(CAM_REQ_CMP); } if (cam_periph_acquire(periph) != CAM_REQ_CMP) softc->state = ADA_STATE_NORMAL; else xpt_schedule(periph, CAM_PRIORITY_DEV); return(CAM_REQ_CMP); } static void ada_dsmtrim(struct ada_softc *softc, struct bio *bp, struct ccb_ataio *ataio) { struct trim_request *req = &softc->trim_req; uint64_t lastlba = (uint64_t)-1; int c, lastcount = 0, off, ranges = 0; bzero(req, sizeof(*req)); TAILQ_INIT(&req->bps); do { uint64_t lba = bp->bio_pblkno; int count = bp->bio_bcount / softc->params.secsize; bioq_remove(&softc->trim_queue, bp); /* Try to extend the previous range. */ if (lba == lastlba) { c = min(count, ATA_DSM_RANGE_MAX - lastcount); lastcount += c; off = (ranges - 1) * ATA_DSM_RANGE_SIZE; req->data[off + 6] = lastcount & 0xff; req->data[off + 7] = (lastcount >> 8) & 0xff; count -= c; lba += c; } while (count > 0) { c = min(count, ATA_DSM_RANGE_MAX); off = ranges * ATA_DSM_RANGE_SIZE; req->data[off + 0] = lba & 0xff; req->data[off + 1] = (lba >> 8) & 0xff; req->data[off + 2] = (lba >> 16) & 0xff; req->data[off + 3] = (lba >> 24) & 0xff; req->data[off + 4] = (lba >> 32) & 0xff; req->data[off + 5] = (lba >> 40) & 0xff; req->data[off + 6] = c & 0xff; req->data[off + 7] = (c >> 8) & 0xff; lba += c; count -= c; lastcount = c; ranges++; /* * Its the caller's responsibility to ensure the * request will fit so we don't need to check for * overrun here */ } lastlba = lba; TAILQ_INSERT_TAIL(&req->bps, bp, bio_queue); bp = bioq_first(&softc->trim_queue); if (bp == NULL || bp->bio_bcount / softc->params.secsize > (softc->trim_max_ranges - ranges) * ATA_DSM_RANGE_MAX) break; } while (1); cam_fill_ataio(ataio, ada_retry_count, adadone, CAM_DIR_OUT, 0, req->data, ((ranges + ATA_DSM_BLK_RANGES - 1) / ATA_DSM_BLK_RANGES) * ATA_DSM_BLK_SIZE, ada_default_timeout * 1000); ata_48bit_cmd(ataio, ATA_DATA_SET_MANAGEMENT, ATA_DSM_TRIM, 0, (ranges + ATA_DSM_BLK_RANGES - 1) / ATA_DSM_BLK_RANGES); } static void ada_cfaerase(struct ada_softc *softc, struct bio *bp, struct ccb_ataio *ataio) { struct trim_request *req = &softc->trim_req; uint64_t lba = bp->bio_pblkno; uint16_t count = bp->bio_bcount / softc->params.secsize; bzero(req, sizeof(*req)); TAILQ_INIT(&req->bps); bioq_remove(&softc->trim_queue, bp); TAILQ_INSERT_TAIL(&req->bps, bp, bio_queue); cam_fill_ataio(ataio, ada_retry_count, adadone, CAM_DIR_NONE, 0, NULL, 0, ada_default_timeout*1000); if (count >= 256) count = 0; ata_28bit_cmd(ataio, ATA_CFA_ERASE, 0, lba, count); } static void adastart(struct cam_periph *periph, union ccb *start_ccb) { struct ada_softc *softc = (struct ada_softc *)periph->softc; struct ccb_ataio *ataio = &start_ccb->ataio; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("adastart\n")); switch (softc->state) { case ADA_STATE_NORMAL: { struct bio *bp; u_int8_t tag_code; /* Run TRIM if not running yet. */ if (!softc->trim_running && (bp = bioq_first(&softc->trim_queue)) != 0) { if (softc->flags & ADA_FLAG_CAN_TRIM) { ada_dsmtrim(softc, bp, ataio); } else if ((softc->flags & ADA_FLAG_CAN_CFA) && !(softc->flags & ADA_FLAG_CAN_48BIT)) { ada_cfaerase(softc, bp, ataio); } else { /* This can happen if DMA was disabled. */ bioq_remove(&softc->trim_queue, bp); biofinish(bp, NULL, EOPNOTSUPP); xpt_release_ccb(start_ccb); adaschedule(periph); return; } softc->trim_running = 1; start_ccb->ccb_h.ccb_state = ADA_CCB_TRIM; start_ccb->ccb_h.flags |= CAM_UNLOCKED; goto out; } /* Run regular command. */ bp = bioq_first(&softc->bio_queue); if (bp == NULL) { xpt_release_ccb(start_ccb); break; } bioq_remove(&softc->bio_queue, bp); if ((bp->bio_flags & BIO_ORDERED) != 0 || (softc->flags & ADA_FLAG_NEED_OTAG) != 0) { softc->flags &= ~ADA_FLAG_NEED_OTAG; softc->flags |= ADA_FLAG_WAS_OTAG; tag_code = 0; } else { tag_code = 1; } switch (bp->bio_cmd) { case BIO_WRITE: - softc->flags |= ADA_FLAG_DIRTY; - /* FALLTHROUGH */ case BIO_READ: { uint64_t lba = bp->bio_pblkno; uint16_t count = bp->bio_bcount / softc->params.secsize; + void *data_ptr; + int rw_op; + + if (bp->bio_cmd == BIO_WRITE) { + softc->flags |= ADA_FLAG_DIRTY; + rw_op = CAM_DIR_OUT; + } else { + rw_op = CAM_DIR_IN; + } + + data_ptr = bp->bio_data; + if ((bp->bio_flags & (BIO_UNMAPPED|BIO_VLIST)) != 0) { + rw_op |= CAM_DATA_BIO; + data_ptr = bp; + } + #ifdef ADA_TEST_FAILURE int fail = 0; /* * Support the failure ioctls. If the command is a * read, and there are pending forced read errors, or * if a write and pending write errors, then fail this * operation with EIO. This is useful for testing * purposes. Also, support having every Nth read fail. * * This is a rather blunt tool. */ if (bp->bio_cmd == BIO_READ) { if (softc->force_read_error) { softc->force_read_error--; fail = 1; } if (softc->periodic_read_error > 0) { if (++softc->periodic_read_count >= softc->periodic_read_error) { softc->periodic_read_count = 0; fail = 1; } } } else { if (softc->force_write_error) { softc->force_write_error--; fail = 1; } } if (fail) { biofinish(bp, NULL, EIO); xpt_release_ccb(start_ccb); adaschedule(periph); return; } #endif KASSERT((bp->bio_flags & BIO_UNMAPPED) == 0 || round_page(bp->bio_bcount + bp->bio_ma_offset) / PAGE_SIZE == bp->bio_ma_n, ("Short bio %p", bp)); cam_fill_ataio(ataio, ada_retry_count, adadone, - (bp->bio_cmd == BIO_READ ? CAM_DIR_IN : - CAM_DIR_OUT) | ((bp->bio_flags & BIO_UNMAPPED) - != 0 ? CAM_DATA_BIO : 0), + rw_op, tag_code, - ((bp->bio_flags & BIO_UNMAPPED) != 0) ? (void *)bp : - bp->bio_data, + data_ptr, bp->bio_bcount, ada_default_timeout*1000); if ((softc->flags & ADA_FLAG_CAN_NCQ) && tag_code) { if (bp->bio_cmd == BIO_READ) { ata_ncq_cmd(ataio, ATA_READ_FPDMA_QUEUED, lba, count); } else { ata_ncq_cmd(ataio, ATA_WRITE_FPDMA_QUEUED, lba, count); } } else if ((softc->flags & ADA_FLAG_CAN_48BIT) && (lba + count >= ATA_MAX_28BIT_LBA || count > 256)) { if (softc->flags & ADA_FLAG_CAN_DMA48) { if (bp->bio_cmd == BIO_READ) { ata_48bit_cmd(ataio, ATA_READ_DMA48, 0, lba, count); } else { ata_48bit_cmd(ataio, ATA_WRITE_DMA48, 0, lba, count); } } else { if (bp->bio_cmd == BIO_READ) { ata_48bit_cmd(ataio, ATA_READ_MUL48, 0, lba, count); } else { ata_48bit_cmd(ataio, ATA_WRITE_MUL48, 0, lba, count); } } } else { if (count == 256) count = 0; if (softc->flags & ADA_FLAG_CAN_DMA) { if (bp->bio_cmd == BIO_READ) { ata_28bit_cmd(ataio, ATA_READ_DMA, 0, lba, count); } else { ata_28bit_cmd(ataio, ATA_WRITE_DMA, 0, lba, count); } } else { if (bp->bio_cmd == BIO_READ) { ata_28bit_cmd(ataio, ATA_READ_MUL, 0, lba, count); } else { ata_28bit_cmd(ataio, ATA_WRITE_MUL, 0, lba, count); } } } break; } case BIO_FLUSH: cam_fill_ataio(ataio, 1, adadone, CAM_DIR_NONE, 0, NULL, 0, ada_default_timeout*1000); if (softc->flags & ADA_FLAG_CAN_48BIT) ata_48bit_cmd(ataio, ATA_FLUSHCACHE48, 0, 0, 0); else ata_28bit_cmd(ataio, ATA_FLUSHCACHE, 0, 0, 0); break; } start_ccb->ccb_h.ccb_state = ADA_CCB_BUFFER_IO; start_ccb->ccb_h.flags |= CAM_UNLOCKED; out: start_ccb->ccb_h.ccb_bp = bp; softc->outstanding_cmds++; softc->refcount++; cam_periph_unlock(periph); xpt_action(start_ccb); cam_periph_lock(periph); softc->refcount--; /* May have more work to do, so ensure we stay scheduled */ adaschedule(periph); break; } case ADA_STATE_RAHEAD: case ADA_STATE_WCACHE: { cam_fill_ataio(ataio, 1, adadone, CAM_DIR_NONE, 0, NULL, 0, ada_default_timeout*1000); if (softc->state == ADA_STATE_RAHEAD) { ata_28bit_cmd(ataio, ATA_SETFEATURES, ADA_RA ? ATA_SF_ENAB_RCACHE : ATA_SF_DIS_RCACHE, 0, 0); start_ccb->ccb_h.ccb_state = ADA_CCB_RAHEAD; } else { ata_28bit_cmd(ataio, ATA_SETFEATURES, ADA_WC ? ATA_SF_ENAB_WCACHE : ATA_SF_DIS_WCACHE, 0, 0); start_ccb->ccb_h.ccb_state = ADA_CCB_WCACHE; } start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; xpt_action(start_ccb); break; } } } static void adadone(struct cam_periph *periph, union ccb *done_ccb) { struct ada_softc *softc; struct ccb_ataio *ataio; struct ccb_getdev *cgd; struct cam_path *path; int state; softc = (struct ada_softc *)periph->softc; ataio = &done_ccb->ataio; path = done_ccb->ccb_h.path; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("adadone\n")); state = ataio->ccb_h.ccb_state & ADA_CCB_TYPE_MASK; switch (state) { case ADA_CCB_BUFFER_IO: case ADA_CCB_TRIM: { struct bio *bp; int error; cam_periph_lock(periph); if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { error = adaerror(done_ccb, 0, 0); if (error == ERESTART) { /* A retry was scheduled, so just return. */ cam_periph_unlock(periph); return; } if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } else { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) panic("REQ_CMP with QFRZN"); error = 0; } bp = (struct bio *)done_ccb->ccb_h.ccb_bp; bp->bio_error = error; if (error != 0) { bp->bio_resid = bp->bio_bcount; bp->bio_flags |= BIO_ERROR; } else { if (state == ADA_CCB_TRIM) bp->bio_resid = 0; else bp->bio_resid = ataio->resid; if (bp->bio_resid > 0) bp->bio_flags |= BIO_ERROR; } softc->outstanding_cmds--; if (softc->outstanding_cmds == 0) softc->flags |= ADA_FLAG_WAS_OTAG; xpt_release_ccb(done_ccb); if (state == ADA_CCB_TRIM) { TAILQ_HEAD(, bio) queue; struct bio *bp1; TAILQ_INIT(&queue); TAILQ_CONCAT(&queue, &softc->trim_req.bps, bio_queue); /* * Normally, the xpt_release_ccb() above would make sure * that when we have more work to do, that work would * get kicked off. However, we specifically keep * trim_running set to 0 before the call above to allow * other I/O to progress when many BIO_DELETE requests * are pushed down. We set trim_running to 0 and call * daschedule again so that we don't stall if there are * no other I/Os pending apart from BIO_DELETEs. */ softc->trim_running = 0; adaschedule(periph); cam_periph_unlock(periph); while ((bp1 = TAILQ_FIRST(&queue)) != NULL) { TAILQ_REMOVE(&queue, bp1, bio_queue); bp1->bio_error = error; if (error != 0) { bp1->bio_flags |= BIO_ERROR; bp1->bio_resid = bp1->bio_bcount; } else bp1->bio_resid = 0; biodone(bp1); } } else { cam_periph_unlock(periph); biodone(bp); } return; } case ADA_CCB_RAHEAD: { if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { if (adaerror(done_ccb, 0, 0) == ERESTART) { out: /* Drop freeze taken due to CAM_DEV_QFREEZE */ cam_release_devq(path, 0, 0, 0, FALSE); return; } else if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { cam_release_devq(path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } /* * Since our peripheral may be invalidated by an error * above or an external event, we must release our CCB * before releasing the reference on the peripheral. * The peripheral will only go away once the last reference * is removed, and we need it around for the CCB release * operation. */ cgd = (struct ccb_getdev *)done_ccb; xpt_setup_ccb(&cgd->ccb_h, path, CAM_PRIORITY_NORMAL); cgd->ccb_h.func_code = XPT_GDEV_TYPE; xpt_action((union ccb *)cgd); if (ADA_WC >= 0 && cgd->ident_data.support.command1 & ATA_SUPPORT_WRITECACHE) { softc->state = ADA_STATE_WCACHE; xpt_release_ccb(done_ccb); xpt_schedule(periph, CAM_PRIORITY_DEV); goto out; } softc->state = ADA_STATE_NORMAL; xpt_release_ccb(done_ccb); /* Drop freeze taken due to CAM_DEV_QFREEZE */ cam_release_devq(path, 0, 0, 0, FALSE); adaschedule(periph); cam_periph_release_locked(periph); return; } case ADA_CCB_WCACHE: { if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { if (adaerror(done_ccb, 0, 0) == ERESTART) { goto out; } else if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { cam_release_devq(path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } softc->state = ADA_STATE_NORMAL; /* * Since our peripheral may be invalidated by an error * above or an external event, we must release our CCB * before releasing the reference on the peripheral. * The peripheral will only go away once the last reference * is removed, and we need it around for the CCB release * operation. */ xpt_release_ccb(done_ccb); /* Drop freeze taken due to CAM_DEV_QFREEZE */ cam_release_devq(path, 0, 0, 0, FALSE); adaschedule(periph); cam_periph_release_locked(periph); return; } case ADA_CCB_DUMP: /* No-op. We're polling */ return; default: break; } xpt_release_ccb(done_ccb); } static int adaerror(union ccb *ccb, u_int32_t cam_flags, u_int32_t sense_flags) { return(cam_periph_error(ccb, cam_flags, sense_flags, NULL)); } static void adagetparams(struct cam_periph *periph, struct ccb_getdev *cgd) { struct ada_softc *softc = (struct ada_softc *)periph->softc; struct disk_params *dp = &softc->params; u_int64_t lbasize48; u_int32_t lbasize; dp->secsize = ata_logical_sector_size(&cgd->ident_data); if ((cgd->ident_data.atavalid & ATA_FLAG_54_58) && cgd->ident_data.current_heads && cgd->ident_data.current_sectors) { dp->heads = cgd->ident_data.current_heads; dp->secs_per_track = cgd->ident_data.current_sectors; dp->cylinders = cgd->ident_data.cylinders; dp->sectors = (u_int32_t)cgd->ident_data.current_size_1 | ((u_int32_t)cgd->ident_data.current_size_2 << 16); } else { dp->heads = cgd->ident_data.heads; dp->secs_per_track = cgd->ident_data.sectors; dp->cylinders = cgd->ident_data.cylinders; dp->sectors = cgd->ident_data.cylinders * dp->heads * dp->secs_per_track; } lbasize = (u_int32_t)cgd->ident_data.lba_size_1 | ((u_int32_t)cgd->ident_data.lba_size_2 << 16); /* use the 28bit LBA size if valid or bigger than the CHS mapping */ if (cgd->ident_data.cylinders == 16383 || dp->sectors < lbasize) dp->sectors = lbasize; /* use the 48bit LBA size if valid */ lbasize48 = ((u_int64_t)cgd->ident_data.lba_size48_1) | ((u_int64_t)cgd->ident_data.lba_size48_2 << 16) | ((u_int64_t)cgd->ident_data.lba_size48_3 << 32) | ((u_int64_t)cgd->ident_data.lba_size48_4 << 48); if ((cgd->ident_data.support.command2 & ATA_SUPPORT_ADDRESS48) && lbasize48 > ATA_MAX_28BIT_LBA) dp->sectors = lbasize48; } static void adasendorderedtag(void *arg) { struct ada_softc *softc = arg; if (ada_send_ordered) { if (softc->outstanding_cmds > 0) { if ((softc->flags & ADA_FLAG_WAS_OTAG) == 0) softc->flags |= ADA_FLAG_NEED_OTAG; softc->flags &= ~ADA_FLAG_WAS_OTAG; } } /* Queue us up again */ callout_reset(&softc->sendordered_c, (ada_default_timeout * hz) / ADA_ORDEREDTAG_INTERVAL, adasendorderedtag, softc); } /* * Step through all ADA peripheral drivers, and if the device is still open, * sync the disk cache to physical media. */ static void adaflush(void) { struct cam_periph *periph; struct ada_softc *softc; union ccb *ccb; int error; CAM_PERIPH_FOREACH(periph, &adadriver) { softc = (struct ada_softc *)periph->softc; if (SCHEDULER_STOPPED()) { /* If we paniced with the lock held, do not recurse. */ if (!cam_periph_owned(periph) && (softc->flags & ADA_FLAG_OPEN)) { adadump(softc->disk, NULL, 0, 0, 0); } continue; } cam_periph_lock(periph); /* * We only sync the cache if the drive is still open, and * if the drive is capable of it.. */ if (((softc->flags & ADA_FLAG_OPEN) == 0) || (softc->flags & ADA_FLAG_CAN_FLUSHCACHE) == 0) { cam_periph_unlock(periph); continue; } ccb = cam_periph_getccb(periph, CAM_PRIORITY_NORMAL); cam_fill_ataio(&ccb->ataio, 0, adadone, CAM_DIR_NONE, 0, NULL, 0, ada_default_timeout*1000); if (softc->flags & ADA_FLAG_CAN_48BIT) ata_48bit_cmd(&ccb->ataio, ATA_FLUSHCACHE48, 0, 0, 0); else ata_28bit_cmd(&ccb->ataio, ATA_FLUSHCACHE, 0, 0, 0); error = cam_periph_runccb(ccb, adaerror, /*cam_flags*/0, /*sense_flags*/ SF_NO_RECOVERY | SF_NO_RETRY, softc->disk->d_devstat); if (error != 0) xpt_print(periph->path, "Synchronize cache failed\n"); xpt_release_ccb(ccb); cam_periph_unlock(periph); } } static void adaspindown(uint8_t cmd, int flags) { struct cam_periph *periph; struct ada_softc *softc; union ccb *ccb; int error; CAM_PERIPH_FOREACH(periph, &adadriver) { /* If we paniced with lock held - not recurse here. */ if (cam_periph_owned(periph)) continue; cam_periph_lock(periph); softc = (struct ada_softc *)periph->softc; /* * We only spin-down the drive if it is capable of it.. */ if ((softc->flags & ADA_FLAG_CAN_POWERMGT) == 0) { cam_periph_unlock(periph); continue; } if (bootverbose) xpt_print(periph->path, "spin-down\n"); ccb = cam_periph_getccb(periph, CAM_PRIORITY_NORMAL); cam_fill_ataio(&ccb->ataio, 0, adadone, CAM_DIR_NONE | flags, 0, NULL, 0, ada_default_timeout*1000); ata_28bit_cmd(&ccb->ataio, cmd, 0, 0, 0); error = cam_periph_runccb(ccb, adaerror, /*cam_flags*/0, /*sense_flags*/ SF_NO_RECOVERY | SF_NO_RETRY, softc->disk->d_devstat); if (error != 0) xpt_print(periph->path, "Spin-down disk failed\n"); xpt_release_ccb(ccb); cam_periph_unlock(periph); } } static void adashutdown(void *arg, int howto) { adaflush(); if (ada_spindown_shutdown != 0 && (howto & (RB_HALT | RB_POWEROFF)) != 0) adaspindown(ATA_STANDBY_IMMEDIATE, 0); } static void adasuspend(void *arg) { adaflush(); if (ada_spindown_suspend != 0) adaspindown(ATA_SLEEP, CAM_DEV_QFREEZE); } static void adaresume(void *arg) { struct cam_periph *periph; struct ada_softc *softc; if (ada_spindown_suspend == 0) return; CAM_PERIPH_FOREACH(periph, &adadriver) { cam_periph_lock(periph); softc = (struct ada_softc *)periph->softc; /* * We only spin-down the drive if it is capable of it.. */ if ((softc->flags & ADA_FLAG_CAN_POWERMGT) == 0) { cam_periph_unlock(periph); continue; } if (bootverbose) xpt_print(periph->path, "resume\n"); /* * Drop freeze taken due to CAM_DEV_QFREEZE flag set on * sleep request. */ cam_release_devq(periph->path, /*relsim_flags*/0, /*openings*/0, /*timeout*/0, /*getcount_only*/0); cam_periph_unlock(periph); } } #endif /* _KERNEL */ Index: stable/10/sys/cam/cam_ccb.h =================================================================== --- stable/10/sys/cam/cam_ccb.h (revision 292347) +++ stable/10/sys/cam/cam_ccb.h (revision 292348) @@ -1,1347 +1,1350 @@ /*- * Data structures and definitions for CAM Control Blocks (CCBs). * * Copyright (c) 1997, 1998 Justin T. Gibbs. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _CAM_CAM_CCB_H #define _CAM_CAM_CCB_H 1 #include #include #include #include #ifndef _KERNEL #include #endif #include #include #include /* General allocation length definitions for CCB structures */ #define IOCDBLEN CAM_MAX_CDBLEN /* Space for CDB bytes/pointer */ #define VUHBALEN 14 /* Vendor Unique HBA length */ #define SIM_IDLEN 16 /* ASCII string len for SIM ID */ #define HBA_IDLEN 16 /* ASCII string len for HBA ID */ #define DEV_IDLEN 16 /* ASCII string len for device names */ #define CCB_PERIPH_PRIV_SIZE 2 /* size of peripheral private area */ #define CCB_SIM_PRIV_SIZE 2 /* size of sim private area */ /* Struct definitions for CAM control blocks */ /* Common CCB header */ /* CAM CCB flags */ typedef enum { CAM_CDB_POINTER = 0x00000001,/* The CDB field is a pointer */ CAM_QUEUE_ENABLE = 0x00000002,/* SIM queue actions are enabled */ CAM_CDB_LINKED = 0x00000004,/* CCB contains a linked CDB */ CAM_NEGOTIATE = 0x00000008,/* * Perform transport negotiation * with this command. */ CAM_DATA_ISPHYS = 0x00000010,/* Data type with physical addrs */ CAM_DIS_AUTOSENSE = 0x00000020,/* Disable autosense feature */ CAM_DIR_BOTH = 0x00000000,/* Data direction (00:IN/OUT) */ CAM_DIR_IN = 0x00000040,/* Data direction (01:DATA IN) */ CAM_DIR_OUT = 0x00000080,/* Data direction (10:DATA OUT) */ CAM_DIR_NONE = 0x000000C0,/* Data direction (11:no data) */ CAM_DIR_MASK = 0x000000C0,/* Data direction Mask */ CAM_DATA_VADDR = 0x00000000,/* Data type (000:Virtual) */ CAM_DATA_PADDR = 0x00000010,/* Data type (001:Physical) */ CAM_DATA_SG = 0x00040000,/* Data type (010:sglist) */ CAM_DATA_SG_PADDR = 0x00040010,/* Data type (011:sglist phys) */ CAM_DATA_BIO = 0x00200000,/* Data type (100:bio) */ CAM_DATA_MASK = 0x00240010,/* Data type mask */ CAM_SOFT_RST_OP = 0x00000100,/* Use Soft reset alternative */ CAM_ENG_SYNC = 0x00000200,/* Flush resid bytes on complete */ CAM_DEV_QFRZDIS = 0x00000400,/* Disable DEV Q freezing */ CAM_DEV_QFREEZE = 0x00000800,/* Freeze DEV Q on execution */ CAM_HIGH_POWER = 0x00001000,/* Command takes a lot of power */ CAM_SENSE_PTR = 0x00002000,/* Sense data is a pointer */ CAM_SENSE_PHYS = 0x00004000,/* Sense pointer is physical addr*/ CAM_TAG_ACTION_VALID = 0x00008000,/* Use the tag action in this ccb*/ CAM_PASS_ERR_RECOVER = 0x00010000,/* Pass driver does err. recovery*/ CAM_DIS_DISCONNECT = 0x00020000,/* Disable disconnect */ CAM_MSG_BUF_PHYS = 0x00080000,/* Message buffer ptr is physical*/ CAM_SNS_BUF_PHYS = 0x00100000,/* Autosense data ptr is physical*/ CAM_CDB_PHYS = 0x00400000,/* CDB poiner is physical */ CAM_ENG_SGLIST = 0x00800000,/* SG list is for the HBA engine */ /* Phase cognizant mode flags */ CAM_DIS_AUTOSRP = 0x01000000,/* Disable autosave/restore ptrs */ CAM_DIS_AUTODISC = 0x02000000,/* Disable auto disconnect */ CAM_TGT_CCB_AVAIL = 0x04000000,/* Target CCB available */ CAM_TGT_PHASE_MODE = 0x08000000,/* The SIM runs in phase mode */ CAM_MSGB_VALID = 0x10000000,/* Message buffer valid */ CAM_STATUS_VALID = 0x20000000,/* Status buffer valid */ CAM_DATAB_VALID = 0x40000000,/* Data buffer valid */ /* Host target Mode flags */ CAM_SEND_SENSE = 0x08000000,/* Send sense data with status */ CAM_TERM_IO = 0x10000000,/* Terminate I/O Message sup. */ CAM_DISCONNECT = 0x20000000,/* Disconnects are mandatory */ CAM_SEND_STATUS = 0x40000000,/* Send status after data phase */ CAM_UNLOCKED = 0x80000000 /* Call callback without lock. */ } ccb_flags; typedef enum { CAM_EXTLUN_VALID = 0x00000001,/* 64bit lun field is valid */ + CAM_USER_DATA_ADDR = 0x00000002,/* Userspace data pointers */ + CAM_SG_FORMAT_IOVEC = 0x00000004,/* iovec instead of busdma S/G*/ + CAM_UNMAPPED_BUF = 0x00000008 /* use unmapped I/O */ } ccb_xflags; /* XPT Opcodes for xpt_action */ typedef enum { /* Function code flags are bits greater than 0xff */ XPT_FC_QUEUED = 0x100, /* Non-immediate function code */ XPT_FC_USER_CCB = 0x200, XPT_FC_XPT_ONLY = 0x400, /* Only for the transport layer device */ XPT_FC_DEV_QUEUED = 0x800 | XPT_FC_QUEUED, /* Passes through the device queues */ /* Common function commands: 0x00->0x0F */ XPT_NOOP = 0x00, /* Execute Nothing */ XPT_SCSI_IO = 0x01 | XPT_FC_DEV_QUEUED, /* Execute the requested I/O operation */ XPT_GDEV_TYPE = 0x02, /* Get type information for specified device */ XPT_GDEVLIST = 0x03, /* Get a list of peripheral devices */ XPT_PATH_INQ = 0x04, /* Path routing inquiry */ XPT_REL_SIMQ = 0x05, /* Release a frozen device queue */ XPT_SASYNC_CB = 0x06, /* Set Asynchronous Callback Parameters */ XPT_SDEV_TYPE = 0x07, /* Set device type information */ XPT_SCAN_BUS = 0x08 | XPT_FC_QUEUED | XPT_FC_USER_CCB | XPT_FC_XPT_ONLY, /* (Re)Scan the SCSI Bus */ XPT_DEV_MATCH = 0x09 | XPT_FC_XPT_ONLY, /* Get EDT entries matching the given pattern */ XPT_DEBUG = 0x0a, /* Turn on debugging for a bus, target or lun */ XPT_PATH_STATS = 0x0b, /* Path statistics (error counts, etc.) */ XPT_GDEV_STATS = 0x0c, /* Device statistics (error counts, etc.) */ XPT_DEV_ADVINFO = 0x0e, /* Get/Set Device advanced information */ XPT_ASYNC = 0x0f | XPT_FC_QUEUED | XPT_FC_USER_CCB | XPT_FC_XPT_ONLY, /* Asynchronous event */ /* SCSI Control Functions: 0x10->0x1F */ XPT_ABORT = 0x10, /* Abort the specified CCB */ XPT_RESET_BUS = 0x11 | XPT_FC_XPT_ONLY, /* Reset the specified SCSI bus */ XPT_RESET_DEV = 0x12 | XPT_FC_DEV_QUEUED, /* Bus Device Reset the specified SCSI device */ XPT_TERM_IO = 0x13, /* Terminate the I/O process */ XPT_SCAN_LUN = 0x14 | XPT_FC_QUEUED | XPT_FC_USER_CCB | XPT_FC_XPT_ONLY, /* Scan Logical Unit */ XPT_GET_TRAN_SETTINGS = 0x15, /* * Get default/user transfer settings * for the target */ XPT_SET_TRAN_SETTINGS = 0x16, /* * Set transfer rate/width * negotiation settings */ XPT_CALC_GEOMETRY = 0x17, /* * Calculate the geometry parameters for * a device give the sector size and * volume size. */ XPT_ATA_IO = 0x18 | XPT_FC_DEV_QUEUED, /* Execute the requested ATA I/O operation */ XPT_GET_SIM_KNOB = 0x18, /* * Get SIM specific knob values. */ XPT_SET_SIM_KNOB = 0x19, /* * Set SIM specific knob values. */ XPT_SMP_IO = 0x1b | XPT_FC_DEV_QUEUED, /* Serial Management Protocol */ XPT_SCAN_TGT = 0x1E | XPT_FC_QUEUED | XPT_FC_USER_CCB | XPT_FC_XPT_ONLY, /* Scan Target */ /* HBA engine commands 0x20->0x2F */ XPT_ENG_INQ = 0x20 | XPT_FC_XPT_ONLY, /* HBA engine feature inquiry */ XPT_ENG_EXEC = 0x21 | XPT_FC_DEV_QUEUED, /* HBA execute engine request */ /* Target mode commands: 0x30->0x3F */ XPT_EN_LUN = 0x30, /* Enable LUN as a target */ XPT_TARGET_IO = 0x31 | XPT_FC_DEV_QUEUED, /* Execute target I/O request */ XPT_ACCEPT_TARGET_IO = 0x32 | XPT_FC_QUEUED | XPT_FC_USER_CCB, /* Accept Host Target Mode CDB */ XPT_CONT_TARGET_IO = 0x33 | XPT_FC_DEV_QUEUED, /* Continue Host Target I/O Connection */ XPT_IMMED_NOTIFY = 0x34 | XPT_FC_QUEUED | XPT_FC_USER_CCB, /* Notify Host Target driver of event (obsolete) */ XPT_NOTIFY_ACK = 0x35, /* Acknowledgement of event (obsolete) */ XPT_IMMEDIATE_NOTIFY = 0x36 | XPT_FC_QUEUED | XPT_FC_USER_CCB, /* Notify Host Target driver of event */ XPT_NOTIFY_ACKNOWLEDGE = 0x37 | XPT_FC_QUEUED | XPT_FC_USER_CCB, /* Acknowledgement of event */ /* Vendor Unique codes: 0x80->0x8F */ XPT_VUNIQUE = 0x80 } xpt_opcode; #define XPT_FC_GROUP_MASK 0xF0 #define XPT_FC_GROUP(op) ((op) & XPT_FC_GROUP_MASK) #define XPT_FC_GROUP_COMMON 0x00 #define XPT_FC_GROUP_SCSI_CONTROL 0x10 #define XPT_FC_GROUP_HBA_ENGINE 0x20 #define XPT_FC_GROUP_TMODE 0x30 #define XPT_FC_GROUP_VENDOR_UNIQUE 0x80 #define XPT_FC_IS_DEV_QUEUED(ccb) \ (((ccb)->ccb_h.func_code & XPT_FC_DEV_QUEUED) == XPT_FC_DEV_QUEUED) #define XPT_FC_IS_QUEUED(ccb) \ (((ccb)->ccb_h.func_code & XPT_FC_QUEUED) != 0) typedef enum { PROTO_UNKNOWN, PROTO_UNSPECIFIED, PROTO_SCSI, /* Small Computer System Interface */ PROTO_ATA, /* AT Attachment */ PROTO_ATAPI, /* AT Attachment Packetized Interface */ PROTO_SATAPM, /* SATA Port Multiplier */ PROTO_SEMB, /* SATA Enclosure Management Bridge */ } cam_proto; typedef enum { XPORT_UNKNOWN, XPORT_UNSPECIFIED, XPORT_SPI, /* SCSI Parallel Interface */ XPORT_FC, /* Fiber Channel */ XPORT_SSA, /* Serial Storage Architecture */ XPORT_USB, /* Universal Serial Bus */ XPORT_PPB, /* Parallel Port Bus */ XPORT_ATA, /* AT Attachment */ XPORT_SAS, /* Serial Attached SCSI */ XPORT_SATA, /* Serial AT Attachment */ XPORT_ISCSI, /* iSCSI */ XPORT_SRP, /* SCSI RDMA Protocol */ } cam_xport; #define XPORT_IS_ATA(t) ((t) == XPORT_ATA || (t) == XPORT_SATA) #define XPORT_IS_SCSI(t) ((t) != XPORT_UNKNOWN && \ (t) != XPORT_UNSPECIFIED && \ !XPORT_IS_ATA(t)) #define XPORT_DEVSTAT_TYPE(t) (XPORT_IS_ATA(t) ? DEVSTAT_TYPE_IF_IDE : \ XPORT_IS_SCSI(t) ? DEVSTAT_TYPE_IF_SCSI : \ DEVSTAT_TYPE_IF_OTHER) #define PROTO_VERSION_UNKNOWN (UINT_MAX - 1) #define PROTO_VERSION_UNSPECIFIED UINT_MAX #define XPORT_VERSION_UNKNOWN (UINT_MAX - 1) #define XPORT_VERSION_UNSPECIFIED UINT_MAX typedef union { LIST_ENTRY(ccb_hdr) le; SLIST_ENTRY(ccb_hdr) sle; TAILQ_ENTRY(ccb_hdr) tqe; STAILQ_ENTRY(ccb_hdr) stqe; } camq_entry; typedef union { void *ptr; u_long field; u_int8_t bytes[sizeof(uintptr_t)]; } ccb_priv_entry; typedef union { ccb_priv_entry entries[CCB_PERIPH_PRIV_SIZE]; u_int8_t bytes[CCB_PERIPH_PRIV_SIZE * sizeof(ccb_priv_entry)]; } ccb_ppriv_area; typedef union { ccb_priv_entry entries[CCB_SIM_PRIV_SIZE]; u_int8_t bytes[CCB_SIM_PRIV_SIZE * sizeof(ccb_priv_entry)]; } ccb_spriv_area; typedef struct { struct timeval *etime; uintptr_t sim_data; uintptr_t periph_data; } ccb_qos_area; struct ccb_hdr { cam_pinfo pinfo; /* Info for priority scheduling */ camq_entry xpt_links; /* For chaining in the XPT layer */ camq_entry sim_links; /* For chaining in the SIM layer */ camq_entry periph_links; /* For chaining in the type driver */ u_int32_t retry_count; void (*cbfcnp)(struct cam_periph *, union ccb *); /* Callback on completion function */ xpt_opcode func_code; /* XPT function code */ u_int32_t status; /* Status returned by CAM subsystem */ struct cam_path *path; /* Compiled path for this ccb */ path_id_t path_id; /* Path ID for the request */ target_id_t target_id; /* Target device ID */ lun_id_t target_lun; /* Target LUN number */ lun64_id_t ext_lun; /* 64bit extended/multi-level LUNs */ u_int32_t flags; /* ccb_flags */ u_int32_t xflags; /* Extended flags */ ccb_ppriv_area periph_priv; ccb_spriv_area sim_priv; ccb_qos_area qos; u_int32_t timeout; /* Hard timeout value in mseconds */ struct timeval softtimeout; /* Soft timeout value in sec + usec */ }; /* Get Device Information CCB */ struct ccb_getdev { struct ccb_hdr ccb_h; cam_proto protocol; struct scsi_inquiry_data inq_data; struct ata_params ident_data; u_int8_t serial_num[252]; u_int8_t inq_flags; u_int8_t serial_num_len; }; /* Device Statistics CCB */ struct ccb_getdevstats { struct ccb_hdr ccb_h; int dev_openings; /* Space left for more work on device*/ int dev_active; /* Transactions running on the device */ int allocated; /* CCBs allocated for the device */ int queued; /* CCBs queued to be sent to the device */ int held; /* * CCBs held by peripheral drivers * for this device */ int maxtags; /* * Boundary conditions for number of * tagged operations */ int mintags; struct timeval last_reset; /* Time of last bus reset/loop init */ }; typedef enum { CAM_GDEVLIST_LAST_DEVICE, CAM_GDEVLIST_LIST_CHANGED, CAM_GDEVLIST_MORE_DEVS, CAM_GDEVLIST_ERROR } ccb_getdevlist_status_e; struct ccb_getdevlist { struct ccb_hdr ccb_h; char periph_name[DEV_IDLEN]; u_int32_t unit_number; unsigned int generation; u_int32_t index; ccb_getdevlist_status_e status; }; typedef enum { PERIPH_MATCH_NONE = 0x000, PERIPH_MATCH_PATH = 0x001, PERIPH_MATCH_TARGET = 0x002, PERIPH_MATCH_LUN = 0x004, PERIPH_MATCH_NAME = 0x008, PERIPH_MATCH_UNIT = 0x010, PERIPH_MATCH_ANY = 0x01f } periph_pattern_flags; struct periph_match_pattern { char periph_name[DEV_IDLEN]; u_int32_t unit_number; path_id_t path_id; target_id_t target_id; lun_id_t target_lun; periph_pattern_flags flags; }; typedef enum { DEV_MATCH_NONE = 0x000, DEV_MATCH_PATH = 0x001, DEV_MATCH_TARGET = 0x002, DEV_MATCH_LUN = 0x004, DEV_MATCH_INQUIRY = 0x008, DEV_MATCH_DEVID = 0x010, DEV_MATCH_ANY = 0x00f } dev_pattern_flags; struct device_id_match_pattern { uint8_t id_len; uint8_t id[256]; }; struct device_match_pattern { path_id_t path_id; target_id_t target_id; lun_id_t target_lun; dev_pattern_flags flags; union { struct scsi_static_inquiry_pattern inq_pat; struct device_id_match_pattern devid_pat; } data; }; typedef enum { BUS_MATCH_NONE = 0x000, BUS_MATCH_PATH = 0x001, BUS_MATCH_NAME = 0x002, BUS_MATCH_UNIT = 0x004, BUS_MATCH_BUS_ID = 0x008, BUS_MATCH_ANY = 0x00f } bus_pattern_flags; struct bus_match_pattern { path_id_t path_id; char dev_name[DEV_IDLEN]; u_int32_t unit_number; u_int32_t bus_id; bus_pattern_flags flags; }; union match_pattern { struct periph_match_pattern periph_pattern; struct device_match_pattern device_pattern; struct bus_match_pattern bus_pattern; }; typedef enum { DEV_MATCH_PERIPH, DEV_MATCH_DEVICE, DEV_MATCH_BUS } dev_match_type; struct dev_match_pattern { dev_match_type type; union match_pattern pattern; }; struct periph_match_result { char periph_name[DEV_IDLEN]; u_int32_t unit_number; path_id_t path_id; target_id_t target_id; lun_id_t target_lun; }; typedef enum { DEV_RESULT_NOFLAG = 0x00, DEV_RESULT_UNCONFIGURED = 0x01 } dev_result_flags; struct device_match_result { path_id_t path_id; target_id_t target_id; lun_id_t target_lun; cam_proto protocol; struct scsi_inquiry_data inq_data; struct ata_params ident_data; dev_result_flags flags; }; struct bus_match_result { path_id_t path_id; char dev_name[DEV_IDLEN]; u_int32_t unit_number; u_int32_t bus_id; }; union match_result { struct periph_match_result periph_result; struct device_match_result device_result; struct bus_match_result bus_result; }; struct dev_match_result { dev_match_type type; union match_result result; }; typedef enum { CAM_DEV_MATCH_LAST, CAM_DEV_MATCH_MORE, CAM_DEV_MATCH_LIST_CHANGED, CAM_DEV_MATCH_SIZE_ERROR, CAM_DEV_MATCH_ERROR } ccb_dev_match_status; typedef enum { CAM_DEV_POS_NONE = 0x000, CAM_DEV_POS_BUS = 0x001, CAM_DEV_POS_TARGET = 0x002, CAM_DEV_POS_DEVICE = 0x004, CAM_DEV_POS_PERIPH = 0x008, CAM_DEV_POS_PDPTR = 0x010, CAM_DEV_POS_TYPEMASK = 0xf00, CAM_DEV_POS_EDT = 0x100, CAM_DEV_POS_PDRV = 0x200 } dev_pos_type; struct ccb_dm_cookie { void *bus; void *target; void *device; void *periph; void *pdrv; }; struct ccb_dev_position { u_int generations[4]; #define CAM_BUS_GENERATION 0x00 #define CAM_TARGET_GENERATION 0x01 #define CAM_DEV_GENERATION 0x02 #define CAM_PERIPH_GENERATION 0x03 dev_pos_type position_type; struct ccb_dm_cookie cookie; }; struct ccb_dev_match { struct ccb_hdr ccb_h; ccb_dev_match_status status; u_int32_t num_patterns; u_int32_t pattern_buf_len; struct dev_match_pattern *patterns; u_int32_t num_matches; u_int32_t match_buf_len; struct dev_match_result *matches; struct ccb_dev_position pos; }; /* * Definitions for the path inquiry CCB fields. */ #define CAM_VERSION 0x18 /* Hex value for current version */ typedef enum { PI_MDP_ABLE = 0x80, /* Supports MDP message */ PI_WIDE_32 = 0x40, /* Supports 32 bit wide SCSI */ PI_WIDE_16 = 0x20, /* Supports 16 bit wide SCSI */ PI_SDTR_ABLE = 0x10, /* Supports SDTR message */ PI_LINKED_CDB = 0x08, /* Supports linked CDBs */ PI_SATAPM = 0x04, /* Supports SATA PM */ PI_TAG_ABLE = 0x02, /* Supports tag queue messages */ PI_SOFT_RST = 0x01 /* Supports soft reset alternative */ } pi_inqflag; typedef enum { PIT_PROCESSOR = 0x80, /* Target mode processor mode */ PIT_PHASE = 0x40, /* Target mode phase cog. mode */ PIT_DISCONNECT = 0x20, /* Disconnects supported in target mode */ PIT_TERM_IO = 0x10, /* Terminate I/O message supported in TM */ PIT_GRP_6 = 0x08, /* Group 6 commands supported */ PIT_GRP_7 = 0x04 /* Group 7 commands supported */ } pi_tmflag; typedef enum { PIM_EXTLUNS = 0x100,/* 64bit extended LUNs supported */ PIM_SCANHILO = 0x80, /* Bus scans from high ID to low ID */ PIM_NOREMOVE = 0x40, /* Removeable devices not included in scan */ PIM_NOINITIATOR = 0x20, /* Initiator role not supported. */ PIM_NOBUSRESET = 0x10, /* User has disabled initial BUS RESET */ PIM_NO_6_BYTE = 0x08, /* Do not send 6-byte commands */ PIM_SEQSCAN = 0x04, /* Do bus scans sequentially, not in parallel */ PIM_UNMAPPED = 0x02, PIM_NOSCAN = 0x01 /* SIM does its own scanning */ } pi_miscflag; /* Path Inquiry CCB */ struct ccb_pathinq_settings_spi { u_int8_t ppr_options; }; struct ccb_pathinq_settings_fc { u_int64_t wwnn; /* world wide node name */ u_int64_t wwpn; /* world wide port name */ u_int32_t port; /* 24 bit port id, if known */ u_int32_t bitrate; /* Mbps */ }; struct ccb_pathinq_settings_sas { u_int32_t bitrate; /* Mbps */ }; #define PATHINQ_SETTINGS_SIZE 128 struct ccb_pathinq { struct ccb_hdr ccb_h; u_int8_t version_num; /* Version number for the SIM/HBA */ u_int8_t hba_inquiry; /* Mimic of INQ byte 7 for the HBA */ u_int16_t target_sprt; /* Flags for target mode support */ u_int32_t hba_misc; /* Misc HBA features */ u_int16_t hba_eng_cnt; /* HBA engine count */ /* Vendor Unique capabilities */ u_int8_t vuhba_flags[VUHBALEN]; u_int32_t max_target; /* Maximum supported Target */ u_int32_t max_lun; /* Maximum supported Lun */ u_int32_t async_flags; /* Installed Async handlers */ path_id_t hpath_id; /* Highest Path ID in the subsystem */ target_id_t initiator_id; /* ID of the HBA on the SCSI bus */ char sim_vid[SIM_IDLEN]; /* Vendor ID of the SIM */ char hba_vid[HBA_IDLEN]; /* Vendor ID of the HBA */ char dev_name[DEV_IDLEN];/* Device name for SIM */ u_int32_t unit_number; /* Unit number for SIM */ u_int32_t bus_id; /* Bus ID for SIM */ u_int32_t base_transfer_speed;/* Base bus speed in KB/sec */ cam_proto protocol; u_int protocol_version; cam_xport transport; u_int transport_version; union { struct ccb_pathinq_settings_spi spi; struct ccb_pathinq_settings_fc fc; struct ccb_pathinq_settings_sas sas; char ccb_pathinq_settings_opaque[PATHINQ_SETTINGS_SIZE]; } xport_specific; u_int maxio; /* Max supported I/O size, in bytes. */ u_int16_t hba_vendor; /* HBA vendor ID */ u_int16_t hba_device; /* HBA device ID */ u_int16_t hba_subvendor; /* HBA subvendor ID */ u_int16_t hba_subdevice; /* HBA subdevice ID */ }; /* Path Statistics CCB */ struct ccb_pathstats { struct ccb_hdr ccb_h; struct timeval last_reset; /* Time of last bus reset/loop init */ }; typedef enum { SMP_FLAG_NONE = 0x00, SMP_FLAG_REQ_SG = 0x01, SMP_FLAG_RSP_SG = 0x02 } ccb_smp_pass_flags; /* * Serial Management Protocol CCB * XXX Currently the semantics for this CCB are that it is executed either * by the addressed device, or that device's parent (i.e. an expander for * any device on an expander) if the addressed device doesn't support SMP. * Later, once we have the ability to probe SMP-only devices and put them * in CAM's topology, the CCB will only be executed by the addressed device * if possible. */ struct ccb_smpio { struct ccb_hdr ccb_h; uint8_t *smp_request; int smp_request_len; uint16_t smp_request_sglist_cnt; uint8_t *smp_response; int smp_response_len; uint16_t smp_response_sglist_cnt; ccb_smp_pass_flags flags; }; typedef union { u_int8_t *sense_ptr; /* * Pointer to storage * for sense information */ /* Storage Area for sense information */ struct scsi_sense_data sense_buf; } sense_t; typedef union { u_int8_t *cdb_ptr; /* Pointer to the CDB bytes to send */ /* Area for the CDB send */ u_int8_t cdb_bytes[IOCDBLEN]; } cdb_t; /* * SCSI I/O Request CCB used for the XPT_SCSI_IO and XPT_CONT_TARGET_IO * function codes. */ struct ccb_scsiio { struct ccb_hdr ccb_h; union ccb *next_ccb; /* Ptr for next CCB for action */ u_int8_t *req_map; /* Ptr to mapping info */ u_int8_t *data_ptr; /* Ptr to the data buf/SG list */ u_int32_t dxfer_len; /* Data transfer length */ /* Autosense storage */ struct scsi_sense_data sense_data; u_int8_t sense_len; /* Number of bytes to autosense */ u_int8_t cdb_len; /* Number of bytes for the CDB */ u_int16_t sglist_cnt; /* Number of SG list entries */ u_int8_t scsi_status; /* Returned SCSI status */ u_int8_t sense_resid; /* Autosense resid length: 2's comp */ u_int32_t resid; /* Transfer residual length: 2's comp */ cdb_t cdb_io; /* Union for CDB bytes/pointer */ u_int8_t *msg_ptr; /* Pointer to the message buffer */ u_int16_t msg_len; /* Number of bytes for the Message */ u_int8_t tag_action; /* What to do for tag queueing */ /* * The tag action should be either the define below (to send a * non-tagged transaction) or one of the defined scsi tag messages * from scsi_message.h. */ #define CAM_TAG_ACTION_NONE 0x00 u_int tag_id; /* tag id from initator (target mode) */ u_int init_id; /* initiator id of who selected */ }; /* * ATA I/O Request CCB used for the XPT_ATA_IO function code. */ struct ccb_ataio { struct ccb_hdr ccb_h; union ccb *next_ccb; /* Ptr for next CCB for action */ struct ata_cmd cmd; /* ATA command register set */ struct ata_res res; /* ATA result register set */ u_int8_t *data_ptr; /* Ptr to the data buf/SG list */ u_int32_t dxfer_len; /* Data transfer length */ u_int32_t resid; /* Transfer residual length: 2's comp */ u_int8_t tag_action; /* What to do for tag queueing */ /* * The tag action should be either the define below (to send a * non-tagged transaction) or one of the defined scsi tag messages * from scsi_message.h. */ #define CAM_TAG_ACTION_NONE 0x00 u_int tag_id; /* tag id from initator (target mode) */ u_int init_id; /* initiator id of who selected */ }; struct ccb_accept_tio { struct ccb_hdr ccb_h; cdb_t cdb_io; /* Union for CDB bytes/pointer */ u_int8_t cdb_len; /* Number of bytes for the CDB */ u_int8_t tag_action; /* What to do for tag queueing */ u_int8_t sense_len; /* Number of bytes of Sense Data */ u_int tag_id; /* tag id from initator (target mode) */ u_int init_id; /* initiator id of who selected */ struct scsi_sense_data sense_data; }; /* Release SIM Queue */ struct ccb_relsim { struct ccb_hdr ccb_h; u_int32_t release_flags; #define RELSIM_ADJUST_OPENINGS 0x01 #define RELSIM_RELEASE_AFTER_TIMEOUT 0x02 #define RELSIM_RELEASE_AFTER_CMDCMPLT 0x04 #define RELSIM_RELEASE_AFTER_QEMPTY 0x08 u_int32_t openings; u_int32_t release_timeout; /* Abstract argument. */ u_int32_t qfrozen_cnt; }; /* * Definitions for the asynchronous callback CCB fields. */ typedef enum { AC_UNIT_ATTENTION = 0x4000,/* Device reported UNIT ATTENTION */ AC_ADVINFO_CHANGED = 0x2000,/* Advance info might have changes */ AC_CONTRACT = 0x1000,/* A contractual callback */ AC_GETDEV_CHANGED = 0x800,/* Getdev info might have changed */ AC_INQ_CHANGED = 0x400,/* Inquiry info might have changed */ AC_TRANSFER_NEG = 0x200,/* New transfer settings in effect */ AC_LOST_DEVICE = 0x100,/* A device went away */ AC_FOUND_DEVICE = 0x080,/* A new device was found */ AC_PATH_DEREGISTERED = 0x040,/* A path has de-registered */ AC_PATH_REGISTERED = 0x020,/* A new path has been registered */ AC_SENT_BDR = 0x010,/* A BDR message was sent to target */ AC_SCSI_AEN = 0x008,/* A SCSI AEN has been received */ AC_UNSOL_RESEL = 0x002,/* Unsolicited reselection occurred */ AC_BUS_RESET = 0x001 /* A SCSI bus reset occurred */ } ac_code; typedef void ac_callback_t (void *softc, u_int32_t code, struct cam_path *path, void *args); /* * Generic Asynchronous callbacks. * * Generic arguments passed bac which are then interpreted between a per-system * contract number. */ #define AC_CONTRACT_DATA_MAX (128 - sizeof (u_int64_t)) struct ac_contract { u_int64_t contract_number; u_int8_t contract_data[AC_CONTRACT_DATA_MAX]; }; #define AC_CONTRACT_DEV_CHG 1 struct ac_device_changed { u_int64_t wwpn; u_int32_t port; target_id_t target; u_int8_t arrived; }; /* Set Asynchronous Callback CCB */ struct ccb_setasync { struct ccb_hdr ccb_h; u_int32_t event_enable; /* Async Event enables */ ac_callback_t *callback; void *callback_arg; }; /* Set Device Type CCB */ struct ccb_setdev { struct ccb_hdr ccb_h; u_int8_t dev_type; /* Value for dev type field in EDT */ }; /* SCSI Control Functions */ /* Abort XPT request CCB */ struct ccb_abort { struct ccb_hdr ccb_h; union ccb *abort_ccb; /* Pointer to CCB to abort */ }; /* Reset SCSI Bus CCB */ struct ccb_resetbus { struct ccb_hdr ccb_h; }; /* Reset SCSI Device CCB */ struct ccb_resetdev { struct ccb_hdr ccb_h; }; /* Terminate I/O Process Request CCB */ struct ccb_termio { struct ccb_hdr ccb_h; union ccb *termio_ccb; /* Pointer to CCB to terminate */ }; typedef enum { CTS_TYPE_CURRENT_SETTINGS, CTS_TYPE_USER_SETTINGS } cts_type; struct ccb_trans_settings_scsi { u_int valid; /* Which fields to honor */ #define CTS_SCSI_VALID_TQ 0x01 u_int flags; #define CTS_SCSI_FLAGS_TAG_ENB 0x01 }; struct ccb_trans_settings_ata { u_int valid; /* Which fields to honor */ #define CTS_ATA_VALID_TQ 0x01 u_int flags; #define CTS_ATA_FLAGS_TAG_ENB 0x01 }; struct ccb_trans_settings_spi { u_int valid; /* Which fields to honor */ #define CTS_SPI_VALID_SYNC_RATE 0x01 #define CTS_SPI_VALID_SYNC_OFFSET 0x02 #define CTS_SPI_VALID_BUS_WIDTH 0x04 #define CTS_SPI_VALID_DISC 0x08 #define CTS_SPI_VALID_PPR_OPTIONS 0x10 u_int flags; #define CTS_SPI_FLAGS_DISC_ENB 0x01 u_int sync_period; u_int sync_offset; u_int bus_width; u_int ppr_options; }; struct ccb_trans_settings_fc { u_int valid; /* Which fields to honor */ #define CTS_FC_VALID_WWNN 0x8000 #define CTS_FC_VALID_WWPN 0x4000 #define CTS_FC_VALID_PORT 0x2000 #define CTS_FC_VALID_SPEED 0x1000 u_int64_t wwnn; /* world wide node name */ u_int64_t wwpn; /* world wide port name */ u_int32_t port; /* 24 bit port id, if known */ u_int32_t bitrate; /* Mbps */ }; struct ccb_trans_settings_sas { u_int valid; /* Which fields to honor */ #define CTS_SAS_VALID_SPEED 0x1000 u_int32_t bitrate; /* Mbps */ }; struct ccb_trans_settings_pata { u_int valid; /* Which fields to honor */ #define CTS_ATA_VALID_MODE 0x01 #define CTS_ATA_VALID_BYTECOUNT 0x02 #define CTS_ATA_VALID_ATAPI 0x20 #define CTS_ATA_VALID_CAPS 0x40 int mode; /* Mode */ u_int bytecount; /* Length of PIO transaction */ u_int atapi; /* Length of ATAPI CDB */ u_int caps; /* Device and host SATA caps. */ #define CTS_ATA_CAPS_H 0x0000ffff #define CTS_ATA_CAPS_H_DMA48 0x00000001 /* 48-bit DMA */ #define CTS_ATA_CAPS_D 0xffff0000 }; struct ccb_trans_settings_sata { u_int valid; /* Which fields to honor */ #define CTS_SATA_VALID_MODE 0x01 #define CTS_SATA_VALID_BYTECOUNT 0x02 #define CTS_SATA_VALID_REVISION 0x04 #define CTS_SATA_VALID_PM 0x08 #define CTS_SATA_VALID_TAGS 0x10 #define CTS_SATA_VALID_ATAPI 0x20 #define CTS_SATA_VALID_CAPS 0x40 int mode; /* Legacy PATA mode */ u_int bytecount; /* Length of PIO transaction */ int revision; /* SATA revision */ u_int pm_present; /* PM is present (XPT->SIM) */ u_int tags; /* Number of allowed tags */ u_int atapi; /* Length of ATAPI CDB */ u_int caps; /* Device and host SATA caps. */ #define CTS_SATA_CAPS_H 0x0000ffff #define CTS_SATA_CAPS_H_PMREQ 0x00000001 #define CTS_SATA_CAPS_H_APST 0x00000002 #define CTS_SATA_CAPS_H_DMAAA 0x00000010 /* Auto-activation */ #define CTS_SATA_CAPS_H_AN 0x00000020 /* Async. notification */ #define CTS_SATA_CAPS_D 0xffff0000 #define CTS_SATA_CAPS_D_PMREQ 0x00010000 #define CTS_SATA_CAPS_D_APST 0x00020000 }; /* Get/Set transfer rate/width/disconnection/tag queueing settings */ struct ccb_trans_settings { struct ccb_hdr ccb_h; cts_type type; /* Current or User settings */ cam_proto protocol; u_int protocol_version; cam_xport transport; u_int transport_version; union { u_int valid; /* Which fields to honor */ struct ccb_trans_settings_ata ata; struct ccb_trans_settings_scsi scsi; } proto_specific; union { u_int valid; /* Which fields to honor */ struct ccb_trans_settings_spi spi; struct ccb_trans_settings_fc fc; struct ccb_trans_settings_sas sas; struct ccb_trans_settings_pata ata; struct ccb_trans_settings_sata sata; } xport_specific; }; /* * Calculate the geometry parameters for a device * give the block size and volume size in blocks. */ struct ccb_calc_geometry { struct ccb_hdr ccb_h; u_int32_t block_size; u_int64_t volume_size; u_int32_t cylinders; u_int8_t heads; u_int8_t secs_per_track; }; /* * Set or get SIM (and transport) specific knobs */ #define KNOB_VALID_ADDRESS 0x1 #define KNOB_VALID_ROLE 0x2 #define KNOB_ROLE_NONE 0x0 #define KNOB_ROLE_INITIATOR 0x1 #define KNOB_ROLE_TARGET 0x2 #define KNOB_ROLE_BOTH 0x3 struct ccb_sim_knob_settings_spi { u_int valid; u_int initiator_id; u_int role; }; struct ccb_sim_knob_settings_fc { u_int valid; u_int64_t wwnn; /* world wide node name */ u_int64_t wwpn; /* world wide port name */ u_int role; }; struct ccb_sim_knob_settings_sas { u_int valid; u_int64_t wwnn; /* world wide node name */ u_int role; }; #define KNOB_SETTINGS_SIZE 128 struct ccb_sim_knob { struct ccb_hdr ccb_h; union { u_int valid; /* Which fields to honor */ struct ccb_sim_knob_settings_spi spi; struct ccb_sim_knob_settings_fc fc; struct ccb_sim_knob_settings_sas sas; char pad[KNOB_SETTINGS_SIZE]; } xport_specific; }; /* * Rescan the given bus, or bus/target/lun */ struct ccb_rescan { struct ccb_hdr ccb_h; cam_flags flags; }; /* * Turn on debugging for the given bus, bus/target, or bus/target/lun. */ struct ccb_debug { struct ccb_hdr ccb_h; cam_debug_flags flags; }; /* Target mode structures. */ struct ccb_en_lun { struct ccb_hdr ccb_h; u_int16_t grp6_len; /* Group 6 VU CDB length */ u_int16_t grp7_len; /* Group 7 VU CDB length */ u_int8_t enable; }; /* old, barely used immediate notify, binary compatibility */ struct ccb_immed_notify { struct ccb_hdr ccb_h; struct scsi_sense_data sense_data; u_int8_t sense_len; /* Number of bytes in sense buffer */ u_int8_t initiator_id; /* Id of initiator that selected */ u_int8_t message_args[7]; /* Message Arguments */ }; struct ccb_notify_ack { struct ccb_hdr ccb_h; u_int16_t seq_id; /* Sequence identifier */ u_int8_t event; /* Event flags */ }; struct ccb_immediate_notify { struct ccb_hdr ccb_h; u_int tag_id; /* Tag for immediate notify */ u_int seq_id; /* Tag for target of notify */ u_int initiator_id; /* Initiator Identifier */ u_int arg; /* Function specific */ }; struct ccb_notify_acknowledge { struct ccb_hdr ccb_h; u_int tag_id; /* Tag for immediate notify */ u_int seq_id; /* Tar for target of notify */ u_int initiator_id; /* Initiator Identifier */ u_int arg; /* Function specific */ }; /* HBA engine structures. */ typedef enum { EIT_BUFFER, /* Engine type: buffer memory */ EIT_LOSSLESS, /* Engine type: lossless compression */ EIT_LOSSY, /* Engine type: lossy compression */ EIT_ENCRYPT /* Engine type: encryption */ } ei_type; typedef enum { EAD_VUNIQUE, /* Engine algorithm ID: vendor unique */ EAD_LZ1V1, /* Engine algorithm ID: LZ1 var.1 */ EAD_LZ2V1, /* Engine algorithm ID: LZ2 var.1 */ EAD_LZ2V2 /* Engine algorithm ID: LZ2 var.2 */ } ei_algo; struct ccb_eng_inq { struct ccb_hdr ccb_h; u_int16_t eng_num; /* The engine number for this inquiry */ ei_type eng_type; /* Returned engine type */ ei_algo eng_algo; /* Returned engine algorithm type */ u_int32_t eng_memeory; /* Returned engine memory size */ }; struct ccb_eng_exec { /* This structure must match SCSIIO size */ struct ccb_hdr ccb_h; u_int8_t *pdrv_ptr; /* Ptr used by the peripheral driver */ u_int8_t *req_map; /* Ptr for mapping info on the req. */ u_int8_t *data_ptr; /* Pointer to the data buf/SG list */ u_int32_t dxfer_len; /* Data transfer length */ u_int8_t *engdata_ptr; /* Pointer to the engine buffer data */ u_int16_t sglist_cnt; /* Num of scatter gather list entries */ u_int32_t dmax_len; /* Destination data maximum length */ u_int32_t dest_len; /* Destination data length */ int32_t src_resid; /* Source residual length: 2's comp */ u_int32_t timeout; /* Timeout value */ u_int16_t eng_num; /* Engine number for this request */ u_int16_t vu_flags; /* Vendor Unique flags */ }; /* * Definitions for the timeout field in the SCSI I/O CCB. */ #define CAM_TIME_DEFAULT 0x00000000 /* Use SIM default value */ #define CAM_TIME_INFINITY 0xFFFFFFFF /* Infinite timeout */ #define CAM_SUCCESS 0 /* For signaling general success */ #define CAM_FAILURE 1 /* For signaling general failure */ #define CAM_FALSE 0 #define CAM_TRUE 1 #define XPT_CCB_INVALID -1 /* for signaling a bad CCB to free */ /* * CCB for working with advanced device information. This operates in a fashion * similar to XPT_GDEV_TYPE. Specify the target in ccb_h, the buffer * type requested, and provide a buffer size/buffer to write to. If the * buffer is too small, provsiz will be larger than bufsiz. */ struct ccb_dev_advinfo { struct ccb_hdr ccb_h; uint32_t flags; #define CDAI_FLAG_NONE 0x0 /* No flags set */ #define CDAI_FLAG_STORE 0x1 /* If set, action becomes store */ uint32_t buftype; /* IN: Type of data being requested */ /* NB: buftype is interpreted on a per-transport basis */ #define CDAI_TYPE_SCSI_DEVID 1 #define CDAI_TYPE_SERIAL_NUM 2 #define CDAI_TYPE_PHYS_PATH 3 #define CDAI_TYPE_RCAPLONG 4 #define CDAI_TYPE_EXT_INQ 5 off_t bufsiz; /* IN: Size of external buffer */ #define CAM_SCSI_DEVID_MAXLEN 65536 /* length in buffer is an uint16_t */ off_t provsiz; /* OUT: Size required/used */ uint8_t *buf; /* IN/OUT: Buffer for requested data */ }; /* * CCB for sending async events */ struct ccb_async { struct ccb_hdr ccb_h; uint32_t async_code; off_t async_arg_size; void *async_arg_ptr; }; /* * Union of all CCB types for kernel space allocation. This union should * never be used for manipulating CCBs - its only use is for the allocation * and deallocation of raw CCB space and is the return type of xpt_ccb_alloc * and the argument to xpt_ccb_free. */ union ccb { struct ccb_hdr ccb_h; /* For convenience */ struct ccb_scsiio csio; struct ccb_getdev cgd; struct ccb_getdevlist cgdl; struct ccb_pathinq cpi; struct ccb_relsim crs; struct ccb_setasync csa; struct ccb_setdev csd; struct ccb_pathstats cpis; struct ccb_getdevstats cgds; struct ccb_dev_match cdm; struct ccb_trans_settings cts; struct ccb_calc_geometry ccg; struct ccb_sim_knob knob; struct ccb_abort cab; struct ccb_resetbus crb; struct ccb_resetdev crd; struct ccb_termio tio; struct ccb_accept_tio atio; struct ccb_scsiio ctio; struct ccb_en_lun cel; struct ccb_immed_notify cin; struct ccb_notify_ack cna; struct ccb_immediate_notify cin1; struct ccb_notify_acknowledge cna2; struct ccb_eng_inq cei; struct ccb_eng_exec cee; struct ccb_smpio smpio; struct ccb_rescan crcn; struct ccb_debug cdbg; struct ccb_ataio ataio; struct ccb_dev_advinfo cdai; struct ccb_async casync; }; __BEGIN_DECLS static __inline void cam_fill_csio(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int8_t tag_action, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int8_t sense_len, u_int8_t cdb_len, u_int32_t timeout); static __inline void cam_fill_ctio(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int tag_action, u_int tag_id, u_int init_id, u_int scsi_status, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout); static __inline void cam_fill_ataio(struct ccb_ataio *ataio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int tag_action, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout); static __inline void cam_fill_smpio(struct ccb_smpio *smpio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint32_t flags, uint8_t *smp_request, int smp_request_len, uint8_t *smp_response, int smp_response_len, uint32_t timeout); static __inline void cam_fill_csio(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int8_t tag_action, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int8_t sense_len, u_int8_t cdb_len, u_int32_t timeout) { csio->ccb_h.func_code = XPT_SCSI_IO; csio->ccb_h.flags = flags; csio->ccb_h.xflags = 0; csio->ccb_h.retry_count = retries; csio->ccb_h.cbfcnp = cbfcnp; csio->ccb_h.timeout = timeout; csio->data_ptr = data_ptr; csio->dxfer_len = dxfer_len; csio->sense_len = sense_len; csio->cdb_len = cdb_len; csio->tag_action = tag_action; } static __inline void cam_fill_ctio(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int tag_action, u_int tag_id, u_int init_id, u_int scsi_status, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout) { csio->ccb_h.func_code = XPT_CONT_TARGET_IO; csio->ccb_h.flags = flags; csio->ccb_h.xflags = 0; csio->ccb_h.retry_count = retries; csio->ccb_h.cbfcnp = cbfcnp; csio->ccb_h.timeout = timeout; csio->data_ptr = data_ptr; csio->dxfer_len = dxfer_len; csio->scsi_status = scsi_status; csio->tag_action = tag_action; csio->tag_id = tag_id; csio->init_id = init_id; } static __inline void cam_fill_ataio(struct ccb_ataio *ataio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int32_t flags, u_int tag_action, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int32_t timeout) { ataio->ccb_h.func_code = XPT_ATA_IO; ataio->ccb_h.flags = flags; ataio->ccb_h.retry_count = retries; ataio->ccb_h.cbfcnp = cbfcnp; ataio->ccb_h.timeout = timeout; ataio->data_ptr = data_ptr; ataio->dxfer_len = dxfer_len; ataio->tag_action = tag_action; } static __inline void cam_fill_smpio(struct ccb_smpio *smpio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint32_t flags, uint8_t *smp_request, int smp_request_len, uint8_t *smp_response, int smp_response_len, uint32_t timeout) { #ifdef _KERNEL KASSERT((flags & CAM_DIR_MASK) == CAM_DIR_BOTH, ("direction != CAM_DIR_BOTH")); KASSERT((smp_request != NULL) && (smp_response != NULL), ("need valid request and response buffers")); KASSERT((smp_request_len != 0) && (smp_response_len != 0), ("need non-zero request and response lengths")); #endif /*_KERNEL*/ smpio->ccb_h.func_code = XPT_SMP_IO; smpio->ccb_h.flags = flags; smpio->ccb_h.retry_count = retries; smpio->ccb_h.cbfcnp = cbfcnp; smpio->ccb_h.timeout = timeout; smpio->smp_request = smp_request; smpio->smp_request_len = smp_request_len; smpio->smp_response = smp_response; smpio->smp_response_len = smp_response_len; } static __inline void cam_set_ccbstatus(union ccb *ccb, cam_status status) { ccb->ccb_h.status &= ~CAM_STATUS_MASK; ccb->ccb_h.status |= status; } static __inline cam_status cam_ccb_status(union ccb *ccb) { return ((cam_status)(ccb->ccb_h.status & CAM_STATUS_MASK)); } void cam_calc_geometry(struct ccb_calc_geometry *ccg, int extended); __END_DECLS #endif /* _CAM_CAM_CCB_H */ Index: stable/10/sys/cam/cam_xpt.c =================================================================== --- stable/10/sys/cam/cam_xpt.c (revision 292347) +++ stable/10/sys/cam/cam_xpt.c (revision 292348) @@ -1,5322 +1,5329 @@ /*- * Implementation of the Common Access Method Transport (XPT) layer. * * Copyright (c) 1997, 1998, 1999 Justin T. Gibbs. * Copyright (c) 1997, 1998, 1999 Kenneth D. Merry. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* geometry translation */ #include /* for xpt_print below */ #include "opt_cam.h" /* * This is the maximum number of high powered commands (e.g. start unit) * that can be outstanding at a particular time. */ #ifndef CAM_MAX_HIGHPOWER #define CAM_MAX_HIGHPOWER 4 #endif /* Datastructures internal to the xpt layer */ MALLOC_DEFINE(M_CAMXPT, "CAM XPT", "CAM XPT buffers"); MALLOC_DEFINE(M_CAMDEV, "CAM DEV", "CAM devices"); MALLOC_DEFINE(M_CAMCCB, "CAM CCB", "CAM CCBs"); MALLOC_DEFINE(M_CAMPATH, "CAM path", "CAM paths"); /* Object for defering XPT actions to a taskqueue */ struct xpt_task { struct task task; void *data1; uintptr_t data2; }; struct xpt_softc { uint32_t xpt_generation; /* number of high powered commands that can go through right now */ struct mtx xpt_highpower_lock; STAILQ_HEAD(highpowerlist, cam_ed) highpowerq; int num_highpower; /* queue for handling async rescan requests. */ TAILQ_HEAD(, ccb_hdr) ccb_scanq; int buses_to_config; int buses_config_done; /* Registered busses */ TAILQ_HEAD(,cam_eb) xpt_busses; u_int bus_generation; struct intr_config_hook *xpt_config_hook; int boot_delay; struct callout boot_callout; struct mtx xpt_topo_lock; struct mtx xpt_lock; struct taskqueue *xpt_taskq; }; typedef enum { DM_RET_COPY = 0x01, DM_RET_FLAG_MASK = 0x0f, DM_RET_NONE = 0x00, DM_RET_STOP = 0x10, DM_RET_DESCEND = 0x20, DM_RET_ERROR = 0x30, DM_RET_ACTION_MASK = 0xf0 } dev_match_ret; typedef enum { XPT_DEPTH_BUS, XPT_DEPTH_TARGET, XPT_DEPTH_DEVICE, XPT_DEPTH_PERIPH } xpt_traverse_depth; struct xpt_traverse_config { xpt_traverse_depth depth; void *tr_func; void *tr_arg; }; typedef int xpt_busfunc_t (struct cam_eb *bus, void *arg); typedef int xpt_targetfunc_t (struct cam_et *target, void *arg); typedef int xpt_devicefunc_t (struct cam_ed *device, void *arg); typedef int xpt_periphfunc_t (struct cam_periph *periph, void *arg); typedef int xpt_pdrvfunc_t (struct periph_driver **pdrv, void *arg); /* Transport layer configuration information */ static struct xpt_softc xsoftc; MTX_SYSINIT(xpt_topo_init, &xsoftc.xpt_topo_lock, "XPT topology lock", MTX_DEF); TUNABLE_INT("kern.cam.boot_delay", &xsoftc.boot_delay); SYSCTL_INT(_kern_cam, OID_AUTO, boot_delay, CTLFLAG_RDTUN, &xsoftc.boot_delay, 0, "Bus registration wait time"); SYSCTL_UINT(_kern_cam, OID_AUTO, xpt_generation, CTLFLAG_RD, &xsoftc.xpt_generation, 0, "CAM peripheral generation count"); struct cam_doneq { struct mtx_padalign cam_doneq_mtx; STAILQ_HEAD(, ccb_hdr) cam_doneq; int cam_doneq_sleep; }; static struct cam_doneq cam_doneqs[MAXCPU]; static int cam_num_doneqs; static struct proc *cam_proc; TUNABLE_INT("kern.cam.num_doneqs", &cam_num_doneqs); SYSCTL_INT(_kern_cam, OID_AUTO, num_doneqs, CTLFLAG_RDTUN, &cam_num_doneqs, 0, "Number of completion queues/threads"); struct cam_periph *xpt_periph; static periph_init_t xpt_periph_init; static struct periph_driver xpt_driver = { xpt_periph_init, "xpt", TAILQ_HEAD_INITIALIZER(xpt_driver.units), /* generation */ 0, CAM_PERIPH_DRV_EARLY }; PERIPHDRIVER_DECLARE(xpt, xpt_driver); static d_open_t xptopen; static d_close_t xptclose; static d_ioctl_t xptioctl; static d_ioctl_t xptdoioctl; static struct cdevsw xpt_cdevsw = { .d_version = D_VERSION, .d_flags = 0, .d_open = xptopen, .d_close = xptclose, .d_ioctl = xptioctl, .d_name = "xpt", }; /* Storage for debugging datastructures */ struct cam_path *cam_dpath; u_int32_t cam_dflags = CAM_DEBUG_FLAGS; TUNABLE_INT("kern.cam.dflags", &cam_dflags); SYSCTL_UINT(_kern_cam, OID_AUTO, dflags, CTLFLAG_RW, &cam_dflags, 0, "Enabled debug flags"); u_int32_t cam_debug_delay = CAM_DEBUG_DELAY; TUNABLE_INT("kern.cam.debug_delay", &cam_debug_delay); SYSCTL_UINT(_kern_cam, OID_AUTO, debug_delay, CTLFLAG_RW, &cam_debug_delay, 0, "Delay in us after each debug message"); /* Our boot-time initialization hook */ static int cam_module_event_handler(module_t, int /*modeventtype_t*/, void *); static moduledata_t cam_moduledata = { "cam", cam_module_event_handler, NULL }; static int xpt_init(void *); DECLARE_MODULE(cam, cam_moduledata, SI_SUB_CONFIGURE, SI_ORDER_SECOND); MODULE_VERSION(cam, 1); static void xpt_async_bcast(struct async_list *async_head, u_int32_t async_code, struct cam_path *path, void *async_arg); static path_id_t xptnextfreepathid(void); static path_id_t xptpathid(const char *sim_name, int sim_unit, int sim_bus); static union ccb *xpt_get_ccb(struct cam_periph *periph); static union ccb *xpt_get_ccb_nowait(struct cam_periph *periph); static void xpt_run_allocq(struct cam_periph *periph, int sleep); static void xpt_run_allocq_task(void *context, int pending); static void xpt_run_devq(struct cam_devq *devq); static timeout_t xpt_release_devq_timeout; static void xpt_release_simq_timeout(void *arg) __unused; static void xpt_acquire_bus(struct cam_eb *bus); static void xpt_release_bus(struct cam_eb *bus); static uint32_t xpt_freeze_devq_device(struct cam_ed *dev, u_int count); static int xpt_release_devq_device(struct cam_ed *dev, u_int count, int run_queue); static struct cam_et* xpt_alloc_target(struct cam_eb *bus, target_id_t target_id); static void xpt_acquire_target(struct cam_et *target); static void xpt_release_target(struct cam_et *target); static struct cam_eb* xpt_find_bus(path_id_t path_id); static struct cam_et* xpt_find_target(struct cam_eb *bus, target_id_t target_id); static struct cam_ed* xpt_find_device(struct cam_et *target, lun_id_t lun_id); static void xpt_config(void *arg); static int xpt_schedule_dev(struct camq *queue, cam_pinfo *dev_pinfo, u_int32_t new_priority); static xpt_devicefunc_t xptpassannouncefunc; static void xptaction(struct cam_sim *sim, union ccb *work_ccb); static void xptpoll(struct cam_sim *sim); static void camisr_runqueue(void); static void xpt_done_process(struct ccb_hdr *ccb_h); static void xpt_done_td(void *); static dev_match_ret xptbusmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_eb *bus); static dev_match_ret xptdevicematch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_ed *device); static dev_match_ret xptperiphmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_periph *periph); static xpt_busfunc_t xptedtbusfunc; static xpt_targetfunc_t xptedttargetfunc; static xpt_devicefunc_t xptedtdevicefunc; static xpt_periphfunc_t xptedtperiphfunc; static xpt_pdrvfunc_t xptplistpdrvfunc; static xpt_periphfunc_t xptplistperiphfunc; static int xptedtmatch(struct ccb_dev_match *cdm); static int xptperiphlistmatch(struct ccb_dev_match *cdm); static int xptbustraverse(struct cam_eb *start_bus, xpt_busfunc_t *tr_func, void *arg); static int xpttargettraverse(struct cam_eb *bus, struct cam_et *start_target, xpt_targetfunc_t *tr_func, void *arg); static int xptdevicetraverse(struct cam_et *target, struct cam_ed *start_device, xpt_devicefunc_t *tr_func, void *arg); static int xptperiphtraverse(struct cam_ed *device, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg); static int xptpdrvtraverse(struct periph_driver **start_pdrv, xpt_pdrvfunc_t *tr_func, void *arg); static int xptpdperiphtraverse(struct periph_driver **pdrv, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg); static xpt_busfunc_t xptdefbusfunc; static xpt_targetfunc_t xptdeftargetfunc; static xpt_devicefunc_t xptdefdevicefunc; static xpt_periphfunc_t xptdefperiphfunc; static void xpt_finishconfig_task(void *context, int pending); static void xpt_dev_async_default(u_int32_t async_code, struct cam_eb *bus, struct cam_et *target, struct cam_ed *device, void *async_arg); static struct cam_ed * xpt_alloc_device_default(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id); static xpt_devicefunc_t xptsetasyncfunc; static xpt_busfunc_t xptsetasyncbusfunc; static cam_status xptregister(struct cam_periph *periph, void *arg); static __inline int device_is_queued(struct cam_ed *device); static __inline int xpt_schedule_devq(struct cam_devq *devq, struct cam_ed *dev) { int retval; mtx_assert(&devq->send_mtx, MA_OWNED); if ((dev->ccbq.queue.entries > 0) && (dev->ccbq.dev_openings > 0) && (dev->ccbq.queue.qfrozen_cnt == 0)) { /* * The priority of a device waiting for controller * resources is that of the highest priority CCB * enqueued. */ retval = xpt_schedule_dev(&devq->send_queue, &dev->devq_entry, CAMQ_GET_PRIO(&dev->ccbq.queue)); } else { retval = 0; } return (retval); } static __inline int device_is_queued(struct cam_ed *device) { return (device->devq_entry.index != CAM_UNQUEUED_INDEX); } static void xpt_periph_init() { make_dev(&xpt_cdevsw, 0, UID_ROOT, GID_OPERATOR, 0600, "xpt0"); } static int xptopen(struct cdev *dev, int flags, int fmt, struct thread *td) { /* * Only allow read-write access. */ if (((flags & FWRITE) == 0) || ((flags & FREAD) == 0)) return(EPERM); /* * We don't allow nonblocking access. */ if ((flags & O_NONBLOCK) != 0) { printf("%s: can't do nonblocking access\n", devtoname(dev)); return(ENODEV); } return(0); } static int xptclose(struct cdev *dev, int flag, int fmt, struct thread *td) { return(0); } /* * Don't automatically grab the xpt softc lock here even though this is going * through the xpt device. The xpt device is really just a back door for * accessing other devices and SIMs, so the right thing to do is to grab * the appropriate SIM lock once the bus/SIM is located. */ static int xptioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td) { int error; if ((error = xptdoioctl(dev, cmd, addr, flag, td)) == ENOTTY) { error = cam_compat_ioctl(dev, cmd, addr, flag, td, xptdoioctl); } return (error); } static int xptdoioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td) { int error; error = 0; switch(cmd) { /* * For the transport layer CAMIOCOMMAND ioctl, we really only want * to accept CCB types that don't quite make sense to send through a * passthrough driver. XPT_PATH_INQ is an exception to this, as stated * in the CAM spec. */ case CAMIOCOMMAND: { union ccb *ccb; union ccb *inccb; struct cam_eb *bus; inccb = (union ccb *)addr; bus = xpt_find_bus(inccb->ccb_h.path_id); if (bus == NULL) return (EINVAL); switch (inccb->ccb_h.func_code) { case XPT_SCAN_BUS: case XPT_RESET_BUS: if (inccb->ccb_h.target_id != CAM_TARGET_WILDCARD || inccb->ccb_h.target_lun != CAM_LUN_WILDCARD) { xpt_release_bus(bus); return (EINVAL); } break; case XPT_SCAN_TGT: if (inccb->ccb_h.target_id == CAM_TARGET_WILDCARD || inccb->ccb_h.target_lun != CAM_LUN_WILDCARD) { xpt_release_bus(bus); return (EINVAL); } break; default: break; } switch(inccb->ccb_h.func_code) { case XPT_SCAN_BUS: case XPT_RESET_BUS: case XPT_PATH_INQ: case XPT_ENG_INQ: case XPT_SCAN_LUN: case XPT_SCAN_TGT: ccb = xpt_alloc_ccb(); /* * Create a path using the bus, target, and lun the * user passed in. */ if (xpt_create_path(&ccb->ccb_h.path, NULL, inccb->ccb_h.path_id, inccb->ccb_h.target_id, inccb->ccb_h.target_lun) != CAM_REQ_CMP){ error = EINVAL; xpt_free_ccb(ccb); break; } /* Ensure all of our fields are correct */ xpt_setup_ccb(&ccb->ccb_h, ccb->ccb_h.path, inccb->ccb_h.pinfo.priority); xpt_merge_ccb(ccb, inccb); xpt_path_lock(ccb->ccb_h.path); cam_periph_runccb(ccb, NULL, 0, 0, NULL); xpt_path_unlock(ccb->ccb_h.path); bcopy(ccb, inccb, sizeof(union ccb)); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); break; case XPT_DEBUG: { union ccb ccb; /* * This is an immediate CCB, so it's okay to * allocate it on the stack. */ /* * Create a path using the bus, target, and lun the * user passed in. */ if (xpt_create_path(&ccb.ccb_h.path, NULL, inccb->ccb_h.path_id, inccb->ccb_h.target_id, inccb->ccb_h.target_lun) != CAM_REQ_CMP){ error = EINVAL; break; } /* Ensure all of our fields are correct */ xpt_setup_ccb(&ccb.ccb_h, ccb.ccb_h.path, inccb->ccb_h.pinfo.priority); xpt_merge_ccb(&ccb, inccb); xpt_action(&ccb); bcopy(&ccb, inccb, sizeof(union ccb)); xpt_free_path(ccb.ccb_h.path); break; } case XPT_DEV_MATCH: { struct cam_periph_map_info mapinfo; struct cam_path *old_path; /* * We can't deal with physical addresses for this * type of transaction. */ if ((inccb->ccb_h.flags & CAM_DATA_MASK) != CAM_DATA_VADDR) { error = EINVAL; break; } /* * Save this in case the caller had it set to * something in particular. */ old_path = inccb->ccb_h.path; /* * We really don't need a path for the matching * code. The path is needed because of the * debugging statements in xpt_action(). They * assume that the CCB has a valid path. */ inccb->ccb_h.path = xpt_periph->path; bzero(&mapinfo, sizeof(mapinfo)); /* * Map the pattern and match buffers into kernel * virtual address space. */ error = cam_periph_mapmem(inccb, &mapinfo, MAXPHYS); if (error) { inccb->ccb_h.path = old_path; break; } /* * This is an immediate CCB, we can send it on directly. */ xpt_action(inccb); /* * Map the buffers back into user space. */ cam_periph_unmapmem(inccb, &mapinfo); inccb->ccb_h.path = old_path; error = 0; break; } default: error = ENOTSUP; break; } xpt_release_bus(bus); break; } /* * This is the getpassthru ioctl. It takes a XPT_GDEVLIST ccb as input, * with the periphal driver name and unit name filled in. The other * fields don't really matter as input. The passthrough driver name * ("pass"), and unit number are passed back in the ccb. The current * device generation number, and the index into the device peripheral * driver list, and the status are also passed back. Note that * since we do everything in one pass, unlike the XPT_GDEVLIST ccb, * we never return a status of CAM_GDEVLIST_LIST_CHANGED. It is * (or rather should be) impossible for the device peripheral driver * list to change since we look at the whole thing in one pass, and * we do it with lock protection. * */ case CAMGETPASSTHRU: { union ccb *ccb; struct cam_periph *periph; struct periph_driver **p_drv; char *name; u_int unit; int base_periph_found; ccb = (union ccb *)addr; unit = ccb->cgdl.unit_number; name = ccb->cgdl.periph_name; base_periph_found = 0; /* * Sanity check -- make sure we don't get a null peripheral * driver name. */ if (*ccb->cgdl.periph_name == '\0') { error = EINVAL; break; } /* Keep the list from changing while we traverse it */ xpt_lock_buses(); /* first find our driver in the list of drivers */ for (p_drv = periph_drivers; *p_drv != NULL; p_drv++) if (strcmp((*p_drv)->driver_name, name) == 0) break; if (*p_drv == NULL) { xpt_unlock_buses(); ccb->ccb_h.status = CAM_REQ_CMP_ERR; ccb->cgdl.status = CAM_GDEVLIST_ERROR; *ccb->cgdl.periph_name = '\0'; ccb->cgdl.unit_number = 0; error = ENOENT; break; } /* * Run through every peripheral instance of this driver * and check to see whether it matches the unit passed * in by the user. If it does, get out of the loops and * find the passthrough driver associated with that * peripheral driver. */ for (periph = TAILQ_FIRST(&(*p_drv)->units); periph != NULL; periph = TAILQ_NEXT(periph, unit_links)) { if (periph->unit_number == unit) break; } /* * If we found the peripheral driver that the user passed * in, go through all of the peripheral drivers for that * particular device and look for a passthrough driver. */ if (periph != NULL) { struct cam_ed *device; int i; base_periph_found = 1; device = periph->path->device; for (i = 0, periph = SLIST_FIRST(&device->periphs); periph != NULL; periph = SLIST_NEXT(periph, periph_links), i++) { /* * Check to see whether we have a * passthrough device or not. */ if (strcmp(periph->periph_name, "pass") == 0) { /* * Fill in the getdevlist fields. */ strcpy(ccb->cgdl.periph_name, periph->periph_name); ccb->cgdl.unit_number = periph->unit_number; if (SLIST_NEXT(periph, periph_links)) ccb->cgdl.status = CAM_GDEVLIST_MORE_DEVS; else ccb->cgdl.status = CAM_GDEVLIST_LAST_DEVICE; ccb->cgdl.generation = device->generation; ccb->cgdl.index = i; /* * Fill in some CCB header fields * that the user may want. */ ccb->ccb_h.path_id = periph->path->bus->path_id; ccb->ccb_h.target_id = periph->path->target->target_id; ccb->ccb_h.target_lun = periph->path->device->lun_id; ccb->ccb_h.status = CAM_REQ_CMP; break; } } } /* * If the periph is null here, one of two things has * happened. The first possibility is that we couldn't * find the unit number of the particular peripheral driver * that the user is asking about. e.g. the user asks for * the passthrough driver for "da11". We find the list of * "da" peripherals all right, but there is no unit 11. * The other possibility is that we went through the list * of peripheral drivers attached to the device structure, * but didn't find one with the name "pass". Either way, * we return ENOENT, since we couldn't find something. */ if (periph == NULL) { ccb->ccb_h.status = CAM_REQ_CMP_ERR; ccb->cgdl.status = CAM_GDEVLIST_ERROR; *ccb->cgdl.periph_name = '\0'; ccb->cgdl.unit_number = 0; error = ENOENT; /* * It is unfortunate that this is even necessary, * but there are many, many clueless users out there. * If this is true, the user is looking for the * passthrough driver, but doesn't have one in his * kernel. */ if (base_periph_found == 1) { printf("xptioctl: pass driver is not in the " "kernel\n"); printf("xptioctl: put \"device pass\" in " "your kernel config file\n"); } } xpt_unlock_buses(); break; } default: error = ENOTTY; break; } return(error); } static int cam_module_event_handler(module_t mod, int what, void *arg) { int error; switch (what) { case MOD_LOAD: if ((error = xpt_init(NULL)) != 0) return (error); break; case MOD_UNLOAD: return EBUSY; default: return EOPNOTSUPP; } return 0; } static void xpt_rescan_done(struct cam_periph *periph, union ccb *done_ccb) { if (done_ccb->ccb_h.ppriv_ptr1 == NULL) { xpt_free_path(done_ccb->ccb_h.path); xpt_free_ccb(done_ccb); } else { done_ccb->ccb_h.cbfcnp = done_ccb->ccb_h.ppriv_ptr1; (*done_ccb->ccb_h.cbfcnp)(periph, done_ccb); } xpt_release_boot(); } /* thread to handle bus rescans */ static void xpt_scanner_thread(void *dummy) { union ccb *ccb; struct cam_path path; xpt_lock_buses(); for (;;) { if (TAILQ_EMPTY(&xsoftc.ccb_scanq)) msleep(&xsoftc.ccb_scanq, &xsoftc.xpt_topo_lock, PRIBIO, "-", 0); if ((ccb = (union ccb *)TAILQ_FIRST(&xsoftc.ccb_scanq)) != NULL) { TAILQ_REMOVE(&xsoftc.ccb_scanq, &ccb->ccb_h, sim_links.tqe); xpt_unlock_buses(); /* * Since lock can be dropped inside and path freed * by completion callback even before return here, * take our own path copy for reference. */ xpt_copy_path(&path, ccb->ccb_h.path); xpt_path_lock(&path); xpt_action(ccb); xpt_path_unlock(&path); xpt_release_path(&path); xpt_lock_buses(); } } } void xpt_rescan(union ccb *ccb) { struct ccb_hdr *hdr; /* Prepare request */ if (ccb->ccb_h.path->target->target_id == CAM_TARGET_WILDCARD && ccb->ccb_h.path->device->lun_id == CAM_LUN_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_BUS; else if (ccb->ccb_h.path->target->target_id != CAM_TARGET_WILDCARD && ccb->ccb_h.path->device->lun_id == CAM_LUN_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_TGT; else if (ccb->ccb_h.path->target->target_id != CAM_TARGET_WILDCARD && ccb->ccb_h.path->device->lun_id != CAM_LUN_WILDCARD) ccb->ccb_h.func_code = XPT_SCAN_LUN; else { xpt_print(ccb->ccb_h.path, "illegal scan path\n"); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); return; } ccb->ccb_h.ppriv_ptr1 = ccb->ccb_h.cbfcnp; ccb->ccb_h.cbfcnp = xpt_rescan_done; xpt_setup_ccb(&ccb->ccb_h, ccb->ccb_h.path, CAM_PRIORITY_XPT); /* Don't make duplicate entries for the same paths. */ xpt_lock_buses(); if (ccb->ccb_h.ppriv_ptr1 == NULL) { TAILQ_FOREACH(hdr, &xsoftc.ccb_scanq, sim_links.tqe) { if (xpt_path_comp(hdr->path, ccb->ccb_h.path) == 0) { wakeup(&xsoftc.ccb_scanq); xpt_unlock_buses(); xpt_print(ccb->ccb_h.path, "rescan already queued\n"); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); return; } } } TAILQ_INSERT_TAIL(&xsoftc.ccb_scanq, &ccb->ccb_h, sim_links.tqe); xsoftc.buses_to_config++; wakeup(&xsoftc.ccb_scanq); xpt_unlock_buses(); } /* Functions accessed by the peripheral drivers */ static int xpt_init(void *dummy) { struct cam_sim *xpt_sim; struct cam_path *path; struct cam_devq *devq; cam_status status; int error, i; TAILQ_INIT(&xsoftc.xpt_busses); TAILQ_INIT(&xsoftc.ccb_scanq); STAILQ_INIT(&xsoftc.highpowerq); xsoftc.num_highpower = CAM_MAX_HIGHPOWER; mtx_init(&xsoftc.xpt_lock, "XPT lock", NULL, MTX_DEF); mtx_init(&xsoftc.xpt_highpower_lock, "XPT highpower lock", NULL, MTX_DEF); xsoftc.xpt_taskq = taskqueue_create("CAM XPT task", M_WAITOK, taskqueue_thread_enqueue, /*context*/&xsoftc.xpt_taskq); #ifdef CAM_BOOT_DELAY /* * Override this value at compile time to assist our users * who don't use loader to boot a kernel. */ xsoftc.boot_delay = CAM_BOOT_DELAY; #endif /* * The xpt layer is, itself, the equivelent of a SIM. * Allow 16 ccbs in the ccb pool for it. This should * give decent parallelism when we probe busses and * perform other XPT functions. */ devq = cam_simq_alloc(16); xpt_sim = cam_sim_alloc(xptaction, xptpoll, "xpt", /*softc*/NULL, /*unit*/0, /*mtx*/&xsoftc.xpt_lock, /*max_dev_transactions*/0, /*max_tagged_dev_transactions*/0, devq); if (xpt_sim == NULL) return (ENOMEM); mtx_lock(&xsoftc.xpt_lock); if ((status = xpt_bus_register(xpt_sim, NULL, 0)) != CAM_SUCCESS) { mtx_unlock(&xsoftc.xpt_lock); printf("xpt_init: xpt_bus_register failed with status %#x," " failing attach\n", status); return (EINVAL); } mtx_unlock(&xsoftc.xpt_lock); /* * Looking at the XPT from the SIM layer, the XPT is * the equivelent of a peripheral driver. Allocate * a peripheral driver entry for us. */ if ((status = xpt_create_path(&path, NULL, CAM_XPT_PATH_ID, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD)) != CAM_REQ_CMP) { printf("xpt_init: xpt_create_path failed with status %#x," " failing attach\n", status); return (EINVAL); } xpt_path_lock(path); cam_periph_alloc(xptregister, NULL, NULL, NULL, "xpt", CAM_PERIPH_BIO, path, NULL, 0, xpt_sim); xpt_path_unlock(path); xpt_free_path(path); if (cam_num_doneqs < 1) cam_num_doneqs = 1 + mp_ncpus / 6; else if (cam_num_doneqs > MAXCPU) cam_num_doneqs = MAXCPU; for (i = 0; i < cam_num_doneqs; i++) { mtx_init(&cam_doneqs[i].cam_doneq_mtx, "CAM doneq", NULL, MTX_DEF); STAILQ_INIT(&cam_doneqs[i].cam_doneq); error = kproc_kthread_add(xpt_done_td, &cam_doneqs[i], &cam_proc, NULL, 0, 0, "cam", "doneq%d", i); if (error != 0) { cam_num_doneqs = i; break; } } if (cam_num_doneqs < 1) { printf("xpt_init: Cannot init completion queues " "- failing attach\n"); return (ENOMEM); } /* * Register a callback for when interrupts are enabled. */ xsoftc.xpt_config_hook = (struct intr_config_hook *)malloc(sizeof(struct intr_config_hook), M_CAMXPT, M_NOWAIT | M_ZERO); if (xsoftc.xpt_config_hook == NULL) { printf("xpt_init: Cannot malloc config hook " "- failing attach\n"); return (ENOMEM); } xsoftc.xpt_config_hook->ich_func = xpt_config; if (config_intrhook_establish(xsoftc.xpt_config_hook) != 0) { free (xsoftc.xpt_config_hook, M_CAMXPT); printf("xpt_init: config_intrhook_establish failed " "- failing attach\n"); } return (0); } static cam_status xptregister(struct cam_periph *periph, void *arg) { struct cam_sim *xpt_sim; if (periph == NULL) { printf("xptregister: periph was NULL!!\n"); return(CAM_REQ_CMP_ERR); } xpt_sim = (struct cam_sim *)arg; xpt_sim->softc = periph; xpt_periph = periph; periph->softc = NULL; return(CAM_REQ_CMP); } int32_t xpt_add_periph(struct cam_periph *periph) { struct cam_ed *device; int32_t status; TASK_INIT(&periph->periph_run_task, 0, xpt_run_allocq_task, periph); device = periph->path->device; status = CAM_REQ_CMP; if (device != NULL) { mtx_lock(&device->target->bus->eb_mtx); device->generation++; SLIST_INSERT_HEAD(&device->periphs, periph, periph_links); mtx_unlock(&device->target->bus->eb_mtx); atomic_add_32(&xsoftc.xpt_generation, 1); } return (status); } void xpt_remove_periph(struct cam_periph *periph) { struct cam_ed *device; device = periph->path->device; if (device != NULL) { mtx_lock(&device->target->bus->eb_mtx); device->generation++; SLIST_REMOVE(&device->periphs, periph, cam_periph, periph_links); mtx_unlock(&device->target->bus->eb_mtx); atomic_add_32(&xsoftc.xpt_generation, 1); } } void xpt_announce_periph(struct cam_periph *periph, char *announce_string) { struct cam_path *path = periph->path; cam_periph_assert(periph, MA_OWNED); periph->flags |= CAM_PERIPH_ANNOUNCED; printf("%s%d at %s%d bus %d scbus%d target %d lun %jx\n", periph->periph_name, periph->unit_number, path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id, path->bus->path_id, path->target->target_id, (uintmax_t)path->device->lun_id); printf("%s%d: ", periph->periph_name, periph->unit_number); if (path->device->protocol == PROTO_SCSI) scsi_print_inquiry(&path->device->inq_data); else if (path->device->protocol == PROTO_ATA || path->device->protocol == PROTO_SATAPM) ata_print_ident(&path->device->ident_data); else if (path->device->protocol == PROTO_SEMB) semb_print_ident( (struct sep_identify_data *)&path->device->ident_data); else printf("Unknown protocol device\n"); if (path->device->serial_num_len > 0) { /* Don't wrap the screen - print only the first 60 chars */ printf("%s%d: Serial Number %.60s\n", periph->periph_name, periph->unit_number, path->device->serial_num); } /* Announce transport details. */ (*(path->bus->xport->announce))(periph); /* Announce command queueing. */ if (path->device->inq_flags & SID_CmdQue || path->device->flags & CAM_DEV_TAG_AFTER_COUNT) { printf("%s%d: Command Queueing enabled\n", periph->periph_name, periph->unit_number); } /* Announce caller's details if they've passed in. */ if (announce_string != NULL) printf("%s%d: %s\n", periph->periph_name, periph->unit_number, announce_string); } void xpt_announce_quirks(struct cam_periph *periph, int quirks, char *bit_string) { if (quirks != 0) { printf("%s%d: quirks=0x%b\n", periph->periph_name, periph->unit_number, quirks, bit_string); } } void xpt_denounce_periph(struct cam_periph *periph) { struct cam_path *path = periph->path; cam_periph_assert(periph, MA_OWNED); printf("%s%d at %s%d bus %d scbus%d target %d lun %jx\n", periph->periph_name, periph->unit_number, path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id, path->bus->path_id, path->target->target_id, (uintmax_t)path->device->lun_id); printf("%s%d: ", periph->periph_name, periph->unit_number); if (path->device->protocol == PROTO_SCSI) scsi_print_inquiry_short(&path->device->inq_data); else if (path->device->protocol == PROTO_ATA || path->device->protocol == PROTO_SATAPM) ata_print_ident_short(&path->device->ident_data); else if (path->device->protocol == PROTO_SEMB) semb_print_ident_short( (struct sep_identify_data *)&path->device->ident_data); else printf("Unknown protocol device"); if (path->device->serial_num_len > 0) printf(" s/n %.60s", path->device->serial_num); printf(" detached\n"); } int xpt_getattr(char *buf, size_t len, const char *attr, struct cam_path *path) { int ret = -1, l; struct ccb_dev_advinfo cdai; struct scsi_vpd_id_descriptor *idd; xpt_path_assert(path, MA_OWNED); memset(&cdai, 0, sizeof(cdai)); xpt_setup_ccb(&cdai.ccb_h, path, CAM_PRIORITY_NORMAL); cdai.ccb_h.func_code = XPT_DEV_ADVINFO; cdai.bufsiz = len; if (!strcmp(attr, "GEOM::ident")) cdai.buftype = CDAI_TYPE_SERIAL_NUM; else if (!strcmp(attr, "GEOM::physpath")) cdai.buftype = CDAI_TYPE_PHYS_PATH; else if (strcmp(attr, "GEOM::lunid") == 0 || strcmp(attr, "GEOM::lunname") == 0) { cdai.buftype = CDAI_TYPE_SCSI_DEVID; cdai.bufsiz = CAM_SCSI_DEVID_MAXLEN; } else goto out; cdai.buf = malloc(cdai.bufsiz, M_CAMXPT, M_NOWAIT|M_ZERO); if (cdai.buf == NULL) { ret = ENOMEM; goto out; } xpt_action((union ccb *)&cdai); /* can only be synchronous */ if ((cdai.ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(cdai.ccb_h.path, 0, 0, 0, FALSE); if (cdai.provsiz == 0) goto out; if (cdai.buftype == CDAI_TYPE_SCSI_DEVID) { if (strcmp(attr, "GEOM::lunid") == 0) { idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_naa); if (idd == NULL) idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_eui64); } else idd = NULL; if (idd == NULL) idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_t10); if (idd == NULL) idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf, cdai.provsiz, scsi_devid_is_lun_name); if (idd == NULL) goto out; ret = 0; if ((idd->proto_codeset & SVPD_ID_CODESET_MASK) == SVPD_ID_CODESET_ASCII) { if (idd->length < len) { for (l = 0; l < idd->length; l++) buf[l] = idd->identifier[l] ? idd->identifier[l] : ' '; buf[l] = 0; } else ret = EFAULT; } else if ((idd->proto_codeset & SVPD_ID_CODESET_MASK) == SVPD_ID_CODESET_UTF8) { l = strnlen(idd->identifier, idd->length); if (l < len) { bcopy(idd->identifier, buf, l); buf[l] = 0; } else ret = EFAULT; } else { if (idd->length * 2 < len) { for (l = 0; l < idd->length; l++) sprintf(buf + l * 2, "%02x", idd->identifier[l]); } else ret = EFAULT; } } else { ret = 0; if (strlcpy(buf, cdai.buf, len) >= len) ret = EFAULT; } out: if (cdai.buf != NULL) free(cdai.buf, M_CAMXPT); return ret; } static dev_match_ret xptbusmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_eb *bus) { dev_match_ret retval; int i; retval = DM_RET_NONE; /* * If we aren't given something to match against, that's an error. */ if (bus == NULL) return(DM_RET_ERROR); /* * If there are no match entries, then this bus matches no * matter what. */ if ((patterns == NULL) || (num_patterns == 0)) return(DM_RET_DESCEND | DM_RET_COPY); for (i = 0; i < num_patterns; i++) { struct bus_match_pattern *cur_pattern; /* * If the pattern in question isn't for a bus node, we * aren't interested. However, we do indicate to the * calling routine that we should continue descending the * tree, since the user wants to match against lower-level * EDT elements. */ if (patterns[i].type != DEV_MATCH_BUS) { if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE) retval |= DM_RET_DESCEND; continue; } cur_pattern = &patterns[i].pattern.bus_pattern; /* * If they want to match any bus node, we give them any * device node. */ if (cur_pattern->flags == BUS_MATCH_ANY) { /* set the copy flag */ retval |= DM_RET_COPY; /* * If we've already decided on an action, go ahead * and return. */ if ((retval & DM_RET_ACTION_MASK) != DM_RET_NONE) return(retval); } /* * Not sure why someone would do this... */ if (cur_pattern->flags == BUS_MATCH_NONE) continue; if (((cur_pattern->flags & BUS_MATCH_PATH) != 0) && (cur_pattern->path_id != bus->path_id)) continue; if (((cur_pattern->flags & BUS_MATCH_BUS_ID) != 0) && (cur_pattern->bus_id != bus->sim->bus_id)) continue; if (((cur_pattern->flags & BUS_MATCH_UNIT) != 0) && (cur_pattern->unit_number != bus->sim->unit_number)) continue; if (((cur_pattern->flags & BUS_MATCH_NAME) != 0) && (strncmp(cur_pattern->dev_name, bus->sim->sim_name, DEV_IDLEN) != 0)) continue; /* * If we get to this point, the user definitely wants * information on this bus. So tell the caller to copy the * data out. */ retval |= DM_RET_COPY; /* * If the return action has been set to descend, then we * know that we've already seen a non-bus matching * expression, therefore we need to further descend the tree. * This won't change by continuing around the loop, so we * go ahead and return. If we haven't seen a non-bus * matching expression, we keep going around the loop until * we exhaust the matching expressions. We'll set the stop * flag once we fall out of the loop. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_DESCEND) return(retval); } /* * If the return action hasn't been set to descend yet, that means * we haven't seen anything other than bus matching patterns. So * tell the caller to stop descending the tree -- the user doesn't * want to match against lower level tree elements. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE) retval |= DM_RET_STOP; return(retval); } static dev_match_ret xptdevicematch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_ed *device) { dev_match_ret retval; int i; retval = DM_RET_NONE; /* * If we aren't given something to match against, that's an error. */ if (device == NULL) return(DM_RET_ERROR); /* * If there are no match entries, then this device matches no * matter what. */ if ((patterns == NULL) || (num_patterns == 0)) return(DM_RET_DESCEND | DM_RET_COPY); for (i = 0; i < num_patterns; i++) { struct device_match_pattern *cur_pattern; struct scsi_vpd_device_id *device_id_page; /* * If the pattern in question isn't for a device node, we * aren't interested. */ if (patterns[i].type != DEV_MATCH_DEVICE) { if ((patterns[i].type == DEV_MATCH_PERIPH) && ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE)) retval |= DM_RET_DESCEND; continue; } cur_pattern = &patterns[i].pattern.device_pattern; /* Error out if mutually exclusive options are specified. */ if ((cur_pattern->flags & (DEV_MATCH_INQUIRY|DEV_MATCH_DEVID)) == (DEV_MATCH_INQUIRY|DEV_MATCH_DEVID)) return(DM_RET_ERROR); /* * If they want to match any device node, we give them any * device node. */ if (cur_pattern->flags == DEV_MATCH_ANY) goto copy_dev_node; /* * Not sure why someone would do this... */ if (cur_pattern->flags == DEV_MATCH_NONE) continue; if (((cur_pattern->flags & DEV_MATCH_PATH) != 0) && (cur_pattern->path_id != device->target->bus->path_id)) continue; if (((cur_pattern->flags & DEV_MATCH_TARGET) != 0) && (cur_pattern->target_id != device->target->target_id)) continue; if (((cur_pattern->flags & DEV_MATCH_LUN) != 0) && (cur_pattern->target_lun != device->lun_id)) continue; if (((cur_pattern->flags & DEV_MATCH_INQUIRY) != 0) && (cam_quirkmatch((caddr_t)&device->inq_data, (caddr_t)&cur_pattern->data.inq_pat, 1, sizeof(cur_pattern->data.inq_pat), scsi_static_inquiry_match) == NULL)) continue; device_id_page = (struct scsi_vpd_device_id *)device->device_id; if (((cur_pattern->flags & DEV_MATCH_DEVID) != 0) && (device->device_id_len < SVPD_DEVICE_ID_HDR_LEN || scsi_devid_match((uint8_t *)device_id_page->desc_list, device->device_id_len - SVPD_DEVICE_ID_HDR_LEN, cur_pattern->data.devid_pat.id, cur_pattern->data.devid_pat.id_len) != 0)) continue; copy_dev_node: /* * If we get to this point, the user definitely wants * information on this device. So tell the caller to copy * the data out. */ retval |= DM_RET_COPY; /* * If the return action has been set to descend, then we * know that we've already seen a peripheral matching * expression, therefore we need to further descend the tree. * This won't change by continuing around the loop, so we * go ahead and return. If we haven't seen a peripheral * matching expression, we keep going around the loop until * we exhaust the matching expressions. We'll set the stop * flag once we fall out of the loop. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_DESCEND) return(retval); } /* * If the return action hasn't been set to descend yet, that means * we haven't seen any peripheral matching patterns. So tell the * caller to stop descending the tree -- the user doesn't want to * match against lower level tree elements. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE) retval |= DM_RET_STOP; return(retval); } /* * Match a single peripheral against any number of match patterns. */ static dev_match_ret xptperiphmatch(struct dev_match_pattern *patterns, u_int num_patterns, struct cam_periph *periph) { dev_match_ret retval; int i; /* * If we aren't given something to match against, that's an error. */ if (periph == NULL) return(DM_RET_ERROR); /* * If there are no match entries, then this peripheral matches no * matter what. */ if ((patterns == NULL) || (num_patterns == 0)) return(DM_RET_STOP | DM_RET_COPY); /* * There aren't any nodes below a peripheral node, so there's no * reason to descend the tree any further. */ retval = DM_RET_STOP; for (i = 0; i < num_patterns; i++) { struct periph_match_pattern *cur_pattern; /* * If the pattern in question isn't for a peripheral, we * aren't interested. */ if (patterns[i].type != DEV_MATCH_PERIPH) continue; cur_pattern = &patterns[i].pattern.periph_pattern; /* * If they want to match on anything, then we will do so. */ if (cur_pattern->flags == PERIPH_MATCH_ANY) { /* set the copy flag */ retval |= DM_RET_COPY; /* * We've already set the return action to stop, * since there are no nodes below peripherals in * the tree. */ return(retval); } /* * Not sure why someone would do this... */ if (cur_pattern->flags == PERIPH_MATCH_NONE) continue; if (((cur_pattern->flags & PERIPH_MATCH_PATH) != 0) && (cur_pattern->path_id != periph->path->bus->path_id)) continue; /* * For the target and lun id's, we have to make sure the * target and lun pointers aren't NULL. The xpt peripheral * has a wildcard target and device. */ if (((cur_pattern->flags & PERIPH_MATCH_TARGET) != 0) && ((periph->path->target == NULL) ||(cur_pattern->target_id != periph->path->target->target_id))) continue; if (((cur_pattern->flags & PERIPH_MATCH_LUN) != 0) && ((periph->path->device == NULL) || (cur_pattern->target_lun != periph->path->device->lun_id))) continue; if (((cur_pattern->flags & PERIPH_MATCH_UNIT) != 0) && (cur_pattern->unit_number != periph->unit_number)) continue; if (((cur_pattern->flags & PERIPH_MATCH_NAME) != 0) && (strncmp(cur_pattern->periph_name, periph->periph_name, DEV_IDLEN) != 0)) continue; /* * If we get to this point, the user definitely wants * information on this peripheral. So tell the caller to * copy the data out. */ retval |= DM_RET_COPY; /* * The return action has already been set to stop, since * peripherals don't have any nodes below them in the EDT. */ return(retval); } /* * If we get to this point, the peripheral that was passed in * doesn't match any of the patterns. */ return(retval); } static int xptedtbusfunc(struct cam_eb *bus, void *arg) { struct ccb_dev_match *cdm; struct cam_et *target; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; /* * If our position is for something deeper in the tree, that means * that we've already seen this node. So, we keep going down. */ if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target != NULL)) retval = DM_RET_DESCEND; else retval = xptbusmatch(cdm->patterns, cdm->num_patterns, bus); /* * If we got an error, bail out of the search. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this bus out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_EDT | CAM_DEV_POS_BUS; cdm->pos.cookie.bus = bus; cdm->pos.generations[CAM_BUS_GENERATION]= xsoftc.bus_generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_BUS; cdm->matches[j].result.bus_result.path_id = bus->path_id; cdm->matches[j].result.bus_result.bus_id = bus->sim->bus_id; cdm->matches[j].result.bus_result.unit_number = bus->sim->unit_number; strncpy(cdm->matches[j].result.bus_result.dev_name, bus->sim->sim_name, DEV_IDLEN); } /* * If the user is only interested in busses, there's no * reason to descend to the next level in the tree. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_STOP) return(1); /* * If there is a target generation recorded, check it to * make sure the target list hasn't changed. */ mtx_lock(&bus->eb_mtx); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target != NULL)) { if ((cdm->pos.generations[CAM_TARGET_GENERATION] != bus->generation)) { mtx_unlock(&bus->eb_mtx); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return (0); } target = (struct cam_et *)cdm->pos.cookie.target; target->refcount++; } else target = NULL; mtx_unlock(&bus->eb_mtx); return (xpttargettraverse(bus, target, xptedttargetfunc, arg)); } static int xptedttargetfunc(struct cam_et *target, void *arg) { struct ccb_dev_match *cdm; struct cam_eb *bus; struct cam_ed *device; cdm = (struct ccb_dev_match *)arg; bus = target->bus; /* * If there is a device list generation recorded, check it to * make sure the device list hasn't changed. */ mtx_lock(&bus->eb_mtx); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target == target) && (cdm->pos.position_type & CAM_DEV_POS_DEVICE) && (cdm->pos.cookie.device != NULL)) { if (cdm->pos.generations[CAM_DEV_GENERATION] != target->generation) { mtx_unlock(&bus->eb_mtx); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } device = (struct cam_ed *)cdm->pos.cookie.device; device->refcount++; } else device = NULL; mtx_unlock(&bus->eb_mtx); return (xptdevicetraverse(target, device, xptedtdevicefunc, arg)); } static int xptedtdevicefunc(struct cam_ed *device, void *arg) { struct cam_eb *bus; struct cam_periph *periph; struct ccb_dev_match *cdm; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; bus = device->target->bus; /* * If our position is for something deeper in the tree, that means * that we've already seen this node. So, we keep going down. */ if ((cdm->pos.position_type & CAM_DEV_POS_DEVICE) && (cdm->pos.cookie.device == device) && (cdm->pos.position_type & CAM_DEV_POS_PERIPH) && (cdm->pos.cookie.periph != NULL)) retval = DM_RET_DESCEND; else retval = xptdevicematch(cdm->patterns, cdm->num_patterns, device); if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this device out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_EDT | CAM_DEV_POS_BUS | CAM_DEV_POS_TARGET | CAM_DEV_POS_DEVICE; cdm->pos.cookie.bus = device->target->bus; cdm->pos.generations[CAM_BUS_GENERATION]= xsoftc.bus_generation; cdm->pos.cookie.target = device->target; cdm->pos.generations[CAM_TARGET_GENERATION] = device->target->bus->generation; cdm->pos.cookie.device = device; cdm->pos.generations[CAM_DEV_GENERATION] = device->target->generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_DEVICE; cdm->matches[j].result.device_result.path_id = device->target->bus->path_id; cdm->matches[j].result.device_result.target_id = device->target->target_id; cdm->matches[j].result.device_result.target_lun = device->lun_id; cdm->matches[j].result.device_result.protocol = device->protocol; bcopy(&device->inq_data, &cdm->matches[j].result.device_result.inq_data, sizeof(struct scsi_inquiry_data)); bcopy(&device->ident_data, &cdm->matches[j].result.device_result.ident_data, sizeof(struct ata_params)); /* Let the user know whether this device is unconfigured */ if (device->flags & CAM_DEV_UNCONFIGURED) cdm->matches[j].result.device_result.flags = DEV_RESULT_UNCONFIGURED; else cdm->matches[j].result.device_result.flags = DEV_RESULT_NOFLAG; } /* * If the user isn't interested in peripherals, don't descend * the tree any further. */ if ((retval & DM_RET_ACTION_MASK) == DM_RET_STOP) return(1); /* * If there is a peripheral list generation recorded, make sure * it hasn't changed. */ xpt_lock_buses(); mtx_lock(&bus->eb_mtx); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus == bus) && (cdm->pos.position_type & CAM_DEV_POS_TARGET) && (cdm->pos.cookie.target == device->target) && (cdm->pos.position_type & CAM_DEV_POS_DEVICE) && (cdm->pos.cookie.device == device) && (cdm->pos.position_type & CAM_DEV_POS_PERIPH) && (cdm->pos.cookie.periph != NULL)) { if (cdm->pos.generations[CAM_PERIPH_GENERATION] != device->generation) { mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } periph = (struct cam_periph *)cdm->pos.cookie.periph; periph->refcount++; } else periph = NULL; mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); return (xptperiphtraverse(device, periph, xptedtperiphfunc, arg)); } static int xptedtperiphfunc(struct cam_periph *periph, void *arg) { struct ccb_dev_match *cdm; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; retval = xptperiphmatch(cdm->patterns, cdm->num_patterns, periph); if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this peripheral out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_EDT | CAM_DEV_POS_BUS | CAM_DEV_POS_TARGET | CAM_DEV_POS_DEVICE | CAM_DEV_POS_PERIPH; cdm->pos.cookie.bus = periph->path->bus; cdm->pos.generations[CAM_BUS_GENERATION]= xsoftc.bus_generation; cdm->pos.cookie.target = periph->path->target; cdm->pos.generations[CAM_TARGET_GENERATION] = periph->path->bus->generation; cdm->pos.cookie.device = periph->path->device; cdm->pos.generations[CAM_DEV_GENERATION] = periph->path->target->generation; cdm->pos.cookie.periph = periph; cdm->pos.generations[CAM_PERIPH_GENERATION] = periph->path->device->generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_PERIPH; cdm->matches[j].result.periph_result.path_id = periph->path->bus->path_id; cdm->matches[j].result.periph_result.target_id = periph->path->target->target_id; cdm->matches[j].result.periph_result.target_lun = periph->path->device->lun_id; cdm->matches[j].result.periph_result.unit_number = periph->unit_number; strncpy(cdm->matches[j].result.periph_result.periph_name, periph->periph_name, DEV_IDLEN); } return(1); } static int xptedtmatch(struct ccb_dev_match *cdm) { struct cam_eb *bus; int ret; cdm->num_matches = 0; /* * Check the bus list generation. If it has changed, the user * needs to reset everything and start over. */ xpt_lock_buses(); if ((cdm->pos.position_type & CAM_DEV_POS_BUS) && (cdm->pos.cookie.bus != NULL)) { if (cdm->pos.generations[CAM_BUS_GENERATION] != xsoftc.bus_generation) { xpt_unlock_buses(); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } bus = (struct cam_eb *)cdm->pos.cookie.bus; bus->refcount++; } else bus = NULL; xpt_unlock_buses(); ret = xptbustraverse(bus, xptedtbusfunc, cdm); /* * If we get back 0, that means that we had to stop before fully * traversing the EDT. It also means that one of the subroutines * has set the status field to the proper value. If we get back 1, * we've fully traversed the EDT and copied out any matching entries. */ if (ret == 1) cdm->status = CAM_DEV_MATCH_LAST; return(ret); } static int xptplistpdrvfunc(struct periph_driver **pdrv, void *arg) { struct cam_periph *periph; struct ccb_dev_match *cdm; cdm = (struct ccb_dev_match *)arg; xpt_lock_buses(); if ((cdm->pos.position_type & CAM_DEV_POS_PDPTR) && (cdm->pos.cookie.pdrv == pdrv) && (cdm->pos.position_type & CAM_DEV_POS_PERIPH) && (cdm->pos.cookie.periph != NULL)) { if (cdm->pos.generations[CAM_PERIPH_GENERATION] != (*pdrv)->generation) { xpt_unlock_buses(); cdm->status = CAM_DEV_MATCH_LIST_CHANGED; return(0); } periph = (struct cam_periph *)cdm->pos.cookie.periph; periph->refcount++; } else periph = NULL; xpt_unlock_buses(); return (xptpdperiphtraverse(pdrv, periph, xptplistperiphfunc, arg)); } static int xptplistperiphfunc(struct cam_periph *periph, void *arg) { struct ccb_dev_match *cdm; dev_match_ret retval; cdm = (struct ccb_dev_match *)arg; retval = xptperiphmatch(cdm->patterns, cdm->num_patterns, periph); if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } /* * If the copy flag is set, copy this peripheral out. */ if (retval & DM_RET_COPY) { int spaceleft, j; spaceleft = cdm->match_buf_len - (cdm->num_matches * sizeof(struct dev_match_result)); /* * If we don't have enough space to put in another * match result, save our position and tell the * user there are more devices to check. */ if (spaceleft < sizeof(struct dev_match_result)) { struct periph_driver **pdrv; pdrv = NULL; bzero(&cdm->pos, sizeof(cdm->pos)); cdm->pos.position_type = CAM_DEV_POS_PDRV | CAM_DEV_POS_PDPTR | CAM_DEV_POS_PERIPH; /* * This may look a bit non-sensical, but it is * actually quite logical. There are very few * peripheral drivers, and bloating every peripheral * structure with a pointer back to its parent * peripheral driver linker set entry would cost * more in the long run than doing this quick lookup. */ for (pdrv = periph_drivers; *pdrv != NULL; pdrv++) { if (strcmp((*pdrv)->driver_name, periph->periph_name) == 0) break; } if (*pdrv == NULL) { cdm->status = CAM_DEV_MATCH_ERROR; return(0); } cdm->pos.cookie.pdrv = pdrv; /* * The periph generation slot does double duty, as * does the periph pointer slot. They are used for * both edt and pdrv lookups and positioning. */ cdm->pos.cookie.periph = periph; cdm->pos.generations[CAM_PERIPH_GENERATION] = (*pdrv)->generation; cdm->status = CAM_DEV_MATCH_MORE; return(0); } j = cdm->num_matches; cdm->num_matches++; cdm->matches[j].type = DEV_MATCH_PERIPH; cdm->matches[j].result.periph_result.path_id = periph->path->bus->path_id; /* * The transport layer peripheral doesn't have a target or * lun. */ if (periph->path->target) cdm->matches[j].result.periph_result.target_id = periph->path->target->target_id; else cdm->matches[j].result.periph_result.target_id = CAM_TARGET_WILDCARD; if (periph->path->device) cdm->matches[j].result.periph_result.target_lun = periph->path->device->lun_id; else cdm->matches[j].result.periph_result.target_lun = CAM_LUN_WILDCARD; cdm->matches[j].result.periph_result.unit_number = periph->unit_number; strncpy(cdm->matches[j].result.periph_result.periph_name, periph->periph_name, DEV_IDLEN); } return(1); } static int xptperiphlistmatch(struct ccb_dev_match *cdm) { int ret; cdm->num_matches = 0; /* * At this point in the edt traversal function, we check the bus * list generation to make sure that no busses have been added or * removed since the user last sent a XPT_DEV_MATCH ccb through. * For the peripheral driver list traversal function, however, we * don't have to worry about new peripheral driver types coming or * going; they're in a linker set, and therefore can't change * without a recompile. */ if ((cdm->pos.position_type & CAM_DEV_POS_PDPTR) && (cdm->pos.cookie.pdrv != NULL)) ret = xptpdrvtraverse( (struct periph_driver **)cdm->pos.cookie.pdrv, xptplistpdrvfunc, cdm); else ret = xptpdrvtraverse(NULL, xptplistpdrvfunc, cdm); /* * If we get back 0, that means that we had to stop before fully * traversing the peripheral driver tree. It also means that one of * the subroutines has set the status field to the proper value. If * we get back 1, we've fully traversed the EDT and copied out any * matching entries. */ if (ret == 1) cdm->status = CAM_DEV_MATCH_LAST; return(ret); } static int xptbustraverse(struct cam_eb *start_bus, xpt_busfunc_t *tr_func, void *arg) { struct cam_eb *bus, *next_bus; int retval; retval = 1; if (start_bus) bus = start_bus; else { xpt_lock_buses(); bus = TAILQ_FIRST(&xsoftc.xpt_busses); if (bus == NULL) { xpt_unlock_buses(); return (retval); } bus->refcount++; xpt_unlock_buses(); } for (; bus != NULL; bus = next_bus) { retval = tr_func(bus, arg); if (retval == 0) { xpt_release_bus(bus); break; } xpt_lock_buses(); next_bus = TAILQ_NEXT(bus, links); if (next_bus) next_bus->refcount++; xpt_unlock_buses(); xpt_release_bus(bus); } return(retval); } static int xpttargettraverse(struct cam_eb *bus, struct cam_et *start_target, xpt_targetfunc_t *tr_func, void *arg) { struct cam_et *target, *next_target; int retval; retval = 1; if (start_target) target = start_target; else { mtx_lock(&bus->eb_mtx); target = TAILQ_FIRST(&bus->et_entries); if (target == NULL) { mtx_unlock(&bus->eb_mtx); return (retval); } target->refcount++; mtx_unlock(&bus->eb_mtx); } for (; target != NULL; target = next_target) { retval = tr_func(target, arg); if (retval == 0) { xpt_release_target(target); break; } mtx_lock(&bus->eb_mtx); next_target = TAILQ_NEXT(target, links); if (next_target) next_target->refcount++; mtx_unlock(&bus->eb_mtx); xpt_release_target(target); } return(retval); } static int xptdevicetraverse(struct cam_et *target, struct cam_ed *start_device, xpt_devicefunc_t *tr_func, void *arg) { struct cam_eb *bus; struct cam_ed *device, *next_device; int retval; retval = 1; bus = target->bus; if (start_device) device = start_device; else { mtx_lock(&bus->eb_mtx); device = TAILQ_FIRST(&target->ed_entries); if (device == NULL) { mtx_unlock(&bus->eb_mtx); return (retval); } device->refcount++; mtx_unlock(&bus->eb_mtx); } for (; device != NULL; device = next_device) { mtx_lock(&device->device_mtx); retval = tr_func(device, arg); mtx_unlock(&device->device_mtx); if (retval == 0) { xpt_release_device(device); break; } mtx_lock(&bus->eb_mtx); next_device = TAILQ_NEXT(device, links); if (next_device) next_device->refcount++; mtx_unlock(&bus->eb_mtx); xpt_release_device(device); } return(retval); } static int xptperiphtraverse(struct cam_ed *device, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg) { struct cam_eb *bus; struct cam_periph *periph, *next_periph; int retval; retval = 1; bus = device->target->bus; if (start_periph) periph = start_periph; else { xpt_lock_buses(); mtx_lock(&bus->eb_mtx); periph = SLIST_FIRST(&device->periphs); while (periph != NULL && (periph->flags & CAM_PERIPH_FREE) != 0) periph = SLIST_NEXT(periph, periph_links); if (periph == NULL) { mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); return (retval); } periph->refcount++; mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); } for (; periph != NULL; periph = next_periph) { retval = tr_func(periph, arg); if (retval == 0) { cam_periph_release_locked(periph); break; } xpt_lock_buses(); mtx_lock(&bus->eb_mtx); next_periph = SLIST_NEXT(periph, periph_links); while (next_periph != NULL && (next_periph->flags & CAM_PERIPH_FREE) != 0) next_periph = SLIST_NEXT(next_periph, periph_links); if (next_periph) next_periph->refcount++; mtx_unlock(&bus->eb_mtx); xpt_unlock_buses(); cam_periph_release_locked(periph); } return(retval); } static int xptpdrvtraverse(struct periph_driver **start_pdrv, xpt_pdrvfunc_t *tr_func, void *arg) { struct periph_driver **pdrv; int retval; retval = 1; /* * We don't traverse the peripheral driver list like we do the * other lists, because it is a linker set, and therefore cannot be * changed during runtime. If the peripheral driver list is ever * re-done to be something other than a linker set (i.e. it can * change while the system is running), the list traversal should * be modified to work like the other traversal functions. */ for (pdrv = (start_pdrv ? start_pdrv : periph_drivers); *pdrv != NULL; pdrv++) { retval = tr_func(pdrv, arg); if (retval == 0) return(retval); } return(retval); } static int xptpdperiphtraverse(struct periph_driver **pdrv, struct cam_periph *start_periph, xpt_periphfunc_t *tr_func, void *arg) { struct cam_periph *periph, *next_periph; int retval; retval = 1; if (start_periph) periph = start_periph; else { xpt_lock_buses(); periph = TAILQ_FIRST(&(*pdrv)->units); while (periph != NULL && (periph->flags & CAM_PERIPH_FREE) != 0) periph = TAILQ_NEXT(periph, unit_links); if (periph == NULL) { xpt_unlock_buses(); return (retval); } periph->refcount++; xpt_unlock_buses(); } for (; periph != NULL; periph = next_periph) { cam_periph_lock(periph); retval = tr_func(periph, arg); cam_periph_unlock(periph); if (retval == 0) { cam_periph_release(periph); break; } xpt_lock_buses(); next_periph = TAILQ_NEXT(periph, unit_links); while (next_periph != NULL && (next_periph->flags & CAM_PERIPH_FREE) != 0) next_periph = TAILQ_NEXT(next_periph, unit_links); if (next_periph) next_periph->refcount++; xpt_unlock_buses(); cam_periph_release(periph); } return(retval); } static int xptdefbusfunc(struct cam_eb *bus, void *arg) { struct xpt_traverse_config *tr_config; tr_config = (struct xpt_traverse_config *)arg; if (tr_config->depth == XPT_DEPTH_BUS) { xpt_busfunc_t *tr_func; tr_func = (xpt_busfunc_t *)tr_config->tr_func; return(tr_func(bus, tr_config->tr_arg)); } else return(xpttargettraverse(bus, NULL, xptdeftargetfunc, arg)); } static int xptdeftargetfunc(struct cam_et *target, void *arg) { struct xpt_traverse_config *tr_config; tr_config = (struct xpt_traverse_config *)arg; if (tr_config->depth == XPT_DEPTH_TARGET) { xpt_targetfunc_t *tr_func; tr_func = (xpt_targetfunc_t *)tr_config->tr_func; return(tr_func(target, tr_config->tr_arg)); } else return(xptdevicetraverse(target, NULL, xptdefdevicefunc, arg)); } static int xptdefdevicefunc(struct cam_ed *device, void *arg) { struct xpt_traverse_config *tr_config; tr_config = (struct xpt_traverse_config *)arg; if (tr_config->depth == XPT_DEPTH_DEVICE) { xpt_devicefunc_t *tr_func; tr_func = (xpt_devicefunc_t *)tr_config->tr_func; return(tr_func(device, tr_config->tr_arg)); } else return(xptperiphtraverse(device, NULL, xptdefperiphfunc, arg)); } static int xptdefperiphfunc(struct cam_periph *periph, void *arg) { struct xpt_traverse_config *tr_config; xpt_periphfunc_t *tr_func; tr_config = (struct xpt_traverse_config *)arg; tr_func = (xpt_periphfunc_t *)tr_config->tr_func; /* * Unlike the other default functions, we don't check for depth * here. The peripheral driver level is the last level in the EDT, * so if we're here, we should execute the function in question. */ return(tr_func(periph, tr_config->tr_arg)); } /* * Execute the given function for every bus in the EDT. */ static int xpt_for_all_busses(xpt_busfunc_t *tr_func, void *arg) { struct xpt_traverse_config tr_config; tr_config.depth = XPT_DEPTH_BUS; tr_config.tr_func = tr_func; tr_config.tr_arg = arg; return(xptbustraverse(NULL, xptdefbusfunc, &tr_config)); } /* * Execute the given function for every device in the EDT. */ static int xpt_for_all_devices(xpt_devicefunc_t *tr_func, void *arg) { struct xpt_traverse_config tr_config; tr_config.depth = XPT_DEPTH_DEVICE; tr_config.tr_func = tr_func; tr_config.tr_arg = arg; return(xptbustraverse(NULL, xptdefbusfunc, &tr_config)); } static int xptsetasyncfunc(struct cam_ed *device, void *arg) { struct cam_path path; struct ccb_getdev cgd; struct ccb_setasync *csa = (struct ccb_setasync *)arg; /* * Don't report unconfigured devices (Wildcard devs, * devices only for target mode, device instances * that have been invalidated but are waiting for * their last reference count to be released). */ if ((device->flags & CAM_DEV_UNCONFIGURED) != 0) return (1); xpt_compile_path(&path, NULL, device->target->bus->path_id, device->target->target_id, device->lun_id); xpt_setup_ccb(&cgd.ccb_h, &path, CAM_PRIORITY_NORMAL); cgd.ccb_h.func_code = XPT_GDEV_TYPE; xpt_action((union ccb *)&cgd); csa->callback(csa->callback_arg, AC_FOUND_DEVICE, &path, &cgd); xpt_release_path(&path); return(1); } static int xptsetasyncbusfunc(struct cam_eb *bus, void *arg) { struct cam_path path; struct ccb_pathinq cpi; struct ccb_setasync *csa = (struct ccb_setasync *)arg; xpt_compile_path(&path, /*periph*/NULL, bus->path_id, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); xpt_path_lock(&path); xpt_setup_ccb(&cpi.ccb_h, &path, CAM_PRIORITY_NORMAL); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); csa->callback(csa->callback_arg, AC_PATH_REGISTERED, &path, &cpi); xpt_path_unlock(&path); xpt_release_path(&path); return(1); } void xpt_action(union ccb *start_ccb) { CAM_DEBUG(start_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_action\n")); start_ccb->ccb_h.status = CAM_REQ_INPROG; (*(start_ccb->ccb_h.path->bus->xport->action))(start_ccb); } void xpt_action_default(union ccb *start_ccb) { struct cam_path *path; struct cam_sim *sim; int lock; path = start_ccb->ccb_h.path; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_action_default\n")); switch (start_ccb->ccb_h.func_code) { case XPT_SCSI_IO: { struct cam_ed *device; /* * For the sake of compatibility with SCSI-1 * devices that may not understand the identify * message, we include lun information in the * second byte of all commands. SCSI-1 specifies * that luns are a 3 bit value and reserves only 3 * bits for lun information in the CDB. Later * revisions of the SCSI spec allow for more than 8 * luns, but have deprecated lun information in the * CDB. So, if the lun won't fit, we must omit. * * Also be aware that during initial probing for devices, * the inquiry information is unknown but initialized to 0. * This means that this code will be exercised while probing * devices with an ANSI revision greater than 2. */ device = path->device; if (device->protocol_version <= SCSI_REV_2 && start_ccb->ccb_h.target_lun < 8 && (start_ccb->ccb_h.flags & CAM_CDB_POINTER) == 0) { start_ccb->csio.cdb_io.cdb_bytes[1] |= start_ccb->ccb_h.target_lun << 5; } start_ccb->csio.scsi_status = SCSI_STATUS_OK; } /* FALLTHROUGH */ case XPT_TARGET_IO: case XPT_CONT_TARGET_IO: start_ccb->csio.sense_resid = 0; start_ccb->csio.resid = 0; /* FALLTHROUGH */ case XPT_ATA_IO: if (start_ccb->ccb_h.func_code == XPT_ATA_IO) start_ccb->ataio.resid = 0; /* FALLTHROUGH */ case XPT_RESET_DEV: case XPT_ENG_EXEC: case XPT_SMP_IO: { struct cam_devq *devq; devq = path->bus->sim->devq; mtx_lock(&devq->send_mtx); cam_ccbq_insert_ccb(&path->device->ccbq, start_ccb); if (xpt_schedule_devq(devq, path->device) != 0) xpt_run_devq(devq); mtx_unlock(&devq->send_mtx); break; } case XPT_CALC_GEOMETRY: /* Filter out garbage */ if (start_ccb->ccg.block_size == 0 || start_ccb->ccg.volume_size == 0) { start_ccb->ccg.cylinders = 0; start_ccb->ccg.heads = 0; start_ccb->ccg.secs_per_track = 0; start_ccb->ccb_h.status = CAM_REQ_CMP; break; } #if defined(PC98) || defined(__sparc64__) /* * In a PC-98 system, geometry translation depens on * the "real" device geometry obtained from mode page 4. * SCSI geometry translation is performed in the * initialization routine of the SCSI BIOS and the result * stored in host memory. If the translation is available * in host memory, use it. If not, rely on the default * translation the device driver performs. * For sparc64, we may need adjust the geometry of large * disks in order to fit the limitations of the 16-bit * fields of the VTOC8 disk label. */ if (scsi_da_bios_params(&start_ccb->ccg) != 0) { start_ccb->ccb_h.status = CAM_REQ_CMP; break; } #endif goto call_sim; case XPT_ABORT: { union ccb* abort_ccb; abort_ccb = start_ccb->cab.abort_ccb; if (XPT_FC_IS_DEV_QUEUED(abort_ccb)) { if (abort_ccb->ccb_h.pinfo.index >= 0) { struct cam_ccbq *ccbq; struct cam_ed *device; device = abort_ccb->ccb_h.path->device; ccbq = &device->ccbq; cam_ccbq_remove_ccb(ccbq, abort_ccb); abort_ccb->ccb_h.status = CAM_REQ_ABORTED|CAM_DEV_QFRZN; xpt_freeze_devq(abort_ccb->ccb_h.path, 1); xpt_done(abort_ccb); start_ccb->ccb_h.status = CAM_REQ_CMP; break; } if (abort_ccb->ccb_h.pinfo.index == CAM_UNQUEUED_INDEX && (abort_ccb->ccb_h.status & CAM_SIM_QUEUED) == 0) { /* * We've caught this ccb en route to * the SIM. Flag it for abort and the * SIM will do so just before starting * real work on the CCB. */ abort_ccb->ccb_h.status = CAM_REQ_ABORTED|CAM_DEV_QFRZN; xpt_freeze_devq(abort_ccb->ccb_h.path, 1); start_ccb->ccb_h.status = CAM_REQ_CMP; break; } } if (XPT_FC_IS_QUEUED(abort_ccb) && (abort_ccb->ccb_h.pinfo.index == CAM_DONEQ_INDEX)) { /* * It's already completed but waiting * for our SWI to get to it. */ start_ccb->ccb_h.status = CAM_UA_ABORT; break; } /* * If we weren't able to take care of the abort request * in the XPT, pass the request down to the SIM for processing. */ } /* FALLTHROUGH */ case XPT_ACCEPT_TARGET_IO: case XPT_EN_LUN: case XPT_IMMED_NOTIFY: case XPT_NOTIFY_ACK: case XPT_RESET_BUS: case XPT_IMMEDIATE_NOTIFY: case XPT_NOTIFY_ACKNOWLEDGE: case XPT_GET_SIM_KNOB: case XPT_SET_SIM_KNOB: case XPT_GET_TRAN_SETTINGS: case XPT_SET_TRAN_SETTINGS: case XPT_PATH_INQ: call_sim: sim = path->bus->sim; lock = (mtx_owned(sim->mtx) == 0); if (lock) CAM_SIM_LOCK(sim); (*(sim->sim_action))(sim, start_ccb); if (lock) CAM_SIM_UNLOCK(sim); break; case XPT_PATH_STATS: start_ccb->cpis.last_reset = path->bus->last_reset; start_ccb->ccb_h.status = CAM_REQ_CMP; break; case XPT_GDEV_TYPE: { struct cam_ed *dev; dev = path->device; if ((dev->flags & CAM_DEV_UNCONFIGURED) != 0) { start_ccb->ccb_h.status = CAM_DEV_NOT_THERE; } else { struct ccb_getdev *cgd; cgd = &start_ccb->cgd; cgd->protocol = dev->protocol; cgd->inq_data = dev->inq_data; cgd->ident_data = dev->ident_data; cgd->inq_flags = dev->inq_flags; cgd->ccb_h.status = CAM_REQ_CMP; cgd->serial_num_len = dev->serial_num_len; if ((dev->serial_num_len > 0) && (dev->serial_num != NULL)) bcopy(dev->serial_num, cgd->serial_num, dev->serial_num_len); } break; } case XPT_GDEV_STATS: { struct cam_ed *dev; dev = path->device; if ((dev->flags & CAM_DEV_UNCONFIGURED) != 0) { start_ccb->ccb_h.status = CAM_DEV_NOT_THERE; } else { struct ccb_getdevstats *cgds; struct cam_eb *bus; struct cam_et *tar; struct cam_devq *devq; cgds = &start_ccb->cgds; bus = path->bus; tar = path->target; devq = bus->sim->devq; mtx_lock(&devq->send_mtx); cgds->dev_openings = dev->ccbq.dev_openings; cgds->dev_active = dev->ccbq.dev_active; cgds->allocated = dev->ccbq.allocated; cgds->queued = cam_ccbq_pending_ccb_count(&dev->ccbq); cgds->held = cgds->allocated - cgds->dev_active - cgds->queued; cgds->last_reset = tar->last_reset; cgds->maxtags = dev->maxtags; cgds->mintags = dev->mintags; if (timevalcmp(&tar->last_reset, &bus->last_reset, <)) cgds->last_reset = bus->last_reset; mtx_unlock(&devq->send_mtx); cgds->ccb_h.status = CAM_REQ_CMP; } break; } case XPT_GDEVLIST: { struct cam_periph *nperiph; struct periph_list *periph_head; struct ccb_getdevlist *cgdl; u_int i; struct cam_ed *device; int found; found = 0; /* * Don't want anyone mucking with our data. */ device = path->device; periph_head = &device->periphs; cgdl = &start_ccb->cgdl; /* * Check and see if the list has changed since the user * last requested a list member. If so, tell them that the * list has changed, and therefore they need to start over * from the beginning. */ if ((cgdl->index != 0) && (cgdl->generation != device->generation)) { cgdl->status = CAM_GDEVLIST_LIST_CHANGED; break; } /* * Traverse the list of peripherals and attempt to find * the requested peripheral. */ for (nperiph = SLIST_FIRST(periph_head), i = 0; (nperiph != NULL) && (i <= cgdl->index); nperiph = SLIST_NEXT(nperiph, periph_links), i++) { if (i == cgdl->index) { strncpy(cgdl->periph_name, nperiph->periph_name, DEV_IDLEN); cgdl->unit_number = nperiph->unit_number; found = 1; } } if (found == 0) { cgdl->status = CAM_GDEVLIST_ERROR; break; } if (nperiph == NULL) cgdl->status = CAM_GDEVLIST_LAST_DEVICE; else cgdl->status = CAM_GDEVLIST_MORE_DEVS; cgdl->index++; cgdl->generation = device->generation; cgdl->ccb_h.status = CAM_REQ_CMP; break; } case XPT_DEV_MATCH: { dev_pos_type position_type; struct ccb_dev_match *cdm; cdm = &start_ccb->cdm; /* * There are two ways of getting at information in the EDT. * The first way is via the primary EDT tree. It starts * with a list of busses, then a list of targets on a bus, * then devices/luns on a target, and then peripherals on a * device/lun. The "other" way is by the peripheral driver * lists. The peripheral driver lists are organized by * peripheral driver. (obviously) So it makes sense to * use the peripheral driver list if the user is looking * for something like "da1", or all "da" devices. If the * user is looking for something on a particular bus/target * or lun, it's generally better to go through the EDT tree. */ if (cdm->pos.position_type != CAM_DEV_POS_NONE) position_type = cdm->pos.position_type; else { u_int i; position_type = CAM_DEV_POS_NONE; for (i = 0; i < cdm->num_patterns; i++) { if ((cdm->patterns[i].type == DEV_MATCH_BUS) ||(cdm->patterns[i].type == DEV_MATCH_DEVICE)){ position_type = CAM_DEV_POS_EDT; break; } } if (cdm->num_patterns == 0) position_type = CAM_DEV_POS_EDT; else if (position_type == CAM_DEV_POS_NONE) position_type = CAM_DEV_POS_PDRV; } switch(position_type & CAM_DEV_POS_TYPEMASK) { case CAM_DEV_POS_EDT: xptedtmatch(cdm); break; case CAM_DEV_POS_PDRV: xptperiphlistmatch(cdm); break; default: cdm->status = CAM_DEV_MATCH_ERROR; break; } if (cdm->status == CAM_DEV_MATCH_ERROR) start_ccb->ccb_h.status = CAM_REQ_CMP_ERR; else start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_SASYNC_CB: { struct ccb_setasync *csa; struct async_node *cur_entry; struct async_list *async_head; u_int32_t added; csa = &start_ccb->csa; added = csa->event_enable; async_head = &path->device->asyncs; /* * If there is already an entry for us, simply * update it. */ cur_entry = SLIST_FIRST(async_head); while (cur_entry != NULL) { if ((cur_entry->callback_arg == csa->callback_arg) && (cur_entry->callback == csa->callback)) break; cur_entry = SLIST_NEXT(cur_entry, links); } if (cur_entry != NULL) { /* * If the request has no flags set, * remove the entry. */ added &= ~cur_entry->event_enable; if (csa->event_enable == 0) { SLIST_REMOVE(async_head, cur_entry, async_node, links); xpt_release_device(path->device); free(cur_entry, M_CAMXPT); } else { cur_entry->event_enable = csa->event_enable; } csa->event_enable = added; } else { cur_entry = malloc(sizeof(*cur_entry), M_CAMXPT, M_NOWAIT); if (cur_entry == NULL) { csa->ccb_h.status = CAM_RESRC_UNAVAIL; break; } cur_entry->event_enable = csa->event_enable; cur_entry->event_lock = mtx_owned(path->bus->sim->mtx) ? 1 : 0; cur_entry->callback_arg = csa->callback_arg; cur_entry->callback = csa->callback; SLIST_INSERT_HEAD(async_head, cur_entry, links); xpt_acquire_device(path->device); } start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_REL_SIMQ: { struct ccb_relsim *crs; struct cam_ed *dev; crs = &start_ccb->crs; dev = path->device; if (dev == NULL) { crs->ccb_h.status = CAM_DEV_NOT_THERE; break; } if ((crs->release_flags & RELSIM_ADJUST_OPENINGS) != 0) { /* Don't ever go below one opening */ if (crs->openings > 0) { xpt_dev_ccbq_resize(path, crs->openings); if (bootverbose) { xpt_print(path, "number of openings is now %d\n", crs->openings); } } } mtx_lock(&dev->sim->devq->send_mtx); if ((crs->release_flags & RELSIM_RELEASE_AFTER_TIMEOUT) != 0) { if ((dev->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) { /* * Just extend the old timeout and decrement * the freeze count so that a single timeout * is sufficient for releasing the queue. */ start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE; callout_stop(&dev->callout); } else { start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; } callout_reset_sbt(&dev->callout, SBT_1MS * crs->release_timeout, 0, xpt_release_devq_timeout, dev, 0); dev->flags |= CAM_DEV_REL_TIMEOUT_PENDING; } if ((crs->release_flags & RELSIM_RELEASE_AFTER_CMDCMPLT) != 0) { if ((dev->flags & CAM_DEV_REL_ON_COMPLETE) != 0) { /* * Decrement the freeze count so that a single * completion is still sufficient to unfreeze * the queue. */ start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE; } else { dev->flags |= CAM_DEV_REL_ON_COMPLETE; start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; } } if ((crs->release_flags & RELSIM_RELEASE_AFTER_QEMPTY) != 0) { if ((dev->flags & CAM_DEV_REL_ON_QUEUE_EMPTY) != 0 || (dev->ccbq.dev_active == 0)) { start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE; } else { dev->flags |= CAM_DEV_REL_ON_QUEUE_EMPTY; start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE; } } mtx_unlock(&dev->sim->devq->send_mtx); if ((start_ccb->ccb_h.flags & CAM_DEV_QFREEZE) == 0) xpt_release_devq(path, /*count*/1, /*run_queue*/TRUE); start_ccb->crs.qfrozen_cnt = dev->ccbq.queue.qfrozen_cnt; start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_DEBUG: { struct cam_path *oldpath; /* Check that all request bits are supported. */ if (start_ccb->cdbg.flags & ~(CAM_DEBUG_COMPILE)) { start_ccb->ccb_h.status = CAM_FUNC_NOTAVAIL; break; } cam_dflags = CAM_DEBUG_NONE; if (cam_dpath != NULL) { oldpath = cam_dpath; cam_dpath = NULL; xpt_free_path(oldpath); } if (start_ccb->cdbg.flags != CAM_DEBUG_NONE) { if (xpt_create_path(&cam_dpath, NULL, start_ccb->ccb_h.path_id, start_ccb->ccb_h.target_id, start_ccb->ccb_h.target_lun) != CAM_REQ_CMP) { start_ccb->ccb_h.status = CAM_RESRC_UNAVAIL; } else { cam_dflags = start_ccb->cdbg.flags; start_ccb->ccb_h.status = CAM_REQ_CMP; xpt_print(cam_dpath, "debugging flags now %x\n", cam_dflags); } } else start_ccb->ccb_h.status = CAM_REQ_CMP; break; } case XPT_NOOP: if ((start_ccb->ccb_h.flags & CAM_DEV_QFREEZE) != 0) xpt_freeze_devq(path, 1); start_ccb->ccb_h.status = CAM_REQ_CMP; break; default: case XPT_SDEV_TYPE: case XPT_TERM_IO: case XPT_ENG_INQ: /* XXX Implement */ printf("%s: CCB type %#x not supported\n", __func__, start_ccb->ccb_h.func_code); start_ccb->ccb_h.status = CAM_PROVIDE_FAIL; if (start_ccb->ccb_h.func_code & XPT_FC_DEV_QUEUED) { xpt_done(start_ccb); } break; } } void xpt_polled_action(union ccb *start_ccb) { u_int32_t timeout; struct cam_sim *sim; struct cam_devq *devq; struct cam_ed *dev; timeout = start_ccb->ccb_h.timeout * 10; sim = start_ccb->ccb_h.path->bus->sim; devq = sim->devq; dev = start_ccb->ccb_h.path->device; mtx_unlock(&dev->device_mtx); /* * Steal an opening so that no other queued requests * can get it before us while we simulate interrupts. */ mtx_lock(&devq->send_mtx); dev->ccbq.dev_openings--; while((devq->send_openings <= 0 || dev->ccbq.dev_openings < 0) && (--timeout > 0)) { mtx_unlock(&devq->send_mtx); DELAY(100); CAM_SIM_LOCK(sim); (*(sim->sim_poll))(sim); CAM_SIM_UNLOCK(sim); camisr_runqueue(); mtx_lock(&devq->send_mtx); } dev->ccbq.dev_openings++; mtx_unlock(&devq->send_mtx); if (timeout != 0) { xpt_action(start_ccb); while(--timeout > 0) { CAM_SIM_LOCK(sim); (*(sim->sim_poll))(sim); CAM_SIM_UNLOCK(sim); camisr_runqueue(); if ((start_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_INPROG) break; DELAY(100); } if (timeout == 0) { /* * XXX Is it worth adding a sim_timeout entry * point so we can attempt recovery? If * this is only used for dumps, I don't think * it is. */ start_ccb->ccb_h.status = CAM_CMD_TIMEOUT; } } else { start_ccb->ccb_h.status = CAM_RESRC_UNAVAIL; } mtx_lock(&dev->device_mtx); } /* * Schedule a peripheral driver to receive a ccb when its * target device has space for more transactions. */ void xpt_schedule(struct cam_periph *periph, u_int32_t new_priority) { CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("xpt_schedule\n")); cam_periph_assert(periph, MA_OWNED); if (new_priority < periph->scheduled_priority) { periph->scheduled_priority = new_priority; xpt_run_allocq(periph, 0); } } /* * Schedule a device to run on a given queue. * If the device was inserted as a new entry on the queue, * return 1 meaning the device queue should be run. If we * were already queued, implying someone else has already * started the queue, return 0 so the caller doesn't attempt * to run the queue. */ static int xpt_schedule_dev(struct camq *queue, cam_pinfo *pinfo, u_int32_t new_priority) { int retval; u_int32_t old_priority; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_schedule_dev\n")); old_priority = pinfo->priority; /* * Are we already queued? */ if (pinfo->index != CAM_UNQUEUED_INDEX) { /* Simply reorder based on new priority */ if (new_priority < old_priority) { camq_change_priority(queue, pinfo->index, new_priority); CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("changed priority to %d\n", new_priority)); retval = 1; } else retval = 0; } else { /* New entry on the queue */ if (new_priority < old_priority) pinfo->priority = new_priority; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("Inserting onto queue\n")); pinfo->generation = ++queue->generation; camq_insert(queue, pinfo); retval = 1; } return (retval); } static void xpt_run_allocq_task(void *context, int pending) { struct cam_periph *periph = context; cam_periph_lock(periph); periph->flags &= ~CAM_PERIPH_RUN_TASK; xpt_run_allocq(periph, 1); cam_periph_unlock(periph); cam_periph_release(periph); } static void xpt_run_allocq(struct cam_periph *periph, int sleep) { struct cam_ed *device; union ccb *ccb; uint32_t prio; cam_periph_assert(periph, MA_OWNED); if (periph->periph_allocating) return; periph->periph_allocating = 1; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_run_allocq(%p)\n", periph)); device = periph->path->device; ccb = NULL; restart: while ((prio = min(periph->scheduled_priority, periph->immediate_priority)) != CAM_PRIORITY_NONE && (periph->periph_allocated - (ccb != NULL ? 1 : 0) < device->ccbq.total_openings || prio <= CAM_PRIORITY_OOB)) { if (ccb == NULL && (ccb = xpt_get_ccb_nowait(periph)) == NULL) { if (sleep) { ccb = xpt_get_ccb(periph); goto restart; } if (periph->flags & CAM_PERIPH_RUN_TASK) break; cam_periph_doacquire(periph); periph->flags |= CAM_PERIPH_RUN_TASK; taskqueue_enqueue(xsoftc.xpt_taskq, &periph->periph_run_task); break; } xpt_setup_ccb(&ccb->ccb_h, periph->path, prio); if (prio == periph->immediate_priority) { periph->immediate_priority = CAM_PRIORITY_NONE; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("waking cam_periph_getccb()\n")); SLIST_INSERT_HEAD(&periph->ccb_list, &ccb->ccb_h, periph_links.sle); wakeup(&periph->ccb_list); } else { periph->scheduled_priority = CAM_PRIORITY_NONE; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("calling periph_start()\n")); periph->periph_start(periph, ccb); } ccb = NULL; } if (ccb != NULL) xpt_release_ccb(ccb); periph->periph_allocating = 0; } static void xpt_run_devq(struct cam_devq *devq) { char cdb_str[(SCSI_MAX_CDBLEN * 3) + 1]; int lock; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_run_devq\n")); devq->send_queue.qfrozen_cnt++; while ((devq->send_queue.entries > 0) && (devq->send_openings > 0) && (devq->send_queue.qfrozen_cnt <= 1)) { struct cam_ed *device; union ccb *work_ccb; struct cam_sim *sim; device = (struct cam_ed *)camq_remove(&devq->send_queue, CAMQ_HEAD); CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("running device %p\n", device)); work_ccb = cam_ccbq_peek_ccb(&device->ccbq, CAMQ_HEAD); if (work_ccb == NULL) { printf("device on run queue with no ccbs???\n"); continue; } if ((work_ccb->ccb_h.flags & CAM_HIGH_POWER) != 0) { mtx_lock(&xsoftc.xpt_highpower_lock); if (xsoftc.num_highpower <= 0) { /* * We got a high power command, but we * don't have any available slots. Freeze * the device queue until we have a slot * available. */ xpt_freeze_devq_device(device, 1); STAILQ_INSERT_TAIL(&xsoftc.highpowerq, device, highpowerq_entry); mtx_unlock(&xsoftc.xpt_highpower_lock); continue; } else { /* * Consume a high power slot while * this ccb runs. */ xsoftc.num_highpower--; } mtx_unlock(&xsoftc.xpt_highpower_lock); } cam_ccbq_remove_ccb(&device->ccbq, work_ccb); cam_ccbq_send_ccb(&device->ccbq, work_ccb); devq->send_openings--; devq->send_active++; xpt_schedule_devq(devq, device); mtx_unlock(&devq->send_mtx); if ((work_ccb->ccb_h.flags & CAM_DEV_QFREEZE) != 0) { /* * The client wants to freeze the queue * after this CCB is sent. */ xpt_freeze_devq(work_ccb->ccb_h.path, 1); } /* In Target mode, the peripheral driver knows best... */ if (work_ccb->ccb_h.func_code == XPT_SCSI_IO) { if ((device->inq_flags & SID_CmdQue) != 0 && work_ccb->csio.tag_action != CAM_TAG_ACTION_NONE) work_ccb->ccb_h.flags |= CAM_TAG_ACTION_VALID; else /* * Clear this in case of a retried CCB that * failed due to a rejected tag. */ work_ccb->ccb_h.flags &= ~CAM_TAG_ACTION_VALID; } switch (work_ccb->ccb_h.func_code) { case XPT_SCSI_IO: CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_CDB,("%s. CDB: %s\n", scsi_op_desc(work_ccb->csio.cdb_io.cdb_bytes[0], &device->inq_data), scsi_cdb_string(work_ccb->csio.cdb_io.cdb_bytes, cdb_str, sizeof(cdb_str)))); break; case XPT_ATA_IO: CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_CDB,("%s. ACB: %s\n", ata_op_string(&work_ccb->ataio.cmd), ata_cmd_string(&work_ccb->ataio.cmd, cdb_str, sizeof(cdb_str)))); break; default: break; } /* * Device queues can be shared among multiple SIM instances * that reside on different busses. Use the SIM from the * queued device, rather than the one from the calling bus. */ sim = device->sim; lock = (mtx_owned(sim->mtx) == 0); if (lock) CAM_SIM_LOCK(sim); (*(sim->sim_action))(sim, work_ccb); if (lock) CAM_SIM_UNLOCK(sim); mtx_lock(&devq->send_mtx); } devq->send_queue.qfrozen_cnt--; } /* * This function merges stuff from the slave ccb into the master ccb, while * keeping important fields in the master ccb constant. */ void xpt_merge_ccb(union ccb *master_ccb, union ccb *slave_ccb) { /* * Pull fields that are valid for peripheral drivers to set * into the master CCB along with the CCB "payload". */ master_ccb->ccb_h.retry_count = slave_ccb->ccb_h.retry_count; master_ccb->ccb_h.func_code = slave_ccb->ccb_h.func_code; master_ccb->ccb_h.timeout = slave_ccb->ccb_h.timeout; master_ccb->ccb_h.flags = slave_ccb->ccb_h.flags; bcopy(&(&slave_ccb->ccb_h)[1], &(&master_ccb->ccb_h)[1], sizeof(union ccb) - sizeof(struct ccb_hdr)); } void -xpt_setup_ccb(struct ccb_hdr *ccb_h, struct cam_path *path, u_int32_t priority) +xpt_setup_ccb_flags(struct ccb_hdr *ccb_h, struct cam_path *path, + u_int32_t priority, u_int32_t flags) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_setup_ccb\n")); ccb_h->pinfo.priority = priority; ccb_h->path = path; ccb_h->path_id = path->bus->path_id; if (path->target) ccb_h->target_id = path->target->target_id; else ccb_h->target_id = CAM_TARGET_WILDCARD; if (path->device) { ccb_h->target_lun = path->device->lun_id; ccb_h->pinfo.generation = ++path->device->ccbq.queue.generation; } else { ccb_h->target_lun = CAM_TARGET_WILDCARD; } ccb_h->pinfo.index = CAM_UNQUEUED_INDEX; - ccb_h->flags = 0; + ccb_h->flags = flags; ccb_h->xflags = 0; +} + +void +xpt_setup_ccb(struct ccb_hdr *ccb_h, struct cam_path *path, u_int32_t priority) +{ + xpt_setup_ccb_flags(ccb_h, path, priority, /*flags*/ 0); } /* Path manipulation functions */ cam_status xpt_create_path(struct cam_path **new_path_ptr, struct cam_periph *perph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id) { struct cam_path *path; cam_status status; path = (struct cam_path *)malloc(sizeof(*path), M_CAMPATH, M_NOWAIT); if (path == NULL) { status = CAM_RESRC_UNAVAIL; return(status); } status = xpt_compile_path(path, perph, path_id, target_id, lun_id); if (status != CAM_REQ_CMP) { free(path, M_CAMPATH); path = NULL; } *new_path_ptr = path; return (status); } cam_status xpt_create_path_unlocked(struct cam_path **new_path_ptr, struct cam_periph *periph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id) { return (xpt_create_path(new_path_ptr, periph, path_id, target_id, lun_id)); } cam_status xpt_compile_path(struct cam_path *new_path, struct cam_periph *perph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id) { struct cam_eb *bus; struct cam_et *target; struct cam_ed *device; cam_status status; status = CAM_REQ_CMP; /* Completed without error */ target = NULL; /* Wildcarded */ device = NULL; /* Wildcarded */ /* * We will potentially modify the EDT, so block interrupts * that may attempt to create cam paths. */ bus = xpt_find_bus(path_id); if (bus == NULL) { status = CAM_PATH_INVALID; } else { xpt_lock_buses(); mtx_lock(&bus->eb_mtx); target = xpt_find_target(bus, target_id); if (target == NULL) { /* Create one */ struct cam_et *new_target; new_target = xpt_alloc_target(bus, target_id); if (new_target == NULL) { status = CAM_RESRC_UNAVAIL; } else { target = new_target; } } xpt_unlock_buses(); if (target != NULL) { device = xpt_find_device(target, lun_id); if (device == NULL) { /* Create one */ struct cam_ed *new_device; new_device = (*(bus->xport->alloc_device))(bus, target, lun_id); if (new_device == NULL) { status = CAM_RESRC_UNAVAIL; } else { device = new_device; } } } mtx_unlock(&bus->eb_mtx); } /* * Only touch the user's data if we are successful. */ if (status == CAM_REQ_CMP) { new_path->periph = perph; new_path->bus = bus; new_path->target = target; new_path->device = device; CAM_DEBUG(new_path, CAM_DEBUG_TRACE, ("xpt_compile_path\n")); } else { if (device != NULL) xpt_release_device(device); if (target != NULL) xpt_release_target(target); if (bus != NULL) xpt_release_bus(bus); } return (status); } cam_status xpt_clone_path(struct cam_path **new_path_ptr, struct cam_path *path) { struct cam_path *new_path; new_path = (struct cam_path *)malloc(sizeof(*path), M_CAMPATH, M_NOWAIT); if (new_path == NULL) return(CAM_RESRC_UNAVAIL); xpt_copy_path(new_path, path); *new_path_ptr = new_path; return (CAM_REQ_CMP); } void xpt_copy_path(struct cam_path *new_path, struct cam_path *path) { *new_path = *path; if (path->bus != NULL) xpt_acquire_bus(path->bus); if (path->target != NULL) xpt_acquire_target(path->target); if (path->device != NULL) xpt_acquire_device(path->device); } void xpt_release_path(struct cam_path *path) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_release_path\n")); if (path->device != NULL) { xpt_release_device(path->device); path->device = NULL; } if (path->target != NULL) { xpt_release_target(path->target); path->target = NULL; } if (path->bus != NULL) { xpt_release_bus(path->bus); path->bus = NULL; } } void xpt_free_path(struct cam_path *path) { CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_free_path\n")); xpt_release_path(path); free(path, M_CAMPATH); } void xpt_path_counts(struct cam_path *path, uint32_t *bus_ref, uint32_t *periph_ref, uint32_t *target_ref, uint32_t *device_ref) { xpt_lock_buses(); if (bus_ref) { if (path->bus) *bus_ref = path->bus->refcount; else *bus_ref = 0; } if (periph_ref) { if (path->periph) *periph_ref = path->periph->refcount; else *periph_ref = 0; } xpt_unlock_buses(); if (target_ref) { if (path->target) *target_ref = path->target->refcount; else *target_ref = 0; } if (device_ref) { if (path->device) *device_ref = path->device->refcount; else *device_ref = 0; } } /* * Return -1 for failure, 0 for exact match, 1 for match with wildcards * in path1, 2 for match with wildcards in path2. */ int xpt_path_comp(struct cam_path *path1, struct cam_path *path2) { int retval = 0; if (path1->bus != path2->bus) { if (path1->bus->path_id == CAM_BUS_WILDCARD) retval = 1; else if (path2->bus->path_id == CAM_BUS_WILDCARD) retval = 2; else return (-1); } if (path1->target != path2->target) { if (path1->target->target_id == CAM_TARGET_WILDCARD) { if (retval == 0) retval = 1; } else if (path2->target->target_id == CAM_TARGET_WILDCARD) retval = 2; else return (-1); } if (path1->device != path2->device) { if (path1->device->lun_id == CAM_LUN_WILDCARD) { if (retval == 0) retval = 1; } else if (path2->device->lun_id == CAM_LUN_WILDCARD) retval = 2; else return (-1); } return (retval); } int xpt_path_comp_dev(struct cam_path *path, struct cam_ed *dev) { int retval = 0; if (path->bus != dev->target->bus) { if (path->bus->path_id == CAM_BUS_WILDCARD) retval = 1; else if (dev->target->bus->path_id == CAM_BUS_WILDCARD) retval = 2; else return (-1); } if (path->target != dev->target) { if (path->target->target_id == CAM_TARGET_WILDCARD) { if (retval == 0) retval = 1; } else if (dev->target->target_id == CAM_TARGET_WILDCARD) retval = 2; else return (-1); } if (path->device != dev) { if (path->device->lun_id == CAM_LUN_WILDCARD) { if (retval == 0) retval = 1; } else if (dev->lun_id == CAM_LUN_WILDCARD) retval = 2; else return (-1); } return (retval); } void xpt_print_path(struct cam_path *path) { if (path == NULL) printf("(nopath): "); else { if (path->periph != NULL) printf("(%s%d:", path->periph->periph_name, path->periph->unit_number); else printf("(noperiph:"); if (path->bus != NULL) printf("%s%d:%d:", path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id); else printf("nobus:"); if (path->target != NULL) printf("%d:", path->target->target_id); else printf("X:"); if (path->device != NULL) printf("%jx): ", (uintmax_t)path->device->lun_id); else printf("X): "); } } void xpt_print_device(struct cam_ed *device) { if (device == NULL) printf("(nopath): "); else { printf("(noperiph:%s%d:%d:%d:%jx): ", device->sim->sim_name, device->sim->unit_number, device->sim->bus_id, device->target->target_id, (uintmax_t)device->lun_id); } } void xpt_print(struct cam_path *path, const char *fmt, ...) { va_list ap; xpt_print_path(path); va_start(ap, fmt); vprintf(fmt, ap); va_end(ap); } int xpt_path_string(struct cam_path *path, char *str, size_t str_len) { struct sbuf sb; sbuf_new(&sb, str, str_len, 0); if (path == NULL) sbuf_printf(&sb, "(nopath): "); else { if (path->periph != NULL) sbuf_printf(&sb, "(%s%d:", path->periph->periph_name, path->periph->unit_number); else sbuf_printf(&sb, "(noperiph:"); if (path->bus != NULL) sbuf_printf(&sb, "%s%d:%d:", path->bus->sim->sim_name, path->bus->sim->unit_number, path->bus->sim->bus_id); else sbuf_printf(&sb, "nobus:"); if (path->target != NULL) sbuf_printf(&sb, "%d:", path->target->target_id); else sbuf_printf(&sb, "X:"); if (path->device != NULL) sbuf_printf(&sb, "%jx): ", (uintmax_t)path->device->lun_id); else sbuf_printf(&sb, "X): "); } sbuf_finish(&sb); return(sbuf_len(&sb)); } path_id_t xpt_path_path_id(struct cam_path *path) { return(path->bus->path_id); } target_id_t xpt_path_target_id(struct cam_path *path) { if (path->target != NULL) return (path->target->target_id); else return (CAM_TARGET_WILDCARD); } lun_id_t xpt_path_lun_id(struct cam_path *path) { if (path->device != NULL) return (path->device->lun_id); else return (CAM_LUN_WILDCARD); } struct cam_sim * xpt_path_sim(struct cam_path *path) { return (path->bus->sim); } struct cam_periph* xpt_path_periph(struct cam_path *path) { return (path->periph); } int xpt_path_legacy_ata_id(struct cam_path *path) { struct cam_eb *bus; int bus_id; if ((strcmp(path->bus->sim->sim_name, "ata") != 0) && strcmp(path->bus->sim->sim_name, "ahcich") != 0 && strcmp(path->bus->sim->sim_name, "mvsch") != 0 && strcmp(path->bus->sim->sim_name, "siisch") != 0) return (-1); if (strcmp(path->bus->sim->sim_name, "ata") == 0 && path->bus->sim->unit_number < 2) { bus_id = path->bus->sim->unit_number; } else { bus_id = 2; xpt_lock_buses(); TAILQ_FOREACH(bus, &xsoftc.xpt_busses, links) { if (bus == path->bus) break; if ((strcmp(bus->sim->sim_name, "ata") == 0 && bus->sim->unit_number >= 2) || strcmp(bus->sim->sim_name, "ahcich") == 0 || strcmp(bus->sim->sim_name, "mvsch") == 0 || strcmp(bus->sim->sim_name, "siisch") == 0) bus_id++; } xpt_unlock_buses(); } if (path->target != NULL) { if (path->target->target_id < 2) return (bus_id * 2 + path->target->target_id); else return (-1); } else return (bus_id * 2); } /* * Release a CAM control block for the caller. Remit the cost of the structure * to the device referenced by the path. If the this device had no 'credits' * and peripheral drivers have registered async callbacks for this notification * call them now. */ void xpt_release_ccb(union ccb *free_ccb) { struct cam_ed *device; struct cam_periph *periph; CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_release_ccb\n")); xpt_path_assert(free_ccb->ccb_h.path, MA_OWNED); device = free_ccb->ccb_h.path->device; periph = free_ccb->ccb_h.path->periph; xpt_free_ccb(free_ccb); periph->periph_allocated--; cam_ccbq_release_opening(&device->ccbq); xpt_run_allocq(periph, 0); } /* Functions accessed by SIM drivers */ static struct xpt_xport xport_default = { .alloc_device = xpt_alloc_device_default, .action = xpt_action_default, .async = xpt_dev_async_default, }; /* * A sim structure, listing the SIM entry points and instance * identification info is passed to xpt_bus_register to hook the SIM * into the CAM framework. xpt_bus_register creates a cam_eb entry * for this new bus and places it in the array of busses and assigns * it a path_id. The path_id may be influenced by "hard wiring" * information specified by the user. Once interrupt services are * available, the bus will be probed. */ int32_t xpt_bus_register(struct cam_sim *sim, device_t parent, u_int32_t bus) { struct cam_eb *new_bus; struct cam_eb *old_bus; struct ccb_pathinq cpi; struct cam_path *path; cam_status status; mtx_assert(sim->mtx, MA_OWNED); sim->bus_id = bus; new_bus = (struct cam_eb *)malloc(sizeof(*new_bus), M_CAMXPT, M_NOWAIT|M_ZERO); if (new_bus == NULL) { /* Couldn't satisfy request */ return (CAM_RESRC_UNAVAIL); } mtx_init(&new_bus->eb_mtx, "CAM bus lock", NULL, MTX_DEF); TAILQ_INIT(&new_bus->et_entries); cam_sim_hold(sim); new_bus->sim = sim; timevalclear(&new_bus->last_reset); new_bus->flags = 0; new_bus->refcount = 1; /* Held until a bus_deregister event */ new_bus->generation = 0; xpt_lock_buses(); sim->path_id = new_bus->path_id = xptpathid(sim->sim_name, sim->unit_number, sim->bus_id); old_bus = TAILQ_FIRST(&xsoftc.xpt_busses); while (old_bus != NULL && old_bus->path_id < new_bus->path_id) old_bus = TAILQ_NEXT(old_bus, links); if (old_bus != NULL) TAILQ_INSERT_BEFORE(old_bus, new_bus, links); else TAILQ_INSERT_TAIL(&xsoftc.xpt_busses, new_bus, links); xsoftc.bus_generation++; xpt_unlock_buses(); /* * Set a default transport so that a PATH_INQ can be issued to * the SIM. This will then allow for probing and attaching of * a more appropriate transport. */ new_bus->xport = &xport_default; status = xpt_create_path(&path, /*periph*/NULL, sim->path_id, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); if (status != CAM_REQ_CMP) { xpt_release_bus(new_bus); free(path, M_CAMXPT); return (CAM_RESRC_UNAVAIL); } xpt_setup_ccb(&cpi.ccb_h, path, CAM_PRIORITY_NORMAL); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); if (cpi.ccb_h.status == CAM_REQ_CMP) { switch (cpi.transport) { case XPORT_SPI: case XPORT_SAS: case XPORT_FC: case XPORT_USB: case XPORT_ISCSI: case XPORT_SRP: case XPORT_PPB: new_bus->xport = scsi_get_xport(); break; case XPORT_ATA: case XPORT_SATA: new_bus->xport = ata_get_xport(); break; default: new_bus->xport = &xport_default; break; } } /* Notify interested parties */ if (sim->path_id != CAM_XPT_PATH_ID) { xpt_async(AC_PATH_REGISTERED, path, &cpi); if ((cpi.hba_misc & PIM_NOSCAN) == 0) { union ccb *scan_ccb; /* Initiate bus rescan. */ scan_ccb = xpt_alloc_ccb_nowait(); if (scan_ccb != NULL) { scan_ccb->ccb_h.path = path; scan_ccb->ccb_h.func_code = XPT_SCAN_BUS; scan_ccb->crcn.flags = 0; xpt_rescan(scan_ccb); } else { xpt_print(path, "Can't allocate CCB to scan bus\n"); xpt_free_path(path); } } else xpt_free_path(path); } else xpt_free_path(path); return (CAM_SUCCESS); } int32_t xpt_bus_deregister(path_id_t pathid) { struct cam_path bus_path; cam_status status; status = xpt_compile_path(&bus_path, NULL, pathid, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); if (status != CAM_REQ_CMP) return (status); xpt_async(AC_LOST_DEVICE, &bus_path, NULL); xpt_async(AC_PATH_DEREGISTERED, &bus_path, NULL); /* Release the reference count held while registered. */ xpt_release_bus(bus_path.bus); xpt_release_path(&bus_path); return (CAM_REQ_CMP); } static path_id_t xptnextfreepathid(void) { struct cam_eb *bus; path_id_t pathid; const char *strval; mtx_assert(&xsoftc.xpt_topo_lock, MA_OWNED); pathid = 0; bus = TAILQ_FIRST(&xsoftc.xpt_busses); retry: /* Find an unoccupied pathid */ while (bus != NULL && bus->path_id <= pathid) { if (bus->path_id == pathid) pathid++; bus = TAILQ_NEXT(bus, links); } /* * Ensure that this pathid is not reserved for * a bus that may be registered in the future. */ if (resource_string_value("scbus", pathid, "at", &strval) == 0) { ++pathid; /* Start the search over */ goto retry; } return (pathid); } static path_id_t xptpathid(const char *sim_name, int sim_unit, int sim_bus) { path_id_t pathid; int i, dunit, val; char buf[32]; const char *dname; pathid = CAM_XPT_PATH_ID; snprintf(buf, sizeof(buf), "%s%d", sim_name, sim_unit); if (strcmp(buf, "xpt0") == 0 && sim_bus == 0) return (pathid); i = 0; while ((resource_find_match(&i, &dname, &dunit, "at", buf)) == 0) { if (strcmp(dname, "scbus")) { /* Avoid a bit of foot shooting. */ continue; } if (dunit < 0) /* unwired?! */ continue; if (resource_int_value("scbus", dunit, "bus", &val) == 0) { if (sim_bus == val) { pathid = dunit; break; } } else if (sim_bus == 0) { /* Unspecified matches bus 0 */ pathid = dunit; break; } else { printf("Ambiguous scbus configuration for %s%d " "bus %d, cannot wire down. The kernel " "config entry for scbus%d should " "specify a controller bus.\n" "Scbus will be assigned dynamically.\n", sim_name, sim_unit, sim_bus, dunit); break; } } if (pathid == CAM_XPT_PATH_ID) pathid = xptnextfreepathid(); return (pathid); } static const char * xpt_async_string(u_int32_t async_code) { switch (async_code) { case AC_BUS_RESET: return ("AC_BUS_RESET"); case AC_UNSOL_RESEL: return ("AC_UNSOL_RESEL"); case AC_SCSI_AEN: return ("AC_SCSI_AEN"); case AC_SENT_BDR: return ("AC_SENT_BDR"); case AC_PATH_REGISTERED: return ("AC_PATH_REGISTERED"); case AC_PATH_DEREGISTERED: return ("AC_PATH_DEREGISTERED"); case AC_FOUND_DEVICE: return ("AC_FOUND_DEVICE"); case AC_LOST_DEVICE: return ("AC_LOST_DEVICE"); case AC_TRANSFER_NEG: return ("AC_TRANSFER_NEG"); case AC_INQ_CHANGED: return ("AC_INQ_CHANGED"); case AC_GETDEV_CHANGED: return ("AC_GETDEV_CHANGED"); case AC_CONTRACT: return ("AC_CONTRACT"); case AC_ADVINFO_CHANGED: return ("AC_ADVINFO_CHANGED"); case AC_UNIT_ATTENTION: return ("AC_UNIT_ATTENTION"); } return ("AC_UNKNOWN"); } static int xpt_async_size(u_int32_t async_code) { switch (async_code) { case AC_BUS_RESET: return (0); case AC_UNSOL_RESEL: return (0); case AC_SCSI_AEN: return (0); case AC_SENT_BDR: return (0); case AC_PATH_REGISTERED: return (sizeof(struct ccb_pathinq)); case AC_PATH_DEREGISTERED: return (0); case AC_FOUND_DEVICE: return (sizeof(struct ccb_getdev)); case AC_LOST_DEVICE: return (0); case AC_TRANSFER_NEG: return (sizeof(struct ccb_trans_settings)); case AC_INQ_CHANGED: return (0); case AC_GETDEV_CHANGED: return (0); case AC_CONTRACT: return (sizeof(struct ac_contract)); case AC_ADVINFO_CHANGED: return (-1); case AC_UNIT_ATTENTION: return (sizeof(struct ccb_scsiio)); } return (0); } static int xpt_async_process_dev(struct cam_ed *device, void *arg) { union ccb *ccb = arg; struct cam_path *path = ccb->ccb_h.path; void *async_arg = ccb->casync.async_arg_ptr; u_int32_t async_code = ccb->casync.async_code; int relock; if (path->device != device && path->device->lun_id != CAM_LUN_WILDCARD && device->lun_id != CAM_LUN_WILDCARD) return (1); /* * The async callback could free the device. * If it is a broadcast async, it doesn't hold * device reference, so take our own reference. */ xpt_acquire_device(device); /* * If async for specific device is to be delivered to * the wildcard client, take the specific device lock. * XXX: We may need a way for client to specify it. */ if ((device->lun_id == CAM_LUN_WILDCARD && path->device->lun_id != CAM_LUN_WILDCARD) || (device->target->target_id == CAM_TARGET_WILDCARD && path->target->target_id != CAM_TARGET_WILDCARD) || (device->target->bus->path_id == CAM_BUS_WILDCARD && path->target->bus->path_id != CAM_BUS_WILDCARD)) { mtx_unlock(&device->device_mtx); xpt_path_lock(path); relock = 1; } else relock = 0; (*(device->target->bus->xport->async))(async_code, device->target->bus, device->target, device, async_arg); xpt_async_bcast(&device->asyncs, async_code, path, async_arg); if (relock) { xpt_path_unlock(path); mtx_lock(&device->device_mtx); } xpt_release_device(device); return (1); } static int xpt_async_process_tgt(struct cam_et *target, void *arg) { union ccb *ccb = arg; struct cam_path *path = ccb->ccb_h.path; if (path->target != target && path->target->target_id != CAM_TARGET_WILDCARD && target->target_id != CAM_TARGET_WILDCARD) return (1); if (ccb->casync.async_code == AC_SENT_BDR) { /* Update our notion of when the last reset occurred */ microtime(&target->last_reset); } return (xptdevicetraverse(target, NULL, xpt_async_process_dev, ccb)); } static void xpt_async_process(struct cam_periph *periph, union ccb *ccb) { struct cam_eb *bus; struct cam_path *path; void *async_arg; u_int32_t async_code; path = ccb->ccb_h.path; async_code = ccb->casync.async_code; async_arg = ccb->casync.async_arg_ptr; CAM_DEBUG(path, CAM_DEBUG_TRACE | CAM_DEBUG_INFO, ("xpt_async(%s)\n", xpt_async_string(async_code))); bus = path->bus; if (async_code == AC_BUS_RESET) { /* Update our notion of when the last reset occurred */ microtime(&bus->last_reset); } xpttargettraverse(bus, NULL, xpt_async_process_tgt, ccb); /* * If this wasn't a fully wildcarded async, tell all * clients that want all async events. */ if (bus != xpt_periph->path->bus) { xpt_path_lock(xpt_periph->path); xpt_async_process_dev(xpt_periph->path->device, ccb); xpt_path_unlock(xpt_periph->path); } if (path->device != NULL && path->device->lun_id != CAM_LUN_WILDCARD) xpt_release_devq(path, 1, TRUE); else xpt_release_simq(path->bus->sim, TRUE); if (ccb->casync.async_arg_size > 0) free(async_arg, M_CAMXPT); xpt_free_path(path); xpt_free_ccb(ccb); } static void xpt_async_bcast(struct async_list *async_head, u_int32_t async_code, struct cam_path *path, void *async_arg) { struct async_node *cur_entry; int lock; cur_entry = SLIST_FIRST(async_head); while (cur_entry != NULL) { struct async_node *next_entry; /* * Grab the next list entry before we call the current * entry's callback. This is because the callback function * can delete its async callback entry. */ next_entry = SLIST_NEXT(cur_entry, links); if ((cur_entry->event_enable & async_code) != 0) { lock = cur_entry->event_lock; if (lock) CAM_SIM_LOCK(path->device->sim); cur_entry->callback(cur_entry->callback_arg, async_code, path, async_arg); if (lock) CAM_SIM_UNLOCK(path->device->sim); } cur_entry = next_entry; } } void xpt_async(u_int32_t async_code, struct cam_path *path, void *async_arg) { union ccb *ccb; int size; ccb = xpt_alloc_ccb_nowait(); if (ccb == NULL) { xpt_print(path, "Can't allocate CCB to send %s\n", xpt_async_string(async_code)); return; } if (xpt_clone_path(&ccb->ccb_h.path, path) != CAM_REQ_CMP) { xpt_print(path, "Can't allocate path to send %s\n", xpt_async_string(async_code)); xpt_free_ccb(ccb); return; } ccb->ccb_h.path->periph = NULL; ccb->ccb_h.func_code = XPT_ASYNC; ccb->ccb_h.cbfcnp = xpt_async_process; ccb->ccb_h.flags |= CAM_UNLOCKED; ccb->casync.async_code = async_code; ccb->casync.async_arg_size = 0; size = xpt_async_size(async_code); if (size > 0 && async_arg != NULL) { ccb->casync.async_arg_ptr = malloc(size, M_CAMXPT, M_NOWAIT); if (ccb->casync.async_arg_ptr == NULL) { xpt_print(path, "Can't allocate argument to send %s\n", xpt_async_string(async_code)); xpt_free_path(ccb->ccb_h.path); xpt_free_ccb(ccb); return; } memcpy(ccb->casync.async_arg_ptr, async_arg, size); ccb->casync.async_arg_size = size; } else if (size < 0) { ccb->casync.async_arg_ptr = async_arg; ccb->casync.async_arg_size = size; } if (path->device != NULL && path->device->lun_id != CAM_LUN_WILDCARD) xpt_freeze_devq(path, 1); else xpt_freeze_simq(path->bus->sim, 1); xpt_done(ccb); } static void xpt_dev_async_default(u_int32_t async_code, struct cam_eb *bus, struct cam_et *target, struct cam_ed *device, void *async_arg) { /* * We only need to handle events for real devices. */ if (target->target_id == CAM_TARGET_WILDCARD || device->lun_id == CAM_LUN_WILDCARD) return; printf("%s called\n", __func__); } static uint32_t xpt_freeze_devq_device(struct cam_ed *dev, u_int count) { struct cam_devq *devq; uint32_t freeze; devq = dev->sim->devq; mtx_assert(&devq->send_mtx, MA_OWNED); CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_freeze_devq_device(%d) %u->%u\n", count, dev->ccbq.queue.qfrozen_cnt, dev->ccbq.queue.qfrozen_cnt + count)); freeze = (dev->ccbq.queue.qfrozen_cnt += count); /* Remove frozen device from sendq. */ if (device_is_queued(dev)) camq_remove(&devq->send_queue, dev->devq_entry.index); return (freeze); } u_int32_t xpt_freeze_devq(struct cam_path *path, u_int count) { struct cam_ed *dev = path->device; struct cam_devq *devq; uint32_t freeze; devq = dev->sim->devq; mtx_lock(&devq->send_mtx); CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_freeze_devq(%d)\n", count)); freeze = xpt_freeze_devq_device(dev, count); mtx_unlock(&devq->send_mtx); return (freeze); } u_int32_t xpt_freeze_simq(struct cam_sim *sim, u_int count) { struct cam_devq *devq; uint32_t freeze; devq = sim->devq; mtx_lock(&devq->send_mtx); freeze = (devq->send_queue.qfrozen_cnt += count); mtx_unlock(&devq->send_mtx); return (freeze); } static void xpt_release_devq_timeout(void *arg) { struct cam_ed *dev; struct cam_devq *devq; dev = (struct cam_ed *)arg; CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_release_devq_timeout\n")); devq = dev->sim->devq; mtx_assert(&devq->send_mtx, MA_OWNED); if (xpt_release_devq_device(dev, /*count*/1, /*run_queue*/TRUE)) xpt_run_devq(devq); } void xpt_release_devq(struct cam_path *path, u_int count, int run_queue) { struct cam_ed *dev; struct cam_devq *devq; CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_release_devq(%d, %d)\n", count, run_queue)); dev = path->device; devq = dev->sim->devq; mtx_lock(&devq->send_mtx); if (xpt_release_devq_device(dev, count, run_queue)) xpt_run_devq(dev->sim->devq); mtx_unlock(&devq->send_mtx); } static int xpt_release_devq_device(struct cam_ed *dev, u_int count, int run_queue) { mtx_assert(&dev->sim->devq->send_mtx, MA_OWNED); CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_release_devq_device(%d, %d) %u->%u\n", count, run_queue, dev->ccbq.queue.qfrozen_cnt, dev->ccbq.queue.qfrozen_cnt - count)); if (count > dev->ccbq.queue.qfrozen_cnt) { #ifdef INVARIANTS printf("xpt_release_devq(): requested %u > present %u\n", count, dev->ccbq.queue.qfrozen_cnt); #endif count = dev->ccbq.queue.qfrozen_cnt; } dev->ccbq.queue.qfrozen_cnt -= count; if (dev->ccbq.queue.qfrozen_cnt == 0) { /* * No longer need to wait for a successful * command completion. */ dev->flags &= ~CAM_DEV_REL_ON_COMPLETE; /* * Remove any timeouts that might be scheduled * to release this queue. */ if ((dev->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) { callout_stop(&dev->callout); dev->flags &= ~CAM_DEV_REL_TIMEOUT_PENDING; } /* * Now that we are unfrozen schedule the * device so any pending transactions are * run. */ xpt_schedule_devq(dev->sim->devq, dev); } else run_queue = 0; return (run_queue); } void xpt_release_simq(struct cam_sim *sim, int run_queue) { struct cam_devq *devq; devq = sim->devq; mtx_lock(&devq->send_mtx); if (devq->send_queue.qfrozen_cnt <= 0) { #ifdef INVARIANTS printf("xpt_release_simq: requested 1 > present %u\n", devq->send_queue.qfrozen_cnt); #endif } else devq->send_queue.qfrozen_cnt--; if (devq->send_queue.qfrozen_cnt == 0) { /* * If there is a timeout scheduled to release this * sim queue, remove it. The queue frozen count is * already at 0. */ if ((sim->flags & CAM_SIM_REL_TIMEOUT_PENDING) != 0){ callout_stop(&sim->callout); sim->flags &= ~CAM_SIM_REL_TIMEOUT_PENDING; } if (run_queue) { /* * Now that we are unfrozen run the send queue. */ xpt_run_devq(sim->devq); } } mtx_unlock(&devq->send_mtx); } /* * XXX Appears to be unused. */ static void xpt_release_simq_timeout(void *arg) { struct cam_sim *sim; sim = (struct cam_sim *)arg; xpt_release_simq(sim, /* run_queue */ TRUE); } void xpt_done(union ccb *done_ccb) { struct cam_doneq *queue; int run, hash; CAM_DEBUG(done_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_done\n")); if ((done_ccb->ccb_h.func_code & XPT_FC_QUEUED) == 0) return; hash = (done_ccb->ccb_h.path_id + done_ccb->ccb_h.target_id + done_ccb->ccb_h.target_lun) % cam_num_doneqs; queue = &cam_doneqs[hash]; mtx_lock(&queue->cam_doneq_mtx); run = (queue->cam_doneq_sleep && STAILQ_EMPTY(&queue->cam_doneq)); STAILQ_INSERT_TAIL(&queue->cam_doneq, &done_ccb->ccb_h, sim_links.stqe); done_ccb->ccb_h.pinfo.index = CAM_DONEQ_INDEX; mtx_unlock(&queue->cam_doneq_mtx); if (run) wakeup(&queue->cam_doneq); } void xpt_done_direct(union ccb *done_ccb) { CAM_DEBUG(done_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_done_direct\n")); if ((done_ccb->ccb_h.func_code & XPT_FC_QUEUED) == 0) return; xpt_done_process(&done_ccb->ccb_h); } union ccb * xpt_alloc_ccb() { union ccb *new_ccb; new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_WAITOK); return (new_ccb); } union ccb * xpt_alloc_ccb_nowait() { union ccb *new_ccb; new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_NOWAIT); return (new_ccb); } void xpt_free_ccb(union ccb *free_ccb) { free(free_ccb, M_CAMCCB); } /* Private XPT functions */ /* * Get a CAM control block for the caller. Charge the structure to the device * referenced by the path. If we don't have sufficient resources to allocate * more ccbs, we return NULL. */ static union ccb * xpt_get_ccb_nowait(struct cam_periph *periph) { union ccb *new_ccb; new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_NOWAIT); if (new_ccb == NULL) return (NULL); periph->periph_allocated++; cam_ccbq_take_opening(&periph->path->device->ccbq); return (new_ccb); } static union ccb * xpt_get_ccb(struct cam_periph *periph) { union ccb *new_ccb; cam_periph_unlock(periph); new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_WAITOK); cam_periph_lock(periph); periph->periph_allocated++; cam_ccbq_take_opening(&periph->path->device->ccbq); return (new_ccb); } union ccb * cam_periph_getccb(struct cam_periph *periph, u_int32_t priority) { struct ccb_hdr *ccb_h; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("cam_periph_getccb\n")); cam_periph_assert(periph, MA_OWNED); while ((ccb_h = SLIST_FIRST(&periph->ccb_list)) == NULL || ccb_h->pinfo.priority != priority) { if (priority < periph->immediate_priority) { periph->immediate_priority = priority; xpt_run_allocq(periph, 0); } else cam_periph_sleep(periph, &periph->ccb_list, PRIBIO, "cgticb", 0); } SLIST_REMOVE_HEAD(&periph->ccb_list, periph_links.sle); return ((union ccb *)ccb_h); } static void xpt_acquire_bus(struct cam_eb *bus) { xpt_lock_buses(); bus->refcount++; xpt_unlock_buses(); } static void xpt_release_bus(struct cam_eb *bus) { xpt_lock_buses(); KASSERT(bus->refcount >= 1, ("bus->refcount >= 1")); if (--bus->refcount > 0) { xpt_unlock_buses(); return; } TAILQ_REMOVE(&xsoftc.xpt_busses, bus, links); xsoftc.bus_generation++; xpt_unlock_buses(); KASSERT(TAILQ_EMPTY(&bus->et_entries), ("destroying bus, but target list is not empty")); cam_sim_release(bus->sim); mtx_destroy(&bus->eb_mtx); free(bus, M_CAMXPT); } static struct cam_et * xpt_alloc_target(struct cam_eb *bus, target_id_t target_id) { struct cam_et *cur_target, *target; mtx_assert(&xsoftc.xpt_topo_lock, MA_OWNED); mtx_assert(&bus->eb_mtx, MA_OWNED); target = (struct cam_et *)malloc(sizeof(*target), M_CAMXPT, M_NOWAIT|M_ZERO); if (target == NULL) return (NULL); TAILQ_INIT(&target->ed_entries); target->bus = bus; target->target_id = target_id; target->refcount = 1; target->generation = 0; target->luns = NULL; mtx_init(&target->luns_mtx, "CAM LUNs lock", NULL, MTX_DEF); timevalclear(&target->last_reset); /* * Hold a reference to our parent bus so it * will not go away before we do. */ bus->refcount++; /* Insertion sort into our bus's target list */ cur_target = TAILQ_FIRST(&bus->et_entries); while (cur_target != NULL && cur_target->target_id < target_id) cur_target = TAILQ_NEXT(cur_target, links); if (cur_target != NULL) { TAILQ_INSERT_BEFORE(cur_target, target, links); } else { TAILQ_INSERT_TAIL(&bus->et_entries, target, links); } bus->generation++; return (target); } static void xpt_acquire_target(struct cam_et *target) { struct cam_eb *bus = target->bus; mtx_lock(&bus->eb_mtx); target->refcount++; mtx_unlock(&bus->eb_mtx); } static void xpt_release_target(struct cam_et *target) { struct cam_eb *bus = target->bus; mtx_lock(&bus->eb_mtx); if (--target->refcount > 0) { mtx_unlock(&bus->eb_mtx); return; } TAILQ_REMOVE(&bus->et_entries, target, links); bus->generation++; mtx_unlock(&bus->eb_mtx); KASSERT(TAILQ_EMPTY(&target->ed_entries), ("destroying target, but device list is not empty")); xpt_release_bus(bus); mtx_destroy(&target->luns_mtx); if (target->luns) free(target->luns, M_CAMXPT); free(target, M_CAMXPT); } static struct cam_ed * xpt_alloc_device_default(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id) { struct cam_ed *device; device = xpt_alloc_device(bus, target, lun_id); if (device == NULL) return (NULL); device->mintags = 1; device->maxtags = 1; return (device); } static void xpt_destroy_device(void *context, int pending) { struct cam_ed *device = context; mtx_lock(&device->device_mtx); mtx_destroy(&device->device_mtx); free(device, M_CAMDEV); } struct cam_ed * xpt_alloc_device(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id) { struct cam_ed *cur_device, *device; struct cam_devq *devq; cam_status status; mtx_assert(&bus->eb_mtx, MA_OWNED); /* Make space for us in the device queue on our bus */ devq = bus->sim->devq; mtx_lock(&devq->send_mtx); status = cam_devq_resize(devq, devq->send_queue.array_size + 1); mtx_unlock(&devq->send_mtx); if (status != CAM_REQ_CMP) return (NULL); device = (struct cam_ed *)malloc(sizeof(*device), M_CAMDEV, M_NOWAIT|M_ZERO); if (device == NULL) return (NULL); cam_init_pinfo(&device->devq_entry); device->target = target; device->lun_id = lun_id; device->sim = bus->sim; if (cam_ccbq_init(&device->ccbq, bus->sim->max_dev_openings) != 0) { free(device, M_CAMDEV); return (NULL); } SLIST_INIT(&device->asyncs); SLIST_INIT(&device->periphs); device->generation = 0; device->flags = CAM_DEV_UNCONFIGURED; device->tag_delay_count = 0; device->tag_saved_openings = 0; device->refcount = 1; mtx_init(&device->device_mtx, "CAM device lock", NULL, MTX_DEF); callout_init_mtx(&device->callout, &devq->send_mtx, 0); TASK_INIT(&device->device_destroy_task, 0, xpt_destroy_device, device); /* * Hold a reference to our parent bus so it * will not go away before we do. */ target->refcount++; cur_device = TAILQ_FIRST(&target->ed_entries); while (cur_device != NULL && cur_device->lun_id < lun_id) cur_device = TAILQ_NEXT(cur_device, links); if (cur_device != NULL) TAILQ_INSERT_BEFORE(cur_device, device, links); else TAILQ_INSERT_TAIL(&target->ed_entries, device, links); target->generation++; return (device); } void xpt_acquire_device(struct cam_ed *device) { struct cam_eb *bus = device->target->bus; mtx_lock(&bus->eb_mtx); device->refcount++; mtx_unlock(&bus->eb_mtx); } void xpt_release_device(struct cam_ed *device) { struct cam_eb *bus = device->target->bus; struct cam_devq *devq; mtx_lock(&bus->eb_mtx); if (--device->refcount > 0) { mtx_unlock(&bus->eb_mtx); return; } TAILQ_REMOVE(&device->target->ed_entries, device,links); device->target->generation++; mtx_unlock(&bus->eb_mtx); /* Release our slot in the devq */ devq = bus->sim->devq; mtx_lock(&devq->send_mtx); cam_devq_resize(devq, devq->send_queue.array_size - 1); mtx_unlock(&devq->send_mtx); KASSERT(SLIST_EMPTY(&device->periphs), ("destroying device, but periphs list is not empty")); KASSERT(device->devq_entry.index == CAM_UNQUEUED_INDEX, ("destroying device while still queued for ccbs")); if ((device->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) callout_stop(&device->callout); xpt_release_target(device->target); cam_ccbq_fini(&device->ccbq); /* * Free allocated memory. free(9) does nothing if the * supplied pointer is NULL, so it is safe to call without * checking. */ free(device->supported_vpds, M_CAMXPT); free(device->device_id, M_CAMXPT); free(device->ext_inq, M_CAMXPT); free(device->physpath, M_CAMXPT); free(device->rcap_buf, M_CAMXPT); free(device->serial_num, M_CAMXPT); taskqueue_enqueue(xsoftc.xpt_taskq, &device->device_destroy_task); } u_int32_t xpt_dev_ccbq_resize(struct cam_path *path, int newopenings) { int result; struct cam_ed *dev; dev = path->device; mtx_lock(&dev->sim->devq->send_mtx); result = cam_ccbq_resize(&dev->ccbq, newopenings); mtx_unlock(&dev->sim->devq->send_mtx); if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0 || (dev->inq_flags & SID_CmdQue) != 0) dev->tag_saved_openings = newopenings; return (result); } static struct cam_eb * xpt_find_bus(path_id_t path_id) { struct cam_eb *bus; xpt_lock_buses(); for (bus = TAILQ_FIRST(&xsoftc.xpt_busses); bus != NULL; bus = TAILQ_NEXT(bus, links)) { if (bus->path_id == path_id) { bus->refcount++; break; } } xpt_unlock_buses(); return (bus); } static struct cam_et * xpt_find_target(struct cam_eb *bus, target_id_t target_id) { struct cam_et *target; mtx_assert(&bus->eb_mtx, MA_OWNED); for (target = TAILQ_FIRST(&bus->et_entries); target != NULL; target = TAILQ_NEXT(target, links)) { if (target->target_id == target_id) { target->refcount++; break; } } return (target); } static struct cam_ed * xpt_find_device(struct cam_et *target, lun_id_t lun_id) { struct cam_ed *device; mtx_assert(&target->bus->eb_mtx, MA_OWNED); for (device = TAILQ_FIRST(&target->ed_entries); device != NULL; device = TAILQ_NEXT(device, links)) { if (device->lun_id == lun_id) { device->refcount++; break; } } return (device); } void xpt_start_tags(struct cam_path *path) { struct ccb_relsim crs; struct cam_ed *device; struct cam_sim *sim; int newopenings; device = path->device; sim = path->bus->sim; device->flags &= ~CAM_DEV_TAG_AFTER_COUNT; xpt_freeze_devq(path, /*count*/1); device->inq_flags |= SID_CmdQue; if (device->tag_saved_openings != 0) newopenings = device->tag_saved_openings; else newopenings = min(device->maxtags, sim->max_tagged_dev_openings); xpt_dev_ccbq_resize(path, newopenings); xpt_async(AC_GETDEV_CHANGED, path, NULL); xpt_setup_ccb(&crs.ccb_h, path, CAM_PRIORITY_NORMAL); crs.ccb_h.func_code = XPT_REL_SIMQ; crs.release_flags = RELSIM_RELEASE_AFTER_QEMPTY; crs.openings = crs.release_timeout = crs.qfrozen_cnt = 0; xpt_action((union ccb *)&crs); } void xpt_stop_tags(struct cam_path *path) { struct ccb_relsim crs; struct cam_ed *device; struct cam_sim *sim; device = path->device; sim = path->bus->sim; device->flags &= ~CAM_DEV_TAG_AFTER_COUNT; device->tag_delay_count = 0; xpt_freeze_devq(path, /*count*/1); device->inq_flags &= ~SID_CmdQue; xpt_dev_ccbq_resize(path, sim->max_dev_openings); xpt_async(AC_GETDEV_CHANGED, path, NULL); xpt_setup_ccb(&crs.ccb_h, path, CAM_PRIORITY_NORMAL); crs.ccb_h.func_code = XPT_REL_SIMQ; crs.release_flags = RELSIM_RELEASE_AFTER_QEMPTY; crs.openings = crs.release_timeout = crs.qfrozen_cnt = 0; xpt_action((union ccb *)&crs); } static void xpt_boot_delay(void *arg) { xpt_release_boot(); } static void xpt_config(void *arg) { /* * Now that interrupts are enabled, go find our devices */ if (taskqueue_start_threads(&xsoftc.xpt_taskq, 1, PRIBIO, "CAM taskq")) printf("xpt_config: failed to create taskqueue thread.\n"); /* Setup debugging path */ if (cam_dflags != CAM_DEBUG_NONE) { if (xpt_create_path(&cam_dpath, NULL, CAM_DEBUG_BUS, CAM_DEBUG_TARGET, CAM_DEBUG_LUN) != CAM_REQ_CMP) { printf("xpt_config: xpt_create_path() failed for debug" " target %d:%d:%d, debugging disabled\n", CAM_DEBUG_BUS, CAM_DEBUG_TARGET, CAM_DEBUG_LUN); cam_dflags = CAM_DEBUG_NONE; } } else cam_dpath = NULL; periphdriver_init(1); xpt_hold_boot(); callout_init(&xsoftc.boot_callout, 1); callout_reset_sbt(&xsoftc.boot_callout, SBT_1MS * xsoftc.boot_delay, 0, xpt_boot_delay, NULL, 0); /* Fire up rescan thread. */ if (kproc_kthread_add(xpt_scanner_thread, NULL, &cam_proc, NULL, 0, 0, "cam", "scanner")) { printf("xpt_config: failed to create rescan thread.\n"); } } void xpt_hold_boot(void) { xpt_lock_buses(); xsoftc.buses_to_config++; xpt_unlock_buses(); } void xpt_release_boot(void) { xpt_lock_buses(); xsoftc.buses_to_config--; if (xsoftc.buses_to_config == 0 && xsoftc.buses_config_done == 0) { struct xpt_task *task; xsoftc.buses_config_done = 1; xpt_unlock_buses(); /* Call manually because we don't have any busses */ task = malloc(sizeof(struct xpt_task), M_CAMXPT, M_NOWAIT); if (task != NULL) { TASK_INIT(&task->task, 0, xpt_finishconfig_task, task); taskqueue_enqueue(taskqueue_thread, &task->task); } } else xpt_unlock_buses(); } /* * If the given device only has one peripheral attached to it, and if that * peripheral is the passthrough driver, announce it. This insures that the * user sees some sort of announcement for every peripheral in their system. */ static int xptpassannouncefunc(struct cam_ed *device, void *arg) { struct cam_periph *periph; int i; for (periph = SLIST_FIRST(&device->periphs), i = 0; periph != NULL; periph = SLIST_NEXT(periph, periph_links), i++); periph = SLIST_FIRST(&device->periphs); if ((i == 1) && (strncmp(periph->periph_name, "pass", 4) == 0)) xpt_announce_periph(periph, NULL); return(1); } static void xpt_finishconfig_task(void *context, int pending) { periphdriver_init(2); /* * Check for devices with no "standard" peripheral driver * attached. For any devices like that, announce the * passthrough driver so the user will see something. */ if (!bootverbose) xpt_for_all_devices(xptpassannouncefunc, NULL); /* Release our hook so that the boot can continue. */ config_intrhook_disestablish(xsoftc.xpt_config_hook); free(xsoftc.xpt_config_hook, M_CAMXPT); xsoftc.xpt_config_hook = NULL; free(context, M_CAMXPT); } cam_status xpt_register_async(int event, ac_callback_t *cbfunc, void *cbarg, struct cam_path *path) { struct ccb_setasync csa; cam_status status; int xptpath = 0; if (path == NULL) { status = xpt_create_path(&path, /*periph*/NULL, CAM_XPT_PATH_ID, CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD); if (status != CAM_REQ_CMP) return (status); xpt_path_lock(path); xptpath = 1; } xpt_setup_ccb(&csa.ccb_h, path, CAM_PRIORITY_NORMAL); csa.ccb_h.func_code = XPT_SASYNC_CB; csa.event_enable = event; csa.callback = cbfunc; csa.callback_arg = cbarg; xpt_action((union ccb *)&csa); status = csa.ccb_h.status; if (xptpath) { xpt_path_unlock(path); xpt_free_path(path); } if ((status == CAM_REQ_CMP) && (csa.event_enable & AC_FOUND_DEVICE)) { /* * Get this peripheral up to date with all * the currently existing devices. */ xpt_for_all_devices(xptsetasyncfunc, &csa); } if ((status == CAM_REQ_CMP) && (csa.event_enable & AC_PATH_REGISTERED)) { /* * Get this peripheral up to date with all * the currently existing busses. */ xpt_for_all_busses(xptsetasyncbusfunc, &csa); } return (status); } static void xptaction(struct cam_sim *sim, union ccb *work_ccb) { CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xptaction\n")); switch (work_ccb->ccb_h.func_code) { /* Common cases first */ case XPT_PATH_INQ: /* Path routing inquiry */ { struct ccb_pathinq *cpi; cpi = &work_ccb->cpi; cpi->version_num = 1; /* XXX??? */ cpi->hba_inquiry = 0; cpi->target_sprt = 0; cpi->hba_misc = 0; cpi->hba_eng_cnt = 0; cpi->max_target = 0; cpi->max_lun = 0; cpi->initiator_id = 0; strncpy(cpi->sim_vid, "FreeBSD", SIM_IDLEN); strncpy(cpi->hba_vid, "", HBA_IDLEN); strncpy(cpi->dev_name, sim->sim_name, DEV_IDLEN); cpi->unit_number = sim->unit_number; cpi->bus_id = sim->bus_id; cpi->base_transfer_speed = 0; cpi->protocol = PROTO_UNSPECIFIED; cpi->protocol_version = PROTO_VERSION_UNSPECIFIED; cpi->transport = XPORT_UNSPECIFIED; cpi->transport_version = XPORT_VERSION_UNSPECIFIED; cpi->ccb_h.status = CAM_REQ_CMP; xpt_done(work_ccb); break; } default: work_ccb->ccb_h.status = CAM_REQ_INVALID; xpt_done(work_ccb); break; } } /* * The xpt as a "controller" has no interrupt sources, so polling * is a no-op. */ static void xptpoll(struct cam_sim *sim) { } void xpt_lock_buses(void) { mtx_lock(&xsoftc.xpt_topo_lock); } void xpt_unlock_buses(void) { mtx_unlock(&xsoftc.xpt_topo_lock); } struct mtx * xpt_path_mtx(struct cam_path *path) { return (&path->device->device_mtx); } static void xpt_done_process(struct ccb_hdr *ccb_h) { struct cam_sim *sim; struct cam_devq *devq; struct mtx *mtx = NULL; if (ccb_h->flags & CAM_HIGH_POWER) { struct highpowerlist *hphead; struct cam_ed *device; mtx_lock(&xsoftc.xpt_highpower_lock); hphead = &xsoftc.highpowerq; device = STAILQ_FIRST(hphead); /* * Increment the count since this command is done. */ xsoftc.num_highpower++; /* * Any high powered commands queued up? */ if (device != NULL) { STAILQ_REMOVE_HEAD(hphead, highpowerq_entry); mtx_unlock(&xsoftc.xpt_highpower_lock); mtx_lock(&device->sim->devq->send_mtx); xpt_release_devq_device(device, /*count*/1, /*runqueue*/TRUE); mtx_unlock(&device->sim->devq->send_mtx); } else mtx_unlock(&xsoftc.xpt_highpower_lock); } sim = ccb_h->path->bus->sim; if (ccb_h->status & CAM_RELEASE_SIMQ) { xpt_release_simq(sim, /*run_queue*/FALSE); ccb_h->status &= ~CAM_RELEASE_SIMQ; } if ((ccb_h->flags & CAM_DEV_QFRZDIS) && (ccb_h->status & CAM_DEV_QFRZN)) { xpt_release_devq(ccb_h->path, /*count*/1, /*run_queue*/TRUE); ccb_h->status &= ~CAM_DEV_QFRZN; } devq = sim->devq; if ((ccb_h->func_code & XPT_FC_USER_CCB) == 0) { struct cam_ed *dev = ccb_h->path->device; mtx_lock(&devq->send_mtx); devq->send_active--; devq->send_openings++; cam_ccbq_ccb_done(&dev->ccbq, (union ccb *)ccb_h); if (((dev->flags & CAM_DEV_REL_ON_QUEUE_EMPTY) != 0 && (dev->ccbq.dev_active == 0))) { dev->flags &= ~CAM_DEV_REL_ON_QUEUE_EMPTY; xpt_release_devq_device(dev, /*count*/1, /*run_queue*/FALSE); } if (((dev->flags & CAM_DEV_REL_ON_COMPLETE) != 0 && (ccb_h->status&CAM_STATUS_MASK) != CAM_REQUEUE_REQ)) { dev->flags &= ~CAM_DEV_REL_ON_COMPLETE; xpt_release_devq_device(dev, /*count*/1, /*run_queue*/FALSE); } if (!device_is_queued(dev)) (void)xpt_schedule_devq(devq, dev); xpt_run_devq(devq); mtx_unlock(&devq->send_mtx); if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0) { mtx = xpt_path_mtx(ccb_h->path); mtx_lock(mtx); if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0 && (--dev->tag_delay_count == 0)) xpt_start_tags(ccb_h->path); } } if ((ccb_h->flags & CAM_UNLOCKED) == 0) { if (mtx == NULL) { mtx = xpt_path_mtx(ccb_h->path); mtx_lock(mtx); } } else { if (mtx != NULL) { mtx_unlock(mtx); mtx = NULL; } } /* Call the peripheral driver's callback */ ccb_h->pinfo.index = CAM_UNQUEUED_INDEX; (*ccb_h->cbfcnp)(ccb_h->path->periph, (union ccb *)ccb_h); if (mtx != NULL) mtx_unlock(mtx); } void xpt_done_td(void *arg) { struct cam_doneq *queue = arg; struct ccb_hdr *ccb_h; STAILQ_HEAD(, ccb_hdr) doneq; STAILQ_INIT(&doneq); mtx_lock(&queue->cam_doneq_mtx); while (1) { while (STAILQ_EMPTY(&queue->cam_doneq)) { queue->cam_doneq_sleep = 1; msleep(&queue->cam_doneq, &queue->cam_doneq_mtx, PRIBIO, "-", 0); queue->cam_doneq_sleep = 0; } STAILQ_CONCAT(&doneq, &queue->cam_doneq); mtx_unlock(&queue->cam_doneq_mtx); THREAD_NO_SLEEPING(); while ((ccb_h = STAILQ_FIRST(&doneq)) != NULL) { STAILQ_REMOVE_HEAD(&doneq, sim_links.stqe); xpt_done_process(ccb_h); } THREAD_SLEEPING_OK(); mtx_lock(&queue->cam_doneq_mtx); } } static void camisr_runqueue(void) { struct ccb_hdr *ccb_h; struct cam_doneq *queue; int i; /* Process global queues. */ for (i = 0; i < cam_num_doneqs; i++) { queue = &cam_doneqs[i]; mtx_lock(&queue->cam_doneq_mtx); while ((ccb_h = STAILQ_FIRST(&queue->cam_doneq)) != NULL) { STAILQ_REMOVE_HEAD(&queue->cam_doneq, sim_links.stqe); mtx_unlock(&queue->cam_doneq_mtx); xpt_done_process(ccb_h); mtx_lock(&queue->cam_doneq_mtx); } mtx_unlock(&queue->cam_doneq_mtx); } } Index: stable/10/sys/cam/cam_xpt.h =================================================================== --- stable/10/sys/cam/cam_xpt.h (revision 292347) +++ stable/10/sys/cam/cam_xpt.h (revision 292348) @@ -1,137 +1,141 @@ /*- * Data structures and definitions for dealing with the * Common Access Method Transport (xpt) layer. * * Copyright (c) 1997 Justin T. Gibbs. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _CAM_CAM_XPT_H #define _CAM_CAM_XPT_H 1 /* Forward Declarations */ union ccb; struct cam_periph; struct cam_ed; struct cam_sim; /* * Definition of a CAM path. Paths are created from bus, target, and lun ids * via xpt_create_path and allow for reference to devices without recurring * lookups in the edt. */ struct cam_path; /* Path functions */ #ifdef _KERNEL /* * Definition of an async handler callback block. These are used to add * SIMs and peripherals to the async callback lists. */ struct async_node { SLIST_ENTRY(async_node) links; u_int32_t event_enable; /* Async Event enables */ u_int32_t event_lock; /* Take SIM lock for handlers. */ void (*callback)(void *arg, u_int32_t code, struct cam_path *path, void *args); void *callback_arg; }; SLIST_HEAD(async_list, async_node); SLIST_HEAD(periph_list, cam_periph); void xpt_action(union ccb *new_ccb); void xpt_action_default(union ccb *new_ccb); union ccb *xpt_alloc_ccb(void); union ccb *xpt_alloc_ccb_nowait(void); void xpt_free_ccb(union ccb *free_ccb); +void xpt_setup_ccb_flags(struct ccb_hdr *ccb_h, + struct cam_path *path, + u_int32_t priority, + u_int32_t flags); void xpt_setup_ccb(struct ccb_hdr *ccb_h, struct cam_path *path, u_int32_t priority); void xpt_merge_ccb(union ccb *master_ccb, union ccb *slave_ccb); cam_status xpt_create_path(struct cam_path **new_path_ptr, struct cam_periph *perph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id); cam_status xpt_create_path_unlocked(struct cam_path **new_path_ptr, struct cam_periph *perph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id); int xpt_getattr(char *buf, size_t len, const char *attr, struct cam_path *path); void xpt_free_path(struct cam_path *path); void xpt_path_counts(struct cam_path *path, uint32_t *bus_ref, uint32_t *periph_ref, uint32_t *target_ref, uint32_t *device_ref); int xpt_path_comp(struct cam_path *path1, struct cam_path *path2); int xpt_path_comp_dev(struct cam_path *path, struct cam_ed *dev); void xpt_print_path(struct cam_path *path); void xpt_print_device(struct cam_ed *device); void xpt_print(struct cam_path *path, const char *fmt, ...); int xpt_path_string(struct cam_path *path, char *str, size_t str_len); path_id_t xpt_path_path_id(struct cam_path *path); target_id_t xpt_path_target_id(struct cam_path *path); lun_id_t xpt_path_lun_id(struct cam_path *path); int xpt_path_legacy_ata_id(struct cam_path *path); struct cam_sim *xpt_path_sim(struct cam_path *path); struct cam_periph *xpt_path_periph(struct cam_path *path); void xpt_async(u_int32_t async_code, struct cam_path *path, void *async_arg); void xpt_rescan(union ccb *ccb); void xpt_hold_boot(void); void xpt_release_boot(void); void xpt_lock_buses(void); void xpt_unlock_buses(void); struct mtx * xpt_path_mtx(struct cam_path *path); #define xpt_path_lock(path) mtx_lock(xpt_path_mtx(path)) #define xpt_path_unlock(path) mtx_unlock(xpt_path_mtx(path)) #define xpt_path_assert(path, what) mtx_assert(xpt_path_mtx(path), (what)) #define xpt_path_owned(path) mtx_owned(xpt_path_mtx(path)) #define xpt_path_sleep(path, chan, priority, wmesg, timo) \ msleep((chan), xpt_path_mtx(path), (priority), (wmesg), (timo)) cam_status xpt_register_async(int event, ac_callback_t *cbfunc, void *cbarg, struct cam_path *path); cam_status xpt_compile_path(struct cam_path *new_path, struct cam_periph *perph, path_id_t path_id, target_id_t target_id, lun_id_t lun_id); cam_status xpt_clone_path(struct cam_path **new_path, struct cam_path *path); void xpt_copy_path(struct cam_path *new_path, struct cam_path *path); void xpt_release_path(struct cam_path *path); #endif /* _KERNEL */ #endif /* _CAM_CAM_XPT_H */ Index: stable/10/sys/cam/scsi/scsi_da.c =================================================================== --- stable/10/sys/cam/scsi/scsi_da.c (revision 292347) +++ stable/10/sys/cam/scsi/scsi_da.c (revision 292348) @@ -1,4025 +1,4036 @@ /*- * Implementation of SCSI Direct Access Peripheral driver for CAM. * * Copyright (c) 1997 Justin T. Gibbs. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #ifdef _KERNEL #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #endif /* _KERNEL */ #ifndef _KERNEL #include #include #endif /* _KERNEL */ #include #include #include #include #include #include #ifndef _KERNEL #include #endif /* !_KERNEL */ #ifdef _KERNEL typedef enum { DA_STATE_PROBE_RC, DA_STATE_PROBE_RC16, DA_STATE_PROBE_LBP, DA_STATE_PROBE_BLK_LIMITS, DA_STATE_PROBE_BDC, DA_STATE_PROBE_ATA, DA_STATE_NORMAL } da_state; typedef enum { DA_FLAG_PACK_INVALID = 0x001, DA_FLAG_NEW_PACK = 0x002, DA_FLAG_PACK_LOCKED = 0x004, DA_FLAG_PACK_REMOVABLE = 0x008, DA_FLAG_NEED_OTAG = 0x020, DA_FLAG_WAS_OTAG = 0x040, DA_FLAG_RETRY_UA = 0x080, DA_FLAG_OPEN = 0x100, DA_FLAG_SCTX_INIT = 0x200, DA_FLAG_CAN_RC16 = 0x400, DA_FLAG_PROBED = 0x800, DA_FLAG_DIRTY = 0x1000, DA_FLAG_ANNOUNCED = 0x2000 } da_flags; typedef enum { DA_Q_NONE = 0x00, DA_Q_NO_SYNC_CACHE = 0x01, DA_Q_NO_6_BYTE = 0x02, DA_Q_NO_PREVENT = 0x04, DA_Q_4K = 0x08, DA_Q_NO_RC16 = 0x10, DA_Q_NO_UNMAP = 0x20, DA_Q_RETRY_BUSY = 0x40 } da_quirks; #define DA_Q_BIT_STRING \ "\020" \ "\001NO_SYNC_CACHE" \ "\002NO_6_BYTE" \ "\003NO_PREVENT" \ "\0044K" \ "\005NO_RC16" \ "\006NO_UNMAP" \ "\007RETRY_BUSY" typedef enum { DA_CCB_PROBE_RC = 0x01, DA_CCB_PROBE_RC16 = 0x02, DA_CCB_PROBE_LBP = 0x03, DA_CCB_PROBE_BLK_LIMITS = 0x04, DA_CCB_PROBE_BDC = 0x05, DA_CCB_PROBE_ATA = 0x06, DA_CCB_BUFFER_IO = 0x07, DA_CCB_DUMP = 0x0A, DA_CCB_DELETE = 0x0B, DA_CCB_TUR = 0x0C, DA_CCB_TYPE_MASK = 0x0F, DA_CCB_RETRY_UA = 0x10 } da_ccb_state; /* * Order here is important for method choice * * We prefer ATA_TRIM as tests run against a Sandforce 2281 SSD attached to * LSI 2008 (mps) controller (FW: v12, Drv: v14) resulted 20% quicker deletes * using ATA_TRIM than the corresponding UNMAP results for a real world mysql * import taking 5mins. * */ typedef enum { DA_DELETE_NONE, DA_DELETE_DISABLE, DA_DELETE_ATA_TRIM, DA_DELETE_UNMAP, DA_DELETE_WS16, DA_DELETE_WS10, DA_DELETE_ZERO, DA_DELETE_MIN = DA_DELETE_ATA_TRIM, DA_DELETE_MAX = DA_DELETE_ZERO } da_delete_methods; typedef void da_delete_func_t (struct cam_periph *periph, union ccb *ccb, struct bio *bp); static da_delete_func_t da_delete_trim; static da_delete_func_t da_delete_unmap; static da_delete_func_t da_delete_ws; static const void * da_delete_functions[] = { NULL, NULL, da_delete_trim, da_delete_unmap, da_delete_ws, da_delete_ws, da_delete_ws }; static const char *da_delete_method_names[] = { "NONE", "DISABLE", "ATA_TRIM", "UNMAP", "WS16", "WS10", "ZERO" }; static const char *da_delete_method_desc[] = { "NONE", "DISABLED", "ATA TRIM", "UNMAP", "WRITE SAME(16) with UNMAP", "WRITE SAME(10) with UNMAP", "ZERO" }; /* Offsets into our private area for storing information */ #define ccb_state ppriv_field0 #define ccb_bp ppriv_ptr1 struct disk_params { u_int8_t heads; u_int32_t cylinders; u_int8_t secs_per_track; u_int32_t secsize; /* Number of bytes/sector */ u_int64_t sectors; /* total number sectors */ u_int stripesize; u_int stripeoffset; }; #define UNMAP_RANGE_MAX 0xffffffff #define UNMAP_HEAD_SIZE 8 #define UNMAP_RANGE_SIZE 16 #define UNMAP_MAX_RANGES 2048 /* Protocol Max is 4095 */ #define UNMAP_BUF_SIZE ((UNMAP_MAX_RANGES * UNMAP_RANGE_SIZE) + \ UNMAP_HEAD_SIZE) #define WS10_MAX_BLKS 0xffff #define WS16_MAX_BLKS 0xffffffff #define ATA_TRIM_MAX_RANGES ((UNMAP_BUF_SIZE / \ (ATA_DSM_RANGE_SIZE * ATA_DSM_BLK_SIZE)) * ATA_DSM_BLK_SIZE) struct da_softc { struct bio_queue_head bio_queue; struct bio_queue_head delete_queue; struct bio_queue_head delete_run_queue; LIST_HEAD(, ccb_hdr) pending_ccbs; int tur; /* TEST UNIT READY should be sent */ int refcount; /* Active xpt_action() calls */ da_state state; da_flags flags; da_quirks quirks; int sort_io_queue; int minimum_cmd_size; int error_inject; int trim_max_ranges; int delete_running; int delete_available; /* Delete methods possibly available */ u_int maxio; uint32_t unmap_max_ranges; uint32_t unmap_max_lba; /* Max LBAs in UNMAP req */ uint64_t ws_max_blks; da_delete_methods delete_method_pref; da_delete_methods delete_method; da_delete_func_t *delete_func; struct disk_params params; struct disk *disk; union ccb saved_ccb; struct task sysctl_task; struct sysctl_ctx_list sysctl_ctx; struct sysctl_oid *sysctl_tree; struct callout sendordered_c; uint64_t wwpn; uint8_t unmap_buf[UNMAP_BUF_SIZE]; struct scsi_read_capacity_data_long rcaplong; struct callout mediapoll_c; }; #define dadeleteflag(softc, delete_method, enable) \ if (enable) { \ softc->delete_available |= (1 << delete_method); \ } else { \ softc->delete_available &= ~(1 << delete_method); \ } struct da_quirk_entry { struct scsi_inquiry_pattern inq_pat; da_quirks quirks; }; static const char quantum[] = "QUANTUM"; static const char microp[] = "MICROP"; static struct da_quirk_entry da_quirk_table[] = { /* SPI, FC devices */ { /* * Fujitsu M2513A MO drives. * Tested devices: M2513A2 firmware versions 1200 & 1300. * (dip switch selects whether T_DIRECT or T_OPTICAL device) * Reported by: W.Scholten */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "FUJITSU", "M2513A", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* See above. */ {T_OPTICAL, SIP_MEDIA_REMOVABLE, "FUJITSU", "M2513A", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * This particular Fujitsu drive doesn't like the * synchronize cache command. * Reported by: Tom Jackson */ {T_DIRECT, SIP_MEDIA_FIXED, "FUJITSU", "M2954*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * This drive doesn't like the synchronize cache command * either. Reported by: Matthew Jacob * in NetBSD PR kern/6027, August 24, 1998. */ {T_DIRECT, SIP_MEDIA_FIXED, microp, "2217*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * This drive doesn't like the synchronize cache command * either. Reported by: Hellmuth Michaelis (hm@kts.org) * (PR 8882). */ {T_DIRECT, SIP_MEDIA_FIXED, microp, "2112*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't like the synchronize cache command. * Reported by: Blaz Zupan */ {T_DIRECT, SIP_MEDIA_FIXED, "NEC", "D3847*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't like the synchronize cache command. * Reported by: Blaz Zupan */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "MAVERICK 540S", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't like the synchronize cache command. */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "LPS525S", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't like the synchronize cache command. * Reported by: walter@pelissero.de */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "LPS540S", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't work correctly with 6 byte reads/writes. * Returns illegal request, and points to byte 9 of the * 6-byte CDB. * Reported by: Adam McDougall */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "VIKING 4*", "*"}, /*quirks*/ DA_Q_NO_6_BYTE }, { /* See above. */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "VIKING 2*", "*"}, /*quirks*/ DA_Q_NO_6_BYTE }, { /* * Doesn't like the synchronize cache command. * Reported by: walter@pelissero.de */ {T_DIRECT, SIP_MEDIA_FIXED, "CONNER", "CP3500*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * The CISS RAID controllers do not support SYNC_CACHE */ {T_DIRECT, SIP_MEDIA_FIXED, "COMPAQ", "RAID*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * The STEC SSDs sometimes hang on UNMAP. */ {T_DIRECT, SIP_MEDIA_FIXED, "STEC", "*", "*"}, /*quirks*/ DA_Q_NO_UNMAP }, { /* * VMware returns BUSY status when storage has transient * connectivity problems, so better wait. */ {T_DIRECT, SIP_MEDIA_FIXED, "VMware*", "*", "*"}, /*quirks*/ DA_Q_RETRY_BUSY }, /* USB mass storage devices supported by umass(4) */ { /* * EXATELECOM (Sigmatel) i-Bead 100/105 USB Flash MP3 Player * PR: kern/51675 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "EXATEL", "i-BEAD10*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Power Quotient Int. (PQI) USB flash key * PR: kern/53067 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Generic*", "USB Flash Disk*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Creative Nomad MUVO mp3 player (USB) * PR: kern/53094 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "CREATIVE", "NOMAD_MUVO", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE|DA_Q_NO_PREVENT }, { /* * Jungsoft NEXDISK USB flash key * PR: kern/54737 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "JUNGSOFT", "NEXDISK*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * FreeDik USB Mini Data Drive * PR: kern/54786 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "FreeDik*", "Mini Data Drive", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Sigmatel USB Flash MP3 Player * PR: kern/57046 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "SigmaTel", "MSCN", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE|DA_Q_NO_PREVENT }, { /* * Neuros USB Digital Audio Computer * PR: kern/63645 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "NEUROS", "dig. audio comp.", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * SEAGRAND NP-900 MP3 Player * PR: kern/64563 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "SEAGRAND", "NP-900*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE|DA_Q_NO_PREVENT }, { /* * iRiver iFP MP3 player (with UMS Firmware) * PR: kern/54881, i386/63941, kern/66124 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "iRiver", "iFP*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Frontier Labs NEX IA+ Digital Audio Player, rev 1.10/0.01 * PR: kern/70158 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "FL" , "Nex*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * ZICPlay USB MP3 Player with FM * PR: kern/75057 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "ACTIONS*" , "USB DISK*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * TEAC USB floppy mechanisms */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "TEAC" , "FD-05*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Kingston DataTraveler II+ USB Pen-Drive. * Reported by: Pawel Jakub Dawidek */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Kingston" , "DataTraveler II+", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * USB DISK Pro PMAP * Reported by: jhs * PR: usb/96381 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, " ", "USB DISK Pro", "PMAP"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Motorola E398 Mobile Phone (TransFlash memory card). * Reported by: Wojciech A. Koszek * PR: usb/89889 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Motorola" , "Motorola Phone", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Qware BeatZkey! Pro * PR: usb/79164 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "GENERIC", "USB DISK DEVICE", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Time DPA20B 1GB MP3 Player * PR: usb/81846 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "USB2.0*", "(FS) FLASH DISK*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Samsung USB key 128Mb * PR: usb/90081 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "USB-DISK", "FreeDik-FlashUsb", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Kingston DataTraveler 2.0 USB Flash memory. * PR: usb/89196 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Kingston", "DataTraveler 2.0", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Creative MUVO Slim mp3 player (USB) * PR: usb/86131 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "CREATIVE", "MuVo Slim", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE|DA_Q_NO_PREVENT }, { /* * United MP5512 Portable MP3 Player (2-in-1 USB DISK/MP3) * PR: usb/80487 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Generic*", "MUSIC DISK", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * SanDisk Micro Cruzer 128MB * PR: usb/75970 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "SanDisk" , "Micro Cruzer", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * TOSHIBA TransMemory USB sticks * PR: kern/94660 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "TOSHIBA", "TransMemory", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * PNY USB 3.0 Flash Drives */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "PNY", "USB 3.0 FD*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE | DA_Q_NO_RC16 }, { /* * PNY USB Flash keys * PR: usb/75578, usb/72344, usb/65436 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "*" , "USB DISK*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Genesys 6-in-1 Card Reader * PR: usb/94647 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Generic*", "STORAGE DEVICE*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Rekam Digital CAMERA * PR: usb/98713 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "CAMERA*", "4MP-9J6*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * iRiver H10 MP3 player * PR: usb/102547 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "iriver", "H10*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * iRiver U10 MP3 player * PR: usb/92306 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "iriver", "U10*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * X-Micro Flash Disk * PR: usb/96901 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "X-Micro", "Flash Disk", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * EasyMP3 EM732X USB 2.0 Flash MP3 Player * PR: usb/96546 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "EM732X", "MP3 Player*", "1.00"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Denver MP3 player * PR: usb/107101 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "DENVER", "MP3 PLAYER", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Philips USB Key Audio KEY013 * PR: usb/68412 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "PHILIPS", "Key*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE | DA_Q_NO_PREVENT }, { /* * JNC MP3 Player * PR: usb/94439 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "JNC*" , "MP3 Player*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * SAMSUNG MP0402H * PR: usb/108427 */ {T_DIRECT, SIP_MEDIA_FIXED, "SAMSUNG", "MP0402H", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * I/O Magic USB flash - Giga Bank * PR: usb/108810 */ {T_DIRECT, SIP_MEDIA_FIXED, "GS-Magic", "stor*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * JoyFly 128mb USB Flash Drive * PR: 96133 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "USB 2.0", "Flash Disk*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * ChipsBnk usb stick * PR: 103702 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "ChipsBnk", "USB*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Storcase (Kingston) InfoStation IFS FC2/SATA-R 201A * PR: 129858 */ {T_DIRECT, SIP_MEDIA_FIXED, "IFS", "FC2/SATA-R*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Samsung YP-U3 mp3-player * PR: 125398 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Samsung", "YP-U3", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { {T_DIRECT, SIP_MEDIA_REMOVABLE, "Netac", "OnlyDisk*", "2000"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Sony Cyber-Shot DSC cameras * PR: usb/137035 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Sony", "Sony DSC", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE | DA_Q_NO_PREVENT }, { {T_DIRECT, SIP_MEDIA_REMOVABLE, "Kingston", "DataTraveler G3", "1.00"}, /*quirks*/ DA_Q_NO_PREVENT }, { /* At least several Transcent USB sticks lie on RC16. */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "JetFlash", "Transcend*", "*"}, /*quirks*/ DA_Q_NO_RC16 }, /* ATA/SATA devices over SAS/USB/... */ { /* Hitachi Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "Hitachi", "H??????????E3*", "*" }, /*quirks*/DA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SAMSUNG HD155UI*", "*" }, /*quirks*/DA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "SAMSUNG", "HD155UI*", "*" }, /*quirks*/DA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SAMSUNG HD204UI*", "*" }, /*quirks*/DA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "SAMSUNG", "HD204UI*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST????DL*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST????DL", "*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST???DM*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST???DM*", "*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST????DM*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST????DM", "*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9500423AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST950042", "3AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9500424AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST950042", "4AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9640423AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST964042", "3AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9640424AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST964042", "4AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9750420AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST975042", "0AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9750422AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST975042", "2AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9750423AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST975042", "3AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Thin Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST???LT*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Thin Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST???LT*", "*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD????RS*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "??RS*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD????RX*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "??RX*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD??????RS*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "????RS*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD??????RX*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "????RX*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD???PKT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "?PKT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD?????PKT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "???PKT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD???PVT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "?PVT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD?????PVT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "???PVT*", "*" }, /*quirks*/DA_Q_4K }, { /* * Olympus FE-210 camera */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "OLYMPUS", "FE210*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * LG UP3S MP3 player */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "LG", "UP3S", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Laser MP3-2GA13 MP3 player */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "USB 2.0", "(HS) Flash Disk", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * LaCie external 250GB Hard drive des by Porsche * Submitted by: Ben Stuyts * PR: 121474 */ {T_DIRECT, SIP_MEDIA_FIXED, "SAMSUNG", "HM250JI", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, /* SATA SSDs */ { /* * Corsair Force 2 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Corsair CSSD-F*", "*" }, /*quirks*/DA_Q_4K }, { /* * Corsair Force 3 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Corsair Force 3*", "*" }, /*quirks*/DA_Q_4K }, { /* * Corsair Neutron GTX SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "Corsair Neutron GTX*", "*" }, /*quirks*/DA_Q_4K }, { /* * Corsair Force GT & GS SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Corsair Force G*", "*" }, /*quirks*/DA_Q_4K }, { /* * Crucial M4 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "M4-CT???M4SSD2*", "*" }, /*quirks*/DA_Q_4K }, { /* * Crucial RealSSD C300 SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "C300-CTFDDAC???MAG*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel 320 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSA2CW*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel 330 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSC2CT*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel 510 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSC2MH*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel 520 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSC2BW*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel X25-M Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSA2M*", "*" }, /*quirks*/DA_Q_4K }, { /* * Kingston E100 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "KINGSTON SE100S3*", "*" }, /*quirks*/DA_Q_4K }, { /* * Kingston HyperX 3k SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "KINGSTON SH103S3*", "*" }, /*quirks*/DA_Q_4K }, { /* * Marvell SSDs (entry taken from OpenSolaris) * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "MARVELL SD88SA02*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Agility 2 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "OCZ-AGILITY2*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Agility 3 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "OCZ-AGILITY3*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Deneva R Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "DENRSTE251M45*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Vertex 2 SSDs (inc pro series) * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "OCZ?VERTEX2*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Vertex 3 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "OCZ-VERTEX3*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Vertex 4 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "OCZ-VERTEX4*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 830 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SAMSUNG SSD 830 Series*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 840 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Samsung SSD 840*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 843T Series SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SAMSUNG MZ7WD*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 850 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Samsung SSD 850*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung PM853T Series SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SAMSUNG MZ7GE*", "*" }, /*quirks*/DA_Q_4K }, { /* * SuperTalent TeraDrive CT SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "FTM??CT25H*", "*" }, /*quirks*/DA_Q_4K }, { /* * XceedIOPS SATA SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SG9XCS2D*", "*" }, /*quirks*/DA_Q_4K }, { /* * Hama Innostor USB-Stick */ { T_DIRECT, SIP_MEDIA_REMOVABLE, "Innostor", "Innostor*", "*" }, /*quirks*/DA_Q_NO_RC16 }, { /* * MX-ES USB Drive by Mach Xtreme */ { T_DIRECT, SIP_MEDIA_REMOVABLE, "MX", "MXUB3*", "*"}, /*quirks*/DA_Q_NO_RC16 }, }; static disk_strategy_t dastrategy; static dumper_t dadump; static periph_init_t dainit; static void daasync(void *callback_arg, u_int32_t code, struct cam_path *path, void *arg); static void dasysctlinit(void *context, int pending); static int dacmdsizesysctl(SYSCTL_HANDLER_ARGS); static int dadeletemethodsysctl(SYSCTL_HANDLER_ARGS); static int dadeletemaxsysctl(SYSCTL_HANDLER_ARGS); static void dadeletemethodset(struct da_softc *softc, da_delete_methods delete_method); static off_t dadeletemaxsize(struct da_softc *softc, da_delete_methods delete_method); static void dadeletemethodchoose(struct da_softc *softc, da_delete_methods default_method); static void daprobedone(struct cam_periph *periph, union ccb *ccb); static periph_ctor_t daregister; static periph_dtor_t dacleanup; static periph_start_t dastart; static periph_oninv_t daoninvalidate; static void dadone(struct cam_periph *periph, union ccb *done_ccb); static int daerror(union ccb *ccb, u_int32_t cam_flags, u_int32_t sense_flags); static void daprevent(struct cam_periph *periph, int action); static void dareprobe(struct cam_periph *periph); static void dasetgeom(struct cam_periph *periph, uint32_t block_len, uint64_t maxsector, struct scsi_read_capacity_data_long *rcaplong, size_t rcap_size); static timeout_t dasendorderedtag; static void dashutdown(void *arg, int howto); static timeout_t damediapoll; #ifndef DA_DEFAULT_POLL_PERIOD #define DA_DEFAULT_POLL_PERIOD 3 #endif #ifndef DA_DEFAULT_TIMEOUT #define DA_DEFAULT_TIMEOUT 60 /* Timeout in seconds */ #endif #ifndef DA_DEFAULT_RETRY #define DA_DEFAULT_RETRY 4 #endif #ifndef DA_DEFAULT_SEND_ORDERED #define DA_DEFAULT_SEND_ORDERED 1 #endif #define DA_SIO (softc->sort_io_queue >= 0 ? \ softc->sort_io_queue : cam_sort_io_queues) static int da_poll_period = DA_DEFAULT_POLL_PERIOD; static int da_retry_count = DA_DEFAULT_RETRY; static int da_default_timeout = DA_DEFAULT_TIMEOUT; static int da_send_ordered = DA_DEFAULT_SEND_ORDERED; static SYSCTL_NODE(_kern_cam, OID_AUTO, da, CTLFLAG_RD, 0, "CAM Direct Access Disk driver"); SYSCTL_INT(_kern_cam_da, OID_AUTO, poll_period, CTLFLAG_RW, &da_poll_period, 0, "Media polling period in seconds"); TUNABLE_INT("kern.cam.da.poll_period", &da_poll_period); SYSCTL_INT(_kern_cam_da, OID_AUTO, retry_count, CTLFLAG_RW, &da_retry_count, 0, "Normal I/O retry count"); TUNABLE_INT("kern.cam.da.retry_count", &da_retry_count); SYSCTL_INT(_kern_cam_da, OID_AUTO, default_timeout, CTLFLAG_RW, &da_default_timeout, 0, "Normal I/O timeout (in seconds)"); TUNABLE_INT("kern.cam.da.default_timeout", &da_default_timeout); SYSCTL_INT(_kern_cam_da, OID_AUTO, send_ordered, CTLFLAG_RW, &da_send_ordered, 0, "Send Ordered Tags"); TUNABLE_INT("kern.cam.da.send_ordered", &da_send_ordered); /* * DA_ORDEREDTAG_INTERVAL determines how often, relative * to the default timeout, we check to see whether an ordered * tagged transaction is appropriate to prevent simple tag * starvation. Since we'd like to ensure that there is at least * 1/2 of the timeout length left for a starved transaction to * complete after we've sent an ordered tag, we must poll at least * four times in every timeout period. This takes care of the worst * case where a starved transaction starts during an interval that * meets the requirement "don't send an ordered tag" test so it takes * us two intervals to determine that a tag must be sent. */ #ifndef DA_ORDEREDTAG_INTERVAL #define DA_ORDEREDTAG_INTERVAL 4 #endif static struct periph_driver dadriver = { dainit, "da", TAILQ_HEAD_INITIALIZER(dadriver.units), /* generation */ 0 }; PERIPHDRIVER_DECLARE(da, dadriver); static MALLOC_DEFINE(M_SCSIDA, "scsi_da", "scsi_da buffers"); static int daopen(struct disk *dp) { struct cam_periph *periph; struct da_softc *softc; int error; periph = (struct cam_periph *)dp->d_drv1; if (cam_periph_acquire(periph) != CAM_REQ_CMP) { return (ENXIO); } cam_periph_lock(periph); if ((error = cam_periph_hold(periph, PRIBIO|PCATCH)) != 0) { cam_periph_unlock(periph); cam_periph_release(periph); return (error); } CAM_DEBUG(periph->path, CAM_DEBUG_TRACE | CAM_DEBUG_PERIPH, ("daopen\n")); softc = (struct da_softc *)periph->softc; dareprobe(periph); /* Wait for the disk size update. */ error = cam_periph_sleep(periph, &softc->disk->d_mediasize, PRIBIO, "dareprobe", 0); if (error != 0) xpt_print(periph->path, "unable to retrieve capacity data\n"); if (periph->flags & CAM_PERIPH_INVALID) error = ENXIO; if (error == 0 && (softc->flags & DA_FLAG_PACK_REMOVABLE) != 0 && (softc->quirks & DA_Q_NO_PREVENT) == 0) daprevent(periph, PR_PREVENT); if (error == 0) { softc->flags &= ~DA_FLAG_PACK_INVALID; softc->flags |= DA_FLAG_OPEN; } cam_periph_unhold(periph); cam_periph_unlock(periph); if (error != 0) cam_periph_release(periph); return (error); } static int daclose(struct disk *dp) { struct cam_periph *periph; struct da_softc *softc; union ccb *ccb; int error; periph = (struct cam_periph *)dp->d_drv1; softc = (struct da_softc *)periph->softc; cam_periph_lock(periph); CAM_DEBUG(periph->path, CAM_DEBUG_TRACE | CAM_DEBUG_PERIPH, ("daclose\n")); if (cam_periph_hold(periph, PRIBIO) == 0) { /* Flush disk cache. */ if ((softc->flags & DA_FLAG_DIRTY) != 0 && (softc->quirks & DA_Q_NO_SYNC_CACHE) == 0 && (softc->flags & DA_FLAG_PACK_INVALID) == 0) { ccb = cam_periph_getccb(periph, CAM_PRIORITY_NORMAL); scsi_synchronize_cache(&ccb->csio, /*retries*/1, /*cbfcnp*/dadone, MSG_SIMPLE_Q_TAG, /*begin_lba*/0, /*lb_count*/0, SSD_FULL_SIZE, 5 * 60 * 1000); error = cam_periph_runccb(ccb, daerror, /*cam_flags*/0, /*sense_flags*/SF_RETRY_UA | SF_QUIET_IR, softc->disk->d_devstat); if (error == 0) softc->flags &= ~DA_FLAG_DIRTY; xpt_release_ccb(ccb); } /* Allow medium removal. */ if ((softc->flags & DA_FLAG_PACK_REMOVABLE) != 0 && (softc->quirks & DA_Q_NO_PREVENT) == 0) daprevent(periph, PR_ALLOW); cam_periph_unhold(periph); } /* * If we've got removeable media, mark the blocksize as * unavailable, since it could change when new media is * inserted. */ if ((softc->flags & DA_FLAG_PACK_REMOVABLE) != 0) softc->disk->d_devstat->flags |= DEVSTAT_BS_UNAVAILABLE; softc->flags &= ~DA_FLAG_OPEN; while (softc->refcount != 0) cam_periph_sleep(periph, &softc->refcount, PRIBIO, "daclose", 1); cam_periph_unlock(periph); cam_periph_release(periph); return (0); } static void daschedule(struct cam_periph *periph) { struct da_softc *softc = (struct da_softc *)periph->softc; if (softc->state != DA_STATE_NORMAL) return; /* Check if we have more work to do. */ if (bioq_first(&softc->bio_queue) || (!softc->delete_running && bioq_first(&softc->delete_queue)) || softc->tur) { xpt_schedule(periph, CAM_PRIORITY_NORMAL); } } /* * Actually translate the requested transfer into one the physical driver * can understand. The transfer is described by a buf and will include * only one physical transfer. */ static void dastrategy(struct bio *bp) { struct cam_periph *periph; struct da_softc *softc; periph = (struct cam_periph *)bp->bio_disk->d_drv1; softc = (struct da_softc *)periph->softc; cam_periph_lock(periph); /* * If the device has been made invalid, error out */ if ((softc->flags & DA_FLAG_PACK_INVALID)) { cam_periph_unlock(periph); biofinish(bp, NULL, ENXIO); return; } CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dastrategy(%p)\n", bp)); /* * Place it in the queue of disk activities for this disk */ if (bp->bio_cmd == BIO_DELETE) { bioq_disksort(&softc->delete_queue, bp); } else if (DA_SIO) { bioq_disksort(&softc->bio_queue, bp); } else { bioq_insert_tail(&softc->bio_queue, bp); } /* * Schedule ourselves for performing the work. */ daschedule(periph); cam_periph_unlock(periph); return; } static int dadump(void *arg, void *virtual, vm_offset_t physical, off_t offset, size_t length) { struct cam_periph *periph; struct da_softc *softc; u_int secsize; struct ccb_scsiio csio; struct disk *dp; int error = 0; dp = arg; periph = dp->d_drv1; softc = (struct da_softc *)periph->softc; cam_periph_lock(periph); secsize = softc->params.secsize; if ((softc->flags & DA_FLAG_PACK_INVALID) != 0) { cam_periph_unlock(periph); return (ENXIO); } if (length > 0) { xpt_setup_ccb(&csio.ccb_h, periph->path, CAM_PRIORITY_NORMAL); csio.ccb_h.ccb_state = DA_CCB_DUMP; scsi_read_write(&csio, /*retries*/0, dadone, MSG_ORDERED_Q_TAG, /*read*/SCSI_RW_WRITE, /*byte2*/0, /*minimum_cmd_size*/ softc->minimum_cmd_size, offset / secsize, length / secsize, /*data_ptr*/(u_int8_t *) virtual, /*dxfer_len*/length, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); xpt_polled_action((union ccb *)&csio); error = cam_periph_error((union ccb *)&csio, 0, SF_NO_RECOVERY | SF_NO_RETRY, NULL); if ((csio.ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(csio.ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); if (error != 0) printf("Aborting dump due to I/O error.\n"); cam_periph_unlock(periph); return (error); } /* * Sync the disk cache contents to the physical media. */ if ((softc->quirks & DA_Q_NO_SYNC_CACHE) == 0) { xpt_setup_ccb(&csio.ccb_h, periph->path, CAM_PRIORITY_NORMAL); csio.ccb_h.ccb_state = DA_CCB_DUMP; scsi_synchronize_cache(&csio, /*retries*/0, /*cbfcnp*/dadone, MSG_SIMPLE_Q_TAG, /*begin_lba*/0,/* Cover the whole disk */ /*lb_count*/0, SSD_FULL_SIZE, 5 * 60 * 1000); xpt_polled_action((union ccb *)&csio); error = cam_periph_error((union ccb *)&csio, 0, SF_NO_RECOVERY | SF_NO_RETRY | SF_QUIET_IR, NULL); if ((csio.ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(csio.ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); if (error != 0) xpt_print(periph->path, "Synchronize cache failed\n"); } cam_periph_unlock(periph); return (error); } static int dagetattr(struct bio *bp) { int ret; struct cam_periph *periph; periph = (struct cam_periph *)bp->bio_disk->d_drv1; cam_periph_lock(periph); ret = xpt_getattr(bp->bio_data, bp->bio_length, bp->bio_attribute, periph->path); cam_periph_unlock(periph); if (ret == 0) bp->bio_completed = bp->bio_length; return ret; } static void dainit(void) { cam_status status; /* * Install a global async callback. This callback will * receive async callbacks like "new device found". */ status = xpt_register_async(AC_FOUND_DEVICE, daasync, NULL, NULL); if (status != CAM_REQ_CMP) { printf("da: Failed to attach master async callback " "due to status 0x%x!\n", status); } else if (da_send_ordered) { /* Register our shutdown event handler */ if ((EVENTHANDLER_REGISTER(shutdown_post_sync, dashutdown, NULL, SHUTDOWN_PRI_DEFAULT)) == NULL) printf("dainit: shutdown event registration failed!\n"); } } /* * Callback from GEOM, called when it has finished cleaning up its * resources. */ static void dadiskgonecb(struct disk *dp) { struct cam_periph *periph; periph = (struct cam_periph *)dp->d_drv1; cam_periph_release(periph); } static void daoninvalidate(struct cam_periph *periph) { struct da_softc *softc; softc = (struct da_softc *)periph->softc; /* * De-register any async callbacks. */ xpt_register_async(0, daasync, periph, periph->path); softc->flags |= DA_FLAG_PACK_INVALID; /* * Return all queued I/O with ENXIO. * XXX Handle any transactions queued to the card * with XPT_ABORT_CCB. */ bioq_flush(&softc->bio_queue, NULL, ENXIO); bioq_flush(&softc->delete_queue, NULL, ENXIO); /* * Tell GEOM that we've gone away, we'll get a callback when it is * done cleaning up its resources. */ disk_gone(softc->disk); } static void dacleanup(struct cam_periph *periph) { struct da_softc *softc; softc = (struct da_softc *)periph->softc; cam_periph_unlock(periph); /* * If we can't free the sysctl tree, oh well... */ if ((softc->flags & DA_FLAG_SCTX_INIT) != 0 && sysctl_ctx_free(&softc->sysctl_ctx) != 0) { xpt_print(periph->path, "can't remove sysctl context\n"); } callout_drain(&softc->mediapoll_c); disk_destroy(softc->disk); callout_drain(&softc->sendordered_c); free(softc, M_DEVBUF); cam_periph_lock(periph); } static void daasync(void *callback_arg, u_int32_t code, struct cam_path *path, void *arg) { struct cam_periph *periph; struct da_softc *softc; periph = (struct cam_periph *)callback_arg; switch (code) { case AC_FOUND_DEVICE: { struct ccb_getdev *cgd; cam_status status; cgd = (struct ccb_getdev *)arg; if (cgd == NULL) break; if (cgd->protocol != PROTO_SCSI) break; if (SID_QUAL(&cgd->inq_data) != SID_QUAL_LU_CONNECTED) break; if (SID_TYPE(&cgd->inq_data) != T_DIRECT && SID_TYPE(&cgd->inq_data) != T_RBC && SID_TYPE(&cgd->inq_data) != T_OPTICAL) break; /* * Allocate a peripheral instance for * this device and start the probe * process. */ status = cam_periph_alloc(daregister, daoninvalidate, dacleanup, dastart, "da", CAM_PERIPH_BIO, path, daasync, AC_FOUND_DEVICE, cgd); if (status != CAM_REQ_CMP && status != CAM_REQ_INPROG) printf("daasync: Unable to attach to new device " "due to status 0x%x\n", status); return; } case AC_ADVINFO_CHANGED: { uintptr_t buftype; buftype = (uintptr_t)arg; if (buftype == CDAI_TYPE_PHYS_PATH) { struct da_softc *softc; softc = periph->softc; disk_attr_changed(softc->disk, "GEOM::physpath", M_NOWAIT); } break; } case AC_UNIT_ATTENTION: { union ccb *ccb; int error_code, sense_key, asc, ascq; softc = (struct da_softc *)periph->softc; ccb = (union ccb *)arg; /* * Handle all UNIT ATTENTIONs except our own, * as they will be handled by daerror(). */ if (xpt_path_periph(ccb->ccb_h.path) != periph && scsi_extract_sense_ccb(ccb, &error_code, &sense_key, &asc, &ascq)) { if (asc == 0x2A && ascq == 0x09) { xpt_print(ccb->ccb_h.path, "Capacity data has changed\n"); softc->flags &= ~DA_FLAG_PROBED; dareprobe(periph); } else if (asc == 0x28 && ascq == 0x00) { softc->flags &= ~DA_FLAG_PROBED; disk_media_changed(softc->disk, M_NOWAIT); } else if (asc == 0x3F && ascq == 0x03) { xpt_print(ccb->ccb_h.path, "INQUIRY data has changed\n"); softc->flags &= ~DA_FLAG_PROBED; dareprobe(periph); } } cam_periph_async(periph, code, path, arg); break; } case AC_SCSI_AEN: softc = (struct da_softc *)periph->softc; if (!softc->tur) { if (cam_periph_acquire(periph) == CAM_REQ_CMP) { softc->tur = 1; daschedule(periph); } } /* FALLTHROUGH */ case AC_SENT_BDR: case AC_BUS_RESET: { struct ccb_hdr *ccbh; softc = (struct da_softc *)periph->softc; /* * Don't fail on the expected unit attention * that will occur. */ softc->flags |= DA_FLAG_RETRY_UA; LIST_FOREACH(ccbh, &softc->pending_ccbs, periph_links.le) ccbh->ccb_state |= DA_CCB_RETRY_UA; break; } default: break; } cam_periph_async(periph, code, path, arg); } static void dasysctlinit(void *context, int pending) { struct cam_periph *periph; struct da_softc *softc; char tmpstr[80], tmpstr2[80]; struct ccb_trans_settings cts; periph = (struct cam_periph *)context; /* * periph was held for us when this task was enqueued */ if (periph->flags & CAM_PERIPH_INVALID) { cam_periph_release(periph); return; } softc = (struct da_softc *)periph->softc; snprintf(tmpstr, sizeof(tmpstr), "CAM DA unit %d", periph->unit_number); snprintf(tmpstr2, sizeof(tmpstr2), "%d", periph->unit_number); sysctl_ctx_init(&softc->sysctl_ctx); softc->flags |= DA_FLAG_SCTX_INIT; softc->sysctl_tree = SYSCTL_ADD_NODE(&softc->sysctl_ctx, SYSCTL_STATIC_CHILDREN(_kern_cam_da), OID_AUTO, tmpstr2, CTLFLAG_RD, 0, tmpstr); if (softc->sysctl_tree == NULL) { printf("dasysctlinit: unable to allocate sysctl tree\n"); cam_periph_release(periph); return; } /* * Now register the sysctl handler, so the user can change the value on * the fly. */ SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "delete_method", CTLTYPE_STRING | CTLFLAG_RWTUN, softc, 0, dadeletemethodsysctl, "A", "BIO_DELETE execution method"); SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "delete_max", CTLTYPE_U64 | CTLFLAG_RW, softc, 0, dadeletemaxsysctl, "Q", "Maximum BIO_DELETE size"); SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "minimum_cmd_size", CTLTYPE_INT | CTLFLAG_RW, &softc->minimum_cmd_size, 0, dacmdsizesysctl, "I", "Minimum CDB size"); SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "sort_io_queue", CTLFLAG_RW, &softc->sort_io_queue, 0, "Sort IO queue to try and optimise disk access patterns"); SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "error_inject", CTLFLAG_RW, &softc->error_inject, 0, "error_inject leaf"); /* * Add some addressing info. */ memset(&cts, 0, sizeof (cts)); xpt_setup_ccb(&cts.ccb_h, periph->path, CAM_PRIORITY_NONE); cts.ccb_h.func_code = XPT_GET_TRAN_SETTINGS; cts.type = CTS_TYPE_CURRENT_SETTINGS; cam_periph_lock(periph); xpt_action((union ccb *)&cts); cam_periph_unlock(periph); if (cts.ccb_h.status != CAM_REQ_CMP) { cam_periph_release(periph); return; } if (cts.protocol == PROTO_SCSI && cts.transport == XPORT_FC) { struct ccb_trans_settings_fc *fc = &cts.xport_specific.fc; if (fc->valid & CTS_FC_VALID_WWPN) { softc->wwpn = fc->wwpn; SYSCTL_ADD_UQUAD(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "wwpn", CTLFLAG_RD, &softc->wwpn, "World Wide Port Name"); } } cam_periph_release(periph); } static int dadeletemaxsysctl(SYSCTL_HANDLER_ARGS) { int error; uint64_t value; struct da_softc *softc; softc = (struct da_softc *)arg1; value = softc->disk->d_delmaxsize; error = sysctl_handle_64(oidp, &value, 0, req); if ((error != 0) || (req->newptr == NULL)) return (error); /* only accept values smaller than the calculated value */ if (value > dadeletemaxsize(softc, softc->delete_method)) { return (EINVAL); } softc->disk->d_delmaxsize = value; return (0); } static int dacmdsizesysctl(SYSCTL_HANDLER_ARGS) { int error, value; value = *(int *)arg1; error = sysctl_handle_int(oidp, &value, 0, req); if ((error != 0) || (req->newptr == NULL)) return (error); /* * Acceptable values here are 6, 10, 12 or 16. */ if (value < 6) value = 6; else if ((value > 6) && (value <= 10)) value = 10; else if ((value > 10) && (value <= 12)) value = 12; else if (value > 12) value = 16; *(int *)arg1 = value; return (0); } static void dadeletemethodset(struct da_softc *softc, da_delete_methods delete_method) { softc->delete_method = delete_method; softc->disk->d_delmaxsize = dadeletemaxsize(softc, delete_method); softc->delete_func = da_delete_functions[delete_method]; if (softc->delete_method > DA_DELETE_DISABLE) softc->disk->d_flags |= DISKFLAG_CANDELETE; else softc->disk->d_flags &= ~DISKFLAG_CANDELETE; } static off_t dadeletemaxsize(struct da_softc *softc, da_delete_methods delete_method) { off_t sectors; switch(delete_method) { case DA_DELETE_UNMAP: sectors = (off_t)softc->unmap_max_lba; break; case DA_DELETE_ATA_TRIM: sectors = (off_t)ATA_DSM_RANGE_MAX * softc->trim_max_ranges; break; case DA_DELETE_WS16: sectors = omin(softc->ws_max_blks, WS16_MAX_BLKS); break; case DA_DELETE_ZERO: case DA_DELETE_WS10: sectors = omin(softc->ws_max_blks, WS10_MAX_BLKS); break; default: return 0; } return (off_t)softc->params.secsize * omin(sectors, softc->params.sectors); } static void daprobedone(struct cam_periph *periph, union ccb *ccb) { struct da_softc *softc; softc = (struct da_softc *)periph->softc; dadeletemethodchoose(softc, DA_DELETE_NONE); if (bootverbose && (softc->flags & DA_FLAG_ANNOUNCED) == 0) { char buf[80]; int i, sep; snprintf(buf, sizeof(buf), "Delete methods: <"); sep = 0; for (i = 0; i <= DA_DELETE_MAX; i++) { if ((softc->delete_available & (1 << i)) == 0 && i != softc->delete_method) continue; if (sep) strlcat(buf, ",", sizeof(buf)); strlcat(buf, da_delete_method_names[i], sizeof(buf)); if (i == softc->delete_method) strlcat(buf, "(*)", sizeof(buf)); sep = 1; } strlcat(buf, ">", sizeof(buf)); printf("%s%d: %s\n", periph->periph_name, periph->unit_number, buf); } /* * Since our peripheral may be invalidated by an error * above or an external event, we must release our CCB * before releasing the probe lock on the peripheral. * The peripheral will only go away once the last lock * is removed, and we need it around for the CCB release * operation. */ xpt_release_ccb(ccb); softc->state = DA_STATE_NORMAL; softc->flags |= DA_FLAG_PROBED; daschedule(periph); wakeup(&softc->disk->d_mediasize); if ((softc->flags & DA_FLAG_ANNOUNCED) == 0) { softc->flags |= DA_FLAG_ANNOUNCED; cam_periph_unhold(periph); } else cam_periph_release_locked(periph); } static void dadeletemethodchoose(struct da_softc *softc, da_delete_methods default_method) { int i, methods; /* If available, prefer the method requested by user. */ i = softc->delete_method_pref; methods = softc->delete_available | (1 << DA_DELETE_DISABLE); if (methods & (1 << i)) { dadeletemethodset(softc, i); return; } /* Use the pre-defined order to choose the best performing delete. */ for (i = DA_DELETE_MIN; i <= DA_DELETE_MAX; i++) { if (i == DA_DELETE_ZERO) continue; if (softc->delete_available & (1 << i)) { dadeletemethodset(softc, i); return; } } /* Fallback to default. */ dadeletemethodset(softc, default_method); } static int dadeletemethodsysctl(SYSCTL_HANDLER_ARGS) { char buf[16]; const char *p; struct da_softc *softc; int i, error, methods, value; softc = (struct da_softc *)arg1; value = softc->delete_method; if (value < 0 || value > DA_DELETE_MAX) p = "UNKNOWN"; else p = da_delete_method_names[value]; strncpy(buf, p, sizeof(buf)); error = sysctl_handle_string(oidp, buf, sizeof(buf), req); if (error != 0 || req->newptr == NULL) return (error); methods = softc->delete_available | (1 << DA_DELETE_DISABLE); for (i = 0; i <= DA_DELETE_MAX; i++) { if (strcmp(buf, da_delete_method_names[i]) == 0) break; } if (i > DA_DELETE_MAX) return (EINVAL); softc->delete_method_pref = i; dadeletemethodchoose(softc, DA_DELETE_NONE); return (0); } static cam_status daregister(struct cam_periph *periph, void *arg) { struct da_softc *softc; struct ccb_pathinq cpi; struct ccb_getdev *cgd; char tmpstr[80]; caddr_t match; cgd = (struct ccb_getdev *)arg; if (cgd == NULL) { printf("daregister: no getdev CCB, can't register device\n"); return(CAM_REQ_CMP_ERR); } softc = (struct da_softc *)malloc(sizeof(*softc), M_DEVBUF, M_NOWAIT|M_ZERO); if (softc == NULL) { printf("daregister: Unable to probe new device. " "Unable to allocate softc\n"); return(CAM_REQ_CMP_ERR); } LIST_INIT(&softc->pending_ccbs); softc->state = DA_STATE_PROBE_RC; bioq_init(&softc->bio_queue); bioq_init(&softc->delete_queue); bioq_init(&softc->delete_run_queue); if (SID_IS_REMOVABLE(&cgd->inq_data)) softc->flags |= DA_FLAG_PACK_REMOVABLE; softc->unmap_max_ranges = UNMAP_MAX_RANGES; softc->unmap_max_lba = UNMAP_RANGE_MAX; softc->ws_max_blks = WS16_MAX_BLKS; softc->trim_max_ranges = ATA_TRIM_MAX_RANGES; softc->sort_io_queue = -1; periph->softc = softc; /* * See if this device has any quirks. */ match = cam_quirkmatch((caddr_t)&cgd->inq_data, (caddr_t)da_quirk_table, sizeof(da_quirk_table)/sizeof(*da_quirk_table), sizeof(*da_quirk_table), scsi_inquiry_match); if (match != NULL) softc->quirks = ((struct da_quirk_entry *)match)->quirks; else softc->quirks = DA_Q_NONE; /* Check if the SIM does not want 6 byte commands */ bzero(&cpi, sizeof(cpi)); xpt_setup_ccb(&cpi.ccb_h, periph->path, CAM_PRIORITY_NORMAL); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); if (cpi.ccb_h.status == CAM_REQ_CMP && (cpi.hba_misc & PIM_NO_6_BYTE)) softc->quirks |= DA_Q_NO_6_BYTE; TASK_INIT(&softc->sysctl_task, 0, dasysctlinit, periph); /* * Take an exclusive refcount on the periph while dastart is called * to finish the probe. The reference will be dropped in dadone at * the end of probe. */ (void)cam_periph_hold(periph, PRIBIO); /* * Schedule a periodic event to occasionally send an * ordered tag to a device. */ callout_init_mtx(&softc->sendordered_c, cam_periph_mtx(periph), 0); callout_reset(&softc->sendordered_c, (da_default_timeout * hz) / DA_ORDEREDTAG_INTERVAL, dasendorderedtag, softc); cam_periph_unlock(periph); /* * RBC devices don't have to support READ(6), only READ(10). */ if (softc->quirks & DA_Q_NO_6_BYTE || SID_TYPE(&cgd->inq_data) == T_RBC) softc->minimum_cmd_size = 10; else softc->minimum_cmd_size = 6; /* * Load the user's default, if any. */ snprintf(tmpstr, sizeof(tmpstr), "kern.cam.da.%d.minimum_cmd_size", periph->unit_number); TUNABLE_INT_FETCH(tmpstr, &softc->minimum_cmd_size); /* * 6, 10, 12 and 16 are the currently permissible values. */ if (softc->minimum_cmd_size < 6) softc->minimum_cmd_size = 6; else if ((softc->minimum_cmd_size > 6) && (softc->minimum_cmd_size <= 10)) softc->minimum_cmd_size = 10; else if ((softc->minimum_cmd_size > 10) && (softc->minimum_cmd_size <= 12)) softc->minimum_cmd_size = 12; else if (softc->minimum_cmd_size > 12) softc->minimum_cmd_size = 16; /* Predict whether device may support READ CAPACITY(16). */ if (SID_ANSI_REV(&cgd->inq_data) >= SCSI_REV_SPC3 && (softc->quirks & DA_Q_NO_RC16) == 0) { softc->flags |= DA_FLAG_CAN_RC16; softc->state = DA_STATE_PROBE_RC16; } /* * Register this media as a disk. */ softc->disk = disk_alloc(); softc->disk->d_devstat = devstat_new_entry(periph->periph_name, periph->unit_number, 0, DEVSTAT_BS_UNAVAILABLE, SID_TYPE(&cgd->inq_data) | XPORT_DEVSTAT_TYPE(cpi.transport), DEVSTAT_PRIORITY_DISK); softc->disk->d_open = daopen; softc->disk->d_close = daclose; softc->disk->d_strategy = dastrategy; softc->disk->d_dump = dadump; softc->disk->d_getattr = dagetattr; softc->disk->d_gone = dadiskgonecb; softc->disk->d_name = "da"; softc->disk->d_drv1 = periph; if (cpi.maxio == 0) softc->maxio = DFLTPHYS; /* traditional default */ else if (cpi.maxio > MAXPHYS) softc->maxio = MAXPHYS; /* for safety */ else softc->maxio = cpi.maxio; softc->disk->d_maxsize = softc->maxio; softc->disk->d_unit = periph->unit_number; softc->disk->d_flags = DISKFLAG_DIRECT_COMPLETION; if ((softc->quirks & DA_Q_NO_SYNC_CACHE) == 0) softc->disk->d_flags |= DISKFLAG_CANFLUSHCACHE; if ((cpi.hba_misc & PIM_UNMAPPED) != 0) softc->disk->d_flags |= DISKFLAG_UNMAPPED_BIO; cam_strvis(softc->disk->d_descr, cgd->inq_data.vendor, sizeof(cgd->inq_data.vendor), sizeof(softc->disk->d_descr)); strlcat(softc->disk->d_descr, " ", sizeof(softc->disk->d_descr)); cam_strvis(&softc->disk->d_descr[strlen(softc->disk->d_descr)], cgd->inq_data.product, sizeof(cgd->inq_data.product), sizeof(softc->disk->d_descr) - strlen(softc->disk->d_descr)); softc->disk->d_hba_vendor = cpi.hba_vendor; softc->disk->d_hba_device = cpi.hba_device; softc->disk->d_hba_subvendor = cpi.hba_subvendor; softc->disk->d_hba_subdevice = cpi.hba_subdevice; /* * Acquire a reference to the periph before we register with GEOM. * We'll release this reference once GEOM calls us back (via * dadiskgonecb()) telling us that our provider has been freed. */ if (cam_periph_acquire(periph) != CAM_REQ_CMP) { xpt_print(periph->path, "%s: lost periph during " "registration!\n", __func__); cam_periph_lock(periph); return (CAM_REQ_CMP_ERR); } disk_create(softc->disk, DISK_VERSION); cam_periph_lock(periph); /* * Add async callbacks for events of interest. * I don't bother checking if this fails as, * in most cases, the system will function just * fine without them and the only alternative * would be to not attach the device on failure. */ xpt_register_async(AC_SENT_BDR | AC_BUS_RESET | AC_LOST_DEVICE | AC_ADVINFO_CHANGED | AC_SCSI_AEN | AC_UNIT_ATTENTION, daasync, periph, periph->path); /* * Emit an attribute changed notification just in case * physical path information arrived before our async * event handler was registered, but after anyone attaching * to our disk device polled it. */ disk_attr_changed(softc->disk, "GEOM::physpath", M_NOWAIT); /* * Schedule a periodic media polling events. */ callout_init_mtx(&softc->mediapoll_c, cam_periph_mtx(periph), 0); if ((softc->flags & DA_FLAG_PACK_REMOVABLE) && (cgd->inq_flags & SID_AEN) == 0 && da_poll_period != 0) callout_reset(&softc->mediapoll_c, da_poll_period * hz, damediapoll, periph); xpt_schedule(periph, CAM_PRIORITY_DEV); return(CAM_REQ_CMP); } static void dastart(struct cam_periph *periph, union ccb *start_ccb) { struct da_softc *softc; softc = (struct da_softc *)periph->softc; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dastart\n")); skipstate: switch (softc->state) { case DA_STATE_NORMAL: { struct bio *bp; uint8_t tag_code; /* Run BIO_DELETE if not running yet. */ if (!softc->delete_running && (bp = bioq_first(&softc->delete_queue)) != NULL) { if (softc->delete_func != NULL) { softc->delete_func(periph, start_ccb, bp); goto out; } else { bioq_flush(&softc->delete_queue, NULL, 0); /* FALLTHROUGH */ } } /* Run regular command. */ bp = bioq_takefirst(&softc->bio_queue); if (bp == NULL) { if (softc->tur) { softc->tur = 0; scsi_test_unit_ready(&start_ccb->csio, /*retries*/ da_retry_count, dadone, MSG_SIMPLE_Q_TAG, SSD_FULL_SIZE, da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_TUR; xpt_action(start_ccb); } else xpt_release_ccb(start_ccb); break; } if (softc->tur) { softc->tur = 0; cam_periph_release_locked(periph); } if ((bp->bio_flags & BIO_ORDERED) != 0 || (softc->flags & DA_FLAG_NEED_OTAG) != 0) { softc->flags &= ~DA_FLAG_NEED_OTAG; softc->flags |= DA_FLAG_WAS_OTAG; tag_code = MSG_ORDERED_Q_TAG; } else { tag_code = MSG_SIMPLE_Q_TAG; } switch (bp->bio_cmd) { case BIO_WRITE: - softc->flags |= DA_FLAG_DIRTY; - /* FALLTHROUGH */ case BIO_READ: + { + void *data_ptr; + int rw_op; + + if (bp->bio_cmd == BIO_WRITE) { + softc->flags |= DA_FLAG_DIRTY; + rw_op = SCSI_RW_WRITE; + } else { + rw_op = SCSI_RW_READ; + } + + data_ptr = bp->bio_data; + if ((bp->bio_flags & (BIO_UNMAPPED|BIO_VLIST)) != 0) { + rw_op |= SCSI_RW_BIO; + data_ptr = bp; + } + scsi_read_write(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/tag_code, - /*read_op*/(bp->bio_cmd == BIO_READ ? - SCSI_RW_READ : SCSI_RW_WRITE) | - ((bp->bio_flags & BIO_UNMAPPED) != 0 ? - SCSI_RW_BIO : 0), + rw_op, /*byte2*/0, softc->minimum_cmd_size, /*lba*/bp->bio_pblkno, /*block_count*/bp->bio_bcount / softc->params.secsize, - /*data_ptr*/ (bp->bio_flags & - BIO_UNMAPPED) != 0 ? (void *)bp : - bp->bio_data, + data_ptr, /*dxfer_len*/ bp->bio_bcount, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); break; + } case BIO_FLUSH: /* * BIO_FLUSH doesn't currently communicate * range data, so we synchronize the cache * over the whole disk. We also force * ordered tag semantics the flush applies * to all previously queued I/O. */ scsi_synchronize_cache(&start_ccb->csio, /*retries*/1, /*cbfcnp*/dadone, MSG_ORDERED_Q_TAG, /*begin_lba*/0, /*lb_count*/0, SSD_FULL_SIZE, da_default_timeout*1000); break; } start_ccb->ccb_h.ccb_state = DA_CCB_BUFFER_IO; start_ccb->ccb_h.flags |= CAM_UNLOCKED; out: LIST_INSERT_HEAD(&softc->pending_ccbs, &start_ccb->ccb_h, periph_links.le); /* We expect a unit attention from this device */ if ((softc->flags & DA_FLAG_RETRY_UA) != 0) { start_ccb->ccb_h.ccb_state |= DA_CCB_RETRY_UA; softc->flags &= ~DA_FLAG_RETRY_UA; } start_ccb->ccb_h.ccb_bp = bp; softc->refcount++; cam_periph_unlock(periph); xpt_action(start_ccb); cam_periph_lock(periph); softc->refcount--; /* May have more work to do, so ensure we stay scheduled */ daschedule(periph); break; } case DA_STATE_PROBE_RC: { struct scsi_read_capacity_data *rcap; rcap = (struct scsi_read_capacity_data *) malloc(sizeof(*rcap), M_SCSIDA, M_NOWAIT|M_ZERO); if (rcap == NULL) { printf("dastart: Couldn't malloc read_capacity data\n"); /* da_free_periph??? */ break; } scsi_read_capacity(&start_ccb->csio, /*retries*/da_retry_count, dadone, MSG_SIMPLE_Q_TAG, rcap, SSD_FULL_SIZE, /*timeout*/5000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_RC; xpt_action(start_ccb); break; } case DA_STATE_PROBE_RC16: { struct scsi_read_capacity_data_long *rcaplong; rcaplong = (struct scsi_read_capacity_data_long *) malloc(sizeof(*rcaplong), M_SCSIDA, M_NOWAIT|M_ZERO); if (rcaplong == NULL) { printf("dastart: Couldn't malloc read_capacity data\n"); /* da_free_periph??? */ break; } scsi_read_capacity_16(&start_ccb->csio, /*retries*/ da_retry_count, /*cbfcnp*/ dadone, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*lba*/ 0, /*reladr*/ 0, /*pmi*/ 0, /*rcap_buf*/ (uint8_t *)rcaplong, /*rcap_buf_len*/ sizeof(*rcaplong), /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_RC16; xpt_action(start_ccb); break; } case DA_STATE_PROBE_LBP: { struct scsi_vpd_logical_block_prov *lbp; if (!scsi_vpd_supported_page(periph, SVPD_LBP)) { /* * If we get here we don't support any SBC-3 delete * methods with UNMAP as the Logical Block Provisioning * VPD page support is required for devices which * support it according to T10/1799-D Revision 31 * however older revisions of the spec don't mandate * this so we currently don't remove these methods * from the available set. */ softc->state = DA_STATE_PROBE_BLK_LIMITS; goto skipstate; } lbp = (struct scsi_vpd_logical_block_prov *) malloc(sizeof(*lbp), M_SCSIDA, M_NOWAIT|M_ZERO); if (lbp == NULL) { printf("dastart: Couldn't malloc lbp data\n"); /* da_free_periph??? */ break; } scsi_inquiry(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, /*inq_buf*/(u_int8_t *)lbp, /*inq_len*/sizeof(*lbp), /*evpd*/TRUE, /*page_code*/SVPD_LBP, /*sense_len*/SSD_MIN_SIZE, /*timeout*/da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_LBP; xpt_action(start_ccb); break; } case DA_STATE_PROBE_BLK_LIMITS: { struct scsi_vpd_block_limits *block_limits; if (!scsi_vpd_supported_page(periph, SVPD_BLOCK_LIMITS)) { /* Not supported skip to next probe */ softc->state = DA_STATE_PROBE_BDC; goto skipstate; } block_limits = (struct scsi_vpd_block_limits *) malloc(sizeof(*block_limits), M_SCSIDA, M_NOWAIT|M_ZERO); if (block_limits == NULL) { printf("dastart: Couldn't malloc block_limits data\n"); /* da_free_periph??? */ break; } scsi_inquiry(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, /*inq_buf*/(u_int8_t *)block_limits, /*inq_len*/sizeof(*block_limits), /*evpd*/TRUE, /*page_code*/SVPD_BLOCK_LIMITS, /*sense_len*/SSD_MIN_SIZE, /*timeout*/da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_BLK_LIMITS; xpt_action(start_ccb); break; } case DA_STATE_PROBE_BDC: { struct scsi_vpd_block_characteristics *bdc; if (!scsi_vpd_supported_page(periph, SVPD_BDC)) { softc->state = DA_STATE_PROBE_ATA; goto skipstate; } bdc = (struct scsi_vpd_block_characteristics *) malloc(sizeof(*bdc), M_SCSIDA, M_NOWAIT|M_ZERO); if (bdc == NULL) { printf("dastart: Couldn't malloc bdc data\n"); /* da_free_periph??? */ break; } scsi_inquiry(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, /*inq_buf*/(u_int8_t *)bdc, /*inq_len*/sizeof(*bdc), /*evpd*/TRUE, /*page_code*/SVPD_BDC, /*sense_len*/SSD_MIN_SIZE, /*timeout*/da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_BDC; xpt_action(start_ccb); break; } case DA_STATE_PROBE_ATA: { struct ata_params *ata_params; if (!scsi_vpd_supported_page(periph, SVPD_ATA_INFORMATION)) { daprobedone(periph, start_ccb); break; } ata_params = (struct ata_params*) malloc(sizeof(*ata_params), M_SCSIDA, M_NOWAIT|M_ZERO); if (ata_params == NULL) { printf("dastart: Couldn't malloc ata_params data\n"); /* da_free_periph??? */ break; } scsi_ata_identify(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, /*data_ptr*/(u_int8_t *)ata_params, /*dxfer_len*/sizeof(*ata_params), /*sense_len*/SSD_FULL_SIZE, /*timeout*/da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_ATA; xpt_action(start_ccb); break; } } } /* * In each of the methods below, while its the caller's * responsibility to ensure the request will fit into a * single device request, we might have changed the delete * method due to the device incorrectly advertising either * its supported methods or limits. * * To prevent this causing further issues we validate the * against the methods limits, and warn which would * otherwise be unnecessary. */ static void da_delete_unmap(struct cam_periph *periph, union ccb *ccb, struct bio *bp) { struct da_softc *softc = (struct da_softc *)periph->softc;; struct bio *bp1; uint8_t *buf = softc->unmap_buf; uint64_t lba, lastlba = (uint64_t)-1; uint64_t totalcount = 0; uint64_t count; uint32_t lastcount = 0, c; uint32_t off, ranges = 0; /* * Currently this doesn't take the UNMAP * Granularity and Granularity Alignment * fields into account. * * This could result in both unoptimal unmap * requests as as well as UNMAP calls unmapping * fewer LBA's than requested. */ softc->delete_running = 1; bzero(softc->unmap_buf, sizeof(softc->unmap_buf)); bp1 = bp; do { bioq_remove(&softc->delete_queue, bp1); if (bp1 != bp) bioq_insert_tail(&softc->delete_run_queue, bp1); lba = bp1->bio_pblkno; count = bp1->bio_bcount / softc->params.secsize; /* Try to extend the previous range. */ if (lba == lastlba) { c = omin(count, UNMAP_RANGE_MAX - lastcount); lastcount += c; off = ((ranges - 1) * UNMAP_RANGE_SIZE) + UNMAP_HEAD_SIZE; scsi_ulto4b(lastcount, &buf[off + 8]); count -= c; lba +=c; totalcount += c; } while (count > 0) { c = omin(count, UNMAP_RANGE_MAX); if (totalcount + c > softc->unmap_max_lba || ranges >= softc->unmap_max_ranges) { xpt_print(periph->path, "%s issuing short delete %ld > %ld" "|| %d >= %d", da_delete_method_desc[softc->delete_method], totalcount + c, softc->unmap_max_lba, ranges, softc->unmap_max_ranges); break; } off = (ranges * UNMAP_RANGE_SIZE) + UNMAP_HEAD_SIZE; scsi_u64to8b(lba, &buf[off + 0]); scsi_ulto4b(c, &buf[off + 8]); lba += c; totalcount += c; ranges++; count -= c; lastcount = c; } lastlba = lba; bp1 = bioq_first(&softc->delete_queue); if (bp1 == NULL || ranges >= softc->unmap_max_ranges || totalcount + bp1->bio_bcount / softc->params.secsize > softc->unmap_max_lba) break; } while (1); scsi_ulto2b(ranges * 16 + 6, &buf[0]); scsi_ulto2b(ranges * 16, &buf[2]); scsi_unmap(&ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, /*byte2*/0, /*data_ptr*/ buf, /*dxfer_len*/ ranges * 16 + 8, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); ccb->ccb_h.ccb_state = DA_CCB_DELETE; ccb->ccb_h.flags |= CAM_UNLOCKED; } static void da_delete_trim(struct cam_periph *periph, union ccb *ccb, struct bio *bp) { struct da_softc *softc = (struct da_softc *)periph->softc; struct bio *bp1; uint8_t *buf = softc->unmap_buf; uint64_t lastlba = (uint64_t)-1; uint64_t count; uint64_t lba; uint32_t lastcount = 0, c, requestcount; int ranges = 0, off, block_count; softc->delete_running = 1; bzero(softc->unmap_buf, sizeof(softc->unmap_buf)); bp1 = bp; do { bioq_remove(&softc->delete_queue, bp1); if (bp1 != bp) bioq_insert_tail(&softc->delete_run_queue, bp1); lba = bp1->bio_pblkno; count = bp1->bio_bcount / softc->params.secsize; requestcount = count; /* Try to extend the previous range. */ if (lba == lastlba) { c = omin(count, ATA_DSM_RANGE_MAX - lastcount); lastcount += c; off = (ranges - 1) * 8; buf[off + 6] = lastcount & 0xff; buf[off + 7] = (lastcount >> 8) & 0xff; count -= c; lba += c; } while (count > 0) { c = omin(count, ATA_DSM_RANGE_MAX); off = ranges * 8; buf[off + 0] = lba & 0xff; buf[off + 1] = (lba >> 8) & 0xff; buf[off + 2] = (lba >> 16) & 0xff; buf[off + 3] = (lba >> 24) & 0xff; buf[off + 4] = (lba >> 32) & 0xff; buf[off + 5] = (lba >> 40) & 0xff; buf[off + 6] = c & 0xff; buf[off + 7] = (c >> 8) & 0xff; lba += c; ranges++; count -= c; lastcount = c; if (count != 0 && ranges == softc->trim_max_ranges) { xpt_print(periph->path, "%s issuing short delete %ld > %ld\n", da_delete_method_desc[softc->delete_method], requestcount, (softc->trim_max_ranges - ranges) * ATA_DSM_RANGE_MAX); break; } } lastlba = lba; bp1 = bioq_first(&softc->delete_queue); if (bp1 == NULL || bp1->bio_bcount / softc->params.secsize > (softc->trim_max_ranges - ranges) * ATA_DSM_RANGE_MAX) break; } while (1); block_count = (ranges + ATA_DSM_BLK_RANGES - 1) / ATA_DSM_BLK_RANGES; scsi_ata_trim(&ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, block_count, /*data_ptr*/buf, /*dxfer_len*/block_count * ATA_DSM_BLK_SIZE, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); ccb->ccb_h.ccb_state = DA_CCB_DELETE; ccb->ccb_h.flags |= CAM_UNLOCKED; } /* * We calculate ws_max_blks here based off d_delmaxsize instead * of using softc->ws_max_blks as it is absolute max for the * device not the protocol max which may well be lower. */ static void da_delete_ws(struct cam_periph *periph, union ccb *ccb, struct bio *bp) { struct da_softc *softc; struct bio *bp1; uint64_t ws_max_blks; uint64_t lba; uint64_t count; /* forward compat with WS32 */ softc = (struct da_softc *)periph->softc; ws_max_blks = softc->disk->d_delmaxsize / softc->params.secsize; softc->delete_running = 1; lba = bp->bio_pblkno; count = 0; bp1 = bp; do { bioq_remove(&softc->delete_queue, bp1); if (bp1 != bp) bioq_insert_tail(&softc->delete_run_queue, bp1); count += bp1->bio_bcount / softc->params.secsize; if (count > ws_max_blks) { xpt_print(periph->path, "%s issuing short delete %ld > %ld\n", da_delete_method_desc[softc->delete_method], count, ws_max_blks); count = omin(count, ws_max_blks); break; } bp1 = bioq_first(&softc->delete_queue); if (bp1 == NULL || lba + count != bp1->bio_pblkno || count + bp1->bio_bcount / softc->params.secsize > ws_max_blks) break; } while (1); scsi_write_same(&ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, /*byte2*/softc->delete_method == DA_DELETE_ZERO ? 0 : SWS_UNMAP, softc->delete_method == DA_DELETE_WS16 ? 16 : 10, /*lba*/lba, /*block_count*/count, /*data_ptr*/ __DECONST(void *, zero_region), /*dxfer_len*/ softc->params.secsize, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); ccb->ccb_h.ccb_state = DA_CCB_DELETE; ccb->ccb_h.flags |= CAM_UNLOCKED; } static int cmd6workaround(union ccb *ccb) { struct scsi_rw_6 cmd6; struct scsi_rw_10 *cmd10; struct da_softc *softc; u_int8_t *cdb; struct bio *bp; int frozen; cdb = ccb->csio.cdb_io.cdb_bytes; softc = (struct da_softc *)xpt_path_periph(ccb->ccb_h.path)->softc; if (ccb->ccb_h.ccb_state == DA_CCB_DELETE) { da_delete_methods old_method = softc->delete_method; /* * Typically there are two reasons for failure here * 1. Delete method was detected as supported but isn't * 2. Delete failed due to invalid params e.g. too big * * While we will attempt to choose an alternative delete method * this may result in short deletes if the existing delete * requests from geom are big for the new method choosen. * * This method assumes that the error which triggered this * will not retry the io otherwise a panic will occur */ dadeleteflag(softc, old_method, 0); dadeletemethodchoose(softc, DA_DELETE_DISABLE); if (softc->delete_method == DA_DELETE_DISABLE) xpt_print(ccb->ccb_h.path, "%s failed, disabling BIO_DELETE\n", da_delete_method_desc[old_method]); else xpt_print(ccb->ccb_h.path, "%s failed, switching to %s BIO_DELETE\n", da_delete_method_desc[old_method], da_delete_method_desc[softc->delete_method]); while ((bp = bioq_takefirst(&softc->delete_run_queue)) != NULL) bioq_disksort(&softc->delete_queue, bp); bioq_disksort(&softc->delete_queue, (struct bio *)ccb->ccb_h.ccb_bp); ccb->ccb_h.ccb_bp = NULL; return (0); } /* Detect unsupported PREVENT ALLOW MEDIUM REMOVAL. */ if ((ccb->ccb_h.flags & CAM_CDB_POINTER) == 0 && (*cdb == PREVENT_ALLOW) && (softc->quirks & DA_Q_NO_PREVENT) == 0) { if (bootverbose) xpt_print(ccb->ccb_h.path, "PREVENT ALLOW MEDIUM REMOVAL not supported.\n"); softc->quirks |= DA_Q_NO_PREVENT; return (0); } /* Detect unsupported SYNCHRONIZE CACHE(10). */ if ((ccb->ccb_h.flags & CAM_CDB_POINTER) == 0 && (*cdb == SYNCHRONIZE_CACHE) && (softc->quirks & DA_Q_NO_SYNC_CACHE) == 0) { if (bootverbose) xpt_print(ccb->ccb_h.path, "SYNCHRONIZE CACHE(10) not supported.\n"); softc->quirks |= DA_Q_NO_SYNC_CACHE; softc->disk->d_flags &= ~DISKFLAG_CANFLUSHCACHE; return (0); } /* Translation only possible if CDB is an array and cmd is R/W6 */ if ((ccb->ccb_h.flags & CAM_CDB_POINTER) != 0 || (*cdb != READ_6 && *cdb != WRITE_6)) return 0; xpt_print(ccb->ccb_h.path, "READ(6)/WRITE(6) not supported, " "increasing minimum_cmd_size to 10.\n"); softc->minimum_cmd_size = 10; bcopy(cdb, &cmd6, sizeof(struct scsi_rw_6)); cmd10 = (struct scsi_rw_10 *)cdb; cmd10->opcode = (cmd6.opcode == READ_6) ? READ_10 : WRITE_10; cmd10->byte2 = 0; scsi_ulto4b(scsi_3btoul(cmd6.addr), cmd10->addr); cmd10->reserved = 0; scsi_ulto2b(cmd6.length, cmd10->length); cmd10->control = cmd6.control; ccb->csio.cdb_len = sizeof(*cmd10); /* Requeue request, unfreezing queue if necessary */ frozen = (ccb->ccb_h.status & CAM_DEV_QFRZN) != 0; ccb->ccb_h.status = CAM_REQUEUE_REQ; xpt_action(ccb); if (frozen) { cam_release_devq(ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } return (ERESTART); } static void dadone(struct cam_periph *periph, union ccb *done_ccb) { struct da_softc *softc; struct ccb_scsiio *csio; u_int32_t priority; da_ccb_state state; softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone\n")); csio = &done_ccb->csio; state = csio->ccb_h.ccb_state & DA_CCB_TYPE_MASK; switch (state) { case DA_CCB_BUFFER_IO: case DA_CCB_DELETE: { struct bio *bp, *bp1; cam_periph_lock(periph); bp = (struct bio *)done_ccb->ccb_h.ccb_bp; if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { int error; int sf; if ((csio->ccb_h.ccb_state & DA_CCB_RETRY_UA) != 0) sf = SF_RETRY_UA; else sf = 0; error = daerror(done_ccb, CAM_RETRY_SELTO, sf); if (error == ERESTART) { /* * A retry was scheduled, so * just return. */ cam_periph_unlock(periph); return; } bp = (struct bio *)done_ccb->ccb_h.ccb_bp; if (error != 0) { int queued_error; /* * return all queued I/O with EIO, so that * the client can retry these I/Os in the * proper order should it attempt to recover. */ queued_error = EIO; if (error == ENXIO && (softc->flags & DA_FLAG_PACK_INVALID)== 0) { /* * Catastrophic error. Mark our pack as * invalid. */ /* * XXX See if this is really a media * XXX change first? */ xpt_print(periph->path, "Invalidating pack\n"); softc->flags |= DA_FLAG_PACK_INVALID; queued_error = ENXIO; } bioq_flush(&softc->bio_queue, NULL, queued_error); if (bp != NULL) { bp->bio_error = error; bp->bio_resid = bp->bio_bcount; bp->bio_flags |= BIO_ERROR; } } else if (bp != NULL) { if (state == DA_CCB_DELETE) bp->bio_resid = 0; else bp->bio_resid = csio->resid; bp->bio_error = 0; if (bp->bio_resid != 0) bp->bio_flags |= BIO_ERROR; } if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } else if (bp != NULL) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) panic("REQ_CMP with QFRZN"); if (state == DA_CCB_DELETE) bp->bio_resid = 0; else bp->bio_resid = csio->resid; if (csio->resid > 0) bp->bio_flags |= BIO_ERROR; if (softc->error_inject != 0) { bp->bio_error = softc->error_inject; bp->bio_resid = bp->bio_bcount; bp->bio_flags |= BIO_ERROR; softc->error_inject = 0; } } LIST_REMOVE(&done_ccb->ccb_h, periph_links.le); if (LIST_EMPTY(&softc->pending_ccbs)) softc->flags |= DA_FLAG_WAS_OTAG; xpt_release_ccb(done_ccb); if (state == DA_CCB_DELETE) { TAILQ_HEAD(, bio) queue; TAILQ_INIT(&queue); TAILQ_CONCAT(&queue, &softc->delete_run_queue.queue, bio_queue); softc->delete_run_queue.insert_point = NULL; /* * Normally, the xpt_release_ccb() above would make sure * that when we have more work to do, that work would * get kicked off. However, we specifically keep * delete_running set to 0 before the call above to * allow other I/O to progress when many BIO_DELETE * requests are pushed down. We set delete_running to 0 * and call daschedule again so that we don't stall if * there are no other I/Os pending apart from BIO_DELETEs. */ softc->delete_running = 0; daschedule(periph); cam_periph_unlock(periph); while ((bp1 = TAILQ_FIRST(&queue)) != NULL) { TAILQ_REMOVE(&queue, bp1, bio_queue); bp1->bio_error = bp->bio_error; if (bp->bio_flags & BIO_ERROR) { bp1->bio_flags |= BIO_ERROR; bp1->bio_resid = bp1->bio_bcount; } else bp1->bio_resid = 0; biodone(bp1); } } else cam_periph_unlock(periph); if (bp != NULL) biodone(bp); return; } case DA_CCB_PROBE_RC: case DA_CCB_PROBE_RC16: { struct scsi_read_capacity_data *rdcap; struct scsi_read_capacity_data_long *rcaplong; char announce_buf[80]; int lbp; lbp = 0; rdcap = NULL; rcaplong = NULL; if (state == DA_CCB_PROBE_RC) rdcap =(struct scsi_read_capacity_data *)csio->data_ptr; else rcaplong = (struct scsi_read_capacity_data_long *) csio->data_ptr; if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { struct disk_params *dp; uint32_t block_size; uint64_t maxsector; u_int lbppbe; /* LB per physical block exponent. */ u_int lalba; /* Lowest aligned LBA. */ if (state == DA_CCB_PROBE_RC) { block_size = scsi_4btoul(rdcap->length); maxsector = scsi_4btoul(rdcap->addr); lbppbe = 0; lalba = 0; /* * According to SBC-2, if the standard 10 * byte READ CAPACITY command returns 2^32, * we should issue the 16 byte version of * the command, since the device in question * has more sectors than can be represented * with the short version of the command. */ if (maxsector == 0xffffffff) { free(rdcap, M_SCSIDA); xpt_release_ccb(done_ccb); softc->state = DA_STATE_PROBE_RC16; xpt_schedule(periph, priority); return; } } else { block_size = scsi_4btoul(rcaplong->length); maxsector = scsi_8btou64(rcaplong->addr); lbppbe = rcaplong->prot_lbppbe & SRC16_LBPPBE; lalba = scsi_2btoul(rcaplong->lalba_lbp); } /* * Because GEOM code just will panic us if we * give them an 'illegal' value we'll avoid that * here. */ if (block_size == 0) { block_size = 512; if (maxsector == 0) maxsector = -1; } if (block_size >= MAXPHYS) { xpt_print(periph->path, "unsupportable block size %ju\n", (uintmax_t) block_size); announce_buf[0] = '\0'; cam_periph_invalidate(periph); } else { /* * We pass rcaplong into dasetgeom(), * because it will only use it if it is * non-NULL. */ dasetgeom(periph, block_size, maxsector, rcaplong, sizeof(*rcaplong)); lbp = (lalba & SRC16_LBPME_A); dp = &softc->params; snprintf(announce_buf, sizeof(announce_buf), "%juMB (%ju %u byte sectors)", ((uintmax_t)dp->secsize * dp->sectors) / (1024 * 1024), (uintmax_t)dp->sectors, dp->secsize); } } else { int error; announce_buf[0] = '\0'; /* * Retry any UNIT ATTENTION type errors. They * are expected at boot. */ error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) { /* * A retry was scheuled, so * just return. */ return; } else if (error != 0) { int asc, ascq; int sense_key, error_code; int have_sense; cam_status status; struct ccb_getdev cgd; /* Don't wedge this device's queue */ status = done_ccb->ccb_h.status; if ((status & CAM_DEV_QFRZN) != 0) cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); xpt_setup_ccb(&cgd.ccb_h, done_ccb->ccb_h.path, CAM_PRIORITY_NORMAL); cgd.ccb_h.func_code = XPT_GDEV_TYPE; xpt_action((union ccb *)&cgd); if (scsi_extract_sense_ccb(done_ccb, &error_code, &sense_key, &asc, &ascq)) have_sense = TRUE; else have_sense = FALSE; /* * If we tried READ CAPACITY(16) and failed, * fallback to READ CAPACITY(10). */ if ((state == DA_CCB_PROBE_RC16) && (softc->flags & DA_FLAG_CAN_RC16) && (((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_INVALID) || ((have_sense) && (error_code == SSD_CURRENT_ERROR) && (sense_key == SSD_KEY_ILLEGAL_REQUEST)))) { softc->flags &= ~DA_FLAG_CAN_RC16; free(rdcap, M_SCSIDA); xpt_release_ccb(done_ccb); softc->state = DA_STATE_PROBE_RC; xpt_schedule(periph, priority); return; } else /* * Attach to anything that claims to be a * direct access or optical disk device, * as long as it doesn't return a "Logical * unit not supported" (0x25) error. */ if ((have_sense) && (asc != 0x25) && (error_code == SSD_CURRENT_ERROR)) { const char *sense_key_desc; const char *asc_desc; dasetgeom(periph, 512, -1, NULL, 0); scsi_sense_desc(sense_key, asc, ascq, &cgd.inq_data, &sense_key_desc, &asc_desc); snprintf(announce_buf, sizeof(announce_buf), "Attempt to query device " "size failed: %s, %s", sense_key_desc, asc_desc); } else { if (have_sense) scsi_sense_print( &done_ccb->csio); else { xpt_print(periph->path, "got CAM status %#x\n", done_ccb->ccb_h.status); } xpt_print(periph->path, "fatal error, " "failed to attach to device\n"); /* * Free up resources. */ cam_periph_invalidate(periph); } } } free(csio->data_ptr, M_SCSIDA); if (announce_buf[0] != '\0' && ((softc->flags & DA_FLAG_ANNOUNCED) == 0)) { /* * Create our sysctl variables, now that we know * we have successfully attached. */ /* increase the refcount */ if (cam_periph_acquire(periph) == CAM_REQ_CMP) { taskqueue_enqueue(taskqueue_thread, &softc->sysctl_task); xpt_announce_periph(periph, announce_buf); xpt_announce_quirks(periph, softc->quirks, DA_Q_BIT_STRING); } else { xpt_print(periph->path, "fatal error, " "could not acquire reference count\n"); } } /* We already probed the device. */ if (softc->flags & DA_FLAG_PROBED) { daprobedone(periph, done_ccb); return; } /* Ensure re-probe doesn't see old delete. */ softc->delete_available = 0; dadeleteflag(softc, DA_DELETE_ZERO, 1); if (lbp && (softc->quirks & DA_Q_NO_UNMAP) == 0) { /* * Based on older SBC-3 spec revisions * any of the UNMAP methods "may" be * available via LBP given this flag so * we flag all of them as availble and * then remove those which further * probes confirm aren't available * later. * * We could also check readcap(16) p_type * flag to exclude one or more invalid * write same (X) types here */ dadeleteflag(softc, DA_DELETE_WS16, 1); dadeleteflag(softc, DA_DELETE_WS10, 1); dadeleteflag(softc, DA_DELETE_UNMAP, 1); xpt_release_ccb(done_ccb); softc->state = DA_STATE_PROBE_LBP; xpt_schedule(periph, priority); return; } xpt_release_ccb(done_ccb); softc->state = DA_STATE_PROBE_BDC; xpt_schedule(periph, priority); return; } case DA_CCB_PROBE_LBP: { struct scsi_vpd_logical_block_prov *lbp; lbp = (struct scsi_vpd_logical_block_prov *)csio->data_ptr; if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { /* * T10/1799-D Revision 31 states at least one of these * must be supported but we don't currently enforce this. */ dadeleteflag(softc, DA_DELETE_WS16, (lbp->flags & SVPD_LBP_WS16)); dadeleteflag(softc, DA_DELETE_WS10, (lbp->flags & SVPD_LBP_WS10)); dadeleteflag(softc, DA_DELETE_UNMAP, (lbp->flags & SVPD_LBP_UNMAP)); } else { int error; error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } /* * Failure indicates we don't support any SBC-3 * delete methods with UNMAP */ } } free(lbp, M_SCSIDA); xpt_release_ccb(done_ccb); softc->state = DA_STATE_PROBE_BLK_LIMITS; xpt_schedule(periph, priority); return; } case DA_CCB_PROBE_BLK_LIMITS: { struct scsi_vpd_block_limits *block_limits; block_limits = (struct scsi_vpd_block_limits *)csio->data_ptr; if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { uint32_t max_txfer_len = scsi_4btoul( block_limits->max_txfer_len); uint32_t max_unmap_lba_cnt = scsi_4btoul( block_limits->max_unmap_lba_cnt); uint32_t max_unmap_blk_cnt = scsi_4btoul( block_limits->max_unmap_blk_cnt); uint64_t ws_max_blks = scsi_8btou64( block_limits->max_write_same_length); if (max_txfer_len != 0) { softc->disk->d_maxsize = MIN(softc->maxio, (off_t)max_txfer_len * softc->params.secsize); } /* * We should already support UNMAP but we check lba * and block count to be sure */ if (max_unmap_lba_cnt != 0x00L && max_unmap_blk_cnt != 0x00L) { softc->unmap_max_lba = max_unmap_lba_cnt; softc->unmap_max_ranges = min(max_unmap_blk_cnt, UNMAP_MAX_RANGES); } else { /* * Unexpected UNMAP limits which means the * device doesn't actually support UNMAP */ dadeleteflag(softc, DA_DELETE_UNMAP, 0); } if (ws_max_blks != 0x00L) softc->ws_max_blks = ws_max_blks; } else { int error; error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } /* * Failure here doesn't mean UNMAP is not * supported as this is an optional page. */ softc->unmap_max_lba = 1; softc->unmap_max_ranges = 1; } } free(block_limits, M_SCSIDA); xpt_release_ccb(done_ccb); softc->state = DA_STATE_PROBE_BDC; xpt_schedule(periph, priority); return; } case DA_CCB_PROBE_BDC: { struct scsi_vpd_block_characteristics *bdc; bdc = (struct scsi_vpd_block_characteristics *)csio->data_ptr; if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { /* * Disable queue sorting for non-rotational media * by default. */ u_int16_t old_rate = softc->disk->d_rotation_rate; softc->disk->d_rotation_rate = scsi_2btoul(bdc->medium_rotation_rate); if (softc->disk->d_rotation_rate == SVPD_BDC_RATE_NON_ROTATING) { softc->sort_io_queue = 0; } if (softc->disk->d_rotation_rate != old_rate) { disk_attr_changed(softc->disk, "GEOM::rotation_rate", M_NOWAIT); } } else { int error; error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } free(bdc, M_SCSIDA); xpt_release_ccb(done_ccb); softc->state = DA_STATE_PROBE_ATA; xpt_schedule(periph, priority); return; } case DA_CCB_PROBE_ATA: { int i; struct ata_params *ata_params; int16_t *ptr; ata_params = (struct ata_params *)csio->data_ptr; ptr = (uint16_t *)ata_params; if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { uint16_t old_rate; for (i = 0; i < sizeof(*ata_params) / 2; i++) ptr[i] = le16toh(ptr[i]); if (ata_params->support_dsm & ATA_SUPPORT_DSM_TRIM && (softc->quirks & DA_Q_NO_UNMAP) == 0) { dadeleteflag(softc, DA_DELETE_ATA_TRIM, 1); if (ata_params->max_dsm_blocks != 0) softc->trim_max_ranges = min( softc->trim_max_ranges, ata_params->max_dsm_blocks * ATA_DSM_BLK_RANGES); } /* * Disable queue sorting for non-rotational media * by default. */ old_rate = softc->disk->d_rotation_rate; softc->disk->d_rotation_rate = ata_params->media_rotation_rate; if (softc->disk->d_rotation_rate == ATA_RATE_NON_ROTATING) { softc->sort_io_queue = 0; } if (softc->disk->d_rotation_rate != old_rate) { disk_attr_changed(softc->disk, "GEOM::rotation_rate", M_NOWAIT); } } else { int error; error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } free(ata_params, M_SCSIDA); daprobedone(periph, done_ccb); return; } case DA_CCB_DUMP: /* No-op. We're polling */ return; case DA_CCB_TUR: { if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { if (daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA | SF_NO_RECOVERY | SF_NO_PRINT) == ERESTART) return; if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } xpt_release_ccb(done_ccb); cam_periph_release_locked(periph); return; } default: break; } xpt_release_ccb(done_ccb); } static void dareprobe(struct cam_periph *periph) { struct da_softc *softc; cam_status status; softc = (struct da_softc *)periph->softc; /* Probe in progress; don't interfere. */ if (softc->state != DA_STATE_NORMAL) return; status = cam_periph_acquire(periph); KASSERT(status == CAM_REQ_CMP, ("dareprobe: cam_periph_acquire failed")); if (softc->flags & DA_FLAG_CAN_RC16) softc->state = DA_STATE_PROBE_RC16; else softc->state = DA_STATE_PROBE_RC; xpt_schedule(periph, CAM_PRIORITY_DEV); } static int daerror(union ccb *ccb, u_int32_t cam_flags, u_int32_t sense_flags) { struct da_softc *softc; struct cam_periph *periph; int error, error_code, sense_key, asc, ascq; periph = xpt_path_periph(ccb->ccb_h.path); softc = (struct da_softc *)periph->softc; /* * Automatically detect devices that do not support * READ(6)/WRITE(6) and upgrade to using 10 byte cdbs. */ error = 0; if ((ccb->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_INVALID) { error = cmd6workaround(ccb); } else if (scsi_extract_sense_ccb(ccb, &error_code, &sense_key, &asc, &ascq)) { if (sense_key == SSD_KEY_ILLEGAL_REQUEST) error = cmd6workaround(ccb); /* * If the target replied with CAPACITY DATA HAS CHANGED UA, * query the capacity and notify upper layers. */ else if (sense_key == SSD_KEY_UNIT_ATTENTION && asc == 0x2A && ascq == 0x09) { xpt_print(periph->path, "Capacity data has changed\n"); softc->flags &= ~DA_FLAG_PROBED; dareprobe(periph); sense_flags |= SF_NO_PRINT; } else if (sense_key == SSD_KEY_UNIT_ATTENTION && asc == 0x28 && ascq == 0x00) { softc->flags &= ~DA_FLAG_PROBED; disk_media_changed(softc->disk, M_NOWAIT); } else if (sense_key == SSD_KEY_UNIT_ATTENTION && asc == 0x3F && ascq == 0x03) { xpt_print(periph->path, "INQUIRY data has changed\n"); softc->flags &= ~DA_FLAG_PROBED; dareprobe(periph); sense_flags |= SF_NO_PRINT; } else if (sense_key == SSD_KEY_NOT_READY && asc == 0x3a && (softc->flags & DA_FLAG_PACK_INVALID) == 0) { softc->flags |= DA_FLAG_PACK_INVALID; disk_media_gone(softc->disk, M_NOWAIT); } } if (error == ERESTART) return (ERESTART); /* * XXX * Until we have a better way of doing pack validation, * don't treat UAs as errors. */ sense_flags |= SF_RETRY_UA; if (softc->quirks & DA_Q_RETRY_BUSY) sense_flags |= SF_RETRY_BUSY; return(cam_periph_error(ccb, cam_flags, sense_flags, &softc->saved_ccb)); } static void damediapoll(void *arg) { struct cam_periph *periph = arg; struct da_softc *softc = periph->softc; if (!softc->tur && LIST_EMPTY(&softc->pending_ccbs)) { if (cam_periph_acquire(periph) == CAM_REQ_CMP) { softc->tur = 1; daschedule(periph); } } /* Queue us up again */ if (da_poll_period != 0) callout_schedule(&softc->mediapoll_c, da_poll_period * hz); } static void daprevent(struct cam_periph *periph, int action) { struct da_softc *softc; union ccb *ccb; int error; softc = (struct da_softc *)periph->softc; if (((action == PR_ALLOW) && (softc->flags & DA_FLAG_PACK_LOCKED) == 0) || ((action == PR_PREVENT) && (softc->flags & DA_FLAG_PACK_LOCKED) != 0)) { return; } ccb = cam_periph_getccb(periph, CAM_PRIORITY_NORMAL); scsi_prevent(&ccb->csio, /*retries*/1, /*cbcfp*/dadone, MSG_SIMPLE_Q_TAG, action, SSD_FULL_SIZE, 5000); error = cam_periph_runccb(ccb, daerror, CAM_RETRY_SELTO, SF_RETRY_UA | SF_NO_PRINT, softc->disk->d_devstat); if (error == 0) { if (action == PR_ALLOW) softc->flags &= ~DA_FLAG_PACK_LOCKED; else softc->flags |= DA_FLAG_PACK_LOCKED; } xpt_release_ccb(ccb); } static void dasetgeom(struct cam_periph *periph, uint32_t block_len, uint64_t maxsector, struct scsi_read_capacity_data_long *rcaplong, size_t rcap_len) { struct ccb_calc_geometry ccg; struct da_softc *softc; struct disk_params *dp; u_int lbppbe, lalba; int error; softc = (struct da_softc *)periph->softc; dp = &softc->params; dp->secsize = block_len; dp->sectors = maxsector + 1; if (rcaplong != NULL) { lbppbe = rcaplong->prot_lbppbe & SRC16_LBPPBE; lalba = scsi_2btoul(rcaplong->lalba_lbp); lalba &= SRC16_LALBA_A; } else { lbppbe = 0; lalba = 0; } if (lbppbe > 0) { dp->stripesize = block_len << lbppbe; dp->stripeoffset = (dp->stripesize - block_len * lalba) % dp->stripesize; } else if (softc->quirks & DA_Q_4K) { dp->stripesize = 4096; dp->stripeoffset = 0; } else { dp->stripesize = 0; dp->stripeoffset = 0; } /* * Have the controller provide us with a geometry * for this disk. The only time the geometry * matters is when we boot and the controller * is the only one knowledgeable enough to come * up with something that will make this a bootable * device. */ xpt_setup_ccb(&ccg.ccb_h, periph->path, CAM_PRIORITY_NORMAL); ccg.ccb_h.func_code = XPT_CALC_GEOMETRY; ccg.block_size = dp->secsize; ccg.volume_size = dp->sectors; ccg.heads = 0; ccg.secs_per_track = 0; ccg.cylinders = 0; xpt_action((union ccb*)&ccg); if ((ccg.ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { /* * We don't know what went wrong here- but just pick * a geometry so we don't have nasty things like divide * by zero. */ dp->heads = 255; dp->secs_per_track = 255; dp->cylinders = dp->sectors / (255 * 255); if (dp->cylinders == 0) { dp->cylinders = 1; } } else { dp->heads = ccg.heads; dp->secs_per_track = ccg.secs_per_track; dp->cylinders = ccg.cylinders; } /* * If the user supplied a read capacity buffer, and if it is * different than the previous buffer, update the data in the EDT. * If it's the same, we don't bother. This avoids sending an * update every time someone opens this device. */ if ((rcaplong != NULL) && (bcmp(rcaplong, &softc->rcaplong, min(sizeof(softc->rcaplong), rcap_len)) != 0)) { struct ccb_dev_advinfo cdai; xpt_setup_ccb(&cdai.ccb_h, periph->path, CAM_PRIORITY_NORMAL); cdai.ccb_h.func_code = XPT_DEV_ADVINFO; cdai.buftype = CDAI_TYPE_RCAPLONG; cdai.flags = CDAI_FLAG_STORE; cdai.bufsiz = rcap_len; cdai.buf = (uint8_t *)rcaplong; xpt_action((union ccb *)&cdai); if ((cdai.ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(cdai.ccb_h.path, 0, 0, 0, FALSE); if (cdai.ccb_h.status != CAM_REQ_CMP) { xpt_print(periph->path, "%s: failed to set read " "capacity advinfo\n", __func__); /* Use cam_error_print() to decode the status */ cam_error_print((union ccb *)&cdai, CAM_ESF_CAM_STATUS, CAM_EPF_ALL); } else { bcopy(rcaplong, &softc->rcaplong, min(sizeof(softc->rcaplong), rcap_len)); } } softc->disk->d_sectorsize = softc->params.secsize; softc->disk->d_mediasize = softc->params.secsize * (off_t)softc->params.sectors; softc->disk->d_stripesize = softc->params.stripesize; softc->disk->d_stripeoffset = softc->params.stripeoffset; /* XXX: these are not actually "firmware" values, so they may be wrong */ softc->disk->d_fwsectors = softc->params.secs_per_track; softc->disk->d_fwheads = softc->params.heads; softc->disk->d_devstat->block_size = softc->params.secsize; softc->disk->d_devstat->flags &= ~DEVSTAT_BS_UNAVAILABLE; error = disk_resize(softc->disk, M_NOWAIT); if (error != 0) xpt_print(periph->path, "disk_resize(9) failed, error = %d\n", error); } static void dasendorderedtag(void *arg) { struct da_softc *softc = arg; if (da_send_ordered) { if (!LIST_EMPTY(&softc->pending_ccbs)) { if ((softc->flags & DA_FLAG_WAS_OTAG) == 0) softc->flags |= DA_FLAG_NEED_OTAG; softc->flags &= ~DA_FLAG_WAS_OTAG; } } /* Queue us up again */ callout_reset(&softc->sendordered_c, (da_default_timeout * hz) / DA_ORDEREDTAG_INTERVAL, dasendorderedtag, softc); } /* * Step through all DA peripheral drivers, and if the device is still open, * sync the disk cache to physical media. */ static void dashutdown(void * arg, int howto) { struct cam_periph *periph; struct da_softc *softc; union ccb *ccb; int error; CAM_PERIPH_FOREACH(periph, &dadriver) { softc = (struct da_softc *)periph->softc; if (SCHEDULER_STOPPED()) { /* If we paniced with the lock held, do not recurse. */ if (!cam_periph_owned(periph) && (softc->flags & DA_FLAG_OPEN)) { dadump(softc->disk, NULL, 0, 0, 0); } continue; } cam_periph_lock(periph); /* * We only sync the cache if the drive is still open, and * if the drive is capable of it.. */ if (((softc->flags & DA_FLAG_OPEN) == 0) || (softc->quirks & DA_Q_NO_SYNC_CACHE)) { cam_periph_unlock(periph); continue; } ccb = cam_periph_getccb(periph, CAM_PRIORITY_NORMAL); scsi_synchronize_cache(&ccb->csio, /*retries*/0, /*cbfcnp*/dadone, MSG_SIMPLE_Q_TAG, /*begin_lba*/0, /* whole disk */ /*lb_count*/0, SSD_FULL_SIZE, 60 * 60 * 1000); error = cam_periph_runccb(ccb, daerror, /*cam_flags*/0, /*sense_flags*/ SF_NO_RECOVERY | SF_NO_RETRY | SF_QUIET_IR, softc->disk->d_devstat); if (error != 0) xpt_print(periph->path, "Synchronize cache failed\n"); xpt_release_ccb(ccb); cam_periph_unlock(periph); } } #else /* !_KERNEL */ /* * XXX These are only left out of the kernel build to silence warnings. If, * for some reason these functions are used in the kernel, the ifdefs should * be moved so they are included both in the kernel and userland. */ void scsi_format_unit(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int8_t tag_action, u_int8_t byte2, u_int16_t ileave, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int8_t sense_len, u_int32_t timeout) { struct scsi_format_unit *scsi_cmd; scsi_cmd = (struct scsi_format_unit *)&csio->cdb_io.cdb_bytes; scsi_cmd->opcode = FORMAT_UNIT; scsi_cmd->byte2 = byte2; scsi_ulto2b(ileave, scsi_cmd->interleave); cam_fill_csio(csio, retries, cbfcnp, /*flags*/ (dxfer_len > 0) ? CAM_DIR_OUT : CAM_DIR_NONE, tag_action, data_ptr, dxfer_len, sense_len, sizeof(*scsi_cmd), timeout); } void scsi_read_defects(struct ccb_scsiio *csio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint8_t tag_action, uint8_t list_format, uint32_t addr_desc_index, uint8_t *data_ptr, uint32_t dxfer_len, int minimum_cmd_size, uint8_t sense_len, uint32_t timeout) { uint8_t cdb_len; /* * These conditions allow using the 10 byte command. Otherwise we * need to use the 12 byte command. */ if ((minimum_cmd_size <= 10) && (addr_desc_index == 0) && (dxfer_len <= SRDD10_MAX_LENGTH)) { struct scsi_read_defect_data_10 *cdb10; cdb10 = (struct scsi_read_defect_data_10 *) &csio->cdb_io.cdb_bytes; cdb_len = sizeof(*cdb10); bzero(cdb10, cdb_len); cdb10->opcode = READ_DEFECT_DATA_10; cdb10->format = list_format; scsi_ulto2b(dxfer_len, cdb10->alloc_length); } else { struct scsi_read_defect_data_12 *cdb12; cdb12 = (struct scsi_read_defect_data_12 *) &csio->cdb_io.cdb_bytes; cdb_len = sizeof(*cdb12); bzero(cdb12, cdb_len); cdb12->opcode = READ_DEFECT_DATA_12; cdb12->format = list_format; scsi_ulto4b(dxfer_len, cdb12->alloc_length); scsi_ulto4b(addr_desc_index, cdb12->address_descriptor_index); } cam_fill_csio(csio, retries, cbfcnp, /*flags*/ CAM_DIR_IN, tag_action, data_ptr, dxfer_len, sense_len, cdb_len, timeout); } void scsi_sanitize(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int8_t tag_action, u_int8_t byte2, u_int16_t control, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int8_t sense_len, u_int32_t timeout) { struct scsi_sanitize *scsi_cmd; scsi_cmd = (struct scsi_sanitize *)&csio->cdb_io.cdb_bytes; scsi_cmd->opcode = SANITIZE; scsi_cmd->byte2 = byte2; scsi_cmd->control = control; scsi_ulto2b(dxfer_len, scsi_cmd->length); cam_fill_csio(csio, retries, cbfcnp, /*flags*/ (dxfer_len > 0) ? CAM_DIR_OUT : CAM_DIR_NONE, tag_action, data_ptr, dxfer_len, sense_len, sizeof(*scsi_cmd), timeout); } #endif /* _KERNEL */ Index: stable/10/sys/cam/scsi/scsi_pass.c =================================================================== --- stable/10/sys/cam/scsi/scsi_pass.c (revision 292347) +++ stable/10/sys/cam/scsi/scsi_pass.c (revision 292348) @@ -1,713 +1,2225 @@ /*- * Copyright (c) 1997, 1998, 2000 Justin T. Gibbs. * Copyright (c) 1997, 1998, 1999 Kenneth D. Merry. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); +#include "opt_kdtrace.h" + #include #include #include +#include #include #include -#include -#include -#include -#include +#include #include +#include +#include +#include #include +#include +#include +#include #include +#include +#include +#include +#include + #include #include #include #include +#include #include #include -#include #include +#include #include #include typedef enum { PASS_FLAG_OPEN = 0x01, PASS_FLAG_LOCKED = 0x02, PASS_FLAG_INVALID = 0x04, - PASS_FLAG_INITIAL_PHYSPATH = 0x08 + PASS_FLAG_INITIAL_PHYSPATH = 0x08, + PASS_FLAG_ZONE_INPROG = 0x10, + PASS_FLAG_ZONE_VALID = 0x20, + PASS_FLAG_UNMAPPED_CAPABLE = 0x40, + PASS_FLAG_ABANDONED_REF_SET = 0x80 } pass_flags; typedef enum { PASS_STATE_NORMAL } pass_state; typedef enum { - PASS_CCB_BUFFER_IO + PASS_CCB_BUFFER_IO, + PASS_CCB_QUEUED_IO } pass_ccb_types; #define ccb_type ppriv_field0 -#define ccb_bp ppriv_ptr1 +#define ccb_ioreq ppriv_ptr1 +/* + * The maximum number of memory segments we preallocate. + */ +#define PASS_MAX_SEGS 16 + +typedef enum { + PASS_IO_NONE = 0x00, + PASS_IO_USER_SEG_MALLOC = 0x01, + PASS_IO_KERN_SEG_MALLOC = 0x02, + PASS_IO_ABANDONED = 0x04 +} pass_io_flags; + +struct pass_io_req { + union ccb ccb; + union ccb *alloced_ccb; + union ccb *user_ccb_ptr; + camq_entry user_periph_links; + ccb_ppriv_area user_periph_priv; + struct cam_periph_map_info mapinfo; + pass_io_flags flags; + ccb_flags data_flags; + int num_user_segs; + bus_dma_segment_t user_segs[PASS_MAX_SEGS]; + int num_kern_segs; + bus_dma_segment_t kern_segs[PASS_MAX_SEGS]; + bus_dma_segment_t *user_segptr; + bus_dma_segment_t *kern_segptr; + int num_bufs; + uint32_t dirs[CAM_PERIPH_MAXMAPS]; + uint32_t lengths[CAM_PERIPH_MAXMAPS]; + uint8_t *user_bufs[CAM_PERIPH_MAXMAPS]; + uint8_t *kern_bufs[CAM_PERIPH_MAXMAPS]; + struct bintime start_time; + TAILQ_ENTRY(pass_io_req) links; +}; + struct pass_softc { - pass_state state; - pass_flags flags; - u_int8_t pd_type; - union ccb saved_ccb; - int open_count; - u_int maxio; - struct devstat *device_stats; - struct cdev *dev; - struct cdev *alias_dev; - struct task add_physpath_task; + pass_state state; + pass_flags flags; + u_int8_t pd_type; + union ccb saved_ccb; + int open_count; + u_int maxio; + struct devstat *device_stats; + struct cdev *dev; + struct cdev *alias_dev; + struct task add_physpath_task; + struct task shutdown_kqueue_task; + struct selinfo read_select; + TAILQ_HEAD(, pass_io_req) incoming_queue; + TAILQ_HEAD(, pass_io_req) active_queue; + TAILQ_HEAD(, pass_io_req) abandoned_queue; + TAILQ_HEAD(, pass_io_req) done_queue; + struct cam_periph *periph; + char zone_name[12]; + char io_zone_name[12]; + uma_zone_t pass_zone; + uma_zone_t pass_io_zone; + size_t io_zone_size; }; - static d_open_t passopen; static d_close_t passclose; static d_ioctl_t passioctl; static d_ioctl_t passdoioctl; +static d_poll_t passpoll; +static d_kqfilter_t passkqfilter; +static void passreadfiltdetach(struct knote *kn); +static int passreadfilt(struct knote *kn, long hint); static periph_init_t passinit; static periph_ctor_t passregister; static periph_oninv_t passoninvalidate; static periph_dtor_t passcleanup; -static void pass_add_physpath(void *context, int pending); +static periph_start_t passstart; +static void pass_shutdown_kqueue(void *context, int pending); +static void pass_add_physpath(void *context, int pending); static void passasync(void *callback_arg, u_int32_t code, struct cam_path *path, void *arg); +static void passdone(struct cam_periph *periph, + union ccb *done_ccb); +static int passcreatezone(struct cam_periph *periph); +static void passiocleanup(struct pass_softc *softc, + struct pass_io_req *io_req); +static int passcopysglist(struct cam_periph *periph, + struct pass_io_req *io_req, + ccb_flags direction); +static int passmemsetup(struct cam_periph *periph, + struct pass_io_req *io_req); +static int passmemdone(struct cam_periph *periph, + struct pass_io_req *io_req); static int passerror(union ccb *ccb, u_int32_t cam_flags, u_int32_t sense_flags); static int passsendccb(struct cam_periph *periph, union ccb *ccb, union ccb *inccb); static struct periph_driver passdriver = { passinit, "pass", TAILQ_HEAD_INITIALIZER(passdriver.units), /* generation */ 0 }; PERIPHDRIVER_DECLARE(pass, passdriver); static struct cdevsw pass_cdevsw = { .d_version = D_VERSION, .d_flags = D_TRACKCLOSE, .d_open = passopen, .d_close = passclose, .d_ioctl = passioctl, + .d_poll = passpoll, + .d_kqfilter = passkqfilter, .d_name = "pass", }; +static struct filterops passread_filtops = { + .f_isfd = 1, + .f_detach = passreadfiltdetach, + .f_event = passreadfilt +}; + +static MALLOC_DEFINE(M_SCSIPASS, "scsi_pass", "scsi passthrough buffers"); + static void passinit(void) { cam_status status; /* * Install a global async callback. This callback will * receive async callbacks like "new device found". */ status = xpt_register_async(AC_FOUND_DEVICE, passasync, NULL, NULL); if (status != CAM_REQ_CMP) { printf("pass: Failed to attach master async callback " "due to status 0x%x!\n", status); } } static void +passrejectios(struct cam_periph *periph) +{ + struct pass_io_req *io_req, *io_req2; + struct pass_softc *softc; + + softc = (struct pass_softc *)periph->softc; + + /* + * The user can no longer get status for I/O on the done queue, so + * clean up all outstanding I/O on the done queue. + */ + TAILQ_FOREACH_SAFE(io_req, &softc->done_queue, links, io_req2) { + TAILQ_REMOVE(&softc->done_queue, io_req, links); + passiocleanup(softc, io_req); + uma_zfree(softc->pass_zone, io_req); + } + + /* + * The underlying device is gone, so we can't issue these I/Os. + * The devfs node has been shut down, so we can't return status to + * the user. Free any I/O left on the incoming queue. + */ + TAILQ_FOREACH_SAFE(io_req, &softc->incoming_queue, links, io_req2) { + TAILQ_REMOVE(&softc->incoming_queue, io_req, links); + passiocleanup(softc, io_req); + uma_zfree(softc->pass_zone, io_req); + } + + /* + * Normally we would put I/Os on the abandoned queue and acquire a + * reference when we saw the final close. But, the device went + * away and devfs may have moved everything off to deadfs by the + * time the I/O done callback is called; as a result, we won't see + * any more closes. So, if we have any active I/Os, we need to put + * them on the abandoned queue. When the abandoned queue is empty, + * we'll release the remaining reference (see below) to the peripheral. + */ + TAILQ_FOREACH_SAFE(io_req, &softc->active_queue, links, io_req2) { + TAILQ_REMOVE(&softc->active_queue, io_req, links); + io_req->flags |= PASS_IO_ABANDONED; + TAILQ_INSERT_TAIL(&softc->abandoned_queue, io_req, links); + } + + /* + * If we put any I/O on the abandoned queue, acquire a reference. + */ + if ((!TAILQ_EMPTY(&softc->abandoned_queue)) + && ((softc->flags & PASS_FLAG_ABANDONED_REF_SET) == 0)) { + cam_periph_doacquire(periph); + softc->flags |= PASS_FLAG_ABANDONED_REF_SET; + } +} + +static void passdevgonecb(void *arg) { struct cam_periph *periph; struct mtx *mtx; struct pass_softc *softc; int i; periph = (struct cam_periph *)arg; mtx = cam_periph_mtx(periph); mtx_lock(mtx); softc = (struct pass_softc *)periph->softc; KASSERT(softc->open_count >= 0, ("Negative open count %d", softc->open_count)); /* * When we get this callback, we will get no more close calls from * devfs. So if we have any dangling opens, we need to release the * reference held for that particular context. */ for (i = 0; i < softc->open_count; i++) cam_periph_release_locked(periph); softc->open_count = 0; /* * Release the reference held for the device node, it is gone now. + * Accordingly, inform all queued I/Os of their fate. */ cam_periph_release_locked(periph); + passrejectios(periph); /* - * We reference the lock directly here, instead of using + * We reference the SIM lock directly here, instead of using * cam_periph_unlock(). The reason is that the final call to * cam_periph_release_locked() above could result in the periph * getting freed. If that is the case, dereferencing the periph * with a cam_periph_unlock() call would cause a page fault. */ mtx_unlock(mtx); + + /* + * We have to remove our kqueue context from a thread because it + * may sleep. It would be nice if we could get a callback from + * kqueue when it is done cleaning up resources. + */ + taskqueue_enqueue(taskqueue_thread, &softc->shutdown_kqueue_task); } static void passoninvalidate(struct cam_periph *periph) { struct pass_softc *softc; softc = (struct pass_softc *)periph->softc; /* * De-register any async callbacks. */ xpt_register_async(0, passasync, periph, periph->path); softc->flags |= PASS_FLAG_INVALID; /* * Tell devfs this device has gone away, and ask for a callback * when it has cleaned up its state. */ destroy_dev_sched_cb(softc->dev, passdevgonecb, periph); - - /* - * XXX Return all queued I/O with ENXIO. - * XXX Handle any transactions queued to the card - * with XPT_ABORT_CCB. - */ } static void passcleanup(struct cam_periph *periph) { struct pass_softc *softc; softc = (struct pass_softc *)periph->softc; + cam_periph_assert(periph, MA_OWNED); + KASSERT(TAILQ_EMPTY(&softc->active_queue), + ("%s called when there are commands on the active queue!\n", + __func__)); + KASSERT(TAILQ_EMPTY(&softc->abandoned_queue), + ("%s called when there are commands on the abandoned queue!\n", + __func__)); + KASSERT(TAILQ_EMPTY(&softc->incoming_queue), + ("%s called when there are commands on the incoming queue!\n", + __func__)); + KASSERT(TAILQ_EMPTY(&softc->done_queue), + ("%s called when there are commands on the done queue!\n", + __func__)); + devstat_remove_entry(softc->device_stats); cam_periph_unlock(periph); + + /* + * We call taskqueue_drain() for the physpath task to make sure it + * is complete. We drop the lock because this can potentially + * sleep. XXX KDM that is bad. Need a way to get a callback when + * a taskqueue is drained. + * + * Note that we don't drain the kqueue shutdown task queue. This + * is because we hold a reference on the periph for kqueue, and + * release that reference from the kqueue shutdown task queue. So + * we cannot come into this routine unless we've released that + * reference. Also, because that could be the last reference, we + * could be called from the cam_periph_release() call in + * pass_shutdown_kqueue(). In that case, the taskqueue_drain() + * would deadlock. It would be preferable if we had a way to + * get a callback when a taskqueue is done. + */ taskqueue_drain(taskqueue_thread, &softc->add_physpath_task); cam_periph_lock(periph); free(softc, M_DEVBUF); } static void +pass_shutdown_kqueue(void *context, int pending) +{ + struct cam_periph *periph; + struct pass_softc *softc; + + periph = context; + softc = periph->softc; + + knlist_clear(&softc->read_select.si_note, /*is_locked*/ 0); + knlist_destroy(&softc->read_select.si_note); + + /* + * Release the reference we held for kqueue. + */ + cam_periph_release(periph); +} + +static void pass_add_physpath(void *context, int pending) { struct cam_periph *periph; struct pass_softc *softc; + struct mtx *mtx; char *physpath; /* * If we have one, create a devfs alias for our * physical path. */ periph = context; softc = periph->softc; physpath = malloc(MAXPATHLEN, M_DEVBUF, M_WAITOK); - cam_periph_lock(periph); - if (periph->flags & CAM_PERIPH_INVALID) { - cam_periph_unlock(periph); + mtx = cam_periph_mtx(periph); + mtx_lock(mtx); + + if (periph->flags & CAM_PERIPH_INVALID) goto out; - } + if (xpt_getattr(physpath, MAXPATHLEN, "GEOM::physpath", periph->path) == 0 && strlen(physpath) != 0) { - cam_periph_unlock(periph); + mtx_unlock(mtx); make_dev_physpath_alias(MAKEDEV_WAITOK, &softc->alias_dev, softc->dev, softc->alias_dev, physpath); - cam_periph_lock(periph); + mtx_lock(mtx); } +out: /* * Now that we've made our alias, we no longer have to have a * reference to the device. */ - if ((softc->flags & PASS_FLAG_INITIAL_PHYSPATH) == 0) { + if ((softc->flags & PASS_FLAG_INITIAL_PHYSPATH) == 0) softc->flags |= PASS_FLAG_INITIAL_PHYSPATH; - cam_periph_unlock(periph); - dev_rel(softc->dev); - } - else - cam_periph_unlock(periph); -out: + /* + * We always acquire a reference to the periph before queueing this + * task queue function, so it won't go away before we run. + */ + while (pending-- > 0) + cam_periph_release_locked(periph); + mtx_unlock(mtx); + free(physpath, M_DEVBUF); } static void passasync(void *callback_arg, u_int32_t code, struct cam_path *path, void *arg) { struct cam_periph *periph; periph = (struct cam_periph *)callback_arg; switch (code) { case AC_FOUND_DEVICE: { struct ccb_getdev *cgd; cam_status status; cgd = (struct ccb_getdev *)arg; if (cgd == NULL) break; /* * Allocate a peripheral instance for * this device and start the probe * process. */ status = cam_periph_alloc(passregister, passoninvalidate, - passcleanup, NULL, "pass", + passcleanup, passstart, "pass", CAM_PERIPH_BIO, path, passasync, AC_FOUND_DEVICE, cgd); if (status != CAM_REQ_CMP && status != CAM_REQ_INPROG) { const struct cam_status_entry *entry; entry = cam_fetch_status_entry(status); printf("passasync: Unable to attach new device " "due to status %#x: %s\n", status, entry ? entry->status_text : "Unknown"); } break; } case AC_ADVINFO_CHANGED: { uintptr_t buftype; buftype = (uintptr_t)arg; if (buftype == CDAI_TYPE_PHYS_PATH) { struct pass_softc *softc; + cam_status status; softc = (struct pass_softc *)periph->softc; + /* + * Acquire a reference to the periph before we + * start the taskqueue, so that we don't run into + * a situation where the periph goes away before + * the task queue has a chance to run. + */ + status = cam_periph_acquire(periph); + if (status != CAM_REQ_CMP) + break; + taskqueue_enqueue(taskqueue_thread, &softc->add_physpath_task); } break; } default: cam_periph_async(periph, code, path, arg); break; } } static cam_status passregister(struct cam_periph *periph, void *arg) { struct pass_softc *softc; struct ccb_getdev *cgd; struct ccb_pathinq cpi; int no_tags; cgd = (struct ccb_getdev *)arg; if (cgd == NULL) { printf("%s: no getdev CCB, can't register device\n", __func__); return(CAM_REQ_CMP_ERR); } softc = (struct pass_softc *)malloc(sizeof(*softc), M_DEVBUF, M_NOWAIT); if (softc == NULL) { printf("%s: Unable to probe new device. " "Unable to allocate softc\n", __func__); return(CAM_REQ_CMP_ERR); } bzero(softc, sizeof(*softc)); softc->state = PASS_STATE_NORMAL; if (cgd->protocol == PROTO_SCSI || cgd->protocol == PROTO_ATAPI) softc->pd_type = SID_TYPE(&cgd->inq_data); else if (cgd->protocol == PROTO_SATAPM) softc->pd_type = T_ENCLOSURE; else softc->pd_type = T_DIRECT; periph->softc = softc; + softc->periph = periph; + TAILQ_INIT(&softc->incoming_queue); + TAILQ_INIT(&softc->active_queue); + TAILQ_INIT(&softc->abandoned_queue); + TAILQ_INIT(&softc->done_queue); + snprintf(softc->zone_name, sizeof(softc->zone_name), "%s%d", + periph->periph_name, periph->unit_number); + snprintf(softc->io_zone_name, sizeof(softc->io_zone_name), "%s%dIO", + periph->periph_name, periph->unit_number); + softc->io_zone_size = MAXPHYS; + knlist_init_mtx(&softc->read_select.si_note, cam_periph_mtx(periph)); bzero(&cpi, sizeof(cpi)); xpt_setup_ccb(&cpi.ccb_h, periph->path, CAM_PRIORITY_NORMAL); cpi.ccb_h.func_code = XPT_PATH_INQ; xpt_action((union ccb *)&cpi); if (cpi.maxio == 0) softc->maxio = DFLTPHYS; /* traditional default */ else if (cpi.maxio > MAXPHYS) softc->maxio = MAXPHYS; /* for safety */ else softc->maxio = cpi.maxio; /* real value */ + if (cpi.hba_misc & PIM_UNMAPPED) + softc->flags |= PASS_FLAG_UNMAPPED_CAPABLE; + /* * We pass in 0 for a blocksize, since we don't * know what the blocksize of this device is, if * it even has a blocksize. */ cam_periph_unlock(periph); no_tags = (cgd->inq_data.flags & SID_CmdQue) == 0; softc->device_stats = devstat_new_entry("pass", periph->unit_number, 0, DEVSTAT_NO_BLOCKSIZE | (no_tags ? DEVSTAT_NO_ORDERED_TAGS : 0), softc->pd_type | XPORT_DEVSTAT_TYPE(cpi.transport) | DEVSTAT_TYPE_PASS, DEVSTAT_PRIORITY_PASS); /* + * Initialize the taskqueue handler for shutting down kqueue. + */ + TASK_INIT(&softc->shutdown_kqueue_task, /*priority*/ 0, + pass_shutdown_kqueue, periph); + + /* + * Acquire a reference to the periph that we can release once we've + * cleaned up the kqueue. + */ + if (cam_periph_acquire(periph) != CAM_REQ_CMP) { + xpt_print(periph->path, "%s: lost periph during " + "registration!\n", __func__); + cam_periph_lock(periph); + return (CAM_REQ_CMP_ERR); + } + + /* * Acquire a reference to the periph before we create the devfs * instance for it. We'll release this reference once the devfs * instance has been freed. */ if (cam_periph_acquire(periph) != CAM_REQ_CMP) { xpt_print(periph->path, "%s: lost periph during " "registration!\n", __func__); cam_periph_lock(periph); return (CAM_REQ_CMP_ERR); } /* Register the device */ softc->dev = make_dev(&pass_cdevsw, periph->unit_number, UID_ROOT, GID_OPERATOR, 0600, "%s%d", periph->periph_name, periph->unit_number); /* - * Now that we have made the devfs instance, hold a reference to it - * until the task queue has run to setup the physical path alias. - * That way devfs won't get rid of the device before we add our - * alias. + * Hold a reference to the periph before we create the physical + * path alias so it can't go away. */ - dev_ref(softc->dev); + if (cam_periph_acquire(periph) != CAM_REQ_CMP) { + xpt_print(periph->path, "%s: lost periph during " + "registration!\n", __func__); + cam_periph_lock(periph); + return (CAM_REQ_CMP_ERR); + } cam_periph_lock(periph); softc->dev->si_drv1 = periph; TASK_INIT(&softc->add_physpath_task, /*priority*/0, pass_add_physpath, periph); /* * See if physical path information is already available. */ taskqueue_enqueue(taskqueue_thread, &softc->add_physpath_task); /* * Add an async callback so that we get notified if * this device goes away or its physical path * (stored in the advanced info data of the EDT) has * changed. */ xpt_register_async(AC_LOST_DEVICE | AC_ADVINFO_CHANGED, passasync, periph, periph->path); if (bootverbose) xpt_announce_periph(periph, NULL); return(CAM_REQ_CMP); } static int passopen(struct cdev *dev, int flags, int fmt, struct thread *td) { struct cam_periph *periph; struct pass_softc *softc; int error; periph = (struct cam_periph *)dev->si_drv1; if (cam_periph_acquire(periph) != CAM_REQ_CMP) return (ENXIO); cam_periph_lock(periph); softc = (struct pass_softc *)periph->softc; if (softc->flags & PASS_FLAG_INVALID) { cam_periph_release_locked(periph); cam_periph_unlock(periph); return(ENXIO); } /* * Don't allow access when we're running at a high securelevel. */ error = securelevel_gt(td->td_ucred, 1); if (error) { cam_periph_release_locked(periph); cam_periph_unlock(periph); return(error); } /* * Only allow read-write access. */ if (((flags & FWRITE) == 0) || ((flags & FREAD) == 0)) { cam_periph_release_locked(periph); cam_periph_unlock(periph); return(EPERM); } /* * We don't allow nonblocking access. */ if ((flags & O_NONBLOCK) != 0) { xpt_print(periph->path, "can't do nonblocking access\n"); cam_periph_release_locked(periph); cam_periph_unlock(periph); return(EINVAL); } softc->open_count++; cam_periph_unlock(periph); return (error); } static int passclose(struct cdev *dev, int flag, int fmt, struct thread *td) { struct cam_periph *periph; struct pass_softc *softc; struct mtx *mtx; periph = (struct cam_periph *)dev->si_drv1; if (periph == NULL) return (ENXIO); mtx = cam_periph_mtx(periph); mtx_lock(mtx); softc = periph->softc; softc->open_count--; + if (softc->open_count == 0) { + struct pass_io_req *io_req, *io_req2; + int need_unlock; + + need_unlock = 0; + + TAILQ_FOREACH_SAFE(io_req, &softc->done_queue, links, io_req2) { + TAILQ_REMOVE(&softc->done_queue, io_req, links); + passiocleanup(softc, io_req); + uma_zfree(softc->pass_zone, io_req); + } + + TAILQ_FOREACH_SAFE(io_req, &softc->incoming_queue, links, + io_req2) { + TAILQ_REMOVE(&softc->incoming_queue, io_req, links); + passiocleanup(softc, io_req); + uma_zfree(softc->pass_zone, io_req); + } + + /* + * If there are any active I/Os, we need to forcibly acquire a + * reference to the peripheral so that we don't go away + * before they complete. We'll release the reference when + * the abandoned queue is empty. + */ + io_req = TAILQ_FIRST(&softc->active_queue); + if ((io_req != NULL) + && (softc->flags & PASS_FLAG_ABANDONED_REF_SET) == 0) { + cam_periph_doacquire(periph); + softc->flags |= PASS_FLAG_ABANDONED_REF_SET; + } + + /* + * Since the I/O in the active queue is not under our + * control, just set a flag so that we can clean it up when + * it completes and put it on the abandoned queue. This + * will prevent our sending spurious completions in the + * event that the device is opened again before these I/Os + * complete. + */ + TAILQ_FOREACH_SAFE(io_req, &softc->active_queue, links, + io_req2) { + TAILQ_REMOVE(&softc->active_queue, io_req, links); + io_req->flags |= PASS_IO_ABANDONED; + TAILQ_INSERT_TAIL(&softc->abandoned_queue, io_req, + links); + } + } + cam_periph_release_locked(periph); /* * We reference the lock directly here, instead of using * cam_periph_unlock(). The reason is that the call to * cam_periph_release_locked() above could result in the periph * getting freed. If that is the case, dereferencing the periph * with a cam_periph_unlock() call would cause a page fault. * * cam_periph_release() avoids this problem using the same method, * but we're manually acquiring and dropping the lock here to * protect the open count and avoid another lock acquisition and * release. */ mtx_unlock(mtx); return (0); } + +static void +passstart(struct cam_periph *periph, union ccb *start_ccb) +{ + struct pass_softc *softc; + + softc = (struct pass_softc *)periph->softc; + + switch (softc->state) { + case PASS_STATE_NORMAL: { + struct pass_io_req *io_req; + + /* + * Check for any queued I/O requests that require an + * allocated slot. + */ + io_req = TAILQ_FIRST(&softc->incoming_queue); + if (io_req == NULL) { + xpt_release_ccb(start_ccb); + break; + } + TAILQ_REMOVE(&softc->incoming_queue, io_req, links); + TAILQ_INSERT_TAIL(&softc->active_queue, io_req, links); + /* + * Merge the user's CCB into the allocated CCB. + */ + xpt_merge_ccb(start_ccb, &io_req->ccb); + start_ccb->ccb_h.ccb_type = PASS_CCB_QUEUED_IO; + start_ccb->ccb_h.ccb_ioreq = io_req; + start_ccb->ccb_h.cbfcnp = passdone; + io_req->alloced_ccb = start_ccb; + binuptime(&io_req->start_time); + devstat_start_transaction(softc->device_stats, + &io_req->start_time); + + xpt_action(start_ccb); + + /* + * If we have any more I/O waiting, schedule ourselves again. + */ + if (!TAILQ_EMPTY(&softc->incoming_queue)) + xpt_schedule(periph, CAM_PRIORITY_NORMAL); + break; + } + default: + break; + } +} + +static void +passdone(struct cam_periph *periph, union ccb *done_ccb) +{ + struct pass_softc *softc; + struct ccb_scsiio *csio; + + softc = (struct pass_softc *)periph->softc; + + cam_periph_assert(periph, MA_OWNED); + + csio = &done_ccb->csio; + switch (csio->ccb_h.ccb_type) { + case PASS_CCB_QUEUED_IO: { + struct pass_io_req *io_req; + + io_req = done_ccb->ccb_h.ccb_ioreq; +#if 0 + xpt_print(periph->path, "%s: called for user CCB %p\n", + __func__, io_req->user_ccb_ptr); +#endif + if (((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) + && (done_ccb->ccb_h.flags & CAM_PASS_ERR_RECOVER) + && ((io_req->flags & PASS_IO_ABANDONED) == 0)) { + int error; + + error = passerror(done_ccb, CAM_RETRY_SELTO, + SF_RETRY_UA | SF_NO_PRINT); + + if (error == ERESTART) { + /* + * A retry was scheduled, so + * just return. + */ + return; + } + } + + /* + * Copy the allocated CCB contents back to the malloced CCB + * so we can give status back to the user when he requests it. + */ + bcopy(done_ccb, &io_req->ccb, sizeof(*done_ccb)); + + /* + * Log data/transaction completion with devstat(9). + */ + switch (done_ccb->ccb_h.func_code) { + case XPT_SCSI_IO: + devstat_end_transaction(softc->device_stats, + done_ccb->csio.dxfer_len - done_ccb->csio.resid, + done_ccb->csio.tag_action & 0x3, + ((done_ccb->ccb_h.flags & CAM_DIR_MASK) == + CAM_DIR_NONE) ? DEVSTAT_NO_DATA : + (done_ccb->ccb_h.flags & CAM_DIR_OUT) ? + DEVSTAT_WRITE : DEVSTAT_READ, NULL, + &io_req->start_time); + break; + case XPT_ATA_IO: + devstat_end_transaction(softc->device_stats, + done_ccb->ataio.dxfer_len - done_ccb->ataio.resid, + done_ccb->ataio.tag_action & 0x3, + ((done_ccb->ccb_h.flags & CAM_DIR_MASK) == + CAM_DIR_NONE) ? DEVSTAT_NO_DATA : + (done_ccb->ccb_h.flags & CAM_DIR_OUT) ? + DEVSTAT_WRITE : DEVSTAT_READ, NULL, + &io_req->start_time); + break; + case XPT_SMP_IO: + /* + * XXX KDM this isn't quite right, but there isn't + * currently an easy way to represent a bidirectional + * transfer in devstat. The only way to do it + * and have the byte counts come out right would + * mean that we would have to record two + * transactions, one for the request and one for the + * response. For now, so that we report something, + * just treat the entire thing as a read. + */ + devstat_end_transaction(softc->device_stats, + done_ccb->smpio.smp_request_len + + done_ccb->smpio.smp_response_len, + DEVSTAT_TAG_SIMPLE, DEVSTAT_READ, NULL, + &io_req->start_time); + break; + default: + devstat_end_transaction(softc->device_stats, 0, + DEVSTAT_TAG_NONE, DEVSTAT_NO_DATA, NULL, + &io_req->start_time); + break; + } + + /* + * In the normal case, take the completed I/O off of the + * active queue and put it on the done queue. Notitfy the + * user that we have a completed I/O. + */ + if ((io_req->flags & PASS_IO_ABANDONED) == 0) { + TAILQ_REMOVE(&softc->active_queue, io_req, links); + TAILQ_INSERT_TAIL(&softc->done_queue, io_req, links); + selwakeuppri(&softc->read_select, PRIBIO); + KNOTE_LOCKED(&softc->read_select.si_note, 0); + } else { + /* + * In the case of an abandoned I/O (final close + * without fetching the I/O), take it off of the + * abandoned queue and free it. + */ + TAILQ_REMOVE(&softc->abandoned_queue, io_req, links); + passiocleanup(softc, io_req); + uma_zfree(softc->pass_zone, io_req); + + /* + * Release the done_ccb here, since we may wind up + * freeing the peripheral when we decrement the + * reference count below. + */ + xpt_release_ccb(done_ccb); + + /* + * If the abandoned queue is empty, we can release + * our reference to the periph since we won't have + * any more completions coming. + */ + if ((TAILQ_EMPTY(&softc->abandoned_queue)) + && (softc->flags & PASS_FLAG_ABANDONED_REF_SET)) { + softc->flags &= ~PASS_FLAG_ABANDONED_REF_SET; + cam_periph_release_locked(periph); + } + + /* + * We have already released the CCB, so we can + * return. + */ + return; + } + break; + } + } + xpt_release_ccb(done_ccb); +} + static int +passcreatezone(struct cam_periph *periph) +{ + struct pass_softc *softc; + int error; + + error = 0; + softc = (struct pass_softc *)periph->softc; + + cam_periph_assert(periph, MA_OWNED); + KASSERT(((softc->flags & PASS_FLAG_ZONE_VALID) == 0), + ("%s called when the pass(4) zone is valid!\n", __func__)); + KASSERT((softc->pass_zone == NULL), + ("%s called when the pass(4) zone is allocated!\n", __func__)); + + if ((softc->flags & PASS_FLAG_ZONE_INPROG) == 0) { + + /* + * We're the first context through, so we need to create + * the pass(4) UMA zone for I/O requests. + */ + softc->flags |= PASS_FLAG_ZONE_INPROG; + + /* + * uma_zcreate() does a blocking (M_WAITOK) allocation, + * so we cannot hold a mutex while we call it. + */ + cam_periph_unlock(periph); + + softc->pass_zone = uma_zcreate(softc->zone_name, + sizeof(struct pass_io_req), NULL, NULL, NULL, NULL, + /*align*/ 0, /*flags*/ 0); + + softc->pass_io_zone = uma_zcreate(softc->io_zone_name, + softc->io_zone_size, NULL, NULL, NULL, NULL, + /*align*/ 0, /*flags*/ 0); + + cam_periph_lock(periph); + + if ((softc->pass_zone == NULL) + || (softc->pass_io_zone == NULL)) { + if (softc->pass_zone == NULL) + xpt_print(periph->path, "unable to allocate " + "IO Req UMA zone\n"); + else + xpt_print(periph->path, "unable to allocate " + "IO UMA zone\n"); + softc->flags &= ~PASS_FLAG_ZONE_INPROG; + goto bailout; + } + + /* + * Set the flags appropriately and notify any other waiters. + */ + softc->flags &= PASS_FLAG_ZONE_INPROG; + softc->flags |= PASS_FLAG_ZONE_VALID; + wakeup(&softc->pass_zone); + } else { + /* + * In this case, the UMA zone has not yet been created, but + * another context is in the process of creating it. We + * need to sleep until the creation is either done or has + * failed. + */ + while ((softc->flags & PASS_FLAG_ZONE_INPROG) + && ((softc->flags & PASS_FLAG_ZONE_VALID) == 0)) { + error = msleep(&softc->pass_zone, + cam_periph_mtx(periph), PRIBIO, + "paszon", 0); + if (error != 0) + goto bailout; + } + /* + * If the zone creation failed, no luck for the user. + */ + if ((softc->flags & PASS_FLAG_ZONE_VALID) == 0){ + error = ENOMEM; + goto bailout; + } + } +bailout: + return (error); +} + +static void +passiocleanup(struct pass_softc *softc, struct pass_io_req *io_req) +{ + union ccb *ccb; + u_int8_t **data_ptrs[CAM_PERIPH_MAXMAPS]; + int i, numbufs; + + ccb = &io_req->ccb; + + switch (ccb->ccb_h.func_code) { + case XPT_DEV_MATCH: + numbufs = min(io_req->num_bufs, 2); + + if (numbufs == 1) { + data_ptrs[0] = (u_int8_t **)&ccb->cdm.matches; + } else { + data_ptrs[0] = (u_int8_t **)&ccb->cdm.patterns; + data_ptrs[1] = (u_int8_t **)&ccb->cdm.matches; + } + break; + case XPT_SCSI_IO: + case XPT_CONT_TARGET_IO: + data_ptrs[0] = &ccb->csio.data_ptr; + numbufs = min(io_req->num_bufs, 1); + break; + case XPT_ATA_IO: + data_ptrs[0] = &ccb->ataio.data_ptr; + numbufs = min(io_req->num_bufs, 1); + break; + case XPT_SMP_IO: + numbufs = min(io_req->num_bufs, 2); + data_ptrs[0] = &ccb->smpio.smp_request; + data_ptrs[1] = &ccb->smpio.smp_response; + break; + case XPT_DEV_ADVINFO: + numbufs = min(io_req->num_bufs, 1); + data_ptrs[0] = (uint8_t **)&ccb->cdai.buf; + break; + default: + /* allow ourselves to be swapped once again */ + return; + break; /* NOTREACHED */ + } + + if (io_req->flags & PASS_IO_USER_SEG_MALLOC) { + free(io_req->user_segptr, M_SCSIPASS); + io_req->user_segptr = NULL; + } + + /* + * We only want to free memory we malloced. + */ + if (io_req->data_flags == CAM_DATA_VADDR) { + for (i = 0; i < io_req->num_bufs; i++) { + if (io_req->kern_bufs[i] == NULL) + continue; + + free(io_req->kern_bufs[i], M_SCSIPASS); + io_req->kern_bufs[i] = NULL; + } + } else if (io_req->data_flags == CAM_DATA_SG) { + for (i = 0; i < io_req->num_kern_segs; i++) { + if ((uint8_t *)(uintptr_t) + io_req->kern_segptr[i].ds_addr == NULL) + continue; + + uma_zfree(softc->pass_io_zone, (uint8_t *)(uintptr_t) + io_req->kern_segptr[i].ds_addr); + io_req->kern_segptr[i].ds_addr = 0; + } + } + + if (io_req->flags & PASS_IO_KERN_SEG_MALLOC) { + free(io_req->kern_segptr, M_SCSIPASS); + io_req->kern_segptr = NULL; + } + + if (io_req->data_flags != CAM_DATA_PADDR) { + for (i = 0; i < numbufs; i++) { + /* + * Restore the user's buffer pointers to their + * previous values. + */ + if (io_req->user_bufs[i] != NULL) + *data_ptrs[i] = io_req->user_bufs[i]; + } + } + +} + +static int +passcopysglist(struct cam_periph *periph, struct pass_io_req *io_req, + ccb_flags direction) +{ + bus_size_t kern_watermark, user_watermark, len_copied, len_to_copy; + bus_dma_segment_t *user_sglist, *kern_sglist; + int i, j, error; + + error = 0; + kern_watermark = 0; + user_watermark = 0; + len_to_copy = 0; + len_copied = 0; + user_sglist = io_req->user_segptr; + kern_sglist = io_req->kern_segptr; + + for (i = 0, j = 0; i < io_req->num_user_segs && + j < io_req->num_kern_segs;) { + uint8_t *user_ptr, *kern_ptr; + + len_to_copy = min(user_sglist[i].ds_len -user_watermark, + kern_sglist[j].ds_len - kern_watermark); + + user_ptr = (uint8_t *)(uintptr_t)user_sglist[i].ds_addr; + user_ptr = user_ptr + user_watermark; + kern_ptr = (uint8_t *)(uintptr_t)kern_sglist[j].ds_addr; + kern_ptr = kern_ptr + kern_watermark; + + user_watermark += len_to_copy; + kern_watermark += len_to_copy; + + if (!useracc(user_ptr, len_to_copy, + (direction == CAM_DIR_IN) ? VM_PROT_WRITE : VM_PROT_READ)) { + xpt_print(periph->path, "%s: unable to access user " + "S/G list element %p len %zu\n", __func__, + user_ptr, len_to_copy); + error = EFAULT; + goto bailout; + } + + if (direction == CAM_DIR_IN) { + error = copyout(kern_ptr, user_ptr, len_to_copy); + if (error != 0) { + xpt_print(periph->path, "%s: copyout of %u " + "bytes from %p to %p failed with " + "error %d\n", __func__, len_to_copy, + kern_ptr, user_ptr, error); + goto bailout; + } + } else { + error = copyin(user_ptr, kern_ptr, len_to_copy); + if (error != 0) { + xpt_print(periph->path, "%s: copyin of %u " + "bytes from %p to %p failed with " + "error %d\n", __func__, len_to_copy, + user_ptr, kern_ptr, error); + goto bailout; + } + } + + len_copied += len_to_copy; + + if (user_sglist[i].ds_len == user_watermark) { + i++; + user_watermark = 0; + } + + if (kern_sglist[j].ds_len == kern_watermark) { + j++; + kern_watermark = 0; + } + } + +bailout: + + return (error); +} + +static int +passmemsetup(struct cam_periph *periph, struct pass_io_req *io_req) +{ + union ccb *ccb; + struct pass_softc *softc; + int numbufs, i; + uint8_t **data_ptrs[CAM_PERIPH_MAXMAPS]; + uint32_t lengths[CAM_PERIPH_MAXMAPS]; + uint32_t dirs[CAM_PERIPH_MAXMAPS]; + uint32_t num_segs; + uint16_t *seg_cnt_ptr; + size_t maxmap; + int error; + + cam_periph_assert(periph, MA_NOTOWNED); + + softc = periph->softc; + + error = 0; + ccb = &io_req->ccb; + maxmap = 0; + num_segs = 0; + seg_cnt_ptr = NULL; + + switch(ccb->ccb_h.func_code) { + case XPT_DEV_MATCH: + if (ccb->cdm.match_buf_len == 0) { + printf("%s: invalid match buffer length 0\n", __func__); + return(EINVAL); + } + if (ccb->cdm.pattern_buf_len > 0) { + data_ptrs[0] = (u_int8_t **)&ccb->cdm.patterns; + lengths[0] = ccb->cdm.pattern_buf_len; + dirs[0] = CAM_DIR_OUT; + data_ptrs[1] = (u_int8_t **)&ccb->cdm.matches; + lengths[1] = ccb->cdm.match_buf_len; + dirs[1] = CAM_DIR_IN; + numbufs = 2; + } else { + data_ptrs[0] = (u_int8_t **)&ccb->cdm.matches; + lengths[0] = ccb->cdm.match_buf_len; + dirs[0] = CAM_DIR_IN; + numbufs = 1; + } + io_req->data_flags = CAM_DATA_VADDR; + break; + case XPT_SCSI_IO: + case XPT_CONT_TARGET_IO: + if ((ccb->ccb_h.flags & CAM_DIR_MASK) == CAM_DIR_NONE) + return(0); + + /* + * The user shouldn't be able to supply a bio. + */ + if ((ccb->ccb_h.flags & CAM_DATA_MASK) == CAM_DATA_BIO) + return (EINVAL); + + io_req->data_flags = ccb->ccb_h.flags & CAM_DATA_MASK; + + data_ptrs[0] = &ccb->csio.data_ptr; + lengths[0] = ccb->csio.dxfer_len; + dirs[0] = ccb->ccb_h.flags & CAM_DIR_MASK; + num_segs = ccb->csio.sglist_cnt; + seg_cnt_ptr = &ccb->csio.sglist_cnt; + numbufs = 1; + maxmap = softc->maxio; + break; + case XPT_ATA_IO: + if ((ccb->ccb_h.flags & CAM_DIR_MASK) == CAM_DIR_NONE) + return(0); + + /* + * We only support a single virtual address for ATA I/O. + */ + if ((ccb->ccb_h.flags & CAM_DATA_MASK) != CAM_DATA_VADDR) + return (EINVAL); + + io_req->data_flags = CAM_DATA_VADDR; + + data_ptrs[0] = &ccb->ataio.data_ptr; + lengths[0] = ccb->ataio.dxfer_len; + dirs[0] = ccb->ccb_h.flags & CAM_DIR_MASK; + numbufs = 1; + maxmap = softc->maxio; + break; + case XPT_SMP_IO: + io_req->data_flags = CAM_DATA_VADDR; + + data_ptrs[0] = &ccb->smpio.smp_request; + lengths[0] = ccb->smpio.smp_request_len; + dirs[0] = CAM_DIR_OUT; + data_ptrs[1] = &ccb->smpio.smp_response; + lengths[1] = ccb->smpio.smp_response_len; + dirs[1] = CAM_DIR_IN; + numbufs = 2; + maxmap = softc->maxio; + break; + case XPT_DEV_ADVINFO: + if (ccb->cdai.bufsiz == 0) + return (0); + + io_req->data_flags = CAM_DATA_VADDR; + + data_ptrs[0] = (uint8_t **)&ccb->cdai.buf; + lengths[0] = ccb->cdai.bufsiz; + dirs[0] = CAM_DIR_IN; + numbufs = 1; + break; + default: + return(EINVAL); + break; /* NOTREACHED */ + } + + io_req->num_bufs = numbufs; + + /* + * If there is a maximum, check to make sure that the user's + * request fits within the limit. In general, we should only have + * a maximum length for requests that go to hardware. Otherwise it + * is whatever we're able to malloc. + */ + for (i = 0; i < numbufs; i++) { + io_req->user_bufs[i] = *data_ptrs[i]; + io_req->dirs[i] = dirs[i]; + io_req->lengths[i] = lengths[i]; + + if (maxmap == 0) + continue; + + if (lengths[i] <= maxmap) + continue; + + xpt_print(periph->path, "%s: data length %u > max allowed %u " + "bytes\n", __func__, lengths[i], maxmap); + error = EINVAL; + goto bailout; + } + + switch (io_req->data_flags) { + case CAM_DATA_VADDR: + /* Map or copy the buffer into kernel address space */ + for (i = 0; i < numbufs; i++) { + uint8_t *tmp_buf; + + /* + * If for some reason no length is specified, we + * don't need to allocate anything. + */ + if (io_req->lengths[i] == 0) + continue; + + /* + * Make sure that the user's buffer is accessible + * to that process. + */ + if (!useracc(io_req->user_bufs[i], io_req->lengths[i], + (io_req->dirs[i] == CAM_DIR_IN) ? VM_PROT_WRITE : + VM_PROT_READ)) { + xpt_print(periph->path, "%s: user address %p " + "length %u is not accessible\n", __func__, + io_req->user_bufs[i], io_req->lengths[i]); + error = EFAULT; + goto bailout; + } + + tmp_buf = malloc(lengths[i], M_SCSIPASS, + M_WAITOK | M_ZERO); + io_req->kern_bufs[i] = tmp_buf; + *data_ptrs[i] = tmp_buf; + +#if 0 + xpt_print(periph->path, "%s: malloced %p len %u, user " + "buffer %p, operation: %s\n", __func__, + tmp_buf, lengths[i], io_req->user_bufs[i], + (dirs[i] == CAM_DIR_IN) ? "read" : "write"); +#endif + /* + * We only need to copy in if the user is writing. + */ + if (dirs[i] != CAM_DIR_OUT) + continue; + + error = copyin(io_req->user_bufs[i], + io_req->kern_bufs[i], lengths[i]); + if (error != 0) { + xpt_print(periph->path, "%s: copy of user " + "buffer from %p to %p failed with " + "error %d\n", __func__, + io_req->user_bufs[i], + io_req->kern_bufs[i], error); + goto bailout; + } + } + break; + case CAM_DATA_PADDR: + /* Pass down the pointer as-is */ + break; + case CAM_DATA_SG: { + size_t sg_length, size_to_go, alloc_size; + uint32_t num_segs_needed; + + /* + * Copy the user S/G list in, and then copy in the + * individual segments. + */ + /* + * We shouldn't see this, but check just in case. + */ + if (numbufs != 1) { + xpt_print(periph->path, "%s: cannot currently handle " + "more than one S/G list per CCB\n", __func__); + error = EINVAL; + goto bailout; + } + + /* + * We have to have at least one segment. + */ + if (num_segs == 0) { + xpt_print(periph->path, "%s: CAM_DATA_SG flag set, " + "but sglist_cnt=0!\n", __func__); + error = EINVAL; + goto bailout; + } + + /* + * Make sure the user specified the total length and didn't + * just leave it to us to decode the S/G list. + */ + if (lengths[0] == 0) { + xpt_print(periph->path, "%s: no dxfer_len specified, " + "but CAM_DATA_SG flag is set!\n", __func__); + error = EINVAL; + goto bailout; + } + + /* + * We allocate buffers in io_zone_size increments for an + * S/G list. This will generally be MAXPHYS. + */ + if (lengths[0] <= softc->io_zone_size) + num_segs_needed = 1; + else { + num_segs_needed = lengths[0] / softc->io_zone_size; + if ((lengths[0] % softc->io_zone_size) != 0) + num_segs_needed++; + } + + /* Figure out the size of the S/G list */ + sg_length = num_segs * sizeof(bus_dma_segment_t); + io_req->num_user_segs = num_segs; + io_req->num_kern_segs = num_segs_needed; + + /* Save the user's S/G list pointer for later restoration */ + io_req->user_bufs[0] = *data_ptrs[0]; + + /* + * If we have enough segments allocated by default to handle + * the length of the user's S/G list, + */ + if (num_segs > PASS_MAX_SEGS) { + io_req->user_segptr = malloc(sizeof(bus_dma_segment_t) * + num_segs, M_SCSIPASS, M_WAITOK | M_ZERO); + io_req->flags |= PASS_IO_USER_SEG_MALLOC; + } else + io_req->user_segptr = io_req->user_segs; + + if (!useracc(*data_ptrs[0], sg_length, VM_PROT_READ)) { + xpt_print(periph->path, "%s: unable to access user " + "S/G list at %p\n", __func__, *data_ptrs[0]); + error = EFAULT; + goto bailout; + } + + error = copyin(*data_ptrs[0], io_req->user_segptr, sg_length); + if (error != 0) { + xpt_print(periph->path, "%s: copy of user S/G list " + "from %p to %p failed with error %d\n", + __func__, *data_ptrs[0], io_req->user_segptr, + error); + goto bailout; + } + + if (num_segs_needed > PASS_MAX_SEGS) { + io_req->kern_segptr = malloc(sizeof(bus_dma_segment_t) * + num_segs_needed, M_SCSIPASS, M_WAITOK | M_ZERO); + io_req->flags |= PASS_IO_KERN_SEG_MALLOC; + } else { + io_req->kern_segptr = io_req->kern_segs; + } + + /* + * Allocate the kernel S/G list. + */ + for (size_to_go = lengths[0], i = 0; + size_to_go > 0 && i < num_segs_needed; + i++, size_to_go -= alloc_size) { + uint8_t *kern_ptr; + + alloc_size = min(size_to_go, softc->io_zone_size); + kern_ptr = uma_zalloc(softc->pass_io_zone, M_WAITOK); + io_req->kern_segptr[i].ds_addr = + (bus_addr_t)(uintptr_t)kern_ptr; + io_req->kern_segptr[i].ds_len = alloc_size; + } + if (size_to_go > 0) { + printf("%s: size_to_go = %zu, software error!\n", + __func__, size_to_go); + error = EINVAL; + goto bailout; + } + + *data_ptrs[0] = (uint8_t *)io_req->kern_segptr; + *seg_cnt_ptr = io_req->num_kern_segs; + + /* + * We only need to copy data here if the user is writing. + */ + if (dirs[0] == CAM_DIR_OUT) + error = passcopysglist(periph, io_req, dirs[0]); + break; + } + case CAM_DATA_SG_PADDR: { + size_t sg_length; + + /* + * We shouldn't see this, but check just in case. + */ + if (numbufs != 1) { + printf("%s: cannot currently handle more than one " + "S/G list per CCB\n", __func__); + error = EINVAL; + goto bailout; + } + + /* + * We have to have at least one segment. + */ + if (num_segs == 0) { + xpt_print(periph->path, "%s: CAM_DATA_SG_PADDR flag " + "set, but sglist_cnt=0!\n", __func__); + error = EINVAL; + goto bailout; + } + + /* + * Make sure the user specified the total length and didn't + * just leave it to us to decode the S/G list. + */ + if (lengths[0] == 0) { + xpt_print(periph->path, "%s: no dxfer_len specified, " + "but CAM_DATA_SG flag is set!\n", __func__); + error = EINVAL; + goto bailout; + } + + /* Figure out the size of the S/G list */ + sg_length = num_segs * sizeof(bus_dma_segment_t); + io_req->num_user_segs = num_segs; + io_req->num_kern_segs = io_req->num_user_segs; + + /* Save the user's S/G list pointer for later restoration */ + io_req->user_bufs[0] = *data_ptrs[0]; + + if (num_segs > PASS_MAX_SEGS) { + io_req->user_segptr = malloc(sizeof(bus_dma_segment_t) * + num_segs, M_SCSIPASS, M_WAITOK | M_ZERO); + io_req->flags |= PASS_IO_USER_SEG_MALLOC; + } else + io_req->user_segptr = io_req->user_segs; + + io_req->kern_segptr = io_req->user_segptr; + + error = copyin(*data_ptrs[0], io_req->user_segptr, sg_length); + if (error != 0) { + xpt_print(periph->path, "%s: copy of user S/G list " + "from %p to %p failed with error %d\n", + __func__, *data_ptrs[0], io_req->user_segptr, + error); + goto bailout; + } + break; + } + default: + case CAM_DATA_BIO: + /* + * A user shouldn't be attaching a bio to the CCB. It + * isn't a user-accessible structure. + */ + error = EINVAL; + break; + } + +bailout: + if (error != 0) + passiocleanup(softc, io_req); + + return (error); +} + +static int +passmemdone(struct cam_periph *periph, struct pass_io_req *io_req) +{ + struct pass_softc *softc; + union ccb *ccb; + int error; + int i; + + error = 0; + softc = (struct pass_softc *)periph->softc; + ccb = &io_req->ccb; + + switch (io_req->data_flags) { + case CAM_DATA_VADDR: + /* + * Copy back to the user buffer if this was a read. + */ + for (i = 0; i < io_req->num_bufs; i++) { + if (io_req->dirs[i] != CAM_DIR_IN) + continue; + + error = copyout(io_req->kern_bufs[i], + io_req->user_bufs[i], io_req->lengths[i]); + if (error != 0) { + xpt_print(periph->path, "Unable to copy %u " + "bytes from %p to user address %p\n", + io_req->lengths[i], + io_req->kern_bufs[i], + io_req->user_bufs[i]); + goto bailout; + } + + } + break; + case CAM_DATA_PADDR: + /* Do nothing. The pointer is a physical address already */ + break; + case CAM_DATA_SG: + /* + * Copy back to the user buffer if this was a read. + * Restore the user's S/G list buffer pointer. + */ + if (io_req->dirs[0] == CAM_DIR_IN) + error = passcopysglist(periph, io_req, io_req->dirs[0]); + break; + case CAM_DATA_SG_PADDR: + /* + * Restore the user's S/G list buffer pointer. No need to + * copy. + */ + break; + default: + case CAM_DATA_BIO: + error = EINVAL; + break; + } + +bailout: + /* + * Reset the user's pointers to their original values and free + * allocated memory. + */ + passiocleanup(softc, io_req); + + return (error); +} + +static int passioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td) { int error; if ((error = passdoioctl(dev, cmd, addr, flag, td)) == ENOTTY) { error = cam_compat_ioctl(dev, cmd, addr, flag, td, passdoioctl); } return (error); } static int passdoioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td) { struct cam_periph *periph; struct pass_softc *softc; int error; uint32_t priority; periph = (struct cam_periph *)dev->si_drv1; if (periph == NULL) return(ENXIO); cam_periph_lock(periph); softc = (struct pass_softc *)periph->softc; error = 0; switch (cmd) { case CAMIOCOMMAND: { union ccb *inccb; union ccb *ccb; int ccb_malloced; inccb = (union ccb *)addr; /* * Some CCB types, like scan bus and scan lun can only go * through the transport layer device. */ if (inccb->ccb_h.func_code & XPT_FC_XPT_ONLY) { xpt_print(periph->path, "CCB function code %#x is " "restricted to the XPT device\n", inccb->ccb_h.func_code); error = ENODEV; break; } /* Compatibility for RL/priority-unaware code. */ priority = inccb->ccb_h.pinfo.priority; if (priority <= CAM_PRIORITY_OOB) priority += CAM_PRIORITY_OOB + 1; /* * Non-immediate CCBs need a CCB from the per-device pool * of CCBs, which is scheduled by the transport layer. * Immediate CCBs and user-supplied CCBs should just be * malloced. */ if ((inccb->ccb_h.func_code & XPT_FC_QUEUED) && ((inccb->ccb_h.func_code & XPT_FC_USER_CCB) == 0)) { ccb = cam_periph_getccb(periph, priority); ccb_malloced = 0; } else { ccb = xpt_alloc_ccb_nowait(); if (ccb != NULL) xpt_setup_ccb(&ccb->ccb_h, periph->path, priority); ccb_malloced = 1; } if (ccb == NULL) { xpt_print(periph->path, "unable to allocate CCB\n"); error = ENOMEM; break; } error = passsendccb(periph, ccb, inccb); if (ccb_malloced) xpt_free_ccb(ccb); else xpt_release_ccb(ccb); break; } + case CAMIOQUEUE: + { + struct pass_io_req *io_req; + union ccb **user_ccb, *ccb; + xpt_opcode fc; + + if ((softc->flags & PASS_FLAG_ZONE_VALID) == 0) { + error = passcreatezone(periph); + if (error != 0) + goto bailout; + } + + /* + * We're going to do a blocking allocation for this I/O + * request, so we have to drop the lock. + */ + cam_periph_unlock(periph); + + io_req = uma_zalloc(softc->pass_zone, M_WAITOK | M_ZERO); + ccb = &io_req->ccb; + user_ccb = (union ccb **)addr; + + /* + * Unlike the CAMIOCOMMAND ioctl above, we only have a + * pointer to the user's CCB, so we have to copy the whole + * thing in to a buffer we have allocated (above) instead + * of allowing the ioctl code to malloc a buffer and copy + * it in. + * + * This is an advantage for this asynchronous interface, + * since we don't want the memory to get freed while the + * CCB is outstanding. + */ +#if 0 + xpt_print(periph->path, "Copying user CCB %p to " + "kernel address %p\n", *user_ccb, ccb); +#endif + error = copyin(*user_ccb, ccb, sizeof(*ccb)); + if (error != 0) { + xpt_print(periph->path, "Copy of user CCB %p to " + "kernel address %p failed with error %d\n", + *user_ccb, ccb, error); + uma_zfree(softc->pass_zone, io_req); + cam_periph_lock(periph); + break; + } + + /* + * Some CCB types, like scan bus and scan lun can only go + * through the transport layer device. + */ + if (ccb->ccb_h.func_code & XPT_FC_XPT_ONLY) { + xpt_print(periph->path, "CCB function code %#x is " + "restricted to the XPT device\n", + ccb->ccb_h.func_code); + uma_zfree(softc->pass_zone, io_req); + cam_periph_lock(periph); + error = ENODEV; + break; + } + + /* + * Save the user's CCB pointer as well as his linked list + * pointers and peripheral private area so that we can + * restore these later. + */ + io_req->user_ccb_ptr = *user_ccb; + io_req->user_periph_links = ccb->ccb_h.periph_links; + io_req->user_periph_priv = ccb->ccb_h.periph_priv; + + /* + * Now that we've saved the user's values, we can set our + * own peripheral private entry. + */ + ccb->ccb_h.ccb_ioreq = io_req; + + /* Compatibility for RL/priority-unaware code. */ + priority = ccb->ccb_h.pinfo.priority; + if (priority <= CAM_PRIORITY_OOB) + priority += CAM_PRIORITY_OOB + 1; + + /* + * Setup fields in the CCB like the path and the priority. + * The path in particular cannot be done in userland, since + * it is a pointer to a kernel data structure. + */ + xpt_setup_ccb_flags(&ccb->ccb_h, periph->path, priority, + ccb->ccb_h.flags); + + /* + * Setup our done routine. There is no way for the user to + * have a valid pointer here. + */ + ccb->ccb_h.cbfcnp = passdone; + + fc = ccb->ccb_h.func_code; + /* + * If this function code has memory that can be mapped in + * or out, we need to call passmemsetup(). + */ + if ((fc == XPT_SCSI_IO) || (fc == XPT_ATA_IO) + || (fc == XPT_SMP_IO) || (fc == XPT_DEV_MATCH) + || (fc == XPT_DEV_ADVINFO)) { + error = passmemsetup(periph, io_req); + if (error != 0) { + uma_zfree(softc->pass_zone, io_req); + cam_periph_lock(periph); + break; + } + } else + io_req->mapinfo.num_bufs_used = 0; + + cam_periph_lock(periph); + + /* + * Everything goes on the incoming queue initially. + */ + TAILQ_INSERT_TAIL(&softc->incoming_queue, io_req, links); + + /* + * If the CCB is queued, and is not a user CCB, then + * we need to allocate a slot for it. Call xpt_schedule() + * so that our start routine will get called when a CCB is + * available. + */ + if ((fc & XPT_FC_QUEUED) + && ((fc & XPT_FC_USER_CCB) == 0)) { + xpt_schedule(periph, priority); + break; + } + + /* + * At this point, the CCB in question is either an + * immediate CCB (like XPT_DEV_ADVINFO) or it is a user CCB + * and therefore should be malloced, not allocated via a slot. + * Remove the CCB from the incoming queue and add it to the + * active queue. + */ + TAILQ_REMOVE(&softc->incoming_queue, io_req, links); + TAILQ_INSERT_TAIL(&softc->active_queue, io_req, links); + + xpt_action(ccb); + + /* + * If this is not a queued CCB (i.e. it is an immediate CCB), + * then it is already done. We need to put it on the done + * queue for the user to fetch. + */ + if ((fc & XPT_FC_QUEUED) == 0) { + TAILQ_REMOVE(&softc->active_queue, io_req, links); + TAILQ_INSERT_TAIL(&softc->done_queue, io_req, links); + } + break; + } + case CAMIOGET: + { + union ccb **user_ccb; + struct pass_io_req *io_req; + int old_error; + + user_ccb = (union ccb **)addr; + old_error = 0; + + io_req = TAILQ_FIRST(&softc->done_queue); + if (io_req == NULL) { + error = ENOENT; + break; + } + + /* + * Remove the I/O from the done queue. + */ + TAILQ_REMOVE(&softc->done_queue, io_req, links); + + /* + * We have to drop the lock during the copyout because the + * copyout can result in VM faults that require sleeping. + */ + cam_periph_unlock(periph); + + /* + * Do any needed copies (e.g. for reads) and revert the + * pointers in the CCB back to the user's pointers. + */ + error = passmemdone(periph, io_req); + + old_error = error; + + io_req->ccb.ccb_h.periph_links = io_req->user_periph_links; + io_req->ccb.ccb_h.periph_priv = io_req->user_periph_priv; + +#if 0 + xpt_print(periph->path, "Copying to user CCB %p from " + "kernel address %p\n", *user_ccb, &io_req->ccb); +#endif + + error = copyout(&io_req->ccb, *user_ccb, sizeof(union ccb)); + if (error != 0) { + xpt_print(periph->path, "Copy to user CCB %p from " + "kernel address %p failed with error %d\n", + *user_ccb, &io_req->ccb, error); + } + + /* + * Prefer the first error we got back, and make sure we + * don't overwrite bad status with good. + */ + if (old_error != 0) + error = old_error; + + cam_periph_lock(periph); + + /* + * At this point, if there was an error, we could potentially + * re-queue the I/O and try again. But why? The error + * would almost certainly happen again. We might as well + * not leak memory. + */ + uma_zfree(softc->pass_zone, io_req); + break; + } default: error = cam_periph_ioctl(periph, cmd, addr, passerror); break; } +bailout: cam_periph_unlock(periph); + return(error); } +static int +passpoll(struct cdev *dev, int poll_events, struct thread *td) +{ + struct cam_periph *periph; + struct pass_softc *softc; + int revents; + + periph = (struct cam_periph *)dev->si_drv1; + if (periph == NULL) + return (ENXIO); + + softc = (struct pass_softc *)periph->softc; + + revents = poll_events & (POLLOUT | POLLWRNORM); + if ((poll_events & (POLLIN | POLLRDNORM)) != 0) { + cam_periph_lock(periph); + + if (!TAILQ_EMPTY(&softc->done_queue)) { + revents |= poll_events & (POLLIN | POLLRDNORM); + } + cam_periph_unlock(periph); + if (revents == 0) + selrecord(td, &softc->read_select); + } + + return (revents); +} + +static int +passkqfilter(struct cdev *dev, struct knote *kn) +{ + struct cam_periph *periph; + struct pass_softc *softc; + + periph = (struct cam_periph *)dev->si_drv1; + if (periph == NULL) + return (ENXIO); + + softc = (struct pass_softc *)periph->softc; + + kn->kn_hook = (caddr_t)periph; + kn->kn_fop = &passread_filtops; + knlist_add(&softc->read_select.si_note, kn, 0); + + return (0); +} + +static void +passreadfiltdetach(struct knote *kn) +{ + struct cam_periph *periph; + struct pass_softc *softc; + + periph = (struct cam_periph *)kn->kn_hook; + softc = (struct pass_softc *)periph->softc; + + knlist_remove(&softc->read_select.si_note, kn, 0); +} + +static int +passreadfilt(struct knote *kn, long hint) +{ + struct cam_periph *periph; + struct pass_softc *softc; + int retval; + + periph = (struct cam_periph *)kn->kn_hook; + softc = (struct pass_softc *)periph->softc; + + cam_periph_assert(periph, MA_OWNED); + + if (TAILQ_EMPTY(&softc->done_queue)) + retval = 0; + else + retval = 1; + + return (retval); +} + /* * Generally, "ccb" should be the CCB supplied by the kernel. "inccb" * should be the CCB that is copied in from the user. */ static int passsendccb(struct cam_periph *periph, union ccb *ccb, union ccb *inccb) { struct pass_softc *softc; struct cam_periph_map_info mapinfo; xpt_opcode fc; int error; softc = (struct pass_softc *)periph->softc; /* * There are some fields in the CCB header that need to be * preserved, the rest we get from the user. */ xpt_merge_ccb(ccb, inccb); + + /* + */ + ccb->ccb_h.cbfcnp = passdone; /* * Let cam_periph_mapmem do a sanity check on the data pointer format. * Even if no data transfer is needed, it's a cheap check and it * simplifies the code. */ fc = ccb->ccb_h.func_code; if ((fc == XPT_SCSI_IO) || (fc == XPT_ATA_IO) || (fc == XPT_SMP_IO) || (fc == XPT_DEV_MATCH) || (fc == XPT_DEV_ADVINFO)) { bzero(&mapinfo, sizeof(mapinfo)); /* * cam_periph_mapmem calls into proc and vm functions that can * sleep as well as trigger I/O, so we can't hold the lock. * Dropping it here is reasonably safe. */ cam_periph_unlock(periph); error = cam_periph_mapmem(ccb, &mapinfo, softc->maxio); cam_periph_lock(periph); /* * cam_periph_mapmem returned an error, we can't continue. * Return the error to the user. */ if (error) return(error); } else /* Ensure that the unmap call later on is a no-op. */ mapinfo.num_bufs_used = 0; /* * If the user wants us to perform any error recovery, then honor * that request. Otherwise, it's up to the user to perform any * error recovery. */ cam_periph_runccb(ccb, passerror, /* cam_flags */ CAM_RETRY_SELTO, /* sense_flags */ ((ccb->ccb_h.flags & CAM_PASS_ERR_RECOVER) ? SF_RETRY_UA : SF_NO_RECOVERY) | SF_NO_PRINT, softc->device_stats); cam_periph_unmapmem(ccb, &mapinfo); ccb->ccb_h.cbfcnp = NULL; ccb->ccb_h.periph_priv = inccb->ccb_h.periph_priv; bcopy(ccb, inccb, sizeof(union ccb)); return(0); } static int passerror(union ccb *ccb, u_int32_t cam_flags, u_int32_t sense_flags) { struct cam_periph *periph; struct pass_softc *softc; periph = xpt_path_periph(ccb->ccb_h.path); softc = (struct pass_softc *)periph->softc; return(cam_periph_error(ccb, cam_flags, sense_flags, &softc->saved_ccb)); } Index: stable/10/sys/cam/scsi/scsi_pass.h =================================================================== --- stable/10/sys/cam/scsi/scsi_pass.h (revision 292347) +++ stable/10/sys/cam/scsi/scsi_pass.h (revision 292348) @@ -1,42 +1,50 @@ /*- * Copyright (c) 1997, 1999 Kenneth D. Merry. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _SCSI_PASS_H #define _SCSI_PASS_H 1 #include #include /* * Convert to using a pointer to a ccb in the next major version. * This should allow us to avoid an extra copy of the CCB data. */ #define CAMIOCOMMAND _IOWR(CAM_VERSION, 2, union ccb) #define CAMGETPASSTHRU _IOWR(CAM_VERSION, 3, union ccb) +/* + * These two ioctls take a union ccb *, but that is not explicitly declared + * to avoid having the ioctl handling code malloc and free their own copy + * of the CCB or the CCB pointer. + */ +#define CAMIOQUEUE _IO(CAM_VERSION, 4) +#define CAMIOGET _IO(CAM_VERSION, 5) + #endif Index: stable/10/sys/dev/md/md.c =================================================================== --- stable/10/sys/dev/md/md.c (revision 292347) +++ stable/10/sys/dev/md/md.c (revision 292348) @@ -1,1681 +1,1846 @@ /*- * ---------------------------------------------------------------------------- * "THE BEER-WARE LICENSE" (Revision 42): * wrote this file. As long as you retain this notice you * can do whatever you want with this stuff. If we meet some day, and you think * this stuff is worth it, you can buy me a beer in return. Poul-Henning Kamp * ---------------------------------------------------------------------------- * * $FreeBSD$ * */ /*- * The following functions are based in the vn(4) driver: mdstart_swap(), * mdstart_vnode(), mdcreate_swap(), mdcreate_vnode() and mddestroy(), * and as such under the following copyright: * * Copyright (c) 1988 University of Utah. * Copyright (c) 1990, 1993 * The Regents of the University of California. All rights reserved. * Copyright (c) 2013 The FreeBSD Foundation * All rights reserved. * * This code is derived from software contributed to Berkeley by * the Systems Programming Group of the University of Utah Computer * Science Department. * * Portions of this software were developed by Konstantin Belousov * under sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: Utah Hdr: vn.c 1.13 94/04/02 * * from: @(#)vn.c 8.6 (Berkeley) 4/1/94 * From: src/sys/dev/vn/vn.c,v 1.122 2000/12/16 16:06:03 */ #include "opt_geom.h" #include "opt_md.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include + #define MD_MODVER 1 #define MD_SHUTDOWN 0x10000 /* Tell worker thread to terminate. */ #define MD_EXITING 0x20000 /* Worker thread is exiting. */ #ifndef MD_NSECT #define MD_NSECT (10000 * 2) #endif static MALLOC_DEFINE(M_MD, "md_disk", "Memory Disk"); static MALLOC_DEFINE(M_MDSECT, "md_sectors", "Memory Disk Sectors"); static int md_debug; SYSCTL_INT(_debug, OID_AUTO, mddebug, CTLFLAG_RW, &md_debug, 0, "Enable md(4) debug messages"); static int md_malloc_wait; SYSCTL_INT(_vm, OID_AUTO, md_malloc_wait, CTLFLAG_RW, &md_malloc_wait, 0, "Allow malloc to wait for memory allocations"); #if defined(MD_ROOT) && !defined(MD_ROOT_FSTYPE) #define MD_ROOT_FSTYPE "ufs" #endif #if defined(MD_ROOT) && defined(MD_ROOT_SIZE) /* * Preloaded image gets put here. * Applications that patch the object with the image can determine * the size looking at the start and end markers (strings), * so we want them contiguous. */ static struct { u_char start[MD_ROOT_SIZE*1024]; u_char end[128]; } mfs_root = { .start = "MFS Filesystem goes here", .end = "MFS Filesystem had better STOP here", }; #endif static g_init_t g_md_init; static g_fini_t g_md_fini; static g_start_t g_md_start; static g_access_t g_md_access; static void g_md_dumpconf(struct sbuf *sb, const char *indent, struct g_geom *gp, struct g_consumer *cp __unused, struct g_provider *pp); static struct cdev *status_dev = 0; static struct sx md_sx; static struct unrhdr *md_uh; static d_ioctl_t mdctlioctl; static struct cdevsw mdctl_cdevsw = { .d_version = D_VERSION, .d_ioctl = mdctlioctl, .d_name = MD_NAME, }; struct g_class g_md_class = { .name = "MD", .version = G_VERSION, .init = g_md_init, .fini = g_md_fini, .start = g_md_start, .access = g_md_access, .dumpconf = g_md_dumpconf, }; DECLARE_GEOM_CLASS(g_md_class, g_md); static LIST_HEAD(, md_s) md_softc_list = LIST_HEAD_INITIALIZER(md_softc_list); #define NINDIR (PAGE_SIZE / sizeof(uintptr_t)) #define NMASK (NINDIR-1) static int nshift; static int md_vnode_pbuf_freecnt; struct indir { uintptr_t *array; u_int total; u_int used; u_int shift; }; struct md_s { int unit; LIST_ENTRY(md_s) list; struct bio_queue_head bio_queue; struct mtx queue_mtx; struct mtx stat_mtx; struct cdev *dev; enum md_types type; off_t mediasize; unsigned sectorsize; unsigned opencount; unsigned fwheads; unsigned fwsectors; unsigned flags; char name[20]; struct proc *procp; struct g_geom *gp; struct g_provider *pp; int (*start)(struct md_s *sc, struct bio *bp); struct devstat *devstat; /* MD_MALLOC related fields */ struct indir *indir; uma_zone_t uma; /* MD_PRELOAD related fields */ u_char *pl_ptr; size_t pl_len; /* MD_VNODE related fields */ struct vnode *vnode; char file[PATH_MAX]; struct ucred *cred; /* MD_SWAP related fields */ vm_object_t object; }; static struct indir * new_indir(u_int shift) { struct indir *ip; ip = malloc(sizeof *ip, M_MD, (md_malloc_wait ? M_WAITOK : M_NOWAIT) | M_ZERO); if (ip == NULL) return (NULL); ip->array = malloc(sizeof(uintptr_t) * NINDIR, M_MDSECT, (md_malloc_wait ? M_WAITOK : M_NOWAIT) | M_ZERO); if (ip->array == NULL) { free(ip, M_MD); return (NULL); } ip->total = NINDIR; ip->shift = shift; return (ip); } static void del_indir(struct indir *ip) { free(ip->array, M_MDSECT); free(ip, M_MD); } static void destroy_indir(struct md_s *sc, struct indir *ip) { int i; for (i = 0; i < NINDIR; i++) { if (!ip->array[i]) continue; if (ip->shift) destroy_indir(sc, (struct indir*)(ip->array[i])); else if (ip->array[i] > 255) uma_zfree(sc->uma, (void *)(ip->array[i])); } del_indir(ip); } /* * This function does the math and allocates the top level "indir" structure * for a device of "size" sectors. */ static struct indir * dimension(off_t size) { off_t rcnt; struct indir *ip; int layer; rcnt = size; layer = 0; while (rcnt > NINDIR) { rcnt /= NINDIR; layer++; } /* * XXX: the top layer is probably not fully populated, so we allocate * too much space for ip->array in here. */ ip = malloc(sizeof *ip, M_MD, M_WAITOK | M_ZERO); ip->array = malloc(sizeof(uintptr_t) * NINDIR, M_MDSECT, M_WAITOK | M_ZERO); ip->total = NINDIR; ip->shift = layer * nshift; return (ip); } /* * Read a given sector */ static uintptr_t s_read(struct indir *ip, off_t offset) { struct indir *cip; int idx; uintptr_t up; if (md_debug > 1) printf("s_read(%jd)\n", (intmax_t)offset); up = 0; for (cip = ip; cip != NULL;) { if (cip->shift) { idx = (offset >> cip->shift) & NMASK; up = cip->array[idx]; cip = (struct indir *)up; continue; } idx = offset & NMASK; return (cip->array[idx]); } return (0); } /* * Write a given sector, prune the tree if the value is 0 */ static int s_write(struct indir *ip, off_t offset, uintptr_t ptr) { struct indir *cip, *lip[10]; int idx, li; uintptr_t up; if (md_debug > 1) printf("s_write(%jd, %p)\n", (intmax_t)offset, (void *)ptr); up = 0; li = 0; cip = ip; for (;;) { lip[li++] = cip; if (cip->shift) { idx = (offset >> cip->shift) & NMASK; up = cip->array[idx]; if (up != 0) { cip = (struct indir *)up; continue; } /* Allocate branch */ cip->array[idx] = (uintptr_t)new_indir(cip->shift - nshift); if (cip->array[idx] == 0) return (ENOSPC); cip->used++; up = cip->array[idx]; cip = (struct indir *)up; continue; } /* leafnode */ idx = offset & NMASK; up = cip->array[idx]; if (up != 0) cip->used--; cip->array[idx] = ptr; if (ptr != 0) cip->used++; break; } if (cip->used != 0 || li == 1) return (0); li--; while (cip->used == 0 && cip != ip) { li--; idx = (offset >> lip[li]->shift) & NMASK; up = lip[li]->array[idx]; KASSERT(up == (uintptr_t)cip, ("md screwed up")); del_indir(cip); lip[li]->array[idx] = 0; lip[li]->used--; cip = lip[li]; } return (0); } static int g_md_access(struct g_provider *pp, int r, int w, int e) { struct md_s *sc; sc = pp->geom->softc; if (sc == NULL) { if (r <= 0 && w <= 0 && e <= 0) return (0); return (ENXIO); } r += pp->acr; w += pp->acw; e += pp->ace; if ((sc->flags & MD_READONLY) != 0 && w > 0) return (EROFS); if ((pp->acr + pp->acw + pp->ace) == 0 && (r + w + e) > 0) { sc->opencount = 1; } else if ((pp->acr + pp->acw + pp->ace) > 0 && (r + w + e) == 0) { sc->opencount = 0; } return (0); } static void g_md_start(struct bio *bp) { struct md_s *sc; sc = bp->bio_to->geom->softc; if ((bp->bio_cmd == BIO_READ) || (bp->bio_cmd == BIO_WRITE)) { mtx_lock(&sc->stat_mtx); devstat_start_transaction_bio(sc->devstat, bp); mtx_unlock(&sc->stat_mtx); } mtx_lock(&sc->queue_mtx); bioq_disksort(&sc->bio_queue, bp); mtx_unlock(&sc->queue_mtx); wakeup(sc); } #define MD_MALLOC_MOVE_ZERO 1 #define MD_MALLOC_MOVE_FILL 2 #define MD_MALLOC_MOVE_READ 3 #define MD_MALLOC_MOVE_WRITE 4 #define MD_MALLOC_MOVE_CMP 5 static int -md_malloc_move(vm_page_t **mp, int *ma_offs, unsigned sectorsize, +md_malloc_move_ma(vm_page_t **mp, int *ma_offs, unsigned sectorsize, void *ptr, u_char fill, int op) { struct sf_buf *sf; vm_page_t m, *mp1; char *p, first; off_t *uc; unsigned n; int error, i, ma_offs1, sz, first_read; m = NULL; error = 0; sf = NULL; /* if (op == MD_MALLOC_MOVE_CMP) { gcc */ first = 0; first_read = 0; uc = ptr; mp1 = *mp; ma_offs1 = *ma_offs; /* } */ sched_pin(); for (n = sectorsize; n != 0; n -= sz) { sz = imin(PAGE_SIZE - *ma_offs, n); if (m != **mp) { if (sf != NULL) sf_buf_free(sf); m = **mp; sf = sf_buf_alloc(m, SFB_CPUPRIVATE | (md_malloc_wait ? 0 : SFB_NOWAIT)); if (sf == NULL) { error = ENOMEM; break; } } p = (char *)sf_buf_kva(sf) + *ma_offs; switch (op) { case MD_MALLOC_MOVE_ZERO: bzero(p, sz); break; case MD_MALLOC_MOVE_FILL: memset(p, fill, sz); break; case MD_MALLOC_MOVE_READ: bcopy(ptr, p, sz); cpu_flush_dcache(p, sz); break; case MD_MALLOC_MOVE_WRITE: bcopy(p, ptr, sz); break; case MD_MALLOC_MOVE_CMP: for (i = 0; i < sz; i++, p++) { if (!first_read) { *uc = (u_char)*p; first = *p; first_read = 1; } else if (*p != first) { error = EDOOFUS; break; } } break; default: - KASSERT(0, ("md_malloc_move unknown op %d\n", op)); + KASSERT(0, ("md_malloc_move_ma unknown op %d\n", op)); break; } if (error != 0) break; *ma_offs += sz; *ma_offs %= PAGE_SIZE; if (*ma_offs == 0) (*mp)++; ptr = (char *)ptr + sz; } if (sf != NULL) sf_buf_free(sf); sched_unpin(); if (op == MD_MALLOC_MOVE_CMP && error != 0) { *mp = mp1; *ma_offs = ma_offs1; } return (error); } static int +md_malloc_move_vlist(bus_dma_segment_t **pvlist, int *pma_offs, + unsigned len, void *ptr, u_char fill, int op) +{ + bus_dma_segment_t *vlist; + uint8_t *p, *end, first; + off_t *uc; + int ma_offs, seg_len; + + vlist = *pvlist; + ma_offs = *pma_offs; + uc = ptr; + + for (; len != 0; len -= seg_len) { + seg_len = imin(vlist->ds_len - ma_offs, len); + p = (uint8_t *)(uintptr_t)vlist->ds_addr + ma_offs; + switch (op) { + case MD_MALLOC_MOVE_ZERO: + bzero(p, seg_len); + break; + case MD_MALLOC_MOVE_FILL: + memset(p, fill, seg_len); + break; + case MD_MALLOC_MOVE_READ: + bcopy(ptr, p, seg_len); + cpu_flush_dcache(p, seg_len); + break; + case MD_MALLOC_MOVE_WRITE: + bcopy(p, ptr, seg_len); + break; + case MD_MALLOC_MOVE_CMP: + end = p + seg_len; + first = *uc = *p; + /* Confirm all following bytes match the first */ + while (++p < end) { + if (*p != first) + return (EDOOFUS); + } + break; + default: + KASSERT(0, ("md_malloc_move_vlist unknown op %d\n", op)); + break; + } + + ma_offs += seg_len; + if (ma_offs == vlist->ds_len) { + ma_offs = 0; + vlist++; + } + ptr = (uint8_t *)ptr + seg_len; + } + *pvlist = vlist; + *pma_offs = ma_offs; + + return (0); +} + +static int mdstart_malloc(struct md_s *sc, struct bio *bp) { u_char *dst; vm_page_t *m; + bus_dma_segment_t *vlist; int i, error, error1, ma_offs, notmapped; off_t secno, nsec, uc; uintptr_t sp, osp; switch (bp->bio_cmd) { case BIO_READ: case BIO_WRITE: case BIO_DELETE: break; default: return (EOPNOTSUPP); } notmapped = (bp->bio_flags & BIO_UNMAPPED) != 0; + vlist = (bp->bio_flags & BIO_VLIST) != 0 ? + (bus_dma_segment_t *)bp->bio_data : NULL; if (notmapped) { m = bp->bio_ma; ma_offs = bp->bio_ma_offset; dst = NULL; + KASSERT(vlist == NULL, ("vlists cannot be unmapped")); + } else if (vlist != NULL) { + ma_offs = bp->bio_ma_offset; + dst = NULL; } else { dst = bp->bio_data; } nsec = bp->bio_length / sc->sectorsize; secno = bp->bio_offset / sc->sectorsize; error = 0; while (nsec--) { osp = s_read(sc->indir, secno); if (bp->bio_cmd == BIO_DELETE) { if (osp != 0) error = s_write(sc->indir, secno, 0); } else if (bp->bio_cmd == BIO_READ) { if (osp == 0) { if (notmapped) { - error = md_malloc_move(&m, &ma_offs, + error = md_malloc_move_ma(&m, &ma_offs, sc->sectorsize, NULL, 0, MD_MALLOC_MOVE_ZERO); + } else if (vlist != NULL) { + error = md_malloc_move_vlist(&vlist, + &ma_offs, sc->sectorsize, NULL, 0, + MD_MALLOC_MOVE_ZERO); } else bzero(dst, sc->sectorsize); } else if (osp <= 255) { if (notmapped) { - error = md_malloc_move(&m, &ma_offs, + error = md_malloc_move_ma(&m, &ma_offs, sc->sectorsize, NULL, osp, MD_MALLOC_MOVE_FILL); + } else if (vlist != NULL) { + error = md_malloc_move_vlist(&vlist, + &ma_offs, sc->sectorsize, NULL, osp, + MD_MALLOC_MOVE_FILL); } else memset(dst, osp, sc->sectorsize); } else { if (notmapped) { - error = md_malloc_move(&m, &ma_offs, + error = md_malloc_move_ma(&m, &ma_offs, sc->sectorsize, (void *)osp, 0, MD_MALLOC_MOVE_READ); + } else if (vlist != NULL) { + error = md_malloc_move_vlist(&vlist, + &ma_offs, sc->sectorsize, + (void *)osp, 0, + MD_MALLOC_MOVE_READ); } else { bcopy((void *)osp, dst, sc->sectorsize); cpu_flush_dcache(dst, sc->sectorsize); } } osp = 0; } else if (bp->bio_cmd == BIO_WRITE) { if (sc->flags & MD_COMPRESS) { if (notmapped) { - error1 = md_malloc_move(&m, &ma_offs, + error1 = md_malloc_move_ma(&m, &ma_offs, sc->sectorsize, &uc, 0, MD_MALLOC_MOVE_CMP); i = error1 == 0 ? sc->sectorsize : 0; + } else if (vlist != NULL) { + error1 = md_malloc_move_vlist(&vlist, + &ma_offs, sc->sectorsize, &uc, 0, + MD_MALLOC_MOVE_CMP); + i = error1 == 0 ? sc->sectorsize : 0; } else { uc = dst[0]; for (i = 1; i < sc->sectorsize; i++) { if (dst[i] != uc) break; } } } else { i = 0; uc = 0; } if (i == sc->sectorsize) { if (osp != uc) error = s_write(sc->indir, secno, uc); } else { if (osp <= 255) { sp = (uintptr_t)uma_zalloc(sc->uma, md_malloc_wait ? M_WAITOK : M_NOWAIT); if (sp == 0) { error = ENOSPC; break; } if (notmapped) { - error = md_malloc_move(&m, + error = md_malloc_move_ma(&m, &ma_offs, sc->sectorsize, (void *)sp, 0, MD_MALLOC_MOVE_WRITE); + } else if (vlist != NULL) { + error = md_malloc_move_vlist( + &vlist, &ma_offs, + sc->sectorsize, (void *)sp, + 0, MD_MALLOC_MOVE_WRITE); } else { bcopy(dst, (void *)sp, sc->sectorsize); } error = s_write(sc->indir, secno, sp); } else { if (notmapped) { - error = md_malloc_move(&m, + error = md_malloc_move_ma(&m, &ma_offs, sc->sectorsize, (void *)osp, 0, MD_MALLOC_MOVE_WRITE); + } else if (vlist != NULL) { + error = md_malloc_move_vlist( + &vlist, &ma_offs, + sc->sectorsize, (void *)osp, + 0, MD_MALLOC_MOVE_WRITE); } else { bcopy(dst, (void *)osp, sc->sectorsize); } osp = 0; } } } else { error = EOPNOTSUPP; } if (osp > 255) uma_zfree(sc->uma, (void*)osp); if (error != 0) break; secno++; - if (!notmapped) + if (!notmapped && vlist == NULL) dst += sc->sectorsize; } bp->bio_resid = 0; return (error); } +static void +mdcopyto_vlist(void *src, bus_dma_segment_t *vlist, off_t offset, off_t len) +{ + off_t seg_len; + + while (offset >= vlist->ds_len) { + offset -= vlist->ds_len; + vlist++; + } + + while (len != 0) { + seg_len = omin(len, vlist->ds_len - offset); + bcopy(src, (void *)(uintptr_t)(vlist->ds_addr + offset), + seg_len); + offset = 0; + src = (uint8_t *)src + seg_len; + len -= seg_len; + vlist++; + } +} + +static void +mdcopyfrom_vlist(bus_dma_segment_t *vlist, off_t offset, void *dst, off_t len) +{ + off_t seg_len; + + while (offset >= vlist->ds_len) { + offset -= vlist->ds_len; + vlist++; + } + + while (len != 0) { + seg_len = omin(len, vlist->ds_len - offset); + bcopy((void *)(uintptr_t)(vlist->ds_addr + offset), dst, + seg_len); + offset = 0; + dst = (uint8_t *)dst + seg_len; + len -= seg_len; + vlist++; + } +} + static int mdstart_preload(struct md_s *sc, struct bio *bp) { + uint8_t *p; + p = sc->pl_ptr + bp->bio_offset; switch (bp->bio_cmd) { case BIO_READ: - bcopy(sc->pl_ptr + bp->bio_offset, bp->bio_data, - bp->bio_length); + if ((bp->bio_flags & BIO_VLIST) != 0) { + mdcopyto_vlist(p, (bus_dma_segment_t *)bp->bio_data, + bp->bio_ma_offset, bp->bio_length); + } else { + bcopy(p, bp->bio_data, bp->bio_length); + } cpu_flush_dcache(bp->bio_data, bp->bio_length); break; case BIO_WRITE: - bcopy(bp->bio_data, sc->pl_ptr + bp->bio_offset, - bp->bio_length); + if ((bp->bio_flags & BIO_VLIST) != 0) { + mdcopyfrom_vlist((bus_dma_segment_t *)bp->bio_data, + bp->bio_ma_offset, p, bp->bio_length); + } else { + bcopy(bp->bio_data, p, bp->bio_length); + } break; } bp->bio_resid = 0; return (0); } static int mdstart_vnode(struct md_s *sc, struct bio *bp) { int error; struct uio auio; struct iovec aiov; + struct iovec *piov; struct mount *mp; struct vnode *vp; struct buf *pb; + bus_dma_segment_t *vlist; struct thread *td; - off_t end, zerosize; + off_t len, zerosize; + int ma_offs; switch (bp->bio_cmd) { case BIO_READ: + auio.uio_rw = UIO_READ; + break; case BIO_WRITE: case BIO_DELETE: + auio.uio_rw = UIO_WRITE; + break; case BIO_FLUSH: break; default: return (EOPNOTSUPP); } td = curthread; vp = sc->vnode; + pb = NULL; + piov = NULL; + ma_offs = bp->bio_ma_offset; /* * VNODE I/O * * If an error occurs, we set BIO_ERROR but we do not set * B_INVAL because (for a write anyway), the buffer is * still valid. */ if (bp->bio_cmd == BIO_FLUSH) { (void) vn_start_write(vp, &mp, V_WAIT); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); error = VOP_FSYNC(vp, MNT_WAIT, td); VOP_UNLOCK(vp, 0); vn_finished_write(mp); return (error); } - bzero(&auio, sizeof(auio)); + auio.uio_offset = (vm_ooffset_t)bp->bio_offset; + auio.uio_resid = bp->bio_length; + auio.uio_segflg = UIO_SYSSPACE; + auio.uio_td = td; - /* - * Special case for BIO_DELETE. On the surface, this is very - * similar to BIO_WRITE, except that we write from our own - * fixed-length buffer, so we have to loop. The net result is - * that the two cases end up having very little in common. - */ if (bp->bio_cmd == BIO_DELETE) { + /* + * Emulate BIO_DELETE by writing zeros. + */ zerosize = ZERO_REGION_SIZE - (ZERO_REGION_SIZE % sc->sectorsize); - auio.uio_iov = &aiov; - auio.uio_iovcnt = 1; - auio.uio_offset = (vm_ooffset_t)bp->bio_offset; - auio.uio_segflg = UIO_SYSSPACE; - auio.uio_rw = UIO_WRITE; - auio.uio_td = td; - end = bp->bio_offset + bp->bio_length; - (void) vn_start_write(vp, &mp, V_WAIT); - vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); - error = 0; - while (auio.uio_offset < end) { - aiov.iov_base = __DECONST(void *, zero_region); - aiov.iov_len = end - auio.uio_offset; - if (aiov.iov_len > zerosize) - aiov.iov_len = zerosize; - auio.uio_resid = aiov.iov_len; - error = VOP_WRITE(vp, &auio, - sc->flags & MD_ASYNC ? 0 : IO_SYNC, sc->cred); - if (error != 0) - break; + auio.uio_iovcnt = howmany(bp->bio_length, zerosize); + piov = malloc(sizeof(*piov) * auio.uio_iovcnt, M_MD, M_WAITOK); + auio.uio_iov = piov; + len = bp->bio_length; + while (len > 0) { + piov->iov_base = __DECONST(void *, zero_region); + piov->iov_len = len; + if (len > zerosize) + piov->iov_len = zerosize; + len -= piov->iov_len; + piov++; } - VOP_UNLOCK(vp, 0); - vn_finished_write(mp); - bp->bio_resid = end - auio.uio_offset; - return (error); - } - - KASSERT(bp->bio_length <= MAXPHYS, ("bio_length %jd", - (uintmax_t)bp->bio_length)); - if ((bp->bio_flags & BIO_UNMAPPED) == 0) { - pb = NULL; - aiov.iov_base = bp->bio_data; - } else { + piov = auio.uio_iov; + } else if ((bp->bio_flags & BIO_VLIST) != 0) { + piov = malloc(sizeof(*piov) * bp->bio_ma_n, M_MD, M_WAITOK); + auio.uio_iov = piov; + vlist = (bus_dma_segment_t *)bp->bio_data; + len = bp->bio_length; + while (len > 0) { + piov->iov_base = (void *)(uintptr_t)(vlist->ds_addr + + ma_offs); + piov->iov_len = vlist->ds_len - ma_offs; + if (piov->iov_len > len) + piov->iov_len = len; + len -= piov->iov_len; + ma_offs = 0; + vlist++; + piov++; + } + auio.uio_iovcnt = piov - auio.uio_iov; + piov = auio.uio_iov; + } else if ((bp->bio_flags & BIO_UNMAPPED) != 0) { pb = getpbuf(&md_vnode_pbuf_freecnt); pmap_qenter((vm_offset_t)pb->b_data, bp->bio_ma, bp->bio_ma_n); - aiov.iov_base = (void *)((vm_offset_t)pb->b_data + - bp->bio_ma_offset); + aiov.iov_base = (void *)((vm_offset_t)pb->b_data + ma_offs); + aiov.iov_len = bp->bio_length; + auio.uio_iov = &aiov; + auio.uio_iovcnt = 1; + } else { + aiov.iov_base = bp->bio_data; + aiov.iov_len = bp->bio_length; + auio.uio_iov = &aiov; + auio.uio_iovcnt = 1; } - aiov.iov_len = bp->bio_length; - auio.uio_iov = &aiov; - auio.uio_iovcnt = 1; - auio.uio_offset = (vm_ooffset_t)bp->bio_offset; - auio.uio_segflg = UIO_SYSSPACE; - if (bp->bio_cmd == BIO_READ) - auio.uio_rw = UIO_READ; - else if (bp->bio_cmd == BIO_WRITE) - auio.uio_rw = UIO_WRITE; - else - panic("wrong BIO_OP in mdstart_vnode"); - auio.uio_resid = bp->bio_length; - auio.uio_td = td; /* * When reading set IO_DIRECT to try to avoid double-caching * the data. When writing IO_DIRECT is not optimal. */ - if (bp->bio_cmd == BIO_READ) { + if (auio.uio_rw == UIO_READ) { vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); error = VOP_READ(vp, &auio, IO_DIRECT, sc->cred); VOP_UNLOCK(vp, 0); } else { (void) vn_start_write(vp, &mp, V_WAIT); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); error = VOP_WRITE(vp, &auio, sc->flags & MD_ASYNC ? 0 : IO_SYNC, sc->cred); VOP_UNLOCK(vp, 0); vn_finished_write(mp); } - if ((bp->bio_flags & BIO_UNMAPPED) != 0) { + + if (pb) { pmap_qremove((vm_offset_t)pb->b_data, bp->bio_ma_n); relpbuf(pb, &md_vnode_pbuf_freecnt); } + + if (piov != NULL) + free(piov, M_MD); + bp->bio_resid = auio.uio_resid; return (error); } static int mdstart_swap(struct md_s *sc, struct bio *bp) { vm_page_t m; u_char *p; vm_pindex_t i, lastp; + bus_dma_segment_t *vlist; int rv, ma_offs, offs, len, lastend; switch (bp->bio_cmd) { case BIO_READ: case BIO_WRITE: case BIO_DELETE: break; default: return (EOPNOTSUPP); } p = bp->bio_data; - ma_offs = (bp->bio_flags & BIO_UNMAPPED) == 0 ? 0 : bp->bio_ma_offset; + ma_offs = (bp->bio_flags & (BIO_UNMAPPED|BIO_VLIST)) != 0 ? + bp->bio_ma_offset : 0; + vlist = (bp->bio_flags & BIO_VLIST) != 0 ? + (bus_dma_segment_t *)bp->bio_data : NULL; /* * offs is the offset at which to start operating on the * next (ie, first) page. lastp is the last page on * which we're going to operate. lastend is the ending * position within that last page (ie, PAGE_SIZE if * we're operating on complete aligned pages). */ offs = bp->bio_offset % PAGE_SIZE; lastp = (bp->bio_offset + bp->bio_length - 1) / PAGE_SIZE; lastend = (bp->bio_offset + bp->bio_length - 1) % PAGE_SIZE + 1; rv = VM_PAGER_OK; VM_OBJECT_WLOCK(sc->object); vm_object_pip_add(sc->object, 1); for (i = bp->bio_offset / PAGE_SIZE; i <= lastp; i++) { len = ((i == lastp) ? lastend : PAGE_SIZE) - offs; m = vm_page_grab(sc->object, i, VM_ALLOC_SYSTEM); if (bp->bio_cmd == BIO_READ) { if (m->valid == VM_PAGE_BITS_ALL) rv = VM_PAGER_OK; else rv = vm_pager_get_pages(sc->object, &m, 1, 0); if (rv == VM_PAGER_ERROR) { vm_page_xunbusy(m); break; } else if (rv == VM_PAGER_FAIL) { /* * Pager does not have the page. Zero * the allocated page, and mark it as * valid. Do not set dirty, the page * can be recreated if thrown out. */ pmap_zero_page(m); m->valid = VM_PAGE_BITS_ALL; } if ((bp->bio_flags & BIO_UNMAPPED) != 0) { pmap_copy_pages(&m, offs, bp->bio_ma, ma_offs, len); + } else if ((bp->bio_flags & BIO_VLIST) != 0) { + physcopyout_vlist(VM_PAGE_TO_PHYS(m) + offs, + vlist, ma_offs, len); + cpu_flush_dcache(p, len); } else { physcopyout(VM_PAGE_TO_PHYS(m) + offs, p, len); cpu_flush_dcache(p, len); } } else if (bp->bio_cmd == BIO_WRITE) { if (len != PAGE_SIZE && m->valid != VM_PAGE_BITS_ALL) rv = vm_pager_get_pages(sc->object, &m, 1, 0); else rv = VM_PAGER_OK; if (rv == VM_PAGER_ERROR) { vm_page_xunbusy(m); break; } if ((bp->bio_flags & BIO_UNMAPPED) != 0) { pmap_copy_pages(bp->bio_ma, ma_offs, &m, offs, len); + } else if ((bp->bio_flags & BIO_VLIST) != 0) { + physcopyin_vlist(vlist, ma_offs, + VM_PAGE_TO_PHYS(m) + offs, len); } else { physcopyin(p, VM_PAGE_TO_PHYS(m) + offs, len); } m->valid = VM_PAGE_BITS_ALL; } else if (bp->bio_cmd == BIO_DELETE) { if (len != PAGE_SIZE && m->valid != VM_PAGE_BITS_ALL) rv = vm_pager_get_pages(sc->object, &m, 1, 0); else rv = VM_PAGER_OK; if (rv == VM_PAGER_ERROR) { vm_page_xunbusy(m); break; } if (len != PAGE_SIZE) { pmap_zero_page_area(m, offs, len); vm_page_clear_dirty(m, offs, len); m->valid = VM_PAGE_BITS_ALL; } else vm_pager_page_unswapped(m); } vm_page_xunbusy(m); vm_page_lock(m); if (bp->bio_cmd == BIO_DELETE && len == PAGE_SIZE) vm_page_free(m); else vm_page_activate(m); vm_page_unlock(m); if (bp->bio_cmd == BIO_WRITE) { vm_page_dirty(m); vm_pager_page_unswapped(m); } /* Actions on further pages start at offset 0 */ p += PAGE_SIZE - offs; offs = 0; ma_offs += len; } vm_object_pip_subtract(sc->object, 1); VM_OBJECT_WUNLOCK(sc->object); return (rv != VM_PAGER_ERROR ? 0 : ENOSPC); } static int mdstart_null(struct md_s *sc, struct bio *bp) { switch (bp->bio_cmd) { case BIO_READ: bzero(bp->bio_data, bp->bio_length); cpu_flush_dcache(bp->bio_data, bp->bio_length); break; case BIO_WRITE: break; } bp->bio_resid = 0; return (0); } static void md_kthread(void *arg) { struct md_s *sc; struct bio *bp; int error; sc = arg; thread_lock(curthread); sched_prio(curthread, PRIBIO); thread_unlock(curthread); if (sc->type == MD_VNODE) curthread->td_pflags |= TDP_NORUNNINGBUF; for (;;) { mtx_lock(&sc->queue_mtx); if (sc->flags & MD_SHUTDOWN) { sc->flags |= MD_EXITING; mtx_unlock(&sc->queue_mtx); kproc_exit(0); } bp = bioq_takefirst(&sc->bio_queue); if (!bp) { msleep(sc, &sc->queue_mtx, PRIBIO | PDROP, "mdwait", 0); continue; } mtx_unlock(&sc->queue_mtx); if (bp->bio_cmd == BIO_GETATTR) { if ((sc->fwsectors && sc->fwheads && (g_handleattr_int(bp, "GEOM::fwsectors", sc->fwsectors) || g_handleattr_int(bp, "GEOM::fwheads", sc->fwheads))) || g_handleattr_int(bp, "GEOM::candelete", 1)) error = -1; else error = EOPNOTSUPP; } else { error = sc->start(sc, bp); } if (error != -1) { bp->bio_completed = bp->bio_length; if ((bp->bio_cmd == BIO_READ) || (bp->bio_cmd == BIO_WRITE)) devstat_end_transaction_bio(sc->devstat, bp); g_io_deliver(bp, error); } } } static struct md_s * mdfind(int unit) { struct md_s *sc; LIST_FOREACH(sc, &md_softc_list, list) { if (sc->unit == unit) break; } return (sc); } static struct md_s * mdnew(int unit, int *errp, enum md_types type) { struct md_s *sc; int error; *errp = 0; if (unit == -1) unit = alloc_unr(md_uh); else unit = alloc_unr_specific(md_uh, unit); if (unit == -1) { *errp = EBUSY; return (NULL); } sc = (struct md_s *)malloc(sizeof *sc, M_MD, M_WAITOK | M_ZERO); sc->type = type; bioq_init(&sc->bio_queue); mtx_init(&sc->queue_mtx, "md bio queue", NULL, MTX_DEF); mtx_init(&sc->stat_mtx, "md stat", NULL, MTX_DEF); sc->unit = unit; sprintf(sc->name, "md%d", unit); LIST_INSERT_HEAD(&md_softc_list, sc, list); error = kproc_create(md_kthread, sc, &sc->procp, 0, 0,"%s", sc->name); if (error == 0) return (sc); LIST_REMOVE(sc, list); mtx_destroy(&sc->stat_mtx); mtx_destroy(&sc->queue_mtx); free_unr(md_uh, sc->unit); free(sc, M_MD); *errp = error; return (NULL); } static void mdinit(struct md_s *sc) { struct g_geom *gp; struct g_provider *pp; g_topology_lock(); gp = g_new_geomf(&g_md_class, "md%d", sc->unit); gp->softc = sc; pp = g_new_providerf(gp, "md%d", sc->unit); pp->flags |= G_PF_DIRECT_SEND | G_PF_DIRECT_RECEIVE; pp->mediasize = sc->mediasize; pp->sectorsize = sc->sectorsize; switch (sc->type) { case MD_MALLOC: case MD_VNODE: case MD_SWAP: pp->flags |= G_PF_ACCEPT_UNMAPPED; break; case MD_PRELOAD: case MD_NULL: break; } sc->gp = gp; sc->pp = pp; g_error_provider(pp, 0); g_topology_unlock(); sc->devstat = devstat_new_entry("md", sc->unit, sc->sectorsize, DEVSTAT_ALL_SUPPORTED, DEVSTAT_TYPE_DIRECT, DEVSTAT_PRIORITY_MAX); } static int mdcreate_malloc(struct md_s *sc, struct md_ioctl *mdio) { uintptr_t sp; int error; off_t u; error = 0; if (mdio->md_options & ~(MD_AUTOUNIT | MD_COMPRESS | MD_RESERVE)) return (EINVAL); if (mdio->md_sectorsize != 0 && !powerof2(mdio->md_sectorsize)) return (EINVAL); /* Compression doesn't make sense if we have reserved space */ if (mdio->md_options & MD_RESERVE) mdio->md_options &= ~MD_COMPRESS; if (mdio->md_fwsectors != 0) sc->fwsectors = mdio->md_fwsectors; if (mdio->md_fwheads != 0) sc->fwheads = mdio->md_fwheads; sc->flags = mdio->md_options & (MD_COMPRESS | MD_FORCE); sc->indir = dimension(sc->mediasize / sc->sectorsize); sc->uma = uma_zcreate(sc->name, sc->sectorsize, NULL, NULL, NULL, NULL, 0x1ff, 0); if (mdio->md_options & MD_RESERVE) { off_t nsectors; nsectors = sc->mediasize / sc->sectorsize; for (u = 0; u < nsectors; u++) { sp = (uintptr_t)uma_zalloc(sc->uma, (md_malloc_wait ? M_WAITOK : M_NOWAIT) | M_ZERO); if (sp != 0) error = s_write(sc->indir, u, sp); else error = ENOMEM; if (error != 0) break; } } return (error); } static int mdsetcred(struct md_s *sc, struct ucred *cred) { char *tmpbuf; int error = 0; /* * Set credits in our softc */ if (sc->cred) crfree(sc->cred); sc->cred = crhold(cred); /* * Horrible kludge to establish credentials for NFS XXX. */ if (sc->vnode) { struct uio auio; struct iovec aiov; tmpbuf = malloc(sc->sectorsize, M_TEMP, M_WAITOK); bzero(&auio, sizeof(auio)); aiov.iov_base = tmpbuf; aiov.iov_len = sc->sectorsize; auio.uio_iov = &aiov; auio.uio_iovcnt = 1; auio.uio_offset = 0; auio.uio_rw = UIO_READ; auio.uio_segflg = UIO_SYSSPACE; auio.uio_resid = aiov.iov_len; vn_lock(sc->vnode, LK_EXCLUSIVE | LK_RETRY); error = VOP_READ(sc->vnode, &auio, 0, sc->cred); VOP_UNLOCK(sc->vnode, 0); free(tmpbuf, M_TEMP); } return (error); } static int mdcreate_vnode(struct md_s *sc, struct md_ioctl *mdio, struct thread *td) { struct vattr vattr; struct nameidata nd; char *fname; int error, flags; /* * Kernel-originated requests must have the filename appended * to the mdio structure to protect against malicious software. */ fname = mdio->md_file; if ((void *)fname != (void *)(mdio + 1)) { error = copyinstr(fname, sc->file, sizeof(sc->file), NULL); if (error != 0) return (error); } else strlcpy(sc->file, fname, sizeof(sc->file)); /* * If the user specified that this is a read only device, don't * set the FWRITE mask before trying to open the backing store. */ flags = FREAD | ((mdio->md_options & MD_READONLY) ? 0 : FWRITE); NDINIT(&nd, LOOKUP, FOLLOW, UIO_SYSSPACE, sc->file, td); error = vn_open(&nd, &flags, 0, NULL); if (error != 0) return (error); NDFREE(&nd, NDF_ONLY_PNBUF); if (nd.ni_vp->v_type != VREG) { error = EINVAL; goto bad; } error = VOP_GETATTR(nd.ni_vp, &vattr, td->td_ucred); if (error != 0) goto bad; if (VOP_ISLOCKED(nd.ni_vp) != LK_EXCLUSIVE) { vn_lock(nd.ni_vp, LK_UPGRADE | LK_RETRY); if (nd.ni_vp->v_iflag & VI_DOOMED) { /* Forced unmount. */ error = EBADF; goto bad; } } nd.ni_vp->v_vflag |= VV_MD; VOP_UNLOCK(nd.ni_vp, 0); if (mdio->md_fwsectors != 0) sc->fwsectors = mdio->md_fwsectors; if (mdio->md_fwheads != 0) sc->fwheads = mdio->md_fwheads; sc->flags = mdio->md_options & (MD_FORCE | MD_ASYNC); if (!(flags & FWRITE)) sc->flags |= MD_READONLY; sc->vnode = nd.ni_vp; error = mdsetcred(sc, td->td_ucred); if (error != 0) { sc->vnode = NULL; vn_lock(nd.ni_vp, LK_EXCLUSIVE | LK_RETRY); nd.ni_vp->v_vflag &= ~VV_MD; goto bad; } return (0); bad: VOP_UNLOCK(nd.ni_vp, 0); (void)vn_close(nd.ni_vp, flags, td->td_ucred, td); return (error); } static int mddestroy(struct md_s *sc, struct thread *td) { if (sc->gp) { sc->gp->softc = NULL; g_topology_lock(); g_wither_geom(sc->gp, ENXIO); g_topology_unlock(); sc->gp = NULL; sc->pp = NULL; } if (sc->devstat) { devstat_remove_entry(sc->devstat); sc->devstat = NULL; } mtx_lock(&sc->queue_mtx); sc->flags |= MD_SHUTDOWN; wakeup(sc); while (!(sc->flags & MD_EXITING)) msleep(sc->procp, &sc->queue_mtx, PRIBIO, "mddestroy", hz / 10); mtx_unlock(&sc->queue_mtx); mtx_destroy(&sc->stat_mtx); mtx_destroy(&sc->queue_mtx); if (sc->vnode != NULL) { vn_lock(sc->vnode, LK_EXCLUSIVE | LK_RETRY); sc->vnode->v_vflag &= ~VV_MD; VOP_UNLOCK(sc->vnode, 0); (void)vn_close(sc->vnode, sc->flags & MD_READONLY ? FREAD : (FREAD|FWRITE), sc->cred, td); } if (sc->cred != NULL) crfree(sc->cred); if (sc->object != NULL) vm_object_deallocate(sc->object); if (sc->indir) destroy_indir(sc, sc->indir); if (sc->uma) uma_zdestroy(sc->uma); LIST_REMOVE(sc, list); free_unr(md_uh, sc->unit); free(sc, M_MD); return (0); } static int mdresize(struct md_s *sc, struct md_ioctl *mdio) { int error, res; vm_pindex_t oldpages, newpages; switch (sc->type) { case MD_VNODE: case MD_NULL: break; case MD_SWAP: if (mdio->md_mediasize <= 0 || (mdio->md_mediasize % PAGE_SIZE) != 0) return (EDOM); oldpages = OFF_TO_IDX(round_page(sc->mediasize)); newpages = OFF_TO_IDX(round_page(mdio->md_mediasize)); if (newpages < oldpages) { VM_OBJECT_WLOCK(sc->object); vm_object_page_remove(sc->object, newpages, 0, 0); swap_pager_freespace(sc->object, newpages, oldpages - newpages); swap_release_by_cred(IDX_TO_OFF(oldpages - newpages), sc->cred); sc->object->charge = IDX_TO_OFF(newpages); sc->object->size = newpages; VM_OBJECT_WUNLOCK(sc->object); } else if (newpages > oldpages) { res = swap_reserve_by_cred(IDX_TO_OFF(newpages - oldpages), sc->cred); if (!res) return (ENOMEM); if ((mdio->md_options & MD_RESERVE) || (sc->flags & MD_RESERVE)) { error = swap_pager_reserve(sc->object, oldpages, newpages - oldpages); if (error < 0) { swap_release_by_cred( IDX_TO_OFF(newpages - oldpages), sc->cred); return (EDOM); } } VM_OBJECT_WLOCK(sc->object); sc->object->charge = IDX_TO_OFF(newpages); sc->object->size = newpages; VM_OBJECT_WUNLOCK(sc->object); } break; default: return (EOPNOTSUPP); } sc->mediasize = mdio->md_mediasize; g_topology_lock(); g_resize_provider(sc->pp, sc->mediasize); g_topology_unlock(); return (0); } static int mdcreate_swap(struct md_s *sc, struct md_ioctl *mdio, struct thread *td) { vm_ooffset_t npage; int error; /* * Range check. Disallow negative sizes or any size less then the * size of a page. Then round to a page. */ if (sc->mediasize <= 0 || (sc->mediasize % PAGE_SIZE) != 0) return (EDOM); /* * Allocate an OBJT_SWAP object. * * Note the truncation. */ npage = mdio->md_mediasize / PAGE_SIZE; if (mdio->md_fwsectors != 0) sc->fwsectors = mdio->md_fwsectors; if (mdio->md_fwheads != 0) sc->fwheads = mdio->md_fwheads; sc->object = vm_pager_allocate(OBJT_SWAP, NULL, PAGE_SIZE * npage, VM_PROT_DEFAULT, 0, td->td_ucred); if (sc->object == NULL) return (ENOMEM); sc->flags = mdio->md_options & (MD_FORCE | MD_RESERVE); if (mdio->md_options & MD_RESERVE) { if (swap_pager_reserve(sc->object, 0, npage) < 0) { error = EDOM; goto finish; } } error = mdsetcred(sc, td->td_ucred); finish: if (error != 0) { vm_object_deallocate(sc->object); sc->object = NULL; } return (error); } static int mdcreate_null(struct md_s *sc, struct md_ioctl *mdio, struct thread *td) { /* * Range check. Disallow negative sizes or any size less then the * size of a page. Then round to a page. */ if (sc->mediasize <= 0 || (sc->mediasize % PAGE_SIZE) != 0) return (EDOM); return (0); } static int xmdctlioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags, struct thread *td) { struct md_ioctl *mdio; struct md_s *sc; int error, i; unsigned sectsize; if (md_debug) printf("mdctlioctl(%s %lx %p %x %p)\n", devtoname(dev), cmd, addr, flags, td); mdio = (struct md_ioctl *)addr; if (mdio->md_version != MDIOVERSION) return (EINVAL); /* * We assert the version number in the individual ioctl * handlers instead of out here because (a) it is possible we * may add another ioctl in the future which doesn't read an * mdio, and (b) the correct return value for an unknown ioctl * is ENOIOCTL, not EINVAL. */ error = 0; switch (cmd) { case MDIOCATTACH: switch (mdio->md_type) { case MD_MALLOC: case MD_PRELOAD: case MD_VNODE: case MD_SWAP: case MD_NULL: break; default: return (EINVAL); } if (mdio->md_sectorsize == 0) sectsize = DEV_BSIZE; else sectsize = mdio->md_sectorsize; if (sectsize > MAXPHYS || mdio->md_mediasize < sectsize) return (EINVAL); if (mdio->md_options & MD_AUTOUNIT) sc = mdnew(-1, &error, mdio->md_type); else { if (mdio->md_unit > INT_MAX) return (EINVAL); sc = mdnew(mdio->md_unit, &error, mdio->md_type); } if (sc == NULL) return (error); if (mdio->md_options & MD_AUTOUNIT) mdio->md_unit = sc->unit; sc->mediasize = mdio->md_mediasize; sc->sectorsize = sectsize; error = EDOOFUS; switch (sc->type) { case MD_MALLOC: sc->start = mdstart_malloc; error = mdcreate_malloc(sc, mdio); break; case MD_PRELOAD: /* * We disallow attaching preloaded memory disks via * ioctl. Preloaded memory disks are automatically * attached in g_md_init(). */ error = EOPNOTSUPP; break; case MD_VNODE: sc->start = mdstart_vnode; error = mdcreate_vnode(sc, mdio, td); break; case MD_SWAP: sc->start = mdstart_swap; error = mdcreate_swap(sc, mdio, td); break; case MD_NULL: sc->start = mdstart_null; error = mdcreate_null(sc, mdio, td); break; } if (error != 0) { mddestroy(sc, td); return (error); } /* Prune off any residual fractional sector */ i = sc->mediasize % sc->sectorsize; sc->mediasize -= i; mdinit(sc); return (0); case MDIOCDETACH: if (mdio->md_mediasize != 0 || (mdio->md_options & ~MD_FORCE) != 0) return (EINVAL); sc = mdfind(mdio->md_unit); if (sc == NULL) return (ENOENT); if (sc->opencount != 0 && !(sc->flags & MD_FORCE) && !(mdio->md_options & MD_FORCE)) return (EBUSY); return (mddestroy(sc, td)); case MDIOCRESIZE: if ((mdio->md_options & ~(MD_FORCE | MD_RESERVE)) != 0) return (EINVAL); sc = mdfind(mdio->md_unit); if (sc == NULL) return (ENOENT); if (mdio->md_mediasize < sc->sectorsize) return (EINVAL); if (mdio->md_mediasize < sc->mediasize && !(sc->flags & MD_FORCE) && !(mdio->md_options & MD_FORCE)) return (EBUSY); return (mdresize(sc, mdio)); case MDIOCQUERY: sc = mdfind(mdio->md_unit); if (sc == NULL) return (ENOENT); mdio->md_type = sc->type; mdio->md_options = sc->flags; mdio->md_mediasize = sc->mediasize; mdio->md_sectorsize = sc->sectorsize; if (sc->type == MD_VNODE) error = copyout(sc->file, mdio->md_file, strlen(sc->file) + 1); return (error); case MDIOCLIST: i = 1; LIST_FOREACH(sc, &md_softc_list, list) { if (i == MDNPAD - 1) mdio->md_pad[i] = -1; else mdio->md_pad[i++] = sc->unit; } mdio->md_pad[0] = i - 1; return (0); default: return (ENOIOCTL); }; } static int mdctlioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags, struct thread *td) { int error; sx_xlock(&md_sx); error = xmdctlioctl(dev, cmd, addr, flags, td); sx_xunlock(&md_sx); return (error); } static void md_preloaded(u_char *image, size_t length, const char *name) { struct md_s *sc; int error; sc = mdnew(-1, &error, MD_PRELOAD); if (sc == NULL) return; sc->mediasize = length; sc->sectorsize = DEV_BSIZE; sc->pl_ptr = image; sc->pl_len = length; sc->start = mdstart_preload; #ifdef MD_ROOT if (sc->unit == 0) rootdevnames[0] = MD_ROOT_FSTYPE ":/dev/md0"; #endif mdinit(sc); if (name != NULL) { printf("%s%d: Preloaded image <%s> %zd bytes at %p\n", MD_NAME, sc->unit, name, length, image); } } static void g_md_init(struct g_class *mp __unused) { caddr_t mod; u_char *ptr, *name, *type; unsigned len; int i; /* figure out log2(NINDIR) */ for (i = NINDIR, nshift = -1; i; nshift++) i >>= 1; mod = NULL; sx_init(&md_sx, "MD config lock"); g_topology_unlock(); md_uh = new_unrhdr(0, INT_MAX, NULL); #ifdef MD_ROOT_SIZE sx_xlock(&md_sx); md_preloaded(mfs_root.start, sizeof(mfs_root.start), NULL); sx_xunlock(&md_sx); #endif /* XXX: are preload_* static or do they need Giant ? */ while ((mod = preload_search_next_name(mod)) != NULL) { name = (char *)preload_search_info(mod, MODINFO_NAME); if (name == NULL) continue; type = (char *)preload_search_info(mod, MODINFO_TYPE); if (type == NULL) continue; if (strcmp(type, "md_image") && strcmp(type, "mfs_root")) continue; ptr = preload_fetch_addr(mod); len = preload_fetch_size(mod); if (ptr != NULL && len != 0) { sx_xlock(&md_sx); md_preloaded(ptr, len, name); sx_xunlock(&md_sx); } } md_vnode_pbuf_freecnt = nswbuf / 10; status_dev = make_dev(&mdctl_cdevsw, INT_MAX, UID_ROOT, GID_WHEEL, 0600, MDCTL_NAME); g_topology_lock(); } static void g_md_dumpconf(struct sbuf *sb, const char *indent, struct g_geom *gp, struct g_consumer *cp __unused, struct g_provider *pp) { struct md_s *mp; char *type; mp = gp->softc; if (mp == NULL) return; switch (mp->type) { case MD_MALLOC: type = "malloc"; break; case MD_PRELOAD: type = "preload"; break; case MD_VNODE: type = "vnode"; break; case MD_SWAP: type = "swap"; break; case MD_NULL: type = "null"; break; default: type = "unknown"; break; } if (pp != NULL) { if (indent == NULL) { sbuf_printf(sb, " u %d", mp->unit); sbuf_printf(sb, " s %ju", (uintmax_t) mp->sectorsize); sbuf_printf(sb, " f %ju", (uintmax_t) mp->fwheads); sbuf_printf(sb, " fs %ju", (uintmax_t) mp->fwsectors); sbuf_printf(sb, " l %ju", (uintmax_t) mp->mediasize); sbuf_printf(sb, " t %s", type); if (mp->type == MD_VNODE && mp->vnode != NULL) sbuf_printf(sb, " file %s", mp->file); } else { sbuf_printf(sb, "%s%d\n", indent, mp->unit); sbuf_printf(sb, "%s%ju\n", indent, (uintmax_t) mp->sectorsize); sbuf_printf(sb, "%s%ju\n", indent, (uintmax_t) mp->fwheads); sbuf_printf(sb, "%s%ju\n", indent, (uintmax_t) mp->fwsectors); sbuf_printf(sb, "%s%ju\n", indent, (uintmax_t) mp->mediasize); sbuf_printf(sb, "%s%s\n", indent, (mp->flags & MD_COMPRESS) == 0 ? "off": "on"); sbuf_printf(sb, "%s%s\n", indent, (mp->flags & MD_READONLY) == 0 ? "read-write": "read-only"); sbuf_printf(sb, "%s%s\n", indent, type); if (mp->type == MD_VNODE && mp->vnode != NULL) { sbuf_printf(sb, "%s", indent); g_conf_printf_escaped(sb, "%s", mp->file); sbuf_printf(sb, "\n"); } } } } static void g_md_fini(struct g_class *mp __unused) { sx_destroy(&md_sx); if (status_dev != NULL) destroy_dev(status_dev); delete_unrhdr(md_uh); } Index: stable/10/sys/geom/geom_disk.c =================================================================== --- stable/10/sys/geom/geom_disk.c (revision 292347) +++ stable/10/sys/geom/geom_disk.c (revision 292348) @@ -1,810 +1,930 @@ /*- * Copyright (c) 2002 Poul-Henning Kamp * Copyright (c) 2002 Networks Associates Technology, Inc. * All rights reserved. * * This software was developed for the FreeBSD Project by Poul-Henning Kamp * and NAI Labs, the Security Research Division of Network Associates, Inc. * under DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the * DARPA CHATS research program. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The names of the authors may not be used to endorse or promote * products derived from this software without specific prior written * permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_geom.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include + struct g_disk_softc { struct mtx done_mtx; struct disk *dp; struct sysctl_ctx_list sysctl_ctx; struct sysctl_oid *sysctl_tree; char led[64]; uint32_t state; struct mtx start_mtx; }; static g_access_t g_disk_access; static g_start_t g_disk_start; static g_ioctl_t g_disk_ioctl; static g_dumpconf_t g_disk_dumpconf; static g_provgone_t g_disk_providergone; static struct g_class g_disk_class = { .name = G_DISK_CLASS_NAME, .version = G_VERSION, .start = g_disk_start, .access = g_disk_access, .ioctl = g_disk_ioctl, .providergone = g_disk_providergone, .dumpconf = g_disk_dumpconf, }; SYSCTL_DECL(_kern_geom); static SYSCTL_NODE(_kern_geom, OID_AUTO, disk, CTLFLAG_RW, 0, "GEOM_DISK stuff"); DECLARE_GEOM_CLASS(g_disk_class, g_disk); static void __inline g_disk_lock_giant(struct disk *dp) { if (dp->d_flags & DISKFLAG_NEEDSGIANT) mtx_lock(&Giant); } static void __inline g_disk_unlock_giant(struct disk *dp) { if (dp->d_flags & DISKFLAG_NEEDSGIANT) mtx_unlock(&Giant); } static int g_disk_access(struct g_provider *pp, int r, int w, int e) { struct disk *dp; struct g_disk_softc *sc; int error; g_trace(G_T_ACCESS, "g_disk_access(%s, %d, %d, %d)", pp->name, r, w, e); g_topology_assert(); sc = pp->private; if (sc == NULL || (dp = sc->dp) == NULL || dp->d_destroyed) { /* * Allow decreasing access count even if disk is not * avaliable anymore. */ if (r <= 0 && w <= 0 && e <= 0) return (0); return (ENXIO); } r += pp->acr; w += pp->acw; e += pp->ace; error = 0; if ((pp->acr + pp->acw + pp->ace) == 0 && (r + w + e) > 0) { if (dp->d_open != NULL) { g_disk_lock_giant(dp); error = dp->d_open(dp); if (bootverbose && error != 0) printf("Opened disk %s -> %d\n", pp->name, error); g_disk_unlock_giant(dp); if (error != 0) return (error); } pp->mediasize = dp->d_mediasize; pp->sectorsize = dp->d_sectorsize; if (dp->d_maxsize == 0) { printf("WARNING: Disk drive %s%d has no d_maxsize\n", dp->d_name, dp->d_unit); dp->d_maxsize = DFLTPHYS; } if (dp->d_delmaxsize == 0) { if (bootverbose && dp->d_flags & DISKFLAG_CANDELETE) { printf("WARNING: Disk drive %s%d has no " "d_delmaxsize\n", dp->d_name, dp->d_unit); } dp->d_delmaxsize = dp->d_maxsize; } pp->stripeoffset = dp->d_stripeoffset; pp->stripesize = dp->d_stripesize; dp->d_flags |= DISKFLAG_OPEN; } else if ((pp->acr + pp->acw + pp->ace) > 0 && (r + w + e) == 0) { if (dp->d_close != NULL) { g_disk_lock_giant(dp); error = dp->d_close(dp); if (error != 0) printf("Closed disk %s -> %d\n", pp->name, error); g_disk_unlock_giant(dp); } sc->state = G_STATE_ACTIVE; if (sc->led[0] != 0) led_set(sc->led, "0"); dp->d_flags &= ~DISKFLAG_OPEN; } return (error); } static void g_disk_kerneldump(struct bio *bp, struct disk *dp) { struct g_kerneldump *gkd; struct g_geom *gp; gkd = (struct g_kerneldump*)bp->bio_data; gp = bp->bio_to->geom; g_trace(G_T_TOPOLOGY, "g_disk_kerneldump(%s, %jd, %jd)", gp->name, (intmax_t)gkd->offset, (intmax_t)gkd->length); if (dp->d_dump == NULL) { g_io_deliver(bp, ENODEV); return; } gkd->di.dumper = dp->d_dump; gkd->di.priv = dp; gkd->di.blocksize = dp->d_sectorsize; gkd->di.maxiosize = dp->d_maxsize; gkd->di.mediaoffset = gkd->offset; if ((gkd->offset + gkd->length) > dp->d_mediasize) gkd->length = dp->d_mediasize - gkd->offset; gkd->di.mediasize = gkd->length; g_io_deliver(bp, 0); } static void g_disk_setstate(struct bio *bp, struct g_disk_softc *sc) { const char *cmd; memcpy(&sc->state, bp->bio_data, sizeof(sc->state)); if (sc->led[0] != 0) { switch (sc->state) { case G_STATE_FAILED: cmd = "1"; break; case G_STATE_REBUILD: cmd = "f5"; break; case G_STATE_RESYNC: cmd = "f1"; break; default: cmd = "0"; break; } led_set(sc->led, cmd); } g_io_deliver(bp, 0); } static void g_disk_done(struct bio *bp) { struct bintime now; struct bio *bp2; struct g_disk_softc *sc; /* See "notes" for why we need a mutex here */ /* XXX: will witness accept a mix of Giant/unGiant drivers here ? */ bp2 = bp->bio_parent; sc = bp2->bio_to->private; bp->bio_completed = bp->bio_length - bp->bio_resid; binuptime(&now); mtx_lock(&sc->done_mtx); if (bp2->bio_error == 0) bp2->bio_error = bp->bio_error; bp2->bio_completed += bp->bio_completed; if ((bp->bio_cmd & (BIO_READ|BIO_WRITE|BIO_DELETE|BIO_FLUSH)) != 0) devstat_end_transaction_bio_bt(sc->dp->d_devstat, bp, &now); bp2->bio_inbed++; if (bp2->bio_children == bp2->bio_inbed) { mtx_unlock(&sc->done_mtx); bp2->bio_resid = bp2->bio_bcount - bp2->bio_completed; g_io_deliver(bp2, bp2->bio_error); } else mtx_unlock(&sc->done_mtx); g_destroy_bio(bp); } static int g_disk_ioctl(struct g_provider *pp, u_long cmd, void * data, int fflag, struct thread *td) { struct disk *dp; struct g_disk_softc *sc; int error; sc = pp->private; dp = sc->dp; if (dp->d_ioctl == NULL) return (ENOIOCTL); g_disk_lock_giant(dp); error = dp->d_ioctl(dp, cmd, data, fflag, td); g_disk_unlock_giant(dp); return (error); } +static off_t +g_disk_maxsize(struct disk *dp, struct bio *bp) +{ + if (bp->bio_cmd == BIO_DELETE) + return (dp->d_delmaxsize); + return (dp->d_maxsize); +} + +static int +g_disk_maxsegs(struct disk *dp, struct bio *bp) +{ + return ((g_disk_maxsize(dp, bp) / PAGE_SIZE) + 1); +} + static void +g_disk_advance(struct disk *dp, struct bio *bp, off_t off) +{ + + bp->bio_offset += off; + bp->bio_length -= off; + + if ((bp->bio_flags & BIO_VLIST) != 0) { + bus_dma_segment_t *seg, *end; + + seg = (bus_dma_segment_t *)bp->bio_data; + end = (bus_dma_segment_t *)bp->bio_data + bp->bio_ma_n; + off += bp->bio_ma_offset; + while (off >= seg->ds_len) { + KASSERT((seg != end), + ("vlist request runs off the end")); + off -= seg->ds_len; + seg++; + } + bp->bio_ma_offset = off; + bp->bio_ma_n = end - seg; + bp->bio_data = (void *)seg; + } else if ((bp->bio_flags & BIO_UNMAPPED) != 0) { + bp->bio_ma += off / PAGE_SIZE; + bp->bio_ma_offset += off; + bp->bio_ma_offset %= PAGE_SIZE; + bp->bio_ma_n -= off / PAGE_SIZE; + } else { + bp->bio_data += off; + } +} + +static void +g_disk_seg_limit(bus_dma_segment_t *seg, off_t *poffset, + off_t *plength, int *ppages) +{ + uintptr_t seg_page_base; + uintptr_t seg_page_end; + off_t offset; + off_t length; + int seg_pages; + + offset = *poffset; + length = *plength; + + if (length > seg->ds_len - offset) + length = seg->ds_len - offset; + + seg_page_base = trunc_page(seg->ds_addr + offset); + seg_page_end = round_page(seg->ds_addr + offset + length); + seg_pages = (seg_page_end - seg_page_base) >> PAGE_SHIFT; + + if (seg_pages > *ppages) { + seg_pages = *ppages; + length = (seg_page_base + (seg_pages << PAGE_SHIFT)) - + (seg->ds_addr + offset); + } + + *poffset = 0; + *plength -= length; + *ppages -= seg_pages; +} + +static off_t +g_disk_vlist_limit(struct disk *dp, struct bio *bp, bus_dma_segment_t **pendseg) +{ + bus_dma_segment_t *seg, *end; + off_t residual; + off_t offset; + int pages; + + seg = (bus_dma_segment_t *)bp->bio_data; + end = (bus_dma_segment_t *)bp->bio_data + bp->bio_ma_n; + residual = bp->bio_length; + offset = bp->bio_ma_offset; + pages = g_disk_maxsegs(dp, bp); + while (residual != 0 && pages != 0) { + KASSERT((seg != end), + ("vlist limit runs off the end")); + g_disk_seg_limit(seg, &offset, &residual, &pages); + seg++; + } + if (pendseg != NULL) + *pendseg = seg; + return (residual); +} + +static bool +g_disk_limit(struct disk *dp, struct bio *bp) +{ + bool limited = false; + off_t maxsz; + + maxsz = g_disk_maxsize(dp, bp); + + /* + * XXX: If we have a stripesize we should really use it here. + * Care should be taken in the delete case if this is done + * as deletes can be very sensitive to size given how they + * are processed. + */ + if (bp->bio_length > maxsz) { + bp->bio_length = maxsz; + limited = true; + } + + if ((bp->bio_flags & BIO_VLIST) != 0) { + bus_dma_segment_t *firstseg, *endseg; + off_t residual; + + firstseg = (bus_dma_segment_t*)bp->bio_data; + residual = g_disk_vlist_limit(dp, bp, &endseg); + if (residual != 0) { + bp->bio_ma_n = endseg - firstseg; + bp->bio_length -= residual; + limited = true; + } + } else if ((bp->bio_flags & BIO_UNMAPPED) != 0) { + bp->bio_ma_n = + howmany(bp->bio_ma_offset + bp->bio_length, PAGE_SIZE); + } + + return (limited); +} + +static void g_disk_start(struct bio *bp) { struct bio *bp2, *bp3; struct disk *dp; struct g_disk_softc *sc; int error; off_t off; sc = bp->bio_to->private; if (sc == NULL || (dp = sc->dp) == NULL || dp->d_destroyed) { g_io_deliver(bp, ENXIO); return; } error = EJUSTRETURN; switch(bp->bio_cmd) { case BIO_DELETE: if (!(dp->d_flags & DISKFLAG_CANDELETE)) { error = EOPNOTSUPP; break; } /* fall-through */ case BIO_READ: case BIO_WRITE: + KASSERT((dp->d_flags & DISKFLAG_UNMAPPED_BIO) != 0 || + (bp->bio_flags & BIO_UNMAPPED) == 0, + ("unmapped bio not supported by disk %s", dp->d_name)); off = 0; bp3 = NULL; bp2 = g_clone_bio(bp); if (bp2 == NULL) { error = ENOMEM; break; } - do { - off_t d_maxsize; + for (;;) { + if (g_disk_limit(dp, bp2)) { + off += bp2->bio_length; - d_maxsize = (bp->bio_cmd == BIO_DELETE) ? - dp->d_delmaxsize : dp->d_maxsize; - bp2->bio_offset += off; - bp2->bio_length -= off; - if ((bp->bio_flags & BIO_UNMAPPED) == 0) { - bp2->bio_data += off; - } else { - KASSERT((dp->d_flags & DISKFLAG_UNMAPPED_BIO) - != 0, - ("unmapped bio not supported by disk %s", - dp->d_name)); - bp2->bio_ma += off / PAGE_SIZE; - bp2->bio_ma_offset += off; - bp2->bio_ma_offset %= PAGE_SIZE; - bp2->bio_ma_n -= off / PAGE_SIZE; - } - if (bp2->bio_length > d_maxsize) { /* - * XXX: If we have a stripesize we should really - * use it here. Care should be taken in the delete - * case if this is done as deletes can be very - * sensitive to size given how they are processed. - */ - bp2->bio_length = d_maxsize; - if ((bp->bio_flags & BIO_UNMAPPED) != 0) { - bp2->bio_ma_n = howmany( - bp2->bio_ma_offset + - bp2->bio_length, PAGE_SIZE); - } - off += d_maxsize; - /* * To avoid a race, we need to grab the next bio * before we schedule this one. See "notes". */ bp3 = g_clone_bio(bp); if (bp3 == NULL) bp->bio_error = ENOMEM; } bp2->bio_done = g_disk_done; bp2->bio_pblkno = bp2->bio_offset / dp->d_sectorsize; bp2->bio_bcount = bp2->bio_length; bp2->bio_disk = dp; mtx_lock(&sc->start_mtx); devstat_start_transaction_bio(dp->d_devstat, bp2); mtx_unlock(&sc->start_mtx); g_disk_lock_giant(dp); dp->d_strategy(bp2); g_disk_unlock_giant(dp); + + if (bp3 == NULL) + break; + bp2 = bp3; bp3 = NULL; - } while (bp2 != NULL); + g_disk_advance(dp, bp2, off); + } break; case BIO_GETATTR: /* Give the driver a chance to override */ if (dp->d_getattr != NULL) { if (bp->bio_disk == NULL) bp->bio_disk = dp; error = dp->d_getattr(bp); if (error != -1) break; error = EJUSTRETURN; } if (g_handleattr_int(bp, "GEOM::candelete", (dp->d_flags & DISKFLAG_CANDELETE) != 0)) break; else if (g_handleattr_int(bp, "GEOM::fwsectors", dp->d_fwsectors)) break; else if (g_handleattr_int(bp, "GEOM::fwheads", dp->d_fwheads)) break; else if (g_handleattr_off_t(bp, "GEOM::frontstuff", 0)) break; else if (g_handleattr_str(bp, "GEOM::ident", dp->d_ident)) break; else if (g_handleattr_uint16_t(bp, "GEOM::hba_vendor", dp->d_hba_vendor)) break; else if (g_handleattr_uint16_t(bp, "GEOM::hba_device", dp->d_hba_device)) break; else if (g_handleattr_uint16_t(bp, "GEOM::hba_subvendor", dp->d_hba_subvendor)) break; else if (g_handleattr_uint16_t(bp, "GEOM::hba_subdevice", dp->d_hba_subdevice)) break; else if (!strcmp(bp->bio_attribute, "GEOM::kerneldump")) g_disk_kerneldump(bp, dp); else if (!strcmp(bp->bio_attribute, "GEOM::setstate")) g_disk_setstate(bp, sc); else if (!strcmp(bp->bio_attribute, "GEOM::rotation_rate")) { uint64_t v; if ((dp->d_flags & DISKFLAG_LACKS_ROTRATE) == 0) v = dp->d_rotation_rate; else v = 0; /* rate unknown */ g_handleattr_uint16_t(bp, "GEOM::rotation_rate", v); break; } else error = ENOIOCTL; break; case BIO_FLUSH: g_trace(G_T_BIO, "g_disk_flushcache(%s)", bp->bio_to->name); if (!(dp->d_flags & DISKFLAG_CANFLUSHCACHE)) { error = EOPNOTSUPP; break; } bp2 = g_clone_bio(bp); if (bp2 == NULL) { g_io_deliver(bp, ENOMEM); return; } bp2->bio_done = g_disk_done; bp2->bio_disk = dp; mtx_lock(&sc->start_mtx); devstat_start_transaction_bio(dp->d_devstat, bp2); mtx_unlock(&sc->start_mtx); g_disk_lock_giant(dp); dp->d_strategy(bp2); g_disk_unlock_giant(dp); break; default: error = EOPNOTSUPP; break; } if (error != EJUSTRETURN) g_io_deliver(bp, error); return; } static void g_disk_dumpconf(struct sbuf *sb, const char *indent, struct g_geom *gp, struct g_consumer *cp, struct g_provider *pp) { struct bio *bp; struct disk *dp; struct g_disk_softc *sc; char *buf; int res = 0; sc = gp->softc; if (sc == NULL || (dp = sc->dp) == NULL) return; if (indent == NULL) { sbuf_printf(sb, " hd %u", dp->d_fwheads); sbuf_printf(sb, " sc %u", dp->d_fwsectors); return; } if (pp != NULL) { sbuf_printf(sb, "%s%u\n", indent, dp->d_fwheads); sbuf_printf(sb, "%s%u\n", indent, dp->d_fwsectors); if (dp->d_getattr != NULL) { buf = g_malloc(DISK_IDENT_SIZE, M_WAITOK); bp = g_alloc_bio(); bp->bio_disk = dp; bp->bio_attribute = "GEOM::ident"; bp->bio_length = DISK_IDENT_SIZE; bp->bio_data = buf; res = dp->d_getattr(bp); sbuf_printf(sb, "%s", indent); g_conf_printf_escaped(sb, "%s", res == 0 ? buf: dp->d_ident); sbuf_printf(sb, "\n"); bp->bio_attribute = "GEOM::lunid"; bp->bio_length = DISK_IDENT_SIZE; bp->bio_data = buf; if (dp->d_getattr(bp) == 0) { sbuf_printf(sb, "%s", indent); g_conf_printf_escaped(sb, "%s", buf); sbuf_printf(sb, "\n"); } bp->bio_attribute = "GEOM::lunname"; bp->bio_length = DISK_IDENT_SIZE; bp->bio_data = buf; if (dp->d_getattr(bp) == 0) { sbuf_printf(sb, "%s", indent); g_conf_printf_escaped(sb, "%s", buf); sbuf_printf(sb, "\n"); } g_destroy_bio(bp); g_free(buf); } else { sbuf_printf(sb, "%s", indent); g_conf_printf_escaped(sb, "%s", dp->d_ident); sbuf_printf(sb, "\n"); } sbuf_printf(sb, "%s", indent); g_conf_printf_escaped(sb, "%s", dp->d_descr); sbuf_printf(sb, "\n"); } } static void g_disk_resize(void *ptr, int flag) { struct disk *dp; struct g_geom *gp; struct g_provider *pp; if (flag == EV_CANCEL) return; g_topology_assert(); dp = ptr; gp = dp->d_geom; if (dp->d_destroyed || gp == NULL) return; LIST_FOREACH(pp, &gp->provider, provider) { if (pp->sectorsize != 0 && pp->sectorsize != dp->d_sectorsize) g_wither_provider(pp, ENXIO); else g_resize_provider(pp, dp->d_mediasize); } } static void g_disk_create(void *arg, int flag) { struct g_geom *gp; struct g_provider *pp; struct disk *dp; struct g_disk_softc *sc; char tmpstr[80]; if (flag == EV_CANCEL) return; g_topology_assert(); dp = arg; sc = g_malloc(sizeof(*sc), M_WAITOK | M_ZERO); mtx_init(&sc->start_mtx, "g_disk_start", NULL, MTX_DEF); mtx_init(&sc->done_mtx, "g_disk_done", NULL, MTX_DEF); sc->dp = dp; gp = g_new_geomf(&g_disk_class, "%s%d", dp->d_name, dp->d_unit); gp->softc = sc; pp = g_new_providerf(gp, "%s", gp->name); devstat_remove_entry(pp->stat); pp->stat = NULL; dp->d_devstat->id = pp; pp->mediasize = dp->d_mediasize; pp->sectorsize = dp->d_sectorsize; pp->stripeoffset = dp->d_stripeoffset; pp->stripesize = dp->d_stripesize; if ((dp->d_flags & DISKFLAG_UNMAPPED_BIO) != 0) pp->flags |= G_PF_ACCEPT_UNMAPPED; if ((dp->d_flags & DISKFLAG_DIRECT_COMPLETION) != 0) pp->flags |= G_PF_DIRECT_SEND; pp->flags |= G_PF_DIRECT_RECEIVE; if (bootverbose) printf("GEOM: new disk %s\n", gp->name); sysctl_ctx_init(&sc->sysctl_ctx); snprintf(tmpstr, sizeof(tmpstr), "GEOM disk %s", gp->name); sc->sysctl_tree = SYSCTL_ADD_NODE(&sc->sysctl_ctx, SYSCTL_STATIC_CHILDREN(_kern_geom_disk), OID_AUTO, gp->name, CTLFLAG_RD, 0, tmpstr); if (sc->sysctl_tree != NULL) { snprintf(tmpstr, sizeof(tmpstr), "kern.geom.disk.%s.led", gp->name); TUNABLE_STR_FETCH(tmpstr, sc->led, sizeof(sc->led)); SYSCTL_ADD_STRING(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), OID_AUTO, "led", CTLFLAG_RW | CTLFLAG_TUN, sc->led, sizeof(sc->led), "LED name"); } pp->private = sc; dp->d_geom = gp; g_error_provider(pp, 0); } /* * We get this callback after all of the consumers have gone away, and just * before the provider is freed. If the disk driver provided a d_gone * callback, let them know that it is okay to free resources -- they won't * be getting any more accesses from GEOM. */ static void g_disk_providergone(struct g_provider *pp) { struct disk *dp; struct g_disk_softc *sc; sc = (struct g_disk_softc *)pp->private; dp = sc->dp; if (dp != NULL && dp->d_gone != NULL) dp->d_gone(dp); if (sc->sysctl_tree != NULL) { sysctl_ctx_free(&sc->sysctl_ctx); sc->sysctl_tree = NULL; } if (sc->led[0] != 0) { led_set(sc->led, "0"); sc->led[0] = 0; } pp->private = NULL; pp->geom->softc = NULL; mtx_destroy(&sc->done_mtx); mtx_destroy(&sc->start_mtx); g_free(sc); } static void g_disk_destroy(void *ptr, int flag) { struct disk *dp; struct g_geom *gp; struct g_disk_softc *sc; g_topology_assert(); dp = ptr; gp = dp->d_geom; if (gp != NULL) { sc = gp->softc; if (sc != NULL) sc->dp = NULL; dp->d_geom = NULL; g_wither_geom(gp, ENXIO); } g_free(dp); } /* * We only allow printable characters in disk ident, * the rest is converted to 'x'. */ static void g_disk_ident_adjust(char *ident, size_t size) { char *p, tmp[4], newid[DISK_IDENT_SIZE]; newid[0] = '\0'; for (p = ident; *p != '\0'; p++) { if (isprint(*p)) { tmp[0] = *p; tmp[1] = '\0'; } else { snprintf(tmp, sizeof(tmp), "x%02hhx", *(unsigned char *)p); } if (strlcat(newid, tmp, sizeof(newid)) >= sizeof(newid)) break; } bzero(ident, size); strlcpy(ident, newid, size); } struct disk * disk_alloc(void) { return (g_malloc(sizeof(struct disk), M_WAITOK | M_ZERO)); } void disk_create(struct disk *dp, int version) { if (version != DISK_VERSION) { printf("WARNING: Attempt to add disk %s%d %s", dp->d_name, dp->d_unit, " using incompatible ABI version of disk(9)\n"); printf("WARNING: Ignoring disk %s%d\n", dp->d_name, dp->d_unit); return; } if (version < DISK_VERSION_04) dp->d_flags |= DISKFLAG_LACKS_ROTRATE; KASSERT(dp->d_strategy != NULL, ("disk_create need d_strategy")); KASSERT(dp->d_name != NULL, ("disk_create need d_name")); KASSERT(*dp->d_name != 0, ("disk_create need d_name")); KASSERT(strlen(dp->d_name) < SPECNAMELEN - 4, ("disk name too long")); if (dp->d_devstat == NULL) dp->d_devstat = devstat_new_entry(dp->d_name, dp->d_unit, dp->d_sectorsize, DEVSTAT_ALL_SUPPORTED, DEVSTAT_TYPE_DIRECT, DEVSTAT_PRIORITY_MAX); dp->d_geom = NULL; g_disk_ident_adjust(dp->d_ident, sizeof(dp->d_ident)); g_post_event(g_disk_create, dp, M_WAITOK, dp, NULL); } void disk_destroy(struct disk *dp) { g_cancel_event(dp); dp->d_destroyed = 1; if (dp->d_devstat != NULL) devstat_remove_entry(dp->d_devstat); g_post_event(g_disk_destroy, dp, M_WAITOK, NULL); } void disk_gone(struct disk *dp) { struct g_geom *gp; struct g_provider *pp; gp = dp->d_geom; if (gp != NULL) { pp = LIST_FIRST(&gp->provider); if (pp != NULL) { KASSERT(LIST_NEXT(pp, provider) == NULL, ("geom %p has more than one provider", gp)); g_wither_provider(pp, ENXIO); } } } void disk_attr_changed(struct disk *dp, const char *attr, int flag) { struct g_geom *gp; struct g_provider *pp; gp = dp->d_geom; if (gp != NULL) LIST_FOREACH(pp, &gp->provider, provider) (void)g_attr_changed(pp, attr, flag); } void disk_media_changed(struct disk *dp, int flag) { struct g_geom *gp; struct g_provider *pp; gp = dp->d_geom; if (gp != NULL) { pp = LIST_FIRST(&gp->provider); if (pp != NULL) { KASSERT(LIST_NEXT(pp, provider) == NULL, ("geom %p has more than one provider", gp)); g_media_changed(pp, flag); } } } void disk_media_gone(struct disk *dp, int flag) { struct g_geom *gp; struct g_provider *pp; gp = dp->d_geom; if (gp != NULL) { pp = LIST_FIRST(&gp->provider); if (pp != NULL) { KASSERT(LIST_NEXT(pp, provider) == NULL, ("geom %p has more than one provider", gp)); g_media_gone(pp, flag); } } } int disk_resize(struct disk *dp, int flag) { if (dp->d_destroyed || dp->d_geom == NULL) return (0); return (g_post_event(g_disk_resize, dp, flag, NULL)); } static void g_kern_disks(void *p, int flag __unused) { struct sbuf *sb; struct g_geom *gp; char *sp; sb = p; sp = ""; g_topology_assert(); LIST_FOREACH(gp, &g_disk_class.geom, geom) { sbuf_printf(sb, "%s%s", sp, gp->name); sp = " "; } sbuf_finish(sb); } static int sysctl_disks(SYSCTL_HANDLER_ARGS) { int error; struct sbuf *sb; sb = sbuf_new_auto(); g_waitfor_event(g_kern_disks, sb, M_WAITOK, NULL); error = SYSCTL_OUT(req, sbuf_data(sb), sbuf_len(sb) + 1); sbuf_delete(sb); return error; } SYSCTL_PROC(_kern, OID_AUTO, disks, CTLTYPE_STRING | CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, 0, sysctl_disks, "A", "names of available disks"); Index: stable/10/sys/geom/geom_io.c =================================================================== --- stable/10/sys/geom/geom_io.c (revision 292347) +++ stable/10/sys/geom/geom_io.c (revision 292348) @@ -1,997 +1,998 @@ /*- * Copyright (c) 2002 Poul-Henning Kamp * Copyright (c) 2002 Networks Associates Technology, Inc. * Copyright (c) 2013 The FreeBSD Foundation * All rights reserved. * * This software was developed for the FreeBSD Project by Poul-Henning Kamp * and NAI Labs, the Security Research Division of Network Associates, Inc. * under DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the * DARPA CHATS research program. * * Portions of this software were developed by Konstantin Belousov * under sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The names of the authors may not be used to endorse or promote * products derived from this software without specific prior written * permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static int g_io_transient_map_bio(struct bio *bp); static struct g_bioq g_bio_run_down; static struct g_bioq g_bio_run_up; static struct g_bioq g_bio_run_task; /* * Pace is a hint that we've had some trouble recently allocating * bios, so we should back off trying to send I/O down the stack * a bit to let the problem resolve. When pacing, we also turn * off direct dispatch to also reduce memory pressure from I/Os * there, at the expxense of some added latency while the memory * pressures exist. See g_io_schedule_down() for more details * and limitations. */ static volatile u_int pace; static uma_zone_t biozone; /* * The head of the list of classifiers used in g_io_request. * Use g_register_classifier() and g_unregister_classifier() * to add/remove entries to the list. * Classifiers are invoked in registration order. */ static TAILQ_HEAD(g_classifier_tailq, g_classifier_hook) g_classifier_tailq = TAILQ_HEAD_INITIALIZER(g_classifier_tailq); #include static void g_bioq_lock(struct g_bioq *bq) { mtx_lock(&bq->bio_queue_lock); } static void g_bioq_unlock(struct g_bioq *bq) { mtx_unlock(&bq->bio_queue_lock); } #if 0 static void g_bioq_destroy(struct g_bioq *bq) { mtx_destroy(&bq->bio_queue_lock); } #endif static void g_bioq_init(struct g_bioq *bq) { TAILQ_INIT(&bq->bio_queue); mtx_init(&bq->bio_queue_lock, "bio queue", NULL, MTX_DEF); } static struct bio * g_bioq_first(struct g_bioq *bq) { struct bio *bp; bp = TAILQ_FIRST(&bq->bio_queue); if (bp != NULL) { KASSERT((bp->bio_flags & BIO_ONQUEUE), ("Bio not on queue bp=%p target %p", bp, bq)); bp->bio_flags &= ~BIO_ONQUEUE; TAILQ_REMOVE(&bq->bio_queue, bp, bio_queue); bq->bio_queue_length--; } return (bp); } struct bio * g_new_bio(void) { struct bio *bp; bp = uma_zalloc(biozone, M_NOWAIT | M_ZERO); #ifdef KTR if ((KTR_COMPILE & KTR_GEOM) && (ktr_mask & KTR_GEOM)) { struct stack st; CTR1(KTR_GEOM, "g_new_bio(): %p", bp); stack_save(&st); CTRSTACK(KTR_GEOM, &st, 3, 0); } #endif return (bp); } struct bio * g_alloc_bio(void) { struct bio *bp; bp = uma_zalloc(biozone, M_WAITOK | M_ZERO); #ifdef KTR if ((KTR_COMPILE & KTR_GEOM) && (ktr_mask & KTR_GEOM)) { struct stack st; CTR1(KTR_GEOM, "g_alloc_bio(): %p", bp); stack_save(&st); CTRSTACK(KTR_GEOM, &st, 3, 0); } #endif return (bp); } void g_destroy_bio(struct bio *bp) { #ifdef KTR if ((KTR_COMPILE & KTR_GEOM) && (ktr_mask & KTR_GEOM)) { struct stack st; CTR1(KTR_GEOM, "g_destroy_bio(): %p", bp); stack_save(&st); CTRSTACK(KTR_GEOM, &st, 3, 0); } #endif uma_zfree(biozone, bp); } struct bio * g_clone_bio(struct bio *bp) { struct bio *bp2; bp2 = uma_zalloc(biozone, M_NOWAIT | M_ZERO); if (bp2 != NULL) { bp2->bio_parent = bp; bp2->bio_cmd = bp->bio_cmd; /* * BIO_ORDERED flag may be used by disk drivers to enforce * ordering restrictions, so this flag needs to be cloned. - * BIO_UNMAPPED should be inherited, to properly indicate - * which way the buffer is passed. + * BIO_UNMAPPED and BIO_VLIST should be inherited, to properly + * indicate which way the buffer is passed. * Other bio flags are not suitable for cloning. */ - bp2->bio_flags = bp->bio_flags & (BIO_ORDERED | BIO_UNMAPPED); + bp2->bio_flags = bp->bio_flags & + (BIO_ORDERED | BIO_UNMAPPED | BIO_VLIST); bp2->bio_length = bp->bio_length; bp2->bio_offset = bp->bio_offset; bp2->bio_data = bp->bio_data; bp2->bio_ma = bp->bio_ma; bp2->bio_ma_n = bp->bio_ma_n; bp2->bio_ma_offset = bp->bio_ma_offset; bp2->bio_attribute = bp->bio_attribute; /* Inherit classification info from the parent */ bp2->bio_classifier1 = bp->bio_classifier1; bp2->bio_classifier2 = bp->bio_classifier2; bp->bio_children++; } #ifdef KTR if ((KTR_COMPILE & KTR_GEOM) && (ktr_mask & KTR_GEOM)) { struct stack st; CTR2(KTR_GEOM, "g_clone_bio(%p): %p", bp, bp2); stack_save(&st); CTRSTACK(KTR_GEOM, &st, 3, 0); } #endif return(bp2); } struct bio * g_duplicate_bio(struct bio *bp) { struct bio *bp2; bp2 = uma_zalloc(biozone, M_WAITOK | M_ZERO); - bp2->bio_flags = bp->bio_flags & BIO_UNMAPPED; + bp2->bio_flags = bp->bio_flags & (BIO_UNMAPPED | BIO_VLIST); bp2->bio_parent = bp; bp2->bio_cmd = bp->bio_cmd; bp2->bio_length = bp->bio_length; bp2->bio_offset = bp->bio_offset; bp2->bio_data = bp->bio_data; bp2->bio_ma = bp->bio_ma; bp2->bio_ma_n = bp->bio_ma_n; bp2->bio_ma_offset = bp->bio_ma_offset; bp2->bio_attribute = bp->bio_attribute; bp->bio_children++; #ifdef KTR if ((KTR_COMPILE & KTR_GEOM) && (ktr_mask & KTR_GEOM)) { struct stack st; CTR2(KTR_GEOM, "g_duplicate_bio(%p): %p", bp, bp2); stack_save(&st); CTRSTACK(KTR_GEOM, &st, 3, 0); } #endif return(bp2); } void g_io_init() { g_bioq_init(&g_bio_run_down); g_bioq_init(&g_bio_run_up); g_bioq_init(&g_bio_run_task); biozone = uma_zcreate("g_bio", sizeof (struct bio), NULL, NULL, NULL, NULL, 0, 0); } int g_io_getattr(const char *attr, struct g_consumer *cp, int *len, void *ptr) { struct bio *bp; int error; g_trace(G_T_BIO, "bio_getattr(%s)", attr); bp = g_alloc_bio(); bp->bio_cmd = BIO_GETATTR; bp->bio_done = NULL; bp->bio_attribute = attr; bp->bio_length = *len; bp->bio_data = ptr; g_io_request(bp, cp); error = biowait(bp, "ggetattr"); *len = bp->bio_completed; g_destroy_bio(bp); return (error); } int g_io_flush(struct g_consumer *cp) { struct bio *bp; int error; g_trace(G_T_BIO, "bio_flush(%s)", cp->provider->name); bp = g_alloc_bio(); bp->bio_cmd = BIO_FLUSH; bp->bio_flags |= BIO_ORDERED; bp->bio_done = NULL; bp->bio_attribute = NULL; bp->bio_offset = cp->provider->mediasize; bp->bio_length = 0; bp->bio_data = NULL; g_io_request(bp, cp); error = biowait(bp, "gflush"); g_destroy_bio(bp); return (error); } static int g_io_check(struct bio *bp) { struct g_consumer *cp; struct g_provider *pp; off_t excess; int error; cp = bp->bio_from; pp = bp->bio_to; /* Fail if access counters dont allow the operation */ switch(bp->bio_cmd) { case BIO_READ: case BIO_GETATTR: if (cp->acr == 0) return (EPERM); break; case BIO_WRITE: case BIO_DELETE: case BIO_FLUSH: if (cp->acw == 0) return (EPERM); break; default: return (EPERM); } /* if provider is marked for error, don't disturb. */ if (pp->error) return (pp->error); if (cp->flags & G_CF_ORPHAN) return (ENXIO); switch(bp->bio_cmd) { case BIO_READ: case BIO_WRITE: case BIO_DELETE: /* Zero sectorsize or mediasize is probably a lack of media. */ if (pp->sectorsize == 0 || pp->mediasize == 0) return (ENXIO); /* Reject I/O not on sector boundary */ if (bp->bio_offset % pp->sectorsize) return (EINVAL); /* Reject I/O not integral sector long */ if (bp->bio_length % pp->sectorsize) return (EINVAL); /* Reject requests before or past the end of media. */ if (bp->bio_offset < 0) return (EIO); if (bp->bio_offset > pp->mediasize) return (EIO); /* Truncate requests to the end of providers media. */ excess = bp->bio_offset + bp->bio_length; if (excess > bp->bio_to->mediasize) { KASSERT((bp->bio_flags & BIO_UNMAPPED) == 0 || round_page(bp->bio_ma_offset + bp->bio_length) / PAGE_SIZE == bp->bio_ma_n, ("excess bio %p too short", bp)); excess -= bp->bio_to->mediasize; bp->bio_length -= excess; if ((bp->bio_flags & BIO_UNMAPPED) != 0) { bp->bio_ma_n = round_page(bp->bio_ma_offset + bp->bio_length) / PAGE_SIZE; } if (excess > 0) CTR3(KTR_GEOM, "g_down truncated bio " "%p provider %s by %d", bp, bp->bio_to->name, excess); } /* Deliver zero length transfers right here. */ if (bp->bio_length == 0) { CTR2(KTR_GEOM, "g_down terminated 0-length " "bp %p provider %s", bp, bp->bio_to->name); return (0); } if ((bp->bio_flags & BIO_UNMAPPED) != 0 && (bp->bio_to->flags & G_PF_ACCEPT_UNMAPPED) == 0 && (bp->bio_cmd == BIO_READ || bp->bio_cmd == BIO_WRITE)) { if ((error = g_io_transient_map_bio(bp)) >= 0) return (error); } break; default: break; } return (EJUSTRETURN); } /* * bio classification support. * * g_register_classifier() and g_unregister_classifier() * are used to add/remove a classifier from the list. * The list is protected using the g_bio_run_down lock, * because the classifiers are called in this path. * * g_io_request() passes bio's that are not already classified * (i.e. those with bio_classifier1 == NULL) to g_run_classifiers(). * Classifiers can store their result in the two fields * bio_classifier1 and bio_classifier2. * A classifier that updates one of the fields should * return a non-zero value. * If no classifier updates the field, g_run_classifiers() sets * bio_classifier1 = BIO_NOTCLASSIFIED to avoid further calls. */ int g_register_classifier(struct g_classifier_hook *hook) { g_bioq_lock(&g_bio_run_down); TAILQ_INSERT_TAIL(&g_classifier_tailq, hook, link); g_bioq_unlock(&g_bio_run_down); return (0); } void g_unregister_classifier(struct g_classifier_hook *hook) { struct g_classifier_hook *entry; g_bioq_lock(&g_bio_run_down); TAILQ_FOREACH(entry, &g_classifier_tailq, link) { if (entry == hook) { TAILQ_REMOVE(&g_classifier_tailq, hook, link); break; } } g_bioq_unlock(&g_bio_run_down); } static void g_run_classifiers(struct bio *bp) { struct g_classifier_hook *hook; int classified = 0; TAILQ_FOREACH(hook, &g_classifier_tailq, link) classified |= hook->func(hook->arg, bp); if (!classified) bp->bio_classifier1 = BIO_NOTCLASSIFIED; } void g_io_request(struct bio *bp, struct g_consumer *cp) { struct g_provider *pp; struct mtx *mtxp; int direct, error, first; KASSERT(cp != NULL, ("NULL cp in g_io_request")); KASSERT(bp != NULL, ("NULL bp in g_io_request")); pp = cp->provider; KASSERT(pp != NULL, ("consumer not attached in g_io_request")); #ifdef DIAGNOSTIC KASSERT(bp->bio_driver1 == NULL, ("bio_driver1 used by the consumer (geom %s)", cp->geom->name)); KASSERT(bp->bio_driver2 == NULL, ("bio_driver2 used by the consumer (geom %s)", cp->geom->name)); KASSERT(bp->bio_pflags == 0, ("bio_pflags used by the consumer (geom %s)", cp->geom->name)); /* * Remember consumer's private fields, so we can detect if they were * modified by the provider. */ bp->_bio_caller1 = bp->bio_caller1; bp->_bio_caller2 = bp->bio_caller2; bp->_bio_cflags = bp->bio_cflags; #endif if (bp->bio_cmd & (BIO_READ|BIO_WRITE|BIO_GETATTR)) { KASSERT(bp->bio_data != NULL, ("NULL bp->data in g_io_request(cmd=%hhu)", bp->bio_cmd)); } if (bp->bio_cmd & (BIO_DELETE|BIO_FLUSH)) { KASSERT(bp->bio_data == NULL, ("non-NULL bp->data in g_io_request(cmd=%hhu)", bp->bio_cmd)); } if (bp->bio_cmd & (BIO_READ|BIO_WRITE|BIO_DELETE)) { KASSERT(bp->bio_offset % cp->provider->sectorsize == 0, ("wrong offset %jd for sectorsize %u", bp->bio_offset, cp->provider->sectorsize)); KASSERT(bp->bio_length % cp->provider->sectorsize == 0, ("wrong length %jd for sectorsize %u", bp->bio_length, cp->provider->sectorsize)); } g_trace(G_T_BIO, "bio_request(%p) from %p(%s) to %p(%s) cmd %d", bp, cp, cp->geom->name, pp, pp->name, bp->bio_cmd); bp->bio_from = cp; bp->bio_to = pp; bp->bio_error = 0; bp->bio_completed = 0; KASSERT(!(bp->bio_flags & BIO_ONQUEUE), ("Bio already on queue bp=%p", bp)); if ((g_collectstats & G_STATS_CONSUMERS) != 0 || ((g_collectstats & G_STATS_PROVIDERS) != 0 && pp->stat != NULL)) binuptime(&bp->bio_t0); else getbinuptime(&bp->bio_t0); #ifdef GET_STACK_USAGE direct = (cp->flags & G_CF_DIRECT_SEND) != 0 && (pp->flags & G_PF_DIRECT_RECEIVE) != 0 && !g_is_geom_thread(curthread) && ((pp->flags & G_PF_ACCEPT_UNMAPPED) != 0 || (bp->bio_flags & BIO_UNMAPPED) == 0 || THREAD_CAN_SLEEP()) && pace == 0; if (direct) { /* Block direct execution if less then half of stack left. */ size_t st, su; GET_STACK_USAGE(st, su); if (su * 2 > st) direct = 0; } #else direct = 0; #endif if (!TAILQ_EMPTY(&g_classifier_tailq) && !bp->bio_classifier1) { g_bioq_lock(&g_bio_run_down); g_run_classifiers(bp); g_bioq_unlock(&g_bio_run_down); } /* * The statistics collection is lockless, as such, but we * can not update one instance of the statistics from more * than one thread at a time, so grab the lock first. */ mtxp = mtx_pool_find(mtxpool_sleep, pp); mtx_lock(mtxp); if (g_collectstats & G_STATS_PROVIDERS) devstat_start_transaction(pp->stat, &bp->bio_t0); if (g_collectstats & G_STATS_CONSUMERS) devstat_start_transaction(cp->stat, &bp->bio_t0); pp->nstart++; cp->nstart++; mtx_unlock(mtxp); if (direct) { error = g_io_check(bp); if (error >= 0) { CTR3(KTR_GEOM, "g_io_request g_io_check on bp %p " "provider %s returned %d", bp, bp->bio_to->name, error); g_io_deliver(bp, error); return; } bp->bio_to->geom->start(bp); } else { g_bioq_lock(&g_bio_run_down); first = TAILQ_EMPTY(&g_bio_run_down.bio_queue); TAILQ_INSERT_TAIL(&g_bio_run_down.bio_queue, bp, bio_queue); bp->bio_flags |= BIO_ONQUEUE; g_bio_run_down.bio_queue_length++; g_bioq_unlock(&g_bio_run_down); /* Pass it on down. */ if (first) wakeup(&g_wait_down); } } void g_io_deliver(struct bio *bp, int error) { struct bintime now; struct g_consumer *cp; struct g_provider *pp; struct mtx *mtxp; int direct, first; KASSERT(bp != NULL, ("NULL bp in g_io_deliver")); pp = bp->bio_to; KASSERT(pp != NULL, ("NULL bio_to in g_io_deliver")); cp = bp->bio_from; if (cp == NULL) { bp->bio_error = error; bp->bio_done(bp); return; } KASSERT(cp != NULL, ("NULL bio_from in g_io_deliver")); KASSERT(cp->geom != NULL, ("NULL bio_from->geom in g_io_deliver")); #ifdef DIAGNOSTIC /* * Some classes - GJournal in particular - can modify bio's * private fields while the bio is in transit; G_GEOM_VOLATILE_BIO * flag means it's an expected behaviour for that particular geom. */ if ((cp->geom->flags & G_GEOM_VOLATILE_BIO) == 0) { KASSERT(bp->bio_caller1 == bp->_bio_caller1, ("bio_caller1 used by the provider %s", pp->name)); KASSERT(bp->bio_caller2 == bp->_bio_caller2, ("bio_caller2 used by the provider %s", pp->name)); KASSERT(bp->bio_cflags == bp->_bio_cflags, ("bio_cflags used by the provider %s", pp->name)); } #endif KASSERT(bp->bio_completed >= 0, ("bio_completed can't be less than 0")); KASSERT(bp->bio_completed <= bp->bio_length, ("bio_completed can't be greater than bio_length")); g_trace(G_T_BIO, "g_io_deliver(%p) from %p(%s) to %p(%s) cmd %d error %d off %jd len %jd", bp, cp, cp->geom->name, pp, pp->name, bp->bio_cmd, error, (intmax_t)bp->bio_offset, (intmax_t)bp->bio_length); KASSERT(!(bp->bio_flags & BIO_ONQUEUE), ("Bio already on queue bp=%p", bp)); /* * XXX: next two doesn't belong here */ bp->bio_bcount = bp->bio_length; bp->bio_resid = bp->bio_bcount - bp->bio_completed; #ifdef GET_STACK_USAGE direct = (pp->flags & G_PF_DIRECT_SEND) && (cp->flags & G_CF_DIRECT_RECEIVE) && !g_is_geom_thread(curthread); if (direct) { /* Block direct execution if less then half of stack left. */ size_t st, su; GET_STACK_USAGE(st, su); if (su * 2 > st) direct = 0; } #else direct = 0; #endif /* * The statistics collection is lockless, as such, but we * can not update one instance of the statistics from more * than one thread at a time, so grab the lock first. */ if ((g_collectstats & G_STATS_CONSUMERS) != 0 || ((g_collectstats & G_STATS_PROVIDERS) != 0 && pp->stat != NULL)) binuptime(&now); mtxp = mtx_pool_find(mtxpool_sleep, cp); mtx_lock(mtxp); if (g_collectstats & G_STATS_PROVIDERS) devstat_end_transaction_bio_bt(pp->stat, bp, &now); if (g_collectstats & G_STATS_CONSUMERS) devstat_end_transaction_bio_bt(cp->stat, bp, &now); cp->nend++; pp->nend++; mtx_unlock(mtxp); if (error != ENOMEM) { bp->bio_error = error; if (direct) { biodone(bp); } else { g_bioq_lock(&g_bio_run_up); first = TAILQ_EMPTY(&g_bio_run_up.bio_queue); TAILQ_INSERT_TAIL(&g_bio_run_up.bio_queue, bp, bio_queue); bp->bio_flags |= BIO_ONQUEUE; g_bio_run_up.bio_queue_length++; g_bioq_unlock(&g_bio_run_up); if (first) wakeup(&g_wait_up); } return; } if (bootverbose) printf("ENOMEM %p on %p(%s)\n", bp, pp, pp->name); bp->bio_children = 0; bp->bio_inbed = 0; bp->bio_driver1 = NULL; bp->bio_driver2 = NULL; bp->bio_pflags = 0; g_io_request(bp, cp); pace = 1; return; } SYSCTL_DECL(_kern_geom); static long transient_maps; SYSCTL_LONG(_kern_geom, OID_AUTO, transient_maps, CTLFLAG_RD, &transient_maps, 0, "Total count of the transient mapping requests"); u_int transient_map_retries = 10; SYSCTL_UINT(_kern_geom, OID_AUTO, transient_map_retries, CTLFLAG_RW, &transient_map_retries, 0, "Max count of retries used before giving up on creating transient map"); int transient_map_hard_failures; SYSCTL_INT(_kern_geom, OID_AUTO, transient_map_hard_failures, CTLFLAG_RD, &transient_map_hard_failures, 0, "Failures to establish the transient mapping due to retry attempts " "exhausted"); int transient_map_soft_failures; SYSCTL_INT(_kern_geom, OID_AUTO, transient_map_soft_failures, CTLFLAG_RD, &transient_map_soft_failures, 0, "Count of retried failures to establish the transient mapping"); int inflight_transient_maps; SYSCTL_INT(_kern_geom, OID_AUTO, inflight_transient_maps, CTLFLAG_RD, &inflight_transient_maps, 0, "Current count of the active transient maps"); static int g_io_transient_map_bio(struct bio *bp) { vm_offset_t addr; long size; u_int retried; KASSERT(unmapped_buf_allowed, ("unmapped disabled")); size = round_page(bp->bio_ma_offset + bp->bio_length); KASSERT(size / PAGE_SIZE == bp->bio_ma_n, ("Bio too short %p", bp)); addr = 0; retried = 0; atomic_add_long(&transient_maps, 1); retry: if (vmem_alloc(transient_arena, size, M_BESTFIT | M_NOWAIT, &addr)) { if (transient_map_retries != 0 && retried >= transient_map_retries) { CTR2(KTR_GEOM, "g_down cannot map bp %p provider %s", bp, bp->bio_to->name); atomic_add_int(&transient_map_hard_failures, 1); return (EDEADLK/* XXXKIB */); } else { /* * Naive attempt to quisce the I/O to get more * in-flight requests completed and defragment * the transient_arena. */ CTR3(KTR_GEOM, "g_down retrymap bp %p provider %s r %d", bp, bp->bio_to->name, retried); pause("g_d_tra", hz / 10); retried++; atomic_add_int(&transient_map_soft_failures, 1); goto retry; } } atomic_add_int(&inflight_transient_maps, 1); pmap_qenter((vm_offset_t)addr, bp->bio_ma, OFF_TO_IDX(size)); bp->bio_data = (caddr_t)addr + bp->bio_ma_offset; bp->bio_flags |= BIO_TRANSIENT_MAPPING; bp->bio_flags &= ~BIO_UNMAPPED; return (EJUSTRETURN); } void g_io_schedule_down(struct thread *tp __unused) { struct bio *bp; int error; for(;;) { g_bioq_lock(&g_bio_run_down); bp = g_bioq_first(&g_bio_run_down); if (bp == NULL) { CTR0(KTR_GEOM, "g_down going to sleep"); msleep(&g_wait_down, &g_bio_run_down.bio_queue_lock, PRIBIO | PDROP, "-", 0); continue; } CTR0(KTR_GEOM, "g_down has work to do"); g_bioq_unlock(&g_bio_run_down); if (pace != 0) { /* * There has been at least one memory allocation * failure since the last I/O completed. Pause 1ms to * give the system a chance to free up memory. We only * do this once because a large number of allocations * can fail in the direct dispatch case and there's no * relationship between the number of these failures and * the length of the outage. If there's still an outage, * we'll pause again and again until it's * resolved. Older versions paused longer and once per * allocation failure. This was OK for a single threaded * g_down, but with direct dispatch would lead to max of * 10 IOPs for minutes at a time when transient memory * issues prevented allocation for a batch of requests * from the upper layers. * * XXX This pacing is really lame. It needs to be solved * by other methods. This is OK only because the worst * case scenario is so rare. In the worst case scenario * all memory is tied up waiting for I/O to complete * which can never happen since we can't allocate bios * for that I/O. */ CTR0(KTR_GEOM, "g_down pacing self"); pause("g_down", min(hz/1000, 1)); pace = 0; } CTR2(KTR_GEOM, "g_down processing bp %p provider %s", bp, bp->bio_to->name); error = g_io_check(bp); if (error >= 0) { CTR3(KTR_GEOM, "g_down g_io_check on bp %p provider " "%s returned %d", bp, bp->bio_to->name, error); g_io_deliver(bp, error); continue; } THREAD_NO_SLEEPING(); CTR4(KTR_GEOM, "g_down starting bp %p provider %s off %ld " "len %ld", bp, bp->bio_to->name, bp->bio_offset, bp->bio_length); bp->bio_to->geom->start(bp); THREAD_SLEEPING_OK(); } } void bio_taskqueue(struct bio *bp, bio_task_t *func, void *arg) { bp->bio_task = func; bp->bio_task_arg = arg; /* * The taskqueue is actually just a second queue off the "up" * queue, so we use the same lock. */ g_bioq_lock(&g_bio_run_up); KASSERT(!(bp->bio_flags & BIO_ONQUEUE), ("Bio already on queue bp=%p target taskq", bp)); bp->bio_flags |= BIO_ONQUEUE; TAILQ_INSERT_TAIL(&g_bio_run_task.bio_queue, bp, bio_queue); g_bio_run_task.bio_queue_length++; wakeup(&g_wait_up); g_bioq_unlock(&g_bio_run_up); } void g_io_schedule_up(struct thread *tp __unused) { struct bio *bp; for(;;) { g_bioq_lock(&g_bio_run_up); bp = g_bioq_first(&g_bio_run_task); if (bp != NULL) { g_bioq_unlock(&g_bio_run_up); THREAD_NO_SLEEPING(); CTR1(KTR_GEOM, "g_up processing task bp %p", bp); bp->bio_task(bp->bio_task_arg); THREAD_SLEEPING_OK(); continue; } bp = g_bioq_first(&g_bio_run_up); if (bp != NULL) { g_bioq_unlock(&g_bio_run_up); THREAD_NO_SLEEPING(); CTR4(KTR_GEOM, "g_up biodone bp %p provider %s off " "%jd len %ld", bp, bp->bio_to->name, bp->bio_offset, bp->bio_length); biodone(bp); THREAD_SLEEPING_OK(); continue; } CTR0(KTR_GEOM, "g_up going to sleep"); msleep(&g_wait_up, &g_bio_run_up.bio_queue_lock, PRIBIO | PDROP, "-", 0); } } void * g_read_data(struct g_consumer *cp, off_t offset, off_t length, int *error) { struct bio *bp; void *ptr; int errorc; KASSERT(length > 0 && length >= cp->provider->sectorsize && length <= MAXPHYS, ("g_read_data(): invalid length %jd", (intmax_t)length)); bp = g_alloc_bio(); bp->bio_cmd = BIO_READ; bp->bio_done = NULL; bp->bio_offset = offset; bp->bio_length = length; ptr = g_malloc(length, M_WAITOK); bp->bio_data = ptr; g_io_request(bp, cp); errorc = biowait(bp, "gread"); if (error != NULL) *error = errorc; g_destroy_bio(bp); if (errorc) { g_free(ptr); ptr = NULL; } return (ptr); } int g_write_data(struct g_consumer *cp, off_t offset, void *ptr, off_t length) { struct bio *bp; int error; KASSERT(length > 0 && length >= cp->provider->sectorsize && length <= MAXPHYS, ("g_write_data(): invalid length %jd", (intmax_t)length)); bp = g_alloc_bio(); bp->bio_cmd = BIO_WRITE; bp->bio_done = NULL; bp->bio_offset = offset; bp->bio_length = length; bp->bio_data = ptr; g_io_request(bp, cp); error = biowait(bp, "gwrite"); g_destroy_bio(bp); return (error); } int g_delete_data(struct g_consumer *cp, off_t offset, off_t length) { struct bio *bp; int error; KASSERT(length > 0 && length >= cp->provider->sectorsize, ("g_delete_data(): invalid length %jd", (intmax_t)length)); bp = g_alloc_bio(); bp->bio_cmd = BIO_DELETE; bp->bio_done = NULL; bp->bio_offset = offset; bp->bio_length = length; bp->bio_data = NULL; g_io_request(bp, cp); error = biowait(bp, "gdelete"); g_destroy_bio(bp); return (error); } void g_print_bio(struct bio *bp) { const char *pname, *cmd = NULL; if (bp->bio_to != NULL) pname = bp->bio_to->name; else pname = "[unknown]"; switch (bp->bio_cmd) { case BIO_GETATTR: cmd = "GETATTR"; printf("%s[%s(attr=%s)]", pname, cmd, bp->bio_attribute); return; case BIO_FLUSH: cmd = "FLUSH"; printf("%s[%s]", pname, cmd); return; case BIO_READ: cmd = "READ"; break; case BIO_WRITE: cmd = "WRITE"; break; case BIO_DELETE: cmd = "DELETE"; break; default: cmd = "UNKNOWN"; printf("%s[%s()]", pname, cmd); return; } printf("%s[%s(offset=%jd, length=%jd)]", pname, cmd, (intmax_t)bp->bio_offset, (intmax_t)bp->bio_length); } Index: stable/10/sys/ia64/include/bus.h =================================================================== --- stable/10/sys/ia64/include/bus.h (revision 292347) +++ stable/10/sys/ia64/include/bus.h (revision 292348) @@ -1,820 +1,823 @@ /*- * Copyright (c) 2009 Marcel Moolenaar * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /* $NetBSD: bus.h,v 1.12 1997/10/01 08:25:15 fvdl Exp $ */ /*- * Copyright (c) 1996, 1997 The NetBSD Foundation, Inc. * All rights reserved. * * This code is derived from software contributed to The NetBSD Foundation * by Jason R. Thorpe of the Numerical Aerospace Simulation Facility, * NASA Ames Research Center. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ /*- * Copyright (c) 1996 Charles M. Hannum. All rights reserved. * Copyright (c) 1996 Christopher G. Demetriou. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by Christopher G. Demetriou * for the NetBSD Project. * 4. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* $FreeBSD$ */ #ifndef _MACHINE_BUS_H_ #define _MACHINE_BUS_H_ #include #include /* * I/O port reads with ia32 semantics. */ #define inb bus_space_read_io_1 #define inw bus_space_read_io_2 #define inl bus_space_read_io_4 #define outb bus_space_write_io_1 #define outw bus_space_write_io_2 #define outl bus_space_write_io_4 /* * Values for the ia64 bus space tag, not to be used directly by MI code. */ #define IA64_BUS_SPACE_IO 0 /* space is i/o space */ #define IA64_BUS_SPACE_MEM 1 /* space is mem space */ #define BUS_SPACE_BARRIER_READ 0x01 /* force read barrier */ #define BUS_SPACE_BARRIER_WRITE 0x02 /* force write barrier */ #define BUS_SPACE_MAXSIZE_24BIT 0xFFFFFF #define BUS_SPACE_MAXSIZE_32BIT 0xFFFFFFFF #define BUS_SPACE_MAXSIZE 0xFFFFFFFFFFFFFFFF #define BUS_SPACE_MAXADDR_24BIT 0xFFFFFF #define BUS_SPACE_MAXADDR_32BIT 0xFFFFFFFF #define BUS_SPACE_MAXADDR 0xFFFFFFFFFFFFFFFF #define BUS_SPACE_UNRESTRICTED (~0) +#ifdef _KERNEL /* * Map and unmap a region of device bus space into CPU virtual address space. */ int bus_space_map(bus_space_tag_t, bus_addr_t, bus_size_t, int, bus_space_handle_t *); void bus_space_unmap(bus_space_tag_t, bus_space_handle_t, bus_size_t size); /* * Get a new handle for a subregion of an already-mapped area of bus space. */ static __inline int bus_space_subregion(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, bus_size_t size __unused, bus_space_handle_t *nbshp) { *nbshp = bsh + ofs; return (0); } /* * Allocate a region of memory that is accessible to devices in bus space. */ int bus_space_alloc(bus_space_tag_t bst, bus_addr_t rstart, bus_addr_t rend, bus_size_t size, bus_size_t align, bus_size_t boundary, int flags, bus_addr_t *addrp, bus_space_handle_t *bshp); /* * Free a region of bus space accessible memory. */ void bus_space_free(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t size); /* * Bus read/write barrier method. */ static __inline void bus_space_barrier(bus_space_tag_t bst __unused, bus_space_handle_t bsh __unused, bus_size_t ofs __unused, bus_size_t size __unused, int flags __unused) { ia64_mf_a(); ia64_mf(); } /* * Read 1 unit of data from bus space described by the tag, handle and ofs * tuple. A unit of data can be 1 byte, 2 bytes, 4 bytes or 8 bytes. The * data is returned. */ uint8_t bus_space_read_io_1(u_long); uint16_t bus_space_read_io_2(u_long); uint32_t bus_space_read_io_4(u_long); uint64_t bus_space_read_io_8(u_long); static __inline uint8_t bus_space_read_1(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs) { uint8_t val; val = (__predict_false(bst == IA64_BUS_SPACE_IO)) ? bus_space_read_io_1(bsh + ofs) : ia64_ld1((void *)(bsh + ofs)); return (val); } static __inline uint16_t bus_space_read_2(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs) { uint16_t val; val = (__predict_false(bst == IA64_BUS_SPACE_IO)) ? bus_space_read_io_2(bsh + ofs) : ia64_ld2((void *)(bsh + ofs)); return (val); } static __inline uint32_t bus_space_read_4(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs) { uint32_t val; val = (__predict_false(bst == IA64_BUS_SPACE_IO)) ? bus_space_read_io_4(bsh + ofs) : ia64_ld4((void *)(bsh + ofs)); return (val); } static __inline uint64_t bus_space_read_8(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs) { uint64_t val; val = (__predict_false(bst == IA64_BUS_SPACE_IO)) ? bus_space_read_io_8(bsh + ofs) : ia64_ld8((void *)(bsh + ofs)); return (val); } /* * Write 1 unit of data to bus space described by the tag, handle and ofs * tuple. A unit of data can be 1 byte, 2 bytes, 4 bytes or 8 bytes. The * data is passed by value. */ void bus_space_write_io_1(u_long, uint8_t); void bus_space_write_io_2(u_long, uint16_t); void bus_space_write_io_4(u_long, uint32_t); void bus_space_write_io_8(u_long, uint64_t); static __inline void bus_space_write_1(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint8_t val) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_io_1(bsh + ofs, val); else ia64_st1((void *)(bsh + ofs), val); } static __inline void bus_space_write_2(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint16_t val) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_io_2(bsh + ofs, val); else ia64_st2((void *)(bsh + ofs), val); } static __inline void bus_space_write_4(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint32_t val) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_io_4(bsh + ofs, val); else ia64_st4((void *)(bsh + ofs), val); } static __inline void bus_space_write_8(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint64_t val) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_io_8(bsh + ofs, val); else ia64_st8((void *)(bsh + ofs), val); } /* * Read count units of data from bus space described by the tag, handle and * ofs tuple. A unit of data can be 1 byte, 2 bytes, 4 bytes or 8 bytes. The * data is returned in the buffer passed by reference. */ void bus_space_read_multi_io_1(u_long, uint8_t *, size_t); void bus_space_read_multi_io_2(u_long, uint16_t *, size_t); void bus_space_read_multi_io_4(u_long, uint32_t *, size_t); void bus_space_read_multi_io_8(u_long, uint64_t *, size_t); static __inline void bus_space_read_multi_1(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint8_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_read_multi_io_1(bsh + ofs, bufp, count); else { while (count-- > 0) *bufp++ = ia64_ld1((void *)(bsh + ofs)); } } static __inline void bus_space_read_multi_2(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint16_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_read_multi_io_2(bsh + ofs, bufp, count); else { while (count-- > 0) *bufp++ = ia64_ld2((void *)(bsh + ofs)); } } static __inline void bus_space_read_multi_4(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint32_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_read_multi_io_4(bsh + ofs, bufp, count); else { while (count-- > 0) *bufp++ = ia64_ld4((void *)(bsh + ofs)); } } static __inline void bus_space_read_multi_8(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint64_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_read_multi_io_8(bsh + ofs, bufp, count); else { while (count-- > 0) *bufp++ = ia64_ld8((void *)(bsh + ofs)); } } /* * Write count units of data to bus space described by the tag, handle and * ofs tuple. A unit of data can be 1 byte, 2 bytes, 4 bytes or 8 bytes. The * data is read from the buffer passed by reference. */ void bus_space_write_multi_io_1(u_long, const uint8_t *, size_t); void bus_space_write_multi_io_2(u_long, const uint16_t *, size_t); void bus_space_write_multi_io_4(u_long, const uint32_t *, size_t); void bus_space_write_multi_io_8(u_long, const uint64_t *, size_t); static __inline void bus_space_write_multi_1(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, const uint8_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_multi_io_1(bsh + ofs, bufp, count); else { while (count-- > 0) ia64_st1((void *)(bsh + ofs), *bufp++); } } static __inline void bus_space_write_multi_2(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, const uint16_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_multi_io_2(bsh + ofs, bufp, count); else { while (count-- > 0) ia64_st2((void *)(bsh + ofs), *bufp++); } } static __inline void bus_space_write_multi_4(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, const uint32_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_multi_io_4(bsh + ofs, bufp, count); else { while (count-- > 0) ia64_st4((void *)(bsh + ofs), *bufp++); } } static __inline void bus_space_write_multi_8(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, const uint64_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_multi_io_8(bsh + ofs, bufp, count); else { while (count-- > 0) ia64_st8((void *)(bsh + ofs), *bufp++); } } /* * Read count units of data from bus space described by the tag, handle and * ofs tuple. A unit of data can be 1 byte, 2 bytes, 4 bytes or 8 bytes. The * data is written to the buffer passed by reference and read from successive * bus space addresses. Access is unordered. */ void bus_space_read_region_io_1(u_long, uint8_t *, size_t); void bus_space_read_region_io_2(u_long, uint16_t *, size_t); void bus_space_read_region_io_4(u_long, uint32_t *, size_t); void bus_space_read_region_io_8(u_long, uint64_t *, size_t); static __inline void bus_space_read_region_1(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint8_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_read_region_io_1(bsh + ofs, bufp, count); else { uint8_t *bsp = (void *)(bsh + ofs); while (count-- > 0) *bufp++ = ia64_ld1(bsp++); } } static __inline void bus_space_read_region_2(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint16_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_read_region_io_2(bsh + ofs, bufp, count); else { uint16_t *bsp = (void *)(bsh + ofs); while (count-- > 0) *bufp++ = ia64_ld2(bsp++); } } static __inline void bus_space_read_region_4(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint32_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_read_region_io_4(bsh + ofs, bufp, count); else { uint32_t *bsp = (void *)(bsh + ofs); while (count-- > 0) *bufp++ = ia64_ld4(bsp++); } } static __inline void bus_space_read_region_8(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint64_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_read_region_io_8(bsh + ofs, bufp, count); else { uint64_t *bsp = (void *)(bsh + ofs); while (count-- > 0) *bufp++ = ia64_ld8(bsp++); } } /* * Write count units of data from bus space described by the tag, handle and * ofs tuple. A unit of data can be 1 byte, 2 bytes, 4 bytes or 8 bytes. The * data is read from the buffer passed by reference and written to successive * bus space addresses. Access is unordered. */ void bus_space_write_region_io_1(u_long, const uint8_t *, size_t); void bus_space_write_region_io_2(u_long, const uint16_t *, size_t); void bus_space_write_region_io_4(u_long, const uint32_t *, size_t); void bus_space_write_region_io_8(u_long, const uint64_t *, size_t); static __inline void bus_space_write_region_1(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, const uint8_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_region_io_1(bsh + ofs, bufp, count); else { uint8_t *bsp = (void *)(bsh + ofs); while (count-- > 0) ia64_st1(bsp++, *bufp++); } } static __inline void bus_space_write_region_2(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, const uint16_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_region_io_2(bsh + ofs, bufp, count); else { uint16_t *bsp = (void *)(bsh + ofs); while (count-- > 0) ia64_st2(bsp++, *bufp++); } } static __inline void bus_space_write_region_4(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, const uint32_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_region_io_4(bsh + ofs, bufp, count); else { uint32_t *bsp = (void *)(bsh + ofs); while (count-- > 0) ia64_st4(bsp++, *bufp++); } } static __inline void bus_space_write_region_8(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, const uint64_t *bufp, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_write_region_io_8(bsh + ofs, bufp, count); else { uint64_t *bsp = (void *)(bsh + ofs); while (count-- > 0) ia64_st8(bsp++, *bufp++); } } /* * Write count units of data from bus space described by the tag, handle and * ofs tuple. A unit of data can be 1 byte, 2 bytes, 4 bytes or 8 bytes. The * data is passed by value. Writes are unordered. */ static __inline void bus_space_set_multi_1(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint8_t val, size_t count) { while (count-- > 0) bus_space_write_1(bst, bsh, ofs, val); } static __inline void bus_space_set_multi_2(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint16_t val, size_t count) { while (count-- > 0) bus_space_write_2(bst, bsh, ofs, val); } static __inline void bus_space_set_multi_4(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint32_t val, size_t count) { while (count-- > 0) bus_space_write_4(bst, bsh, ofs, val); } static __inline void bus_space_set_multi_8(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint64_t val, size_t count) { while (count-- > 0) bus_space_write_8(bst, bsh, ofs, val); } /* * Write count units of data from bus space described by the tag, handle and * ofs tuple. A unit of data can be 1 byte, 2 bytes, 4 bytes or 8 bytes. The * data is passed by value and written to successive bus space addresses. * Writes are unordered. */ void bus_space_set_region_io_1(u_long, uint8_t, size_t); void bus_space_set_region_io_2(u_long, uint16_t, size_t); void bus_space_set_region_io_4(u_long, uint32_t, size_t); void bus_space_set_region_io_8(u_long, uint64_t, size_t); static __inline void bus_space_set_region_1(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint8_t val, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_set_region_io_1(bsh + ofs, val, count); else { uint8_t *bsp = (void *)(bsh + ofs); while (count-- > 0) ia64_st1(bsp++, val); } } static __inline void bus_space_set_region_2(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint16_t val, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_set_region_io_2(bsh + ofs, val, count); else { uint16_t *bsp = (void *)(bsh + ofs); while (count-- > 0) ia64_st2(bsp++, val); } } static __inline void bus_space_set_region_4(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint32_t val, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_set_region_io_4(bsh + ofs, val, count); else { uint32_t *bsp = (void *)(bsh + ofs); while (count-- > 0) ia64_st4(bsp++, val); } } static __inline void bus_space_set_region_8(bus_space_tag_t bst, bus_space_handle_t bsh, bus_size_t ofs, uint64_t val, size_t count) { if (__predict_false(bst == IA64_BUS_SPACE_IO)) bus_space_set_region_io_4(bsh + ofs, val, count); else { uint64_t *bsp = (void *)(bsh + ofs); while (count-- > 0) ia64_st8(bsp++, val); } } /* * Copy count units of data from bus space described by the tag and the first * handle and ofs pair to bus space described by the tag and the second handle * and ofs pair. A unit of data can be 1 byte, 2 bytes, 4 bytes or 8 bytes. * The data is read from successive bus space addresses and also written to * successive bus space addresses. Both reads and writes are unordered. */ void bus_space_copy_region_io_1(u_long, u_long, size_t); void bus_space_copy_region_io_2(u_long, u_long, size_t); void bus_space_copy_region_io_4(u_long, u_long, size_t); void bus_space_copy_region_io_8(u_long, u_long, size_t); static __inline void bus_space_copy_region_1(bus_space_tag_t bst, bus_space_handle_t sbsh, bus_size_t sofs, bus_space_handle_t dbsh, bus_size_t dofs, size_t count) { uint8_t *dst, *src; if (__predict_false(bst == IA64_BUS_SPACE_IO)) { bus_space_copy_region_io_1(sbsh + sofs, dbsh + dofs, count); return; } src = (void *)(sbsh + sofs); dst = (void *)(dbsh + dofs); if (src < dst) { src += count - 1; dst += count - 1; while (count-- > 0) ia64_st1(dst--, ia64_ld1(src--)); } else { while (count-- > 0) ia64_st1(dst++, ia64_ld1(src++)); } } static __inline void bus_space_copy_region_2(bus_space_tag_t bst, bus_space_handle_t sbsh, bus_size_t sofs, bus_space_handle_t dbsh, bus_size_t dofs, size_t count) { uint16_t *dst, *src; if (__predict_false(bst == IA64_BUS_SPACE_IO)) { bus_space_copy_region_io_2(sbsh + sofs, dbsh + dofs, count); return; } src = (void *)(sbsh + sofs); dst = (void *)(dbsh + dofs); if (src < dst) { src += count - 1; dst += count - 1; while (count-- > 0) ia64_st2(dst--, ia64_ld2(src--)); } else { while (count-- > 0) ia64_st2(dst++, ia64_ld2(src++)); } } static __inline void bus_space_copy_region_4(bus_space_tag_t bst, bus_space_handle_t sbsh, bus_size_t sofs, bus_space_handle_t dbsh, bus_size_t dofs, size_t count) { uint32_t *dst, *src; if (__predict_false(bst == IA64_BUS_SPACE_IO)) { bus_space_copy_region_io_4(sbsh + sofs, dbsh + dofs, count); return; } src = (void *)(sbsh + sofs); dst = (void *)(dbsh + dofs); if (src < dst) { src += count - 1; dst += count - 1; while (count-- > 0) ia64_st4(dst--, ia64_ld4(src--)); } else { while (count-- > 0) ia64_st4(dst++, ia64_ld4(src++)); } } static __inline void bus_space_copy_region_8(bus_space_tag_t bst, bus_space_handle_t sbsh, bus_size_t sofs, bus_space_handle_t dbsh, bus_size_t dofs, size_t count) { uint64_t *dst, *src; if (__predict_false(bst == IA64_BUS_SPACE_IO)) { bus_space_copy_region_io_8(sbsh + sofs, dbsh + dofs, count); return; } src = (void *)(sbsh + sofs); dst = (void *)(dbsh + dofs); if (src < dst) { src += count - 1; dst += count - 1; while (count-- > 0) ia64_st8(dst--, ia64_ld8(src--)); } else { while (count-- > 0) ia64_st8(dst++, ia64_ld8(src++)); } } /* * Stream accesses are the same as normal accesses on ia64; there are no * supported bus systems with an endianess different from the host one. */ #define bus_space_read_stream_1 bus_space_read_1 #define bus_space_read_stream_2 bus_space_read_2 #define bus_space_read_stream_4 bus_space_read_4 #define bus_space_read_stream_8 bus_space_read_8 #define bus_space_write_stream_1 bus_space_write_1 #define bus_space_write_stream_2 bus_space_write_2 #define bus_space_write_stream_4 bus_space_write_4 #define bus_space_write_stream_8 bus_space_write_8 #define bus_space_read_multi_stream_1 bus_space_read_multi_1 #define bus_space_read_multi_stream_2 bus_space_read_multi_2 #define bus_space_read_multi_stream_4 bus_space_read_multi_4 #define bus_space_read_multi_stream_8 bus_space_read_multi_8 #define bus_space_write_multi_stream_1 bus_space_write_multi_1 #define bus_space_write_multi_stream_2 bus_space_write_multi_2 #define bus_space_write_multi_stream_4 bus_space_write_multi_4 #define bus_space_write_multi_stream_8 bus_space_write_multi_8 #define bus_space_read_region_stream_1 bus_space_read_region_1 #define bus_space_read_region_stream_2 bus_space_read_region_2 #define bus_space_read_region_stream_4 bus_space_read_region_4 #define bus_space_read_region_stream_8 bus_space_read_region_8 #define bus_space_write_region_stream_1 bus_space_write_region_1 #define bus_space_write_region_stream_2 bus_space_write_region_2 #define bus_space_write_region_stream_4 bus_space_write_region_4 #define bus_space_write_region_stream_8 bus_space_write_region_8 #define bus_space_set_multi_stream_1 bus_space_set_multi_1 #define bus_space_set_multi_stream_2 bus_space_set_multi_2 #define bus_space_set_multi_stream_4 bus_space_set_multi_4 #define bus_space_set_multi_stream_8 bus_space_set_multi_8 #define bus_space_set_region_stream_1 bus_space_set_region_1 #define bus_space_set_region_stream_2 bus_space_set_region_2 #define bus_space_set_region_stream_4 bus_space_set_region_4 #define bus_space_set_region_stream_8 bus_space_set_region_8 #define bus_space_copy_region_stream_1 bus_space_copy_region_1 #define bus_space_copy_region_stream_2 bus_space_copy_region_2 #define bus_space_copy_region_stream_4 bus_space_copy_region_4 #define bus_space_copy_region_stream_8 bus_space_copy_region_8 + +#endif /* _KERNEL */ #include #endif /* _MACHINE_BUS_H_ */ Index: stable/10/sys/kern/subr_bus_dma.c =================================================================== --- stable/10/sys/kern/subr_bus_dma.c (revision 292347) +++ stable/10/sys/kern/subr_bus_dma.c (revision 292348) @@ -1,542 +1,581 @@ /*- * Copyright (c) 2012 EMC Corp. * All rights reserved. * * Copyright (c) 1997, 1998 Justin T. Gibbs. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_bus.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* - * Load a list of virtual addresses. + * Load up data starting at offset within a region specified by a + * list of virtual address ranges until either length or the region + * are exhausted. */ static int _bus_dmamap_load_vlist(bus_dma_tag_t dmat, bus_dmamap_t map, bus_dma_segment_t *list, int sglist_cnt, struct pmap *pmap, int *nsegs, - int flags) + int flags, size_t offset, size_t length) { int error; error = 0; - for (; sglist_cnt > 0; sglist_cnt--, list++) { - error = _bus_dmamap_load_buffer(dmat, map, - (void *)(uintptr_t)list->ds_addr, list->ds_len, pmap, + for (; sglist_cnt > 0 && length != 0; sglist_cnt--, list++) { + char *addr; + size_t ds_len; + + KASSERT((offset < list->ds_len), + ("Invalid mid-segment offset")); + addr = (char *)(uintptr_t)list->ds_addr + offset; + ds_len = list->ds_len - offset; + offset = 0; + if (ds_len > length) + ds_len = length; + length -= ds_len; + KASSERT((ds_len != 0), ("Segment length is zero")); + error = _bus_dmamap_load_buffer(dmat, map, addr, ds_len, pmap, flags, NULL, nsegs); if (error) break; } return (error); } /* * Load a list of physical addresses. */ static int _bus_dmamap_load_plist(bus_dma_tag_t dmat, bus_dmamap_t map, bus_dma_segment_t *list, int sglist_cnt, int *nsegs, int flags) { int error; error = 0; for (; sglist_cnt > 0; sglist_cnt--, list++) { error = _bus_dmamap_load_phys(dmat, map, (vm_paddr_t)list->ds_addr, list->ds_len, flags, NULL, nsegs); if (error) break; } return (error); } /* * Load an mbuf chain. */ static int _bus_dmamap_load_mbuf_sg(bus_dma_tag_t dmat, bus_dmamap_t map, struct mbuf *m0, bus_dma_segment_t *segs, int *nsegs, int flags) { struct mbuf *m; int error; error = 0; for (m = m0; m != NULL && error == 0; m = m->m_next) { if (m->m_len > 0) { error = _bus_dmamap_load_buffer(dmat, map, m->m_data, m->m_len, kernel_pmap, flags | BUS_DMA_LOAD_MBUF, segs, nsegs); } } CTR5(KTR_BUSDMA, "%s: tag %p tag flags 0x%x error %d nsegs %d", __func__, dmat, flags, error, *nsegs); return (error); } /* + * Load tlen data starting at offset within a region specified by a list of + * physical pages. + */ +static int +_bus_dmamap_load_pages(bus_dma_tag_t dmat, bus_dmamap_t map, + vm_page_t *pages, bus_size_t tlen, int offset, int *nsegs, int flags) +{ + vm_paddr_t paddr; + bus_size_t len; + int error, i; + + for (i = 0, error = 0; error == 0 && tlen > 0; i++, tlen -= len) { + len = min(PAGE_SIZE - offset, tlen); + paddr = VM_PAGE_TO_PHYS(pages[i]) + offset; + error = _bus_dmamap_load_phys(dmat, map, paddr, len, + flags, NULL, nsegs); + offset = 0; + } + return (error); +} + +/* * Load from block io. */ static int _bus_dmamap_load_bio(bus_dma_tag_t dmat, bus_dmamap_t map, struct bio *bio, int *nsegs, int flags) { - int error; - if ((bio->bio_flags & BIO_UNMAPPED) == 0) { - error = _bus_dmamap_load_buffer(dmat, map, bio->bio_data, - bio->bio_bcount, kernel_pmap, flags, NULL, nsegs); - } else { - error = _bus_dmamap_load_ma(dmat, map, bio->bio_ma, - bio->bio_bcount, bio->bio_ma_offset, flags, NULL, nsegs); + if ((bio->bio_flags & BIO_VLIST) != 0) { + bus_dma_segment_t *segs = (bus_dma_segment_t *)bio->bio_data; + return (_bus_dmamap_load_vlist(dmat, map, segs, bio->bio_ma_n, + kernel_pmap, nsegs, flags, bio->bio_ma_offset, + bio->bio_bcount)); } - return (error); + + if ((bio->bio_flags & BIO_UNMAPPED) != 0) + return (_bus_dmamap_load_pages(dmat, map, bio->bio_ma, + bio->bio_bcount, bio->bio_ma_offset, nsegs, flags)); + + return (_bus_dmamap_load_buffer(dmat, map, bio->bio_data, + bio->bio_bcount, kernel_pmap, flags, NULL, nsegs)); } int bus_dmamap_load_ma_triv(bus_dma_tag_t dmat, bus_dmamap_t map, struct vm_page **ma, bus_size_t tlen, int ma_offs, int flags, bus_dma_segment_t *segs, int *segp) { vm_paddr_t paddr; bus_size_t len; int error, i; error = 0; for (i = 0; tlen > 0; i++, tlen -= len) { len = min(PAGE_SIZE - ma_offs, tlen); paddr = VM_PAGE_TO_PHYS(ma[i]) + ma_offs; error = _bus_dmamap_load_phys(dmat, map, paddr, len, flags, segs, segp); if (error != 0) break; ma_offs = 0; } return (error); } /* * Load a cam control block. */ static int _bus_dmamap_load_ccb(bus_dma_tag_t dmat, bus_dmamap_t map, union ccb *ccb, int *nsegs, int flags) { struct ccb_hdr *ccb_h; void *data_ptr; int error; uint32_t dxfer_len; uint16_t sglist_cnt; error = 0; ccb_h = &ccb->ccb_h; switch (ccb_h->func_code) { case XPT_SCSI_IO: { struct ccb_scsiio *csio; csio = &ccb->csio; data_ptr = csio->data_ptr; dxfer_len = csio->dxfer_len; sglist_cnt = csio->sglist_cnt; break; } case XPT_CONT_TARGET_IO: { struct ccb_scsiio *ctio; ctio = &ccb->ctio; data_ptr = ctio->data_ptr; dxfer_len = ctio->dxfer_len; sglist_cnt = ctio->sglist_cnt; break; } case XPT_ATA_IO: { struct ccb_ataio *ataio; ataio = &ccb->ataio; data_ptr = ataio->data_ptr; dxfer_len = ataio->dxfer_len; sglist_cnt = 0; break; } default: panic("_bus_dmamap_load_ccb: Unsupported func code %d", ccb_h->func_code); } switch ((ccb_h->flags & CAM_DATA_MASK)) { case CAM_DATA_VADDR: error = _bus_dmamap_load_buffer(dmat, map, data_ptr, dxfer_len, kernel_pmap, flags, NULL, nsegs); break; case CAM_DATA_PADDR: error = _bus_dmamap_load_phys(dmat, map, (vm_paddr_t)(uintptr_t)data_ptr, dxfer_len, flags, NULL, nsegs); break; case CAM_DATA_SG: error = _bus_dmamap_load_vlist(dmat, map, (bus_dma_segment_t *)data_ptr, sglist_cnt, kernel_pmap, - nsegs, flags); + nsegs, flags, 0, dxfer_len); break; case CAM_DATA_SG_PADDR: error = _bus_dmamap_load_plist(dmat, map, (bus_dma_segment_t *)data_ptr, sglist_cnt, nsegs, flags); break; case CAM_DATA_BIO: error = _bus_dmamap_load_bio(dmat, map, (struct bio *)data_ptr, nsegs, flags); break; default: panic("_bus_dmamap_load_ccb: flags 0x%X unimplemented", ccb_h->flags); } return (error); } /* * Load a uio. */ static int _bus_dmamap_load_uio(bus_dma_tag_t dmat, bus_dmamap_t map, struct uio *uio, int *nsegs, int flags) { bus_size_t resid; bus_size_t minlen; struct iovec *iov; pmap_t pmap; caddr_t addr; int error, i; if (uio->uio_segflg == UIO_USERSPACE) { KASSERT(uio->uio_td != NULL, ("bus_dmamap_load_uio: USERSPACE but no proc")); pmap = vmspace_pmap(uio->uio_td->td_proc->p_vmspace); } else pmap = kernel_pmap; resid = uio->uio_resid; iov = uio->uio_iov; error = 0; for (i = 0; i < uio->uio_iovcnt && resid != 0 && !error; i++) { /* * Now at the first iovec to load. Load each iovec * until we have exhausted the residual count. */ addr = (caddr_t) iov[i].iov_base; minlen = resid < iov[i].iov_len ? resid : iov[i].iov_len; if (minlen > 0) { error = _bus_dmamap_load_buffer(dmat, map, addr, minlen, pmap, flags, NULL, nsegs); resid -= minlen; } } return (error); } /* * Map the buffer buf into bus space using the dmamap map. */ int bus_dmamap_load(bus_dma_tag_t dmat, bus_dmamap_t map, void *buf, bus_size_t buflen, bus_dmamap_callback_t *callback, void *callback_arg, int flags) { bus_dma_segment_t *segs; struct memdesc mem; int error; int nsegs; if ((flags & BUS_DMA_NOWAIT) == 0) { mem = memdesc_vaddr(buf, buflen); _bus_dmamap_waitok(dmat, map, &mem, callback, callback_arg); } nsegs = -1; error = _bus_dmamap_load_buffer(dmat, map, buf, buflen, kernel_pmap, flags, NULL, &nsegs); nsegs++; CTR5(KTR_BUSDMA, "%s: tag %p tag flags 0x%x error %d nsegs %d", __func__, dmat, flags, error, nsegs); if (error == EINPROGRESS) return (error); segs = _bus_dmamap_complete(dmat, map, NULL, nsegs, error); if (error) (*callback)(callback_arg, segs, 0, error); else (*callback)(callback_arg, segs, nsegs, 0); /* * Return ENOMEM to the caller so that it can pass it up the stack. * This error only happens when NOWAIT is set, so deferral is disabled. */ if (error == ENOMEM) return (error); return (0); } int bus_dmamap_load_mbuf(bus_dma_tag_t dmat, bus_dmamap_t map, struct mbuf *m0, bus_dmamap_callback2_t *callback, void *callback_arg, int flags) { bus_dma_segment_t *segs; int nsegs, error; M_ASSERTPKTHDR(m0); flags |= BUS_DMA_NOWAIT; nsegs = -1; error = _bus_dmamap_load_mbuf_sg(dmat, map, m0, NULL, &nsegs, flags); ++nsegs; segs = _bus_dmamap_complete(dmat, map, NULL, nsegs, error); if (error) (*callback)(callback_arg, segs, 0, 0, error); else (*callback)(callback_arg, segs, nsegs, m0->m_pkthdr.len, error); CTR5(KTR_BUSDMA, "%s: tag %p tag flags 0x%x error %d nsegs %d", __func__, dmat, flags, error, nsegs); return (error); } int bus_dmamap_load_mbuf_sg(bus_dma_tag_t dmat, bus_dmamap_t map, struct mbuf *m0, bus_dma_segment_t *segs, int *nsegs, int flags) { int error; flags |= BUS_DMA_NOWAIT; *nsegs = -1; error = _bus_dmamap_load_mbuf_sg(dmat, map, m0, segs, nsegs, flags); ++*nsegs; _bus_dmamap_complete(dmat, map, segs, *nsegs, error); return (error); } int bus_dmamap_load_uio(bus_dma_tag_t dmat, bus_dmamap_t map, struct uio *uio, bus_dmamap_callback2_t *callback, void *callback_arg, int flags) { bus_dma_segment_t *segs; int nsegs, error; flags |= BUS_DMA_NOWAIT; nsegs = -1; error = _bus_dmamap_load_uio(dmat, map, uio, &nsegs, flags); nsegs++; segs = _bus_dmamap_complete(dmat, map, NULL, nsegs, error); if (error) (*callback)(callback_arg, segs, 0, 0, error); else (*callback)(callback_arg, segs, nsegs, uio->uio_resid, error); CTR5(KTR_BUSDMA, "%s: tag %p tag flags 0x%x error %d nsegs %d", __func__, dmat, flags, error, nsegs); return (error); } int bus_dmamap_load_ccb(bus_dma_tag_t dmat, bus_dmamap_t map, union ccb *ccb, bus_dmamap_callback_t *callback, void *callback_arg, int flags) { bus_dma_segment_t *segs; struct ccb_hdr *ccb_h; struct memdesc mem; int error; int nsegs; ccb_h = &ccb->ccb_h; if ((ccb_h->flags & CAM_DIR_MASK) == CAM_DIR_NONE) { callback(callback_arg, NULL, 0, 0); return (0); } if ((flags & BUS_DMA_NOWAIT) == 0) { mem = memdesc_ccb(ccb); _bus_dmamap_waitok(dmat, map, &mem, callback, callback_arg); } nsegs = -1; error = _bus_dmamap_load_ccb(dmat, map, ccb, &nsegs, flags); nsegs++; CTR5(KTR_BUSDMA, "%s: tag %p tag flags 0x%x error %d nsegs %d", __func__, dmat, flags, error, nsegs); if (error == EINPROGRESS) return (error); segs = _bus_dmamap_complete(dmat, map, NULL, nsegs, error); if (error) (*callback)(callback_arg, segs, 0, error); else (*callback)(callback_arg, segs, nsegs, error); /* * Return ENOMEM to the caller so that it can pass it up the stack. * This error only happens when NOWAIT is set, so deferral is disabled. */ if (error == ENOMEM) return (error); return (0); } int bus_dmamap_load_bio(bus_dma_tag_t dmat, bus_dmamap_t map, struct bio *bio, bus_dmamap_callback_t *callback, void *callback_arg, int flags) { bus_dma_segment_t *segs; struct memdesc mem; int error; int nsegs; if ((flags & BUS_DMA_NOWAIT) == 0) { mem = memdesc_bio(bio); _bus_dmamap_waitok(dmat, map, &mem, callback, callback_arg); } nsegs = -1; error = _bus_dmamap_load_bio(dmat, map, bio, &nsegs, flags); nsegs++; CTR5(KTR_BUSDMA, "%s: tag %p tag flags 0x%x error %d nsegs %d", __func__, dmat, flags, error, nsegs); if (error == EINPROGRESS) return (error); segs = _bus_dmamap_complete(dmat, map, NULL, nsegs, error); if (error) (*callback)(callback_arg, segs, 0, error); else (*callback)(callback_arg, segs, nsegs, error); /* * Return ENOMEM to the caller so that it can pass it up the stack. * This error only happens when NOWAIT is set, so deferral is disabled. */ if (error == ENOMEM) return (error); return (0); } int bus_dmamap_load_mem(bus_dma_tag_t dmat, bus_dmamap_t map, struct memdesc *mem, bus_dmamap_callback_t *callback, void *callback_arg, int flags) { bus_dma_segment_t *segs; int error; int nsegs; if ((flags & BUS_DMA_NOWAIT) == 0) _bus_dmamap_waitok(dmat, map, mem, callback, callback_arg); nsegs = -1; error = 0; switch (mem->md_type) { case MEMDESC_VADDR: error = _bus_dmamap_load_buffer(dmat, map, mem->u.md_vaddr, mem->md_opaque, kernel_pmap, flags, NULL, &nsegs); break; case MEMDESC_PADDR: error = _bus_dmamap_load_phys(dmat, map, mem->u.md_paddr, mem->md_opaque, flags, NULL, &nsegs); break; case MEMDESC_VLIST: error = _bus_dmamap_load_vlist(dmat, map, mem->u.md_list, - mem->md_opaque, kernel_pmap, &nsegs, flags); + mem->md_opaque, kernel_pmap, &nsegs, flags, 0, SIZE_T_MAX); break; case MEMDESC_PLIST: error = _bus_dmamap_load_plist(dmat, map, mem->u.md_list, mem->md_opaque, &nsegs, flags); break; case MEMDESC_BIO: error = _bus_dmamap_load_bio(dmat, map, mem->u.md_bio, &nsegs, flags); break; case MEMDESC_UIO: error = _bus_dmamap_load_uio(dmat, map, mem->u.md_uio, &nsegs, flags); break; case MEMDESC_MBUF: error = _bus_dmamap_load_mbuf_sg(dmat, map, mem->u.md_mbuf, NULL, &nsegs, flags); break; case MEMDESC_CCB: error = _bus_dmamap_load_ccb(dmat, map, mem->u.md_ccb, &nsegs, flags); break; } nsegs++; CTR5(KTR_BUSDMA, "%s: tag %p tag flags 0x%x error %d nsegs %d", __func__, dmat, flags, error, nsegs); if (error == EINPROGRESS) return (error); segs = _bus_dmamap_complete(dmat, map, NULL, nsegs, error); if (error) (*callback)(callback_arg, segs, 0, error); else (*callback)(callback_arg, segs, nsegs, 0); /* * Return ENOMEM to the caller so that it can pass it up the stack. * This error only happens when NOWAIT is set, so deferral is disabled. */ if (error == ENOMEM) return (error); return (0); } Index: stable/10/sys/kern/subr_uio.c =================================================================== --- stable/10/sys/kern/subr_uio.c (revision 292347) +++ stable/10/sys/kern/subr_uio.c (revision 292348) @@ -1,570 +1,624 @@ /*- * Copyright (c) 1982, 1986, 1991, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Copyright (c) 2014 The FreeBSD Foundation * * Portions of this software were developed by Konstantin Belousov * under sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)kern_subr.c 8.3 (Berkeley) 1/21/94 */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include + SYSCTL_INT(_kern, KERN_IOV_MAX, iov_max, CTLFLAG_RD, SYSCTL_NULL_INT_PTR, UIO_MAXIOV, "Maximum number of elements in an I/O vector; sysconf(_SC_IOV_MAX)"); static int uiomove_faultflag(void *cp, int n, struct uio *uio, int nofault); int copyin_nofault(const void *udaddr, void *kaddr, size_t len) { int error, save; save = vm_fault_disable_pagefaults(); error = copyin(udaddr, kaddr, len); vm_fault_enable_pagefaults(save); return (error); } int copyout_nofault(const void *kaddr, void *udaddr, size_t len) { int error, save; save = vm_fault_disable_pagefaults(); error = copyout(kaddr, udaddr, len); vm_fault_enable_pagefaults(save); return (error); } #define PHYS_PAGE_COUNT(len) (howmany(len, PAGE_SIZE) + 1) int physcopyin(void *src, vm_paddr_t dst, size_t len) { vm_page_t m[PHYS_PAGE_COUNT(len)]; struct iovec iov[1]; struct uio uio; int i; iov[0].iov_base = src; iov[0].iov_len = len; uio.uio_iov = iov; uio.uio_iovcnt = 1; uio.uio_offset = 0; uio.uio_resid = len; uio.uio_segflg = UIO_SYSSPACE; uio.uio_rw = UIO_WRITE; for (i = 0; i < PHYS_PAGE_COUNT(len); i++, dst += PAGE_SIZE) m[i] = PHYS_TO_VM_PAGE(dst); return (uiomove_fromphys(m, dst & PAGE_MASK, len, &uio)); } int physcopyout(vm_paddr_t src, void *dst, size_t len) { vm_page_t m[PHYS_PAGE_COUNT(len)]; struct iovec iov[1]; struct uio uio; int i; iov[0].iov_base = dst; iov[0].iov_len = len; uio.uio_iov = iov; uio.uio_iovcnt = 1; uio.uio_offset = 0; uio.uio_resid = len; uio.uio_segflg = UIO_SYSSPACE; uio.uio_rw = UIO_READ; for (i = 0; i < PHYS_PAGE_COUNT(len); i++, src += PAGE_SIZE) m[i] = PHYS_TO_VM_PAGE(src); return (uiomove_fromphys(m, src & PAGE_MASK, len, &uio)); } #undef PHYS_PAGE_COUNT + +int +physcopyin_vlist(bus_dma_segment_t *src, off_t offset, vm_paddr_t dst, + size_t len) +{ + size_t seg_len; + int error; + + error = 0; + while (offset >= src->ds_len) { + offset -= src->ds_len; + src++; + } + + while (len > 0 && error == 0) { + seg_len = MIN(src->ds_len - offset, len); + error = physcopyin((void *)(uintptr_t)(src->ds_addr + offset), + dst, seg_len); + offset = 0; + src++; + len -= seg_len; + dst += seg_len; + } + + return (error); +} + +int +physcopyout_vlist(vm_paddr_t src, bus_dma_segment_t *dst, off_t offset, + size_t len) +{ + size_t seg_len; + int error; + + error = 0; + while (offset >= dst->ds_len) { + offset -= dst->ds_len; + dst++; + } + + while (len > 0 && error == 0) { + seg_len = MIN(dst->ds_len - offset, len); + error = physcopyout(src, (void *)(uintptr_t)(dst->ds_addr + + offset), seg_len); + offset = 0; + dst++; + len -= seg_len; + src += seg_len; + } + + return (error); +} int uiomove(void *cp, int n, struct uio *uio) { return (uiomove_faultflag(cp, n, uio, 0)); } int uiomove_nofault(void *cp, int n, struct uio *uio) { return (uiomove_faultflag(cp, n, uio, 1)); } static int uiomove_faultflag(void *cp, int n, struct uio *uio, int nofault) { struct thread *td; struct iovec *iov; size_t cnt; int error, newflags, save; td = curthread; error = 0; KASSERT(uio->uio_rw == UIO_READ || uio->uio_rw == UIO_WRITE, ("uiomove: mode")); KASSERT(uio->uio_segflg != UIO_USERSPACE || uio->uio_td == td, ("uiomove proc")); if (!nofault) WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "Calling uiomove()"); /* XXX does it make a sense to set TDP_DEADLKTREAT for UIO_SYSSPACE ? */ newflags = TDP_DEADLKTREAT; if (uio->uio_segflg == UIO_USERSPACE && nofault) { /* * Fail if a non-spurious page fault occurs. */ newflags |= TDP_NOFAULTING | TDP_RESETSPUR; } save = curthread_pflags_set(newflags); while (n > 0 && uio->uio_resid) { iov = uio->uio_iov; cnt = iov->iov_len; if (cnt == 0) { uio->uio_iov++; uio->uio_iovcnt--; continue; } if (cnt > n) cnt = n; switch (uio->uio_segflg) { case UIO_USERSPACE: maybe_yield(); if (uio->uio_rw == UIO_READ) error = copyout(cp, iov->iov_base, cnt); else error = copyin(iov->iov_base, cp, cnt); if (error) goto out; break; case UIO_SYSSPACE: if (uio->uio_rw == UIO_READ) bcopy(cp, iov->iov_base, cnt); else bcopy(iov->iov_base, cp, cnt); break; case UIO_NOCOPY: break; } iov->iov_base = (char *)iov->iov_base + cnt; iov->iov_len -= cnt; uio->uio_resid -= cnt; uio->uio_offset += cnt; cp = (char *)cp + cnt; n -= cnt; } out: curthread_pflags_restore(save); return (error); } /* * Wrapper for uiomove() that validates the arguments against a known-good * kernel buffer. Currently, uiomove accepts a signed (n) argument, which * is almost definitely a bad thing, so we catch that here as well. We * return a runtime failure, but it might be desirable to generate a runtime * assertion failure instead. */ int uiomove_frombuf(void *buf, int buflen, struct uio *uio) { size_t offset, n; if (uio->uio_offset < 0 || uio->uio_resid < 0 || (offset = uio->uio_offset) != uio->uio_offset) return (EINVAL); if (buflen <= 0 || offset >= buflen) return (0); if ((n = buflen - offset) > IOSIZE_MAX) return (EINVAL); return (uiomove((char *)buf + offset, n, uio)); } /* * Give next character to user as result of read. */ int ureadc(int c, struct uio *uio) { struct iovec *iov; char *iov_base; WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "Calling ureadc()"); again: if (uio->uio_iovcnt == 0 || uio->uio_resid == 0) panic("ureadc"); iov = uio->uio_iov; if (iov->iov_len == 0) { uio->uio_iovcnt--; uio->uio_iov++; goto again; } switch (uio->uio_segflg) { case UIO_USERSPACE: if (subyte(iov->iov_base, c) < 0) return (EFAULT); break; case UIO_SYSSPACE: iov_base = iov->iov_base; *iov_base = c; break; case UIO_NOCOPY: break; } iov->iov_base = (char *)iov->iov_base + 1; iov->iov_len--; uio->uio_resid--; uio->uio_offset++; return (0); } int copyinfrom(const void * __restrict src, void * __restrict dst, size_t len, int seg) { int error = 0; switch (seg) { case UIO_USERSPACE: error = copyin(src, dst, len); break; case UIO_SYSSPACE: bcopy(src, dst, len); break; default: panic("copyinfrom: bad seg %d\n", seg); } return (error); } int copyinstrfrom(const void * __restrict src, void * __restrict dst, size_t len, size_t * __restrict copied, int seg) { int error = 0; switch (seg) { case UIO_USERSPACE: error = copyinstr(src, dst, len, copied); break; case UIO_SYSSPACE: error = copystr(src, dst, len, copied); break; default: panic("copyinstrfrom: bad seg %d\n", seg); } return (error); } int copyiniov(const struct iovec *iovp, u_int iovcnt, struct iovec **iov, int error) { u_int iovlen; *iov = NULL; if (iovcnt > UIO_MAXIOV) return (error); iovlen = iovcnt * sizeof (struct iovec); *iov = malloc(iovlen, M_IOV, M_WAITOK); error = copyin(iovp, *iov, iovlen); if (error) { free(*iov, M_IOV); *iov = NULL; } return (error); } int copyinuio(const struct iovec *iovp, u_int iovcnt, struct uio **uiop) { struct iovec *iov; struct uio *uio; u_int iovlen; int error, i; *uiop = NULL; if (iovcnt > UIO_MAXIOV) return (EINVAL); iovlen = iovcnt * sizeof (struct iovec); uio = malloc(iovlen + sizeof *uio, M_IOV, M_WAITOK); iov = (struct iovec *)(uio + 1); error = copyin(iovp, iov, iovlen); if (error) { free(uio, M_IOV); return (error); } uio->uio_iov = iov; uio->uio_iovcnt = iovcnt; uio->uio_segflg = UIO_USERSPACE; uio->uio_offset = -1; uio->uio_resid = 0; for (i = 0; i < iovcnt; i++) { if (iov->iov_len > IOSIZE_MAX - uio->uio_resid) { free(uio, M_IOV); return (EINVAL); } uio->uio_resid += iov->iov_len; iov++; } *uiop = uio; return (0); } struct uio * cloneuio(struct uio *uiop) { struct uio *uio; int iovlen; iovlen = uiop->uio_iovcnt * sizeof (struct iovec); uio = malloc(iovlen + sizeof *uio, M_IOV, M_WAITOK); *uio = *uiop; uio->uio_iov = (struct iovec *)(uio + 1); bcopy(uiop->uio_iov, uio->uio_iov, iovlen); return (uio); } /* * Map some anonymous memory in user space of size sz, rounded up to the page * boundary. */ int copyout_map(struct thread *td, vm_offset_t *addr, size_t sz) { struct vmspace *vms; int error; vm_size_t size; vms = td->td_proc->p_vmspace; /* * Map somewhere after heap in process memory. */ PROC_LOCK(td->td_proc); *addr = round_page((vm_offset_t)vms->vm_daddr + lim_max(td->td_proc, RLIMIT_DATA)); PROC_UNLOCK(td->td_proc); /* round size up to page boundry */ size = (vm_size_t)round_page(sz); error = vm_mmap(&vms->vm_map, addr, size, PROT_READ | PROT_WRITE, VM_PROT_ALL, MAP_PRIVATE | MAP_ANON, OBJT_DEFAULT, NULL, 0); return (error); } /* * Unmap memory in user space. */ int copyout_unmap(struct thread *td, vm_offset_t addr, size_t sz) { vm_map_t map; vm_size_t size; if (sz == 0) return (0); map = &td->td_proc->p_vmspace->vm_map; size = (vm_size_t)round_page(sz); if (vm_map_remove(map, addr, addr + size) != KERN_SUCCESS) return (EINVAL); return (0); } #ifdef NO_FUEWORD /* * XXXKIB The temporal implementation of fue*() functions which do not * handle usermode -1 properly, mixing it with the fault code. Keep * this until MD code is written. Currently sparc64, mips and arm do * not have proper implementation. */ int fueword(volatile const void *base, long *val) { long res; res = fuword(base); if (res == -1) return (-1); *val = res; return (0); } int fueword32(volatile const void *base, int32_t *val) { int32_t res; res = fuword32(base); if (res == -1) return (-1); *val = res; return (0); } #ifdef _LP64 int fueword64(volatile const void *base, int64_t *val) { int32_t res; res = fuword64(base); if (res == -1) return (-1); *val = res; return (0); } #endif int casueword32(volatile uint32_t *base, uint32_t oldval, uint32_t *oldvalp, uint32_t newval) { int32_t ov; ov = casuword32(base, oldval, newval); if (ov == -1) return (-1); *oldvalp = ov; return (0); } int casueword(volatile u_long *p, u_long oldval, u_long *oldvalp, u_long newval) { u_long ov; ov = casuword(p, oldval, newval); if (ov == -1) return (-1); *oldvalp = ov; return (0); } #else /* NO_FUEWORD */ int32_t fuword32(volatile const void *addr) { int rv; int32_t val; rv = fueword32(addr, &val); return (rv == -1 ? -1 : val); } #ifdef _LP64 int64_t fuword64(volatile const void *addr) { int rv; int64_t val; rv = fueword64(addr, &val); return (rv == -1 ? -1 : val); } #endif /* _LP64 */ long fuword(volatile const void *addr) { long val; int rv; rv = fueword(addr, &val); return (rv == -1 ? -1 : val); } uint32_t casuword32(volatile uint32_t *addr, uint32_t old, uint32_t new) { int rv; uint32_t val; rv = casueword32(addr, old, &val, new); return (rv == -1 ? -1 : val); } u_long casuword(volatile u_long *addr, u_long old, u_long new) { int rv; u_long val; rv = casueword(addr, old, &val, new); return (rv == -1 ? -1 : val); } #endif /* NO_FUEWORD */ Index: stable/10/sys/pc98/include/bus.h =================================================================== --- stable/10/sys/pc98/include/bus.h (revision 292347) +++ stable/10/sys/pc98/include/bus.h (revision 292348) @@ -1,642 +1,648 @@ /*- * Copyright (c) KATO Takenori, 1999. * * All rights reserved. Unpublished rights reserved under the copyright * laws of Japan. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer as * the first lines of this file unmodified. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /* $NecBSD: busio.h,v 3.25.4.2.2.1 2000/06/12 03:53:08 honda Exp $ */ /* $NetBSD: bus.h,v 1.12 1997/10/01 08:25:15 fvdl Exp $ */ /*- * [NetBSD for NEC PC-98 series] * Copyright (c) 1997, 1998 * NetBSD/pc98 porting staff. All rights reserved. * * [Ported for FreeBSD] * Copyright (c) 2001 * TAKAHASHI Yoshihiro. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE * DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ /* * Copyright (c) 1997, 1998 * Naofumi HONDA. All rights reserved. * * This module support generic bus address relocation mechanism. * To reduce a function call overhead, we employ pascal call methods. */ #ifndef _PC98_BUS_H_ #define _PC98_BUS_H_ +#ifdef _KERNEL #include +#endif /* _KERNEL */ #include #include #define BUS_SPACE_MAXSIZE_24BIT 0xFFFFFF #define BUS_SPACE_MAXSIZE_32BIT 0xFFFFFFFF #define BUS_SPACE_MAXSIZE 0xFFFFFFFF #define BUS_SPACE_MAXADDR_24BIT 0xFFFFFF #define BUS_SPACE_MAXADDR_32BIT 0xFFFFFFFF #define BUS_SPACE_MAXADDR 0xFFFFFFFF #define BUS_SPACE_UNRESTRICTED (~0) +#ifdef _KERNEL + /* * address relocation table */ #define BUS_SPACE_IAT_MAXSIZE 33 typedef bus_addr_t *bus_space_iat_t; #define BUS_SPACE_IAT_SZ(IOTARRAY) (sizeof(IOTARRAY)/sizeof(bus_addr_t)) /* * Access methods for bus resources and address space. */ struct resource; /* * bus space tag */ #define _PASCAL_CALL (void) #define _BUS_SPACE_CALL_FUNCS_TAB(NAME,TYPE,BWN) \ NAME##_space_read_##BWN, \ NAME##_space_read_multi_##BWN, \ NAME##_space_read_region_##BWN, \ NAME##_space_write_##BWN, \ NAME##_space_write_multi_##BWN, \ NAME##_space_write_region_##BWN, \ NAME##_space_set_multi_##BWN, \ NAME##_space_set_region_##BWN, \ NAME##_space_copy_region_##BWN #define _BUS_SPACE_CALL_FUNCS_PROTO(NAME,TYPE,BWN) \ TYPE NAME##_space_read_##BWN _PASCAL_CALL; \ void NAME##_space_read_multi_##BWN _PASCAL_CALL; \ void NAME##_space_read_region_##BWN _PASCAL_CALL; \ void NAME##_space_write_##BWN _PASCAL_CALL; \ void NAME##_space_write_multi_##BWN _PASCAL_CALL; \ void NAME##_space_write_region_##BWN _PASCAL_CALL; \ void NAME##_space_set_multi_##BWN _PASCAL_CALL; \ void NAME##_space_set_region_##BWN _PASCAL_CALL; \ void NAME##_space_copy_region_##BWN _PASCAL_CALL; #define _BUS_SPACE_CALL_FUNCS(NAME,TYPE,BWN) \ TYPE (* NAME##_read_##BWN) _PASCAL_CALL; \ void (* NAME##_read_multi_##BWN) _PASCAL_CALL; \ void (* NAME##_read_region_##BWN) _PASCAL_CALL; \ void (* NAME##_write_##BWN) _PASCAL_CALL; \ void (* NAME##_write_multi_##BWN) _PASCAL_CALL; \ void (* NAME##_write_region_##BWN) _PASCAL_CALL; \ void (* NAME##_set_multi_##BWN) _PASCAL_CALL; \ void (* NAME##_set_region_##BWN) _PASCAL_CALL; \ void (* NAME##_copy_region_##BWN) _PASCAL_CALL; struct bus_space_access_methods { /* 8 bits access methods */ _BUS_SPACE_CALL_FUNCS(bs,u_int8_t,1) /* 16 bits access methods */ _BUS_SPACE_CALL_FUNCS(bs,u_int16_t,2) /* 32 bits access methods */ _BUS_SPACE_CALL_FUNCS(bs,u_int32_t,4) }; /* * Access methods for bus resources and address space. */ struct bus_space_tag { #define BUS_SPACE_TAG_IO 0 #define BUS_SPACE_TAG_MEM 1 u_int bs_tag; /* bus space flags */ struct bus_space_access_methods bs_da; /* direct access */ struct bus_space_access_methods bs_ra; /* relocate access */ #if 0 struct bus_space_access_methods bs_ida; /* indexed direct access */ #endif }; /* * bus space handle */ struct bus_space_handle { bus_addr_t bsh_base; size_t bsh_sz; bus_addr_t bsh_iat[BUS_SPACE_IAT_MAXSIZE]; size_t bsh_maxiatsz; size_t bsh_iatsz; struct resource **bsh_res; size_t bsh_ressz; struct bus_space_access_methods bsh_bam; }; /* * Values for the pc98 bus space tag, not to be used directly by MI code. */ extern struct bus_space_tag SBUS_io_space_tag; extern struct bus_space_tag SBUS_mem_space_tag; #define X86_BUS_SPACE_IO (&SBUS_io_space_tag) #define X86_BUS_SPACE_MEM (&SBUS_mem_space_tag) /* * Allocate/Free bus_space_handle */ int i386_bus_space_handle_alloc(bus_space_tag_t t, bus_addr_t bpa, bus_size_t size, bus_space_handle_t *bshp); void i386_bus_space_handle_free(bus_space_tag_t t, bus_space_handle_t bsh, size_t size); /* * int bus_space_map (bus_space_tag_t t, bus_addr_t addr, * bus_size_t size, int flag, bus_space_handle_t *bshp); * * Map a region of bus space. */ int i386_memio_map(bus_space_tag_t t, bus_addr_t addr, bus_size_t size, int flag, bus_space_handle_t *bshp); #define bus_space_map(t, a, s, f, hp) \ i386_memio_map((t), (a), (s), (f), (hp)) /* * int bus_space_unmap (bus_space_tag_t t, * bus_space_handle_t bsh, bus_size_t size); * * Unmap a region of bus space. */ void i386_memio_unmap(bus_space_tag_t t, bus_space_handle_t bsh, bus_size_t size); #define bus_space_unmap(t, h, s) \ i386_memio_unmap((t), (h), (s)) /* * int bus_space_map_load (bus_space_tag_t t, bus_space_handle_t bsh, * bus_size_t size, bus_space_iat_t iat, u_int flags); * * Load I/O address table of bus space. */ int i386_memio_map_load(bus_space_tag_t t, bus_space_handle_t bsh, bus_size_t size, bus_space_iat_t iat, u_int flags); #define bus_space_map_load(t, h, s, iat, f) \ i386_memio_map_load((t), (h), (s), (iat), (f)) /* * int bus_space_subregion (bus_space_tag_t t, * bus_space_handle_t bsh, bus_size_t offset, bus_size_t size, * bus_space_handle_t *nbshp); * * Get a new handle for a subregion of an already-mapped area of bus space. */ int i386_memio_subregion(bus_space_tag_t t, bus_space_handle_t bsh, bus_size_t offset, bus_size_t size, bus_space_handle_t *nbshp); #define bus_space_subregion(t, h, o, s, nhp) \ i386_memio_subregion((t), (h), (o), (s), (nhp)) /* * int bus_space_free (bus_space_tag_t t, * bus_space_handle_t bsh, bus_size_t size); * * Free a region of bus space. */ void i386_memio_free(bus_space_tag_t t, bus_space_handle_t bsh, bus_size_t size); #define bus_space_free(t, h, s) \ i386_memio_free((t), (h), (s)) /* * int bus_space_compare (bus_space_tag_t t1, bus_space_handle_t bsh1, * bus_space_tag_t t2, bus_space_handle_t bsh2); * * Compare two resources. */ int i386_memio_compare(bus_space_tag_t t1, bus_space_handle_t bsh1, bus_space_tag_t t2, bus_space_handle_t bsh2); #define bus_space_compare(t1, h1, t2, h2) \ i386_memio_compare((t1), (h1), (t2), (h2)) /* * Access methods for bus resources and address space. */ #define _BUS_ACCESS_METHODS_PROTO(TYPE,BWN) \ static __inline TYPE bus_space_read_##BWN \ (bus_space_tag_t, bus_space_handle_t, bus_size_t offset); \ static __inline void bus_space_read_multi_##BWN \ (bus_space_tag_t, bus_space_handle_t, \ bus_size_t, TYPE *, size_t); \ static __inline void bus_space_read_region_##BWN \ (bus_space_tag_t, bus_space_handle_t, \ bus_size_t, TYPE *, size_t); \ static __inline void bus_space_write_##BWN \ (bus_space_tag_t, bus_space_handle_t, bus_size_t, TYPE); \ static __inline void bus_space_write_multi_##BWN \ (bus_space_tag_t, bus_space_handle_t, \ bus_size_t, const TYPE *, size_t); \ static __inline void bus_space_write_region_##BWN \ (bus_space_tag_t, bus_space_handle_t, \ bus_size_t, const TYPE *, size_t); \ static __inline void bus_space_set_multi_##BWN \ (bus_space_tag_t, bus_space_handle_t, bus_size_t, TYPE, size_t);\ static __inline void bus_space_set_region_##BWN \ (bus_space_tag_t, bus_space_handle_t, bus_size_t, TYPE, size_t);\ static __inline void bus_space_copy_region_##BWN \ (bus_space_tag_t, bus_space_handle_t, bus_size_t, \ bus_space_handle_t, bus_size_t, size_t); _BUS_ACCESS_METHODS_PROTO(u_int8_t,1) _BUS_ACCESS_METHODS_PROTO(u_int16_t,2) _BUS_ACCESS_METHODS_PROTO(u_int32_t,4) /* * read methods */ #define _BUS_SPACE_READ(TYPE,BWN) \ static __inline TYPE \ bus_space_read_##BWN (bus_space_tag_t tag, bus_space_handle_t bsh, \ bus_size_t offset) \ { \ register TYPE result; \ \ __asm __volatile("call *%2" \ :"=a" (result), \ "=d" (offset) \ :"o" (bsh->bsh_bam.bs_read_##BWN), \ "b" (bsh), \ "1" (offset) \ ); \ \ return result; \ } _BUS_SPACE_READ(u_int8_t,1) _BUS_SPACE_READ(u_int16_t,2) _BUS_SPACE_READ(u_int32_t,4) /* * write methods */ #define _BUS_SPACE_WRITE(TYPE,BWN) \ static __inline void \ bus_space_write_##BWN (bus_space_tag_t tag, bus_space_handle_t bsh, \ bus_size_t offset, TYPE val) \ { \ \ __asm __volatile("call *%1" \ :"=d" (offset) \ :"o" (bsh->bsh_bam.bs_write_##BWN), \ "a" (val), \ "b" (bsh), \ "0" (offset) \ ); \ } _BUS_SPACE_WRITE(u_int8_t,1) _BUS_SPACE_WRITE(u_int16_t,2) _BUS_SPACE_WRITE(u_int32_t,4) /* * multi read */ #define _BUS_SPACE_READ_MULTI(TYPE,BWN) \ static __inline void \ bus_space_read_multi_##BWN (bus_space_tag_t tag, bus_space_handle_t bsh, \ bus_size_t offset, TYPE *buf, size_t cnt) \ { \ \ __asm __volatile("call *%3" \ :"=c" (cnt), \ "=d" (offset), \ "=D" (buf) \ :"o" (bsh->bsh_bam.bs_read_multi_##BWN), \ "b" (bsh), \ "0" (cnt), \ "1" (offset), \ "2" (buf) \ :"memory"); \ } _BUS_SPACE_READ_MULTI(u_int8_t,1) _BUS_SPACE_READ_MULTI(u_int16_t,2) _BUS_SPACE_READ_MULTI(u_int32_t,4) /* * multi write */ #define _BUS_SPACE_WRITE_MULTI(TYPE,BWN) \ static __inline void \ bus_space_write_multi_##BWN (bus_space_tag_t tag, bus_space_handle_t bsh, \ bus_size_t offset, const TYPE *buf, size_t cnt) \ { \ \ __asm __volatile("call *%3" \ :"=c" (cnt), \ "=d" (offset), \ "=S" (buf) \ :"o" (bsh->bsh_bam.bs_write_multi_##BWN), \ "b" (bsh), \ "0" (cnt), \ "1" (offset), \ "2" (buf) \ ); \ } _BUS_SPACE_WRITE_MULTI(u_int8_t,1) _BUS_SPACE_WRITE_MULTI(u_int16_t,2) _BUS_SPACE_WRITE_MULTI(u_int32_t,4) /* * region read */ #define _BUS_SPACE_READ_REGION(TYPE,BWN) \ static __inline void \ bus_space_read_region_##BWN (bus_space_tag_t tag, bus_space_handle_t bsh, \ bus_size_t offset, TYPE *buf, size_t cnt) \ { \ \ __asm __volatile("call *%3" \ :"=c" (cnt), \ "=d" (offset), \ "=D" (buf) \ :"o" (bsh->bsh_bam.bs_read_region_##BWN), \ "b" (bsh), \ "0" (cnt), \ "1" (offset), \ "2" (buf) \ :"memory"); \ } _BUS_SPACE_READ_REGION(u_int8_t,1) _BUS_SPACE_READ_REGION(u_int16_t,2) _BUS_SPACE_READ_REGION(u_int32_t,4) /* * region write */ #define _BUS_SPACE_WRITE_REGION(TYPE,BWN) \ static __inline void \ bus_space_write_region_##BWN (bus_space_tag_t tag, bus_space_handle_t bsh, \ bus_size_t offset, const TYPE *buf, size_t cnt) \ { \ \ __asm __volatile("call *%3" \ :"=c" (cnt), \ "=d" (offset), \ "=S" (buf) \ :"o" (bsh->bsh_bam.bs_write_region_##BWN), \ "b" (bsh), \ "0" (cnt), \ "1" (offset), \ "2" (buf) \ ); \ } _BUS_SPACE_WRITE_REGION(u_int8_t,1) _BUS_SPACE_WRITE_REGION(u_int16_t,2) _BUS_SPACE_WRITE_REGION(u_int32_t,4) /* * multi set */ #define _BUS_SPACE_SET_MULTI(TYPE,BWN) \ static __inline void \ bus_space_set_multi_##BWN (bus_space_tag_t tag, bus_space_handle_t bsh, \ bus_size_t offset, TYPE val, size_t cnt) \ { \ \ __asm __volatile("call *%2" \ :"=c" (cnt), \ "=d" (offset) \ :"o" (bsh->bsh_bam.bs_set_multi_##BWN), \ "a" (val), \ "b" (bsh), \ "0" (cnt), \ "1" (offset) \ ); \ } _BUS_SPACE_SET_MULTI(u_int8_t,1) _BUS_SPACE_SET_MULTI(u_int16_t,2) _BUS_SPACE_SET_MULTI(u_int32_t,4) /* * region set */ #define _BUS_SPACE_SET_REGION(TYPE,BWN) \ static __inline void \ bus_space_set_region_##BWN (bus_space_tag_t tag, bus_space_handle_t bsh, \ bus_size_t offset, TYPE val, size_t cnt) \ { \ \ __asm __volatile("call *%2" \ :"=c" (cnt), \ "=d" (offset) \ :"o" (bsh->bsh_bam.bs_set_region_##BWN), \ "a" (val), \ "b" (bsh), \ "0" (cnt), \ "1" (offset) \ ); \ } _BUS_SPACE_SET_REGION(u_int8_t,1) _BUS_SPACE_SET_REGION(u_int16_t,2) _BUS_SPACE_SET_REGION(u_int32_t,4) /* * copy */ #define _BUS_SPACE_COPY_REGION(BWN) \ static __inline void \ bus_space_copy_region_##BWN (bus_space_tag_t tag, bus_space_handle_t sbsh, \ bus_size_t src, bus_space_handle_t dbsh, bus_size_t dst, size_t cnt) \ { \ \ if (dbsh->bsh_bam.bs_copy_region_1 != sbsh->bsh_bam.bs_copy_region_1) \ panic("bus_space_copy_region: funcs mismatch (ENOSUPPORT)");\ \ __asm __volatile("call *%3" \ :"=c" (cnt), \ "=S" (src), \ "=D" (dst) \ :"o" (dbsh->bsh_bam.bs_copy_region_##BWN), \ "a" (sbsh), \ "b" (dbsh), \ "0" (cnt), \ "1" (src), \ "2" (dst) \ ); \ } _BUS_SPACE_COPY_REGION(1) _BUS_SPACE_COPY_REGION(2) _BUS_SPACE_COPY_REGION(4) /* * Bus read/write barrier methods. * * void bus_space_barrier(bus_space_tag_t tag, bus_space_handle_t bsh, * bus_size_t offset, bus_size_t len, int flags); * * * Note that BUS_SPACE_BARRIER_WRITE doesn't do anything other than * prevent reordering by the compiler; all Intel x86 processors currently * retire operations outside the CPU in program order. */ #define BUS_SPACE_BARRIER_READ 0x01 /* force read barrier */ #define BUS_SPACE_BARRIER_WRITE 0x02 /* force write barrier */ static __inline void bus_space_barrier(bus_space_tag_t tag, bus_space_handle_t bsh, bus_size_t offset, bus_size_t len, int flags) { if (flags & BUS_SPACE_BARRIER_READ) __asm __volatile("lock; addl $0,0(%%esp)" : : : "memory"); else __compiler_membar(); } #ifdef BUS_SPACE_NO_LEGACY #undef inb #undef outb #define inb(a) compiler_error #define inw(a) compiler_error #define inl(a) compiler_error #define outb(a, b) compiler_error #define outw(a, b) compiler_error #define outl(a, b) compiler_error #endif #include /* * Stream accesses are the same as normal accesses on i386/pc98; there are no * supported bus systems with an endianess different from the host one. */ #define bus_space_read_stream_1(t, h, o) bus_space_read_1((t), (h), (o)) #define bus_space_read_stream_2(t, h, o) bus_space_read_2((t), (h), (o)) #define bus_space_read_stream_4(t, h, o) bus_space_read_4((t), (h), (o)) #define bus_space_read_multi_stream_1(t, h, o, a, c) \ bus_space_read_multi_1((t), (h), (o), (a), (c)) #define bus_space_read_multi_stream_2(t, h, o, a, c) \ bus_space_read_multi_2((t), (h), (o), (a), (c)) #define bus_space_read_multi_stream_4(t, h, o, a, c) \ bus_space_read_multi_4((t), (h), (o), (a), (c)) #define bus_space_write_stream_1(t, h, o, v) \ bus_space_write_1((t), (h), (o), (v)) #define bus_space_write_stream_2(t, h, o, v) \ bus_space_write_2((t), (h), (o), (v)) #define bus_space_write_stream_4(t, h, o, v) \ bus_space_write_4((t), (h), (o), (v)) #define bus_space_write_multi_stream_1(t, h, o, a, c) \ bus_space_write_multi_1((t), (h), (o), (a), (c)) #define bus_space_write_multi_stream_2(t, h, o, a, c) \ bus_space_write_multi_2((t), (h), (o), (a), (c)) #define bus_space_write_multi_stream_4(t, h, o, a, c) \ bus_space_write_multi_4((t), (h), (o), (a), (c)) #define bus_space_set_multi_stream_1(t, h, o, v, c) \ bus_space_set_multi_1((t), (h), (o), (v), (c)) #define bus_space_set_multi_stream_2(t, h, o, v, c) \ bus_space_set_multi_2((t), (h), (o), (v), (c)) #define bus_space_set_multi_stream_4(t, h, o, v, c) \ bus_space_set_multi_4((t), (h), (o), (v), (c)) #define bus_space_read_region_stream_1(t, h, o, a, c) \ bus_space_read_region_1((t), (h), (o), (a), (c)) #define bus_space_read_region_stream_2(t, h, o, a, c) \ bus_space_read_region_2((t), (h), (o), (a), (c)) #define bus_space_read_region_stream_4(t, h, o, a, c) \ bus_space_read_region_4((t), (h), (o), (a), (c)) #define bus_space_write_region_stream_1(t, h, o, a, c) \ bus_space_write_region_1((t), (h), (o), (a), (c)) #define bus_space_write_region_stream_2(t, h, o, a, c) \ bus_space_write_region_2((t), (h), (o), (a), (c)) #define bus_space_write_region_stream_4(t, h, o, a, c) \ bus_space_write_region_4((t), (h), (o), (a), (c)) #define bus_space_set_region_stream_1(t, h, o, v, c) \ bus_space_set_region_1((t), (h), (o), (v), (c)) #define bus_space_set_region_stream_2(t, h, o, v, c) \ bus_space_set_region_2((t), (h), (o), (v), (c)) #define bus_space_set_region_stream_4(t, h, o, v, c) \ bus_space_set_region_4((t), (h), (o), (v), (c)) #define bus_space_copy_region_stream_1(t, h1, o1, h2, o2, c) \ bus_space_copy_region_1((t), (h1), (o1), (h2), (o2), (c)) #define bus_space_copy_region_stream_2(t, h1, o1, h2, o2, c) \ bus_space_copy_region_2((t), (h1), (o1), (h2), (o2), (c)) #define bus_space_copy_region_stream_4(t, h1, o1, h2, o2, c) \ bus_space_copy_region_4((t), (h1), (o1), (h2), (o2), (c)) + +#endif /* _KERNEL */ #endif /* _PC98_BUS_H_ */ Index: stable/10/sys/sys/bio.h =================================================================== --- stable/10/sys/sys/bio.h (revision 292347) +++ stable/10/sys/sys/bio.h (revision 292348) @@ -1,158 +1,159 @@ /*- * Copyright (c) 1982, 1986, 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)buf.h 8.9 (Berkeley) 3/30/95 * $FreeBSD$ */ #ifndef _SYS_BIO_H_ #define _SYS_BIO_H_ #include /* bio_cmd */ #define BIO_READ 0x01 /* Read I/O data */ #define BIO_WRITE 0x02 /* Write I/O data */ #define BIO_DELETE 0x04 /* TRIM or free blocks, i.e. mark as unused */ #define BIO_GETATTR 0x08 /* Get GEOM attributes of object */ #define BIO_FLUSH 0x10 /* Commit outstanding I/O now */ #define BIO_CMD0 0x20 /* Available for local hacks */ #define BIO_CMD1 0x40 /* Available for local hacks */ #define BIO_CMD2 0x80 /* Available for local hacks */ /* bio_flags */ #define BIO_ERROR 0x01 /* An error occurred processing this bio. */ #define BIO_DONE 0x02 /* This bio is finished. */ #define BIO_ONQUEUE 0x04 /* This bio is in a queue & not yet taken. */ /* * This bio must be executed after all previous bios in the queue have been * executed, and before any successive bios can be executed. */ #define BIO_ORDERED 0x08 #define BIO_UNMAPPED 0x10 #define BIO_TRANSIENT_MAPPING 0x20 +#define BIO_VLIST 0x40 #ifdef _KERNEL struct disk; struct bio; struct vm_map; /* Empty classifier tag, to prevent further classification. */ #define BIO_NOTCLASSIFIED (void *)(~0UL) typedef void bio_task_t(void *); /* * The bio structure describes an I/O operation in the kernel. */ struct bio { uint8_t bio_cmd; /* I/O operation. */ uint8_t bio_flags; /* General flags. */ uint8_t bio_cflags; /* Private use by the consumer. */ uint8_t bio_pflags; /* Private use by the provider. */ struct cdev *bio_dev; /* Device to do I/O on. */ struct disk *bio_disk; /* Valid below geom_disk.c only */ off_t bio_offset; /* Offset into file. */ long bio_bcount; /* Valid bytes in buffer. */ caddr_t bio_data; /* Memory, superblocks, indirect etc. */ struct vm_page **bio_ma; /* Or unmapped. */ int bio_ma_offset; /* Offset in the first page of bio_ma. */ int bio_ma_n; /* Number of pages in bio_ma. */ int bio_error; /* Errno for BIO_ERROR. */ long bio_resid; /* Remaining I/O in bytes. */ void (*bio_done)(struct bio *); void *bio_driver1; /* Private use by the provider. */ void *bio_driver2; /* Private use by the provider. */ void *bio_caller1; /* Private use by the consumer. */ void *bio_caller2; /* Private use by the consumer. */ TAILQ_ENTRY(bio) bio_queue; /* Disksort queue. */ const char *bio_attribute; /* Attribute for BIO_[GS]ETATTR */ struct g_consumer *bio_from; /* GEOM linkage */ struct g_provider *bio_to; /* GEOM linkage */ off_t bio_length; /* Like bio_bcount */ off_t bio_completed; /* Inverse of bio_resid */ u_int bio_children; /* Number of spawned bios */ u_int bio_inbed; /* Children safely home by now */ struct bio *bio_parent; /* Pointer to parent */ struct bintime bio_t0; /* Time request started */ bio_task_t *bio_task; /* Task_queue handler */ void *bio_task_arg; /* Argument to above */ void *bio_classifier1; /* Classifier tag. */ void *bio_classifier2; /* Classifier tag. */ #ifdef DIAGNOSTIC void *_bio_caller1; void *_bio_caller2; uint8_t _bio_cflags; #endif /* XXX: these go away when bio chaining is introduced */ daddr_t bio_pblkno; /* physical block number */ }; struct uio; struct devstat; struct bio_queue_head { TAILQ_HEAD(bio_queue, bio) queue; off_t last_offset; struct bio *insert_point; }; extern struct vm_map *bio_transient_map; extern int bio_transient_maxcnt; void biodone(struct bio *bp); void biofinish(struct bio *bp, struct devstat *stat, int error); int biowait(struct bio *bp, const char *wchan); void bioq_disksort(struct bio_queue_head *ap, struct bio *bp); struct bio *bioq_first(struct bio_queue_head *head); struct bio *bioq_takefirst(struct bio_queue_head *head); void bioq_flush(struct bio_queue_head *head, struct devstat *stp, int error); void bioq_init(struct bio_queue_head *head); void bioq_insert_head(struct bio_queue_head *head, struct bio *bp); void bioq_insert_tail(struct bio_queue_head *head, struct bio *bp); void bioq_remove(struct bio_queue_head *head, struct bio *bp); void bio_taskqueue(struct bio *bp, bio_task_t *fund, void *arg); int physio(struct cdev *dev, struct uio *uio, int ioflag); #define physread physio #define physwrite physio #endif /* _KERNEL */ #endif /* !_SYS_BIO_H_ */ Index: stable/10/sys/sys/uio.h =================================================================== --- stable/10/sys/sys/uio.h (revision 292347) +++ stable/10/sys/sys/uio.h (revision 292348) @@ -1,121 +1,126 @@ /*- * Copyright (c) 1982, 1986, 1993, 1994 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)uio.h 8.5 (Berkeley) 2/22/94 * $FreeBSD$ */ #ifndef _SYS_UIO_H_ #define _SYS_UIO_H_ #include #include #include #ifndef _SSIZE_T_DECLARED typedef __ssize_t ssize_t; #define _SSIZE_T_DECLARED #endif #ifndef _OFF_T_DECLARED typedef __off_t off_t; #define _OFF_T_DECLARED #endif #if __BSD_VISIBLE enum uio_rw { UIO_READ, UIO_WRITE }; /* Segment flag values. */ enum uio_seg { UIO_USERSPACE, /* from user data space */ UIO_SYSSPACE, /* from system space */ UIO_NOCOPY /* don't copy, already in object */ }; #endif #ifdef _KERNEL struct uio { struct iovec *uio_iov; /* scatter/gather list */ int uio_iovcnt; /* length of scatter/gather list */ off_t uio_offset; /* offset in target object */ ssize_t uio_resid; /* remaining bytes to process */ enum uio_seg uio_segflg; /* address space */ enum uio_rw uio_rw; /* operation */ struct thread *uio_td; /* owner */ }; /* * Limits * * N.B.: UIO_MAXIOV must be no less than IOV_MAX from * which in turn must be no less than _XOPEN_IOV_MAX from . If * we ever make this tunable (probably pointless), then IOV_MAX should be * removed from and applications would be expected to use * sysconf(3) to find out the correct value, or else assume the worst * (_XOPEN_IOV_MAX). Perhaps UIO_MAXIOV should be simply defined as * IOV_MAX. */ #define UIO_MAXIOV 1024 /* max 1K of iov's */ struct vm_object; struct vm_page; +struct bus_dma_segment; struct uio *cloneuio(struct uio *uiop); int copyinfrom(const void * __restrict src, void * __restrict dst, size_t len, int seg); int copyiniov(const struct iovec *iovp, u_int iovcnt, struct iovec **iov, int error); int copyinstrfrom(const void * __restrict src, void * __restrict dst, size_t len, size_t * __restrict copied, int seg); int copyinuio(const struct iovec *iovp, u_int iovcnt, struct uio **uiop); int copyout_map(struct thread *td, vm_offset_t *addr, size_t sz); int copyout_unmap(struct thread *td, vm_offset_t addr, size_t sz); int physcopyin(void *src, vm_paddr_t dst, size_t len); int physcopyout(vm_paddr_t src, void *dst, size_t len); +int physcopyin_vlist(struct bus_dma_segment *src, off_t offset, + vm_paddr_t dst, size_t len); +int physcopyout_vlist(vm_paddr_t src, struct bus_dma_segment *dst, + off_t offset, size_t len); int uiomove(void *cp, int n, struct uio *uio); int uiomove_frombuf(void *buf, int buflen, struct uio *uio); int uiomove_fromphys(struct vm_page *ma[], vm_offset_t offset, int n, struct uio *uio); int uiomove_nofault(void *cp, int n, struct uio *uio); int uiomove_object(struct vm_object *obj, off_t obj_size, struct uio *uio); #else /* !_KERNEL */ __BEGIN_DECLS ssize_t readv(int, const struct iovec *, int); ssize_t writev(int, const struct iovec *, int); #if __BSD_VISIBLE ssize_t preadv(int, const struct iovec *, int, off_t); ssize_t pwritev(int, const struct iovec *, int, off_t); #endif __END_DECLS #endif /* _KERNEL */ #endif /* !_SYS_UIO_H_ */ Index: stable/10/usr.sbin/Makefile =================================================================== --- stable/10/usr.sbin/Makefile (revision 292347) +++ stable/10/usr.sbin/Makefile (revision 292348) @@ -1,363 +1,364 @@ # From: @(#)Makefile 5.20 (Berkeley) 6/12/93 # $FreeBSD$ .include SUBDIR= adduser \ arp \ binmiscctl \ bsdconfig \ + camdd \ cdcontrol \ chkgrp \ chown \ chroot \ ckdist \ clear_locks \ crashinfo \ cron \ ctladm \ ctld \ daemon \ dconschat \ devinfo \ digictl \ diskinfo \ dumpcis \ extattr \ extattrctl \ fifolog \ fstyp \ fwcontrol \ getfmac \ getpmac \ gstat \ i2c \ ifmcstat \ iostat \ kldxref \ mailwrapper \ makefs \ memcontrol \ mergemaster \ mfiutil \ mixer \ mlxcontrol \ mountd \ mptutil \ mtest \ ${_mtree} \ newsyslog \ nfscbd \ nfsd \ nfsdumpstate \ nfsrevoke \ nfsuserd \ nmtree \ nologin \ ${_pc_sysinstall} \ pciconf \ periodic \ powerd \ procctl \ pstat \ pw \ pwd_mkdb \ quot \ rarpd \ rmt \ rpcbind \ rpc.lockd \ rpc.statd \ rpc.umntall \ rtprio \ service \ services_mkdb \ sesutil \ setfib \ setfmac \ setpmac \ smbmsg \ snapinfo \ spray \ syslogd \ sysrc \ tcpdrop \ tcpdump \ traceroute \ trpt \ tzsetup \ uefisign \ ugidfw \ vipw \ wake \ watch \ watchdogd \ zic # NB: keep these sorted by MK_* knobs .if ${MK_ACCT} != "no" SUBDIR+= accton SUBDIR+= sa .endif .if ${MK_AMD} != "no" SUBDIR+= amd .endif .if ${MK_AUDIT} != "no" SUBDIR+= audit SUBDIR+= auditd .if ${MK_OPENSSL} != "no" SUBDIR+= auditdistd .endif SUBDIR+= auditreduce SUBDIR+= praudit .endif .if ${MK_AUTHPF} != "no" SUBDIR+= authpf .endif .if ${MK_AUTOFS} != "no" SUBDIR+= autofs .endif .if ${MK_BLUETOOTH} != "no" SUBDIR+= bluetooth .endif .if ${MK_BOOTPARAMD} != "no" SUBDIR+= bootparamd .endif .if ${MK_BSDINSTALL} != "no" SUBDIR+= bsdinstall .endif .if ${MK_BSNMP} != "no" SUBDIR+= bsnmpd .endif .if ${MK_CTM} != "no" SUBDIR+= ctm .endif .if ${MK_FLOPPY} != "no" SUBDIR+= fdcontrol SUBDIR+= fdformat SUBDIR+= fdread SUBDIR+= fdwrite .endif .if ${MK_FMTREE} != "no" SUBDIR+= mtree .endif .if ${MK_FREEBSD_UPDATE} != "no" SUBDIR+= freebsd-update .endif .if ${MK_GSSAPI} != "no" SUBDIR+= gssd .endif .if ${MK_GPIO} != "no" SUBDIR+= gpioctl .endif .if ${MK_INET6} != "no" SUBDIR+= faithd SUBDIR+= ip6addrctl SUBDIR+= mld6query SUBDIR+= ndp SUBDIR+= rip6query SUBDIR+= route6d SUBDIR+= rrenumd SUBDIR+= rtadvctl SUBDIR+= rtadvd SUBDIR+= rtsold SUBDIR+= traceroute6 .endif .if ${MK_INETD} != "no" SUBDIR+= inetd .endif .if ${MK_IPFW} != "no" SUBDIR+= ipfwpcap .endif .if ${MK_IPX} != "no" SUBDIR+= IPXrouted .endif .if ${MK_ISCSI} != "no" SUBDIR+= iscsid .endif .if ${MK_JAIL} != "no" SUBDIR+= jail SUBDIR+= jexec SUBDIR+= jls .endif # XXX MK_SYSCONS .if ${MK_LEGACY_CONSOLE} != "no" SUBDIR+= kbdcontrol SUBDIR+= kbdmap SUBDIR+= moused SUBDIR+= vidcontrol .endif .if ${MK_LIBTHR} != "no" || ${MK_LIBPTHREAD} != "no" .if ${MK_PPP} != "no" SUBDIR+= pppctl .endif .if ${MK_NS_CACHING} != "no" SUBDIR+= nscd .endif .endif .if ${MK_LPR} != "no" SUBDIR+= lpr .endif .if ${MK_MAN_UTILS} != "no" SUBDIR+= manctl .endif .if ${MK_NAND} != "no" SUBDIR+= nandsim SUBDIR+= nandtool .endif .if ${MK_NETGRAPH} != "no" SUBDIR+= flowctl SUBDIR+= lmcconfig SUBDIR+= ngctl SUBDIR+= nghook .endif .if ${MK_NIS} != "no" SUBDIR+= rpc.yppasswdd SUBDIR+= rpc.ypupdated SUBDIR+= rpc.ypxfrd SUBDIR+= ypbind SUBDIR+= yp_mkdb SUBDIR+= yppoll SUBDIR+= yppush SUBDIR+= ypserv SUBDIR+= ypset .endif .if ${MK_NTP} != "no" SUBDIR+= ntp .endif .if ${MK_OPENSSL} != "no" SUBDIR+= keyserv .endif .if ${MK_PC_SYSINSTALL} != "no" _pc_sysinstall= pc-sysinstall .endif .if ${MK_PF} != "no" SUBDIR+= ftp-proxy .endif .if ${MK_PKGBOOTSTRAP} != "no" SUBDIR+= pkg .endif .if ${MK_PKGTOOLS} != "no" SUBDIR+= pkg_install .endif # XXX MK_TOOLCHAIN? .if ${MK_PMC} != "no" SUBDIR+= pmcannotate SUBDIR+= pmccontrol SUBDIR+= pmcstat SUBDIR+= pmcstudy .endif .if ${MK_PORTSNAP} != "no" SUBDIR+= portsnap .endif .if ${MK_PPP} != "no" SUBDIR+= ppp .endif .if ${MK_QUOTAS} != "no" SUBDIR+= edquota SUBDIR+= quotaon SUBDIR+= repquota .endif .if ${MK_RCMDS} != "no" SUBDIR+= rwhod .endif .if ${MK_RCS} != "no" SUBDIR+= etcupdate .endif .if ${MK_SENDMAIL} != "no" SUBDIR+= editmap SUBDIR+= mailstats SUBDIR+= makemap SUBDIR+= praliases SUBDIR+= sendmail .endif .if ${MK_TCP_WRAPPERS} != "no" SUBDIR+= tcpdchk SUBDIR+= tcpdmatch .endif .if ${MK_TESTS} != "no" SUBDIR+= tests .endif .if ${MK_TIMED} != "no" SUBDIR+= timed .endif .if ${MK_TOOLCHAIN} != "no" SUBDIR+= config SUBDIR+= crunch .endif .if ${MK_UNBOUND} != "no" SUBDIR+= unbound .endif .if ${MK_USB} != "no" SUBDIR+= uathload SUBDIR+= uhsoctl SUBDIR+= usbconfig SUBDIR+= usbdump .endif .if ${MK_UTMPX} != "no" SUBDIR+= ac SUBDIR+= lastlogin SUBDIR+= utx .endif .if ${MK_WIRELESS} != "no" SUBDIR+= ancontrol SUBDIR+= wlandebug SUBDIR+= wpa .endif .include SUBDIR:= ${SUBDIR:O} SUBDIR_PARALLEL= .include Index: stable/10/usr.sbin/camdd/camdd.c =================================================================== --- stable/10/usr.sbin/camdd/camdd.c (nonexistent) +++ stable/10/usr.sbin/camdd/camdd.c (revision 292348) @@ -0,0 +1,3428 @@ +/*- + * Copyright (c) 1997-2007 Kenneth D. Merry + * Copyright (c) 2013, 2014, 2015 Spectra Logic Corporation + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions, and the following disclaimer, + * without modification. + * 2. Redistributions in binary form must reproduce at minimum a disclaimer + * substantially similar to the "NO WARRANTY" disclaimer below + * ("Disclaimer") and any redistribution must be conditioned upon + * including a substantially similar Disclaimer requirement for further + * binary redistribution. + * + * NO WARRANTY + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING + * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGES. + * + * Authors: Ken Merry (Spectra Logic Corporation) + */ + +/* + * This is eventually intended to be: + * - A basic data transfer/copy utility + * - A simple benchmark utility + * - An example of how to use the asynchronous pass(4) driver interface. + */ +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +typedef enum { + CAMDD_CMD_NONE = 0x00000000, + CAMDD_CMD_HELP = 0x00000001, + CAMDD_CMD_WRITE = 0x00000002, + CAMDD_CMD_READ = 0x00000003 +} camdd_cmdmask; + +typedef enum { + CAMDD_ARG_NONE = 0x00000000, + CAMDD_ARG_VERBOSE = 0x00000001, + CAMDD_ARG_DEVICE = 0x00000002, + CAMDD_ARG_BUS = 0x00000004, + CAMDD_ARG_TARGET = 0x00000008, + CAMDD_ARG_LUN = 0x00000010, + CAMDD_ARG_UNIT = 0x00000020, + CAMDD_ARG_TIMEOUT = 0x00000040, + CAMDD_ARG_ERR_RECOVER = 0x00000080, + CAMDD_ARG_RETRIES = 0x00000100 +} camdd_argmask; + +typedef enum { + CAMDD_DEV_NONE = 0x00, + CAMDD_DEV_PASS = 0x01, + CAMDD_DEV_FILE = 0x02 +} camdd_dev_type; + +struct camdd_io_opts { + camdd_dev_type dev_type; + char *dev_name; + uint64_t blocksize; + uint64_t queue_depth; + uint64_t offset; + int min_cmd_size; + int write_dev; + uint64_t debug; +}; + +typedef enum { + CAMDD_BUF_NONE, + CAMDD_BUF_DATA, + CAMDD_BUF_INDIRECT +} camdd_buf_type; + +struct camdd_buf_indirect { + /* + * Pointer to the source buffer. + */ + struct camdd_buf *src_buf; + + /* + * Offset into the source buffer, in bytes. + */ + uint64_t offset; + /* + * Pointer to the starting point in the source buffer. + */ + uint8_t *start_ptr; + + /* + * Length of this chunk in bytes. + */ + size_t len; +}; + +struct camdd_buf_data { + /* + * Buffer allocated when we allocate this camdd_buf. This should + * be the size of the blocksize for this device. + */ + uint8_t *buf; + + /* + * The amount of backing store allocated in buf. Generally this + * will be the blocksize of the device. + */ + uint32_t alloc_len; + + /* + * The amount of data that was put into the buffer (on reads) or + * the amount of data we have put onto the src_list so far (on + * writes). + */ + uint32_t fill_len; + + /* + * The amount of data that was not transferred. + */ + uint32_t resid; + + /* + * Starting byte offset on the reader. + */ + uint64_t src_start_offset; + + /* + * CCB used for pass(4) device targets. + */ + union ccb ccb; + + /* + * Number of scatter/gather segments. + */ + int sg_count; + + /* + * Set if we had to tack on an extra buffer to round the transfer + * up to a sector size. + */ + int extra_buf; + + /* + * Scatter/gather list used generally when we're the writer for a + * pass(4) device. + */ + bus_dma_segment_t *segs; + + /* + * Scatter/gather list used generally when we're the writer for a + * file or block device; + */ + struct iovec *iovec; +}; + +union camdd_buf_types { + struct camdd_buf_indirect indirect; + struct camdd_buf_data data; +}; + +typedef enum { + CAMDD_STATUS_NONE, + CAMDD_STATUS_OK, + CAMDD_STATUS_SHORT_IO, + CAMDD_STATUS_EOF, + CAMDD_STATUS_ERROR +} camdd_buf_status; + +struct camdd_buf { + camdd_buf_type buf_type; + union camdd_buf_types buf_type_spec; + + camdd_buf_status status; + + uint64_t lba; + size_t len; + + /* + * A reference count of how many indirect buffers point to this + * buffer. + */ + int refcount; + + /* + * A link back to our parent device. + */ + struct camdd_dev *dev; + STAILQ_ENTRY(camdd_buf) links; + STAILQ_ENTRY(camdd_buf) work_links; + + /* + * A count of the buffers on the src_list. + */ + int src_count; + + /* + * List of buffers from our partner thread that are the components + * of this buffer for the I/O. Uses src_links. + */ + STAILQ_HEAD(,camdd_buf) src_list; + STAILQ_ENTRY(camdd_buf) src_links; +}; + +#define NUM_DEV_TYPES 2 + +struct camdd_dev_pass { + int scsi_dev_type; + struct cam_device *dev; + uint64_t max_sector; + uint32_t block_len; + uint32_t cpi_maxio; +}; + +typedef enum { + CAMDD_FILE_NONE, + CAMDD_FILE_REG, + CAMDD_FILE_STD, + CAMDD_FILE_PIPE, + CAMDD_FILE_DISK, + CAMDD_FILE_TAPE, + CAMDD_FILE_TTY, + CAMDD_FILE_MEM +} camdd_file_type; + +typedef enum { + CAMDD_FF_NONE = 0x00, + CAMDD_FF_CAN_SEEK = 0x01 +} camdd_file_flags; + +struct camdd_dev_file { + int fd; + struct stat sb; + char filename[MAXPATHLEN + 1]; + camdd_file_type file_type; + camdd_file_flags file_flags; + uint8_t *tmp_buf; +}; + +struct camdd_dev_block { + int fd; + uint64_t size_bytes; + uint32_t block_len; +}; + +union camdd_dev_spec { + struct camdd_dev_pass pass; + struct camdd_dev_file file; + struct camdd_dev_block block; +}; + +typedef enum { + CAMDD_DEV_FLAG_NONE = 0x00, + CAMDD_DEV_FLAG_EOF = 0x01, + CAMDD_DEV_FLAG_PEER_EOF = 0x02, + CAMDD_DEV_FLAG_ACTIVE = 0x04, + CAMDD_DEV_FLAG_EOF_SENT = 0x08, + CAMDD_DEV_FLAG_EOF_QUEUED = 0x10 +} camdd_dev_flags; + +struct camdd_dev { + camdd_dev_type dev_type; + union camdd_dev_spec dev_spec; + camdd_dev_flags flags; + char device_name[MAXPATHLEN+1]; + uint32_t blocksize; + uint32_t sector_size; + uint64_t max_sector; + uint64_t sector_io_limit; + int min_cmd_size; + int write_dev; + int retry_count; + int io_timeout; + int debug; + uint64_t start_offset_bytes; + uint64_t next_io_pos_bytes; + uint64_t next_peer_pos_bytes; + uint64_t next_completion_pos_bytes; + uint64_t peer_bytes_queued; + uint64_t bytes_transferred; + uint32_t target_queue_depth; + uint32_t cur_active_io; + uint8_t *extra_buf; + uint32_t extra_buf_len; + struct camdd_dev *peer_dev; + pthread_mutex_t mutex; + pthread_cond_t cond; + int kq; + + int (*run)(struct camdd_dev *dev); + int (*fetch)(struct camdd_dev *dev); + + /* + * Buffers that are available for I/O. Uses links. + */ + STAILQ_HEAD(,camdd_buf) free_queue; + + /* + * Free indirect buffers. These are used for breaking a large + * buffer into multiple pieces. + */ + STAILQ_HEAD(,camdd_buf) free_indirect_queue; + + /* + * Buffers that have been queued to the kernel. Uses links. + */ + STAILQ_HEAD(,camdd_buf) active_queue; + + /* + * Will generally contain one of our buffers that is waiting for enough + * I/O from our partner thread to be able to execute. This will + * generally happen when our per-I/O-size is larger than the + * partner thread's per-I/O-size. Uses links. + */ + STAILQ_HEAD(,camdd_buf) pending_queue; + + /* + * Number of buffers on the pending queue + */ + int num_pending_queue; + + /* + * Buffers that are filled and ready to execute. This is used when + * our partner (reader) thread sends us blocks that are larger than + * our blocksize, and so we have to split them into multiple pieces. + */ + STAILQ_HEAD(,camdd_buf) run_queue; + + /* + * Number of buffers on the run queue. + */ + int num_run_queue; + + STAILQ_HEAD(,camdd_buf) reorder_queue; + + int num_reorder_queue; + + /* + * Buffers that have been queued to us by our partner thread + * (generally the reader thread) to be written out. Uses + * work_links. + */ + STAILQ_HEAD(,camdd_buf) work_queue; + + /* + * Buffers that have been completed by our partner thread. Uses + * work_links. + */ + STAILQ_HEAD(,camdd_buf) peer_done_queue; + + /* + * Number of buffers on the peer done queue. + */ + uint32_t num_peer_done_queue; + + /* + * A list of buffers that we have queued to our peer thread. Uses + * links. + */ + STAILQ_HEAD(,camdd_buf) peer_work_queue; + + /* + * Number of buffers on the peer work queue. + */ + uint32_t num_peer_work_queue; +}; + +static sem_t camdd_sem; +static int need_exit = 0; +static int error_exit = 0; +static int need_status = 0; + +#ifndef min +#define min(a, b) (a < b) ? a : b +#endif + +/* + * XXX KDM private copy of timespecsub(). This is normally defined in + * sys/time.h, but is only enabled in the kernel. If that definition is + * enabled in userland, it breaks the build of libnetbsd. + */ +#ifndef timespecsub +#define timespecsub(vvp, uvp) \ + do { \ + (vvp)->tv_sec -= (uvp)->tv_sec; \ + (vvp)->tv_nsec -= (uvp)->tv_nsec; \ + if ((vvp)->tv_nsec < 0) { \ + (vvp)->tv_sec--; \ + (vvp)->tv_nsec += 1000000000; \ + } \ + } while (0) +#endif + + +/* Generically usefull offsets into the peripheral private area */ +#define ppriv_ptr0 periph_priv.entries[0].ptr +#define ppriv_ptr1 periph_priv.entries[1].ptr +#define ppriv_field0 periph_priv.entries[0].field +#define ppriv_field1 periph_priv.entries[1].field + +#define ccb_buf ppriv_ptr0 + +#define CAMDD_FILE_DEFAULT_BLOCK 524288 +#define CAMDD_FILE_DEFAULT_DEPTH 1 +#define CAMDD_PASS_MAX_BLOCK 1048576 +#define CAMDD_PASS_DEFAULT_DEPTH 6 +#define CAMDD_PASS_RW_TIMEOUT 60 * 1000 + +static int parse_btl(char *tstr, int *bus, int *target, int *lun, + camdd_argmask *arglst); +void camdd_free_dev(struct camdd_dev *dev); +struct camdd_dev *camdd_alloc_dev(camdd_dev_type dev_type, + struct kevent *new_ke, int num_ke, + int retry_count, int timeout); +static struct camdd_buf *camdd_alloc_buf(struct camdd_dev *dev, + camdd_buf_type buf_type); +void camdd_release_buf(struct camdd_buf *buf); +struct camdd_buf *camdd_get_buf(struct camdd_dev *dev, camdd_buf_type buf_type); +int camdd_buf_sg_create(struct camdd_buf *buf, int iovec, + uint32_t sector_size, uint32_t *num_sectors_used, + int *double_buf_needed); +uint32_t camdd_buf_get_len(struct camdd_buf *buf); +void camdd_buf_add_child(struct camdd_buf *buf, struct camdd_buf *child_buf); +int camdd_probe_tape(int fd, char *filename, uint64_t *max_iosize, + uint64_t *max_blk, uint64_t *min_blk, uint64_t *blk_gran); +struct camdd_dev *camdd_probe_file(int fd, struct camdd_io_opts *io_opts, + int retry_count, int timeout); +struct camdd_dev *camdd_probe_pass(struct cam_device *cam_dev, + struct camdd_io_opts *io_opts, + camdd_argmask arglist, int probe_retry_count, + int probe_timeout, int io_retry_count, + int io_timeout); +void *camdd_file_worker(void *arg); +camdd_buf_status camdd_ccb_status(union ccb *ccb); +int camdd_queue_peer_buf(struct camdd_dev *dev, struct camdd_buf *buf); +int camdd_complete_peer_buf(struct camdd_dev *dev, struct camdd_buf *peer_buf); +void camdd_peer_done(struct camdd_buf *buf); +void camdd_complete_buf(struct camdd_dev *dev, struct camdd_buf *buf, + int *error_count); +int camdd_pass_fetch(struct camdd_dev *dev); +int camdd_file_run(struct camdd_dev *dev); +int camdd_pass_run(struct camdd_dev *dev); +int camdd_get_next_lba_len(struct camdd_dev *dev, uint64_t *lba, ssize_t *len); +int camdd_queue(struct camdd_dev *dev, struct camdd_buf *read_buf); +void camdd_get_depth(struct camdd_dev *dev, uint32_t *our_depth, + uint32_t *peer_depth, uint32_t *our_bytes, + uint32_t *peer_bytes); +void *camdd_worker(void *arg); +void camdd_sig_handler(int sig); +void camdd_print_status(struct camdd_dev *camdd_dev, + struct camdd_dev *other_dev, + struct timespec *start_time); +int camdd_rw(struct camdd_io_opts *io_opts, int num_io_opts, + uint64_t max_io, int retry_count, int timeout); +int camdd_parse_io_opts(char *args, int is_write, + struct camdd_io_opts *io_opts); +void usage(void); + +/* + * Parse out a bus, or a bus, target and lun in the following + * format: + * bus + * bus:target + * bus:target:lun + * + * Returns the number of parsed components, or 0. + */ +static int +parse_btl(char *tstr, int *bus, int *target, int *lun, camdd_argmask *arglst) +{ + char *tmpstr; + int convs = 0; + + while (isspace(*tstr) && (*tstr != '\0')) + tstr++; + + tmpstr = (char *)strtok(tstr, ":"); + if ((tmpstr != NULL) && (*tmpstr != '\0')) { + *bus = strtol(tmpstr, NULL, 0); + *arglst |= CAMDD_ARG_BUS; + convs++; + tmpstr = (char *)strtok(NULL, ":"); + if ((tmpstr != NULL) && (*tmpstr != '\0')) { + *target = strtol(tmpstr, NULL, 0); + *arglst |= CAMDD_ARG_TARGET; + convs++; + tmpstr = (char *)strtok(NULL, ":"); + if ((tmpstr != NULL) && (*tmpstr != '\0')) { + *lun = strtol(tmpstr, NULL, 0); + *arglst |= CAMDD_ARG_LUN; + convs++; + } + } + } + + return convs; +} + +/* + * XXX KDM clean up and free all of the buffers on the queue! + */ +void +camdd_free_dev(struct camdd_dev *dev) +{ + if (dev == NULL) + return; + + switch (dev->dev_type) { + case CAMDD_DEV_FILE: { + struct camdd_dev_file *file_dev = &dev->dev_spec.file; + + if (file_dev->fd != -1) + close(file_dev->fd); + free(file_dev->tmp_buf); + break; + } + case CAMDD_DEV_PASS: { + struct camdd_dev_pass *pass_dev = &dev->dev_spec.pass; + + if (pass_dev->dev != NULL) + cam_close_device(pass_dev->dev); + break; + } + default: + break; + } + + free(dev); +} + +struct camdd_dev * +camdd_alloc_dev(camdd_dev_type dev_type, struct kevent *new_ke, int num_ke, + int retry_count, int timeout) +{ + struct camdd_dev *dev = NULL; + struct kevent *ke; + size_t ke_size; + int retval = 0; + + dev = malloc(sizeof(*dev)); + if (dev == NULL) { + warn("%s: unable to malloc %zu bytes", __func__, sizeof(*dev)); + goto bailout; + } + + bzero(dev, sizeof(*dev)); + + dev->dev_type = dev_type; + dev->io_timeout = timeout; + dev->retry_count = retry_count; + STAILQ_INIT(&dev->free_queue); + STAILQ_INIT(&dev->free_indirect_queue); + STAILQ_INIT(&dev->active_queue); + STAILQ_INIT(&dev->pending_queue); + STAILQ_INIT(&dev->run_queue); + STAILQ_INIT(&dev->reorder_queue); + STAILQ_INIT(&dev->work_queue); + STAILQ_INIT(&dev->peer_done_queue); + STAILQ_INIT(&dev->peer_work_queue); + retval = pthread_mutex_init(&dev->mutex, NULL); + if (retval != 0) { + warnc(retval, "%s: failed to initialize mutex", __func__); + goto bailout; + } + + retval = pthread_cond_init(&dev->cond, NULL); + if (retval != 0) { + warnc(retval, "%s: failed to initialize condition variable", + __func__); + goto bailout; + } + + dev->kq = kqueue(); + if (dev->kq == -1) { + warn("%s: Unable to create kqueue", __func__); + goto bailout; + } + + ke_size = sizeof(struct kevent) * (num_ke + 4); + ke = malloc(ke_size); + if (ke == NULL) { + warn("%s: unable to malloc %zu bytes", __func__, ke_size); + goto bailout; + } + bzero(ke, ke_size); + if (num_ke > 0) + bcopy(new_ke, ke, num_ke * sizeof(struct kevent)); + + EV_SET(&ke[num_ke++], (uintptr_t)&dev->work_queue, EVFILT_USER, + EV_ADD|EV_ENABLE|EV_CLEAR, 0,0, 0); + EV_SET(&ke[num_ke++], (uintptr_t)&dev->peer_done_queue, EVFILT_USER, + EV_ADD|EV_ENABLE|EV_CLEAR, 0,0, 0); + EV_SET(&ke[num_ke++], SIGINFO, EVFILT_SIGNAL, EV_ADD|EV_ENABLE, 0,0,0); + EV_SET(&ke[num_ke++], SIGINT, EVFILT_SIGNAL, EV_ADD|EV_ENABLE, 0,0,0); + + retval = kevent(dev->kq, ke, num_ke, NULL, 0, NULL); + if (retval == -1) { + warn("%s: Unable to register kevents", __func__); + goto bailout; + } + + + return (dev); + +bailout: + free(dev); + + return (NULL); +} + +static struct camdd_buf * +camdd_alloc_buf(struct camdd_dev *dev, camdd_buf_type buf_type) +{ + struct camdd_buf *buf = NULL; + uint8_t *data_ptr = NULL; + + /* + * We only need to allocate data space for data buffers. + */ + switch (buf_type) { + case CAMDD_BUF_DATA: + data_ptr = malloc(dev->blocksize); + if (data_ptr == NULL) { + warn("unable to allocate %u bytes", dev->blocksize); + goto bailout_error; + } + break; + default: + break; + } + + buf = malloc(sizeof(*buf)); + if (buf == NULL) { + warn("unable to allocate %zu bytes", sizeof(*buf)); + goto bailout_error; + } + + bzero(buf, sizeof(*buf)); + buf->buf_type = buf_type; + buf->dev = dev; + switch (buf_type) { + case CAMDD_BUF_DATA: { + struct camdd_buf_data *data; + + data = &buf->buf_type_spec.data; + + data->alloc_len = dev->blocksize; + data->buf = data_ptr; + break; + } + case CAMDD_BUF_INDIRECT: + break; + default: + break; + } + STAILQ_INIT(&buf->src_list); + + return (buf); + +bailout_error: + if (data_ptr != NULL) + free(data_ptr); + + if (buf != NULL) + free(buf); + + return (NULL); +} + +void +camdd_release_buf(struct camdd_buf *buf) +{ + struct camdd_dev *dev; + + dev = buf->dev; + + switch (buf->buf_type) { + case CAMDD_BUF_DATA: { + struct camdd_buf_data *data; + + data = &buf->buf_type_spec.data; + + if (data->segs != NULL) { + if (data->extra_buf != 0) { + void *extra_buf; + + extra_buf = (void *) + data->segs[data->sg_count - 1].ds_addr; + free(extra_buf); + data->extra_buf = 0; + } + free(data->segs); + data->segs = NULL; + data->sg_count = 0; + } else if (data->iovec != NULL) { + if (data->extra_buf != 0) { + free(data->iovec[data->sg_count - 1].iov_base); + data->extra_buf = 0; + } + free(data->iovec); + data->iovec = NULL; + data->sg_count = 0; + } + STAILQ_INSERT_TAIL(&dev->free_queue, buf, links); + break; + } + case CAMDD_BUF_INDIRECT: + STAILQ_INSERT_TAIL(&dev->free_indirect_queue, buf, links); + break; + default: + err(1, "%s: Invalid buffer type %d for released buffer", + __func__, buf->buf_type); + break; + } +} + +struct camdd_buf * +camdd_get_buf(struct camdd_dev *dev, camdd_buf_type buf_type) +{ + struct camdd_buf *buf = NULL; + + switch (buf_type) { + case CAMDD_BUF_DATA: + buf = STAILQ_FIRST(&dev->free_queue); + if (buf != NULL) { + struct camdd_buf_data *data; + uint8_t *data_ptr; + uint32_t alloc_len; + + STAILQ_REMOVE_HEAD(&dev->free_queue, links); + data = &buf->buf_type_spec.data; + data_ptr = data->buf; + alloc_len = data->alloc_len; + bzero(buf, sizeof(*buf)); + data->buf = data_ptr; + data->alloc_len = alloc_len; + } + break; + case CAMDD_BUF_INDIRECT: + buf = STAILQ_FIRST(&dev->free_indirect_queue); + if (buf != NULL) { + STAILQ_REMOVE_HEAD(&dev->free_indirect_queue, links); + + bzero(buf, sizeof(*buf)); + } + break; + default: + warnx("Unknown buffer type %d requested", buf_type); + break; + } + + + if (buf == NULL) + return (camdd_alloc_buf(dev, buf_type)); + else { + STAILQ_INIT(&buf->src_list); + buf->dev = dev; + buf->buf_type = buf_type; + + return (buf); + } +} + +int +camdd_buf_sg_create(struct camdd_buf *buf, int iovec, uint32_t sector_size, + uint32_t *num_sectors_used, int *double_buf_needed) +{ + struct camdd_buf *tmp_buf; + struct camdd_buf_data *data; + uint8_t *extra_buf = NULL; + size_t extra_buf_len = 0; + int i, retval = 0; + + data = &buf->buf_type_spec.data; + + data->sg_count = buf->src_count; + /* + * Compose a scatter/gather list from all of the buffers in the list. + * If the length of the buffer isn't a multiple of the sector size, + * we'll have to add an extra buffer. This should only happen + * at the end of a transfer. + */ + if ((data->fill_len % sector_size) != 0) { + extra_buf_len = sector_size - (data->fill_len % sector_size); + extra_buf = calloc(extra_buf_len, 1); + if (extra_buf == NULL) { + warn("%s: unable to allocate %zu bytes for extra " + "buffer space", __func__, extra_buf_len); + retval = 1; + goto bailout; + } + data->extra_buf = 1; + data->sg_count++; + } + if (iovec == 0) { + data->segs = calloc(data->sg_count, sizeof(bus_dma_segment_t)); + if (data->segs == NULL) { + warn("%s: unable to allocate %zu bytes for S/G list", + __func__, sizeof(bus_dma_segment_t) * + data->sg_count); + retval = 1; + goto bailout; + } + + } else { + data->iovec = calloc(data->sg_count, sizeof(struct iovec)); + if (data->iovec == NULL) { + warn("%s: unable to allocate %zu bytes for S/G list", + __func__, sizeof(struct iovec) * data->sg_count); + retval = 1; + goto bailout; + } + } + + for (i = 0, tmp_buf = STAILQ_FIRST(&buf->src_list); + i < buf->src_count && tmp_buf != NULL; i++, + tmp_buf = STAILQ_NEXT(tmp_buf, src_links)) { + + if (tmp_buf->buf_type == CAMDD_BUF_DATA) { + struct camdd_buf_data *tmp_data; + + tmp_data = &tmp_buf->buf_type_spec.data; + if (iovec == 0) { + data->segs[i].ds_addr = + (bus_addr_t) tmp_data->buf; + data->segs[i].ds_len = tmp_data->fill_len - + tmp_data->resid; + } else { + data->iovec[i].iov_base = tmp_data->buf; + data->iovec[i].iov_len = tmp_data->fill_len - + tmp_data->resid; + } + if (((tmp_data->fill_len - tmp_data->resid) % + sector_size) != 0) + *double_buf_needed = 1; + } else { + struct camdd_buf_indirect *tmp_ind; + + tmp_ind = &tmp_buf->buf_type_spec.indirect; + if (iovec == 0) { + data->segs[i].ds_addr = + (bus_addr_t)tmp_ind->start_ptr; + data->segs[i].ds_len = tmp_ind->len; + } else { + data->iovec[i].iov_base = tmp_ind->start_ptr; + data->iovec[i].iov_len = tmp_ind->len; + } + if ((tmp_ind->len % sector_size) != 0) + *double_buf_needed = 1; + } + } + + if (extra_buf != NULL) { + if (iovec == 0) { + data->segs[i].ds_addr = (bus_addr_t)extra_buf; + data->segs[i].ds_len = extra_buf_len; + } else { + data->iovec[i].iov_base = extra_buf; + data->iovec[i].iov_len = extra_buf_len; + } + i++; + } + if ((tmp_buf != NULL) || (i != data->sg_count)) { + warnx("buffer source count does not match " + "number of buffers in list!"); + retval = 1; + goto bailout; + } + +bailout: + if (retval == 0) { + *num_sectors_used = (data->fill_len + extra_buf_len) / + sector_size; + } + return (retval); +} + +uint32_t +camdd_buf_get_len(struct camdd_buf *buf) +{ + uint32_t len = 0; + + if (buf->buf_type != CAMDD_BUF_DATA) { + struct camdd_buf_indirect *indirect; + + indirect = &buf->buf_type_spec.indirect; + len = indirect->len; + } else { + struct camdd_buf_data *data; + + data = &buf->buf_type_spec.data; + len = data->fill_len; + } + + return (len); +} + +void +camdd_buf_add_child(struct camdd_buf *buf, struct camdd_buf *child_buf) +{ + struct camdd_buf_data *data; + + assert(buf->buf_type == CAMDD_BUF_DATA); + + data = &buf->buf_type_spec.data; + + STAILQ_INSERT_TAIL(&buf->src_list, child_buf, src_links); + buf->src_count++; + + data->fill_len += camdd_buf_get_len(child_buf); +} + +typedef enum { + CAMDD_TS_MAX_BLK, + CAMDD_TS_MIN_BLK, + CAMDD_TS_BLK_GRAN, + CAMDD_TS_EFF_IOSIZE +} camdd_status_item_index; + +static struct camdd_status_items { + const char *name; + struct mt_status_entry *entry; +} req_status_items[] = { + { "max_blk", NULL }, + { "min_blk", NULL }, + { "blk_gran", NULL }, + { "max_effective_iosize", NULL } +}; + +int +camdd_probe_tape(int fd, char *filename, uint64_t *max_iosize, + uint64_t *max_blk, uint64_t *min_blk, uint64_t *blk_gran) +{ + struct mt_status_data status_data; + char *xml_str = NULL; + unsigned int i; + int retval = 0; + + retval = mt_get_xml_str(fd, MTIOCEXTGET, &xml_str); + if (retval != 0) + err(1, "Couldn't get XML string from %s", filename); + + retval = mt_get_status(xml_str, &status_data); + if (retval != XML_STATUS_OK) { + warn("couldn't get status for %s", filename); + retval = 1; + goto bailout; + } else + retval = 0; + + if (status_data.error != 0) { + warnx("%s", status_data.error_str); + retval = 1; + goto bailout; + } + + for (i = 0; i < sizeof(req_status_items) / + sizeof(req_status_items[0]); i++) { + char *name; + + name = __DECONST(char *, req_status_items[i].name); + req_status_items[i].entry = mt_status_entry_find(&status_data, + name); + if (req_status_items[i].entry == NULL) { + errx(1, "Cannot find status entry %s", + req_status_items[i].name); + } + } + + *max_iosize = req_status_items[CAMDD_TS_EFF_IOSIZE].entry->value_unsigned; + *max_blk= req_status_items[CAMDD_TS_MAX_BLK].entry->value_unsigned; + *min_blk= req_status_items[CAMDD_TS_MIN_BLK].entry->value_unsigned; + *blk_gran = req_status_items[CAMDD_TS_BLK_GRAN].entry->value_unsigned; +bailout: + + free(xml_str); + mt_status_free(&status_data); + + return (retval); +} + +struct camdd_dev * +camdd_probe_file(int fd, struct camdd_io_opts *io_opts, int retry_count, + int timeout) +{ + struct camdd_dev *dev = NULL; + struct camdd_dev_file *file_dev; + uint64_t blocksize = io_opts->blocksize; + + dev = camdd_alloc_dev(CAMDD_DEV_FILE, NULL, 0, retry_count, timeout); + if (dev == NULL) + goto bailout; + + file_dev = &dev->dev_spec.file; + file_dev->fd = fd; + strlcpy(file_dev->filename, io_opts->dev_name, + sizeof(file_dev->filename)); + strlcpy(dev->device_name, io_opts->dev_name, sizeof(dev->device_name)); + if (blocksize == 0) + dev->blocksize = CAMDD_FILE_DEFAULT_BLOCK; + else + dev->blocksize = blocksize; + + if ((io_opts->queue_depth != 0) + && (io_opts->queue_depth != 1)) { + warnx("Queue depth %ju for %s ignored, only 1 outstanding " + "command supported", (uintmax_t)io_opts->queue_depth, + io_opts->dev_name); + } + dev->target_queue_depth = CAMDD_FILE_DEFAULT_DEPTH; + dev->run = camdd_file_run; + dev->fetch = NULL; + + /* + * We can effectively access files on byte boundaries. We'll reset + * this for devices like disks that can be accessed on sector + * boundaries. + */ + dev->sector_size = 1; + + if ((fd != STDIN_FILENO) + && (fd != STDOUT_FILENO)) { + int retval; + + retval = fstat(fd, &file_dev->sb); + if (retval != 0) { + warn("Cannot stat %s", dev->device_name); + goto bailout; + camdd_free_dev(dev); + dev = NULL; + } + if (S_ISREG(file_dev->sb.st_mode)) { + file_dev->file_type = CAMDD_FILE_REG; + } else if (S_ISCHR(file_dev->sb.st_mode)) { + int type; + + if (ioctl(fd, FIODTYPE, &type) == -1) + err(1, "FIODTYPE ioctl failed on %s", + dev->device_name); + else { + if (type & D_TAPE) + file_dev->file_type = CAMDD_FILE_TAPE; + else if (type & D_DISK) + file_dev->file_type = CAMDD_FILE_DISK; + else if (type & D_MEM) + file_dev->file_type = CAMDD_FILE_MEM; + else if (type & D_TTY) + file_dev->file_type = CAMDD_FILE_TTY; + } + } else if (S_ISDIR(file_dev->sb.st_mode)) { + errx(1, "cannot operate on directory %s", + dev->device_name); + } else if (S_ISFIFO(file_dev->sb.st_mode)) { + file_dev->file_type = CAMDD_FILE_PIPE; + } else + errx(1, "Cannot determine file type for %s", + dev->device_name); + + switch (file_dev->file_type) { + case CAMDD_FILE_REG: + if (file_dev->sb.st_size != 0) + dev->max_sector = file_dev->sb.st_size - 1; + else + dev->max_sector = 0; + file_dev->file_flags |= CAMDD_FF_CAN_SEEK; + break; + case CAMDD_FILE_TAPE: { + uint64_t max_iosize, max_blk, min_blk, blk_gran; + /* + * Check block limits and maximum effective iosize. + * Make sure the blocksize is within the block + * limits (and a multiple of the minimum blocksize) + * and that the blocksize is <= maximum effective + * iosize. + */ + retval = camdd_probe_tape(fd, dev->device_name, + &max_iosize, &max_blk, &min_blk, &blk_gran); + if (retval != 0) + errx(1, "Unable to probe tape %s", + dev->device_name); + + /* + * The blocksize needs to be <= the maximum + * effective I/O size of the tape device. Note + * that this also takes into account the maximum + * blocksize reported by READ BLOCK LIMITS. + */ + if (dev->blocksize > max_iosize) { + warnx("Blocksize %u too big for %s, limiting " + "to %ju", dev->blocksize, dev->device_name, + max_iosize); + dev->blocksize = max_iosize; + } + + /* + * The blocksize needs to be at least min_blk; + */ + if (dev->blocksize < min_blk) { + warnx("Blocksize %u too small for %s, " + "increasing to %ju", dev->blocksize, + dev->device_name, min_blk); + dev->blocksize = min_blk; + } + + /* + * And the blocksize needs to be a multiple of + * the block granularity. + */ + if ((blk_gran != 0) + && (dev->blocksize % (1 << blk_gran))) { + warnx("Blocksize %u for %s not a multiple of " + "%d, adjusting to %d", dev->blocksize, + dev->device_name, (1 << blk_gran), + dev->blocksize & ~((1 << blk_gran) - 1)); + dev->blocksize &= ~((1 << blk_gran) - 1); + } + + if (dev->blocksize == 0) { + errx(1, "Unable to derive valid blocksize for " + "%s", dev->device_name); + } + + /* + * For tape drives, set the sector size to the + * blocksize so that we make sure not to write + * less than the blocksize out to the drive. + */ + dev->sector_size = dev->blocksize; + break; + } + case CAMDD_FILE_DISK: { + off_t media_size; + unsigned int sector_size; + + file_dev->file_flags |= CAMDD_FF_CAN_SEEK; + + if (ioctl(fd, DIOCGSECTORSIZE, §or_size) == -1) { + err(1, "DIOCGSECTORSIZE ioctl failed on %s", + dev->device_name); + } + + if (sector_size == 0) { + errx(1, "DIOCGSECTORSIZE ioctl returned " + "invalid sector size %u for %s", + sector_size, dev->device_name); + } + + if (ioctl(fd, DIOCGMEDIASIZE, &media_size) == -1) { + err(1, "DIOCGMEDIASIZE ioctl failed on %s", + dev->device_name); + } + + if (media_size == 0) { + errx(1, "DIOCGMEDIASIZE ioctl returned " + "invalid media size %ju for %s", + (uintmax_t)media_size, dev->device_name); + } + + if (dev->blocksize % sector_size) { + errx(1, "%s blocksize %u not a multiple of " + "sector size %u", dev->device_name, + dev->blocksize, sector_size); + } + + dev->sector_size = sector_size; + dev->max_sector = (media_size / sector_size) - 1; + break; + } + case CAMDD_FILE_MEM: + file_dev->file_flags |= CAMDD_FF_CAN_SEEK; + break; + default: + break; + } + } + + if ((io_opts->offset != 0) + && ((file_dev->file_flags & CAMDD_FF_CAN_SEEK) == 0)) { + warnx("Offset %ju specified for %s, but we cannot seek on %s", + io_opts->offset, io_opts->dev_name, io_opts->dev_name); + goto bailout_error; + } +#if 0 + else if ((io_opts->offset != 0) + && ((io_opts->offset % dev->sector_size) != 0)) { + warnx("Offset %ju for %s is not a multiple of the " + "sector size %u", io_opts->offset, + io_opts->dev_name, dev->sector_size); + goto bailout_error; + } else { + dev->start_offset_bytes = io_opts->offset; + } +#endif + +bailout: + return (dev); + +bailout_error: + camdd_free_dev(dev); + return (NULL); +} + +/* + * Need to implement this. Do a basic probe: + * - Check the inquiry data, make sure we're talking to a device that we + * can reasonably expect to talk to -- direct, RBC, CD, WORM. + * - Send a test unit ready, make sure the device is available. + * - Get the capacity and block size. + */ +struct camdd_dev * +camdd_probe_pass(struct cam_device *cam_dev, struct camdd_io_opts *io_opts, + camdd_argmask arglist, int probe_retry_count, + int probe_timeout, int io_retry_count, int io_timeout) +{ + union ccb *ccb; + uint64_t maxsector; + uint32_t cpi_maxio, max_iosize, pass_numblocks; + uint32_t block_len; + struct scsi_read_capacity_data rcap; + struct scsi_read_capacity_data_long rcaplong; + struct camdd_dev *dev; + struct camdd_dev_pass *pass_dev; + struct kevent ke; + int scsi_dev_type; + int retval; + + dev = NULL; + + scsi_dev_type = SID_TYPE(&cam_dev->inq_data); + maxsector = 0; + block_len = 0; + + /* + * For devices that support READ CAPACITY, we'll attempt to get the + * capacity. Otherwise, we really don't support tape or other + * devices via SCSI passthrough, so just return an error in that case. + */ + switch (scsi_dev_type) { + case T_DIRECT: + case T_WORM: + case T_CDROM: + case T_OPTICAL: + case T_RBC: + break; + default: + errx(1, "Unsupported SCSI device type %d", scsi_dev_type); + break; /*NOTREACHED*/ + } + + ccb = cam_getccb(cam_dev); + + if (ccb == NULL) { + warnx("%s: error allocating ccb", __func__); + goto bailout; + } + + bzero(&(&ccb->ccb_h)[1], + sizeof(struct ccb_scsiio) - sizeof(struct ccb_hdr)); + + scsi_read_capacity(&ccb->csio, + /*retries*/ probe_retry_count, + /*cbfcnp*/ NULL, + /*tag_action*/ MSG_SIMPLE_Q_TAG, + &rcap, + SSD_FULL_SIZE, + /*timeout*/ probe_timeout ? probe_timeout : 5000); + + /* Disable freezing the device queue */ + ccb->ccb_h.flags |= CAM_DEV_QFRZDIS; + + if (arglist & CAMDD_ARG_ERR_RECOVER) + ccb->ccb_h.flags |= CAM_PASS_ERR_RECOVER; + + if (cam_send_ccb(cam_dev, ccb) < 0) { + warn("error sending READ CAPACITY command"); + + cam_error_print(cam_dev, ccb, CAM_ESF_ALL, + CAM_EPF_ALL, stderr); + + goto bailout; + } + + if ((ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { + cam_error_print(cam_dev, ccb, CAM_ESF_ALL, CAM_EPF_ALL, stderr); + retval = 1; + goto bailout; + } + + maxsector = scsi_4btoul(rcap.addr); + block_len = scsi_4btoul(rcap.length); + + /* + * A last block of 2^32-1 means that the true capacity is over 2TB, + * and we need to issue the long READ CAPACITY to get the real + * capacity. Otherwise, we're all set. + */ + if (maxsector != 0xffffffff) + goto rcap_done; + + scsi_read_capacity_16(&ccb->csio, + /*retries*/ probe_retry_count, + /*cbfcnp*/ NULL, + /*tag_action*/ MSG_SIMPLE_Q_TAG, + /*lba*/ 0, + /*reladdr*/ 0, + /*pmi*/ 0, + (uint8_t *)&rcaplong, + sizeof(rcaplong), + /*sense_len*/ SSD_FULL_SIZE, + /*timeout*/ probe_timeout ? probe_timeout : 5000); + + /* Disable freezing the device queue */ + ccb->ccb_h.flags |= CAM_DEV_QFRZDIS; + + if (arglist & CAMDD_ARG_ERR_RECOVER) + ccb->ccb_h.flags |= CAM_PASS_ERR_RECOVER; + + if (cam_send_ccb(cam_dev, ccb) < 0) { + warn("error sending READ CAPACITY (16) command"); + + cam_error_print(cam_dev, ccb, CAM_ESF_ALL, + CAM_EPF_ALL, stderr); + + retval = 1; + goto bailout; + } + + if ((ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { + cam_error_print(cam_dev, ccb, CAM_ESF_ALL, CAM_EPF_ALL, stderr); + goto bailout; + } + + maxsector = scsi_8btou64(rcaplong.addr); + block_len = scsi_4btoul(rcaplong.length); + +rcap_done: + + bzero(&(&ccb->ccb_h)[1], + sizeof(struct ccb_scsiio) - sizeof(struct ccb_hdr)); + + ccb->ccb_h.func_code = XPT_PATH_INQ; + ccb->ccb_h.flags = CAM_DIR_NONE; + ccb->ccb_h.retry_count = 1; + + if (cam_send_ccb(cam_dev, ccb) < 0) { + warn("error sending XPT_PATH_INQ CCB"); + + cam_error_print(cam_dev, ccb, CAM_ESF_ALL, + CAM_EPF_ALL, stderr); + goto bailout; + } + + EV_SET(&ke, cam_dev->fd, EVFILT_READ, EV_ADD|EV_ENABLE, 0, 0, 0); + + dev = camdd_alloc_dev(CAMDD_DEV_PASS, &ke, 1, io_retry_count, + io_timeout); + if (dev == NULL) + goto bailout; + + pass_dev = &dev->dev_spec.pass; + pass_dev->scsi_dev_type = scsi_dev_type; + pass_dev->dev = cam_dev; + pass_dev->max_sector = maxsector; + pass_dev->block_len = block_len; + pass_dev->cpi_maxio = ccb->cpi.maxio; + snprintf(dev->device_name, sizeof(dev->device_name), "%s%u", + pass_dev->dev->device_name, pass_dev->dev->dev_unit_num); + dev->sector_size = block_len; + dev->max_sector = maxsector; + + + /* + * Determine the optimal blocksize to use for this device. + */ + + /* + * If the controller has not specified a maximum I/O size, + * just go with 128K as a somewhat conservative value. + */ + if (pass_dev->cpi_maxio == 0) + cpi_maxio = 131072; + else + cpi_maxio = pass_dev->cpi_maxio; + + /* + * If the controller has a large maximum I/O size, limit it + * to something smaller so that the kernel doesn't have trouble + * allocating buffers to copy data in and out for us. + * XXX KDM this is until we have unmapped I/O support in the kernel. + */ + max_iosize = min(cpi_maxio, CAMDD_PASS_MAX_BLOCK); + + /* + * If we weren't able to get a block size for some reason, + * default to 512 bytes. + */ + block_len = pass_dev->block_len; + if (block_len == 0) + block_len = 512; + + /* + * Figure out how many blocksize chunks will fit in the + * maximum I/O size. + */ + pass_numblocks = max_iosize / block_len; + + /* + * And finally, multiple the number of blocks by the LBA + * length to get our maximum block size; + */ + dev->blocksize = pass_numblocks * block_len; + + if (io_opts->blocksize != 0) { + if ((io_opts->blocksize % dev->sector_size) != 0) { + warnx("Blocksize %ju for %s is not a multiple of " + "sector size %u", (uintmax_t)io_opts->blocksize, + dev->device_name, dev->sector_size); + goto bailout_error; + } + dev->blocksize = io_opts->blocksize; + } + dev->target_queue_depth = CAMDD_PASS_DEFAULT_DEPTH; + if (io_opts->queue_depth != 0) + dev->target_queue_depth = io_opts->queue_depth; + + if (io_opts->offset != 0) { + if (io_opts->offset > (dev->max_sector * dev->sector_size)) { + warnx("Offset %ju is past the end of device %s", + io_opts->offset, dev->device_name); + goto bailout_error; + } +#if 0 + else if ((io_opts->offset % dev->sector_size) != 0) { + warnx("Offset %ju for %s is not a multiple of the " + "sector size %u", io_opts->offset, + dev->device_name, dev->sector_size); + goto bailout_error; + } + dev->start_offset_bytes = io_opts->offset; +#endif + } + + dev->min_cmd_size = io_opts->min_cmd_size; + + dev->run = camdd_pass_run; + dev->fetch = camdd_pass_fetch; + +bailout: + cam_freeccb(ccb); + + return (dev); + +bailout_error: + cam_freeccb(ccb); + + camdd_free_dev(dev); + + return (NULL); +} + +void * +camdd_worker(void *arg) +{ + struct camdd_dev *dev = arg; + struct camdd_buf *buf; + struct timespec ts, *kq_ts; + + ts.tv_sec = 0; + ts.tv_nsec = 0; + + pthread_mutex_lock(&dev->mutex); + + dev->flags |= CAMDD_DEV_FLAG_ACTIVE; + + for (;;) { + struct kevent ke; + int retval = 0; + + /* + * XXX KDM check the reorder queue depth? + */ + if (dev->write_dev == 0) { + uint32_t our_depth, peer_depth, peer_bytes, our_bytes; + uint32_t target_depth = dev->target_queue_depth; + uint32_t peer_target_depth = + dev->peer_dev->target_queue_depth; + uint32_t peer_blocksize = dev->peer_dev->blocksize; + + camdd_get_depth(dev, &our_depth, &peer_depth, + &our_bytes, &peer_bytes); + +#if 0 + while (((our_depth < target_depth) + && (peer_depth < peer_target_depth)) + || ((peer_bytes + our_bytes) < + (peer_blocksize * 2))) { +#endif + while (((our_depth + peer_depth) < + (target_depth + peer_target_depth)) + || ((peer_bytes + our_bytes) < + (peer_blocksize * 3))) { + + retval = camdd_queue(dev, NULL); + if (retval == 1) + break; + else if (retval != 0) { + error_exit = 1; + goto bailout; + } + + camdd_get_depth(dev, &our_depth, &peer_depth, + &our_bytes, &peer_bytes); + } + } + /* + * See if we have any I/O that is ready to execute. + */ + buf = STAILQ_FIRST(&dev->run_queue); + if (buf != NULL) { + while (dev->target_queue_depth > dev->cur_active_io) { + retval = dev->run(dev); + if (retval == -1) { + dev->flags |= CAMDD_DEV_FLAG_EOF; + error_exit = 1; + break; + } else if (retval != 0) { + break; + } + } + } + + /* + * We've reached EOF, or our partner has reached EOF. + */ + if ((dev->flags & CAMDD_DEV_FLAG_EOF) + || (dev->flags & CAMDD_DEV_FLAG_PEER_EOF)) { + if (dev->write_dev != 0) { + if ((STAILQ_EMPTY(&dev->work_queue)) + && (dev->num_run_queue == 0) + && (dev->cur_active_io == 0)) { + goto bailout; + } + } else { + /* + * If we're the reader, and the writer + * got EOF, he is already done. If we got + * the EOF, then we need to wait until + * everything is flushed out for the writer. + */ + if (dev->flags & CAMDD_DEV_FLAG_PEER_EOF) { + goto bailout; + } else if ((dev->num_peer_work_queue == 0) + && (dev->num_peer_done_queue == 0) + && (dev->cur_active_io == 0) + && (dev->num_run_queue == 0)) { + goto bailout; + } + } + /* + * XXX KDM need to do something about the pending + * queue and cleanup resources. + */ + } + + if ((dev->write_dev == 0) + && (dev->cur_active_io == 0) + && (dev->peer_bytes_queued < dev->peer_dev->blocksize)) + kq_ts = &ts; + else + kq_ts = NULL; + + /* + * Run kevent to see if there are events to process. + */ + pthread_mutex_unlock(&dev->mutex); + retval = kevent(dev->kq, NULL, 0, &ke, 1, kq_ts); + pthread_mutex_lock(&dev->mutex); + if (retval == -1) { + warn("%s: error returned from kevent",__func__); + goto bailout; + } else if (retval != 0) { + switch (ke.filter) { + case EVFILT_READ: + if (dev->fetch != NULL) { + retval = dev->fetch(dev); + if (retval == -1) { + error_exit = 1; + goto bailout; + } + } + break; + case EVFILT_SIGNAL: + /* + * We register for this so we don't get + * an error as a result of a SIGINFO or a + * SIGINT. It will actually get handled + * by the signal handler. If we get a + * SIGINT, bail out without printing an + * error message. Any other signals + * will result in the error message above. + */ + if (ke.ident == SIGINT) + goto bailout; + break; + case EVFILT_USER: + retval = 0; + /* + * Check to see if the other thread has + * queued any I/O for us to do. (In this + * case we're the writer.) + */ + for (buf = STAILQ_FIRST(&dev->work_queue); + buf != NULL; + buf = STAILQ_FIRST(&dev->work_queue)) { + STAILQ_REMOVE_HEAD(&dev->work_queue, + work_links); + retval = camdd_queue(dev, buf); + /* + * We keep going unless we get an + * actual error. If we get EOF, we + * still want to remove the buffers + * from the queue and send the back + * to the reader thread. + */ + if (retval == -1) { + error_exit = 1; + goto bailout; + } else + retval = 0; + } + + /* + * Next check to see if the other thread has + * queued any completed buffers back to us. + * (In this case we're the reader.) + */ + for (buf = STAILQ_FIRST(&dev->peer_done_queue); + buf != NULL; + buf = STAILQ_FIRST(&dev->peer_done_queue)){ + STAILQ_REMOVE_HEAD( + &dev->peer_done_queue, work_links); + dev->num_peer_done_queue--; + camdd_peer_done(buf); + } + break; + default: + warnx("%s: unknown kevent filter %d", + __func__, ke.filter); + break; + } + } + } + +bailout: + + dev->flags &= ~CAMDD_DEV_FLAG_ACTIVE; + + /* XXX KDM cleanup resources here? */ + + pthread_mutex_unlock(&dev->mutex); + + need_exit = 1; + sem_post(&camdd_sem); + + return (NULL); +} + +/* + * Simplistic translation of CCB status to our local status. + */ +camdd_buf_status +camdd_ccb_status(union ccb *ccb) +{ + camdd_buf_status status = CAMDD_STATUS_NONE; + cam_status ccb_status; + + ccb_status = ccb->ccb_h.status & CAM_STATUS_MASK; + + switch (ccb_status) { + case CAM_REQ_CMP: { + if (ccb->csio.resid == 0) { + status = CAMDD_STATUS_OK; + } else if (ccb->csio.dxfer_len > ccb->csio.resid) { + status = CAMDD_STATUS_SHORT_IO; + } else { + status = CAMDD_STATUS_EOF; + } + break; + } + case CAM_SCSI_STATUS_ERROR: { + switch (ccb->csio.scsi_status) { + case SCSI_STATUS_OK: + case SCSI_STATUS_COND_MET: + case SCSI_STATUS_INTERMED: + case SCSI_STATUS_INTERMED_COND_MET: + status = CAMDD_STATUS_OK; + break; + case SCSI_STATUS_CMD_TERMINATED: + case SCSI_STATUS_CHECK_COND: + case SCSI_STATUS_QUEUE_FULL: + case SCSI_STATUS_BUSY: + case SCSI_STATUS_RESERV_CONFLICT: + default: + status = CAMDD_STATUS_ERROR; + break; + } + break; + } + default: + status = CAMDD_STATUS_ERROR; + break; + } + + return (status); +} + +/* + * Queue a buffer to our peer's work thread for writing. + * + * Returns 0 for success, -1 for failure, 1 if the other thread exited. + */ +int +camdd_queue_peer_buf(struct camdd_dev *dev, struct camdd_buf *buf) +{ + struct kevent ke; + STAILQ_HEAD(, camdd_buf) local_queue; + struct camdd_buf *buf1, *buf2; + struct camdd_buf_data *data = NULL; + uint64_t peer_bytes_queued = 0; + int active = 1; + int retval = 0; + + STAILQ_INIT(&local_queue); + + /* + * Since we're the reader, we need to queue our I/O to the writer + * in sequential order in order to make sure it gets written out + * in sequential order. + * + * Check the next expected I/O starting offset. If this doesn't + * match, put it on the reorder queue. + */ + if ((buf->lba * dev->sector_size) != dev->next_completion_pos_bytes) { + + /* + * If there is nothing on the queue, there is no sorting + * needed. + */ + if (STAILQ_EMPTY(&dev->reorder_queue)) { + STAILQ_INSERT_TAIL(&dev->reorder_queue, buf, links); + dev->num_reorder_queue++; + goto bailout; + } + + /* + * Sort in ascending order by starting LBA. There should + * be no identical LBAs. + */ + for (buf1 = STAILQ_FIRST(&dev->reorder_queue); buf1 != NULL; + buf1 = buf2) { + buf2 = STAILQ_NEXT(buf1, links); + if (buf->lba < buf1->lba) { + /* + * If we're less than the first one, then + * we insert at the head of the list + * because this has to be the first element + * on the list. + */ + STAILQ_INSERT_HEAD(&dev->reorder_queue, + buf, links); + dev->num_reorder_queue++; + break; + } else if (buf->lba > buf1->lba) { + if (buf2 == NULL) { + STAILQ_INSERT_TAIL(&dev->reorder_queue, + buf, links); + dev->num_reorder_queue++; + break; + } else if (buf->lba < buf2->lba) { + STAILQ_INSERT_AFTER(&dev->reorder_queue, + buf1, buf, links); + dev->num_reorder_queue++; + break; + } + } else { + errx(1, "Found buffers with duplicate LBA %ju!", + buf->lba); + } + } + goto bailout; + } else { + + /* + * We're the next expected I/O completion, so put ourselves + * on the local queue to be sent to the writer. We use + * work_links here so that we can queue this to the + * peer_work_queue before taking the buffer off of the + * local_queue. + */ + dev->next_completion_pos_bytes += buf->len; + STAILQ_INSERT_TAIL(&local_queue, buf, work_links); + + /* + * Go through the reorder queue looking for more sequential + * I/O and add it to the local queue. + */ + for (buf1 = STAILQ_FIRST(&dev->reorder_queue); buf1 != NULL; + buf1 = STAILQ_FIRST(&dev->reorder_queue)) { + /* + * As soon as we see an I/O that is out of sequence, + * we're done. + */ + if ((buf1->lba * dev->sector_size) != + dev->next_completion_pos_bytes) + break; + + STAILQ_REMOVE_HEAD(&dev->reorder_queue, links); + dev->num_reorder_queue--; + STAILQ_INSERT_TAIL(&local_queue, buf1, work_links); + dev->next_completion_pos_bytes += buf1->len; + } + } + + /* + * Setup the event to let the other thread know that it has work + * pending. + */ + EV_SET(&ke, (uintptr_t)&dev->peer_dev->work_queue, EVFILT_USER, 0, + NOTE_TRIGGER, 0, NULL); + + /* + * Put this on our shadow queue so that we know what we've queued + * to the other thread. + */ + STAILQ_FOREACH_SAFE(buf1, &local_queue, work_links, buf2) { + if (buf1->buf_type != CAMDD_BUF_DATA) { + errx(1, "%s: should have a data buffer, not an " + "indirect buffer", __func__); + } + data = &buf1->buf_type_spec.data; + + /* + * We only need to send one EOF to the writer, and don't + * need to continue sending EOFs after that. + */ + if (buf1->status == CAMDD_STATUS_EOF) { + if (dev->flags & CAMDD_DEV_FLAG_EOF_SENT) { + STAILQ_REMOVE(&local_queue, buf1, camdd_buf, + work_links); + camdd_release_buf(buf1); + retval = 1; + continue; + } + dev->flags |= CAMDD_DEV_FLAG_EOF_SENT; + } + + + STAILQ_INSERT_TAIL(&dev->peer_work_queue, buf1, links); + peer_bytes_queued += (data->fill_len - data->resid); + dev->peer_bytes_queued += (data->fill_len - data->resid); + dev->num_peer_work_queue++; + } + + if (STAILQ_FIRST(&local_queue) == NULL) + goto bailout; + + /* + * Drop our mutex and pick up the other thread's mutex. We need to + * do this to avoid deadlocks. + */ + pthread_mutex_unlock(&dev->mutex); + pthread_mutex_lock(&dev->peer_dev->mutex); + + if (dev->peer_dev->flags & CAMDD_DEV_FLAG_ACTIVE) { + /* + * Put the buffers on the other thread's incoming work queue. + */ + for (buf1 = STAILQ_FIRST(&local_queue); buf1 != NULL; + buf1 = STAILQ_FIRST(&local_queue)) { + STAILQ_REMOVE_HEAD(&local_queue, work_links); + STAILQ_INSERT_TAIL(&dev->peer_dev->work_queue, buf1, + work_links); + } + /* + * Send an event to the other thread's kqueue to let it know + * that there is something on the work queue. + */ + retval = kevent(dev->peer_dev->kq, &ke, 1, NULL, 0, NULL); + if (retval == -1) + warn("%s: unable to add peer work_queue kevent", + __func__); + else + retval = 0; + } else + active = 0; + + pthread_mutex_unlock(&dev->peer_dev->mutex); + pthread_mutex_lock(&dev->mutex); + + /* + * If the other side isn't active, run through the queue and + * release all of the buffers. + */ + if (active == 0) { + for (buf1 = STAILQ_FIRST(&local_queue); buf1 != NULL; + buf1 = STAILQ_FIRST(&local_queue)) { + STAILQ_REMOVE_HEAD(&local_queue, work_links); + STAILQ_REMOVE(&dev->peer_work_queue, buf1, camdd_buf, + links); + dev->num_peer_work_queue--; + camdd_release_buf(buf1); + } + dev->peer_bytes_queued -= peer_bytes_queued; + retval = 1; + } + +bailout: + return (retval); +} + +/* + * Return a buffer to the reader thread when we have completed writing it. + */ +int +camdd_complete_peer_buf(struct camdd_dev *dev, struct camdd_buf *peer_buf) +{ + struct kevent ke; + int retval = 0; + + /* + * Setup the event to let the other thread know that we have + * completed a buffer. + */ + EV_SET(&ke, (uintptr_t)&dev->peer_dev->peer_done_queue, EVFILT_USER, 0, + NOTE_TRIGGER, 0, NULL); + + /* + * Drop our lock and acquire the other thread's lock before + * manipulating + */ + pthread_mutex_unlock(&dev->mutex); + pthread_mutex_lock(&dev->peer_dev->mutex); + + /* + * Put the buffer on the reader thread's peer done queue now that + * we have completed it. + */ + STAILQ_INSERT_TAIL(&dev->peer_dev->peer_done_queue, peer_buf, + work_links); + dev->peer_dev->num_peer_done_queue++; + + /* + * Send an event to the peer thread to let it know that we've added + * something to its peer done queue. + */ + retval = kevent(dev->peer_dev->kq, &ke, 1, NULL, 0, NULL); + if (retval == -1) + warn("%s: unable to add peer_done_queue kevent", __func__); + else + retval = 0; + + /* + * Drop the other thread's lock and reacquire ours. + */ + pthread_mutex_unlock(&dev->peer_dev->mutex); + pthread_mutex_lock(&dev->mutex); + + return (retval); +} + +/* + * Free a buffer that was written out by the writer thread and returned to + * the reader thread. + */ +void +camdd_peer_done(struct camdd_buf *buf) +{ + struct camdd_dev *dev; + struct camdd_buf_data *data; + + dev = buf->dev; + if (buf->buf_type != CAMDD_BUF_DATA) { + errx(1, "%s: should have a data buffer, not an " + "indirect buffer", __func__); + } + + data = &buf->buf_type_spec.data; + + STAILQ_REMOVE(&dev->peer_work_queue, buf, camdd_buf, links); + dev->num_peer_work_queue--; + dev->peer_bytes_queued -= (data->fill_len - data->resid); + + if (buf->status == CAMDD_STATUS_EOF) + dev->flags |= CAMDD_DEV_FLAG_PEER_EOF; + + STAILQ_INSERT_TAIL(&dev->free_queue, buf, links); +} + +/* + * Assumes caller holds the lock for this device. + */ +void +camdd_complete_buf(struct camdd_dev *dev, struct camdd_buf *buf, + int *error_count) +{ + int retval = 0; + + /* + * If we're the reader, we need to send the completed I/O + * to the writer. If we're the writer, we need to just + * free up resources, or let the reader know if we've + * encountered an error. + */ + if (dev->write_dev == 0) { + retval = camdd_queue_peer_buf(dev, buf); + if (retval != 0) + (*error_count)++; + } else { + struct camdd_buf *tmp_buf, *next_buf; + + STAILQ_FOREACH_SAFE(tmp_buf, &buf->src_list, src_links, + next_buf) { + struct camdd_buf *src_buf; + struct camdd_buf_indirect *indirect; + + STAILQ_REMOVE(&buf->src_list, tmp_buf, + camdd_buf, src_links); + + tmp_buf->status = buf->status; + + if (tmp_buf->buf_type == CAMDD_BUF_DATA) { + camdd_complete_peer_buf(dev, tmp_buf); + continue; + } + + indirect = &tmp_buf->buf_type_spec.indirect; + src_buf = indirect->src_buf; + src_buf->refcount--; + /* + * XXX KDM we probably need to account for + * exactly how many bytes we were able to + * write. Allocate the residual to the + * first N buffers? Or just track the + * number of bytes written? Right now the reader + * doesn't do anything with a residual. + */ + src_buf->status = buf->status; + if (src_buf->refcount <= 0) + camdd_complete_peer_buf(dev, src_buf); + STAILQ_INSERT_TAIL(&dev->free_indirect_queue, + tmp_buf, links); + } + + STAILQ_INSERT_TAIL(&dev->free_queue, buf, links); + } +} + +/* + * Fetch all completed commands from the pass(4) device. + * + * Returns the number of commands received, or -1 if any of the commands + * completed with an error. Returns 0 if no commands are available. + */ +int +camdd_pass_fetch(struct camdd_dev *dev) +{ + struct camdd_dev_pass *pass_dev = &dev->dev_spec.pass; + union ccb ccb; + int retval = 0, num_fetched = 0, error_count = 0; + + pthread_mutex_unlock(&dev->mutex); + /* + * XXX KDM we don't distinguish between EFAULT and ENOENT. + */ + while ((retval = ioctl(pass_dev->dev->fd, CAMIOGET, &ccb)) != -1) { + struct camdd_buf *buf; + struct camdd_buf_data *data; + cam_status ccb_status; + union ccb *buf_ccb; + + buf = ccb.ccb_h.ccb_buf; + data = &buf->buf_type_spec.data; + buf_ccb = &data->ccb; + + num_fetched++; + + /* + * Copy the CCB back out so we get status, sense data, etc. + */ + bcopy(&ccb, buf_ccb, sizeof(ccb)); + + pthread_mutex_lock(&dev->mutex); + + /* + * We're now done, so take this off the active queue. + */ + STAILQ_REMOVE(&dev->active_queue, buf, camdd_buf, links); + dev->cur_active_io--; + + ccb_status = ccb.ccb_h.status & CAM_STATUS_MASK; + if (ccb_status != CAM_REQ_CMP) { + cam_error_print(pass_dev->dev, &ccb, CAM_ESF_ALL, + CAM_EPF_ALL, stderr); + } + + data->resid = ccb.csio.resid; + dev->bytes_transferred += (ccb.csio.dxfer_len - ccb.csio.resid); + + if (buf->status == CAMDD_STATUS_NONE) + buf->status = camdd_ccb_status(&ccb); + if (buf->status == CAMDD_STATUS_ERROR) + error_count++; + else if (buf->status == CAMDD_STATUS_EOF) { + /* + * Once we queue this buffer to our partner thread, + * he will know that we've hit EOF. + */ + dev->flags |= CAMDD_DEV_FLAG_EOF; + } + + camdd_complete_buf(dev, buf, &error_count); + + /* + * Unlock in preparation for the ioctl call. + */ + pthread_mutex_unlock(&dev->mutex); + } + + pthread_mutex_lock(&dev->mutex); + + if (error_count > 0) + return (-1); + else + return (num_fetched); +} + +/* + * Returns -1 for error, 0 for success/continue, and 1 for resource + * shortage/stop processing. + */ +int +camdd_file_run(struct camdd_dev *dev) +{ + struct camdd_dev_file *file_dev = &dev->dev_spec.file; + struct camdd_buf_data *data; + struct camdd_buf *buf; + off_t io_offset; + int retval = 0, write_dev = dev->write_dev; + int error_count = 0, no_resources = 0, double_buf_needed = 0; + uint32_t num_sectors = 0, db_len = 0; + + buf = STAILQ_FIRST(&dev->run_queue); + if (buf == NULL) { + no_resources = 1; + goto bailout; + } else if ((dev->write_dev == 0) + && (dev->flags & (CAMDD_DEV_FLAG_EOF | + CAMDD_DEV_FLAG_EOF_SENT))) { + STAILQ_REMOVE(&dev->run_queue, buf, camdd_buf, links); + dev->num_run_queue--; + buf->status = CAMDD_STATUS_EOF; + error_count++; + goto bailout; + } + + /* + * If we're writing, we need to go through the source buffer list + * and create an S/G list. + */ + if (write_dev != 0) { + retval = camdd_buf_sg_create(buf, /*iovec*/ 1, + dev->sector_size, &num_sectors, &double_buf_needed); + if (retval != 0) { + no_resources = 1; + goto bailout; + } + } + + STAILQ_REMOVE(&dev->run_queue, buf, camdd_buf, links); + dev->num_run_queue--; + + data = &buf->buf_type_spec.data; + + /* + * pread(2) and pwrite(2) offsets are byte offsets. + */ + io_offset = buf->lba * dev->sector_size; + + /* + * Unlock the mutex while we read or write. + */ + pthread_mutex_unlock(&dev->mutex); + + /* + * Note that we don't need to double buffer if we're the reader + * because in that case, we have allocated a single buffer of + * sufficient size to do the read. This copy is necessary on + * writes because if one of the components of the S/G list is not + * a sector size multiple, the kernel will reject the write. This + * is unfortunate but not surprising. So this will make sure that + * we're using a single buffer that is a multiple of the sector size. + */ + if ((double_buf_needed != 0) + && (data->sg_count > 1) + && (write_dev != 0)) { + uint32_t cur_offset; + int i; + + if (file_dev->tmp_buf == NULL) + file_dev->tmp_buf = calloc(dev->blocksize, 1); + if (file_dev->tmp_buf == NULL) { + buf->status = CAMDD_STATUS_ERROR; + error_count++; + goto bailout; + } + for (i = 0, cur_offset = 0; i < data->sg_count; i++) { + bcopy(data->iovec[i].iov_base, + &file_dev->tmp_buf[cur_offset], + data->iovec[i].iov_len); + cur_offset += data->iovec[i].iov_len; + } + db_len = cur_offset; + } + + if (file_dev->file_flags & CAMDD_FF_CAN_SEEK) { + if (write_dev == 0) { + /* + * XXX KDM is there any way we would need a S/G + * list here? + */ + retval = pread(file_dev->fd, data->buf, + buf->len, io_offset); + } else { + if (double_buf_needed != 0) { + retval = pwrite(file_dev->fd, file_dev->tmp_buf, + db_len, io_offset); + } else if (data->sg_count == 0) { + retval = pwrite(file_dev->fd, data->buf, + data->fill_len, io_offset); + } else { + retval = pwritev(file_dev->fd, data->iovec, + data->sg_count, io_offset); + } + } + } else { + if (write_dev == 0) { + /* + * XXX KDM is there any way we would need a S/G + * list here? + */ + retval = read(file_dev->fd, data->buf, buf->len); + } else { + if (double_buf_needed != 0) { + retval = write(file_dev->fd, file_dev->tmp_buf, + db_len); + } else if (data->sg_count == 0) { + retval = write(file_dev->fd, data->buf, + data->fill_len); + } else { + retval = writev(file_dev->fd, data->iovec, + data->sg_count); + } + } + } + + /* We're done, re-acquire the lock */ + pthread_mutex_lock(&dev->mutex); + + if (retval >= (ssize_t)data->fill_len) { + /* + * If the bytes transferred is more than the request size, + * that indicates an overrun, which should only happen at + * the end of a transfer if we have to round up to a sector + * boundary. + */ + if (buf->status == CAMDD_STATUS_NONE) + buf->status = CAMDD_STATUS_OK; + data->resid = 0; + dev->bytes_transferred += retval; + } else if (retval == -1) { + warn("Error %s %s", (write_dev) ? "writing to" : + "reading from", file_dev->filename); + + buf->status = CAMDD_STATUS_ERROR; + data->resid = data->fill_len; + error_count++; + + if (dev->debug == 0) + goto bailout; + + if ((double_buf_needed != 0) + && (write_dev != 0)) { + fprintf(stderr, "%s: fd %d, DB buf %p, len %u lba %ju " + "offset %ju\n", __func__, file_dev->fd, + file_dev->tmp_buf, db_len, (uintmax_t)buf->lba, + (uintmax_t)io_offset); + } else if (data->sg_count == 0) { + fprintf(stderr, "%s: fd %d, buf %p, len %u, lba %ju " + "offset %ju\n", __func__, file_dev->fd, data->buf, + data->fill_len, (uintmax_t)buf->lba, + (uintmax_t)io_offset); + } else { + int i; + + fprintf(stderr, "%s: fd %d, len %u, lba %ju " + "offset %ju\n", __func__, file_dev->fd, + data->fill_len, (uintmax_t)buf->lba, + (uintmax_t)io_offset); + + for (i = 0; i < data->sg_count; i++) { + fprintf(stderr, "index %d ptr %p len %zu\n", + i, data->iovec[i].iov_base, + data->iovec[i].iov_len); + } + } + } else if (retval == 0) { + buf->status = CAMDD_STATUS_EOF; + if (dev->debug != 0) + printf("%s: got EOF from %s!\n", __func__, + file_dev->filename); + data->resid = data->fill_len; + error_count++; + } else if (retval < (ssize_t)data->fill_len) { + if (buf->status == CAMDD_STATUS_NONE) + buf->status = CAMDD_STATUS_SHORT_IO; + data->resid = data->fill_len - retval; + dev->bytes_transferred += retval; + } + +bailout: + if (buf != NULL) { + if (buf->status == CAMDD_STATUS_EOF) { + struct camdd_buf *buf2; + dev->flags |= CAMDD_DEV_FLAG_EOF; + STAILQ_FOREACH(buf2, &dev->run_queue, links) + buf2->status = CAMDD_STATUS_EOF; + } + + camdd_complete_buf(dev, buf, &error_count); + } + + if (error_count != 0) + return (-1); + else if (no_resources != 0) + return (1); + else + return (0); +} + +/* + * Execute one command from the run queue. Returns 0 for success, 1 for + * stop processing, and -1 for error. + */ +int +camdd_pass_run(struct camdd_dev *dev) +{ + struct camdd_buf *buf = NULL; + struct camdd_dev_pass *pass_dev = &dev->dev_spec.pass; + struct camdd_buf_data *data; + uint32_t num_blocks, sectors_used = 0; + union ccb *ccb; + int retval = 0, is_write = dev->write_dev; + int double_buf_needed = 0; + + buf = STAILQ_FIRST(&dev->run_queue); + if (buf == NULL) { + retval = 1; + goto bailout; + } + + /* + * If we're writing, we need to go through the source buffer list + * and create an S/G list. + */ + if (is_write != 0) { + retval = camdd_buf_sg_create(buf, /*iovec*/ 0,dev->sector_size, + §ors_used, &double_buf_needed); + if (retval != 0) { + retval = -1; + goto bailout; + } + } + + STAILQ_REMOVE(&dev->run_queue, buf, camdd_buf, links); + dev->num_run_queue--; + + data = &buf->buf_type_spec.data; + + ccb = &data->ccb; + bzero(&(&ccb->ccb_h)[1], + sizeof(struct ccb_scsiio) - sizeof(struct ccb_hdr)); + + /* + * In almost every case the number of blocks should be the device + * block size. The exception may be at the end of an I/O stream + * for a partial block or at the end of a device. + */ + if (is_write != 0) + num_blocks = sectors_used; + else + num_blocks = data->fill_len / pass_dev->block_len; + + scsi_read_write(&ccb->csio, + /*retries*/ dev->retry_count, + /*cbfcnp*/ NULL, + /*tag_action*/ MSG_SIMPLE_Q_TAG, + /*readop*/ (dev->write_dev == 0) ? SCSI_RW_READ : + SCSI_RW_WRITE, + /*byte2*/ 0, + /*minimum_cmd_size*/ dev->min_cmd_size, + /*lba*/ buf->lba, + /*block_count*/ num_blocks, + /*data_ptr*/ (data->sg_count != 0) ? + (uint8_t *)data->segs : data->buf, + /*dxfer_len*/ (num_blocks * pass_dev->block_len), + /*sense_len*/ SSD_FULL_SIZE, + /*timeout*/ dev->io_timeout); + + /* Disable freezing the device queue */ + ccb->ccb_h.flags |= CAM_DEV_QFRZDIS; + + if (dev->retry_count != 0) + ccb->ccb_h.flags |= CAM_PASS_ERR_RECOVER; + + if (data->sg_count != 0) { + ccb->csio.sglist_cnt = data->sg_count; + ccb->ccb_h.flags |= CAM_DATA_SG; + } + + /* + * Store a pointer to the buffer in the CCB. The kernel will + * restore this when we get it back, and we'll use it to identify + * the buffer this CCB came from. + */ + ccb->ccb_h.ccb_buf = buf; + + /* + * Unlock our mutex in preparation for issuing the ioctl. + */ + pthread_mutex_unlock(&dev->mutex); + /* + * Queue the CCB to the pass(4) driver. + */ + if (ioctl(pass_dev->dev->fd, CAMIOQUEUE, ccb) == -1) { + pthread_mutex_lock(&dev->mutex); + + warn("%s: error sending CAMIOQUEUE ioctl to %s%u", __func__, + pass_dev->dev->device_name, pass_dev->dev->dev_unit_num); + warn("%s: CCB address is %p", __func__, ccb); + retval = -1; + + STAILQ_INSERT_TAIL(&dev->free_queue, buf, links); + } else { + pthread_mutex_lock(&dev->mutex); + + dev->cur_active_io++; + STAILQ_INSERT_TAIL(&dev->active_queue, buf, links); + } + +bailout: + return (retval); +} + +int +camdd_get_next_lba_len(struct camdd_dev *dev, uint64_t *lba, ssize_t *len) +{ + struct camdd_dev_pass *pass_dev; + uint32_t num_blocks; + int retval = 0; + + pass_dev = &dev->dev_spec.pass; + + *lba = dev->next_io_pos_bytes / dev->sector_size; + *len = dev->blocksize; + num_blocks = *len / dev->sector_size; + + /* + * If max_sector is 0, then we have no set limit. This can happen + * if we're writing to a file in a filesystem, or reading from + * something like /dev/zero. + */ + if ((dev->max_sector != 0) + || (dev->sector_io_limit != 0)) { + uint64_t max_sector; + + if ((dev->max_sector != 0) + && (dev->sector_io_limit != 0)) + max_sector = min(dev->sector_io_limit, dev->max_sector); + else if (dev->max_sector != 0) + max_sector = dev->max_sector; + else + max_sector = dev->sector_io_limit; + + + /* + * Check to see whether we're starting off past the end of + * the device. If so, we need to just send an EOF + * notification to the writer. + */ + if (*lba > max_sector) { + *len = 0; + retval = 1; + } else if (((*lba + num_blocks) > max_sector + 1) + || ((*lba + num_blocks) < *lba)) { + /* + * If we get here (but pass the first check), we + * can trim the request length down to go to the + * end of the device. + */ + num_blocks = (max_sector + 1) - *lba; + *len = num_blocks * dev->sector_size; + retval = 1; + } + } + + dev->next_io_pos_bytes += *len; + + return (retval); +} + +/* + * Returns 0 for success, 1 for EOF detected, and -1 for failure. + */ +int +camdd_queue(struct camdd_dev *dev, struct camdd_buf *read_buf) +{ + struct camdd_buf *buf = NULL; + struct camdd_buf_data *data; + struct camdd_dev_pass *pass_dev; + size_t new_len; + struct camdd_buf_data *rb_data; + int is_write = dev->write_dev; + int eof_flush_needed = 0; + int retval = 0; + int error; + + pass_dev = &dev->dev_spec.pass; + + /* + * If we've gotten EOF or our partner has, we should not continue + * queueing I/O. If we're a writer, though, we should continue + * to write any buffers that don't have EOF status. + */ + if ((dev->flags & CAMDD_DEV_FLAG_EOF) + || ((dev->flags & CAMDD_DEV_FLAG_PEER_EOF) + && (is_write == 0))) { + /* + * Tell the worker thread that we have seen EOF. + */ + retval = 1; + + /* + * If we're the writer, send the buffer back with EOF status. + */ + if (is_write) { + read_buf->status = CAMDD_STATUS_EOF; + + error = camdd_complete_peer_buf(dev, read_buf); + } + goto bailout; + } + + if (is_write == 0) { + buf = camdd_get_buf(dev, CAMDD_BUF_DATA); + if (buf == NULL) { + retval = -1; + goto bailout; + } + data = &buf->buf_type_spec.data; + + retval = camdd_get_next_lba_len(dev, &buf->lba, &buf->len); + if (retval != 0) { + buf->status = CAMDD_STATUS_EOF; + + if ((buf->len == 0) + && ((dev->flags & (CAMDD_DEV_FLAG_EOF_SENT | + CAMDD_DEV_FLAG_EOF_QUEUED)) != 0)) { + camdd_release_buf(buf); + goto bailout; + } + dev->flags |= CAMDD_DEV_FLAG_EOF_QUEUED; + } + + data->fill_len = buf->len; + data->src_start_offset = buf->lba * dev->sector_size; + + /* + * Put this on the run queue. + */ + STAILQ_INSERT_TAIL(&dev->run_queue, buf, links); + dev->num_run_queue++; + + /* We're done. */ + goto bailout; + } + + /* + * Check for new EOF status from the reader. + */ + if ((read_buf->status == CAMDD_STATUS_EOF) + || (read_buf->status == CAMDD_STATUS_ERROR)) { + dev->flags |= CAMDD_DEV_FLAG_PEER_EOF; + if ((STAILQ_FIRST(&dev->pending_queue) == NULL) + && (read_buf->len == 0)) { + camdd_complete_peer_buf(dev, read_buf); + retval = 1; + goto bailout; + } else + eof_flush_needed = 1; + } + + /* + * See if we have a buffer we're composing with pieces from our + * partner thread. + */ + buf = STAILQ_FIRST(&dev->pending_queue); + if (buf == NULL) { + uint64_t lba; + ssize_t len; + + retval = camdd_get_next_lba_len(dev, &lba, &len); + if (retval != 0) { + read_buf->status = CAMDD_STATUS_EOF; + + if (len == 0) { + dev->flags |= CAMDD_DEV_FLAG_EOF; + error = camdd_complete_peer_buf(dev, read_buf); + goto bailout; + } + } + + /* + * If we don't have a pending buffer, we need to grab a new + * one from the free list or allocate another one. + */ + buf = camdd_get_buf(dev, CAMDD_BUF_DATA); + if (buf == NULL) { + retval = 1; + goto bailout; + } + + buf->lba = lba; + buf->len = len; + + STAILQ_INSERT_TAIL(&dev->pending_queue, buf, links); + dev->num_pending_queue++; + } + + data = &buf->buf_type_spec.data; + + rb_data = &read_buf->buf_type_spec.data; + + if ((rb_data->src_start_offset != dev->next_peer_pos_bytes) + && (dev->debug != 0)) { + printf("%s: WARNING: reader offset %#jx != expected offset " + "%#jx\n", __func__, (uintmax_t)rb_data->src_start_offset, + (uintmax_t)dev->next_peer_pos_bytes); + } + dev->next_peer_pos_bytes = rb_data->src_start_offset + + (rb_data->fill_len - rb_data->resid); + + new_len = (rb_data->fill_len - rb_data->resid) + data->fill_len; + if (new_len < buf->len) { + /* + * There are three cases here: + * 1. We need more data to fill up a block, so we put + * this I/O on the queue and wait for more I/O. + * 2. We have a pending buffer in the queue that is + * smaller than our blocksize, but we got an EOF. So we + * need to go ahead and flush the write out. + * 3. We got an error. + */ + + /* + * Increment our fill length. + */ + data->fill_len += (rb_data->fill_len - rb_data->resid); + + /* + * Add the new read buffer to the list for writing. + */ + STAILQ_INSERT_TAIL(&buf->src_list, read_buf, src_links); + + /* Increment the count */ + buf->src_count++; + + if (eof_flush_needed == 0) { + /* + * We need to exit, because we don't have enough + * data yet. + */ + goto bailout; + } else { + /* + * Take the buffer off of the pending queue. + */ + STAILQ_REMOVE(&dev->pending_queue, buf, camdd_buf, + links); + dev->num_pending_queue--; + + /* + * If we need an EOF flush, but there is no data + * to flush, go ahead and return this buffer. + */ + if (data->fill_len == 0) { + camdd_complete_buf(dev, buf, /*error_count*/0); + retval = 1; + goto bailout; + } + + /* + * Put this on the next queue for execution. + */ + STAILQ_INSERT_TAIL(&dev->run_queue, buf, links); + dev->num_run_queue++; + } + } else if (new_len == buf->len) { + /* + * We have enough data to completey fill one block, + * so we're ready to issue the I/O. + */ + + /* + * Take the buffer off of the pending queue. + */ + STAILQ_REMOVE(&dev->pending_queue, buf, camdd_buf, links); + dev->num_pending_queue--; + + /* + * Add the new read buffer to the list for writing. + */ + STAILQ_INSERT_TAIL(&buf->src_list, read_buf, src_links); + + /* Increment the count */ + buf->src_count++; + + /* + * Increment our fill length. + */ + data->fill_len += (rb_data->fill_len - rb_data->resid); + + /* + * Put this on the next queue for execution. + */ + STAILQ_INSERT_TAIL(&dev->run_queue, buf, links); + dev->num_run_queue++; + } else { + struct camdd_buf *idb; + struct camdd_buf_indirect *indirect; + uint32_t len_to_go, cur_offset; + + + idb = camdd_get_buf(dev, CAMDD_BUF_INDIRECT); + if (idb == NULL) { + retval = 1; + goto bailout; + } + indirect = &idb->buf_type_spec.indirect; + indirect->src_buf = read_buf; + read_buf->refcount++; + indirect->offset = 0; + indirect->start_ptr = rb_data->buf; + /* + * We've already established that there is more + * data in read_buf than we have room for in our + * current write request. So this particular chunk + * of the request should just be the remainder + * needed to fill up a block. + */ + indirect->len = buf->len - (data->fill_len - data->resid); + + camdd_buf_add_child(buf, idb); + + /* + * This buffer is ready to execute, so we can take + * it off the pending queue and put it on the run + * queue. + */ + STAILQ_REMOVE(&dev->pending_queue, buf, camdd_buf, + links); + dev->num_pending_queue--; + STAILQ_INSERT_TAIL(&dev->run_queue, buf, links); + dev->num_run_queue++; + + cur_offset = indirect->offset + indirect->len; + + /* + * The resulting I/O would be too large to fit in + * one block. We need to split this I/O into + * multiple pieces. Allocate as many buffers as needed. + */ + for (len_to_go = rb_data->fill_len - rb_data->resid - + indirect->len; len_to_go > 0;) { + struct camdd_buf *new_buf; + struct camdd_buf_data *new_data; + uint64_t lba; + ssize_t len; + + retval = camdd_get_next_lba_len(dev, &lba, &len); + if ((retval != 0) + && (len == 0)) { + /* + * The device has already been marked + * as EOF, and there is no space left. + */ + goto bailout; + } + + new_buf = camdd_get_buf(dev, CAMDD_BUF_DATA); + if (new_buf == NULL) { + retval = 1; + goto bailout; + } + + new_buf->lba = lba; + new_buf->len = len; + + idb = camdd_get_buf(dev, CAMDD_BUF_INDIRECT); + if (idb == NULL) { + retval = 1; + goto bailout; + } + + indirect = &idb->buf_type_spec.indirect; + + indirect->src_buf = read_buf; + read_buf->refcount++; + indirect->offset = cur_offset; + indirect->start_ptr = rb_data->buf + cur_offset; + indirect->len = min(len_to_go, new_buf->len); +#if 0 + if (((indirect->len % dev->sector_size) != 0) + || ((indirect->offset % dev->sector_size) != 0)) { + warnx("offset %ju len %ju not aligned with " + "sector size %u", indirect->offset, + (uintmax_t)indirect->len, dev->sector_size); + } +#endif + cur_offset += indirect->len; + len_to_go -= indirect->len; + + camdd_buf_add_child(new_buf, idb); + + new_data = &new_buf->buf_type_spec.data; + + if ((new_data->fill_len == new_buf->len) + || (eof_flush_needed != 0)) { + STAILQ_INSERT_TAIL(&dev->run_queue, + new_buf, links); + dev->num_run_queue++; + } else if (new_data->fill_len < buf->len) { + STAILQ_INSERT_TAIL(&dev->pending_queue, + new_buf, links); + dev->num_pending_queue++; + } else { + warnx("%s: too much data in new " + "buffer!", __func__); + retval = 1; + goto bailout; + } + } + } + +bailout: + return (retval); +} + +void +camdd_get_depth(struct camdd_dev *dev, uint32_t *our_depth, + uint32_t *peer_depth, uint32_t *our_bytes, uint32_t *peer_bytes) +{ + *our_depth = dev->cur_active_io + dev->num_run_queue; + if (dev->num_peer_work_queue > + dev->num_peer_done_queue) + *peer_depth = dev->num_peer_work_queue - + dev->num_peer_done_queue; + else + *peer_depth = 0; + *our_bytes = *our_depth * dev->blocksize; + *peer_bytes = dev->peer_bytes_queued; +} + +void +camdd_sig_handler(int sig) +{ + if (sig == SIGINFO) + need_status = 1; + else { + need_exit = 1; + error_exit = 1; + } + + sem_post(&camdd_sem); +} + +void +camdd_print_status(struct camdd_dev *camdd_dev, struct camdd_dev *other_dev, + struct timespec *start_time) +{ + struct timespec done_time; + uint64_t total_ns; + long double mb_sec, total_sec; + int error = 0; + + error = clock_gettime(CLOCK_MONOTONIC_PRECISE, &done_time); + if (error != 0) { + warn("Unable to get done time"); + return; + } + + timespecsub(&done_time, start_time); + + total_ns = done_time.tv_nsec + (done_time.tv_sec * 1000000000); + total_sec = total_ns; + total_sec /= 1000000000; + + fprintf(stderr, "%ju bytes %s %s\n%ju bytes %s %s\n" + "%.4Lf seconds elapsed\n", + (uintmax_t)camdd_dev->bytes_transferred, + (camdd_dev->write_dev == 0) ? "read from" : "written to", + camdd_dev->device_name, + (uintmax_t)other_dev->bytes_transferred, + (other_dev->write_dev == 0) ? "read from" : "written to", + other_dev->device_name, total_sec); + + mb_sec = min(other_dev->bytes_transferred,camdd_dev->bytes_transferred); + mb_sec /= 1024 * 1024; + mb_sec *= 1000000000; + mb_sec /= total_ns; + fprintf(stderr, "%.2Lf MB/sec\n", mb_sec); +} + +int +camdd_rw(struct camdd_io_opts *io_opts, int num_io_opts, uint64_t max_io, + int retry_count, int timeout) +{ + char *device = NULL; + struct cam_device *new_cam_dev = NULL; + struct camdd_dev *devs[2]; + struct timespec start_time; + pthread_t threads[2]; + int unit = 0; + int error = 0; + int i; + + if (num_io_opts != 2) { + warnx("Must have one input and one output path"); + error = 1; + goto bailout; + } + + bzero(devs, sizeof(devs)); + + for (i = 0; i < num_io_opts; i++) { + switch (io_opts[i].dev_type) { + case CAMDD_DEV_PASS: { + camdd_argmask new_arglist = CAMDD_ARG_NONE; + int bus = 0, target = 0, lun = 0; + char name[30]; + int rv; + + if (isdigit(io_opts[i].dev_name[0])) { + /* device specified as bus:target[:lun] */ + rv = parse_btl(io_opts[i].dev_name, &bus, + &target, &lun, &new_arglist); + if (rv < 2) { + warnx("numeric device specification " + "must be either bus:target, or " + "bus:target:lun"); + error = 1; + goto bailout; + } + /* default to 0 if lun was not specified */ + if ((new_arglist & CAMDD_ARG_LUN) == 0) { + lun = 0; + new_arglist |= CAMDD_ARG_LUN; + } + } else { + if (cam_get_device(io_opts[i].dev_name, name, + sizeof name, &unit) == -1) { + warnx("%s", cam_errbuf); + error = 1; + goto bailout; + } + device = strdup(name); + new_arglist |= CAMDD_ARG_DEVICE |CAMDD_ARG_UNIT; + } + + if (new_arglist & (CAMDD_ARG_BUS | CAMDD_ARG_TARGET)) + new_cam_dev = cam_open_btl(bus, target, lun, + O_RDWR, NULL); + else + new_cam_dev = cam_open_spec_device(device, unit, + O_RDWR, NULL); + if (new_cam_dev == NULL) { + warnx("%s", cam_errbuf); + error = 1; + goto bailout; + } + + devs[i] = camdd_probe_pass(new_cam_dev, + /*io_opts*/ &io_opts[i], + CAMDD_ARG_ERR_RECOVER, + /*probe_retry_count*/ 3, + /*probe_timeout*/ 5000, + /*io_retry_count*/ retry_count, + /*io_timeout*/ timeout); + if (devs[i] == NULL) { + warn("Unable to probe device %s%u", + new_cam_dev->device_name, + new_cam_dev->dev_unit_num); + error = 1; + goto bailout; + } + break; + } + case CAMDD_DEV_FILE: { + int fd = -1; + + if (io_opts[i].dev_name[0] == '-') { + if (io_opts[i].write_dev != 0) + fd = STDOUT_FILENO; + else + fd = STDIN_FILENO; + } else { + if (io_opts[i].write_dev != 0) { + fd = open(io_opts[i].dev_name, + O_RDWR | O_CREAT, S_IWUSR |S_IRUSR); + } else { + fd = open(io_opts[i].dev_name, + O_RDONLY); + } + } + if (fd == -1) { + warn("error opening file %s", + io_opts[i].dev_name); + error = 1; + goto bailout; + } + + devs[i] = camdd_probe_file(fd, &io_opts[i], + retry_count, timeout); + if (devs[i] == NULL) { + error = 1; + goto bailout; + } + + break; + } + default: + warnx("Unknown device type %d (%s)", + io_opts[i].dev_type, io_opts[i].dev_name); + error = 1; + goto bailout; + break; /*NOTREACHED */ + } + + devs[i]->write_dev = io_opts[i].write_dev; + + devs[i]->start_offset_bytes = io_opts[i].offset; + + if (max_io != 0) { + devs[i]->sector_io_limit = + (devs[i]->start_offset_bytes / + devs[i]->sector_size) + + (max_io / devs[i]->sector_size) - 1; + devs[i]->sector_io_limit = + (devs[i]->start_offset_bytes / + devs[i]->sector_size) + + (max_io / devs[i]->sector_size) - 1; + } + + devs[i]->next_io_pos_bytes = devs[i]->start_offset_bytes; + devs[i]->next_completion_pos_bytes =devs[i]->start_offset_bytes; + } + + devs[0]->peer_dev = devs[1]; + devs[1]->peer_dev = devs[0]; + devs[0]->next_peer_pos_bytes = devs[0]->peer_dev->next_io_pos_bytes; + devs[1]->next_peer_pos_bytes = devs[1]->peer_dev->next_io_pos_bytes; + + sem_init(&camdd_sem, /*pshared*/ 0, 0); + + signal(SIGINFO, camdd_sig_handler); + signal(SIGINT, camdd_sig_handler); + + error = clock_gettime(CLOCK_MONOTONIC_PRECISE, &start_time); + if (error != 0) { + warn("Unable to get start time"); + goto bailout; + } + + for (i = 0; i < num_io_opts; i++) { + error = pthread_create(&threads[i], NULL, camdd_worker, + (void *)devs[i]); + if (error != 0) { + warnc(error, "pthread_create() failed"); + goto bailout; + } + } + + for (;;) { + if ((sem_wait(&camdd_sem) == -1) + || (need_exit != 0)) { + struct kevent ke; + + for (i = 0; i < num_io_opts; i++) { + EV_SET(&ke, (uintptr_t)&devs[i]->work_queue, + EVFILT_USER, 0, NOTE_TRIGGER, 0, NULL); + + devs[i]->flags |= CAMDD_DEV_FLAG_EOF; + + error = kevent(devs[i]->kq, &ke, 1, NULL, 0, + NULL); + if (error == -1) + warn("%s: unable to wake up thread", + __func__); + error = 0; + } + break; + } else if (need_status != 0) { + camdd_print_status(devs[0], devs[1], &start_time); + need_status = 0; + } + } + for (i = 0; i < num_io_opts; i++) { + pthread_join(threads[i], NULL); + } + + camdd_print_status(devs[0], devs[1], &start_time); + +bailout: + + for (i = 0; i < num_io_opts; i++) + camdd_free_dev(devs[i]); + + return (error + error_exit); +} + +void +usage(void) +{ + fprintf(stderr, +"usage: camdd <-i|-o pass=pass0,bs=1M,offset=1M,depth=4>\n" +" <-i|-o file=/tmp/file,bs=512K,offset=1M>\n" +" <-i|-o file=/dev/da0,bs=512K,offset=1M>\n" +" <-i|-o file=/dev/nsa0,bs=512K>\n" +" [-C retry_count][-E][-m max_io_amt][-t timeout_secs][-v][-h]\n" +"Option description\n" +"-i Specify input device/file and parameters\n" +"-o Specify output device/file and parameters\n" +"Input and Output parameters\n" +"pass=name Specify a pass(4) device like pass0 or /dev/pass0\n" +"file=name Specify a file or device, /tmp/foo, /dev/da0, /dev/null\n" +" or - for stdin/stdout\n" +"bs=blocksize Specify blocksize in bytes, or using K, M, G, etc. suffix\n" +"offset=len Specify starting offset in bytes or using K, M, G suffix\n" +" NOTE: offset cannot be specified on tapes, pipes, stdin/out\n" +"depth=N Specify a numeric queue depth. This only applies to pass(4)\n" +"mcs=N Specify a minimum cmd size for pass(4) read/write commands\n" +"Optional arguments\n" +"-C retry_cnt Specify a retry count for pass(4) devices\n" +"-E Enable CAM error recovery for pass(4) devices\n" +"-m max_io Specify the maximum amount to be transferred in bytes or\n" +" using K, G, M, etc. suffixes\n" +"-t timeout Specify the I/O timeout to use with pass(4) devices\n" +"-v Enable verbose error recovery\n" +"-h Print this message\n"); +} + + +int +camdd_parse_io_opts(char *args, int is_write, struct camdd_io_opts *io_opts) +{ + char *tmpstr, *tmpstr2; + char *orig_tmpstr = NULL; + int retval = 0; + + io_opts->write_dev = is_write; + + tmpstr = strdup(args); + if (tmpstr == NULL) { + warn("strdup failed"); + retval = 1; + goto bailout; + } + orig_tmpstr = tmpstr; + while ((tmpstr2 = strsep(&tmpstr, ",")) != NULL) { + char *name, *value; + + /* + * If the user creates an empty parameter by putting in two + * commas, skip over it and look for the next field. + */ + if (*tmpstr2 == '\0') + continue; + + name = strsep(&tmpstr2, "="); + if (*name == '\0') { + warnx("Got empty I/O parameter name"); + retval = 1; + goto bailout; + } + value = strsep(&tmpstr2, "="); + if ((value == NULL) + || (*value == '\0')) { + warnx("Empty I/O parameter value for %s", name); + retval = 1; + goto bailout; + } + if (strncasecmp(name, "file", 4) == 0) { + io_opts->dev_type = CAMDD_DEV_FILE; + io_opts->dev_name = strdup(value); + if (io_opts->dev_name == NULL) { + warn("Error allocating memory"); + retval = 1; + goto bailout; + } + } else if (strncasecmp(name, "pass", 4) == 0) { + io_opts->dev_type = CAMDD_DEV_PASS; + io_opts->dev_name = strdup(value); + if (io_opts->dev_name == NULL) { + warn("Error allocating memory"); + retval = 1; + goto bailout; + } + } else if ((strncasecmp(name, "bs", 2) == 0) + || (strncasecmp(name, "blocksize", 9) == 0)) { + retval = expand_number(value, &io_opts->blocksize); + if (retval == -1) { + warn("expand_number(3) failed on %s=%s", name, + value); + retval = 1; + goto bailout; + } + } else if (strncasecmp(name, "depth", 5) == 0) { + char *endptr; + + io_opts->queue_depth = strtoull(value, &endptr, 0); + if (*endptr != '\0') { + warnx("invalid queue depth %s", value); + retval = 1; + goto bailout; + } + } else if (strncasecmp(name, "mcs", 3) == 0) { + char *endptr; + + io_opts->min_cmd_size = strtol(value, &endptr, 0); + if ((*endptr != '\0') + || ((io_opts->min_cmd_size > 16) + || (io_opts->min_cmd_size < 0))) { + warnx("invalid minimum cmd size %s", value); + retval = 1; + goto bailout; + } + } else if (strncasecmp(name, "offset", 6) == 0) { + retval = expand_number(value, &io_opts->offset); + if (retval == -1) { + warn("expand_number(3) failed on %s=%s", name, + value); + retval = 1; + goto bailout; + } + } else if (strncasecmp(name, "debug", 5) == 0) { + char *endptr; + + io_opts->debug = strtoull(value, &endptr, 0); + if (*endptr != '\0') { + warnx("invalid debug level %s", value); + retval = 1; + goto bailout; + } + } else { + warnx("Unrecognized parameter %s=%s", name, value); + } + } +bailout: + free(orig_tmpstr); + + return (retval); +} + +int +main(int argc, char **argv) +{ + int c; + camdd_argmask arglist = CAMDD_ARG_NONE; + int timeout = 0, retry_count = 1; + int error = 0; + uint64_t max_io = 0; + struct camdd_io_opts *opt_list = NULL; + + if (argc == 1) { + usage(); + exit(1); + } + + opt_list = calloc(2, sizeof(struct camdd_io_opts)); + if (opt_list == NULL) { + warn("Unable to allocate option list"); + error = 1; + goto bailout; + } + + while ((c = getopt(argc, argv, "C:Ehi:m:o:t:v")) != -1){ + switch (c) { + case 'C': + retry_count = strtol(optarg, NULL, 0); + if (retry_count < 0) + errx(1, "retry count %d is < 0", + retry_count); + arglist |= CAMDD_ARG_RETRIES; + break; + case 'E': + arglist |= CAMDD_ARG_ERR_RECOVER; + break; + case 'i': + case 'o': + if (((c == 'i') + && (opt_list[0].dev_type != CAMDD_DEV_NONE)) + || ((c == 'o') + && (opt_list[1].dev_type != CAMDD_DEV_NONE))) { + errx(1, "Only one input and output path " + "allowed"); + } + error = camdd_parse_io_opts(optarg, (c == 'o') ? 1 : 0, + (c == 'o') ? &opt_list[1] : &opt_list[0]); + if (error != 0) + goto bailout; + break; + case 'm': + error = expand_number(optarg, &max_io); + if (error == -1) { + warn("invalid maximum I/O amount %s", optarg); + error = 1; + goto bailout; + } + break; + case 't': + timeout = strtol(optarg, NULL, 0); + if (timeout < 0) + errx(1, "invalid timeout %d", timeout); + /* Convert the timeout from seconds to ms */ + timeout *= 1000; + arglist |= CAMDD_ARG_TIMEOUT; + break; + case 'v': + arglist |= CAMDD_ARG_VERBOSE; + break; + case 'h': + default: + usage(); + exit(1); + break; /*NOTREACHED*/ + } + } + + if ((opt_list[0].dev_type == CAMDD_DEV_NONE) + || (opt_list[1].dev_type == CAMDD_DEV_NONE)) + errx(1, "Must specify both -i and -o"); + + /* + * Set the timeout if the user hasn't specified one. + */ + if (timeout == 0) + timeout = CAMDD_PASS_RW_TIMEOUT; + + error = camdd_rw(opt_list, 2, max_io, retry_count, timeout); + +bailout: + free(opt_list); + + exit(error); +} Property changes on: stable/10/usr.sbin/camdd/camdd.c ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: stable/10/usr.sbin/camdd/Makefile =================================================================== --- stable/10/usr.sbin/camdd/Makefile (nonexistent) +++ stable/10/usr.sbin/camdd/Makefile (revision 292348) @@ -0,0 +1,11 @@ +# $FreeBSD$ + +PROG= camdd +SRCS= camdd.c +SDIR= ${.CURDIR}/../../sys +DPADD= ${LIBCAM} ${LIBMT} ${LIBSBUF} ${LIBBSDXML} ${LIBUTIL} ${LIBTHR} +LDADD= -lcam -lmt -lsbuf -lbsdxml -lutil -lthr +NO_WTHREAD_SAFETY= 1 +MAN= camdd.8 + +.include Property changes on: stable/10/usr.sbin/camdd/Makefile ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: stable/10/usr.sbin/camdd/camdd.8 =================================================================== --- stable/10/usr.sbin/camdd/camdd.8 (nonexistent) +++ stable/10/usr.sbin/camdd/camdd.8 (revision 292348) @@ -0,0 +1,283 @@ +.\" +.\" Copyright (c) 2015 Spectra Logic Corporation +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions, and the following disclaimer, +.\" without modification. +.\" 2. Redistributions in binary form must reproduce at minimum a disclaimer +.\" substantially similar to the "NO WARRANTY" disclaimer below +.\" ("Disclaimer") and any redistribution must be conditioned upon +.\" including a substantially similar Disclaimer requirement for further +.\" binary redistribution. +.\" +.\" NO WARRANTY +.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +.\" "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR +.\" A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +.\" HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, +.\" STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING +.\" IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +.\" POSSIBILITY OF SUCH DAMAGES. +.\" +.\" Authors: Ken Merry (Spectra Logic Corporation) +.\" +.\" $FreeBSD$ +.\" +.Dd November 11, 2015 +.Dt CAMDD 8 +.Os +.Sh NAME +.Nm camdd +.Nd CAM data transfer utility +.Sh SYNOPSIS +.Nm +.Aq Fl i|o Ar pass=pass_dev|file=filename,bs=blocksize,[...] +.Op Fl C Ar retry_count +.Op Fl E +.Op Fl m Ar max_io +.Op Fl t Ar timeout +.Op Fl v +.Op Fl h +.Sh DESCRIPTION +The +.Nm +utility is a sequential data transfer utility that offers standard +.Xr read 2 +and +.Xr write 2 +operation in addition to a mode that uses the asynchronous +.Xr pass 4 +API. +The asynchronous +.Xr pass 4 +API allows multiple requests to be queued to a device simultaneously. +.Pp +.Nm +collects performance information and will display it when the transfer +completes, when +.Nm +is terminated or when it receives a SIGINFO signal. +.Pp +The following options are available: +.Bl -tag -width 12n +.It Fl i | o Ar args +Specify the input and output device or file. +Both +.Fl i +and +.Fl o +must be specified. +There are a number of parameters that can be specified. +One of the first two (file or pass) MUST be specified to indicate which I/O +method to use on the device in question. +.Bl -tag -width 9n +.It pass=dev +Specify a +.Xr pass 4 +device to operate on. +This requests that +.Nm +access the device in question be accessed via the asynchronous +.Xr pass 4 +interface. +.Pp +The device name can be a +.Xr pass 4 +name and unit number, for instance +.Dq pass0 , +or a regular peripheral driver name and unit number, for instance +.Dq da5 . +It can also be the path of a +.Xr pass 4 +or other disk device, like +.Dq /dev/da5 . +It may also be a bus:target:lun, for example: +.Dq 0:5:0 . +.Pp +Only +.Xr pass 4 +devices for +.Tn SCSI +disk-like devices are supported. +.Tn ATA +devices are not currently supported, but support could be added later. +Specifically, +.Tn SCSI +Direct Access (type 0), WORM (type 4), CDROM (type 5), and RBC (Reduced +Block Command, type 14) devices are supported. +Tape drives, medium changers, enclosures etc. are not supported. +.It file=path +Specify a file or device to operate on. +This requests that the file or device in question be accessed using the +standard +.Xr read 2 +and +.Xr write 2 +system calls. +The file interface does not support queueing multiple commands at a time. +It does support probing disk sector size and capacity information, and tape +blocksize and maximum transfer size information. +The file interface supports standard files, disks, tape drives, special +devices, pipes and standard input and output. +If the file is specified as a +.Dq - , +standard input or standard output are used. +For tape devices, the specified blocksize will be the size that +.Nm +attempts to use to write to or read from the tape. +When writing to a tape device, the blocksize is treated like a disk sector +size. +So, that means +.Nm +will not write anything smaller than the sector size. +At the end of a transfer, if there isn't sufficient data from the reader +to yield a full block, +.Nm +will add zeros on the end of the data from the reader to make up a full +block. +.It bs=N +Specify the blocksize to use for transfers. +.Nm +will attempt to read or write using the requested blocksize. +.Pp +Note that the blocksize given only applies to either the input or the +output path. +To use the same blocksize for the input and output transfers, you must +specify that blocksize with both the +.Fl i +and +.Fl o +arguments. +.Pp +The blocksize may be specified in bytes, or using any suffix (e.g. k, M, G) +supported by +.Xr expand_number 3 . +.It offset=N +Specify the starting offset for the input or output device or file. +The offset may be specified in bytes, or by using any suffix (e.g. k, M, G) +supported by +.Xr expand_number 3 . +.It depth=N +Specify a desired queue depth for the input or output path. +.Nm +will attempt to keep the requested number of requests of the specified +blocksize queued to the input or output device. +Queue depths greater than 1 are only supported for the asynchronous +.Xr pass 4 +output method. +The queue depth is maintained on a best effort basis, and may not be +possible to maintain for especially fast devices. +For writes, maintaining the queue depth also depends on a sufficiently +fast reading device. +.It mcs=N +Specify the minimum command size to use for +.Xr pass 4 +devices. +Some devices do not support 6 byte +.Tn SCSI +commands. +The +.Xr da 4 +device handles this restriction automatically, but the +.Xr pass 4 +device allows the user to specify the +.Tn SCSI +command used. +If a device does not accept 6 byte +.Tn SCSI +READ/WRITE commands (which is the default at lower LBAs), it will generally +accept 10 byte +.Tn SCSI +commands instead. +.It debug=N +Specify the debug level for this device. +There is currently only one debug level setting, so setting this to any +non-zero value will turn on debugging. +The debug facility may be expanded in the future. +.El +.It Fl C Ar count +Specify the retry count for commands sent via the asynchronous +.Xr pass 4 +interface. +This does not apply to commands sent via the file interface. +.It Fl E +Enable kernel error recovery for the +.Xr pass 4 +driver. +If error recovery is not enabled, unit attention conditions and other +transient failures may cause the transfer to fail. +.It Fl m Ar size +Specify the maximum amount of data to be transferred. +This may be specified in bytes, or by using any suffix (e.g. K, M, G) +supported by +.Xr expand_number 3 . +.It Fl t Ar timeout +Specify the command timeout in seconds to use for commands sent via the +.Xr pass 4 +driver. +.It Fl v +Enable verbose reporting of errors. +This is recommended to aid in debugging any +.Tn SCSI +issues that come up. +.It Fl h +Display the +.Nm +usage message. +.El +.Pp +If +.Nm +receives a SIGINFO signal, it will print the current input and output byte +counts, elapsed runtime and average throughput. +If +.Nm +receives a SIGINT signal, it will print the current input and output byte +counts, elapsed runtime and average throughput and then exit. +.Sh EXAMPLES +.Dl camdd -i pass=da8,bs=512k,depth=4 -o pass=da3,bs=512k,depth=4 +.Pp +Copy all data from da8 to da3 using a blocksize of 512k for both drives, +and attempt to maintain a queue depth of 4 on both the input and output +devices. +The transfer will stop when the end of either device is reached. +.Pp +.Dl camdd -i file=/dev/zero,bs=1M -o pass=da5,bs=1M,depth=4 -m 100M +.Pp +Read 1MB blocks of zeros from /dev/zero, and write them to da5 with a +desired queue depth of 4. +Stop the transfer after 100MB has been written. +.Pp +.Dl camdd -i pass=da8,bs=1M,depth=3 -o file=disk.img +.Pp +Copy disk da8 using a 1MB blocksize and desired queue depth of 3 to the +file disk.img. +.Pp +.Dl camdd -i file=/etc/rc -o file=- +.Pp +Read the file /etc/rc and write it to standard output. +.Pp +.Dl camdd -i pass=da10,bs=64k,depth=16 -o file=/dev/nsa0,bs=128k +.Pp +Copy 64K blocks from the disk da10 with a queue depth of 16, and write +to the tape drive sa0 with a 128k blocksize. +The copy will stop when either the end of the disk or tape is reached. +.Sh SEE ALSO +.Xr cam 3 , +.Xr cam 4 , +.Xr pass 4 , +.Xr camcontrol 8 +.Sh HISTORY +.Nm +first appeared in +.Fx 10.2 +.Sh AUTHORS +.An Kenneth Merry Aq Mt ken@FreeBSD.org Property changes on: stable/10/usr.sbin/camdd/camdd.8 ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: stable/10 =================================================================== --- stable/10 (revision 292347) +++ stable/10 (revision 292348) Property changes on: stable/10 ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head:r291716,291724,291741-291742