Page MenuHomeFreeBSD

ufshci: Introduce the ufshci(4) driver
ClosedPublic

Authored by j_yoon.choi_samsung.com on May 16 2025, 12:36 AM.
Tags
None
Referenced Files
F120468836: D50370.id156864.diff
Wed, Jun 18, 5:35 AM
F120371193: D50370.id156571.diff
Tue, Jun 17, 10:35 AM
Unknown Object (File)
Tue, Jun 17, 7:35 AM
Unknown Object (File)
Tue, Jun 17, 6:41 AM
Unknown Object (File)
Tue, Jun 17, 2:26 AM
Unknown Object (File)
Mon, Jun 16, 1:58 PM
Unknown Object (File)
Mon, Jun 16, 3:36 AM
Unknown Object (File)
Sun, Jun 15, 9:57 AM

Details

Summary

This commit adds a storage driver that supports the Universal Flash
Storage Host Controller Interface (UFSHCI) on FreeBSD.

Universal Flash Storage (UFS) is a flash-based mobile storage device
that replaces eMMC, aiming for high performance with low power. The UFS
Host Controller Interface (UFSHCI) is the host side controller and
connects UFS device to a system bus, such as PCIe.

The code targets the latest standards:

The ufshci(4) driver implements controller/device initialization,
interrupt, single-doorbell(SDB) queue based IO requests. Support for
multi-queue (MCQ) IO requests is planned for a later commit.

Implemented features:

  • PCIe bus support
  • legacy(INTx) Interrupt Handling
  • UIC command support
  • UTP Transfer Request (UTR) support
  • UTP Task Management Request (UTMR) support
  • single doorbell queue (SDB) with multiple queue depth
  • SCSI command set support
  • sysctl

Work in progress:

  • multi-Circular Queue (per-CPU IO queues)
  • MSI-X interrupt Support
  • write booster
  • write Protect
  • Host Performance Booster (HPB)
  • interrupt aggregation
  • ARM based system bus support
  • ufs-utils port

Tests were performed on QEMU and an Intel-based laptop.
Since QEMU has an emulated UFS device, I tested on QEMU.

How to test on QEMU:

  1. Run QEMU $ qemu-system-x86_64 ... -device ufs -drive file=blk1g.bin,format=raw,if=none,id=luimg -device ufs-lu,drive=luimg,lun=0
  2. Loading/unloading the ufshci module on QEMU $ kldload /usr/obj/usr/src/amd64.amd64/sys/modules/ufshci/ufshci.ko $ kldunload ufshci

Testing on real hardware:

  • Samsung Galaxy Book S (Intel Lakefield) with UFS 3.0
  • Lenovo duet 3 11ian8 (Intel N100) with UFS 2.1

Sponsored by: Samsung Electronics

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
sys/cam/cam_ccb.h
1072

I modified it as shown below.
Items that follow CTS_UFSHCI_VALID_LINK are valid as a unit.

	/* 
	 * Ensure the validity of the information for the Unipro link
	 * (GEAR, SPEED, LANE)
	 */
#define CTS_UFSHCI_VALID_LINK	0x01
sys/dev/ufshci/ufshci.h
6

Would it be okay if I update it like this?

/*-
 * Copyright (c) 2025, Samsung Electronics Co., Ltd.
 * Written by Jaeyoon Choi
 *
 * SPDX-License-Identifier: BSD-2-Clause
 */
sys/dev/ufshci/ufshci_pci.c
196

I applied goto intx. And ctrlr->msi_count is initialized to 0 when allocating softc.

sys/dev/ufshci/ufshci_private.h
59

Removed!

63

Removed. Thanks!

67

Removed!

118

This is a variable for Multi-Circular Queue, which is supported since UFS 4.0.
We don't need it now, so I'll remove it.

j_yoon.choi_samsung.com marked 21 inline comments as done.

Apply review comments

sys/dev/ufshci/ufshci_ctrlr.c
50โ€“51

The main reason for asking is it looks like a simple thing to move & not need any pci-specific headers in a generic file.

The latest version of the UFS spec is paid, but you can download version 2.1 for free. Except for the MCQs, 4.1 and 2.1 are almost identical.
Please refer to them in your review.

Move the quirk check routine to PCI attach.

j_yoon.choi_samsung.com added inline comments.
sys/dev/ufshci/ufshci_ctrlr.c
50โ€“51

I moved the quirk check routine to PCI attach and added the quirks bit field to softc.
Thanks!

Fix a bug in interrupt enable register, Add Lenovo Duet3 11ian device id

Number of bugs found during testing on Lenovo duet3

  • Change sbt_pause() to DELAY() during UIC command transaction
  • Add register dump feature
  • Fix UIC power mode bugs
  • Fix big endian bug

Check the completion notification register when processing request completion

The UFSHCI driver now works on the Samsung Galaxy Book S and Lenovo Duet3.

The following fixes have been added

  • revert the completion notification check
  • fix task_tag bug

This patch is now ready for full review.
Thank you for your patience.

The following has been added :

  • Fixed several bugs
  • Applied clang-format to ensure consistent code style
  • Added a device quirk interface
  • Verified filesystem creation and read/write integrity on both QEMU and real hardware

The following features are not yet implemented and will be submitted as separate patches:

  • multi-Circular Queue (per-CPU IO queues)
  • MSI-X interrupt Support
  • flushing in-flight I/Os
  • command timeout and retry
  • UFS power-management handling (suspend / resume)
  • task management request
  • write booster
  • write Protect
  • Host Performance Booster (HPB)
  • interrupt aggregation
  • ACPI/FDP Support
  • man page
  • big endian support

I measured the key performance metrics on the real device and confirmed that the results meet my goals.
For Write, if WriteBooster is applied, the bandwitch will increase.
Here are the results I measured on each device.

<Samsung Galaxy Book S (UFS3.1 512GB)>

WorkloadQDWriteRead
Sequential 1MB1407MiB/s749MiB/s
2407MiB/s1348MiB/s
4408MiB/s1350MiB/s
8407MiB/s1344MiB/s
16408MiB/s1324MiB/s
32409MiB/s1252MiB/s
Radom 4KB (1GB Range)111.5 KIOPS5468 IOPS
222.6 KIOPS11.5 KIOPS
444.4 KIOPS25.0 KIOPS
865.1 KIOPS72.1 KIOPS
1668.8 KIOPS72.1 KIOPS
3269.7 KIOPS94.6 KIOPS

<Lenovo Duet3 (UFS2.0 128GB)>

WorkloadQDWriteRead
Sequential 1MB1294MiB/s866MiB/s
2291MiB/s873MiB/s
4291MiB/s874MiB/s
8291MiB/s873MiB/s
16292MiB/s873MiB/s
32294MiB/s872MiB/s
Radom 4KB (1GB Range)115.8 KIOPS5716 IOPS
242.5 KIOPS11.7 KIOPS
460.5 KIOPS31.2 KIOPS
860.9 KIOPS60.2 KIOPS
1667.9 KIOPS92.6 KIOPS
3260.4 KIOPS123 KIOPS

For the experiment, I used the following command with FIO.

#SEQ WRITE
fio --name=seq1m_qd1_write  --filename=/dev/da1 --rw=write  --bs=1M --iodepth=1 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd2_write  --filename=/dev/da1 --rw=write  --bs=1M --iodepth=2 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd4_write  --filename=/dev/da1 --rw=write  --bs=1M --iodepth=4 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd8_write  --filename=/dev/da1 --rw=write  --bs=1M --iodepth=8 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd16_write  --filename=/dev/da1 --rw=write  --bs=1M --iodepth=16 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd32_write  --filename=/dev/da1 --rw=write  --bs=1M --iodepth=32 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  

# SEQ READ
fio --name=seq1m_qd1_read  --filename=/dev/da1 --rw=read  --bs=1M --iodepth=1 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd2_read  --filename=/dev/da1 --rw=read  --bs=1M --iodepth=2 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd4_read  --filename=/dev/da1 --rw=read  --bs=1M --iodepth=4 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd8_read  --filename=/dev/da1 --rw=read  --bs=1M --iodepth=8 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd16_read  --filename=/dev/da1 --rw=read  --bs=1M --iodepth=16 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  
fio --name=seq1m_qd32_read  --filename=/dev/da1 --rw=read  --bs=1M --iodepth=32 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G  

# RAND WRITE (1GB Range)
fio --name=rand4k_qd1_write --filename=/dev/da1 --rw=randwrite --bs=4k --iodepth=1 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd2_write --filename=/dev/da1 --rw=randwrite --bs=4k --iodepth=2 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd4_write --filename=/dev/da1 --rw=randwrite --bs=4k --iodepth=4 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd8_write --filename=/dev/da1 --rw=randwrite --bs=4k --iodepth=8 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd16_write --filename=/dev/da1 --rw=randwrite --bs=4k --iodepth=16 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd32_write --filename=/dev/da1 --rw=randwrite --bs=4k --iodepth=32 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     

# RAND READ (1GB Range)
fio --name=rand4k_qd1_read --filename=/dev/da1 --rw=randread --bs=4k --iodepth=1 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd2_read --filename=/dev/da1 --rw=randread --bs=4k --iodepth=2 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd4_read --filename=/dev/da1 --rw=randread --bs=4k --iodepth=4 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd8_read --filename=/dev/da1 --rw=randread --bs=4k --iodepth=8 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd16_read --filename=/dev/da1 --rw=randread --bs=4k --iodepth=16 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G     
fio --name=rand4k_qd32_read --filename=/dev/da1 --rw=randread --bs=4k --iodepth=32 --numjobs=1 --time_based=1 --runtime=60 --direct=1 --ioengine=posixaio --size=1G

This looks great! I only had one comment, on the looping wait loops. And even if that's not fixed, I think we're ready.

sys/dev/ufshci/ufshci.h
6

This is perfect. Thank you..

82

Yes. My comment was about consistency, not the need to follow something absolutely. clangformat will produce consistent results.

200โ€“207

That's fine. Somewhere this restriction should be noted.

sys/dev/ufshci/ufshci_ctrlr.c
130

Thanks!

504

Need to fix this...

sys/dev/ufshci/ufshci_uic_cmd.c
40

Hmmm... this can be quite hard on the CPU... What's the typical timeline here and what are the typical timeouts? I see they are in ms. Also, what's the point of doing the backoff on the delay? wouldn't that just lengthen the time it takes to startup w/o there being a benefit? Or is there something in the spec that needs to do this.

Finally, is there not an interrupt that could be use so we could convert this to a pause() instead of a delay()? Same questions/feedback apply to the similar functions below.

sys/modules/ufshci/Makefile
18

Looks like cam.h includes this...

This revision is now accepted and ready to land.Wed, Jun 11, 1:21 PM
This revision now requires review to proceed.Wed, Jun 11, 11:47 PM
j_yoon.choi_samsung.com marked 6 inline comments as done.

Apply review comment, Remove unnecessary functions

j_yoon.choi_samsung.com marked an inline comment as done.EditedThu, Jun 12, 2:59 AM

Thank you for the kind review.
Your advice was a great help in preparing the patch. :)

sys/dev/ufshci/ufshci.h
82

Thank you, I'll use clang-format.

200โ€“207

I added the following comment to the top of ufshci.h.

/*
 * Note: This driver currently assumes a little-endian architecture.
 * Big-endian support is not yet implemented.
 */
sys/dev/ufshci/ufshci_uic_cmd.c
40

The timeout here is 500ms, and this logic only runs during driver initialization.
Backoff didnโ€™t seem necessary, so Iโ€™ve replaced it with a simple busy-wait using DELAY(10).

I tried using pause_sbt(), but it caused issues due to sleeping while holding a lock, so I reverted to DELAY().
UIC completion can be detected via interrupt, so eventually this should be converted to an interrupt-based pause. I will prepare this as a separate patch.

sys/modules/ufshci/Makefile
18

Removed. Thanks!

Thanks for the quick updates!

sys/dev/ufshci/ufshci_uic_cmd.c
40

OK. Looks like you've added notes about this limitation. If it's just boot, it's OK for now, but we'll want a note for the future since it's important enough to optimize once we more widely deploy on real hardware.

sys/modules/ufshci/Makefile
18

Ah, I think was wasn't communicating well... It's still needed, but otherwise unreferenced in this change set. It was more of a note to others that it was needed. Sorry that I wasn't clear and created extra work for you.

j_yoon.choi_samsung.com marked 2 inline comments as done.

Apply review comments

Apply review comments.

sys/dev/ufshci/ufshci_uic_cmd.c
40

I agree, I plan to prepare a patch for this within this month.

sys/modules/ufshci/Makefile
18

I misunderstood the comment. I've reverted it.
Thank you!

Change UIC power mode ready time from 500ms to 2000ms

I think this is ready to commit (or at least to start to try)... What do you think?

In D50370#1159535, @imp wrote:

I think this is ready to commit (or at least to start to try)... What do you think?

Yes, I think this is ready to commit!

In D50370#1159535, @imp wrote:

I think this is ready to commit (or at least to start to try)... What do you think?

Yes, I think this is ready to commit!

I don't understand the commit process. Is there anything I need to do to commit?

In D50370#1159535, @imp wrote:

I think this is ready to commit (or at least to start to try)... What do you think?

Yes, I think this is ready to commit!

I don't understand the commit process. Is there anything I need to do to commit?

Since you don't yet have commit permissions, I'll need to do that. I'll download it to my tree, make sure it compiles everywhere (or is disabled) and then push it into the repo. I just wanted to make sure that you didn't have anything else planned before I started that process...

In D50370#1160086, @imp wrote:
In D50370#1159535, @imp wrote:

I think this is ready to commit (or at least to start to try)... What do you think?

Yes, I think this is ready to commit!

I don't understand the commit process. Is there anything I need to do to commit?

Since you don't yet have commit permissions, I'll need to do that. I'll download it to my tree, make sure it compiles everywhere (or is disabled) and then push it into the repo. I just wanted to make sure that you didn't have anything else planned before I started that process...

I donโ€™t have any additional changes planned, so feel free to proceed.
Please let me know if you have any issues. Thanks again!

This revision was not accepted when it landed; it landed in state Needs Review.Sun, Jun 15, 6:09 AM
This revision was automatically updated to reflect the committed changes.