Page MenuHomeFreeBSD

mmc: Ignore BADCRC errors in CMD13 when switching to HS200
ClosedPublic

Authored by ashafer_badland.io on May 6 2020, 9:45 PM.

Details

Summary

This fixes my acer aspire A114-32-P7E5, which has eMMC flash memory that
requires this patch to work. It showed CMD13 failing with result 2 (bad crc),
preventing bsdinstall from continuing. This is with a gemini lake sdhci.

successful dmesg: https://dmesgd.nycbug.org/index.cgi?do=view&id=5472

From reading around online, it seems that some number of cards can not depend on
CMD13 to check for errors while switching to HS200. They generate BADCRC errors
for a short bit before CMD13 succeeds. It's my understanding that CMD13 is used after
CMD6 to verify switching has completed, and if these BADCRC errors are observed
we should ignore them and retry CMD13.

I've update mmc_switch_status to have an argument to ignore bad crc errors,
which the caller sets if we are switching to HS200. This seemed to be the best
place to do this.

Test Plan

Tested on the aforementioned acer, and my guess is this fix will apply to other
cheap laptops since they all seem to use equivalent hardware. Posting the patch
here to get some more eyes on this, please let me know anything that needs
improving.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

ashafer_badland.io created this revision.

Having looked at this code, it's my belief that this is safe to retry when we get a CRC error.
The root cause likely is insufficient settle time, but my copies of the MMC/SD standards are a bit too old to pursue this directly. It may also be an erratum in the specific bridge chip for switching modes as well.
I'd be curious to see what @manu or @kibab has to say as well since it's been a while since I wrote this code.

sys/dev/mmc/mmc.c
958 ↗(On Diff #71472)

I'd drop this change.

sys/dev/mmc/mmc_subr.c
191

I'd drop this change.

193

I'd drop this change.

215

I'd change this comment to be simpler:

CRC errors indicate that the command wasn't accepted and executed due to a
communications error. Retry CRC errors a couple of times to cope with transient
failures.

218

I'd drop ignore_badcrc && from this line and fold with the lone below.

Bonus points: add a statistical counter for how many times this fires that one can get off the mmc bus sysctl.

sys/dev/mmc/mmc_subr.h
66 ↗(On Diff #71472)

I'd drop this change.

I like it. Hopefully @manu or @kibab can also comment, but if not in a few days, I'll go ahead and commit.

This revision is now accepted and ready to land.Jun 2 2021, 3:38 PM

I chatted with @manu on IRC. We looked at section 4.3.10 from the physical layer simplified spec (8.00). The proper response to the CRC error on CMD6 should be to power cycle the card. However, this works. Can you try adding a DELAY(100); at the top of mmc_switch_status to see if that also solves the problem? That would help us know why this workaround works...

I miss read the review and though that CMD6 was failing.
But if CMD13 is failing after CMD6 it's just that the card isn't ready so this code make sense to me.

OK. I'll land it later today then. thanks @manu