mbox series

[v2,0/6] mmc: handle undervoltage events and prevent eMMC corruption

Message ID 20250220074429.2906141-1-o.rempel@pengutronix.de (mailing list archive)
Headers show
Series mmc: handle undervoltage events and prevent eMMC corruption | expand

Message

Oleksij Rempel Feb. 20, 2025, 7:44 a.m. UTC
This patch set introduces a framework for handling undervoltage events
in the MMC subsystem. The goal is to improve system reliability by
ensuring graceful handling of power fluctuations that could otherwise
lead to metadata corruption, potentially rendering the eMMC chip
unusable or causing significant data loss.

## Problem Statement

Power fluctuations and sudden losses can leave eMMC devices in an
undefined state, leading to severe consequences. The worst case can
result in metadata corruption, making the entire storage inaccessible.
While some eMMC devices promise to handle such situations internally,
experience shows that some chip variants are still affected. This has
led vendors to take a more protective approach, implementing external
undervoltage handling as a precautionary measure to avoid costly field
failures and returns.

The existence of the "Power Off Notification" feature in the eMMC
standard itself serves as indirect evidence that this is a real-world
issue.  While some projects have already faced the consequences of
ignoring this problem (often at significant cost), specific cases cannot
be disclosed due to NDAs.

## Challenges and Implementation Approach

1. **Raising awareness of the problem**: While vendors have used
   proprietary solutions for years, a unified approach is needed upstream.
   This patch set is a first step in making that happen.

2. **Finding an acceptable implementation path**: There are multiple
   ways to handle undervoltage - either in the kernel or in user space,
   through a global shutdown mechanism, or using the regulator framework.
   This patch set takes the kernel-based approach but does not prevent
   future extensions, such as allowing user-space handoff once available.

3. **Preparing for vendor adoption and testing**: By providing a
   structured solution upstream, this patch set lowers the barrier for
   vendors to standardize their undervoltage handling instead of relying on
   fragmented, out-of-tree implementations.

## Current Limitations

This patch set is an initial step and does not yet cover all possible
design restrictions or edge cases. Future improvements may include
better coordination with user space and enhancements based on broader
testing.

## Testing Details

The implementation was tested on an iMX8MP-based system. The board had
approximately 100ms of available power hold-up time. The Power Off
Notification was sent ~4ms after the board was detached from the power
supply, allowing sufficient time for the eMMC to handle the event
properly.  Tests were conducted under both idle conditions and active
read/write operations.

Oleksij Rempel (6):
  mmc: core: Handle undervoltage events and register regulator notifiers
  mmc: core: make mmc_interrupt_hpi() global
  mmc: core: refactor _mmc_suspend() for undervoltage handling
  mmc: core: add undervoltage handler for MMC/eMMC devices
  mmc: block: abort requests and suppress errors after undervoltage
    shutdown
  mmc: sdhci: prevent command execution after undervoltage shutdown

 drivers/mmc/core/block.c     |   2 +-
 drivers/mmc/core/core.c      |  20 ++++++
 drivers/mmc/core/core.h      |   2 +
 drivers/mmc/core/mmc.c       | 101 ++++++++++++++++++++++++------
 drivers/mmc/core/mmc_ops.c   |   2 +-
 drivers/mmc/core/mmc_ops.h   |   1 +
 drivers/mmc/core/queue.c     |   2 +-
 drivers/mmc/core/regulator.c | 115 +++++++++++++++++++++++++++++++++++
 drivers/mmc/host/sdhci.c     |   9 +++
 include/linux/mmc/host.h     |   9 +++
 10 files changed, 241 insertions(+), 22 deletions(-)

--
2.39.5