diff mbox series

[v3,9/9] docs: add a doc about I/O hang

Message ID 20201022130303.1092-10-cenjiahui@huawei.com (mailing list archive)
State New, archived
Headers show
Series block-backend: Introduce I/O hang | expand

Commit Message

Jiahui Cen Oct. 22, 2020, 1:03 p.m. UTC
Give some details about the I/O hang and how to use it.

Signed-off-by: Jiahui Cen <cenjiahui@huawei.com>
Signed-off-by: Ying Fang <fangying1@huawei.com>
---
 docs/io-hang.rst | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)
 create mode 100644 docs/io-hang.rst
diff mbox series

Patch

diff --git a/docs/io-hang.rst b/docs/io-hang.rst
new file mode 100644
index 0000000000..53356ceed2
--- /dev/null
+++ b/docs/io-hang.rst
@@ -0,0 +1,45 @@ 
+========
+I/O hang
+========
+
+Introduction
+============
+
+I/O hang is a mechanism to automatically rehandle AIOs when an error occurs on
+the backend block device, which is unperceivable for Guest. If the error on the
+backend device is temporary, like NFS returns EIO due to network fluctuations,
+once the device is recovered, the AIOs will be resent automatically and done
+successfully. If the error on the device is unrecoverable, the failed AIOs will
+be returned to Guest after rehandle timeout.
+
+This mechanism can provide self-recovery and high availability to VM. From the
+view of Guest, AIOs sent to the device are hung for a time and the finished, and
+any other unrelated service in Guest can work as usual.
+
+Implementation
+==============
+
+Each BlockBackend will have a list to store AIOs which need be rehandled and a
+timer to monitor the timeout.
+
+If I/O hang is enabled, all returned AIOs will be checked and the failed ones
+will be inserted into BlockBackend's list. The timer will be started and the
+AIOs in the list will be resent to the backend device. Once the result of AIOs
+relevant to this backend device is success, the AIOs will be returned back to
+Guest. Otherwise, those AIOs will be rehandled periodically until timeout.
+
+I/O hang provides interfaces to drain all the AIOs in BlockBackend's list. There
+are some situations to drain the rehandling AIOs, eg. resetting virtio-blk.
+
+I/O hang also provides qapi events, which can be used for manager tools like
+libvirt to monitor.
+
+Examples
+========
+
+Enable I/O hang when launching QEMU::
+
+      $ qemu-system-x86_64                                      \
+          -machine pc,accel=kvm -smp 1 -m 2048                  \
+          -drive file=shared.img,format=raw,cache=none,aio=native,iohang-timeout=60
+