From patchwork Tue Dec 17 12:45:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Chen X-Patchwork-Id: 11297553 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5D7B31593 for ; Tue, 17 Dec 2019 12:58:48 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2CAEF20663 for ; Tue, 17 Dec 2019 12:58:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2CAEF20663 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:40058 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCR4-00068Y-GZ for patchwork-qemu-devel@patchwork.kernel.org; Tue, 17 Dec 2019 07:58:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:54022) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCMV-0000Ok-QF for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ihCMT-0001DG-TB for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:03 -0500 Received: from mga18.intel.com ([134.134.136.126]:49326) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ihCMT-00016O-Kd for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:01 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Dec 2019 04:53:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,325,1571727600"; d="scan'208";a="209689617" Received: from unknown (HELO localhost.localdomain) ([10.239.13.19]) by orsmga008.jf.intel.com with ESMTP; 17 Dec 2019 04:53:53 -0800 From: Zhang Chen To: Jason Wang , Paolo Bonzini , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , qemu-dev Subject: [PATCH V4 1/5] net/awd.c: Introduce Advanced Watch Dog module framework Date: Tue, 17 Dec 2019 20:45:50 +0800 Message-Id: <20191217124554.30818-2-chen.zhang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191217124554.30818-1-chen.zhang@intel.com> References: <20191217124554.30818-1-chen.zhang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.126 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Zhang Chen Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Zhang Chen This patch introduce a new module named Advanced Watch Dog, and defined the input and output parameter. AWD use standard chardev as the way of communicationg with the outside world. If you want to use it, please add "--enable-awd" when configure qemu. Demo command: -object advanced-watchdog,id=awd1,server=on,awd_node=d_node,notification_node=remote_server,opt_script=opt_script_path,iothread=awd_iothread,pulse_interval=1000,timeout=5000 Signed-off-by: Zhang Chen --- configure | 9 ++ net/Makefile.objs | 1 + net/awd.c | 261 ++++++++++++++++++++++++++++++++++++++++++++++ qemu-options.hx | 20 ++++ 4 files changed, 291 insertions(+) create mode 100644 net/awd.c diff --git a/configure b/configure index 6099be1d84..49d1830de4 100755 --- a/configure +++ b/configure @@ -383,6 +383,7 @@ vhost_scsi="" vhost_vsock="" vhost_user="" vhost_user_fs="" +awd="no" kvm="no" hax="no" hvf="no" @@ -1304,6 +1305,10 @@ for opt do ;; --enable-vhost-user-fs) vhost_user_fs="yes" ;; + --disable-awd) awd="no" + ;; + --enable-awd) awd="yes" + ;; --disable-opengl) opengl="no" ;; --enable-opengl) opengl="yes" @@ -1780,6 +1785,7 @@ disabled with --disable-FEATURE, default is enabled if available: vhost-crypto vhost-user-crypto backend support vhost-kernel vhost kernel backend support vhost-user vhost-user backend support + awd Advanced Watch Dog support spice spice rbd rados block device (rbd) libiscsi iscsi support @@ -7043,6 +7049,9 @@ fi if test "$vhost_user" = "yes" ; then echo "CONFIG_VHOST_USER=y" >> $config_host_mak fi +if test "$awd" = "yes" ; then + echo "CONFIG_AWD=y" >> $config_host_mak +fi if test "$vhost_user_fs" = "yes" ; then echo "CONFIG_VHOST_USER_FS=y" >> $config_host_mak fi diff --git a/net/Makefile.objs b/net/Makefile.objs index c5d076d19c..187e655443 100644 --- a/net/Makefile.objs +++ b/net/Makefile.objs @@ -19,6 +19,7 @@ common-obj-y += colo-compare.o common-obj-y += colo.o common-obj-y += filter-rewriter.o common-obj-y += filter-replay.o +common-obj-$(CONFIG_AWD) += awd.o tap-obj-$(CONFIG_LINUX) = tap-linux.o tap-obj-$(CONFIG_BSD) = tap-bsd.o diff --git a/net/awd.c b/net/awd.c new file mode 100644 index 0000000000..d42b4a7372 --- /dev/null +++ b/net/awd.c @@ -0,0 +1,261 @@ +/* + * Advanced Watch Dog + * + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO) + * (a.k.a. Fault Tolerance or Continuous Replication) + * + * Copyright (c) 2019 Intel Corporation + * + * Author: Zhang Chen + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "trace.h" +#include "qemu-common.h" +#include "qapi/error.h" +#include "net/net.h" +#include "qom/object_interfaces.h" +#include "qom/object.h" +#include "chardev/char-fe.h" +#include "qemu/sockets.h" +#include "sysemu/iothread.h" + +#define TYPE_AWD "advanced-watchdog" +#define AWD(obj) OBJECT_CHECK(AwdState, (obj), TYPE_AWD) + +#define AWD_READ_LEN_MAX NET_BUFSIZE +/* Default advanced watchdog pulse interval */ +#define AWD_PULSE_INTERVAL_DEFAULT 5000 +/* Default advanced watchdog timeout */ +#define AWD_TIMEOUT_DEFAULT 2000 + +typedef struct AwdState { + Object parent; + + bool server; + char *awd_node; + char *notification_node; + char *opt_script; + uint32_t pulse_interval; + uint32_t timeout; + IOThread *iothread; +} AwdState; + +typedef struct AwdClass { + ObjectClass parent_class; +} AwdClass; + +static char *awd_get_node(Object *obj, Error **errp) +{ + AwdState *s = AWD(obj); + + return g_strdup(s->awd_node); +} + +static void awd_set_node(Object *obj, const char *value, Error **errp) +{ + AwdState *s = AWD(obj); + + g_free(s->awd_node); + s->awd_node = g_strdup(value); +} + +static char *noti_get_node(Object *obj, Error **errp) +{ + AwdState *s = AWD(obj); + + return g_strdup(s->notification_node); +} + +static void noti_set_node(Object *obj, const char *value, Error **errp) +{ + AwdState *s = AWD(obj); + + g_free(s->notification_node); + s->notification_node = g_strdup(value); +} + +static char *opt_script_get_node(Object *obj, Error **errp) +{ + AwdState *s = AWD(obj); + + return g_strdup(s->opt_script); +} + +static void opt_script_set_node(Object *obj, const char *value, Error **errp) +{ + AwdState *s = AWD(obj); + + g_free(s->opt_script); + s->opt_script = g_strdup(value); +} + +static bool awd_get_server(Object *obj, Error **errp) +{ + AwdState *s = AWD(obj); + + return s->server; +} + +static void awd_set_server(Object *obj, bool value, Error **errp) +{ + AwdState *s = AWD(obj); + + s->server = value; +} + +static void awd_get_interval(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ + AwdState *s = AWD(obj); + uint32_t value = s->pulse_interval; + + visit_type_uint32(v, name, &value, errp); +} + +static void awd_set_interval(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ + AwdState *s = AWD(obj); + Error *local_err = NULL; + uint32_t value; + + visit_type_uint32(v, name, &value, &local_err); + if (local_err) { + goto out; + } + if (!value) { + error_setg(&local_err, "Property '%s.%s' requires a positive value", + object_get_typename(obj), name); + goto out; + } + s->pulse_interval = value; + +out: + error_propagate(errp, local_err); +} + +static void awd_get_timeout(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ + AwdState *s = AWD(obj); + uint32_t value = s->timeout; + + visit_type_uint32(v, name, &value, errp); +} + +static void awd_set_timeout(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ + AwdState *s = AWD(obj); + Error *local_err = NULL; + uint32_t value; + + visit_type_uint32(v, name, &value, &local_err); + if (local_err) { + goto out; + } + + if (!value) { + error_setg(&local_err, "Property '%s.%s' requires a positive value", + object_get_typename(obj), name); + goto out; + } + s->timeout = value; + +out: + error_propagate(errp, local_err); +} + +static void awd_complete(UserCreatable *uc, Error **errp) +{ + AwdState *s = AWD(uc); + + if (!s->awd_node || !s->iothread || + !s->notification_node || !s->opt_script) { + error_setg(errp, "advanced-watchdog needs 'awd_node', " + "'notification_node', 'opt_script' " + "and 'server' property set"); + return; + } + + return; +} + +static void awd_class_init(ObjectClass *oc, void *data) +{ + UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc); + + ucc->complete = awd_complete; +} + +static void awd_init(Object *obj) +{ + AwdState *s = AWD(obj); + + object_property_add_str(obj, "awd_node", + awd_get_node, awd_set_node, + NULL); + + object_property_add_str(obj, "notification_node", + noti_get_node, noti_set_node, + NULL); + + object_property_add_str(obj, "opt_script", + opt_script_get_node, opt_script_set_node, + NULL); + + object_property_add_bool(obj, "server", + awd_get_server, + awd_set_server, NULL); + + object_property_add(obj, "pulse_interval", "uint32", + awd_get_interval, + awd_set_interval, NULL, NULL, NULL); + + object_property_add(obj, "timeout", "uint32", + awd_get_timeout, + awd_set_timeout, NULL, NULL, NULL); + + object_property_add_link(obj, "iothread", TYPE_IOTHREAD, + (Object **)&s->iothread, + object_property_allow_set_link, + OBJ_PROP_LINK_STRONG, NULL); +} + +static void awd_finalize(Object *obj) +{ + AwdState *s = AWD(obj); + + g_free(s->awd_node); + g_free(s->notification_node); +} + +static const TypeInfo awd_info = { + .name = TYPE_AWD, + .parent = TYPE_OBJECT, + .instance_size = sizeof(AwdState), + .instance_init = awd_init, + .instance_finalize = awd_finalize, + .class_size = sizeof(AwdClass), + .class_init = awd_class_init, + .interfaces = (InterfaceInfo[]) { + { TYPE_USER_CREATABLE }, + { } + } +}; + +static void register_types(void) +{ + type_register_static(&awd_info); +} + +type_init(register_types); diff --git a/qemu-options.hx b/qemu-options.hx index 65c9473b73..40417afab5 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -4589,6 +4589,26 @@ Dump the network traffic on netdev @var{dev} to the file specified by The file format is libpcap, so it can be analyzed with tools such as tcpdump or Wireshark. +@item -object advanced-watchdog,id=@var{id},awd_node=@var{chardevid},notification_node=@var{chardevid},server=@var{server},iothread=@var{id},opt_script=@var{path}[,pulse_interval=@var{time_ms},timeout=@var{time_ms}] + +Advanced Watch Dog is an universal monitoring module on VMM side, it can be used +to detect network down(VMM to guest, VMM to VMM, VMM to another remote server) +and do previously set operation. AWD(Advanced WatchDog) use awd_node +@var{chardevid} parameter to connect with a -chardev node for heartbeat +service, and the service use the server @var{server} parameter to divided into +server side and client side. The iothread @var{id} parameter make AWD attach to +iothread and run independently of the main loop. The pulse_interval @var{time_ms} +and timeout @var{time_ms} are heartbeat service property, default property are +pulse_interval=5000, timeout=2000. AWD use the notification_node @var{chardevid} +attach another -chardev socket node to do previously set operation, user can +setup the operation(user command) in opt_script file, AWD will open this script +and send it to notification_node. It make user have basic VM/Host network +monitoring tools and basic false tolerance and recovery solution. + +Usage cases: +Send message to admin, notify another VMM, send qmp command to qemu do some +operation like restart the VM, build VMM heartbeat system, etc. + @item -object colo-compare,id=@var{id},primary_in=@var{chardevid},secondary_in=@var{chardevid},outdev=@var{chardevid},iothread=@var{id}[,vnet_hdr_support][,notify_dev=@var{id}] Colo-compare gets packet from primary_in@var{chardevid} and secondary_in@var{chardevid}, than compare primary packet with From patchwork Tue Dec 17 12:45:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Chen X-Patchwork-Id: 11297543 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B5850112B for ; Tue, 17 Dec 2019 12:54:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 47A8E20663 for ; Tue, 17 Dec 2019 12:54:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 47A8E20663 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:39990 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCNN-0001fD-VU for patchwork-qemu-devel@patchwork.kernel.org; Tue, 17 Dec 2019 07:54:57 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:53963) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCMT-0000OP-Vs for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ihCMS-0001Bb-6a for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:01 -0500 Received: from mga18.intel.com ([134.134.136.126]:49328) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ihCMR-00018a-Ty for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:00 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Dec 2019 04:53:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,325,1571727600"; d="scan'208";a="209689629" Received: from unknown (HELO localhost.localdomain) ([10.239.13.19]) by orsmga008.jf.intel.com with ESMTP; 17 Dec 2019 04:53:55 -0800 From: Zhang Chen To: Jason Wang , Paolo Bonzini , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , qemu-dev Subject: [PATCH V4 2/5] net/awd.c: Initailize input/output chardev Date: Tue, 17 Dec 2019 20:45:51 +0800 Message-Id: <20191217124554.30818-3-chen.zhang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191217124554.30818-1-chen.zhang@intel.com> References: <20191217124554.30818-1-chen.zhang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.126 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Zhang Chen Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Zhang Chen Find and check the chardev awd_node and notification_node, The awd_node used for keep connect with outside(like VM client/other host/Remote server), and the notification_node used for do some operation when disconnect event occur. Signed-off-by: Zhang Chen --- net/awd.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/net/awd.c b/net/awd.c index d42b4a7372..ad3d39c982 100644 --- a/net/awd.c +++ b/net/awd.c @@ -42,6 +42,8 @@ typedef struct AwdState { char *opt_script; uint32_t pulse_interval; uint32_t timeout; + CharBackend chr_awd_node; + CharBackend chr_notification_node; IOThread *iothread; } AwdState; @@ -175,9 +177,30 @@ out: error_propagate(errp, local_err); } +static int find_and_check_chardev(Chardev **chr, + char *chr_name, + Error **errp) +{ + *chr = qemu_chr_find(chr_name); + if (*chr == NULL) { + error_setg(errp, "Device '%s' not found", + chr_name); + return 1; + } + + if (!qemu_chr_has_feature(*chr, QEMU_CHAR_FEATURE_RECONNECTABLE)) { + error_setg(errp, "chardev \"%s\" is not reconnectable", + chr_name); + return 1; + } + + return 0; +} + static void awd_complete(UserCreatable *uc, Error **errp) { AwdState *s = AWD(uc); + Chardev *chr; if (!s->awd_node || !s->iothread || !s->notification_node || !s->opt_script) { @@ -187,6 +210,20 @@ static void awd_complete(UserCreatable *uc, Error **errp) return; } + if (find_and_check_chardev(&chr, s->awd_node, errp) || + !qemu_chr_fe_init(&s->chr_awd_node, chr, errp)) { + error_setg(errp, "advanced-watchdog can't find chardev awd_node: %s", + s->awd_node); + return; + } + + if (find_and_check_chardev(&chr, s->notification_node, errp) || + !qemu_chr_fe_init(&s->chr_notification_node, chr, errp)) { + error_setg(errp, "advanced-watchdog can't find " + "chardev notification_node: %s", s->notification_node); + return; + } + return; } From patchwork Tue Dec 17 12:45:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Chen X-Patchwork-Id: 11297545 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2D604112B for ; Tue, 17 Dec 2019 12:55:02 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0D8FD20663 for ; Tue, 17 Dec 2019 12:55:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D8FD20663 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:39993 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCNR-0001hq-0O for patchwork-qemu-devel@patchwork.kernel.org; Tue, 17 Dec 2019 07:55:01 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:53978) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCMU-0000OR-0h for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ihCMS-0001Bp-La for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:01 -0500 Received: from mga18.intel.com ([134.134.136.126]:49328) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ihCMS-00018a-Dc for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:00 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Dec 2019 04:53:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,325,1571727600"; d="scan'208";a="209689636" Received: from unknown (HELO localhost.localdomain) ([10.239.13.19]) by orsmga008.jf.intel.com with ESMTP; 17 Dec 2019 04:53:57 -0800 From: Zhang Chen To: Jason Wang , Paolo Bonzini , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , qemu-dev Subject: [PATCH V4 3/5] net/awd.c: Load advanced watch dog worker thread job Date: Tue, 17 Dec 2019 20:45:52 +0800 Message-Id: <20191217124554.30818-4-chen.zhang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191217124554.30818-1-chen.zhang@intel.com> References: <20191217124554.30818-1-chen.zhang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.126 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Zhang Chen Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Zhang Chen This patch load pulse_timer and timeout_timer in the new iothread. The pulse timer will send pulse info to awd_node, and the timeout timer will check the reply pulse from awd_node. If timeout occur, it will send opt_script's data to the notification_node. Signed-off-by: Zhang Chen --- net/awd.c | 193 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 193 insertions(+) diff --git a/net/awd.c b/net/awd.c index ad3d39c982..04f40e7cc8 100644 --- a/net/awd.c +++ b/net/awd.c @@ -40,17 +40,137 @@ typedef struct AwdState { char *awd_node; char *notification_node; char *opt_script; + char *opt_script_data; uint32_t pulse_interval; uint32_t timeout; CharBackend chr_awd_node; CharBackend chr_notification_node; + SocketReadState awd_rs; + + QEMUTimer *pulse_timer; + QEMUTimer *timeout_timer; IOThread *iothread; + GMainContext *worker_context; } AwdState; typedef struct AwdClass { ObjectClass parent_class; } AwdClass; +static int awd_chr_send(AwdState *s, + const uint8_t *buf, + uint32_t size) +{ + int ret = 0; + uint32_t len = htonl(size); + + if (!size) { + return 0; + } + + ret = qemu_chr_fe_write_all(&s->chr_awd_node, (uint8_t *)&len, + sizeof(len)); + if (ret != sizeof(len)) { + goto err; + } + + ret = qemu_chr_fe_write_all(&s->chr_awd_node, (uint8_t *)buf, + size); + if (ret != size) { + goto err; + } + + return 0; + +err: + return ret < 0 ? ret : -EIO; +} + +static int awd_chr_can_read(void *opaque) +{ + return AWD_READ_LEN_MAX; +} + +static void awd_node_in(void *opaque, const uint8_t *buf, int size) +{ + AwdState *s = AWD(opaque); + int ret; + + ret = net_fill_rstate(&s->awd_rs, buf, size); + if (ret == -1) { + qemu_chr_fe_set_handlers(&s->chr_awd_node, NULL, NULL, NULL, NULL, + NULL, NULL, true); + error_report("advanced-watchdog get pulse error"); + } +} + +static void awd_send_pulse(void *opaque) +{ + AwdState *s = opaque; + char buf[] = "advanced-watchdog pulse"; + + awd_chr_send(s, (uint8_t *)buf, sizeof(buf)); +} + +static void awd_regular_pulse(void *opaque) +{ + AwdState *s = opaque; + + awd_send_pulse(s); + timer_mod(s->pulse_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + s->pulse_interval); +} + +static void awd_timeout(void *opaque) +{ + AwdState *s = opaque; + int ret = 0; + + ret = qemu_chr_fe_write_all(&s->chr_notification_node, + (uint8_t *)s->opt_script_data, + strlen(s->opt_script_data)); + if (ret) { + error_report("advanced-watchdog notification failure"); + } +} + +static void awd_timer_init(AwdState *s) +{ + AioContext *ctx = iothread_get_aio_context(s->iothread); + + s->timeout_timer = aio_timer_new(ctx, QEMU_CLOCK_VIRTUAL, SCALE_MS, + awd_timeout, s); + + s->pulse_timer = aio_timer_new(ctx, QEMU_CLOCK_VIRTUAL, SCALE_MS, + awd_regular_pulse, s); + + if (!s->pulse_interval) { + s->pulse_interval = AWD_PULSE_INTERVAL_DEFAULT; + } + + if (!s->timeout) { + s->timeout = AWD_TIMEOUT_DEFAULT; + } + + timer_mod(s->pulse_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + s->pulse_interval); +} + +static void awd_timer_del(AwdState *s) +{ + if (s->pulse_timer) { + timer_del(s->pulse_timer); + timer_free(s->pulse_timer); + s->pulse_timer = NULL; + } + + if (s->timeout_timer) { + timer_del(s->timeout_timer); + timer_free(s->timeout_timer); + s->timeout_timer = NULL; + } + } + static char *awd_get_node(Object *obj, Error **errp) { AwdState *s = AWD(obj); @@ -177,6 +297,22 @@ out: error_propagate(errp, local_err); } +static void awd_rs_finalize(SocketReadState *awd_rs) +{ + AwdState *s = container_of(awd_rs, AwdState, awd_rs); + + if (!s->server) { + char buf[] = "advanced-watchdog reply pulse"; + + awd_chr_send(s, (uint8_t *)buf, sizeof(buf)); + } + + timer_mod(s->timeout_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + s->timeout); + + error_report("advanced-watchdog got message : %s", awd_rs->buf); +} + static int find_and_check_chardev(Chardev **chr, char *chr_name, Error **errp) @@ -197,6 +333,46 @@ static int find_and_check_chardev(Chardev **chr, return 0; } +static void awd_iothread(AwdState *s) +{ + object_ref(OBJECT(s->iothread)); + s->worker_context = iothread_get_g_main_context(s->iothread); + + qemu_chr_fe_set_handlers(&s->chr_awd_node, awd_chr_can_read, + awd_node_in, NULL, NULL, + s, s->worker_context, true); + + awd_timer_init(s); +} + +static int get_opt_script_data(AwdState *s) +{ + FILE *opt_fd; + long fsize; + + opt_fd = fopen(s->opt_script, "r"); + if (opt_fd == NULL) { + error_report("advanced-watchdog can't open " + "opt_script: %s", s->opt_script); + return -1; + } + + fseek(opt_fd, 0, SEEK_END); + fsize = ftell(opt_fd); + fseek(opt_fd, 0, SEEK_SET); + s->opt_script_data = malloc(fsize + 1); + + if (!fread(s->opt_script_data, 1, fsize, opt_fd)) { + error_report("advanced-watchdog can't read " + "opt_script: %s", s->opt_script); + return -1; + } + + fclose(opt_fd); + + return 0; +} + static void awd_complete(UserCreatable *uc, Error **errp) { AwdState *s = AWD(uc); @@ -224,6 +400,16 @@ static void awd_complete(UserCreatable *uc, Error **errp) return; } + if (get_opt_script_data(s)) { + error_setg(errp, "advanced-watchdog can't get " + "opt script data: %s", s->opt_script); + return; + } + + net_socket_rs_init(&s->awd_rs, awd_rs_finalize, false); + + awd_iothread(s); + return; } @@ -272,6 +458,13 @@ static void awd_finalize(Object *obj) { AwdState *s = AWD(obj); + qemu_chr_fe_deinit(&s->chr_awd_node, false); + qemu_chr_fe_deinit(&s->chr_notification_node, false); + + if (s->iothread) { + awd_timer_del(s); + } + g_free(s->awd_node); g_free(s->notification_node); } From patchwork Tue Dec 17 12:45:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Chen X-Patchwork-Id: 11297547 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 00B68930 for ; Tue, 17 Dec 2019 12:55:03 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D37C621835 for ; Tue, 17 Dec 2019 12:55:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D37C621835 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:39994 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCNR-0001iz-Je for patchwork-qemu-devel@patchwork.kernel.org; Tue, 17 Dec 2019 07:55:01 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:53979) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCMU-0000OS-17 for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ihCMT-0001Cc-3n for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:01 -0500 Received: from mga18.intel.com ([134.134.136.126]:49328) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ihCMS-00018a-SY for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:01 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Dec 2019 04:53:59 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,325,1571727600"; d="scan'208";a="209689641" Received: from unknown (HELO localhost.localdomain) ([10.239.13.19]) by orsmga008.jf.intel.com with ESMTP; 17 Dec 2019 04:53:58 -0800 From: Zhang Chen To: Jason Wang , Paolo Bonzini , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , qemu-dev Subject: [PATCH V4 4/5] vl.c: Make Advanced Watch Dog delayed initialization Date: Tue, 17 Dec 2019 20:45:53 +0800 Message-Id: <20191217124554.30818-5-chen.zhang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191217124554.30818-1-chen.zhang@intel.com> References: <20191217124554.30818-1-chen.zhang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.126 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Zhang Chen Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Zhang Chen Advanced Watch Dog module needs chardev socket to initialize properly before running. Signed-off-by: Zhang Chen --- vl.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/vl.c b/vl.c index 6a65a64bfd..048fe458b9 100644 --- a/vl.c +++ b/vl.c @@ -2689,6 +2689,13 @@ static bool object_create_initial(const char *type, QemuOpts *opts) return false; } + /* + * Reason: Advanced Watch Dog property "chardev". + */ + if (g_str_equal(type, "advanced-watchdog")) { + return false; + } + /* Memory allocation by backends needs to be done * after configure_accelerator() (due to the tcg_enabled() * checks at memory_region_init_*()). From patchwork Tue Dec 17 12:45:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Chen X-Patchwork-Id: 11297551 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0EC1F930 for ; Tue, 17 Dec 2019 12:57:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E274D2146E for ; Tue, 17 Dec 2019 12:57:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E274D2146E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:40052 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCQ3-0005FG-0X for patchwork-qemu-devel@patchwork.kernel.org; Tue, 17 Dec 2019 07:57:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:54011) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihCMV-0000OZ-9V for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ihCMU-0001Di-1E for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:03 -0500 Received: from mga18.intel.com ([134.134.136.126]:49328) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ihCMT-00018a-Mv for qemu-devel@nongnu.org; Tue, 17 Dec 2019 07:54:01 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Dec 2019 04:54:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,325,1571727600"; d="scan'208";a="209689654" Received: from unknown (HELO localhost.localdomain) ([10.239.13.19]) by orsmga008.jf.intel.com with ESMTP; 17 Dec 2019 04:54:00 -0800 From: Zhang Chen To: Jason Wang , Paolo Bonzini , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , qemu-dev Subject: [PATCH V4 5/5] docs/awd.txt: Add doc to introduce Advanced WatchDog(AWD) module Date: Tue, 17 Dec 2019 20:45:54 +0800 Message-Id: <20191217124554.30818-6-chen.zhang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20191217124554.30818-1-chen.zhang@intel.com> References: <20191217124554.30818-1-chen.zhang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.126 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhang Chen , Zhang Chen Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Zhang Chen Add docs to introduce Advanced WatchDog detail and usage. Signed-off-by: Zhang Chen --- docs/awd.txt | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) create mode 100644 docs/awd.txt diff --git a/docs/awd.txt b/docs/awd.txt new file mode 100644 index 0000000000..0ce513be5a --- /dev/null +++ b/docs/awd.txt @@ -0,0 +1,88 @@ +Advanced Watch Dog (AWD) +======================== +Copyright (c) 2019 Intel Corporation. +Author: Zhang Chen + +This work is licensed under the terms of the GNU GPL, version 2 or later. +See the COPYING file in the top-level directory. + +Introduction +------------ + +Advanced Watch Dog is an universal monitoring module on VMM side, it can be used +to detect network issues(VMM to guest, VMM to VMM, VMM to another remote server) +and do previously set operation. Current AWD accept any input as the signal +to refresh the watchdog timer, and we can also make a certain interactive +protocol here. Users can pre-write some command or some messages in the +AWD opt-script as the notification output. We noticed that there is no way +for VMM communicate directly, so we engaged with real customer found that they +need a lightweight and efficient mechanism to solve some practical problems, +for example Edge Computing cases(they think high level software is too heavy +to use in Edge or it is hard to manage and combine with VM instance). +It make user have basic VM/Host network monitoring tools and basic false +tolerance and recovery solution. + +Use case +-------- + +1. Monitor local guest status. +Running a simple application in guest for send signal to the local AWD module, +if timeout occur, AWD will notify high level admin or do some previously set +operation. For example send exit command to local QMP interface or qemu monitor. + +2. Monitor other VMM. +AWD module can be connected to each other to build heartbeat service. + +3. Monitor other remote service. +In some cases, remote service have certain relationship with current VM. If +network connection have some issue, AWD can do some urgent operation like reboot +local VM. etc... + +AWD usage +--------- + +User must "--enable-awd" in Qemu configuration. + +1. Monitor local guest status. + +-chardev socket,id=detection,host=0.0.0.0,port=9009,server,nowait +-chardev socket,id=notification,host=127.0.0.1,port=4445 +-object iothread,id=iothread1 +-object advanced-watchdog,id=awd1,server=on,awd_node=detection,notification_node=notification,opt_script=colo_opt_script,iothread=iothread1,pulse_interval=1000,timeout=5000 +-monitor tcp::4445,server,nowait + +qemu_opt_script: +quit + +Guest service need connect to detection node, admin can check notification node +to get message when timeout occur. + +2. Monitor other VMM. + +Demo usage(for COLO heartbeat service): + +In primary node: + +-chardev socket,id=h1,host=3.3.3.3,port=9009,server,nowait +-chardev socket,id=heartbeat0,host=3.3.3.3,port=4445 +-object iothread,id=iothread1 +-object advanced-watchdog,id=heart1,server=on,awd_node=h1,notification_node=heartbeat0,opt_script=colo_primary_opt_script,iothread=iothread1,pulse_interval=1000,timeout=5000 + +colo_primary_opt_script: +x_colo_lost_heartbeat + +In secondary node: + +-monitor tcp::4445,server,nowait +-chardev socket,id=h1,host=3.3.3.3,port=9009,reconnect=1 +-chardev socket,id=heart1,host=3.3.3.8,port=4445 +-object iothread,id=iothread1 +-object advanced-watchdog,id=heart1,server=off,awd_node=h1,notification_node=heart1,opt_script=colo_secondary_opt_script,iothread=iothread1,timeout=10000 + +colo_secondary_opt_script: +nbd_server_stop +x_colo_lost_heartbeat + +3. Monitor other remote service. + +Same like monitor local guest except detection node and notification node.