diff mbox

[Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled

Message ID D3E216785288A145B7BC975F83A2ED103FEF6E4C@szxeml556-mbx.china.huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Zhanghaoyu (A) July 30, 2013, 9:04 a.m. UTC
>> >> hi all,
>> >> 
>> >> I met similar problem to these, while performing live migration or 
>> >> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, 
>> >> guest:suse11sp2), running tele-communication software suite in 
>> >> guest, 
>> >> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
>> >> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
>> >> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
>> >> https://bugzilla.kernel.org/show_bug.cgi?id=58771
>> >> 
>> >> After live migration or virsh restore [savefile], one process's CPU 
>> >> utilization went up by about 30%, resulted in throughput 
>> >> degradation of this process.
>> >> 
>> >> If EPT disabled, this problem gone.
>> >> 
>> >> I suspect that kvm hypervisor has business with this problem.
>> >> Based on above suspect, I want to find the two adjacent versions of 
>> >> kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1), 
>> >> and analyze the differences between this two versions, or apply the 
>> >> patches between this two versions by bisection method, finally find the key patches.
>> >> 
>> >> Any better ideas?
>> >> 
>> >> Thanks,
>> >> Zhang Haoyu
>> >
>> >I've attempted to duplicate this on a number of machines that are as similar to yours as I am able to get my hands on, and so far have not been able to see any performance degradation. And from what I've read in the above links, huge pages do not seem to be part of the problem.
>> >
>> >So, if you are in a position to bisect the kernel changes, that would probably be the best avenue to pursue in my opinion.
>> >
>> >Bruce
>> 
>> I found the first bad 
>> commit([612819c3c6e67bac8fceaa7cc402f13b1b63f7e4] KVM: propagate fault r/w information to gup(), allow read-only memory) which triggers this problem by git bisecting the kvm kernel (download from https://git.kernel.org/pub/scm/virt/kvm/kvm.git) changes.
>> 
>> And,
>> git log 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 -n 1 -p > 
>> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log
>> git diff 
>> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1..612819c3c6e67bac8fceaa7cc4
>> 02f13b1b63f7e4 > 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff
>> 
>> Then, I diffed 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log and 
>> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff,
>> came to a conclusion that all of the differences between 
>> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1 and 
>> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4
>> are contributed by no other than 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4, so this commit is the peace-breaker which directly or indirectly causes the degradation.
>> 
>> Does the map_writable flag passed to mmu_set_spte() function have effect on PTE's PAT flag or increase the VMEXITs induced by that guest tried to write read-only memory?
>> 
>> Thanks,
>> Zhang Haoyu
>> 
>
>There should be no read-only memory maps backing guest RAM.
>
>Can you confirm map_writable = false is being passed to __direct_map? (this should not happen, for guest RAM).
>And if it is false, please capture the associated GFN.
>
I added below check and printk at the start of __direct_map() at the fist bad commit version,

I virsh-save the VM, and then virsh-restore it, so many GFNs were printed, you can absolutely describe it as flooding.

>Its probably an issue with an older get_user_pages variant (either in kvm-kmod or the older kernel). Is there any indication of a similar issue with upstream kernel?
I will test the upstream kvm host(https://git.kernel.org/pub/scm/virt/kvm/kvm.git) later, if the problem is still there, 
I will revert the first bad commit patch: 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 on the upstream, then test it again.

And, I collected the VMEXITs statistics in pre-save and post-restore period at first bad commit version,
pre-save:
COTS-F10S03:~ # perf stat -e "kvm:*" -a sleep 30

 Performance counter stats for 'sleep 30':

           1222318 kvm:kvm_entry
                 0 kvm:kvm_hypercall
                 0 kvm:kvm_hv_hypercall
            351755 kvm:kvm_pio
              6703 kvm:kvm_cpuid
            692502 kvm:kvm_apic
           1234173 kvm:kvm_exit
            223956 kvm:kvm_inj_virq
                 0 kvm:kvm_inj_exception
             16028 kvm:kvm_page_fault
             59872 kvm:kvm_msr
                 0 kvm:kvm_cr
            169596 kvm:kvm_pic_set_irq
             81455 kvm:kvm_apic_ipi
            245103 kvm:kvm_apic_accept_irq
                 0 kvm:kvm_nested_vmrun
                 0 kvm:kvm_nested_intercepts
                 0 kvm:kvm_nested_vmexit
                 0 kvm:kvm_nested_vmexit_inject
                 0 kvm:kvm_nested_intr_vmexit
                 0 kvm:kvm_invlpga
                 0 kvm:kvm_skinit
            853020 kvm:kvm_emulate_insn
            171140 kvm:kvm_set_irq
            171534 kvm:kvm_ioapic_set_irq
                 0 kvm:kvm_msi_set_irq
             99276 kvm:kvm_ack_irq
            971166 kvm:kvm_mmio
             33722 kvm:kvm_fpu
                 0 kvm:kvm_age_page
                 0 kvm:kvm_try_async_get_page
                 0 kvm:kvm_async_pf_not_present
                 0 kvm:kvm_async_pf_ready
                 0 kvm:kvm_async_pf_completed
                 0 kvm:kvm_async_pf_doublefault

      30.019069018 seconds time elapsed

post-restore:
COTS-F10S03:~ # perf stat -e "kvm:*" -a sleep 30

 Performance counter stats for 'sleep 30':

           1327880 kvm:kvm_entry
                 0 kvm:kvm_hypercall
                 0 kvm:kvm_hv_hypercall
            375189 kvm:kvm_pio
              6925 kvm:kvm_cpuid
            804414 kvm:kvm_apic
           1339352 kvm:kvm_exit
            245922 kvm:kvm_inj_virq
                 0 kvm:kvm_inj_exception
             15856 kvm:kvm_page_fault
             39500 kvm:kvm_msr
                 1 kvm:kvm_cr
            179150 kvm:kvm_pic_set_irq
             98436 kvm:kvm_apic_ipi
            247430 kvm:kvm_apic_accept_irq
                 0 kvm:kvm_nested_vmrun
                 0 kvm:kvm_nested_intercepts
                 0 kvm:kvm_nested_vmexit
                 0 kvm:kvm_nested_vmexit_inject
                 0 kvm:kvm_nested_intr_vmexit
                 0 kvm:kvm_invlpga
                 0 kvm:kvm_skinit
            955410 kvm:kvm_emulate_insn
            182240 kvm:kvm_set_irq
            182562 kvm:kvm_ioapic_set_irq
                 0 kvm:kvm_msi_set_irq
            105267 kvm:kvm_ack_irq
           1113999 kvm:kvm_mmio
             37789 kvm:kvm_fpu
                 0 kvm:kvm_age_page
                 0 kvm:kvm_try_async_get_page
                 0 kvm:kvm_async_pf_not_present
                 0 kvm:kvm_async_pf_ready
                 0 kvm:kvm_async_pf_completed
                 0 kvm:kvm_async_pf_doublefault

      30.000779718 seconds time elapsed

Thanks,
Zhang Haoyu
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Gleb Natapov Aug. 1, 2013, 6:16 a.m. UTC | #1
On Tue, Jul 30, 2013 at 09:04:56AM +0000, Zhanghaoyu (A) wrote:
> 
> >> >> hi all,
> >> >> 
> >> >> I met similar problem to these, while performing live migration or 
> >> >> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, 
> >> >> guest:suse11sp2), running tele-communication software suite in 
> >> >> guest, 
> >> >> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
> >> >> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
> >> >> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
> >> >> https://bugzilla.kernel.org/show_bug.cgi?id=58771
> >> >> 
> >> >> After live migration or virsh restore [savefile], one process's CPU 
> >> >> utilization went up by about 30%, resulted in throughput 
> >> >> degradation of this process.
> >> >> 
> >> >> If EPT disabled, this problem gone.
> >> >> 
> >> >> I suspect that kvm hypervisor has business with this problem.
> >> >> Based on above suspect, I want to find the two adjacent versions of 
> >> >> kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1), 
> >> >> and analyze the differences between this two versions, or apply the 
> >> >> patches between this two versions by bisection method, finally find the key patches.
> >> >> 
> >> >> Any better ideas?
> >> >> 
> >> >> Thanks,
> >> >> Zhang Haoyu
> >> >
> >> >I've attempted to duplicate this on a number of machines that are as similar to yours as I am able to get my hands on, and so far have not been able to see any performance degradation. And from what I've read in the above links, huge pages do not seem to be part of the problem.
> >> >
> >> >So, if you are in a position to bisect the kernel changes, that would probably be the best avenue to pursue in my opinion.
> >> >
> >> >Bruce
> >> 
> >> I found the first bad 
> >> commit([612819c3c6e67bac8fceaa7cc402f13b1b63f7e4] KVM: propagate fault r/w information to gup(), allow read-only memory) which triggers this problem by git bisecting the kvm kernel (download from https://git.kernel.org/pub/scm/virt/kvm/kvm.git) changes.
> >> 
> >> And,
> >> git log 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 -n 1 -p > 
> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log
> >> git diff 
> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1..612819c3c6e67bac8fceaa7cc4
> >> 02f13b1b63f7e4 > 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff
> >> 
> >> Then, I diffed 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log and 
> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff,
> >> came to a conclusion that all of the differences between 
> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1 and 
> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4
> >> are contributed by no other than 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4, so this commit is the peace-breaker which directly or indirectly causes the degradation.
> >> 
> >> Does the map_writable flag passed to mmu_set_spte() function have effect on PTE's PAT flag or increase the VMEXITs induced by that guest tried to write read-only memory?
> >> 
> >> Thanks,
> >> Zhang Haoyu
> >> 
> >
> >There should be no read-only memory maps backing guest RAM.
> >
> >Can you confirm map_writable = false is being passed to __direct_map? (this should not happen, for guest RAM).
> >And if it is false, please capture the associated GFN.
> >
> I added below check and printk at the start of __direct_map() at the fist bad commit version,
> --- kvm-612819c3c6e67bac8fceaa7cc402f13b1b63f7e4/arch/x86/kvm/mmu.c     2013-07-26 18:44:05.000000000 +0800
> +++ kvm-612819/arch/x86/kvm/mmu.c       2013-07-31 00:05:48.000000000 +0800
> @@ -2223,6 +2223,9 @@ static int __direct_map(struct kvm_vcpu
>         int pt_write = 0;
>         gfn_t pseudo_gfn;
> 
> +        if (!map_writable)
> +                printk(KERN_ERR "%s: %s: gfn = %llu \n", __FILE__, __func__, gfn);
> +
>         for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
>                 if (iterator.level == level) {
>                         unsigned pte_access = ACC_ALL;
> 
> I virsh-save the VM, and then virsh-restore it, so many GFNs were printed, you can absolutely describe it as flooding.
> 
The flooding you see happens during migrate to file stage because of dirty
page tracking. If you clear dmesg after virsh-save you should not see any
flooding after virsh-restore. I just checked with latest tree, I do not.


--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhanghaoyu (A) Aug. 5, 2013, 8:35 a.m. UTC | #2
Pj4gPj4gPj4gaGkgYWxsLA0KPj4gPj4gPj4gDQo+PiA+PiA+PiBJIG1ldCBzaW1pbGFyIHByb2Js
ZW0gdG8gdGhlc2UsIHdoaWxlIHBlcmZvcm1pbmcgbGl2ZSBtaWdyYXRpb24gb3IgDQo+PiA+PiA+
PiBzYXZlLXJlc3RvcmUgdGVzdCBvbiB0aGUga3ZtIHBsYXRmb3JtIChxZW11OjEuNC4wLCBob3N0
OnN1c2UxMXNwMiwgDQo+PiA+PiA+PiBndWVzdDpzdXNlMTFzcDIpLCBydW5uaW5nIHRlbGUtY29t
bXVuaWNhdGlvbiBzb2Z0d2FyZSBzdWl0ZSBpbiANCj4+ID4+ID4+IGd1ZXN0LCANCj4+ID4+ID4+
IGh0dHBzOi8vbGlzdHMuZ251Lm9yZy9hcmNoaXZlL2h0bWwvcWVtdS1kZXZlbC8yMDEzLTA1L21z
ZzAwMDk4Lmh0bWwNCj4+ID4+ID4+IGh0dHA6Ly9jb21tZW50cy5nbWFuZS5vcmcvZ21hbmUuY29t
cC5lbXVsYXRvcnMua3ZtLmRldmVsLzEwMjUwNg0KPj4gPj4gPj4gaHR0cDovL3RocmVhZC5nbWFu
ZS5vcmcvZ21hbmUuY29tcC5lbXVsYXRvcnMua3ZtLmRldmVsLzEwMDU5Mg0KPj4gPj4gPj4gaHR0
cHM6Ly9idWd6aWxsYS5rZXJuZWwub3JnL3Nob3dfYnVnLmNnaT9pZD01ODc3MQ0KPj4gPj4gPj4g
DQo+PiA+PiA+PiBBZnRlciBsaXZlIG1pZ3JhdGlvbiBvciB2aXJzaCByZXN0b3JlIFtzYXZlZmls
ZV0sIG9uZSBwcm9jZXNzJ3MgQ1BVIA0KPj4gPj4gPj4gdXRpbGl6YXRpb24gd2VudCB1cCBieSBh
Ym91dCAzMCUsIHJlc3VsdGVkIGluIHRocm91Z2hwdXQgDQo+PiA+PiA+PiBkZWdyYWRhdGlvbiBv
ZiB0aGlzIHByb2Nlc3MuDQo+PiA+PiA+PiANCj4+ID4+ID4+IElmIEVQVCBkaXNhYmxlZCwgdGhp
cyBwcm9ibGVtIGdvbmUuDQo+PiA+PiA+PiANCj4+ID4+ID4+IEkgc3VzcGVjdCB0aGF0IGt2bSBo
eXBlcnZpc29yIGhhcyBidXNpbmVzcyB3aXRoIHRoaXMgcHJvYmxlbS4NCj4+ID4+ID4+IEJhc2Vk
IG9uIGFib3ZlIHN1c3BlY3QsIEkgd2FudCB0byBmaW5kIHRoZSB0d28gYWRqYWNlbnQgdmVyc2lv
bnMgb2YgDQo+PiA+PiA+PiBrdm0ta21vZCB3aGljaCB0cmlnZ2VycyB0aGlzIHByb2JsZW0gb3Ig
bm90IChlLmcuIDIuNi4zOSwgMy4wLXJjMSksIA0KPj4gPj4gPj4gYW5kIGFuYWx5emUgdGhlIGRp
ZmZlcmVuY2VzIGJldHdlZW4gdGhpcyB0d28gdmVyc2lvbnMsIG9yIGFwcGx5IHRoZSANCj4+ID4+
ID4+IHBhdGNoZXMgYmV0d2VlbiB0aGlzIHR3byB2ZXJzaW9ucyBieSBiaXNlY3Rpb24gbWV0aG9k
LCBmaW5hbGx5IGZpbmQgdGhlIGtleSBwYXRjaGVzLg0KPj4gPj4gPj4gDQo+PiA+PiA+PiBBbnkg
YmV0dGVyIGlkZWFzPw0KPj4gPj4gPj4gDQo+PiA+PiA+PiBUaGFua3MsDQo+PiA+PiA+PiBaaGFu
ZyBIYW95dQ0KPj4gPj4gPg0KPj4gPj4gPkkndmUgYXR0ZW1wdGVkIHRvIGR1cGxpY2F0ZSB0aGlz
IG9uIGEgbnVtYmVyIG9mIG1hY2hpbmVzIHRoYXQgYXJlIGFzIHNpbWlsYXIgdG8geW91cnMgYXMg
SSBhbSBhYmxlIHRvIGdldCBteSBoYW5kcyBvbiwgYW5kIHNvIGZhciBoYXZlIG5vdCBiZWVuIGFi
bGUgdG8gc2VlIGFueSBwZXJmb3JtYW5jZSBkZWdyYWRhdGlvbi4gQW5kIGZyb20gd2hhdCBJJ3Zl
IHJlYWQgaW4gdGhlIGFib3ZlIGxpbmtzLCBodWdlIHBhZ2VzIGRvIG5vdCBzZWVtIHRvIGJlIHBh
cnQgb2YgdGhlIHByb2JsZW0uDQo+PiA+PiA+DQo+PiA+PiA+U28sIGlmIHlvdSBhcmUgaW4gYSBw
b3NpdGlvbiB0byBiaXNlY3QgdGhlIGtlcm5lbCBjaGFuZ2VzLCB0aGF0IHdvdWxkIHByb2JhYmx5
IGJlIHRoZSBiZXN0IGF2ZW51ZSB0byBwdXJzdWUgaW4gbXkgb3Bpbmlvbi4NCj4+ID4+ID4NCj4+
ID4+ID5CcnVjZQ0KPj4gPj4gDQo+PiA+PiBJIGZvdW5kIHRoZSBmaXJzdCBiYWQgDQo+PiA+PiBj
b21taXQoWzYxMjgxOWMzYzZlNjdiYWM4ZmNlYWE3Y2M0MDJmMTNiMWI2M2Y3ZTRdIEtWTTogcHJv
cGFnYXRlIGZhdWx0IHIvdyBpbmZvcm1hdGlvbiB0byBndXAoKSwgYWxsb3cgcmVhZC1vbmx5IG1l
bW9yeSkgd2hpY2ggdHJpZ2dlcnMgdGhpcyBwcm9ibGVtIGJ5IGdpdCBiaXNlY3RpbmcgdGhlIGt2
bSBrZXJuZWwgKGRvd25sb2FkIGZyb20gaHR0cHM6Ly9naXQua2VybmVsLm9yZy9wdWIvc2NtL3Zp
cnQva3ZtL2t2bS5naXQpIGNoYW5nZXMuDQo+PiA+PiANCj4+ID4+IEFuZCwNCj4+ID4+IGdpdCBs
b2cgNjEyODE5YzNjNmU2N2JhYzhmY2VhYTdjYzQwMmYxM2IxYjYzZjdlNCAtbiAxIC1wID4gDQo+
PiA+PiA2MTI4MTljM2M2ZTY3YmFjOGZjZWFhN2NjNDAyZjEzYjFiNjNmN2U0LmxvZw0KPj4gPj4g
Z2l0IGRpZmYgDQo+PiA+PiA2MTI4MTljM2M2ZTY3YmFjOGZjZWFhN2NjNDAyZjEzYjFiNjNmN2U0
fjEuLjYxMjgxOWMzYzZlNjdiYWM4ZmNlYWE3Y2M0DQo+PiA+PiAwMmYxM2IxYjYzZjdlNCA+IDYx
MjgxOWMzYzZlNjdiYWM4ZmNlYWE3Y2M0MDJmMTNiMWI2M2Y3ZTQuZGlmZg0KPj4gPj4gDQo+PiA+
PiBUaGVuLCBJIGRpZmZlZCA2MTI4MTljM2M2ZTY3YmFjOGZjZWFhN2NjNDAyZjEzYjFiNjNmN2U0
LmxvZyBhbmQgDQo+PiA+PiA2MTI4MTljM2M2ZTY3YmFjOGZjZWFhN2NjNDAyZjEzYjFiNjNmN2U0
LmRpZmYsDQo+PiA+PiBjYW1lIHRvIGEgY29uY2x1c2lvbiB0aGF0IGFsbCBvZiB0aGUgZGlmZmVy
ZW5jZXMgYmV0d2VlbiANCj4+ID4+IDYxMjgxOWMzYzZlNjdiYWM4ZmNlYWE3Y2M0MDJmMTNiMWI2
M2Y3ZTR+MSBhbmQgDQo+PiA+PiA2MTI4MTljM2M2ZTY3YmFjOGZjZWFhN2NjNDAyZjEzYjFiNjNm
N2U0DQo+PiA+PiBhcmUgY29udHJpYnV0ZWQgYnkgbm8gb3RoZXIgdGhhbiA2MTI4MTljM2M2ZTY3
YmFjOGZjZWFhN2NjNDAyZjEzYjFiNjNmN2U0LCBzbyB0aGlzIGNvbW1pdCBpcyB0aGUgcGVhY2Ut
YnJlYWtlciB3aGljaCBkaXJlY3RseSBvciBpbmRpcmVjdGx5IGNhdXNlcyB0aGUgZGVncmFkYXRp
b24uDQo+PiA+PiANCj4+ID4+IERvZXMgdGhlIG1hcF93cml0YWJsZSBmbGFnIHBhc3NlZCB0byBt
bXVfc2V0X3NwdGUoKSBmdW5jdGlvbiBoYXZlIGVmZmVjdCBvbiBQVEUncyBQQVQgZmxhZyBvciBp
bmNyZWFzZSB0aGUgVk1FWElUcyBpbmR1Y2VkIGJ5IHRoYXQgZ3Vlc3QgdHJpZWQgdG8gd3JpdGUg
cmVhZC1vbmx5IG1lbW9yeT8NCj4+ID4+IA0KPj4gPj4gVGhhbmtzLA0KPj4gPj4gWmhhbmcgSGFv
eXUNCj4+ID4+IA0KPj4gPg0KPj4gPlRoZXJlIHNob3VsZCBiZSBubyByZWFkLW9ubHkgbWVtb3J5
IG1hcHMgYmFja2luZyBndWVzdCBSQU0uDQo+PiA+DQo+PiA+Q2FuIHlvdSBjb25maXJtIG1hcF93
cml0YWJsZSA9IGZhbHNlIGlzIGJlaW5nIHBhc3NlZCB0byBfX2RpcmVjdF9tYXA/ICh0aGlzIHNo
b3VsZCBub3QgaGFwcGVuLCBmb3IgZ3Vlc3QgUkFNKS4NCj4+ID5BbmQgaWYgaXQgaXMgZmFsc2Us
IHBsZWFzZSBjYXB0dXJlIHRoZSBhc3NvY2lhdGVkIEdGTi4NCj4+ID4NCj4+IEkgYWRkZWQgYmVs
b3cgY2hlY2sgYW5kIHByaW50ayBhdCB0aGUgc3RhcnQgb2YgX19kaXJlY3RfbWFwKCkgYXQgdGhl
IGZpc3QgYmFkIGNvbW1pdCB2ZXJzaW9uLA0KPj4gLS0tIGt2bS02MTI4MTljM2M2ZTY3YmFjOGZj
ZWFhN2NjNDAyZjEzYjFiNjNmN2U0L2FyY2gveDg2L2t2bS9tbXUuYyAgICAgMjAxMy0wNy0yNiAx
ODo0NDowNS4wMDAwMDAwMDAgKzA4MDANCj4+ICsrKyBrdm0tNjEyODE5L2FyY2gveDg2L2t2bS9t
bXUuYyAgICAgICAyMDEzLTA3LTMxIDAwOjA1OjQ4LjAwMDAwMDAwMCArMDgwMA0KPj4gQEAgLTIy
MjMsNiArMjIyMyw5IEBAIHN0YXRpYyBpbnQgX19kaXJlY3RfbWFwKHN0cnVjdCBrdm1fdmNwdQ0K
Pj4gICAgICAgICBpbnQgcHRfd3JpdGUgPSAwOw0KPj4gICAgICAgICBnZm5fdCBwc2V1ZG9fZ2Zu
Ow0KPj4gDQo+PiArICAgICAgICBpZiAoIW1hcF93cml0YWJsZSkNCj4+ICsgICAgICAgICAgICAg
ICAgcHJpbnRrKEtFUk5fRVJSICIlczogJXM6IGdmbiA9ICVsbHUgXG4iLCBfX0ZJTEVfXywgX19m
dW5jX18sIGdmbik7DQo+PiArDQo+PiAgICAgICAgIGZvcl9lYWNoX3NoYWRvd19lbnRyeSh2Y3B1
LCAodTY0KWdmbiA8PCBQQUdFX1NISUZULCBpdGVyYXRvcikgew0KPj4gICAgICAgICAgICAgICAg
IGlmIChpdGVyYXRvci5sZXZlbCA9PSBsZXZlbCkgew0KPj4gICAgICAgICAgICAgICAgICAgICAg
ICAgdW5zaWduZWQgcHRlX2FjY2VzcyA9IEFDQ19BTEw7DQo+PiANCj4+IEkgdmlyc2gtc2F2ZSB0
aGUgVk0sIGFuZCB0aGVuIHZpcnNoLXJlc3RvcmUgaXQsIHNvIG1hbnkgR0ZOcyB3ZXJlIHByaW50
ZWQsIHlvdSBjYW4gYWJzb2x1dGVseSBkZXNjcmliZSBpdCBhcyBmbG9vZGluZy4NCj4+IA0KPlRo
ZSBmbG9vZGluZyB5b3Ugc2VlIGhhcHBlbnMgZHVyaW5nIG1pZ3JhdGUgdG8gZmlsZSBzdGFnZSBi
ZWNhdXNlIG9mIGRpcnR5DQo+cGFnZSB0cmFja2luZy4gSWYgeW91IGNsZWFyIGRtZXNnIGFmdGVy
IHZpcnNoLXNhdmUgeW91IHNob3VsZCBub3Qgc2VlIGFueQ0KPmZsb29kaW5nIGFmdGVyIHZpcnNo
LXJlc3RvcmUuIEkganVzdCBjaGVja2VkIHdpdGggbGF0ZXN0IHRyZWUsIEkgZG8gbm90Lg0KDQpJ
IG1hZGUgYSB2ZXJpZmljYXRpb24gYWdhaW4uDQpJIHZpcnNoLXNhdmUgdGhlIFZNLCBkdXJpbmcg
dGhlIHNhdmluZyBzdGFnZSwgSSBydW4gJ2RtZXNnJywgbm8gR0ZOIHByaW50ZWQsIG1heWJlIHRo
ZSBzd2l0Y2hpbmcgZnJvbSBydW5uaW5nIHN0YWdlIHRvIHBhdXNlIHN0YWdlIHRha2VzIHNvIHNo
b3J0IHRpbWUsIA0Kbm8gZ3Vlc3Qtd3JpdGUgaGFwcGVucyBkdXJpbmcgdGhpcyBzd2l0Y2hpbmcg
cGVyaW9kLg0KQWZ0ZXIgdGhlIGNvbXBsZXRpb24gb2Ygc2F2aW5nIG9wZXJhdGlvbiwgSSBydW4g
J2RlbXNnIC1jJyB0byBjbGVhciB0aGUgYnVmZmVyIGFsbCB0aGUgc2FtZSwgdGhlbiBJIHZpcnNo
LXJlc3RvcmUgdGhlIFZNLCBzbyBtYW55IEdGTnMgYXJlIHByaW50ZWQgYnkgcnVubmluZyAnZG1l
c2cnLA0KYW5kIEkgYWxzbyBydW4gJ3RhaWwgLWYgL3Zhci9sb2cvbWVzc2FnZXMnIGR1cmluZyB0
aGUgcmVzdG9yaW5nIHN0YWdlLCBzbyBtYW55IEdGTnMgYXJlIGZsb29kZWQgZHluYW1pY2FsbHkg
dG9vLg0KSSdtIHN1cmUgdGhhdCB0aGUgZmxvb2RpbmcgaGFwcGVucyBkdXJpbmcgdGhlIHZpcnNo
LXJlc3RvcmUgc3RhZ2UsIG5vdCB0aGUgbWlncmF0aW9uIHN0YWdlLg0KDQpPbiBWTSdzIG5vcm1h
bCBzdGFydGluZyBzdGFnZSwgb25seSB2ZXJ5IGZldyBHRk5zIGFyZSBwcmludGVkLCBzaG93biBh
cyBiZWxvdw0KZ2ZuID0gMTYNCmdmbiA9IDYwNA0KZ2ZuID0gNjA1DQpnZm4gPSA2MDYNCmdmbiA9
IDYwNw0KZ2ZuID0gNjA4DQpnZm4gPSA2MDkNCg0KYnV0IG9uIHRoZSBWTSdzIHJlc3RvcmluZyBz
dGFnZSwgc28gbWFueSBHRk5zIGFyZSBwcmludGVkLCB0YWtpbmcgc29tZSBleGFtcGxlcyBzaG93
biBhcyBiZWxvdywNCjIwNDI2MDANCjI3OTc3NzcNCjI3OTc3NzgNCjI3OTc3NzkNCjI3OTc3ODAN
CjI3OTc3ODENCjI3OTc3ODINCjI3OTc3ODMNCjI3OTc3ODQNCjI3OTc3ODUNCjIwNDI2MDINCjI4
NDY0ODINCjIwNDI2MDMNCjI4NDY0ODMNCjIwNDI2MDYNCjI4NDY0ODUNCjIwNDI2MDcNCjI4NDY0
ODYNCjIwNDI2MTANCjIwNDI2MTENCjI4NDY0ODkNCjI4NDY0OTANCjIwNDI2MTQNCjIwNDI2MTUN
CjI4NDY0OTMNCjI4NDY0OTQNCjIwNDI2MTcNCjIwNDI2MTgNCjI4NDY0OTcNCjIwNDI2MjENCjI4
NDY0OTgNCjIwNDI2MjINCjIwNDI2MjUNCg0KVGhhbmtzLA0KWmhhbmcgSGFveXUNCg0K
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov Aug. 5, 2013, 8:43 a.m. UTC | #3
On Mon, Aug 05, 2013 at 08:35:09AM +0000, Zhanghaoyu (A) wrote:
> >> >> >> hi all,
> >> >> >> 
> >> >> >> I met similar problem to these, while performing live migration or 
> >> >> >> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, 
> >> >> >> guest:suse11sp2), running tele-communication software suite in 
> >> >> >> guest, 
> >> >> >> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
> >> >> >> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
> >> >> >> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
> >> >> >> https://bugzilla.kernel.org/show_bug.cgi?id=58771
> >> >> >> 
> >> >> >> After live migration or virsh restore [savefile], one process's CPU 
> >> >> >> utilization went up by about 30%, resulted in throughput 
> >> >> >> degradation of this process.
> >> >> >> 
> >> >> >> If EPT disabled, this problem gone.
> >> >> >> 
> >> >> >> I suspect that kvm hypervisor has business with this problem.
> >> >> >> Based on above suspect, I want to find the two adjacent versions of 
> >> >> >> kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1), 
> >> >> >> and analyze the differences between this two versions, or apply the 
> >> >> >> patches between this two versions by bisection method, finally find the key patches.
> >> >> >> 
> >> >> >> Any better ideas?
> >> >> >> 
> >> >> >> Thanks,
> >> >> >> Zhang Haoyu
> >> >> >
> >> >> >I've attempted to duplicate this on a number of machines that are as similar to yours as I am able to get my hands on, and so far have not been able to see any performance degradation. And from what I've read in the above links, huge pages do not seem to be part of the problem.
> >> >> >
> >> >> >So, if you are in a position to bisect the kernel changes, that would probably be the best avenue to pursue in my opinion.
> >> >> >
> >> >> >Bruce
> >> >> 
> >> >> I found the first bad 
> >> >> commit([612819c3c6e67bac8fceaa7cc402f13b1b63f7e4] KVM: propagate fault r/w information to gup(), allow read-only memory) which triggers this problem by git bisecting the kvm kernel (download from https://git.kernel.org/pub/scm/virt/kvm/kvm.git) changes.
> >> >> 
> >> >> And,
> >> >> git log 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 -n 1 -p > 
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log
> >> >> git diff 
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1..612819c3c6e67bac8fceaa7cc4
> >> >> 02f13b1b63f7e4 > 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff
> >> >> 
> >> >> Then, I diffed 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log and 
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff,
> >> >> came to a conclusion that all of the differences between 
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1 and 
> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4
> >> >> are contributed by no other than 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4, so this commit is the peace-breaker which directly or indirectly causes the degradation.
> >> >> 
> >> >> Does the map_writable flag passed to mmu_set_spte() function have effect on PTE's PAT flag or increase the VMEXITs induced by that guest tried to write read-only memory?
> >> >> 
> >> >> Thanks,
> >> >> Zhang Haoyu
> >> >> 
> >> >
> >> >There should be no read-only memory maps backing guest RAM.
> >> >
> >> >Can you confirm map_writable = false is being passed to __direct_map? (this should not happen, for guest RAM).
> >> >And if it is false, please capture the associated GFN.
> >> >
> >> I added below check and printk at the start of __direct_map() at the fist bad commit version,
> >> --- kvm-612819c3c6e67bac8fceaa7cc402f13b1b63f7e4/arch/x86/kvm/mmu.c     2013-07-26 18:44:05.000000000 +0800
> >> +++ kvm-612819/arch/x86/kvm/mmu.c       2013-07-31 00:05:48.000000000 +0800
> >> @@ -2223,6 +2223,9 @@ static int __direct_map(struct kvm_vcpu
> >>         int pt_write = 0;
> >>         gfn_t pseudo_gfn;
> >> 
> >> +        if (!map_writable)
> >> +                printk(KERN_ERR "%s: %s: gfn = %llu \n", __FILE__, __func__, gfn);
> >> +
> >>         for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
> >>                 if (iterator.level == level) {
> >>                         unsigned pte_access = ACC_ALL;
> >> 
> >> I virsh-save the VM, and then virsh-restore it, so many GFNs were printed, you can absolutely describe it as flooding.
> >> 
> >The flooding you see happens during migrate to file stage because of dirty
> >page tracking. If you clear dmesg after virsh-save you should not see any
> >flooding after virsh-restore. I just checked with latest tree, I do not.
> 
> I made a verification again.
> I virsh-save the VM, during the saving stage, I run 'dmesg', no GFN printed, maybe the switching from running stage to pause stage takes so short time, 
> no guest-write happens during this switching period.
> After the completion of saving operation, I run 'demsg -c' to clear the buffer all the same, then I virsh-restore the VM, so many GFNs are printed by running 'dmesg',
> and I also run 'tail -f /var/log/messages' during the restoring stage, so many GFNs are flooded dynamically too.
> I'm sure that the flooding happens during the virsh-restore stage, not the migration stage.
> 
Interesting, is this with upstream kernel? For me the situation is
exactly the opposite. What is your command line?
 
--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhanghaoyu (A) Aug. 5, 2013, 9:09 a.m. UTC | #4
>> >> >> >> hi all,

>> >> >> >> 

>> >> >> >> I met similar problem to these, while performing live migration or 

>> >> >> >> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, 

>> >> >> >> guest:suse11sp2), running tele-communication software suite in 

>> >> >> >> guest, 

>> >> >> >> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html

>> >> >> >> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506

>> >> >> >> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592

>> >> >> >> https://bugzilla.kernel.org/show_bug.cgi?id=58771

>> >> >> >> 

>> >> >> >> After live migration or virsh restore [savefile], one process's CPU 

>> >> >> >> utilization went up by about 30%, resulted in throughput 

>> >> >> >> degradation of this process.

>> >> >> >> 

>> >> >> >> If EPT disabled, this problem gone.

>> >> >> >> 

>> >> >> >> I suspect that kvm hypervisor has business with this problem.

>> >> >> >> Based on above suspect, I want to find the two adjacent versions of 

>> >> >> >> kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1), 

>> >> >> >> and analyze the differences between this two versions, or apply the 

>> >> >> >> patches between this two versions by bisection method, finally find the key patches.

>> >> >> >> 

>> >> >> >> Any better ideas?

>> >> >> >> 

>> >> >> >> Thanks,

>> >> >> >> Zhang Haoyu

>> >> >> >

>> >> >> >I've attempted to duplicate this on a number of machines that are as similar to yours as I am able to get my hands on, and so far have not been able to see any performance degradation. And from what I've read in the above links, huge pages do not seem to be part of the problem.

>> >> >> >

>> >> >> >So, if you are in a position to bisect the kernel changes, that would probably be the best avenue to pursue in my opinion.

>> >> >> >

>> >> >> >Bruce

>> >> >> 

>> >> >> I found the first bad 

>> >> >> commit([612819c3c6e67bac8fceaa7cc402f13b1b63f7e4] KVM: propagate fault r/w information to gup(), allow read-only memory) which triggers this problem by git bisecting the kvm kernel (download from https://git.kernel.org/pub/scm/virt/kvm/kvm.git) changes.

>> >> >> 

>> >> >> And,

>> >> >> git log 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 -n 1 -p > 

>> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log

>> >> >> git diff 

>> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1..612819c3c6e67bac8fceaa7cc4

>> >> >> 02f13b1b63f7e4 > 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff

>> >> >> 

>> >> >> Then, I diffed 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.log and 

>> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4.diff,

>> >> >> came to a conclusion that all of the differences between 

>> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4~1 and 

>> >> >> 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4

>> >> >> are contributed by no other than 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4, so this commit is the peace-breaker which directly or indirectly causes the degradation.

>> >> >> 

>> >> >> Does the map_writable flag passed to mmu_set_spte() function have effect on PTE's PAT flag or increase the VMEXITs induced by that guest tried to write read-only memory?

>> >> >> 

>> >> >> Thanks,

>> >> >> Zhang Haoyu

>> >> >> 

>> >> >

>> >> >There should be no read-only memory maps backing guest RAM.

>> >> >

>> >> >Can you confirm map_writable = false is being passed to __direct_map? (this should not happen, for guest RAM).

>> >> >And if it is false, please capture the associated GFN.

>> >> >

>> >> I added below check and printk at the start of __direct_map() at the fist bad commit version,

>> >> --- kvm-612819c3c6e67bac8fceaa7cc402f13b1b63f7e4/arch/x86/kvm/mmu.c     2013-07-26 18:44:05.000000000 +0800

>> >> +++ kvm-612819/arch/x86/kvm/mmu.c       2013-07-31 00:05:48.000000000 +0800

>> >> @@ -2223,6 +2223,9 @@ static int __direct_map(struct kvm_vcpu

>> >>         int pt_write = 0;

>> >>         gfn_t pseudo_gfn;

>> >> 

>> >> +        if (!map_writable)

>> >> +                printk(KERN_ERR "%s: %s: gfn = %llu \n", __FILE__, __func__, gfn);

>> >> +

>> >>         for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {

>> >>                 if (iterator.level == level) {

>> >>                         unsigned pte_access = ACC_ALL;

>> >> 

>> >> I virsh-save the VM, and then virsh-restore it, so many GFNs were printed, you can absolutely describe it as flooding.

>> >> 

>> >The flooding you see happens during migrate to file stage because of dirty

>> >page tracking. If you clear dmesg after virsh-save you should not see any

>> >flooding after virsh-restore. I just checked with latest tree, I do not.

>> 

>> I made a verification again.

>> I virsh-save the VM, during the saving stage, I run 'dmesg', no GFN printed, maybe the switching from running stage to pause stage takes so short time, 

>> no guest-write happens during this switching period.

>> After the completion of saving operation, I run 'demsg -c' to clear the buffer all the same, then I virsh-restore the VM, so many GFNs are printed by running 'dmesg',

>> and I also run 'tail -f /var/log/messages' during the restoring stage, so many GFNs are flooded dynamically too.

>> I'm sure that the flooding happens during the virsh-restore stage, not the migration stage.

>> 

>Interesting, is this with upstream kernel? For me the situation is

>exactly the opposite. What is your command line?

> 

I made the verification on the first bad commit 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4, not the upstream.
When I build the upstream, encounter a problem that I compile and install the upstream(commit: e769ece3b129698d2b09811a6f6d304e4eaa8c29) on sles11sp2 environment via below command
cp /boot/config-3.0.13-0.27-default ./.config
yes "" | make oldconfig
make && make modules_install && make install
then, I reboot the host, and select the upstream kernel, but during the starting stage, below problem happened,
Could not find /dev/disk/by-id/scsi-3600508e000000000864407c5b8f7ad01-part3 

I'm trying to resolve it.

The QEMU command line (/var/log/libvirt/qemu/[domain name].log),
LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ QEMU_AUDIO_DRV=none /usr/local/bin/qemu-system-x86_64 -name ATS1 -S -M pc-0.12 -cpu qemu32 -enable-kvm -m 12288 -smp 4,sockets=4,cores=1,threads=1 -uuid 0505ec91-382d-800e-2c79-e5b286eb60b5 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/ATS1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/opt/ne/vm/ATS1.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=20,id=hostnet0,vhost=on,vhostfd=21 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:e0:fc:00:0f:00,bus=pci.0,addr=0x3,bootindex=2 -netdev tap,fd=22,id=hostnet1,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:e0:fc:01:0f:00,bus=pci.0,addr=0x4 -netdev tap,fd=24,id=hostnet2,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=00:e0:fc:02:0f:00,bus=pci.0,addr=0x5 -netdev tap,fd=26,id=hostnet3,vhost=on,vhostfd=27 -device virtio-net-pci,netdev=hostnet3,id=net3,mac=00:e0:fc:03:0f:00,bus=pci.0,addr=0x6 -netdev tap,fd=28,id=hostnet4,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet4,id=net4,mac=00:e0:fc:0a:0f:00,bus=pci.0,addr=0x7 -netdev tap,fd=30,id=hostnet5,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet5,id=net5,mac=00:e0:fc:0b:0f:00,bus=pci.0,addr=0x9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc *:0 -k en-us -vga cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0xb -watchdog-action poweroff -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa

Thanks,
Zhang Haoyu
Andreas Färber Aug. 5, 2013, 9:15 a.m. UTC | #5
Hi,

Am 05.08.2013 11:09, schrieb Zhanghaoyu (A):
> When I build the upstream, encounter a problem that I compile and install the upstream(commit: e769ece3b129698d2b09811a6f6d304e4eaa8c29) on sles11sp2 environment via below command
> cp /boot/config-3.0.13-0.27-default ./.config
> yes "" | make oldconfig
> make && make modules_install && make install
> then, I reboot the host, and select the upstream kernel, but during the starting stage, below problem happened,
> Could not find /dev/disk/by-id/scsi-3600508e000000000864407c5b8f7ad01-part3 
> 
> I'm trying to resolve it.

Possibly you need to enable loading unsupported kernel modules?
At least that's needed when testing a kmod with a SUSE kernel.

Regards,
Andreas
Zhanghaoyu (A) Aug. 5, 2013, 9:22 a.m. UTC | #6
PkhpLA0KPg0KPkFtIDA1LjA4LjIwMTMgMTE6MDksIHNjaHJpZWIgWmhhbmdoYW95dSAoQSk6DQo+
PiBXaGVuIEkgYnVpbGQgdGhlIHVwc3RyZWFtLCBlbmNvdW50ZXIgYSBwcm9ibGVtIHRoYXQgSSBj
b21waWxlIGFuZCANCj4+IGluc3RhbGwgdGhlIHVwc3RyZWFtKGNvbW1pdDogZTc2OWVjZTNiMTI5
Njk4ZDJiMDk4MTFhNmY2ZDMwNGU0ZWFhOGMyOSkgDQo+PiBvbiBzbGVzMTFzcDIgZW52aXJvbm1l
bnQgdmlhIGJlbG93IGNvbW1hbmQgY3AgDQo+PiAvYm9vdC9jb25maWctMy4wLjEzLTAuMjctZGVm
YXVsdCAuLy5jb25maWcgeWVzICIiIHwgbWFrZSBvbGRjb25maWcgDQo+PiBtYWtlICYmIG1ha2Ug
bW9kdWxlc19pbnN0YWxsICYmIG1ha2UgaW5zdGFsbCB0aGVuLCBJIHJlYm9vdCB0aGUgaG9zdCwg
DQo+PiBhbmQgc2VsZWN0IHRoZSB1cHN0cmVhbSBrZXJuZWwsIGJ1dCBkdXJpbmcgdGhlIHN0YXJ0
aW5nIHN0YWdlLCBiZWxvdyANCj4+IHByb2JsZW0gaGFwcGVuZWQsIENvdWxkIG5vdCBmaW5kIA0K
Pj4gL2Rldi9kaXNrL2J5LWlkL3Njc2ktMzYwMDUwOGUwMDAwMDAwMDA4NjQ0MDdjNWI4ZjdhZDAx
LXBhcnQzDQo+PiANCj4+IEknbSB0cnlpbmcgdG8gcmVzb2x2ZSBpdC4NCj4NCj5Qb3NzaWJseSB5
b3UgbmVlZCB0byBlbmFibGUgbG9hZGluZyB1bnN1cHBvcnRlZCBrZXJuZWwgbW9kdWxlcz8NCj5B
dCBsZWFzdCB0aGF0J3MgbmVlZGVkIHdoZW4gdGVzdGluZyBhIGttb2Qgd2l0aCBhIFNVU0Uga2Vy
bmVsLg0KPg0KSSBoYXZlIHRyaWVkIHRvIHNldCAiIGFsbG93X3Vuc3VwcG9ydGVkX21vZHVsZXMg
MSIgaW4gL2V0Yy9tb2Rwcm9iZS5kL3Vuc3VwcG9ydGVkLW1vZHVsZXMsIGJ1dCB0aGUgcHJvYmxl
bSBzdGlsbCBoYXBwZW5lZC4NCkkgcmVwbGFjZSB0aGUgd2hvbGUga2VybmVsIHdpdGggdGhlIGt2
bSBrZXJuZWwsIG5vdCBvbmx5IHRoZSBrdm0gbW9kdWxlcy4NCg0KPlJlZ2FyZHMsDQo+QW5kcmVh
cw0K
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov Aug. 5, 2013, 9:37 a.m. UTC | #7
On Mon, Aug 05, 2013 at 09:09:56AM +0000, Zhanghaoyu (A) wrote:
> The QEMU command line (/var/log/libvirt/qemu/[domain name].log),
> LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ QEMU_AUDIO_DRV=none /usr/local/bin/qemu-system-x86_64 -name ATS1 -S -M pc-0.12 -cpu qemu32 -enable-kvm -m 12288 -smp 4,sockets=4,cores=1,threads=1 -uuid 0505ec91-382d-800e-2c79-e5b286eb60b5 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/ATS1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/opt/ne/vm/ATS1.img,if=none,id=drive-virtio-disk0,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=20,id=hostnet0,vhost=on,vhostfd=21 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:e0:fc:00:0f:00,bus=pci.0,addr=0x3,bootindex=2 -netdev tap,fd=22,id=hostnet1,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:e0:fc:01:0f:00,bus=pci.0,addr=0x4 -netdev tap,fd=24,id=ho
 stnet2,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=00:e0:fc:02:0f:00,bus=pci.0,addr=0x5 -netdev tap,fd=26,id=hostnet3,vhost=on,vhostfd=27 -device virtio-net-pci,netdev=hostnet3,id=net3,mac=00:e0:fc:03:0f:00,bus=pci.0,addr=0x6 -netdev tap,fd=28,id=hostnet4,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet4,id=net4,mac=00:e0:fc:0a:0f:00,bus=pci.0,addr=0x7 -netdev tap,fd=30,id=hostnet5,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet5,id=net5,mac=00:e0:fc:0b:0f:00,bus=pci.0,addr=0x9 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc *:0 -k en-us -vga cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0xb -watchdog-action poweroff -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa
> 
Which QEMU version is this? Can you try with e1000 NICs instead of
virtio?

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhanghaoyu (A) Aug. 6, 2013, 10:47 a.m. UTC | #8
>> The QEMU command line (/var/log/libvirt/qemu/[domain name].log), 
>> LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ QEMU_AUDIO_DRV=none 
>> /usr/local/bin/qemu-system-x86_64 -name ATS1 -S -M pc-0.12 -cpu qemu32 
>> -enable-kvm -m 12288 -smp 4,sockets=4,cores=1,threads=1 -uuid 
>> 0505ec91-382d-800e-2c79-e5b286eb60b5 -no-user-config -nodefaults 
>> -chardev 
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/ATS1.monitor,server,n
>> owait -mon chardev=charmonitor,id=monitor,mode=control -rtc 
>> base=localtime -no-shutdown -device 
>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
>> file=/opt/ne/vm/ATS1.img,if=none,id=drive-virtio-disk0,format=raw,cach
>> e=none -device 
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id
>> =virtio-disk0,bootindex=1 -netdev 
>> tap,fd=20,id=hostnet0,vhost=on,vhostfd=21 -device 
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:e0:fc:00:0f:00,bus=pci.0
>> ,addr=0x3,bootindex=2 -netdev 
>> tap,fd=22,id=hostnet1,vhost=on,vhostfd=23 -device 
>> virtio-net-pci,netdev=hostnet1,id=net1,mac=00:e0:fc:01:0f:00,bus=pci.0
>> ,addr=0x4 -netdev tap,fd=24,id=hostnet2,vhost=on,vhostfd=25 -device 
>> virtio-net-pci,netdev=hostnet2,id=net2,mac=00:e0:fc:02:0f:00,bus=pci.0
>> ,addr=0x5 -netdev tap,fd=26,id=hostnet3,vhost=on,vhostfd=27 -device 
>> virtio-net-pci,netdev=hostnet3,id=net3,mac=00:e0:fc:03:0f:00,bus=pci.0
>> ,addr=0x6 -netdev tap,fd=28,id=hostnet4,vhost=on,vhostfd=29 -device 
>> virtio-net-pci,netdev=hostnet4,id=net4,mac=00:e0:fc:0a:0f:00,bus=pci.0
>> ,addr=0x7 -netdev tap,fd=30,id=hostnet5,vhost=on,vhostfd=31 -device 
>> virtio-net-pci,netdev=hostnet5,id=net5,mac=00:e0:fc:0b:0f:00,bus=pci.0
>> ,addr=0x9 -chardev pty,id=charserial0 -device 
>> isa-serial,chardev=charserial0,id=serial0 -vnc *:0 -k en-us -vga 
>> cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0xb 
>> -watchdog-action poweroff -device 
>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa
>> 
>Which QEMU version is this? Can you try with e1000 NICs instead of virtio?
>
This QEMU version is 1.0.0, but I also test QEMU 1.5.2, the same problem exists, including the performance degradation and readonly GFNs' flooding.
I tried with e1000 NICs instead of virtio, including the performance degradation and readonly GFNs' flooding, the QEMU version is 1.5.2.
No matter e1000 NICs or virtio NICs, the GFNs' flooding is initiated at post-restore stage (i.e. running stage), as soon as the restoring completed, the flooding is starting.

Thanks,
Zhang Haoyu

>--
>			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhanghaoyu (A) Aug. 7, 2013, 1:34 a.m. UTC | #9
>>> The QEMU command line (/var/log/libvirt/qemu/[domain name].log), 
>>> LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ 
>>> QEMU_AUDIO_DRV=none
>>> /usr/local/bin/qemu-system-x86_64 -name ATS1 -S -M pc-0.12 -cpu 
>>> qemu32 -enable-kvm -m 12288 -smp 4,sockets=4,cores=1,threads=1 -uuid
>>> 0505ec91-382d-800e-2c79-e5b286eb60b5 -no-user-config -nodefaults 
>>> -chardev 
>>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/ATS1.monitor,server,
>>> n owait -mon chardev=charmonitor,id=monitor,mode=control -rtc 
>>> base=localtime -no-shutdown -device
>>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
>>> file=/opt/ne/vm/ATS1.img,if=none,id=drive-virtio-disk0,format=raw,cac
>>> h
>>> e=none -device
>>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,i
>>> d
>>> =virtio-disk0,bootindex=1 -netdev
>>> tap,fd=20,id=hostnet0,vhost=on,vhostfd=21 -device 
>>> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:e0:fc:00:0f:00,bus=pci.
>>> 0
>>> ,addr=0x3,bootindex=2 -netdev
>>> tap,fd=22,id=hostnet1,vhost=on,vhostfd=23 -device 
>>> virtio-net-pci,netdev=hostnet1,id=net1,mac=00:e0:fc:01:0f:00,bus=pci.
>>> 0
>>> ,addr=0x4 -netdev tap,fd=24,id=hostnet2,vhost=on,vhostfd=25 -device 
>>> virtio-net-pci,netdev=hostnet2,id=net2,mac=00:e0:fc:02:0f:00,bus=pci.
>>> 0
>>> ,addr=0x5 -netdev tap,fd=26,id=hostnet3,vhost=on,vhostfd=27 -device 
>>> virtio-net-pci,netdev=hostnet3,id=net3,mac=00:e0:fc:03:0f:00,bus=pci.
>>> 0
>>> ,addr=0x6 -netdev tap,fd=28,id=hostnet4,vhost=on,vhostfd=29 -device 
>>> virtio-net-pci,netdev=hostnet4,id=net4,mac=00:e0:fc:0a:0f:00,bus=pci.
>>> 0
>>> ,addr=0x7 -netdev tap,fd=30,id=hostnet5,vhost=on,vhostfd=31 -device 
>>> virtio-net-pci,netdev=hostnet5,id=net5,mac=00:e0:fc:0b:0f:00,bus=pci.
>>> 0
>>> ,addr=0x9 -chardev pty,id=charserial0 -device 
>>> isa-serial,chardev=charserial0,id=serial0 -vnc *:0 -k en-us -vga 
>>> cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0xb
>>> -watchdog-action poweroff -device
>>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa
>>> 
>>Which QEMU version is this? Can you try with e1000 NICs instead of virtio?
>>
>This QEMU version is 1.0.0, but I also test QEMU 1.5.2, the same problem exists, including the performance degradation and readonly GFNs' flooding.
>I tried with e1000 NICs instead of virtio, including the performance degradation and readonly GFNs' flooding, the QEMU version is 1.5.2.
>No matter e1000 NICs or virtio NICs, the GFNs' flooding is initiated at post-restore stage (i.e. running stage), as soon as the restoring completed, the flooding is starting.
>
>Thanks,
>Zhang Haoyu
>
>>--
>>			Gleb.

Should we focus on the first bad commit(612819c3c6e67bac8fceaa7cc402f13b1b63f7e4) and the surprising GFNs' flooding?

I applied below patch to  __direct_map(), 
@@ -2223,6 +2223,8 @@ static int __direct_map(struct kvm_vcpu
        int pt_write = 0;
        gfn_t pseudo_gfn;

+        map_writable = true;
+
        for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
                if (iterator.level == level) {
                        unsigned pte_access = ACC_ALL;
and rebuild the kvm-kmod, then re-insmod it.
After I started a VM, the host seemed to be abnormal, so many programs cannot be started successfully, segmentation fault is reported.
In my opinion, after above patch applied, the commit: 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 should be of no effect, but the test result proved me wrong.
Dose the map_writable value's getting process in hva_to_pfn() have effect on the result?

Thanks,
Zhang Haoyu
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov Aug. 7, 2013, 5:52 a.m. UTC | #10
On Wed, Aug 07, 2013 at 01:34:41AM +0000, Zhanghaoyu (A) wrote:
> >>> The QEMU command line (/var/log/libvirt/qemu/[domain name].log), 
> >>> LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ 
> >>> QEMU_AUDIO_DRV=none
> >>> /usr/local/bin/qemu-system-x86_64 -name ATS1 -S -M pc-0.12 -cpu 
> >>> qemu32 -enable-kvm -m 12288 -smp 4,sockets=4,cores=1,threads=1 -uuid
> >>> 0505ec91-382d-800e-2c79-e5b286eb60b5 -no-user-config -nodefaults 
> >>> -chardev 
> >>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/ATS1.monitor,server,
> >>> n owait -mon chardev=charmonitor,id=monitor,mode=control -rtc 
> >>> base=localtime -no-shutdown -device
> >>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
> >>> file=/opt/ne/vm/ATS1.img,if=none,id=drive-virtio-disk0,format=raw,cac
> >>> h
> >>> e=none -device
> >>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,i
> >>> d
> >>> =virtio-disk0,bootindex=1 -netdev
> >>> tap,fd=20,id=hostnet0,vhost=on,vhostfd=21 -device 
> >>> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:e0:fc:00:0f:00,bus=pci.
> >>> 0
> >>> ,addr=0x3,bootindex=2 -netdev
> >>> tap,fd=22,id=hostnet1,vhost=on,vhostfd=23 -device 
> >>> virtio-net-pci,netdev=hostnet1,id=net1,mac=00:e0:fc:01:0f:00,bus=pci.
> >>> 0
> >>> ,addr=0x4 -netdev tap,fd=24,id=hostnet2,vhost=on,vhostfd=25 -device 
> >>> virtio-net-pci,netdev=hostnet2,id=net2,mac=00:e0:fc:02:0f:00,bus=pci.
> >>> 0
> >>> ,addr=0x5 -netdev tap,fd=26,id=hostnet3,vhost=on,vhostfd=27 -device 
> >>> virtio-net-pci,netdev=hostnet3,id=net3,mac=00:e0:fc:03:0f:00,bus=pci.
> >>> 0
> >>> ,addr=0x6 -netdev tap,fd=28,id=hostnet4,vhost=on,vhostfd=29 -device 
> >>> virtio-net-pci,netdev=hostnet4,id=net4,mac=00:e0:fc:0a:0f:00,bus=pci.
> >>> 0
> >>> ,addr=0x7 -netdev tap,fd=30,id=hostnet5,vhost=on,vhostfd=31 -device 
> >>> virtio-net-pci,netdev=hostnet5,id=net5,mac=00:e0:fc:0b:0f:00,bus=pci.
> >>> 0
> >>> ,addr=0x9 -chardev pty,id=charserial0 -device 
> >>> isa-serial,chardev=charserial0,id=serial0 -vnc *:0 -k en-us -vga 
> >>> cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0xb
> >>> -watchdog-action poweroff -device
> >>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa
> >>> 
> >>Which QEMU version is this? Can you try with e1000 NICs instead of virtio?
> >>
> >This QEMU version is 1.0.0, but I also test QEMU 1.5.2, the same problem exists, including the performance degradation and readonly GFNs' flooding.
> >I tried with e1000 NICs instead of virtio, including the performance degradation and readonly GFNs' flooding, the QEMU version is 1.5.2.
> >No matter e1000 NICs or virtio NICs, the GFNs' flooding is initiated at post-restore stage (i.e. running stage), as soon as the restoring completed, the flooding is starting.
> >
> >Thanks,
> >Zhang Haoyu
> >
> >>--
> >>			Gleb.
> 
> Should we focus on the first bad commit(612819c3c6e67bac8fceaa7cc402f13b1b63f7e4) and the surprising GFNs' flooding?
> 
Not really. There is no point in debugging very old version compiled
with kvm-kmod, there are to many variables in the environment. I cannot
reproduce the GFN flooding on upstream, so the problem may be gone, may
be a result of kvm-kmod problem or something different in how I invoke
qemu. So the best way to proceed is for you to reproduce with upstream
version then at least I will be sure that we are using the same code.

> I applied below patch to  __direct_map(), 
> @@ -2223,6 +2223,8 @@ static int __direct_map(struct kvm_vcpu
>         int pt_write = 0;
>         gfn_t pseudo_gfn;
> 
> +        map_writable = true;
> +
>         for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
>                 if (iterator.level == level) {
>                         unsigned pte_access = ACC_ALL;
> and rebuild the kvm-kmod, then re-insmod it.
> After I started a VM, the host seemed to be abnormal, so many programs cannot be started successfully, segmentation fault is reported.
> In my opinion, after above patch applied, the commit: 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 should be of no effect, but the test result proved me wrong.
> Dose the map_writable value's getting process in hva_to_pfn() have effect on the result?
> 
If hva_to_pfn() returns map_writable == false it means that page is
mapped as read only on primary MMU, so it should not be mapped writable
on secondary MMU either. This should not happen usually.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhanghaoyu (A) Aug. 14, 2013, 9:05 a.m. UTC | #11
>> >>> The QEMU command line (/var/log/libvirt/qemu/[domain name].log), 

>> >>> LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ 

>> >>> QEMU_AUDIO_DRV=none

>> >>> /usr/local/bin/qemu-system-x86_64 -name ATS1 -S -M pc-0.12 -cpu 

>> >>> qemu32 -enable-kvm -m 12288 -smp 4,sockets=4,cores=1,threads=1 -uuid

>> >>> 0505ec91-382d-800e-2c79-e5b286eb60b5 -no-user-config -nodefaults 

>> >>> -chardev 

>> >>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/ATS1.monitor,server,

>> >>> n owait -mon chardev=charmonitor,id=monitor,mode=control -rtc 

>> >>> base=localtime -no-shutdown -device

>> >>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 

>> >>> file=/opt/ne/vm/ATS1.img,if=none,id=drive-virtio-disk0,format=raw,cac

>> >>> h

>> >>> e=none -device

>> >>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,i

>> >>> d

>> >>> =virtio-disk0,bootindex=1 -netdev

>> >>> tap,fd=20,id=hostnet0,vhost=on,vhostfd=21 -device 

>> >>> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:e0:fc:00:0f:00,bus=pci.

>> >>> 0

>> >>> ,addr=0x3,bootindex=2 -netdev

>> >>> tap,fd=22,id=hostnet1,vhost=on,vhostfd=23 -device 

>> >>> virtio-net-pci,netdev=hostnet1,id=net1,mac=00:e0:fc:01:0f:00,bus=pci.

>> >>> 0

>> >>> ,addr=0x4 -netdev tap,fd=24,id=hostnet2,vhost=on,vhostfd=25 -device 

>> >>> virtio-net-pci,netdev=hostnet2,id=net2,mac=00:e0:fc:02:0f:00,bus=pci.

>> >>> 0

>> >>> ,addr=0x5 -netdev tap,fd=26,id=hostnet3,vhost=on,vhostfd=27 -device 

>> >>> virtio-net-pci,netdev=hostnet3,id=net3,mac=00:e0:fc:03:0f:00,bus=pci.

>> >>> 0

>> >>> ,addr=0x6 -netdev tap,fd=28,id=hostnet4,vhost=on,vhostfd=29 -device 

>> >>> virtio-net-pci,netdev=hostnet4,id=net4,mac=00:e0:fc:0a:0f:00,bus=pci.

>> >>> 0

>> >>> ,addr=0x7 -netdev tap,fd=30,id=hostnet5,vhost=on,vhostfd=31 -device 

>> >>> virtio-net-pci,netdev=hostnet5,id=net5,mac=00:e0:fc:0b:0f:00,bus=pci.

>> >>> 0

>> >>> ,addr=0x9 -chardev pty,id=charserial0 -device 

>> >>> isa-serial,chardev=charserial0,id=serial0 -vnc *:0 -k en-us -vga 

>> >>> cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0xb

>> >>> -watchdog-action poweroff -device

>> >>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa

>> >>> 

>> >>Which QEMU version is this? Can you try with e1000 NICs instead of virtio?

>> >>

>> >This QEMU version is 1.0.0, but I also test QEMU 1.5.2, the same problem exists, including the performance degradation and readonly GFNs' flooding.

>> >I tried with e1000 NICs instead of virtio, including the performance degradation and readonly GFNs' flooding, the QEMU version is 1.5.2.

>> >No matter e1000 NICs or virtio NICs, the GFNs' flooding is initiated at post-restore stage (i.e. running stage), as soon as the restoring completed, the flooding is starting.

>> >

>> >Thanks,

>> >Zhang Haoyu

>> >

>> >>--

>> >>			Gleb.

>> 

>> Should we focus on the first bad commit(612819c3c6e67bac8fceaa7cc402f13b1b63f7e4) and the surprising GFNs' flooding?

>> 

>Not really. There is no point in debugging very old version compiled

>with kvm-kmod, there are to many variables in the environment. I cannot

>reproduce the GFN flooding on upstream, so the problem may be gone, may

>be a result of kvm-kmod problem or something different in how I invoke

>qemu. So the best way to proceed is for you to reproduce with upstream

>version then at least I will be sure that we are using the same code.

>

Thanks, I will test the combos of upstream kvm kernel and upstream qemu.
And, the guest os version above I said was wrong, current running guest os is SLES10SP4.

Thanks,
Zhang Haoyu

>> I applied below patch to  __direct_map(), 

>> @@ -2223,6 +2223,8 @@ static int __direct_map(struct kvm_vcpu

>>         int pt_write = 0;

>>         gfn_t pseudo_gfn;

>> 

>> +        map_writable = true;

>> +

>>         for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {

>>                 if (iterator.level == level) {

>>                         unsigned pte_access = ACC_ALL;

>> and rebuild the kvm-kmod, then re-insmod it.

>> After I started a VM, the host seemed to be abnormal, so many programs cannot be started successfully, segmentation fault is reported.

>> In my opinion, after above patch applied, the commit: 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 should be of no effect, but the test result proved me wrong.

>> Dose the map_writable value's getting process in hva_to_pfn() have effect on the result?

>> 

>If hva_to_pfn() returns map_writable == false it means that page is

>mapped as read only on primary MMU, so it should not be mapped writable

>on secondary MMU either. This should not happen usually.

>

>--

>			Gleb.
Zhanghaoyu (A) Aug. 20, 2013, 1:33 p.m. UTC | #12
>>> >>> The QEMU command line (/var/log/libvirt/qemu/[domain name].log), 

>>> >>> LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ 

>>> >>> QEMU_AUDIO_DRV=none

>>> >>> /usr/local/bin/qemu-system-x86_64 -name ATS1 -S -M pc-0.12 -cpu

>>> >>> qemu32 -enable-kvm -m 12288 -smp 4,sockets=4,cores=1,threads=1 

>>> >>> -uuid

>>> >>> 0505ec91-382d-800e-2c79-e5b286eb60b5 -no-user-config -nodefaults 

>>> >>> -chardev 

>>> >>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/ATS1.monitor,ser

>>> >>> ver, n owait -mon chardev=charmonitor,id=monitor,mode=control 

>>> >>> -rtc base=localtime -no-shutdown -device

>>> >>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 

>>> >>> file=/opt/ne/vm/ATS1.img,if=none,id=drive-virtio-disk0,format=raw

>>> >>> ,cac

>>> >>> h

>>> >>> e=none -device

>>> >>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-dis

>>> >>> k0,i

>>> >>> d

>>> >>> =virtio-disk0,bootindex=1 -netdev

>>> >>> tap,fd=20,id=hostnet0,vhost=on,vhostfd=21 -device 

>>> >>> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:e0:fc:00:0f:00,bus=pci.

>>> >>> 0

>>> >>> ,addr=0x3,bootindex=2 -netdev

>>> >>> tap,fd=22,id=hostnet1,vhost=on,vhostfd=23 -device 

>>> >>> virtio-net-pci,netdev=hostnet1,id=net1,mac=00:e0:fc:01:0f:00,bus=pci.

>>> >>> 0

>>> >>> ,addr=0x4 -netdev tap,fd=24,id=hostnet2,vhost=on,vhostfd=25 

>>> >>> -device virtio-net-pci,netdev=hostnet2,id=net2,mac=00:e0:fc:02:0f:00,bus=pci.

>>> >>> 0

>>> >>> ,addr=0x5 -netdev tap,fd=26,id=hostnet3,vhost=on,vhostfd=27 

>>> >>> -device virtio-net-pci,netdev=hostnet3,id=net3,mac=00:e0:fc:03:0f:00,bus=pci.

>>> >>> 0

>>> >>> ,addr=0x6 -netdev tap,fd=28,id=hostnet4,vhost=on,vhostfd=29 

>>> >>> -device virtio-net-pci,netdev=hostnet4,id=net4,mac=00:e0:fc:0a:0f:00,bus=pci.

>>> >>> 0

>>> >>> ,addr=0x7 -netdev tap,fd=30,id=hostnet5,vhost=on,vhostfd=31 

>>> >>> -device virtio-net-pci,netdev=hostnet5,id=net5,mac=00:e0:fc:0b:0f:00,bus=pci.

>>> >>> 0

>>> >>> ,addr=0x9 -chardev pty,id=charserial0 -device 

>>> >>> isa-serial,chardev=charserial0,id=serial0 -vnc *:0 -k en-us -vga 

>>> >>> cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0xb

>>> >>> -watchdog-action poweroff -device 

>>> >>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa

>>> >>> 

>>> >>Which QEMU version is this? Can you try with e1000 NICs instead of virtio?

>>> >>

>>> >This QEMU version is 1.0.0, but I also test QEMU 1.5.2, the same problem exists, including the performance degradation and readonly GFNs' flooding.

>>> >I tried with e1000 NICs instead of virtio, including the performance degradation and readonly GFNs' flooding, the QEMU version is 1.5.2.

>>> >No matter e1000 NICs or virtio NICs, the GFNs' flooding is initiated at post-restore stage (i.e. running stage), as soon as the restoring completed, the flooding is starting.

>>> >

>>> >Thanks,

>>> >Zhang Haoyu

>>> >

>>> >>--

>>> >>			Gleb.

>>> 

>>> Should we focus on the first bad commit(612819c3c6e67bac8fceaa7cc402f13b1b63f7e4) and the surprising GFNs' flooding?

>>> 

>>Not really. There is no point in debugging very old version compiled 

>>with kvm-kmod, there are to many variables in the environment. I cannot 

>>reproduce the GFN flooding on upstream, so the problem may be gone, may 

>>be a result of kvm-kmod problem or something different in how I invoke 

>>qemu. So the best way to proceed is for you to reproduce with upstream 

>>version then at least I will be sure that we are using the same code.

>>

>Thanks, I will test the combos of upstream kvm kernel and upstream qemu.

>And, the guest os version above I said was wrong, current running guest os is SLES10SP4.

>

I tested below combos of qemu and kernel,
+-----------------+-----------------+-----------------+
|  kvm kernel     |      QEMU       |   test result   |
+-----------------+-----------------+-----------------+
|  kvm-3.11-2     |   qemu-1.5.2    |      GOOD       |
+-----------------+-----------------+-----------------+
|  SLES11SP2      |   qemu-1.0.0    |      BAD        |
+-----------------+-----------------+-----------------+
|  SLES11SP2      |   qemu-1.4.0    |      BAD        |
+-----------------+-----------------+-----------------+
|  SLES11SP2      |   qemu-1.4.2    |      BAD        |
+-----------------+-----------------+-----------------+
|  SLES11SP2      | qemu-1.5.0-rc0  |      GOOD       |
+-----------------+-----------------+-----------------+
|  SLES11SP2      |   qemu-1.5.0    |      GOOD       |
+-----------------+-----------------+-----------------+
|  SLES11SP2      |   qemu-1.5.1    |      GOOD       |
+-----------------+-----------------+-----------------+
|  SLES11SP2      |   qemu-1.5.2    |      GOOD       |
+-----------------+-----------------+-----------------+
NOTE:
1. above kvm-3.11-2 in the table is the whole tag kernel download from https://git.kernel.org/pub/scm/virt/kvm/kvm.git
2. SLES11SP2's kernel version is 3.0.13-0.27

Then I git bisect the qemu changes between qemu-1.4.2 and qemu-1.5.0-rc0 by marking the good version as bad, and the bad version as good,
so the first bad commit is just the patch which fixes the degradation problem.
+------------+-------------------------------------------+-----------------+-----------------+
| bisect No. |                  commit                   |  save-restore   |    migration    |
+------------+-------------------------------------------+-----------------+-----------------+
|      1     | 03e94e39ce5259efdbdeefa1f249ddb499d57321  |      BAD        |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
|      2     | 99835e00849369bab726a4dc4ceed1f6f9ed967c  |      GOOD       |       GOOD      |
+------------+-------------------------------------------+-----------------+-----------------+
|      3     | 62e1aeaee4d0450222a0ea43c713b59526e3e0fe  |      BAD        |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
|      4     | 9d9801cf803cdceaa4845fe27150b24d5ab083e6  |      BAD        |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
|      5     | d76bb73549fcac07524aea5135280ea533a94fd6  |      BAD        |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
|      6     | d913829f0fd8451abcb1fd9d6dfce5586d9d7e10  |      GOOD       |       GOOD      |
+------------+-------------------------------------------+-----------------+-----------------+
|      7     | d2f38a0acb0a1c5b7ab7621a32d603d08d513bea  |      BAD        |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
|      8     | e344b8a16de429ada3d9126f26e2a96d71348356  |      BAD        |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
|      9     | 56ded708ec38e4cb75a7c7357480ca34c0dc6875  |      BAD        |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
|      10    | 78d07ae7ac74bcc7f79aeefbaff17fb142f44b4d  |      BAD        |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
|      11    | 70c8652bf3c1fea79b7b68864e86926715c49261  |      GOOD       |       GOOD      |
+------------+-------------------------------------------+-----------------+-----------------+
|      12    | f1c72795af573b24a7da5eb52375c9aba8a37972  |      GOOD       |       GOOD      |
+------------+-------------------------------------------+-----------------+-----------------+
NOTE: above tests were made on SLES11SP2.

So, the commit f1c72795af573b24a7da5eb52375c9aba8a37972 is just the patch which fixes the degradation.

Then, I replace SLES11SP2's default kvm-kmod with kvm-kmod-3.6, and applied below patch to __direct_map(),
@@ -2599,6 +2599,9 @@ static int __direct_map(struct kvm_vcpu
        int emulate = 0;
        gfn_t pseudo_gfn;

+        if (!map_writable)
+                printk(KERN_ERR "%s: %s: gfn = %llu \n", __FILE__, __func__, gfn);
+
        for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
                if (iterator.level == level) {
                        unsigned pte_access = ACC_ALL;
and, I rebuild the kvm-kmod, then re-insmod it, test the adjacent commits again, test results shown as below,
+------------+-------------------------------------------+-----------------+-----------------+
| bisect No. |                  commit                   |  save-restore   |    migration    |
+------------+-------------------------------------------+-----------------+-----------------+
|      10    | 78d07ae7ac74bcc7f79aeefbaff17fb142f44b4d  |      BAD        |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
|      12    | f1c72795af573b24a7da5eb52375c9aba8a37972  |      GOOD       |       BAD       |
+------------+-------------------------------------------+-----------------+-----------------+
While testing commit 78d07ae7ac74bcc7f79aeefbaff17fb142f44b4d, as soon as the restoration/migration complete, the GFNs flooding is starting,
take some examples shown as below,
2073462
2857203
2073463
2073464
2073465
3218751
2073466
2857206
2857207
2073467
2073468
2857210
2857211
3218752
2857214
2857215
3218753
2857217
2857218
2857221
2857222
3218754
2857225
2857226
3218755
2857229
2857230
2857232
2857233
3218756
2780393
2780394
2857236
2780395
2857237
2780396
2780397
2780398
2780399
2780400
2780401
3218757
2857240
2857241
2857244
3218758
2857247
2857248
2857251
2857252
3218759
2857255
2857256
3218760
2857289
2857290
2857293
2857294
3218761
2857297
2857298
3218762
3218763
3218764
3218765
3218766
3218767
3218768
3218769
3218770
3218771
3218772

but, after a period of time, the flooding rate slowed down.

while testing commit f1c72795af573b24a7da5eb52375c9aba8a37972, after restoration, no GFN was printed, and no performance degradation.
but as soon as live migration complete, GFNs flooding is starting, and performance degradation also happened.

NOTE: The test results of commit f1c72795af573b24a7da5eb52375c9aba8a37972 seemed to be unstable, I will make verification again.


>Thanks,

>Zhang Haoyu

>

>>> I applied below patch to  __direct_map(), @@ -2223,6 +2223,8 @@ 

>>> static int __direct_map(struct kvm_vcpu

>>>         int pt_write = 0;

>>>         gfn_t pseudo_gfn;

>>> 

>>> +        map_writable = true;

>>> +

>>>         for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {

>>>                 if (iterator.level == level) {

>>>                         unsigned pte_access = ACC_ALL; and rebuild 

>>> the kvm-kmod, then re-insmod it.

>>> After I started a VM, the host seemed to be abnormal, so many programs cannot be started successfully, segmentation fault is reported.

>>> In my opinion, after above patch applied, the commit: 612819c3c6e67bac8fceaa7cc402f13b1b63f7e4 should be of no effect, but the test result proved me wrong.

>>> Dose the map_writable value's getting process in hva_to_pfn() have effect on the result?

>>> 

>>If hva_to_pfn() returns map_writable == false it means that page is 

>>mapped as read only on primary MMU, so it should not be mapped writable 

>>on secondary MMU either. This should not happen usually.

>>

>>--

>>			Gleb.
diff mbox

Patch

--- kvm-612819c3c6e67bac8fceaa7cc402f13b1b63f7e4/arch/x86/kvm/mmu.c     2013-07-26 18:44:05.000000000 +0800
+++ kvm-612819/arch/x86/kvm/mmu.c       2013-07-31 00:05:48.000000000 +0800
@@ -2223,6 +2223,9 @@  static int __direct_map(struct kvm_vcpu
        int pt_write = 0;
        gfn_t pseudo_gfn;

+        if (!map_writable)
+                printk(KERN_ERR "%s: %s: gfn = %llu \n", __FILE__, __func__, gfn);
+
        for_each_shadow_entry(vcpu, (u64)gfn << PAGE_SHIFT, iterator) {
                if (iterator.level == level) {
                        unsigned pte_access = ACC_ALL;