'Any comments? Re: [RFC][PATCH 0/12] KVM, x86, ppc, asm-generic: moving'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kvm-ppc
Subject:    Any comments? Re: [RFC][PATCH 0/12] KVM, x86, ppc, asm-generic: moving
From:       Takuya Yoshikawa <yoshikawa.takuya () oss ! ntt ! co ! jp>
Date:       2010-05-24 7:05:29
Message-ID: 4BFA2539.3030709 () oss ! ntt ! co ! jp
[Download RAW message or body]

(2010/05/17 18:06), Takuya Yoshikawa wrote:
>
>> User allocated bitmaps have the advantage of reducing pinned memory.
>> However we have plenty more pinned memory allocated in memory slots, so
>> by itself, user allocated bitmaps don't justify this change.

Sorry for pinging several times.

>
> In that sense, what do you think about the question I sent last week?
>
> === REPOST 1 ===
>  >>
>  >> mark_page_dirty is called with the mmu_lock spinlock held in set_spte.
>  >> Must find a way to move it outside of the spinlock section.

I am now trying to do something to solve this spinlock problem. But the
spinlock section looks too wide to solve with simple workaround.

> Sorry but I have to say that mmu_lock spin_lock problem was completely
> out of
> my mind. Although I looked through the code, it seems not easy to move the
> set_bit_user to outside of spinlock section without breaking the
> semantics of
> its protection.
>
> So this may take some time to solve.
>
> But personally, I want to do something for x86's "vmallc() every time"
> problem
> even though moving dirty bitmaps to user space cannot be achieved soon.
>
> In that sense, do you mind if we do double buffering without moving
> dirty bitmaps to
> user space?

So I would be happy if you give me any comments about this kind of other
options.

Thanks,
   Takuya


>
> I know that the resource for vmalloc() is precious for x86 but even now,
> at the timing
> of get_dirty_log, we use the same amount of memory as double buffering.
> === 1 END ===
>
>
>>
>> Perhaps if we optimize memory slot write protection (I have some ideas
>> about this) we can make the performance improvement more pronounced.
>>
>
> It's really nice!
>
> Even now we can measure the performance improvement by introducing
> switch ioctl
> when guest is relatively idle, so the combination will be really effective!
>
> === REPOST 2 ===
>  >>
>  >> Can you post such a test, for an idle large guest?
>  >
>  > OK, I'll do!
>
>
> Result of "low workload test" (running top during migration) first,
>
> 4GB guest
> picked up slots[1](len=3757047808) only
> *****************************************
> get.org get.opt switch.opt
>
> 1060875 310292 190335
> 1076754 301295 188600
> 655504 318284 196029
> 529769 301471 325
> 694796 70216 221172
> 651868 353073 196184
> 543339 312865 213236
> 1061938 72785 203090
> 689527 323901 249519
> 621364 323881 473
> 1063671 70703 192958
> 915903 336318 174008
> 1046462 332384 782
> 1037942 72783 190655
> 680122 318305 243544
> 688156 314935 193526
> 558658 265934 190550
> 652454 372135 196270
> 660140 68613 352
> 1101947 378642 186575
> ... ... ...
> *****************************************
>
> As expected we've got the difference more clearly.
>
> In this case, switch.opt reduced 1/3 (.1 msec) compared to get.opt
> for each iteration.
>
> And when the slot is cleaner, the ratio is bigger.
> === 2 END ===
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic