[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xen-users
Subject:    Re[2]: User domain starts with a crash loop when memory configured is above 500GB
From:       "Robert Polasek" <polasekr () gmail ! com>
Date:       2023-09-20 19:35:16
Message-ID: emdb128c6d-0c65-4d3b-8992-588c271a583a () c9023129 ! com
[Download RAW message or body]

Thank you Juergen for the tip. It was dead on. I made those changes and 
I am able to boot larger user domains than 500GB and also the kernel 
crash messages went away.

Cheers,
Robert


------ Original Message ------
From "Juergen Gross" <jgross@suse.com>
To "Robert Polasek" <polasekr@gmail.com>; xen-users@lists.xenproject.org
Date 2023-09-19 10:35:24
Subject Re: User domain starts with a crash loop when memory configured 
is above 500GB

> On 18.09.23 17:34, Robert Polasek wrote:
> > Hi everybody,
> > 
> > I have a server with 760GB of RAM. I have only domain 0 running there with 16GB \
> > of ram assigned to it. 
> > Here is a configuration for my user domain:
> > 
> > name = "node01"
> > kernel = "/boot/vmlinuz-5.15.0-82-generic"
> > root = "/dev/xvda"
> > memory = 614400
> > maxmem = 614400
> > vcpus = 32
> > maxvcpus = 32
> > disk = ['file:/vserver/images/node01.img,xvda,w']
> > vif = ['bridge=virbr0,mac=00:16:3e:01:01:02']
> > iommu = "soft"
> > swiotlb = "force"
> > pci_permissive = 1
> > pci = ['0000:3e:00.0','0000:3f:00.0','0000:40:00.0','0000:41:00.0','0000:b1:00.0','0000:b2:00.0']
> >  
> > nics = 1
> > dhcp = "off"
> > ip = "192.168.122.15"
> > netmask = "255.255.255.0"
> > gateway = "192.168.122.1"
> > hostname = "node01"
> > 
> > extra="3"
> > 
> > When I try to start the domain, it spins in a crash loop with following error \
> > messages: 
> > [ 6864.140170] WARNING: CPU: 2 PID: 266 at arch/x86/xen/multicalls.c:102 \
> > xen_mc_flush+0x197/0x200 [ 6864.140183] Modules linked in:
> > [ 6864.140190] CPU: 2 PID: 266 Comm: xen-balloon Tainted: G      D W          \
> > 5.15.0-82-generic #91-Ubuntu [ 6864.140203] RIP: e030:xen_mc_flush+0x197/0x200
> > [ 6864.140212] Code: 77 65 89 c0 48 c1 e0 05 48 05 00 20 00 81 ff d0 0f 1f 00 49 \
> > 89 45 18 48 85 c0 0f 89 17 ff ff ff 45 8b 4d 00 41 bf 01 00 00 00 <0f> 0b 48 c7 \
> > c7 f0 8e 5b 82 44 89 ca 44 89 fe 45 31 f6 65 8b 0d e8 [ 6864.140234] RSP: \
> > e02b:ffffc90041027b88 EFLAGS: 00010002 [ 6864.140243] RAX: 0000000000000001 RBX: \
> > 0000000000000040 RCX: 0000000000000000 [ 6864.140253] RDX: 0000000000000000 RSI: \
> > 0000000000000002 RDI: ffff89009809e310 [ 6864.140264] RBP: ffffc90041027bb8 R08: \
> > ffff888168dc0000 R09: 0000000000000002 [ 6864.140275] R10: 0000000000000200 R11: \
> > ffff8900980b7690 R12: 0000000000000000 [ 6864.140286] R13: ffff89009809e300 R14: \
> > 0000000000000002 R15: 0000000000000001 [ 6864.140303] FS:  0000000000000000(0000) \
> > GS:ffff890098080000(0000) knlGS:0000000000000000 [ 6864.140315] CS:  10000e030 \
> > DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6864.140324] CR2: 0000000000000000 CR3: \
> > 0000000002e10000 CR4: 0000000000050660 [ 6864.140339] Call Trace:
> > [ 6864.140344]  <TASK>
> > [ 6864.140349]  ? __raw_callee_save_xen_make_pte+0x15/0x27
> > [ 6864.140359]  xen_mc_issue+0x61/0x80
> > [ 6864.140367]  xen_alloc_pte+0xd8/0x290
> > [ 6864.140376]  pmd_populate_kernel.constprop.0+0x4b/0xa0
> > [ 6864.140387]  vmemmap_pmd_populate+0x69/0x79
> > [ 6864.140395]  vmemmap_populate_basepages+0x68/0xb3
> > [ 6864.140405]  vmemmap_populate+0x2a/0xa9
> > [ 6864.140412]  __populate_section_memmap+0x3c/0x57
> > [ 6864.140422]  sparse_add_section+0x12b/0x1dc
> > [ 6864.140431]  __add_pages+0xac/0x150
> > [ 6864.140440]  add_pages+0x17/0x70
> > [ 6864.140447]  arch_add_memory+0x45/0x60
> > [ 6864.140455]  add_memory_resource+0x12c/0x320
> > [ 6864.140467]  reserve_additional_memory+0x10f/0x160
> > [ 6864.140476]  balloon_thread+0x337/0x500
> > [ 6864.140483]  ? wait_woken+0x70/0x70
> > [ 6864.140492]  ? reserve_additional_memory+0x160/0x160
> > [ 6864.140501]  kthread+0x127/0x150
> > [ 6864.140509]  ? set_kthread_struct+0x50/0x50
> > [ 6864.140518]  ret_from_fork+0x1f/0x30
> > [ 6864.140528]  </TASK>
> > [ 6864.140533] ---[ end trace 3bca9737718a46b2 ]---
> > [ 6864.140541] 1 of 2 multicall(s) failed: cpu 2
> > [ 6864.140549]   call  2: op=26 arg=[ffff89009809eb10] result=-22
> > 
> > Any suggestion what I am doing wrong? There should be plenty of RAM to start \
> > 600GB domain. I can start  user domain with 500GB no problem. Thank you in \
> > advance for your help and suggestions.
> 
> I think your kernel has been configured with CONFIG_XEN_512GB.
> 
> You should try to add "xen_512gb_limit=0" to your guest's command line.
> 
> Even if this is fixing your boot issue, the guest shouldn't show the error
> you are seeing.
> 
> 
> Juergen
> 


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic