[prev in list] [next in list] [prev in thread] [next in thread]
List: debian-user
Subject: Re: Segfaults after upgrade to Debian 11.7 on virtualized systems with AMD Ryzen CPU
From: Andreas Haumer <andreas () xss ! co ! at>
Date: 2023-05-01 13:10:13
Message-ID: f8d126f4-99b1-7cf3-c93f-67a6a9fe4e74 () xss ! co ! at
[Download RAW message or body]
[Attachment #2 (multipart/mixed)]
Hi!
Thank you all for your reply!
Am 01.05.23 um 00:39 schrieb NetValue Operations Centre:
> I've tried downgrading libc (and related packages) to 2.31-13+deb11u5, but no \
> success - still getting segmentation faults. Booting back to the 5.10.0-21 kernel \
> seems the only solution at the moment.
I now found out, that the 5.10.0-22 kernel boots fine, if I set the CPU
model manually to "EPYC-Rome" in my VM configuration (I use "virt-manager")
In that case, "virsh dumpxml" tells me about the VM's CPU:
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>EPYC-Rome</model>
<feature policy='require' name='x2apic'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='xsaves'/>
<feature policy='disable' name='svm'/>
<feature policy='require' name='topoext'/>
<feature policy='disable' name='npt'/>
<feature policy='disable' name='nrip-save'/>
</cpu>
On the other hand, if I set the CPU model as "copy from host",
booting the 5.10.0-22 kernel results in the reported segfaults.
In that case, "virsh dumpxml" tells me the following:
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>EPYC-Rome</model>
<vendor>AMD</vendor>
<feature policy='require' name='x2apic'/>
<feature policy='require' name='tsc-deadline'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='tsc_adjust'/>
<feature policy='require' name='erms'/>
<feature policy='require' name='invpcid'/>
<feature policy='require' name='pku'/>
<feature policy='require' name='vaes'/>
<feature policy='require' name='vpclmulqdq'/>
<feature policy='require' name='fsrm'/>
<feature policy='require' name='spec-ctrl'/>
<feature policy='require' name='stibp'/>
<feature policy='require' name='arch-capabilities'/>
<feature policy='require' name='ssbd'/>
<feature policy='require' name='xsaves'/>
<feature policy='require' name='cmp_legacy'/>
<feature policy='require' name='amd-ssbd'/>
<feature policy='require' name='virt-ssbd'/>
<feature policy='disable' name='lbrv'/>
<feature policy='disable' name='tsc-scale'/>
<feature policy='disable' name='vmcb-clean'/>
<feature policy='disable' name='pause-filter'/>
<feature policy='disable' name='pfthreshold'/>
<feature policy='require' name='rdctl-no'/>
<feature policy='require' name='skip-l1dfl-vmentry'/>
<feature policy='require' name='mds-no'/>
<feature policy='require' name='pschange-mc-no'/>
<feature policy='disable' name='svm'/>
<feature policy='require' name='topoext'/>
<feature policy='disable' name='npt'/>
<feature policy='disable' name='nrip-save'/>
</cpu>
On the host, "lscpu" tells me:
root@pauli:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 48 bits physical, 48 bits virtual
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 25
Model: 33
Model name: AMD Ryzen 9 5950X 16-Core Processor
Stepping: 2
Frequency boost: enabled
CPU MHz: 2200.000
CPU max MHz: 5980,4678
CPU min MHz: 2200,0000
BogoMIPS: 8000.67
Virtualization: AMD-V
L1d cache: 512 KiB
L1i cache: 512 KiB
L2 cache: 8 MiB
L3 cache: 64 MiB
NUMA node0 CPU(s): 0-31
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via \
prctl Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and \
__user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, \
IBPB conditional, IBRS_FW, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected \
Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not \
affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge \
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb \
rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl \
pni pclmulqdq monit or ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c \
rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch \
osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb \
cat_ l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep \
bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec \
xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf \
xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean \
flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif \
v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
I still do not see which system component is to blame here exactly, but it seems,
there actually is some issue with the current Debian 11.7 5.10.0-22 kernel.
Time to create a Debian bugreport?
Regards
- andreas
--
Andreas Haumer
*x Software + Systeme | mailto:andreas@xss.co.at
Karmarschgasse 51/2/20 | https://www.xss.co.at/
A-1100 Vienna, Austria | Tel: +43-1-6060114-0
["OpenPGP_signature.asc" (application/pgp-signature)]
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic