395 Star 1.4K Fork 1.3K

GVPopenEuler / kernel

 / 详情

鲲鹏920 mpam功能验证出现panic问题,导致机器死机

已完成
缺陷
创建于  
2020-03-26 11:05

复现步骤:
挂载后:
mkdir p1
cd p1
echo f > cpus
[  136.505555] kernel BUG at arch/arm64/kernel/traps.c:417!
[  136.509615] kernel BUG at arch/arm64/kernel/traps.c:417!
[  136.514377] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.95+ #13
[  136.548883] Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 1.08 12/14/2019
[  136.556528] pstate: 00400089 (nzcv daIf +PAN -UAO)
[  136.561471] pc : do_undefinstr+0x64/0x68
[  136.565508] lr : do_undefinstr+0x38/0x68
[  136.569548] sp : ffff00000806fd80
[  136.578994] x29: ffff00000806fd80 x28: ffff803fc46f6400
[  136.590408] x27: 0000000000000000 x26: 0000000000000001
[  136.601743] x25: 0000000000000003 x24: 0000000000000018
[  136.613024] x23: 0000000020400089 x22: ffff0000080a53a8
[  136.624121] x21: ffff00000806fee0 x20: 0000ffffffffffff
[  136.635071] x19: ffff00000806fda0 x18: 0000000000000000
[  136.646071] x17: 0000000000000000 x16: 0000000000000000
[  136.656788] x15: 0000000000000000 x14: 0000000000000000
[  136.667355] x13: 0000000000000000 x12: ffff805fffffe0c0
[  136.677761] x11: 0000000000000040 x10: 0000000000000b80

[  136.687954] x9 : ffff00000aeafe90 x8 : ffff803fc46f6fe0
[  136.697990] x7 : 0000000000000000 x6 : ffff00000806fd78
[  136.707844] x5 : 0000000000000000 x4 : ffff0000092434b0
[  136.717473] x3 : 000000000000001f x2 : 651ade7c9516f600
[  136.726900] x1 : 0000000000000000 x0 : 0000000020400089
[  136.736139] Call trace:
[  136.742498]  do_undefinstr+0x64/0x68
[  136.750011]  el1_undef+0x10/0x84
[  136.757109]  __mpam_sched_in+0x58/0xc8
[  136.764611]  update_cpu_closid_rmid+0x50/0x60
[  136.772654]  flush_smp_call_function_queue+0x84/0x178
[  136.781499]  generic_smp_call_function_single_interrupt+0x18/0x20
[  136.791530]  handle_IPI+0x410/0x598
[  136.798845]  gic_handle_irq+0x144/0x164
[  136.806386]  el1_irq+0xb8/0x140
[  136.813160]  arch_cpu_idle+0x3c/0x1c0
[  136.820297]  default_idle_call+0x20/0x38
[  136.827664]  cpuidle_idle_call+0x14c/0x190
[  136.835254]  do_idle+0xac/0xf0
[  136.841775]  cpu_startup_entry+0x2c/0x30
[  136.849212]  secondary_start_kernel+0x134/0x160
[  136.857276] Code: 94003965 f9400bf3 a8c27bfd d65f03c0 (d4210000)

[  136.867175] ---[ end trace 6ca0447735f94559 ]---
[  136.875614] Kernel panic - not syncing: Fatal exception in interrupt
[  136.885958] SMP: stopping secondary CPUs
[  137.965290] SMP: failed to stop secondary CPUs 0-3
[  137.974228] Kernel Offset: disabled
[  137.981923] CPU features: 0x12,a2200a38
[  137.990004] Memory Limit: none
[  137.997329] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

评论 (5)

haoxing990 创建了缺陷
haoxing990 关联仓库设置为openEuler/kernel
展开全部操作日志

Hey @haoxing990, Welcome to openEuler Community.
All of the projects in openEuler Community are maintained by @openeuler-ci-bot.
That means the developpers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at https://gitee.com/openeuler/community/blob/master/en/sig-infrastructure/command.md to find the details.

haoxing990 修改了描述

do_undefinstr,应该是编译器问题吧,使用openEuler的7.3编译器没有这个问题

do_undefinstr,应该是编译器问题吧,使用openEuler的7.3编译器没有这个问题

@hanjun-guo

gcc版本是9.2, mpam不支持gcc9.2吗??

@hanjun-guo
gcc版本是9.2, mpam不支持gcc9.2吗??

@haoxing990
邮件回复了,这里再贴上回复供其他同学参考
vmcore信息如下:

PID: 0 TASK: ffff8027dcd14ec0 CPU: 1 COMMAND: "swapper/1"

#0 [ffff00000800bc40] crash_kexec at ffff00000819df84

#1 [ffff00000800bc70] die at ffff00000808eb6c

#2 [ffff00000800bcb0] arm64_notify_die at ffff00000808eda8

#3 [ffff00000800bcc0] force_signal_inject at ffff00000808eff8

#4 [ffff00000800bd90] efi_header_end at ffff000008081044

#5 [ffff00000800bee0] el1_undef at ffff000008083214

 PC: ffff0000080a1c88  [__mpam_sched_in+104]

 LR: ffff0000080a1d48  [update_cpu_closid_rmid+88]

 SP: ffff00000800bef0  PSTATE: 80400089

X29: ffff00000800bef0  X28: ffff0000097a3dc0  X27: ffff8027dcd14ec0

X26: 0000000000000001  X25: 0000000000000001  X24: 0000000000000001

X23: 0000000000000000  X22: 0000000000000001  X21: 0000000000000000

X20: 0000000000000000  X19: ffff00002fa1b9e0  X18: 0000000000000000

X17: 0000000000000000  X16: 0000000000000000  X15: 0000000000000000

X14: 0000000000000000  X13: 0000000000000000  X12: 0000000000000001

X11: ffff000008b689f0  X10: 0000000000000bc0   X9: ffff0000080a1d48

 X8: ffff8027dcd15ae0   X7: 0000000000000018   X6: 0000000000000000

 X5: 0000000000000000   X4: 00008027d6c1d000   X3: ffff8027df9c27a8

 X2: 0000000000000001   X1: 0000000000000000   X0: ffff000008da57a8

#6 [ffff00000800bef0] __mpam_sched_in at ffff0000080a1c84

#7 [ffff00000800bf00] flush_smp_call_function_queue at ffff000008191e14

#8 [ffff00000800bf30] generic_smp_call_function_single_interrupt at ffff0000081 -- MORE -- forward:

crash> dis -l __mpam_sched_in+104
/mnt/new/kernel-4.19/arch/arm64/kernel/mpam.c: 1328
0xffff0000080a1c88 <__mpam_sched_in+104>: mrs x0, s3_0_c10_c5_1

MPAM是openEuler的preview功能,并不是非常完善;此功能需要BIOS支持,完成相关寄存器配置,如果BIOS不支持,会出现MPAM寄存器(如上面的s3_0_c10_c5_1)访问EL1无权限而导致的指令异常。

后面我们会提供从BIOS ACPI表获取信息来初始化MPAM,如果BIOS未提供ACPI表,就不初始化MPAM(不能使用),从而避免这类的问题

上面贴的内容gitee会中匹配bug标题,有点乱,关键信息如下:
crash> dis -l __mpam_sched_in+104
/mnt/new/kernel-4.19/arch/arm64/kernel/mpam.c: 1328
0xffff0000080a1c88 <__mpam_sched_in+104>: mrs x0, s3_0_c10_c5_1

MPAM是openEuler的preview功能,并不是非常完善;此功能需要BIOS支持,完成相关寄存器配置,如果BIOS不支持,会出现MPAM寄存器(如上面的s3_0_c10_c5_1)访问EL1无权限而导致的指令异常。

后面我们会提供从BIOS ACPI表获取信息来初始化MPAM,如果BIOS未提供ACPI表,就不初始化MPAM(不能使用),从而避免这类的问题

yanzh_h 负责人设置为wangxiongfeng
yanzh_h 添加协作者Xie XiuQi
wangxiongfeng 任务状态待办的 修改为已完成

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(4)
5329419 openeuler ci bot 1632792936
C
1
https://gitee.com/openeuler/kernel.git
git@gitee.com:openeuler/kernel.git
openeuler
kernel
kernel

搜索帮助