394 Star 1.4K Fork 1.3K

GVPopenEuler / kernel

 / 详情

[20.03-LTS-SP2]arm物理机执行长稳用例后出现内存分配不足无法执行任何命令,重启后执行长稳又产生core文件

已验收
缺陷
创建于  
2021-05-20 11:31

【环境信息】
环境信息:arm物理机
OS版本:20.03-SP2-round2
内核:4.19.90-2105.2.0.0086.oe1.aarch64
【问题复现步骤】
1.使用ISO选中最小化模式安装
2.执行长稳用例
【预期结果】
执行无异常
【实际结果】
1.执行长稳用例一段时间后出现如下报错:
libgcc_s.so.1 must be installed for pthread_cancel to work
并且执行普通命令都会报错fork:Cannot allocate memory
2.重启机器后再次执行长稳用例会产生core文件
输入图片说明

评论 (8)

6++ 创建了缺陷
6++ 关联仓库设置为openEuler/kernel
展开全部操作日志

Hey Classicriver_jia, Welcome to openEuler Community.
All of the projects in openEuler Community are maintained by @openeuler-ci-bot.
That means the developers can comment below every pull request or issue to trigger Bot Commands.
Please follow instructions at https://gitee.com/openeuler/community/blob/master/en/sig-infrastructure/command.md to find the details.

openeuler-ci-bot 添加了
 
sig/Kernel
标签
6++ 上传了附件messages
6++ 删除了附件messages
6++ 负责人设置为成坚 (CHENG Jian)
6++ 计划截止日期设置为2021-05-21
6++ 计划开始日期设置为2021-05-20
6++ 计划截止日期2021-05-21 修改为2021-05-25
6++ 优先级设置为主要

执行如下命令进程长稳测试:

cd /home/dfx_long_stress
sh startrunall.sh -p EulerOS_arm

【问题1】第一次运行,接近一天之后,报如下错误:

输入图片说明

【问题 2】重启之后,再次运行长稳测试命令,内核 crash

先看【问题2】 CRASH 的问题。
vmcore 文件在

cd /var/crash/127.0.0.1-2021-05-20-10:09:11
crash ./vmcore /usr/lib/debug/usr/lib/modules/4.19.90-2105.2.0.0086.oe1.aarch64/vmlinux

输入图片说明

挂死的原因 :

crash> dmesg | grep "and no killable processes"
[ 2987.293464] Out of memory and no killable processes...
crash> dmesg | grep "System is deadlocked on memory"
[ 2987.318681] Kernel panic - not syncing: System is deadlocked on memory

输入图片说明

成坚 (CHENG Jian) 修改了描述

dump_header 的信息如下:

[ 2982.448041] kworker/u193:4 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[ 2982.466182] kworker/u193:4 cpuset=/ mems_allowed=0-3
[ 2982.490155] CPU: 11 PID: 1072757 Comm: kworker/u193:4 Kdump: loaded Tainted: G        W  OEL    4.19.90-2105.2.0.0086.oe1.aarch64 #1
[ 2982.513329] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDDA, BIOS 1.06 10/29/2019
[ 2982.537764] Call trace:
[ 2982.555335]  dump_backtrace+0x0/0x198
[ 2982.555337]  show_stack+0x24/0x30
[ 2982.579787]  dump_stack+0xa4/0xe8
[ 2982.579791]  dump_header+0x6c/0x240
[ 2982.604778]  out_of_memory+0x4cc/0x520
[ 2982.604780]  __alloc_pages_nodemask+0xd10/0xd88
[ 2982.637035]  alloc_pages_current+0x88/0xf0
[ 2982.661907]  __page_cache_alloc+0x8c/0xd8
[ 2982.680514]  generic_file_buffered_read+0x3b8/0xae8
[ 2982.680516]  generic_file_read_iter+0x114/0x190
[ 2982.713812]  ext4_file_read_iter+0x5c/0x140 [ext4]
[ 2982.738425]  __vfs_read+0x130/0x1a0
[ 2982.738427]  vfs_read+0x94/0x150
[ 2982.756217]  kernel_read+0x68/0xc8
[ 2982.756219]  prepare_binprm+0xc8/0x1a0
[ 2982.781542]  __do_execve_file.isra.13+0x56c/0x7c0
[ 2982.781544]  do_execve+0x48/0x58
[ 2982.813656]  call_usermodehelper_exec_async+0x200/0x230
[ 2982.813659]  ret_from_fork+0x10/0x18
[ 2982.847043] Mem-Info:
[ 2982.871440] active_anon:0 inactive_anon:0 isolated_anon:0
                active_file:0 inactive_file:120 isolated_file:0
                slab_reclaimable:55402 slab_unreclaimable:123071868
                mapped:0 shmem:1 pagetables:208 bounce:0
                free:9408 free_pcp:1510 free_cma:0
[ 2983.000464] Node 0 DMA32 free:503424kB min:256kB low:1664kB high:3072kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:2092864kB managed:1453440kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 2983.026154] lowmem_reserve[]: 0 8055 8055
[ 2983.052162] Node 0 Normal free:25152kB min:27776kB low:159744kB high:291712kB active_anon:0kB inactive_anon:0kB active_file:4992kB inactive_file:448kB unevictable:128kB writepending:0kB present:132120576kB managed:131984704kB mlocked:128kB kernel_stack:22656kB pagetables:4096kB bounce:0kB free_pcp:38144kB local_pcp:384kB free_cma:0kB
[ 2983.077660] lowmem_reserve[]: 0 0 0
[ 2983.103343] Node 1 Normal free:26304kB min:28224kB low:162304kB high:296384kB active_anon:0kB inactive_anon:0kB active_file:1280kB inactive_file:768kB unevictable:320kB writepending:0kB present:134217728kB managed:134081984kB mlocked:320kB kernel_stack:17664kB pagetables:4928kB bounce:0kB free_pcp:26368kB local_pcp:0kB free_cma:0kB
[ 2983.128907] lowmem_reserve[]: 0 0 0
[ 2983.146757] Node 2 Normal free:20672kB min:28224kB low:162304kB high:296384kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:3904kB unevictable:64kB writepending:0kB present:134217728kB managed:134081856kB mlocked:64kB kernel_stack:12224kB pagetables:1024kB bounce:0kB free_pcp:14656kB local_pcp:0kB free_cma:0kB
[ 2983.179350] lowmem_reserve[]: 0 0 0
[ 2983.204603] Node 3 Normal free:26560kB min:27968kB low:160960kB high:293952kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:768kB unevictable:192kB writepending:0kB present:134217728kB managed:133032384kB mlocked:192kB kernel_stack:12800kB pagetables:3264kB bounce:0kB free_pcp:17472kB local_pcp:0kB free_cma:0kB
[ 2983.229132] lowmem_reserve[]: 0 0 0
[ 2983.254007] Node 0 DMA32: 214*64kB (UME) 50*128kB (UME) 30*256kB (UME) 15*512kB (UME) 3*1024kB (UE) 1*2048kB (U) 9*4096kB (U) 6*8192kB (U) 5*16384kB (U) 3*32768kB (U) 3*65536kB (U) 0*131072kB 0*262144kB 0*524288kB = 503424kB
[ 2983.279676] Node 0 Normal: 218*64kB (UME) 84*128kB (U) 19*256kB (UE) 6*512kB (UE) 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB 0*65536kB 0*131072kB 0*262144kB 0*524288kB = 32640kB
[ 2983.303505] Node 1 Normal: 153*64kB (UME) 102*128kB (U) 19*256kB (UE) 4*512kB (UE) 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB 0*65536kB 0*131072kB 0*262144kB 0*524288kB = 29760kB
[ 2983.328372] Node 2 Normal: 173*64kB (UME) 47*128kB (UE) 10*256kB (U) 2*512kB (U) 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB 0*65536kB 0*131072kB 0*262144kB 0*524288kB = 20672kB
[ 2983.361194] Node 3 Normal: 172*64kB (UME) 23*128kB (UME) 18*256kB (UE) 0*512kB 0*1024kB 1*2048kB (U) 0*4096kB 0*8192kB 0*16384kB 0*32768kB 0*65536kB 0*131072kB 0*262144kB 0*524288kB = 20608kB
[ 2983.385730] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 2983.411746] Node 0 hugepages_total=32 hugepages_free=32 hugepages_surp=0 hugepages_size=524288kB
[ 2983.438515] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 2983.465321] Node 1 hugepages_total=32 hugepages_free=32 hugepages_surp=0 hugepages_size=524288kB
[ 2983.491285] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 2983.517541] Node 2 hugepages_total=32 hugepages_free=32 hugepages_surp=0 hugepages_size=524288kB
[ 2983.543259] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 2983.569428] Node 3 hugepages_total=32 hugepages_free=32 hugepages_surp=0 hugepages_size=524288kB
[ 2983.595198] 136 total pagecache pages
[ 2983.620208] 1 pages in swap cache
[ 2983.646231] Swap cache stats: add 12094, delete 12093, find 218/861
[ 2983.671146] Free swap  = 3674048kB
[ 2983.696728] Total swap = 4194240kB
[ 2983.722016] 8388541 pages RAM
[ 2983.746464] 0 pages HighMem/MovableOnly
[ 2983.780152] 34879 pages reserved
[ 2983.806168] 0 pages cma reserved
[ 2983.830938] 0 pages hwpoisoned
[ 2983.857909] Unreclaimable slab info:
[ 2983.884074] Name                      Used          Total
[ 2983.911180] nf_conntrack            3570KB       3570KB
[ 2983.938443] scsi_sense_cache        3776KB       3776KB
[ 2983.966307] RAWv6                 921353KB    1043751KB
[ 2983.992661] UDPv6                   3780KB       3780KB
[ 2984.019764] tw_sock_TCPv6           1976KB       1976KB
[ 2984.045856] request_sock_TCPv6       1855KB       1855KB
[ 2984.070352] TCPv6                   3825KB       3825KB
[ 2984.097124] mqueue_inode_cache       5992KB       5992KB
[ 2984.123039] kioctx                  1086KB       1086KB
[ 2984.149231] dnotify_struct           256KB        256KB
[ 2984.174402] dio                     2295KB       2295KB
[ 2984.199602] ip4-frags               5886KB       5886KB
[ 2984.231531] secpath_cache           2176KB       2176KB
[ 2984.255824] RAW                  1444428KB    1445220KB
[ 2984.289226] UDP                    12941KB      15172KB
[ 2984.315216] tw_sock_TCP              127KB        127KB
[ 2984.341669] request_sock_TCP         191KB        191KB
[ 2984.367155] TCP                    10573KB      10573KB
[ 2984.393542] hugetlbfs_inode_cache        635KB        635KB
[ 2984.418881] eventpoll_pwq           3519KB       3519KB
[ 2984.444327] inotify_inode_mark       3519KB       3519KB
[ 2984.471361] request_queue           3415KB       3415KB
[ 2984.496839] blkdev_ioc              6132KB       6132KB
[ 2984.524153] biovec-max             17152KB      17152KB
[ 2984.542790] biovec-128               320KB        320KB
[ 2984.568914] biovec-64               3776KB       3776KB
[ 2984.596409] user_namespace          2542KB       2542KB
[ 2984.622926] uid_cache               1662KB       1662KB
[ 2984.649190] skbuff_head_cache      14912KB      14912KB
[ 2984.674543] file_lock_cache         5428KB       5428KB
[ 2984.700636] net_namespace          68557KB      68557KB
[ 2984.725303] shmem_inode_cache      11423KB      11636KB
[ 2984.760256] taskstats               3824KB       3824KB
[ 2984.795337] proc_dir_entry         85406KB      98271KB
[ 2984.821298] pde_opener              6142KB       6142KB
[ 2984.846185] sigqueue               16935KB      16935KB
[ 2984.880311] kernfs_node_cache     113455KB     113455KB
[ 2984.906537] mnt_cache              41587KB      42330KB
[ 2984.932374] filp                41020846KB   41020864KB
[ 2984.966103] names_cache            12288KB      12288KB
[ 2984.991877] ebitmap_node           14400KB      14400KB
[ 2985.017212] avc_xperms_data           64KB         64KB
[ 2985.042341] selinux_file_security       6208KB       6208KB
[ 2985.067113] uts_namespace           5787KB       5787KB
[ 2985.091452] vm_area_struct         50537KB      50920KB
[ 2985.115000] mm_struct               7686KB       7686KB
[ 2985.138794] files_cache             8439KB       8439KB
[ 2985.162519] signal_cache          801752KB     802053KB
[ 2985.200203] sighand_cache          21631KB      23017KB
[ 2985.224356] task_struct          3606334KB    3606520KB
[ 2985.247874] cred_jar              193091KB     193091KB
[ 2985.271417] anon_vma_chain         68350KB      68352KB
[ 2985.304605] anon_vma               11572KB      11572KB
[ 2985.328945] pid                   121280KB     121280KB
[ 2985.361416] Acpi-Operand            6718KB       6718KB
[ 2985.387699] Acpi-Parse              6142KB       6142KB
[ 2985.413742] Acpi-State               511KB        511KB
[ 2985.438434] Acpi-Namespace       5859497KB    5859497KB
[ 2985.471561] numa_policy            14844KB      14844KB
[ 2985.496360] ftrace_event_field        639KB        639KB
[ 2985.521536] pool_workqueue          9600KB       9600KB
[ 2985.546279] task_group             16540KB      16540KB
[ 2985.577567] vmap_area              28144KB      28452KB
[ 2985.594470] pgd_cache               6400KB       6400KB
[ 2985.617543] kmalloc-131072         36864KB      36864KB
[ 2985.639836] kmalloc-65536          66560KB      66560KB
[ 2985.663799] kmalloc-32768          12800KB      12800KB
[ 2985.686106] kmalloc-16384          43008KB      43008KB
[ 2985.709893] kmalloc-8192          127208KB     140288KB
[ 2985.733110] kmalloc-4096          360600KB     421248KB
[ 2985.766290] kmalloc-2048         3309216KB   19472320KB
[ 2985.797229] kmalloc-1024          190339KB     190720KB
[ 2985.831312] kmalloc-512          2979124KB   38064256KB
[ 2985.862184] kmalloc-256          1055561KB    1055616KB
[ 2985.883821] kmalloc-128         19880015KB   27862080KB
[ 2985.903620] kmem_cache_node          488KB        640KB
[ 2985.927888] kmem_cache               448KB        448KB
[ 2985.951597] Tasks state (memory values in pages):
[ 2985.975405] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 2985.998959] [   1393]     0  1393      565        0   393216      142         -1000 systemd-udevd
[ 2986.030619] [   2352]     0  2352      313        0   393216       55         -1000 auditd
[ 2986.062050] [   2472]     0  2472      264        0   327680       74         -1000 sshd
[ 2986.095132] [  45142]     0 45142     3350        0   393216        0             0 sh
[ 2986.127844] [  45145]     0 45145     3350        0   393216        0             0 sh
[ 2986.162165] [  45147]     0 45147     3350        0   393216        0             0 sh
[ 2986.188082] [  45153]     0 45153     3350        0   393216        0             0 sh
[ 2986.223741] [  45156]     0 45156     3350        0   393216        0             0 sh
[ 2986.259117] [  45895]     0 45895      961        4   393216       23         -1000 stress-ng
[ 2986.285294] [  45898]     0 45898      961        4   393216       23         -1000 stress-ng
[ 2986.310534] [  45904]     0 45904      965        4   393216       24         -1000 stress-ng
[ 2986.337019] [  45906]     0 45906      961        4   393216       24         -1000 stress-ng
[ 2986.370611] [  46061]     0 46061      961        4   393216       23         -1000 stress-ng
[ 2986.396730] [  57385]     0 57385      579        0   393216        0             0 systemd-udevd
[ 2986.422850] [  61917]     0 61917       49        0   327680        0             0 hugemmap05
[ 2986.447635] [  61990]     0 61990       43        0   393216        0             0 acct01
[ 2986.473596] [ 138251]     0 138251      121        0   393216        0             0 modprobe
[ 2986.499075] [ 140314]     0 140314      960        4   327680       24         -1000 stress-ng
[ 2986.522841] [ 140636]     0 140636     1477        0   327680       31         -1000 stress-ng-procf
[ 2986.556299] [ 140637]     0 140637     1477        0   327680       31         -1000 stress-ng-procf
[ 2986.580233] [ 143396]     0 143396      102        0   393216        0             0 rmmod
[ 2986.613510] [ 145493]     0 145493     3330        0   393216        0             0 losetup
[ 2986.638019] [ 147632]     0 147632     3366        0   458752        0             0 umount
[ 2986.663073] [ 147687]     0 147687      102        0   327680        0             0 lsmod
[ 2986.687384] [ 148032]     0 148032     3327        0   458752        0             0 cat
[ 2986.711944] [ 149908]     0 149908      102        0   327680        0             0 lsmod
[ 2986.736086] [ 152735]     0 152735     3327        0   393216        0             0 cat
[ 2986.760547] [ 153186]     0 153186      107        0   393216        0             0 insmod
[ 2986.784584] [ 153334]     0 153334      105        0   393216        0             0 insmod
[ 2986.816854] [ 176391]     0 176391      106        0   327680        0             0 insmod
[ 2986.841537] [ 191954]     0 191954      102        0   393216        0             0 lsmod
[ 2986.864550] [ 191960]     0 191960     3327        0   458752        0             0 cat
[ 2986.898367] [ 199937]     0 199937      105        0   393216        0             0 insmod
[ 2986.916287] [ 212083]     0 212083     3327        0   458752        0             0 cat
[ 2986.941061] [ 259584]     0 259584      961        0   393216       25         -1000 stress-ng-dentr
[ 2986.973345] [ 342615]     0 342615     1477        0   393216       29         -1000 stress-ng-memba
[ 2986.999177] [ 352989]     0 352989      965        0   393216       27         -1000 stress-ng-clone
[ 2987.032810] [ 353003]     0 353003      965        0   393216        0          1000 stress-ng-clone
[ 2987.056177] [ 353004]     0 353004      965        0   393216        0          1000 stress-ng-clone
[ 2987.088045] [ 410453]     0 410453     1479        4   393216       28         -1000 stress-ng-mcont
[ 2987.113215] [ 703930]     0 703925       53        0   393216        0             0 growfiles
[ 2987.137961] [ 769680]     0 769680      960        4   327680       24         -1000 stress-ng
[ 2987.163882] [ 781670]     0 781670       36        0   327680        0             0 socket_create
[ 2987.190032] [ 782136]     0 782136      961        4   393216       24         -1000 stress-ng
[ 2987.214821] [ 788409]     0 788409      960        0   327680       28         -1000 stress-ng-sendf
[ 2987.240824] [1011793]     0 1011793      961        0   393216       33         -1000 stress-ng-sched
[ 2987.266984] [1021955]     0 1021955      961        0   393216       27         -1000 stress-ng-kill
[ 2987.293464] Out of memory and no killable processes...
[ 2987.318681] Kernel panic - not syncing: System is deadlocked on memory
               
[ 2987.336876] CPU: 11 PID: 1072757 Comm: kworker/u193:4 Kdump: loaded Tainted: G        W  OEL    4.19.90-2105.2.0.0086.oe1.aarch64 #1
[ 2987.356199] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDDA, BIOS 1.06 10/29/2019
[ 2987.410573] Call trace:
[ 2987.467274]  dump_backtrace+0x0/0x198
[ 2987.467276]  show_stack+0x24/0x30
[ 2987.487276]  dump_stack+0xa4/0xe8
[ 2987.487278]  panic+0x130/0x318
[ 2987.508194]  out_of_memory+0x51c/0x520
[ 2987.508196]  __alloc_pages_nodemask+0xd10/0xd88
[ 2987.527304]  alloc_pages_current+0x88/0xf0
[ 2987.585674]  __page_cache_alloc+0x8c/0xd8
[ 2987.585676]  generic_file_buffered_read+0x3b8/0xae8
[ 2987.606928]  generic_file_read_iter+0x114/0x190
[ 2987.606958]  ext4_file_read_iter+0x5c/0x140 [ext4]
[ 2987.627318]  __vfs_read+0x130/0x1a0
[ 2987.690198]  vfs_read+0x94/0x150
[ 2987.711608]  kernel_read+0x68/0xc8
[ 2987.733368]  prepare_binprm+0xc8/0x1a0
[ 2987.733370]  __do_execve_file.isra.13+0x56c/0x7c0
[ 2987.752640]  do_execve+0x48/0x58
[ 2987.752642]  call_usermodehelper_exec_async+0x200/0x230
[ 2987.819808]  ret_from_fork+0x10/0x18
[ 2987.819845] SMP: stopping secondary CPUs
[ 2988.874804] SMP: failed to stop secondary CPUs 0-10,12-95
[ 2988.914748] ------------[ cut here ]------------
[ 2988.933388] Some CPUs may be stale, kdump will be unreliable.
[ 2988.952305] WARNING: CPU: 11 PID: 1072757 at arch/arm64/kernel/machine_kexec.c:160 machine_kexec+0x58/0x3e8
[ 2988.980516] Modules linked in: async_memcpy async_xor xor xor_neon async_tx raid6_pq(-) sm4_generic authenc cmac ansi_cprng vmac sha3_generic ccm seed cts aes_ce_ccm fcrypt pcbc anubis khazad tea michael_mic arc4 salsa20_generic camellia_generic cast6_generic cast5_generic cast_common serpent_generic twofish_generic xts twofish_common blowfish_generic blowfish_common lrw tgr192 des_generic wp512 rmd320 rmd256 rmd160 rmd128 md4 sha512_generic binfmt_misc loop jprob(OE) ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_filter ebtable_nat ebtable_broute bridge stp llc ebtables ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
[ 2989.026861] CPU: 11 PID: 1072757 Comm: kworker/u193:4 Kdump: loaded Tainted: G        W  OEL    4.19.90-2105.2.0.0086.oe1.aarch64 #1
[ 2989.046410] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDDA, BIOS 1.06 10/29/2019
[ 2989.065568] pstate: 60c00089 (nZCv daIf +PAN +UAO)
[ 2989.093791] pc : machine_kexec+0x58/0x3e8
[ 2989.093793] lr : machine_kexec+0x58/0x3e8
[ 2989.112705] sp : ffff0002a4b2f660
[ 2989.112707] x29: ffff0002a4b2f660 x28: 0000000000000000 
[ 2989.132645] x27: 00000000006200ca x26: ffff0000092d37c8 
[ 2989.159281] x25: ffff0000092d57f0 x24: ffff0000092d5000 
[ 2989.159282] x23: ffff0000095f0000 x22: ffff803fcab50c00 
[ 2989.177933] x21: 0000000000000000 x20: ffff00000931a000 
[ 2989.177934] x19: ffff803fcab50c00 x18: ffffffffffffffff 
[ 2989.196549] x17: 0000000008000000 x16: 0000000000003200 
[ 2989.196550] x15: 0000000000000001 x14: ffff000008b4ebc8 
[ 2989.214748] x13: 0000000000000000 x12: 0000000000000000 
[ 2989.214756] x11: 00000000ffffffff x10: ffff0000092d57f0 
[ 2989.242367] x9 : ffff000008fb0018 x8 : ffff000008663d80 
[ 2989.242368] x7 : 0000000000000001 x6 : 0000000000000015 
[ 2989.260413] x5 : 001fffffffffffff x4 : 0000000000000000 
[ 2989.260414] x3 : 0000000000000000 x2 : ffffffffffffffff 
[ 2989.279219] x1 : 175ef75c185fd800 x0 : 0000000000000000 
[ 2989.279222] Call trace:
[ 2989.279225]  machine_kexec+0x58/0x3e8
[ 2989.294751]  __crash_kexec+0x84/0x138
[ 2989.294753]  panic+0x140/0x318
[ 2989.311918]  out_of_memory+0x51c/0x520
[ 2989.311921]  __alloc_pages_nodemask+0xd10/0xd88
[ 2989.329640]  alloc_pages_current+0x88/0xf0
[ 2989.345293]  __page_cache_alloc+0x8c/0xd8
[ 2989.360886]  generic_file_buffered_read+0x3b8/0xae8
[ 2989.377318]  generic_file_read_iter+0x114/0x190
[ 2989.396451]  ext4_file_read_iter+0x5c/0x140 [ext4]
[ 2989.396454]  __vfs_read+0x130/0x1a0
[ 2989.412465]  vfs_read+0x94/0x150
[ 2989.428499]  kernel_read+0x68/0xc8
[ 2989.443447]  prepare_binprm+0xc8/0x1a0
[ 2989.460897]  __do_execve_file.isra.13+0x56c/0x7c0
[ 2989.460900]  do_execve+0x48/0x58
[ 2989.474193]  call_usermodehelper_exec_async+0x200/0x230
[ 2989.489611]  ret_from_fork+0x10/0x18
[ 2989.505674] ---[ end trace 8a510d7db2828faf ]---
[ 2989.521381] Bye!

1 第一个问题(fork 的实现报 ENOMEM)


1.1 内核参数 overcommit_memory 


它是 内存分配策略

#define OVERCOMMIT_GUESS		0
#define OVERCOMMIT_ALWAYS		1
#define OVERCOMMIT_NEVER		2

可选值:0、1、2。

enum 描述
OVERCOMMIT_GUESS 0 表示内核将检查是否有足够的可用内存供应用进程使用;如果有足够的可用内存,内存申请允许;否则,内存申请失败,并把错误返回给应用进程。
OVERCOMMIT_ALWAYS 1 表示内核允许分配所有的物理内存,而不管当前的内存状态如何。
OVERCOMMIT_NEVER 2 表示内核允许分配超过所有物理内存和交换空间总和的内存

出问题的时候, 内核中该参数的值为 0.

输入图片说明

2 __vm_enough_memory


__vm_enough_memory

__vm_enough_memory 根据 overcommit_memory 的值来做对应的检查

如果发现是 OVERCOMMIT_GUESS,则检查当前是否有足够的内存来供进程使用,如果没有, 则返回 -ENOMEM。

copy_process
-=> copy_mm
   -=> dup_mm
     -=> dup_mmap
        -=> security_vm_enough_memory_mm

失败以后直接返回,导致了分配失败。报 Cannot allocate memory

2 第二个问题 OOM 的问题


[ 2983.052162] Node 0 Normal free:25152kB min:27776kB low:159744kB high:291712kB active_anon:0kB inactive_anon:0kB active_file:4992kB inactive_file:448kB unevictable:128kB writepending:0kB present:132120576kB managed:131984704kB mlocked:128kB kernel_stack:22656kB pagetables:4096kB bounce:0kB free_pcp:38144kB local_pcp:384kB free_cma:0kB
[ 2983.077660] lowmem_reserve[]: 0 0 0
[ 2983.103343] Node 1 Normal free:26304kB min:28224kB low:162304kB high:296384kB active_anon:0kB inactive_anon:0kB active_file:1280kB inactive_file:768kB unevictable:320kB writepending:0kB present:134217728kB managed:134081984kB mlocked:320kB kernel_stack:17664kB pagetables:4928kB bounce:0kB free_pcp:26368kB local_pcp:0kB free_cma:0kB
[ 2983.128907] lowmem_reserve[]: 0 0 0
[ 2983.146757] Node 2 Normal free:20672kB min:28224kB low:162304kB high:296384kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:3904kB unevictable:64kB writepending:0kB present:134217728kB managed:134081856kB mlocked:64kB kernel_stack:12224kB pagetables:1024kB bounce:0kB free_pcp:14656kB local_pcp:0kB free_cma:0kB
[ 2983.179350] lowmem_reserve[]: 0 0 0
[ 2983.204603] Node 3 Normal free:26560kB min:27968kB low:160960kB high:293952kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:768kB unevictable:192kB writepending:0kB present:134217728kB managed:133032384kB mlocked:192kB kernel_stack:12800kB pagetables:3264kB bounce:0kB free_pcp:17472kB local_pcp:0kB free_cma:0kB
[ 2983.229132] lowmem_reserve[]: 0 0 0

各个 NODE 上 normal zone 的水线,free 内存 < min 最低限制水线已经。因此将导致分配不出来。
但是系统中富余的进程都是 stress 的进程,oom_score_adj 都是OOM_SCORE_ADJ_MIN(-1000), 无法被 OOM kill 掉。
因此导致失败。

成坚 (CHENG Jian) 任务状态待办的 修改为修复中

两个问题其实是一类问题,都是因为内存不足了,只是触发的场景不同。

1--第一个问题是在,进程 fork 阶段,检查发现内存不足以分配,因此直接报错,fork 失败
2--第二个问题是fork 已经完成了,进程正常创建了,但是为进程分配实际物理页的时候,发现水线过低,无法正常分配,走到 OOM 流程,却又无法杀死 stress-ng 的进程。

成坚 (CHENG Jian) 任务状态修复中 修改为已完成
6++ 任务状态已完成 修改为已验收

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(3)
5329419 openeuler ci bot 1632792936
C
1
https://gitee.com/openeuler/kernel.git
git@gitee.com:openeuler/kernel.git
openeuler
kernel
kernel

搜索帮助

14c37bed 8189591 565d56ea 8189591