最近维护测试服务器越来越多出现OOM。每次都是改改内核参数,貌似有点用处。但是这个治标不治本,源头没有找到。
自己先了解了一下一些基础性的东西。每次OOM后查看messages日志都能看到
Jun 18 17:10:23 free-72-222 kernel: oom-killer: gfp_mask=0xd0Jun 18 17:10:23 free-72-222 kernel: Mem-info:Jun 18 17:10:23 free-72-222 kernel: DMA per-cpu:Jun 18 17:10:23 free-72-222 kernel: cpu 0 hot: low 2, high 6, batch 1Jun 18 17:10:23 free-72-222 kernel: cpu 0 cold: low 0, high 2, batch 1Jun 18 17:10:23 free-72-222 kernel: cpu 1 hot: low 2, high 6, batch 1Jun 18 17:10:23 free-72-222 kernel: cpu 1 cold: low 0, high 2, batch 1Jun 18 17:10:23 free-72-222 kernel: cpu 2 hot: low 2, high 6, batch 1Jun 18 17:10:27 free-72-222 kernel: cpu 2 cold: low 0, high 2, batch 1Jun 18 17:10:27 free-72-222 kernel: cpu 3 hot: low 2, high 6, batch 1Jun 18 17:10:27 free-72-222 kernel: cpu 3 cold: low 0, high 2, batch 1Jun 18 17:10:27 free-72-222 kernel: Normal per-cpu:Jun 18 17:10:27 free-72-222 kernel: cpu 0 hot: low 32, high 96, batch 16Jun 18 17:10:27 free-72-222 kernel: cpu 0 cold: low 0, high 32, batch 16Jun 18 17:10:27 free-72-222 kernel: cpu 1 hot: low 32, high 96, batch 16Jun 18 17:10:27 free-72-222 kernel: cpu 1 cold: low 0, high 32, batch 16Jun 18 17:10:27 free-72-222 kernel: cpu 2 hot: low 32, high 96, batch 16Jun 18 17:10:27 free-72-222 kernel: cpu 2 cold: low 0, high 32, batch 16…Jun 20 14:46:44 free-72-222 kernel: cpu 2 cold: low 0, high 32, batch 16Jun 20 14:46:44 free-72-222 kernel: cpu 1 cold: low 0, high 32, batch 16Jun 20 14:46:44 free-72-222 kernel: cpu 2 hot: low 32, high 96, batch 16Jun 20 14:46:44 free-72-222 kernel: cpu 2 cold: low 0, high 32, batch 16Jun 20 14:46:44 free-72-222 kernel: cpu 3 hot: low 32, high 96, batch 16Jun 20 14:46:44 free-72-222 kernel: cpu 3 cold: low 0, high 32, batch 16Jun 20 14:46:44 free-72-222 kernel:Jun 20 14:46:44 free-72-222 kernel: Free pages: 35748kB (24320kB HighMem)Jun 20 14:46:44 free-72-222 kernel: protections[]: 0 0 0Jun 20 14:46:44 free-72-222 kernel: protections[]: 0 0 0Jun 20 14:46:44 free-72-222 kernel: protections[]: 0 0 0Jun 20 14:46:44 free-72-222 kernel: protections[]: 0 0 0Jun 20 14:46:44 free-72-222 kernel: Normal free:3304kB min:3336kB low:6672kB high:10008kB active:617956kB inactive:0kB present:729088kB pages_scanned:1293 all_unreclaimable? noJun 20 14:46:44 free-72-222 kernel: protections[]: 0 0 0Jun 20 14:46:44 free-72-222 kernel: HighMem free:24320kB min:512kB low:1024kB high:1536kB active:2836904kB inactive:486976kB present:3358720kB pages_scanned:0 all_unreclaimable? noJun 20 14:46:44 free-72-222 kernel: protections[]: 0 0 0Jun 20 14:46:44 free-72-222 kernel: DMA: 34kB 28kB 616kB 432kB 564kB 1128kB 1256kB 2512kB 21024kB 22048kB 04096kB = 8124kBJun 20 14:46:44 free-72-222 kernel: Normal: 04kB 18kB 016kB 132kB 164kB 1128kB 0256kB 0512kB 11024kB 12048kB 04096kB = 3304kBJun 20 14:46:44 free-72-222 kernel: HighMem: 59424kB 58kB 016kB 032kB 064kB 0128kB 0256kB 1512kB 01024kB 02048kB0*4096kB = 24320kBJun 20 14:46:44 free-72-222 kernel: 428935 pagecache pagesJun 20 14:46:44 free-72-222 kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0Jun 20 14:46:44 free-72-222 kernel: 0 bounce buffer pagesJun 20 14:46:44 free-72-222 kernel: Free swap: 0kBJun 20 14:46:44 free-72-222 kernel: 1026048 pages of RAMJun 20 14:46:44 free-72-222 kernel: 839680 pages of HIGHMEMJun 20 14:46:44 free-72-222 kernel: 10594 reserved pagesJun 20 14:46:44 free-72-222 kernel: 413640 pages sharedJun 20 14:46:44 free-72-222 kernel: 0 pages swap cachedJun 20 14:46:44 free-72-222 kernel: Out of Memory: Killed process 19148 (java). 这样的日志,对于里面的
...