jjzjj

Kubernetes(K8S) Node NotReady 节点资源不足 Pod无法运行

VipSoft 2023-03-28 原文

k8s 线上集群中 Node 节点状态变成 NotReady 状态,导致整个 Node 节点中容器停止服务。

一个 Node 节点中是可以运行多个 Pod 容器,每个 Pod 容器可以运行多个实例 App 容器。Node 节点不可用,就会直接导致 Node 节点中所有的容器不可用,Node 节点是否健康,直接影响该节点下所有的实例容器的健康状态,直至影响整个 K8S 集群

kubectl top node NotFound

# 查看节点的资源情况
[root@k8smaster ~]# kubectl top node
NAME        CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
k8smaster   269m         13%    1699Mi          22%       
k8snode1    1306m        65%    9705Mi          82%       
k8snode2    288m         14%    8100Mi          68%

# 查看节点状态
[root@k8smaster ~]# kubectl get nodes
NAME        STATUS     ROLES    AGE   VERSION
k8smaster   Ready      master   33d   v1.18.19
k8snode1    NotReady   <none>   33d   v1.18.19
k8snode2    Ready      <none>   33d   v1.18.19
# 查看节点日志
[root@k8smaster ~]# kubectl describe nodes k8snode1
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                1 (50%)       7100m (355%)
  memory             7378Mi (95%)  14556Mi (188%)
  ephemeral-storage  0 (0%)        0 (0%)
  hugepages-2Mi      0 (0%)        0 (0%)
Events:
  Type     Reason                   Age                   From     Message
  ----     ------                   ----                  ----     -------
  Warning  SystemOOM                30m                   kubelet  System OOM encountered, victim process: java, pid: 29417
  Warning  SystemOOM                30m                   kubelet  System OOM encountered, victim process: java, pid: 29418
  Warning  SystemOOM                30m                   kubelet  System OOM encountered, victim process: java, pid: 29430
  Warning  SystemOOM                30m                   kubelet  System OOM encountered, victim process: erl_child_setup, pid: 26391
  Warning  SystemOOM                30m                   kubelet  System OOM encountered, victim process: beam.smp, pid: 26134
  Warning  SystemOOM                30m                   kubelet  System OOM encountered, victim process: 1_scheduler, pid: 26392
  Warning  SystemOOM                29m                   kubelet  System OOM encountered, victim process: java, pid: 28855
  Warning  SystemOOM                29m                   kubelet  System OOM encountered, victim process: java, pid: 28637
  Warning  SystemOOM                28m                   kubelet  System OOM encountered, victim process: java, pid: 29348
  Normal   NodeHasSufficientMemory  24m (x5 over 3h11m)   kubelet  Node k8snode1 status is now: NodeHasSufficientMemory
  Normal   NodeHasSufficientPID     24m (x5 over 3h11m)   kubelet  Node k8snode1 status is now: NodeHasSufficientPID
  Normal   NodeHasNoDiskPressure    24m (x5 over 3h11m)   kubelet  Node k8snode1 status is now: NodeHasNoDiskPressure
  Warning  SystemOOM                9m57s (x26 over 28m)  kubelet  (combined from similar events): System OOM encountered, victim process: java, pid: 30289
  Normal   NodeReady                5m38s (x9 over 30m)   kubelet  Node k8snode1 status is now: NodeReady
# 查看 pod 分在哪些节点上,发现 都在node1 上,【这是问题所在】
[root@k8smaster ~]# kubectl get pod,svc -n thothehp-test -o wide 
NAME                          READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
pod/basic-67ffd66f55-zjrx5     1/1     Running   13         45h     10.244.1.89   k8snode1   <none>           <none>
pod/c-api-69c786b7d7-m5brp   1/1     Running   11         3h53m   10.244.1.78   k8snode1   <none>           <none>
pod/d-api-6f8948ccd7-7p6pb    1/1     Running   12         139m    10.244.1.82   k8snode1   <none>           <none>
pod/gateway-5c84bc8775-pk86m   1/1     Running   7          25h     10.244.1.84   k8snode1   <none>           <none>
pod/im-5fc6c47d75-dl9g4        1/1     Running   8          83m     10.244.1.86   k8snode1   <none>           <none>
pod/medical-5f55855785-qr7r5   1/1     Running   12         83m     10.244.1.90   k8snode1   <none>           <none>
pod/pay-5d98658dbc-ww4sg       1/1     Running   11         83m     10.244.1.88   k8snode1   <none>           <none>
pod/elasticsearch-0            1/1     Running   0          80m     10.244.2.66   k8snode2   <none>           <none>
pod/emqtt-54b6f4497c-s44jz     1/1     Running   5          83m     10.244.1.83   k8snode1   <none>           <none>
pod/nacos-0                    1/1     Running   0          80m     10.244.2.67   k8snode2   <none>           <none>

NAME                            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                 AGE     SELECTOR
service/nacos-headless          ClusterIP   None             <none>        8848/TCP,7848/TCP       45h     app=nacos
service/service-basic           ClusterIP   None             <none>        80/TCP                  45h     app=ehp-basic
service/service-c-api           ClusterIP   None             <none>        80/TCP                  3h53m   app=ehp-cms-api
service/service-d-api           ClusterIP   None             <none>        80/TCP                  139m    app=ehp-ds-api
service/service-gateway         NodePort    10.101.194.234   <none>        80:30180/TCP            25h     app=ehp-gateway
service/service-im              ClusterIP   None             <none>        80/TCP                  129m    app=ehp-im
service/service-medical         ClusterIP   None             <none>        80/TCP                  111m    app=ehp-medical
service/service-pay             ClusterIP   10.111.162.80    <none>        80/TCP                  93m     app=ehp-pay
service/service-elasticsearch   ClusterIP   10.111.74.111    <none>        9200/TCP,9300/TCP       2d3h    app=elasticsearch
service/service-emqtt           NodePort    10.106.201.96    <none>        61613:31616/TCP,8083:30804/TCP    2d5h  app=emqtt
service/service-nacos           NodePort    10.106.166.59    <none>        8848:30848/TCP,7848:31176/TCP     45h   app=nacos
[root@k8smaster ~]# 

加大内存,重启,内存加大后,会自动分配一些到 Node2 上面,也可以能过 label 指定某个 POD 选择哪个 Node 节点

# 需要重启docker
[root@k8snode1 ~]# systemctl restart docker

# 需要重启kubelet
[root@k8snode1 ~]# sudo systemctl restart kubelet

kubectl top node NotFound

# 查看节点的资源情况
[root@k8smaster ~]# kubectl top node
NAME        CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
k8smaster   269m         13%    1699Mi          22%       
k8snode1    1306m        65%    9705Mi          82%       
k8snode2    288m         14%    8100Mi          68%

有关Kubernetes(K8S) Node NotReady 节点资源不足 Pod无法运行的更多相关文章

  1. ruby-on-rails - 由于 "wkhtmltopdf",PDFKIT 显然无法正常工作 - 2

    我在从html页面生成PDF时遇到问题。我正在使用PDFkit。在安装它的过程中,我注意到我需要wkhtmltopdf。所以我也安装了它。我做了PDFkit的文档所说的一切......现在我在尝试加载PDF时遇到了这个错误。这里是错误:commandfailed:"/usr/local/bin/wkhtmltopdf""--margin-right""0.75in""--page-size""Letter""--margin-top""0.75in""--margin-bottom""0.75in""--encoding""UTF-8""--margin-left""0.75in""-

  2. ruby-on-rails - 无法使用 Rails 3.2 创建插件? - 2

    我对最新版本的Rails有疑问。我创建了一个新应用程序(railsnewMyProject),但我没有脚本/生成,只有脚本/rails,当我输入ruby./script/railsgeneratepluginmy_plugin"Couldnotfindgeneratorplugin.".你知道如何生成插件模板吗?没有这个命令可以创建插件吗?PS:我正在使用Rails3.2.1和ruby​​1.8.7[universal-darwin11.0] 最佳答案 随着Rails3.2.0的发布,插件生成器已经被移除。查看变更日志here.现在

  3. ruby - 无法运行 Rails 2.x 应用程序 - 2

    我尝试运行2.x应用程序。我使用rvm并为此应用程序设置其他版本的ruby​​:$rvmuseree-1.8.7-head我尝试运行服务器,然后出现很多错误:$script/serverNOTE:Gem.source_indexisdeprecated,useSpecification.Itwillberemovedonorafter2011-11-01.Gem.source_indexcalledfrom/Users/serg/rails_projects_terminal/work_proj/spohelp/config/../vendor/rails/railties/lib/r

  4. ruby-on-rails - 无法在centos上安装therubyracer(V8和GCC出错) - 2

    我正在尝试在我的centos服务器上安装therubyracer,但遇到了麻烦。$geminstalltherubyracerBuildingnativeextensions.Thiscouldtakeawhile...ERROR:Errorinstallingtherubyracer:ERROR:Failedtobuildgemnativeextension./usr/local/rvm/rubies/ruby-1.9.3-p125/bin/rubyextconf.rbcheckingformain()in-lpthread...yescheckingforv8.h...no***e

  5. ruby - 无法让 RSpec 工作—— 'require' : cannot load such file - 2

    我花了三天的时间用头撞墙,试图弄清楚为什么简单的“rake”不能通过我的规范文件。如果您遇到这种情况:任何文件夹路径中都不要有空格!。严重地。事实上,从现在开始,您命名的任何内容都没有空格。这是我的控制台输出:(在/Users/*****/Desktop/LearningRuby/learn_ruby)$rake/Users/*******/Desktop/LearningRuby/learn_ruby/00_hello/hello_spec.rb:116:in`require':cannotloadsuchfile--hello(LoadError) 最佳

  6. ruby - 无法覆盖 irb 中的 to_s - 2

    我在pry中定义了一个函数:to_s,但我无法调用它。这个方法去哪里了,怎么调用?pry(main)>defto_spry(main)*'hello'pry(main)*endpry(main)>to_s=>"main"我的ruby版本是2.1.2看了一些答案和搜索后,我认为我得到了正确的答案:这个方法用在什么地方?在irb或pry中定义方法时,会转到Object.instance_methods[1]pry(main)>defto_s[1]pry(main)*'hello'[1]pry(main)*end=>:to_s[2]pry(main)>defhello[2]pry(main)

  7. ruby - 无法在 60 秒内获得稳定的 Firefox 连接 (127.0.0.1 :7055) - 2

    我使用的是Firefox版本36.0.1和Selenium-Webdrivergem版本2.45.0。我能够创建Firefox实例,但无法使用脚本继续进行进一步的操作无法在60秒内获得稳定的Firefox连接(127.0.0.1:7055)错误。有人能帮帮我吗? 最佳答案 我遇到了同样的问题。降级到firefoxv33后一切正常。您可以找到旧版本here 关于ruby-无法在60秒内获得稳定的Firefox连接(127.0.0.1:7055),我们在StackOverflow上找到一个类

  8. ruby - 安装 Ruby 时遇到问题(无法下载资源 "readline--patch") - 2

    当我尝试安装Ruby时遇到此错误。我试过查看this和this但无济于事➜~brewinstallrubyWarning:YouareusingOSX10.12.Wedonotprovidesupportforthispre-releaseversion.Youmayencounterbuildfailuresorotherbreakages.Pleasecreatepull-requestsinsteadoffilingissues.==>Installingdependenciesforruby:readline,libyaml,makedepend==>Installingrub

  9. ruby-on-rails - 无法让 rspec、spork 和调试器正常运行 - 2

    GivenIamadumbprogrammerandIamusingrspecandIamusingsporkandIwanttodebug...mmm...let'ssaaay,aspecforPhone.那么,我应该把“require'ruby-debug'”行放在哪里,以便在phone_spec.rb的特定点停止处理?(我所要求的只是一个大而粗的箭头,即使是一个有挑战性的程序员也能看到:-3)我已经尝试了很多位置,除非我没有正确测试它们,否则会发生一些奇怪的事情:在spec_helper.rb中的以下位置:require'rubygems'require'spork'

  10. ruby-on-rails - Rails 3,嵌套资源,没有路由匹配 [PUT] - 2

    我真的为这个而疯狂。我一直在搜索答案并尝试我找到的所有内容,包括相关问题和stackoverflow上的答案,但仍然无法正常工作。我正在使用嵌套资源,但无法使表单正常工作。我总是遇到错误,例如没有路线匹配[PUT]"/galleries/1/photos"表格在这里:/galleries/1/photos/1/edit路线.rbresources:galleriesdoresources:photosendresources:galleriesresources:photos照片Controller.rbdefnew@gallery=Gallery.find(params[:galle

随机推荐