Uploaded image for project: 'OpenVZ'
  1. OpenVZ
  2. OVZ-5778

042stab079.4 load average zero'd

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Fix Version/s: OpenVZ-legacy
    • Component/s: Containers::Kernel
    • Security Level: Public
    • Environment:
      Operating System: RHEL/CentOS 6
      Platform: x86_64 (AMD64)

      Description

      Hi,

      We upgraded an OpenVZ server earlier this week to 042stab079.4 in attempt to resolve an odd issue we have experienced with the kipmi0 thread.

      Upon booting the server and having it online for approximately 5 minutes it reaches a load of 0.00/0.00/0.00. This load is constant from that time onward. The load does not increase regardless of process or IO activity. Attached is a text file with a few lines from top.

      Most of the time we are seeing individual container load averages constant and at normal levels. (0.00 <= load < ~1.00)

      # vzlist -o ctid,laverage
            CTID LAVERAGE
            4851 0.01/0.07/0.03
            4875 0.00/0.01/0.00
            5154 0.00/0.00/0.00
            5237 0.01/0.09/0.06
            5312 0.00/0.00/0.00
      .. snip ..

      However, there are times at which vzlist reports load averages on these containers extremely high for no apparent reason. We have monitoring software on this node that'll dump a process listing of these containers during this condition. At the same time that vzlist reports the extremely high load we see the same load average within that container from this process dump.

      Here is an example of top from an individual container which experienced the problem:

      --
      top - 04:07:19 up 6:56, 0 users, load average: 4447.13, 448500469.55, 228660381686.36
      Tasks: 43 total, 2 running, 41 sleeping, 0 stopped, 0 zombie
      Cpu(s): 7.4%us, 0.4%sy, 0.1%ni, 92.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.2%st
      Mem: 786432k total, 492816k used, 293616k free, 0k buffers
      Swap: 1048576k total, 0k used, 1048576k free, 175400k cached
      --

      During the time that this container load reported > 4000 the node itself was underloaded (plenty of idle CPU, IO, etc) and reporting a load of 0.00.

      This behavior with the individual container loads seems to last only for seconds, quickly disappearing, and it appears to roll through a few random containers sequentially. I have not been able to determine what triggers this or how to reproduce it on demand.

      I thought this issue was due to it being a testing kernel, but I noticed today that 79.4 was pushed stable. We did not see this issue with 78.28.

      Linux **** 2.6.32-042stab079.4 #1 SMP Thu Jul 18 18:57:29 MSK 2013 x86_64 x86_64 x86_64 GNU/Linux


      --


      Note: The issue that prompted us to upgrade the kernel from 78.28 is that we are seeing the kipmi0 process constantly siting at 100% on this server. At the same time we see poor network performance intermittently. This process/behavior can be seen in the attached file. I am not certain of the cause of the kipmi0 issue although we see it on both 78.28 and 79.4 - having me conclude it is not related to the load average topic of this bug report.

      Regards,

      Bryon

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              khorenko Konstantin Khorenko
              Reporter:
              bryon@x10hosting.com Bryon Elston
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: