Uploaded image for project: 'OpenVZ'
  1. OpenVZ
  2. OVZ-7488

openvz-7.0.21-55: vcmmd-8.0.88-1.vz7 broken!

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: Vz7.0-Update-next
    • Component/s: VCMMD (Memory Manager)
    • Security Level: Public
    • Environment:
      OpenVZ 7, fully YUM updated
      Hardware: Various systems

      Description

      >Description of problem:

      After a "yum update" on any of our various OpenVZ 7 nodes we were unable to start or restart VMs or CTs:

      [root@node ~]# prlctl start 727
      WARNING: You are using a deprecated CLI component that won't be installed by default in the next major release. Please use virsh instead
      Starting the CT...
      Failed to start the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: vcmmd: failed to register Container: Failed to get VCMMD D-Bus name
      vcmmd: failed to unregister Container: Failed to get VCMMD D-Bus name
      vcmmd: failed to unregister Container: Failed to get VCMMD D-Bus name
      Failed to start the Container
      )

      Service "vcmmd" was reported as stopped and could not be restarted:

      [root@node ~]# systemctl restart vcmmd
      Job for vcmmd.service failed because the control process exited with error code. See "systemctl status vcmmd.service" and "journalctl -xe" for details.

      Attempts to manually run "vcmmd" in interactive mode produced this output:

      [root@node ~]# vcmmd -i
      2024-01-25 16:00:46 INFO vcmmd: Started
      2024-01-25 16:00:46 INFO vcmmd.config: Loading config from file '/etc/vz/vcmmd.conf'
      2024-01-25 16:00:46 INFO vcmmd.host: [redacted]: 67101446144 bytes available for VEs
      2024-01-25 16:00:46 ERROR vcmmd.host: [redacted]: Memory cgroup vstorage.slice does not exist
      2024-01-25 16:00:46 ERROR vcmmd.ldmgr: Failed to load policy "density": Policy not found
      2024-01-25 16:00:46 INFO vcmmd.ldmgr: Switch to fallback policy
      2024-01-25 16:00:46 INFO vcmmd.ldmgr: Loaded policy "NoOpPolicy"
      2024-01-25 16:00:46 CRITICAL vcmmd: Terminating program due to unhandled exception:
      2024-01-25 16:00:46 CRITICAL vcmmd: Traceback (most recent call last):
      2024-01-25 16:00:46 CRITICAL vcmmd: File "/usr/lib/python3.6/site-packages/vcmmd/util/threading.py", line 43, in run_with_except_hook
      2024-01-25 16:00:46 CRITICAL vcmmd: run_original(*args2, **kwargs2)
      2024-01-25 16:00:46 CRITICAL vcmmd: File "/usr/lib64/python3.6/threading.py", line 864, in run
      2024-01-25 16:00:46 CRITICAL vcmmd: self._target(*self._args, **self._kwargs)
      2024-01-25 16:00:46 CRITICAL vcmmd: File "/usr/lib/python3.6/site-packages/vcmmd/ldmgr/policy.py", line 82, in wrapper
      2024-01-25 16:00:46 CRITICAL vcmmd: sleep_timeout = f(self, *args, **kwargs)
      2024-01-25 16:00:46 CRITICAL vcmmd: File "/usr/lib/python3.6/site-packages/vcmmd/ldmgr/policy.py", line 323, in ksm_controller
      2024-01-25 16:00:46 CRITICAL vcmmd: params = self.get_ksm_params()
      2024-01-25 16:00:46 CRITICAL vcmmd: File "/usr/lib/python3.6/site-packages/vcmmd/ldmgr/policies/NoOpPolicy.py", line 54, in get_ksm_params
      2024-01-25 16:00:46 CRITICAL vcmmd: if self.active_vm < ksm_vms_active_threshold or \
      2024-01-25 16:00:46 CRITICAL vcmmd: AttributeError: 'NoOpPolicy' object has no attribute 'active_vm'

      CTs and VMs that had been running during the YUM update continued to run, but would fail to start if an attempt was made to restart them.

      The only remedy was to roll back to the previous last good version of "vcmmd":

      rpm -hUv --force https://download.openvz.org/virtuozzo/releases/openvz-7.0.20-147/x86_64/os/Packages/v/vcmmd-8.0.77-1.vz7.noarch.rpm

      >How reproducible:

      yum clean all
      yum update
      prlctl restart <vpsid>

      >Host OS:

      Fully YUM updated OpenVZ 7. Various kernel versions, as some nodes hadn't been rebooted in a while. Youngest had 17 days of uptime, oldest had 983 days of uptime.

      >Guest OS:

      Any. Doesn't matter. All VMs and CTs affected, regardless of guest OSs. And we use EL7-EL8 as well as various Debian versions in guest OS's.

      >Additional info:

      We had a dozen nodes affected in house and a whole bunch more from clients whom we need still to reach out to.

      Do you by chance test YUM updates before release? If not, then why not? If yes: Your test procedure could probably need an overhaul.

        Attachments

          Activity

            People

            Assignee:
            svarog Sergej Parshikov
            Reporter:
            mstauber Michael Stauber
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: