Uploaded image for project: 'OpenVZ'
  1. OpenVZ
  2. OVZ-7496

logind sessions stuck in closing state causes slow session creation within AlmaLinux 9 containers seemingly caused by systemd being fed the -z parameter

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: Vz7.0-Update-next
    • Component/s: Containers::Kernel
    • Security Level: Public
    • Environment:

      Description

      >Description of problem:

      This is similar to this report: https://bugs.openvz.org/browse/OVZ-7216

      However the cause of the delay in this instance is that loginctl list-sessions shows hundreds, or even thousands of sessions sitting in the 'closing' state and aren't closed until you run systemctl daemon-reload.

      As the sessions build up, communications via dbus get slower and slower, ultimately resulting in the inability to connect via SSH, slower cgroups creation, etc - anything that creates a new session slows down while it waits for dbus to respond.

      When the number of sessions stuck in closing state becomes too much (usually in the thousands), even running loginctl list-sessions can hang or return with a timeout.

      As a workaround we run the following in cron to clear the sessions so they don't reach a critical mass (every minute):

      systemctl | grep abandoned | grep -e "[[:digit:]]" | sed "s/.scope.*/.scope/" | xargs systemctl stop

      This appears to have been fixed long ago in systemd: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1591411

      Which has me wondering if this is a vz-specific version of that bug report stemming from changes to either cgroups or logind session handling between the host node v3 kernel and the container's v5 kernel expectations.

      >How reproducible:

      The creation and closing of thousands of sessions in a short span of time needs to be replicated. On our systems this occurs by having Plesk cgroups configured for customer accounts such that every php-fpm process launched is opened within a cgroup, thus creating numerous cgroup sessions and then trying to close them once the php-fpm process isn't needed any longer.

      I'm not certain if this can be replicated outside of this environment as I don't have enough experience with creating cgroups manually.

      >Actual results:

      With enough logind sessions stuck in closing state:
      - initd and/or logind can sit at 100% CPU on one core indefinitely
      - loginctl list-sessions can hang
      - SSH logins can hang for 30s or longer, same goes for su once you get logged in
      - system load increases as various services (like cgroups creation) is slowed down
      - We've also seen a significant increase in kernel memory usage recently that might be connected to this.

      >Expected results:

      With our AlmaLinux 8 containers with Plesk and the same cgroups configuration on the same host nodes, logind sessions average around 20-30. They must not be stuck in the 'closing' state, which is what I would expect with AlmaLinux 9 as well.

      >Host OS:

      Virtuozzo Linux 7.9 / 3.10.0-1160.80.1.vz7.191.4 / OpenVZ release 7.0.21 (55)

      >Guest OS:

      AlmaLinux release 9.3 / 5.14.0 / systemd-219-78.vl7.7.1.x86_64 / dbus-1.10.24-15.vl7.1.x86_64

        Attachments

          Activity

            People

            Assignee:
            aleksandr.leskin Aleksandr Leskin
            Reporter:
            websavers Jordan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: