Uploaded image for project: 'OpenVZ'
  1. OpenVZ
  2. OVZ-5587

incorrect bind() syscall behavior if address already in use and container was resumed

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Fix Version/s: OpenVZ-legacy
    • Component/s: Containers::Kernel
    • Security Level: Public
    • Environment:
      Operating System: RHEL/CentOS 6
      Platform: x86_64 (AMD64)

      Description

      Steps to Reproduce:

      1. install latest CentOS 6.x
      2. install latest OpenVZ for CentOS 6.x
      3, create container, 32-bit or 64-bit, enter to container (for example, 123)
      4. install latest stable nginx inside this container (with ip 172.22.22.123)
      5. configure nginx "listen 172.22.22.123:80 default_server backlog=1024"
      6. start nginx, enable nginx service autostart (chkconfig nginx on)
      7. restart entire server (command "restart" in hardware node ssh)
      8. after restart - login into container with nginx and run "strace nginx -t"

      Actual Results:

      EADDRINUSE only in listen() after successful bind() from "nginx -t" instance

      # netstat -tunlp | grep :80
      tcp 0 0 172.22.22.123:80 0.0.0.0:* LISTEN 596/nginx
      tcp 0 0 127.0.0.1:80 0.0.0.0:* LISTEN 589/httpd

      # strace nginx -t
      bind(12, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.22.22.123")}, 16) = 0
      listen(12, 1024) = -1 EADDRINUSE (Address already in use)

      Expected Results:

      EADDRINUSE in bind(), because address 172.22.22.123 already in use by 596/nginx

      # netstat -tunlp | grep :80
      tcp 0 0 172.22.22.123:80 0.0.0.0:* LISTEN 596/nginx
      tcp 0 0 127.0.0.1:80 0.0.0.0:* LISTEN 589/httpd

      # strace nginx -t
      bind(12, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.22.22.123")}, 16) = -1 EADDRINUSE (Address already in use)

      Additional Information:

      temporary workaround exists: after restart container ("vzctl restart 123") bind() inside this container work as expected. but, only before next full server reboot. while rebooting - vz service suspend/resume all containers and this bind() bug already appears.

      vzctl.log:
      ...
       vzctl : CT 123 : Checkpointing completed successfully
      ...
      vzctl : CT 123 : Dump file /vz/dump/Dump.123 exists, trying to restore from it
      vzctl : CT 123 : Restoring container ...
      vzctl : CT 123 : Container is mounted
      vzctl : CT 123 : Adding IP address(es): 172.22.22.123
      vzctl : CT 123 : Setting CPU units: 1000
      vzctl : CT 123 : Container start in progress...
      vzctl : CT 123 : Restoring completed successfully
      ...
      ============ now BUG can be observed ===================
      ...
      vzctl : CT 123 : Restarting container
      vzctl : CT 123 : Stopping container ...
      vzctl : CT 123 : Container was stopped
      vzctl : CT 123 : Container is unmounted
      vzctl : CT 123 : Starting container...
      vzctl : CT 123 : Container is mounted
      vzctl : CT 123 : Adding IP address(es): 172.22.22.123
      vzctl : CT 123 : Setting CPU units: 1000
      vzctl : CT 123 : Container start in progress...
      ...
      ============== no more BUG, bind() work as expected =============

      this bug first appears with recent 2.6.32-042stabXXXX kernels, after vzctl change behaviour to checkpoint/restore containers during server restart.

      with old vzctl (without VE_STOP_MODE=suspend)
      bind() syscall BUG was not observed (suspend not used at all).

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              khorenko Konstantin Khorenko
              Reporter:
              gmm@csdoc.com Gena Makhomed
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated: