Details
-
Type: Bug
-
Status: Open
-
Priority: Minor
-
Resolution: Unresolved
-
Fix Version/s: OpenVZ-legacy
-
Component/s: Containers::Kernel
-
Security Level: Public
-
Environment:Operating System: RHEL/CentOS 6
Platform: x86_64 (AMD64)
-
External issue URL:
-
External issue ID:2470
Description
Steps to Reproduce:
1. install latest CentOS 6.x
2. install latest OpenVZ for CentOS 6.x
3, create container, 32-bit or 64-bit, enter to container (for example, 123)
4. install latest stable nginx inside this container (with ip 172.22.22.123)
5. configure nginx "listen 172.22.22.123:80 default_server backlog=1024"
6. start nginx, enable nginx service autostart (chkconfig nginx on)
7. restart entire server (command "restart" in hardware node ssh)
8. after restart - login into container with nginx and run "strace nginx -t"
Actual Results:
EADDRINUSE only in listen() after successful bind() from "nginx -t" instance
# netstat -tunlp | grep :80
tcp 0 0 172.22.22.123:80 0.0.0.0:* LISTEN 596/nginx
tcp 0 0 127.0.0.1:80 0.0.0.0:* LISTEN 589/httpd
# strace nginx -t
bind(12, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.22.22.123")}, 16) = 0
listen(12, 1024) = -1 EADDRINUSE (Address already in use)
Expected Results:
EADDRINUSE in bind(), because address 172.22.22.123 already in use by 596/nginx
# netstat -tunlp | grep :80
tcp 0 0 172.22.22.123:80 0.0.0.0:* LISTEN 596/nginx
tcp 0 0 127.0.0.1:80 0.0.0.0:* LISTEN 589/httpd
# strace nginx -t
bind(12, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.22.22.123")}, 16) = -1 EADDRINUSE (Address already in use)
Additional Information:
temporary workaround exists: after restart container ("vzctl restart 123") bind() inside this container work as expected. but, only before next full server reboot. while rebooting - vz service suspend/resume all containers and this bind() bug already appears.
vzctl.log:
...
vzctl : CT 123 : Checkpointing completed successfully
...
vzctl : CT 123 : Dump file /vz/dump/Dump.123 exists, trying to restore from it
vzctl : CT 123 : Restoring container ...
vzctl : CT 123 : Container is mounted
vzctl : CT 123 : Adding IP address(es): 172.22.22.123
vzctl : CT 123 : Setting CPU units: 1000
vzctl : CT 123 : Container start in progress...
vzctl : CT 123 : Restoring completed successfully
...
============ now BUG can be observed ===================
...
vzctl : CT 123 : Restarting container
vzctl : CT 123 : Stopping container ...
vzctl : CT 123 : Container was stopped
vzctl : CT 123 : Container is unmounted
vzctl : CT 123 : Starting container...
vzctl : CT 123 : Container is mounted
vzctl : CT 123 : Adding IP address(es): 172.22.22.123
vzctl : CT 123 : Setting CPU units: 1000
vzctl : CT 123 : Container start in progress...
...
============== no more BUG, bind() work as expected =============
this bug first appears with recent 2.6.32-042stabXXXX kernels, after vzctl change behaviour to checkpoint/restore containers during server restart.
with old vzctl (without VE_STOP_MODE=suspend)
bind() syscall BUG was not observed (suspend not used at all).
1. install latest CentOS 6.x
2. install latest OpenVZ for CentOS 6.x
3, create container, 32-bit or 64-bit, enter to container (for example, 123)
4. install latest stable nginx inside this container (with ip 172.22.22.123)
5. configure nginx "listen 172.22.22.123:80 default_server backlog=1024"
6. start nginx, enable nginx service autostart (chkconfig nginx on)
7. restart entire server (command "restart" in hardware node ssh)
8. after restart - login into container with nginx and run "strace nginx -t"
Actual Results:
EADDRINUSE only in listen() after successful bind() from "nginx -t" instance
# netstat -tunlp | grep :80
tcp 0 0 172.22.22.123:80 0.0.0.0:* LISTEN 596/nginx
tcp 0 0 127.0.0.1:80 0.0.0.0:* LISTEN 589/httpd
# strace nginx -t
bind(12, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.22.22.123")}, 16) = 0
listen(12, 1024) = -1 EADDRINUSE (Address already in use)
Expected Results:
EADDRINUSE in bind(), because address 172.22.22.123 already in use by 596/nginx
# netstat -tunlp | grep :80
tcp 0 0 172.22.22.123:80 0.0.0.0:* LISTEN 596/nginx
tcp 0 0 127.0.0.1:80 0.0.0.0:* LISTEN 589/httpd
# strace nginx -t
bind(12, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("172.22.22.123")}, 16) = -1 EADDRINUSE (Address already in use)
Additional Information:
temporary workaround exists: after restart container ("vzctl restart 123") bind() inside this container work as expected. but, only before next full server reboot. while rebooting - vz service suspend/resume all containers and this bind() bug already appears.
vzctl.log:
...
vzctl : CT 123 : Checkpointing completed successfully
...
vzctl : CT 123 : Dump file /vz/dump/Dump.123 exists, trying to restore from it
vzctl : CT 123 : Restoring container ...
vzctl : CT 123 : Container is mounted
vzctl : CT 123 : Adding IP address(es): 172.22.22.123
vzctl : CT 123 : Setting CPU units: 1000
vzctl : CT 123 : Container start in progress...
vzctl : CT 123 : Restoring completed successfully
...
============ now BUG can be observed ===================
...
vzctl : CT 123 : Restarting container
vzctl : CT 123 : Stopping container ...
vzctl : CT 123 : Container was stopped
vzctl : CT 123 : Container is unmounted
vzctl : CT 123 : Starting container...
vzctl : CT 123 : Container is mounted
vzctl : CT 123 : Adding IP address(es): 172.22.22.123
vzctl : CT 123 : Setting CPU units: 1000
vzctl : CT 123 : Container start in progress...
...
============== no more BUG, bind() work as expected =============
this bug first appears with recent 2.6.32-042stabXXXX kernels, after vzctl change behaviour to checkpoint/restore containers during server restart.
with old vzctl (without VE_STOP_MODE=suspend)
bind() syscall BUG was not observed (suspend not used at all).
Attachments
Issue Links
- is duplicated by
-
OVZ-5807 nginx fail to restart after suspend/restore container
- Open