Uploaded image for project: 'OpenVZ'
  1. OpenVZ
  2. OVZ-5623

Unstopable container: Child 170622 exited with status 7

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Fix Version/s: OpenVZ-legacy
    • Component/s: Containers::Kernel
    • Security Level: Public
    • Environment:
      Operating System: Other
      Platform: Other

      Description

      Hello,

      I have an very strange issue with openvz - I can't stop container.

      uname -a
      Linux ovz63.fastvps.ru 2.6.32-042stab072.10 #1 SMP Wed Jan 16 18:54:05 MSK 2013 x86_64 x86_64 x86_64 GNU/Linux

      vzlist |grep 6318
            6318 8 running 88.198.252.41 yyy.ru

      I tryed many times this command:
      vzctl stop 6318
      Stopping container ...
      Child 171771 exited with status 7
      Killing container ...
      Child 171773 exited with status 7
      Unable to stop container

      but container is still running:
      vzlist|grep 6318
            6318 8 running 88.198.252.41 pupik.ru

      I tryed kill CT process-by-process, but it did not helped also:


       for i in `ps aux|awk '{print $2}'` ; do vzpid $i | grep -v CTID | awk '{print $1 " " $2}' ;done|grep 6318 | awk '{print $1}'
      16584
      16585
      16586
      330166
      550504
      670235
      747985
      952680

      U can see all process running on VE below:
      ps aux|grep 16584
      root 16584 0.0 0.0 0 0 ? Ss Feb09 0:01 [init]
      root 186944 0.0 0.0 103260 924 pts/0 S+ 08:48 0:00 grep 16584

      [root@ovz63 ~]# ps aux|grep 16585
      root 16585 0.0 0.0 0 0 ? S Feb09 0:00 [kthreadd/6318]
      root 186952 0.0 0.0 103260 920 pts/0 S+ 08:48 0:00 grep 16585

      [root@ovz63 ~]# ps aux|grep 16586
      root 16586 0.0 0.0 0 0 ? S Feb09 0:00 [khelper/6318]
      root 186955 0.0 0.0 103260 924 pts/0 S+ 08:48 0:00 grep 16586

      [root@ovz63 ~]# ps aux|grep 550504
      root 187002 0.0 0.0 103256 928 pts/0 S+ 08:49 0:00 grep 550504
      500 550504 0.0 0.0 0 0 ? Z Feb12 0:48 [php] <defunct>

      [root@ovz63 ~]# ps aux|grep 330166
      root 187046 0.0 0.0 103260 928 pts/0 S+ 08:49 0:00 grep 330166
      33 330166 0.0 0.0 152980 5588 ? D Feb10 0:02 /usr/sbin/apache2 -k start

      [root@ovz63 ~]# ps aux|grep 550504
      root 187063 0.0 0.0 103256 924 pts/0 S+ 08:49 0:00 grep 550504
      500 550504 0.0 0.0 0 0 ? Z Feb12 0:48 [php] <defunct>

      [root@ovz63 ~]# ps aux|grep 747985
      root 187095 0.0 0.0 103256 924 pts/0 S+ 08:49 0:00 grep 747985
      503 747985 0.0 0.0 0 0 ? Z Feb10 0:01 [php] <defunct>

      [root@ovz63 ~]# ps aux|grep 952680
      root 187105 0.0 0.0 103256 924 pts/0 S+ 08:49 0:00 grep 952680
      503 952680 0.0 0.0 0 0 ? Z Feb10 0:01 [php] <defunct>


      U tryed many-many-many times kill -9 this processes, process by process, but it was unkillable!

      kill -9 330166
      [root@ovz63 ~]# ps aux|grep 330166
      root 170314 0.0 0.0 4388 640 pts/0 T 08:43 0:00 strace -f -p 330166
      root 194986 0.0 0.0 103260 928 pts/0 S+ 08:51 0:00 grep 330166
      33 330166 0.0 0.0 152980 5588 ? D Feb10 0:02 /usr/sbin/apache2 -k start


      As I can see, problem with apache in D state:


      echo w > /proc/sysrq-trigger
      [314733.332512] Show Blocked State
      [314733.332574] task taskaddr stack pid father veid
      [314733.332906] apache2 D ffff880c29ead320 0 330166 16584 6318 0x00000084
      [314733.334019] ffff880bb4141e08 0000000000000082 ffff880c29ead320 ffff880c29ead320
      [314733.334147] ffff880bb4141ec8 ffff880c29ead320 ffffffff8100bc4e ffff880bb4141e08
      [314733.334273] 0000000000000212 0000000000000000 ffff880c29ead8e8 000000000001e9c0
      [314733.334399] Call Trace:
      [314733.334460] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20
      [314733.334526] [<ffffffff8104e1ad>] ? mutex_spin_on_owner+0x8d/0xc0
      [314733.334592] [<ffffffff814f175e>] __mutex_lock_slowpath+0x13e/0x180
      [314733.334657] [<ffffffff814f15fb>] mutex_lock+0x2b/0x50
      [314733.334720] [<ffffffff811a94d9>] do_unlinkat+0xa9/0x1b0
      [314733.334784] [<ffffffff810e63b7>] ? audit_syscall_entry+0x1d7/0x200
      [314733.334848] [<ffffffff811a95f6>] sys_unlink+0x16/0x20
      [314733.334912] [<ffffffff8100b102>] system_call_fastpath+0x16/0x1b


      ps -eo ppid,pid,user,stat,pcpu,comm,wchan:32 |grep 330166
        16584 330166 33 D 0.0 apache2 unlinkat
       330166 550504 500 Z 0.0 php <defunct> exit
       330166 670235 503 Z 0.0 php <defunct> exit
       330166 747985 503 Z 0.0 php <defunct> exit
       330166 952680 503 Z 0.0 php <defunct> exit

      I was forced HWN reboot, but it's impossible way to fix issue!

      May be this bug is related with openvz?

        Attachments

        1. dmesg.txt
          248 kB
        2. vmcore-dmesg.txt
          512 kB

          Activity

            People

            Assignee:
            vdavydov Vladimir Davydov
            Reporter:
            pavel.odintsov@gmail.com Pavel Odintsov
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: