Uploaded image for project: 'OpenVZ'
  1. OpenVZ
  2. OVZ-6734

Container process enter D status with backtrace about ext4 in dmesg

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Fix Version/s: Vz7.0-Update-next
    • Component/s: Containers::Kernel
    • Security Level: Public
    • Environment:
      3.10.0-327.3.1.vz7.10.15
      ploop-7.0.21-1.vz7.x86_64
      vzctl-7.0.87-1.vz7.x86_64

      Description

      >Description of problem:
      Container's process may stuck (enter D process) seldomly. Meanwhile backtrace appears in dmesg
      >How reproducible:
      I can reproduce it by running a java minecraft server. Many other processes like mysql, apt-get, nginx etc. have encountered this problem, but minecraft server is the only method to reproduce it stably(may due to its certain I/O workload).
      >Steps to Reproduce:
      1. vzctl start 101
      2. vzctl enter 101
      3. apt-get install openjdk-7-jre
      4. java -jar server.jar
      5. play with the minecraft server


      Both ploop and simfs based container can reproduce it.
      With ploop, process entered D status causes entire container stuck.
      With simfs, the stucked process won't affect other process.

      backtrace:

      Apr 22 00:49:46 scgyshell-1 kernel: INFO: task java:177716 blocked for more than 120 seconds.
      Apr 22 00:49:46 scgyshell-1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Apr 22 00:49:46 scgyshell-1 kernel: java D ffff880111448000 0 177716 169372 7 0x00000004
      Apr 22 00:49:46 scgyshell-1 kernel: ffff88010ddd7af0 0000000000000086 ffff880111448000 ffff88010ddd7fd8
      Apr 22 00:49:46 scgyshell-1 kernel: ffff88010ddd7fd8 ffff88010ddd7fd8 0000000000000007 ffff8804495d6580
      Apr 22 00:49:46 scgyshell-1 kernel: ffff88045fcdec40 0000000000000000 7fffffffffffffff ffffffff8117ae20
      Apr 22 00:49:46 scgyshell-1 kernel: Call Trace:
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8117ae20>] ? wait_on_page_read+0x60/0x60
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff81635d69>] schedule+0x29/0x70
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff81633b39>] schedule_timeout+0x239/0x2d0
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffffa0470e5c>] ? __ext4_journal_stop+0x3c/0xb0 [ext4]
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffffa043fcc9>] ? ext4_da_write_end+0x139/0x2e0 [ext4]
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8117ae20>] ? wait_on_page_read+0x60/0x60
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8163544e>] io_schedule_timeout+0xae/0x130
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff816354e8>] io_schedule+0x18/0x1a
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8117ae2e>] sleep_on_page+0xe/0x20
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff81633c90>] __wait_on_bit+0x60/0x90
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8117abb6>] wait_on_page_bit+0x86/0xb0
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff810a8790>] ? wake_atomic_t_function+0x40/0x40
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8118c80b>] truncate_inode_pages_range+0x3bb/0x740
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff810b2c04>] ? __wake_up+0x44/0x50
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffffa017325a>] ? jbd2_journal_stop+0x1ea/0x3d0 [jbd2]
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffffa0449f94>] ? ext4_unlink+0x304/0x390 [ext4]
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8125ddda>] ? __dquot_initialize+0x3a/0x1c0
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8118cc0e>] truncate_inode_pages_final+0x5e/0x90
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffffa043f40c>] ext4_evict_inode+0x10c/0x520 [ext4]
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff812163d7>] evict+0xa7/0x170
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff81216cbb>] iput+0x18b/0x1f0
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8120b26e>] do_unlinkat+0x1ae/0x2b0
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff811ffa0e>] ? SYSC_newstat+0x3e/0x60
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff8120c276>] SyS_unlink+0x16/0x20
      Apr 22 00:49:46 scgyshell-1 kernel: [<ffffffff81640e49>] system_call_fastpath+0x16/0x1b

      PS: I use ceph rbd disk to hold container's data. I am testing whether it can be reproduced if container stored on local disk.

      >Host OS:
      Centos7
      >Guest OS:
      ubuntu14.04
      >Additional info (see https://openvz.org/Reporting_OpenVZ_problem):

        Attachments

          Activity

            People

            Assignee:
            khorenko Konstantin Khorenko
            Reporter:
            alkaid alkaid
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: