Uploaded image for project: 'OpenVZ'
  1. OpenVZ
  2. OVZ-6824

Failure to checkpoint cPanel container

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Fix Version/s: Vz7.0-Update3
    • Component/s: CRIU
    • Security Level: Public
    • Environment:
      Linux 3.10.0-327.36.1.vz7.18.7 #1 SMP Tue Oct 11 15:39:22 MSK 2016 x86_64 x86_64 x86_64 GNU/Linux

      Description

      We run several OVZ host nodes with many cPanel VE's on them. We put up
      a few new OVZ7 host nodes with intentions on transitioning over to
      that at some point in the future. At first I used the ovztransfer.sh
      script to transfer a cPanel VE from old OVZ to OVZ7 which worked ok,
      the VE started and operated as expected but when I went to
      snapshot/checkpoint the VE I started running in to problems. I cannot
      get OVZ7 to snapshot this cPanel VE.

      Ok, so I thought perhaps it was a botched copy or something related to
      the ovztransfer.sh script since it's not technically an official
      script (but is mentioned in the OVZ documentation). I decided to
      create a brand new centos7 VE on the OVZ7 hostnode and did a fresh
      cPanel install then migrated the cPanel users from old to new server
      using typical cPanel fashion (transfer tool). Everything worked a
      expected but then I tried to snapshot the new container and again it
      is erroring out. I have another container on the same hostnode that I
      can snap without a problem.

      There definitely seems to be some sort of bug that I'm hitting when
      snapshotting this particular cPanel container under OVZ7. It is a
      total road block, I cannot continue a transition to ovz7 until I know
      that I can checkpoint VE's reliably.

      Here are some details:

      Hostnode: Virtuozzo Linux release 7.2 / Linux 3.10.0-327.36.1.vz7.18.7

      VE: CentOS Linux release 7.2.1511 (Core) - Running a brand new install
      of the latest version of cPanel with about ~600 active users recently
      migrated to it using cPanel transfer tool / ~250GB of data.

      # prlctl snapshot 1035
      Creating the snapshot...
      PRL_ERR_VZCTL_OPERATION_FAILED (Details: Failed to checkpoint the Container
      All dump files and logs were saved to
      /vz/private/1035/dump/{ce2b2c58-00ac-4e2f-b4da-e0a0dc594ff4}.fail
      Failed tp dump the Container, status pipe unexpectedly closed
      Failed to dump Container
      Failed to resume Container
      Failed to create snapshot
      )
      Failed to create the snapshot: Unknown


      What I think is the relevant info from the dump.log file

      (04.143824) Error (criu/sk-inet.c:158): In-flight connection (l) for 924d83
      (04.143831) Error (criu/sk-inet.c:160): In-flight connections can be
      ignored with the --skip-in-flight option.
      (04.143868) Error (criu/cr-dump.c:1322): Dump files (pid: 256864) failed with -1
      (04.168423) Error (criu/cr-dump.c:1634): Dumping FAILED.


      I ran the snapshot again and got a different error on the second pass
      but seems to still be related to sockets in some way:

      (04.005642) fdinfo: type: 0x4 flags: 02000002/01 pos: 0 fd: 5
      (04.005657) 287702 fdinfo 6: pos: 0 flags: 2000002/0x1
      (04.005665) Searching for socket 9888dd (family 2.6)
      (04.005675) Error (criu/sk-inet.c:202): Name resolved on unconnected socket
      (04.005680) ----------------------------------------
      (04.005693) Error (criu/cr-dump.c:1322): Dump files (pid: 287702) failed with -1
      (04.005726) Waiting for 287702 to trap
      (04.005753) Daemon 287702 exited trapping
      (04.005761) Sent msg to daemon 5 0 0
      pie: 20365: __fetched msg: 5 0 0
      pie: 20365: 20365: new_sp=0x7f27094c8008 ip 0x7f271180c20e

      I attached the first dump attempt

      >Host OS: Virtuozzo Linux release 7.2

      >Guest OS: CentOS Linux release 7.2.1511 (Core)

      >Additional Info: The VE has about 6 Ip addresses assigned to it in veid.conf. If I comment out the ips and start the VE I can dump it. If I comment the IPs and replace them with 6 different Ips I can snapshot the container as well, although there may not be any active connections on those IPs at the time of the snapshot.

      It could possibly be related to open connections/sockets for Dovcot mail server

        Attachments

          Activity

            People

            Assignee:
            kgorkunov.int Kirill Gorkunov
            Reporter:
            jfall James F
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: