Details
-
Type: Bug
-
Status: Resolved
-
Priority: Major
-
Resolution: Fixed
-
Fix Version/s: Vz7.0-Update8
-
Component/s: CRIU
-
Security Level: Public
Description
>Description of problem:
https://forum.openvz.org/index.php?t=msg&th=13504
[root@vz03 ~]# vzmigrate -vvv --online --require-realtime vz04 222
...
2018-05-23 10:30:10.732: Live migration stage started
2018-05-23 10:30:36.320: Io multiplexer aborted
2018-05-23 10:30:36.320: 2018-05-23 10:30:36.321: Phaul service failed to live migrate CT
2018-05-23 10:30:36.320: 2018-05-23 10:30:36.321: error [-73] : Phaul service failed to live migrate CT
2018-05-23 10:30:36.321: Phaul service failed to live migrate CT
2018-05-23 10:30:36.321: Phaul failed to live migrate CT (/var/log/phaul.log)
2018-05-23 10:30:36.322: 2018-05-23 10:30:36.322: cleaning : destroy CT 222
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.372: cleaning : 'rm' dir : /vz/private/222
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.372: can not rename : [/vz/private/222] -> [/vz/private/222.ss6sKg]
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.373: cleaning : 'rmdir' dir : /vz/root/222
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.373: can not find entry for delete : [/vz/root/222]
2018-05-23 10:30:37.373: 2018-05-23 10:30:37.373: unlocking 222
[root@vz03 ~]# tail -20 /var/log/phaul.log
10:30:33.214: 285170: Notify (post-network-lock)
10:30:35.283: 285170: Final FS and images sync
10:30:35.522: 285170: Sending images to target
10:30:35.524: 285170: Pack
10:30:35.561: 285170: Add htype images
10:30:35.812: 285170: Asking target host to restore
10:30:36.271: 285170: Remote exception
10:30:36.271: 285170: I/O operation on closed file
Traceback (most recent call last):
File "/usr/libexec/phaul/p.haul", line 9, in <module>
load_entry_point('phaul==0.1', 'console_scripts', 'p.haul')()
File "/usr/lib/python2.7/site-packages/phaul/shell/phaul_client.py", line 49, in main
worker.start_migration()
File "/usr/lib/python2.7/site-packages/phaul/iters.py", line 161, in start_migration
self.__start_live_migration()
File "/usr/lib/python2.7/site-packages/phaul/iters.py", line 232, in __start_live_migration
self.target_host.restore_from_images()
File "/usr/lib/python2.7/site-packages/phaul/xem_rpc_client.py", line 26, in __call__
raise Exception(resp[1])
Exception: I/O operation on closed file
dst server logs
[root@vz04 ~]# tail -20 /var/log/phaul-service.log
10:30:35.562: 817892: Waiting for images to unpack
10:30:35.813: 817892: Restoring from images
10:30:35.827: 817892: Starting vzctl restore
10:30:36.269: 817892: > Restoring the Container ...
10:30:36.269: 817892: > Mount image: /vz/private/222/root.hdd
10:30:36.269: 817892: > Container is mounted
10:30:36.269: 817892: > Setting permissions for image=/vz/private/222/root.hdd
10:30:36.269: 817892: > (00.000283) Error (criu/util.c:694): Can't read link of fd -404: No such file or directory
10:30:36.270: 817892: > (00.000295) Error (criu/protobuf.c:77): Unexpected EOF on (null)
10:30:36.270: 817892: > The restore log was saved in /vz/dump/222/rst-_cQGWZ-18.05.23-10.30/criu_restore.9.log
10:30:36.270: 817892: > criu exited with rc=17
10:30:36.270: 817892: > Unmount image: /vz/private/222/root.hdd
[root@vz04 ~]# tail -20 /vz/dump/222/rst-_cQGWZ-18.05.23-10.30/criu_restore.9.log
(00.000142) Version: 3.8 (gitid 0)
(00.000188) Running on vz04.boardreader.com Linux 3.10.0-693.21.1.vz7.47.4 #1 SMP Sat Apr 28 11:48:07 MSK 2018 x86_64
(00.000237) No inventory.img image
(00.000283) Error (criu/util.c:694): Can't read link of fd -404: No such file or directory
(00.000295) Error (criu/protobuf.c:77): Unexpected EOF on (null)
it was reported to github/criu
https://github.com/checkpoint-restore/criu/issues/494
it have attached logs.
Dmitry Safonov believes it is not criu issue:
"
This one looks suspicious and might be the reason of the fail:
(02.037229) Error (criu/page-xfer.c:379): No parent image found, though parent directory is set: No such file or directory
Probably, it's a bug in criu integration. libvzctl or something?
"
https://forum.openvz.org/index.php?t=msg&th=13504
[root@vz03 ~]# vzmigrate -vvv --online --require-realtime vz04 222
...
2018-05-23 10:30:10.732: Live migration stage started
2018-05-23 10:30:36.320: Io multiplexer aborted
2018-05-23 10:30:36.320: 2018-05-23 10:30:36.321: Phaul service failed to live migrate CT
2018-05-23 10:30:36.320: 2018-05-23 10:30:36.321: error [-73] : Phaul service failed to live migrate CT
2018-05-23 10:30:36.321: Phaul service failed to live migrate CT
2018-05-23 10:30:36.321: Phaul failed to live migrate CT (/var/log/phaul.log)
2018-05-23 10:30:36.322: 2018-05-23 10:30:36.322: cleaning : destroy CT 222
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.372: cleaning : 'rm' dir : /vz/private/222
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.372: can not rename : [/vz/private/222] -> [/vz/private/222.ss6sKg]
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.373: cleaning : 'rmdir' dir : /vz/root/222
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.373: can not find entry for delete : [/vz/root/222]
2018-05-23 10:30:37.373: 2018-05-23 10:30:37.373: unlocking 222
[root@vz03 ~]# tail -20 /var/log/phaul.log
10:30:33.214: 285170: Notify (post-network-lock)
10:30:35.283: 285170: Final FS and images sync
10:30:35.522: 285170: Sending images to target
10:30:35.524: 285170: Pack
10:30:35.561: 285170: Add htype images
10:30:35.812: 285170: Asking target host to restore
10:30:36.271: 285170: Remote exception
10:30:36.271: 285170: I/O operation on closed file
Traceback (most recent call last):
File "/usr/libexec/phaul/p.haul", line 9, in <module>
load_entry_point('phaul==0.1', 'console_scripts', 'p.haul')()
File "/usr/lib/python2.7/site-packages/phaul/shell/phaul_client.py", line 49, in main
worker.start_migration()
File "/usr/lib/python2.7/site-packages/phaul/iters.py", line 161, in start_migration
self.__start_live_migration()
File "/usr/lib/python2.7/site-packages/phaul/iters.py", line 232, in __start_live_migration
self.target_host.restore_from_images()
File "/usr/lib/python2.7/site-packages/phaul/xem_rpc_client.py", line 26, in __call__
raise Exception(resp[1])
Exception: I/O operation on closed file
dst server logs
[root@vz04 ~]# tail -20 /var/log/phaul-service.log
10:30:35.562: 817892: Waiting for images to unpack
10:30:35.813: 817892: Restoring from images
10:30:35.827: 817892: Starting vzctl restore
10:30:36.269: 817892: > Restoring the Container ...
10:30:36.269: 817892: > Mount image: /vz/private/222/root.hdd
10:30:36.269: 817892: > Container is mounted
10:30:36.269: 817892: > Setting permissions for image=/vz/private/222/root.hdd
10:30:36.269: 817892: > (00.000283) Error (criu/util.c:694): Can't read link of fd -404: No such file or directory
10:30:36.270: 817892: > (00.000295) Error (criu/protobuf.c:77): Unexpected EOF on (null)
10:30:36.270: 817892: > The restore log was saved in /vz/dump/222/rst-_cQGWZ-18.05.23-10.30/criu_restore.9.log
10:30:36.270: 817892: > criu exited with rc=17
10:30:36.270: 817892: > Unmount image: /vz/private/222/root.hdd
[root@vz04 ~]# tail -20 /vz/dump/222/rst-_cQGWZ-18.05.23-10.30/criu_restore.9.log
(00.000142) Version: 3.8 (gitid 0)
(00.000188) Running on vz04.boardreader.com Linux 3.10.0-693.21.1.vz7.47.4 #1 SMP Sat Apr 28 11:48:07 MSK 2018 x86_64
(00.000237) No inventory.img image
(00.000283) Error (criu/util.c:694): Can't read link of fd -404: No such file or directory
(00.000295) Error (criu/protobuf.c:77): Unexpected EOF on (null)
it was reported to github/criu
https://github.com/checkpoint-restore/criu/issues/494
it have attached logs.
Dmitry Safonov believes it is not criu issue:
"
This one looks suspicious and might be the reason of the fail:
(02.037229) Error (criu/page-xfer.c:379): No parent image found, though parent directory is set: No such file or directory
Probably, it's a bug in criu integration. libvzctl or something?
"