On Fri, Jun 21, 2019 at 11:36:36AM -0400, Matt Heon wrote:
On 2019-06-21 17:10, Adrian Reber wrote:
> The current container migration implementation in Podman cannot handle
> changes to the file-system.
>
> If a container changes a file the recommendation is to mount that
> directory as a tmpfs and then the changed file will be correctly
> migrated to the destination system. If something changes a file in /tmp
> for example, following steps are currently necessary:
>
> # podman run -d --tmpfs /tmp <container>
> # podman container checkpoint -l -e /tmp/chkpt.tar.gz
> # scp /tmp/chkpt.tar.gz destination-host:/tmp
>
> On the destination host of the migration:
>
> # podman container restore -i /tmp/chkpt.tar.gz
>
> Files changed in /tmp in the container will also be in the restored
> container on the destination host, because CRIU automatically handles
> tmpfs directories.
>
> To make it easier for users to not have to mark all changed directories
> as --tmpfs I would like to include changed files in the checkpoint
> archive (/tmp/chkpt.tar.gz from my example).
>
> One possible implementation could use
vendor/github.com/containers/storage/store.go:
>
> // Diff returns the tarstream which would specify the changes returned
> // by Changes. If options are passed in, they can override default
> // behaviors.
> Diff(from, to string, options *DiffOptions) (io.ReadCloser, error)
>
> This sounds exactly like what I need. I get a tarstream which I can
> embed into the checkpoint archive and which can then be used with
> ApplyDiff() before restoring the container.
>
> Does this sound like the right approach to also migrate file-system
> changes during container migration?
>
Can we commit the container, generating a new image from it (including
all the diffs that `Diff` would show), and then change the exported
container to use that new image, instead of the one it originally
used?
Using 'commit' internally was also my initial idea, but I am not sure
how I can export just that single layer which includes the file system
changes between the original layer the container was started with and
all the changes since then. Is there already an interface or API to get
just that one layer?
The reason I was looking at 'diff' is that with 'commit' I would need to
change the image ID the container is running from. The restored
container would have the same content but it would no longer be based on
the image ID during 'podman run' but on the image ID created during
'commit'. It would still be the same content but different metadata. Not
sure how important that is. This might get interesting (complicated) if
migrating a file-system modifying container multiple times. It would be
necessary to track the original image ID and all the commits and
changes.
I was also not sure how 'commit' behaves with different storage
backends. I guess it all just works, but I was not sure.
One reason I am not convinced that 'diff' is the right approach is if
only 1 byte is changed in a large file. I hope that overlay is smart
enough to only store the diff of the changed block (if that is how
overlay works). Using 'diff' probably the complete large file has to be
transferred even if only 1 byte changes.
So both ideas ('diff' based and 'commit' based) are not perfect (as far
as I understand it).
If I could get the content of 'commit' without writing that new image
into the checkpoint archive I could later apply these changes to the
restored container (again without creating a new image ID). That way the
image the container is based on would not changed unexpectedly but I
could benefit from overlay being smart about only providing the changed
blocks and not the complete files.
The main reason I was not following the 'commit' approach any more is
the changed image ID of the restored container.
Adrian