On 10/24/24 11:25, Ganeshar, Puvi wrote:

Dan,

 

Thanks for coming back to me on this.

 

If I use an NFS store (with Read & Write) as Podman’s storage, do you anticipate any race conditions when multiple podman processes reading and writing at the same time?  Do I need implement any locking mechanisms like what they do in relational databases.

 

Yum and DNF should not be a bigger issue as we don’t build them every day and we use distroless for the Go microservices and Java s built on a custom base image with all deps already included.

 

Thanks again.

I don't think so.  We already have locking built into podman database and NFS Storage.  Once the container is running Podman is not going to do anything.  In detach mode podman exits.

Podman is only writing to storage and working with locks when the container is created and when images are pulled.

The key issue with NFS and storage is that if Podman needs to create a different UID other then the users UID, then NFS server will not allow Podman to do a chown. 

From its point of view it sees dwalsh chowning a file to a non dwalsh UID.

 

Puvi Ganeshar | @pg925u
Principal, Platform Engineer
CICD - Pipeline Express | Toronto

Image

 

From: Daniel Walsh <dwalsh@redhat.com>
Date: Wednesday, October 23, 2024 at 11:10
AM
To: podman@lists.podman.io <podman@lists.podman.io>
Subject: [Podman] Re: Speeding up podman by using cache

On 10/22/24 11:04, Ganeshar, Puvi wrote:

Hello Podman team,

 

I am about explore this option so just wanted to check with you all before as I might be wasting my time.

 

I am in Platform Engineering team at DirecTV, and we run Go and Java pipelines on Jenkins using Amazon EKS as the workers.  So, the process is that when a Jenkins build runs, it asks the EKS for a worker (Kubernetes pod) and the cluster would spawn one and the new pod would communicate back to the Jenkins controller.  We use the Jenkins Kubernetes pod template to configure the communication.  We are currently running the latest LTS of podman, v5.2.2, however still using cgroups-v1 for now, planning to migrate early 2025 by upgrading the cluster to use Amazon Linux 2023 which uses cgroups-v2 as default.  Here’s the podman configuration details that we use:

 

host:

  arch: arm64

  buildahVersion: 1.37.2

  cgroupControllers:

  - cpuset

  - cpu

  - cpuacct

  - blkio

  - memory

  - devices

  - freezer

  - net_cls

  - perf_event

  - net_prio

  - hugetlb

  - pids

  cgroupManager: cgroupfs

  cgroupVersion: v1

  conmon:

    package: conmon-2.1.12-1.el9.aarch64

    path: /usr/bin/conmon

    version: 'conmon version 2.1.12, commit: f174c390e4760883511ab6b5c146dcb244aeb647'

  cpuUtilization:

    idlePercent: 99.22

    systemPercent: 0.37

    userPercent: 0.41

  cpus: 16

  databaseBackend: sqlite

  distribution:

    distribution: centos

    version: "9"

  eventLogger: file

  freeLocks: 2048

  hostname: podmanv5-arm

  idMappings:

    gidmap: null

    uidmap: null

  kernel: 5.10.225-213.878.amzn2.aarch64

  linkmode: dynamic

  logDriver: k8s-file

  memFree: 8531066880

  memTotal: 33023348736

  networkBackend: netavark

  networkBackendInfo:

    backend: netavark

    dns:

      package: aardvark-dns-1.12.1-1.el9.aarch64

      path: /usr/libexec/podman/aardvark-dns

      version: aardvark-dns 1.12.1

    package: netavark-1.12.2-1.el9.aarch64

    path: /usr/libexec/podman/netavark

    version: netavark 1.12.2

  ociRuntime:

    name: crun

    package: crun-1.16.1-1.el9.aarch64

    path: /usr/bin/crun

    version: |-

      crun version 1.16.1

      commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32

      rundir: /run/crun

      spec: 1.0.0

      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL

  os: linux

  pasta:

    executable: /usr/bin/pasta

    package: passt-0^20240806.gee36266-2.el9.aarch64

    version: |

      pasta 0^20240806.gee36266-2.el9.aarch64-pasta

      Copyright Red Hat

      GNU General Public License, version 2 or later

        https://www.gnu.org/licenses/old-licenses/gpl-2.0.html

      This is free software: you are free to change and redistribute it.

      There is NO WARRANTY, to the extent permitted by law.

  remoteSocket:

    exists: false

    path: /run/podman/podman.sock

  rootlessNetworkCmd: pasta

  security:

    apparmorEnabled: false

    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT

    rootless: false

    seccompEnabled: true

    seccompProfilePath: /usr/share/containers/seccomp.json

    selinuxEnabled: false

  serviceIsRemote: false

  slirp4netns:

    executable: /usr/bin/slirp4netns

    package: slirp4netns-1.3.1-1.el9.aarch64

    version: |-

      slirp4netns version 1.3.1

      commit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236

      libslirp: 4.4.0

      SLIRP_CONFIG_VERSION_MAX: 3

      libseccomp: 2.5.2

  swapFree: 0

  swapTotal: 0

  uptime: 144h 6m 15.00s (Approximately 6.00 days)

  variant: v8

plugins:

  authorization: null

  log:

  - k8s-file

  - none

  - passthrough

  - journald

  network:

  - bridge

  - macvlan

  - ipvlan

  volume:

  - local

registries:

  search:

  - registry.access.redhat.com

  - registry.redhat.io

  - docker.io

store:

  configFile: /etc/containers/storage.conf

  containerStore:

    number: 0

    paused: 0

    running: 0

    stopped: 0

  graphDriverName: overlay

  graphOptions:

    overlay.mountopt: nodev,metacopy=on

  graphRoot: /var/lib/containers/storage

  graphRootAllocated: 107352141824

  graphRootUsed: 23986397184

  graphStatus:

    Backing Filesystem: xfs

    Native Overlay Diff: "false"

    Supports d_type: "true"

    Supports shifting: "true"

    Supports volatile: "true"

    Using metacopy: "false"

  imageCopyTmpDir: /var/tmp

  imageStore:

    number: 1

  runRoot: /run/containers/storage

  transientStore: false

  volumePath: /var/lib/containers/storage/volumes

version:

  APIVersion: 5.2.2

  Built: 1724331496

  BuiltTime: Thu Aug 22 12:58:16 2024

  GitCommit: ""

  GoVersion: go1.22.5 (Red Hat 1.22.5-2.el9)

  Os: linux

  OsArch: linux/arm64

  Version: 5.2.2

We migrated to podman when Kubernetes deprecated docker and have been using podman for the last two years or so.  Its working well, however since we run over 500 builds a day, I am trying to explore whether I can speed up the podman build process by using image caching.  I wanted to see if I use an NFS file system (Amazon FSX) as the storage for podman (overlay-fs) would it improve podman performance by the builds completing much faster as of the already downloaded images on the NFS.  Currently, podman in each pod on the EKS cluster would download all the required images every time so not taking advantage of the cached images.

 

These are my concerns:

  1. Any race conditions, a podman processes colliding with each other during read and write.
  2. Performance of I/O operations as NFS communication will be over the network.

 

Have any of you tried this method before?  If so, can you share any pitfalls that you’ve faced?

 

Any comments / advice would be beneficial as I need to weigh up pros and cons before spending time on this.  Also, if it causes outage due to storage failures it would block all our developers; so, I will have to design this in a way where we can recover quickly.

 

Thanks very much in advance and have a great day.

 

Puvi Ganeshar | @pg925u
Principal, Platform Engineer
CICD - Pipeline Express | Toronto

Image



_______________________________________________
Podman mailing list -- podman@lists.podman.io
To unsubscribe send an email to podman-leave@lists.podman.io

You can setup an additional store which is preloaded with Images on an NFS share, which should work fine.

Whether this improves performance or not is probably something you need to discover.

If you are dealing with YUM and DNF,  you might also want to play with sharing of the rpm database with the build system.

https://www.redhat.com/en/blog/speeding-container-buildah

https://www.youtube.com/watch?v=qsh7NL8H4GQ