Thursday, October 13, 2022

Bitnami Improves Container Catalog Size

Authored by Alejandro Gómez, R&D Manager at VMware

Introduction

In a previous post, we shared details of an analysis we conducted to optimize our container’s image sizes in order to improve our end users' experience. After some work, on September 9, 2022, we enabled the --squash option when building the container’s images, and we are already seeing some improvements.

Changes at DockerHub

After enabling the --squash option, we are seeing an important change on DockerHub repositories. For example, for Pytorch and Magento, before enabling the --squash option, the compressed size was 616.64MB (2.14GB decompressed) and 391.61 MB (1.35GB decompressed) respectively.




On the other hand, checking the squashed images, we can observe how the container’s image sizes are lower for those products: Pytorch with a compressed size of 327.58MB (1.12GB decompressed) and Magento with a compressed size of 269.51MB (906MB decompressed).





Here are the size reductions we observed for both assets:
  • Pytorch: 46.9% reduction for the compressed size and 47.7% for the decompressed size
  • Magento: 32.2% reduction for the compressed size and 34.64% for the decompressed size


How will end users see the benefits of this change?

End users can easily check the benefits of the improvements when they try to use the container’s images. We did a quick test by pulling the mentioned non-squashed and the squashed ones to see how much time the test needed. The script used for the test was this simple one:

#!/bin/bash


docker system prune -a -f

start_time=$SECONDS

non_squashed_images=(bitnami/magento:2.4.5-debian-11-r9 bitnami/pytorch:1.12.1-debian-11-r10)

for image in "${non_squashed_images[@]}"

do

    docker pull $image

done

non_squashed_elapsed=$(( SECONDS - start_time ))


docker system prune -a -f

start_time=$SECONDS

squashed_images=(bitnami/magento:2.4.5-debian-11-r10 bitnami/pytorch:1.12.1-debian-11-r11)

for image in "${squashed_images[@]}"

do

    docker pull $image

done

squashed_elapsed=$(( SECONDS - start_time ))


echo "Time pulling non-squashed images: $non_squashed_elapsed s"

echo "Time pulling squashed images: $squashed_elapsed s"



After running the script, the output was really interesting, not only in terms of time, but also important things about cached base image layer:

Total reclaimed space: 0B

2.4.5-debian-11-r9: Pulling from bitnami/magento

3b5e91f25ce6: Pulling fs layer

… … …

… … …

Digest: sha256:bbdde3cea27eaec4264f0464d8491600e24d5b726365d63c24a92ba156344024

Status: Downloaded newer image for bitnami/magento:2.4.5-debian-11-r9

docker.io/bitnami/magento:2.4.5-debian-11-r9

1.12.1-debian-11-r10: Pulling from bitnami/pytorch

3b5e91f25ce6: Already exists


Digest: sha256:1a238c5f74fe29afb77a08b5fa3aefd8d22c3ca065bbd1d8a278baf93585814d

Status: Downloaded newer image for bitnami/pytorch:1.12.1-debian-11-r10

docker.io/bitnami/pytorch:1.12.1-debian-11-r10

Deleted Images:

untagged: bitnami/magento:2.4.5-debian-11-r9

untagged: bitnami/magento@sha256:bbdde3cea27eaec4264f0464d8491600e24d5b726365d63c24a92ba156344024

… … …

… … …

deleted: sha256:7ec26d70ae9c46517aedc0931c2952ea9e5f30a50405f9466cb1f614d52bbff7

deleted: sha256:d745f418fc70bf8570f4b4ebefbd27fb868cda7d46deed2278b9749349b00ce2


Total reclaimed space: 3.415GB

2.4.5-debian-11-r10: Pulling from bitnami/magento

3b5e91f25ce6: Pulling fs layer

Digest: sha256:7775f3bc1cfb81c0b39597a044d28602734bf0e04697353117f7973739314b9c

Status: Downloaded newer image for bitnami/magento:2.4.5-debian-11-r10

docker.io/bitnami/magento:2.4.5-debian-11-r10

1.12.1-debian-11-r11: Pulling from bitnami/pytorch

3b5e91f25ce6: Already exists

Digest: sha256:3273861a829d49e560396aa5d935476ab6131dc4080b4f9f6544ff1053a36035

Status: Downloaded newer image for bitnami/pytorch:1.12.1-debian-11-r11

docker.io/bitnami/pytorch:1.12.1-debian-11-r11


Total reclaimed space: 1.947GB


Time pulling non-squashed images: 165 seconds

Time pulling squashed images: 91 seconds


As we can observe in the script output:
  • The uncompressed size dropped from 3.415GB to 1.947GB (43% less).
  • Squashed images pulling was 45% faster than the normal ones.
  • The base image layer with digest 3b5e91f25ce6 used always from the cache in the second pull (for non-squashed and squashed ones).
So, an end user who uses, for example, Magento or Pytorch would see those savings. The previous test was using one of the biggest images and decreased like Pytorch, but what about other solutions and with normal end users’ products? We will do another test with a product like, for example, WordPress (that uses WordPress and MariaDB). For this test, we will use the WordPress docker-compose.yml that exists in the Bitnami containers repository. We modified the existent MariaDB and WordPress images for:
  • Non-squashed:
    • bitnami/wordpress:6.0.2-debian-11-r2
    • bitnami/mariadb:10.6.9-debian-11-r7
  • Squashed:
    • bitnami/wordpress:6.0.2-debian-11-r3
    • bitnami/mariadb:10.6.9-debian-11-r8

For this test, we used the following script (that uses two docker-compose files created with the mentioned images before):

#!/bin/bash


docker system prune -a -f

start_time=$SECONDS

docker-compose -f wordpress-docker-compose-non-squashed.yml up -d

non_squashed_elapsed=$(( SECONDS - start_time ))

docker-compose -f wordpress-docker-compose-non-squashed.yml down


docker system prune -a -f

start_time=$SECONDS

docker-compose -f wordpress-docker-compose-squashed.yml up -d

squashed_elapsed=$(( SECONDS - start_time ))

docker-compose -f wordpress-docker-compose-squashed.yml down

docker system prune -a -f

echo "Time pulling and starting non-squashed WordPress: $non_squashed_elapsed s"

echo "Time pulling and starting squashed WordPress: $squashed_elapsed s"


With this test, the squashed solution used 974MB compared to the 1.161GB (16.11% of savings) that the non-squashed one used and, in terms of time, used 52 seconds for the squashed solution compared to 60 seconds (14.43% of savings) used by the non-squashed one. Also, in this final basic test, the improvement is useful (the major benefits come for data solutions like Pytorch or Spark, for example). When we analyzed the main use cases, we could see that users mainly enjoy benefits from this improvement, though we are aware that in some use cases, the container layers would make sense to be preserved. However, based on our research, it’s beneficial for the majority of the users.

Will it be the last thing we will do to improve the catalog? Definitely not! We have some things "in the oven" that we are cooking up in order to keep pushing the catalog improvements forward. And—remember!—our catalog is open source, so you can feel free to contribute in our containers repository too, as explained in our previous post.

Tuesday, October 4, 2022

Analyzing the Bitnami Container Catalog Size: Findings and Next Steps

Authored by Alejandro Gómez, R&D Manager at VMware


Introduction

At Bitnami, we are always thinking of new ways to improve our catalog and provide the best experience to our users. An example of this is the creation of a new source of truth for Bitnami containers, which you can read about in this blog post. These are some of the actions we on the Bitnami engineering team have taken to enhance the current size of the Bitnami container catalog:
  • Analyze distroless and docker-slim as options to decrease containers’ base image size
  • Getting insights from Dive and considering the use of multi-stage builds to increase the performance of the image
  • Test docker-build –squash as an option to reduce the container image size
In this post, we share our findings as well as the option the team finally adopted to boost the performance of our containers by reducing the image size.

Distroless and docker-slim

With the aim of improving the catalog, our team ran an analysis to determine the best option for reducing the size of container images nowadays. We considered either moving into distroless as the base image for our containers or using docker-slim as a tool to shrink the container images.

We found it very difficult to embrace the distroless approach since some applications, like Apache, required more dependencies to be compiled, which would add more complexity to the catalog.

On the other hand, the analysis of docker-slim wasn’t good for some assets, like PostgreSQL 14, because it didn’t produce good results in the tests—as a result of lack of configuration from our side—as shown below:

docker-slim build --compose-file docker-compose.yml --target-compose-svc postgresql --include-path /opt/bitnami --http-probe-off --include-shell --include-bin /bin/sh --include-bin /usr/lib/x86_64-linux-gnu/libedit.so.2 --include-bin /usr/bin/touch  --include-bin /bin/mktemp --exclude-pattern "/bitnami/postgresql/**" --exclude-pattern "/opt/bitnami/postgresql/tmp/**" --exclude-pattern "/tmp/**" --pull  --path-perms "/opt/bitnami/postgresql/tmp:775#0#0"

We got a container image reduction of 33.4%:

bitnami/postgresql.slim   14       0ce6542042ff   1 second ago   181MB
bitnami/postgresql        14        0336c8e4fba4   23 hours ago   272MB
The container worked without issue with the default helm install values. But it failed in the architecture=replication configuration because some binaries were missing. Based on the usage of docker-slim, we couldn’t calculate it properly because we were using a basic configuration without tuning it. Even though we have seen significant reductions in the applied cases—some of them worked perfectly, like Pytorch or Apache—we also have found that they failed in the detection phase in most cases, which requires a substantial amount of case-by-case manual configuration. This would mean spending a lot of work tuning all our containers to make sure they are properly configured to pass all these tests.

Dive and multi-stage builds

To test how to shrink the size of a Docker image, we have used dive, a tool for exploring a Docker image, layer contents, and discovering ways to shrink the size of your Docker/OCI image by calculating the efficiency ratio of a Docker image. This is calculated by checking the number of files that have been repeated between layers (for example, by doing chmod operations). For this reason, we decided to run this analysis on the heaviest assets we have in the catalog, such as Pytorch. In the case of Pytorch, we achieved an efficiency of 51.13% because of a chmod operation we are running in a different layer that duplicates one of the layers.

The Bitnami engineering team also analyzed the usage of multi-stage builds to check whether the steps above could help. First, we ran a basic test using the WordPress container and its whole Dockerfile as reference and converted both in a wordpress-builder step. Then, we built the final Dockerfile by installing only the dependencies and copying from the wordpress-builder the result of the /opt/bitnami folder, where all the binaries are installed. The result of that test wasn’t as good as expected because, when running the WordPress container, an error occurred:

wordpress_1  | cp: cannot create regular file '/bitnami/wordpress/wp-config.php': Permission denied

COPY --from" does not preserve the permissions in our test:

Although there was a previous work at Multi-stage COPY --from should preserve ownership/permissions, that option was not working as expected and extra execution steps were needed to, again, adjust the file permissions, which could lead to the same problem we had before running the tool.

You might wonder, “Why not optimize the Dockerfile directly?” The answer is not easy. At Bitnami, we don’t create the Dockerfile files manually; we have a system through which each asset is modeled and generates the files automatically. This is why we shouldn’t tune each Dockerfile asset, because we must think about the future maintenance and ensure that if an update of the asset comes, it will work.

docker build --squash analysis

The last option we evaluated was to add the --squash flag when running the docker build phase. This would squash newly built layers into a single new layer, preserving the base image layer. Thus, we shouldn’t change the container nor remove files. It would only squash all the layers into a single one, which sounds reasonable, but we had to make sure if that made sense or not.

For this analysis, we built the whole catalog with and without the --squash option, then we checked the container image sizes again. This was the output data we got:
  • 312 container images built (remember some products have different versions like, for example, MariaDB)
  • 36 container images (11.54%) of BAC had a size reduction of at least 20%:
    • 3 of them, more than 50%
    • 9 of them, between 40% and 50%
    • 6 of them, between 30% and 40%
    • 18 of them, between 20% and 30%
  • 135 container images (43.27%) had a size reduction minor than 20% but major than 1%
  • The rest (141 container images, 45.19%) had a reduction minor than 1% (most of them are built from scratch, so it makes sense not to see any benefit)
For the entire Bitnami Application Catalog, we got a benefit of 25,219 MB (25,219 GB saved) in raw mode; remember that when we push the container images into a registry, they are usually compressed. This first output meant we were going to save an important amount of traffic and storage size per container release.

Bitnami builds their containers using VMware Image Builder, so we needed to check whether the vulnerability scan would keep working. We did this by using some non-squashed and squashed container images. The Aqua Trivy scanner worked as expected (VMware Image Builder uses both Trivy and Anchore Grype as the vulnerability scans), throwing the same output in the tests.

After confirming that the container images worked as expected, we checked the repository storage impact too by testing whether or not performing all steps mentioned above was worthwhile for end users. To confirm that point, we did a test pushing the most reduced container images in both flavors (squashed and non-squashed). After pushing squashed and non-squashed container images, we confirmed a storage reduction of more than 40% (when an image is pushed, it’s compressed first, but we also observed benefits after compressing the container image).

On the other hand, we also discovered that squashing container images was worthwhile, and that also improves the speed when an end user pulls more than one Bitnami Docker image. Based on several tests and use cases, the download of the squashed images was 32% faster than the regular ones. This can be explained because the base image is the only shared layer due to the cadence. Although the download of layers can be done in parallel, docker doesn’t extract images in parallel.

Conclusions and next steps

After doing the analysis and seeing the pros and cons of each option, preserving layers is not so critical for the main use case of Bitnami users. The possibility to reuse layers is really low and also the base OS that comes in the base image—that is cached—has the new OS changes. We have decided to implement the docker --squash option in our catalog to reduce the container’s images size and to help our end users, reducing the time you have to wait to get the container image ready to be used and alleviating the repositories storage in case someone wants to use our images as a base image on their own containers.

We are aware there could be other use cases where preserving the container layers could be required or beneficial, but based on our data, the approach that we have adopted would be beneficial in the majority of cases.

This will not be our only action to improve the catalog. We will keep working hard to improve it, and we will share the numbers of the usage of this implementation with the community. Read about how we implemented changes based on this analysis to improve the Bitnami end user experience in our followup post.

Support and resources

Looking to learn more or have more questions? Check out the new Bitnami GitHub repository for containers, and if you need to get a resolution on issues or want to send us your feedback, please open an issue. A markdown template is provided by default to open new issues, with certain information requested to help us prioritize and respond as soon as possible.

And if you want to contribute to the project, feel free to send us a pull request, and the team will check it and guide you in the process for a successful merge.

Boost your knowledge about Bitnami containers and Helm charts by checking our latest tutorials.