Date: Tue, 2 Nov 2021 05:33:31 -0500 (CDT)
From: Ariadne Conill <ariadne@dereferenced.org>
To: =?ISO-8859-15?Q?=C9loi_Rivard?= <eloi.rivard@aquilenet.fr>
cc: ~alpine/users@lists.alpinelinux.org
Subject: Re: Alpine Linux general performances
In-Reply-To: <6df8863e77b970b466dbfc9a3a5c2bcec3199f48.camel@aquilenet.fr>
Message-ID: <4dcedd5d-e2ce-e8e-e231-874997bbe9f6@dereferenced.org>
References: <6df8863e77b970b466dbfc9a3a5c2bcec3199f48.camel@aquilenet.fr>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="0-1780979172-1635849218=:10302"

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-1780979172-1635849218=:10302
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8BIT

Hi,

On Tue, 2 Nov 2021, Éloi Rivard wrote:

> Hi.
>
> My company have a cluster of LXC/LXD hypervisors that run Alpine and Debian
> containers, mostly for classical webservices (nginx + uwsgi + python apps +
> ZEO/ZODB database). We made some investigations when we felt a "general
> slowness" on our Alpine containers. With some testing we could exonerate most of
> the i/o factors. In the end we encountered a significative difference on CPU-
> heavy operations, like for instance the compilation of Python, between Debian
> and Alpine containers.
>
> I made a simple experiment: compile Python 3.10.0 on a fresh Alpine container,
> and then again on a fresh Debian container. I made this twice, once on my
> personal computer (Ryzen 7 1800x 16 cores - 16Gb of RAM), and once on our
> production hypervisor (Xeon(R) E5-2609 0 @ 2.40GHz 8 cores - 64Gb of RAM). Here
> are the results for the 'make' command (produced with 'time'):
>
> # hypervisor
>
>   # alpine
>   real     5m13
>   user     4m40
>   sys      0m33
>
>   # debian
>   real     3m01
>   user     2m47
>   sys      0m13
>
> # my personal computer
>
>   # alpine
>   real    3m50
>   user    3m27
>   sys     0m20
>
>   # debian
>   real    2m17
>   user    2m07
>   sys     0m07
>
> In both cases Alpine containers take 66% more time to compile Python, compared
> to Debian. We also compiled python directly on one of the hypervisors, and we
> observed results really close to the Debian container. [1]
> I wonder if this experiment has correlations with the "general slowness" we felt
> on our production Alpine containers.
>
> Some people wrote some interesting things [2] about the memory allocation in
> musl being slower than on glibc. This is quite technical and I am not
> sure what to think of this as memory allocation is not really in my field of
> competence. In the end I am not even sure bad memory allocation performances are
> the cause for a slow Python compilation, or that "slowness feeling" I mentioned.
>
> So I wanted to ask here what do you think of all these.
>
> - Is this situation well-know?

Yes.  It is known that some workloads, mostly involving malloc, and heavy 
amounts of C string operations benchmark poorly verses glibc.

This is largely because the security hardening features of musl and 
Alpine are not zero cost, but also because musl does not contain 
micro-architecture specific optimizations, meaning that on glibc you 
might get strlen/strcpy type functions that are hand-tuned for the exact 
CPU you are using.

In practice though, performance is adequate for most workloads.

> - Is the musl memory allocation a good lead to explain performance differences?

It is known that some memory allocation patterns lead to bad performance 
with the new hardened malloc.  However, the security benefits of the 
hardened malloc to some extent justifies the performance costs, in my 
opinion.  We are still working to optimize the hardened malloc.

One workaround might be to use jemalloc instead, which is available as a 
package.  I am investigating a way to make it possible to always use 
jemalloc instead of the hardened malloc for performance-critical 
workloads, but that will require some discussion with the musl author 
which I haven't gotten to yet.

> - Can memory allocation explain the whole thing?

About 70%, I would say.

> - Is my experiment flawed because of some other factors?

Your experiment is valid, but your assumptions that raw performance in 
benchmarks is the metric that matters are flawed.  The benefits to Alpine 
are not raw performance, but reliability, memory efficiency and security 
hardening.

Ariadne

--0-1780979172-1635849218=:10302--