~alpine/users

5 5

Alpine Linux general performances

Details
Message ID
<6df8863e77b970b466dbfc9a3a5c2bcec3199f48.camel@aquilenet.fr>
DKIM signature
missing
Download raw message
Hi.

My company have a cluster of LXC/LXD hypervisors that run Alpine and Debian
containers, mostly for classical webservices (nginx + uwsgi + python apps +
ZEO/ZODB database). We made some investigations when we felt a "general
slowness" on our Alpine containers. With some testing we could exonerate most of
the i/o factors. In the end we encountered a significative difference on CPU-
heavy operations, like for instance the compilation of Python, between Debian
and Alpine containers.

I made a simple experiment: compile Python 3.10.0 on a fresh Alpine container,
and then again on a fresh Debian container. I made this twice, once on my
personal computer (Ryzen 7 1800x 16 cores - 16Gb of RAM), and once on our
production hypervisor (Xeon(R) E5-2609 0 @ 2.40GHz 8 cores - 64Gb of RAM). Here
are the results for the 'make' command (produced with 'time'):

# hypervisor

   # alpine
   real     5m13
   user     4m40
   sys      0m33

   # debian
   real     3m01
   user     2m47
   sys      0m13

# my personal computer

   # alpine
   real    3m50
   user    3m27
   sys     0m20

   # debian
   real    2m17
   user    2m07
   sys     0m07

In both cases Alpine containers take 66% more time to compile Python, compared
to Debian. We also compiled python directly on one of the hypervisors, and we
observed results really close to the Debian container. [1]
I wonder if this experiment has correlations with the "general slowness" we felt
on our production Alpine containers.

Some people wrote some interesting things [2] about the memory allocation in
musl being slower than on glibc. This is quite technical and I am not
sure what to think of this as memory allocation is not really in my field of
competence. In the end I am not even sure bad memory allocation performances are
the cause for a slow Python compilation, or that "slowness feeling" I mentioned.

So I wanted to ask here what do you think of all these.

- Is this situation well-know?
- Is the musl memory allocation a good lead to explain performance differences?
- Can memory allocation explain the whole thing?
- Is my experiment flawed because of some other factors?

Éloi

[1]
https://discuss.linuxcontainers.org/t/performance-problem-container-slower-than-host-x1-2/12291/4
[2]
https://www.linkedin.com/pulse/testing-alternative-c-memory-allocators-pt-2-musl-mystery-gomes/
Details
Message ID
<4dcedd5d-e2ce-e8e-e231-874997bbe9f6@dereferenced.org>
In-Reply-To
<6df8863e77b970b466dbfc9a3a5c2bcec3199f48.camel@aquilenet.fr> (view parent)
DKIM signature
missing
Download raw message
Hi,

On Tue, 2 Nov 2021, Éloi Rivard wrote:

> Hi.
>
> My company have a cluster of LXC/LXD hypervisors that run Alpine and Debian
> containers, mostly for classical webservices (nginx + uwsgi + python apps +
> ZEO/ZODB database). We made some investigations when we felt a "general
> slowness" on our Alpine containers. With some testing we could exonerate most of
> the i/o factors. In the end we encountered a significative difference on CPU-
> heavy operations, like for instance the compilation of Python, between Debian
> and Alpine containers.
>
> I made a simple experiment: compile Python 3.10.0 on a fresh Alpine container,
> and then again on a fresh Debian container. I made this twice, once on my
> personal computer (Ryzen 7 1800x 16 cores - 16Gb of RAM), and once on our
> production hypervisor (Xeon(R) E5-2609 0 @ 2.40GHz 8 cores - 64Gb of RAM). Here
> are the results for the 'make' command (produced with 'time'):
>
> # hypervisor
>
>   # alpine
>   real     5m13
>   user     4m40
>   sys      0m33
>
>   # debian
>   real     3m01
>   user     2m47
>   sys      0m13
>
> # my personal computer
>
>   # alpine
>   real    3m50
>   user    3m27
>   sys     0m20
>
>   # debian
>   real    2m17
>   user    2m07
>   sys     0m07
>
> In both cases Alpine containers take 66% more time to compile Python, compared
> to Debian. We also compiled python directly on one of the hypervisors, and we
> observed results really close to the Debian container. [1]
> I wonder if this experiment has correlations with the "general slowness" we felt
> on our production Alpine containers.
>
> Some people wrote some interesting things [2] about the memory allocation in
> musl being slower than on glibc. This is quite technical and I am not
> sure what to think of this as memory allocation is not really in my field of
> competence. In the end I am not even sure bad memory allocation performances are
> the cause for a slow Python compilation, or that "slowness feeling" I mentioned.
>
> So I wanted to ask here what do you think of all these.
>
> - Is this situation well-know?

Yes.  It is known that some workloads, mostly involving malloc, and heavy 
amounts of C string operations benchmark poorly verses glibc.

This is largely because the security hardening features of musl and 
Alpine are not zero cost, but also because musl does not contain 
micro-architecture specific optimizations, meaning that on glibc you 
might get strlen/strcpy type functions that are hand-tuned for the exact 
CPU you are using.

In practice though, performance is adequate for most workloads.

> - Is the musl memory allocation a good lead to explain performance differences?

It is known that some memory allocation patterns lead to bad performance 
with the new hardened malloc.  However, the security benefits of the 
hardened malloc to some extent justifies the performance costs, in my 
opinion.  We are still working to optimize the hardened malloc.

One workaround might be to use jemalloc instead, which is available as a 
package.  I am investigating a way to make it possible to always use 
jemalloc instead of the hardened malloc for performance-critical 
workloads, but that will require some discussion with the musl author 
which I haven't gotten to yet.

> - Can memory allocation explain the whole thing?

About 70%, I would say.

> - Is my experiment flawed because of some other factors?

Your experiment is valid, but your assumptions that raw performance in 
benchmarks is the metric that matters are flawed.  The benefits to Alpine 
are not raw performance, but reliability, memory efficiency and security 
hardening.

Ariadne
Details
Message ID
<ef1b404ddc3298adc2981a1453f7ff2e@aquilenet.fr>
In-Reply-To
<4dcedd5d-e2ce-e8e-e231-874997bbe9f6@dereferenced.org> (view parent)
DKIM signature
missing
Download raw message
Thank you for your valuable answer. I wasn't aware about that 
security-performance trade-off.

> One workaround might be to use jemalloc instead, which is available as
> a package.  I am investigating a way to make it possible to always use
> jemalloc instead of the hardened malloc for performance-critical
> workloads, but that will require some discussion with the musl author
> which I haven't gotten to yet.

Interesting. Would that take the form of a musl-jemalloc package for a 
system-wide usage?

Do you have thought on mimalloc and the Emerson Gomes benchmark 
blogpost?

Help booting Alpine

Graham Bentley <admin@cpcnw.co.uk>
Details
Message ID
<95eaadfe-6add-6dfb-9f40-beb5b28d9052@cpcnw.co.uk>
In-Reply-To
<ef1b404ddc3298adc2981a1453f7ff2e@aquilenet.fr> (view parent)
DKIM signature
missing
Download raw message
Hi - I've burnt the image to USB stick using Rufus as per wiki and am 
getting no
luck getting booted.

Tried front/back USB ports, different USB stick as well as burning .iso 
to DVD yet
still no joy. System is a Lenovo M93.

This is the error message;

---
Mounting boot media failed
initramfs emergency recover shell launched
Type exit to continue to boot
sh: cant access tty; job control turned off
/#
--
Get this if I try UEFI or BIOS

Any pointers?

Thanks!

Re: Help booting Alpine

Details
Message ID
<cbc7b581-4130-f023-f27b-41a359a41c8b@vincentbentley.co.uk>
In-Reply-To
<95eaadfe-6add-6dfb-9f40-beb5b28d9052@cpcnw.co.uk> (view parent)
DKIM signature
missing
Download raw message
I usually try an old 1GB or 2GB USB stick if I experience a problem 
booting a new distro. If that doesn't work, I try burning a CD or DVD 
and booting from USB optical media. One of them usually works.

On 02/11/2021 20:33, Graham Bentley wrote:
> Hi - I've burnt the image to USB stick using Rufus as per wiki and am 
> getting no
> luck getting booted.
> 
> Tried front/back USB ports, different USB stick as well as burning .iso 
> to DVD yet
> still no joy. System is a Lenovo M93.
> 
> This is the error message;
> 
> ---
> Mounting boot media failed
> initramfs emergency recover shell launched
> Type exit to continue to boot
> sh: cant access tty; job control turned off
> /#
> -- 
> Get this if I try UEFI or BIOS
> 
> Any pointers?
> 
> Thanks!
Jakub Jirutka <jakub@jirutka.cz>
Details
Message ID
<104c22a5-a0e7-bd1a-9506-a66222fef718@jirutka.cz>
In-Reply-To
<ef1b404ddc3298adc2981a1453f7ff2e@aquilenet.fr> (view parent)
DKIM signature
missing
Download raw message
Hi,

about the allocator: another option is mimalloc [1] which performs even better than jemalloc, can be built in secure mode (adding guard pages, randomized allocation, encrypted free lists, etc. to protect against various heap vulnerabilities) and IIRC it’s also smaller. I’ve already packaged it, but it’s still in testing and not used by any package.

There are two versions of mimalloc in two variants:

mimalloc1 [2] – stable release of mimalloc built in secure mode
mimalloc1-insecure [3] - stable release of mimalloc built in insecure (default) mode
mimalloc2 [4] - beta release of mimalloc built in secure mode
mimalloc2-insecure [5] - beta release of mimalloc built in insecure (default) mode

Upstream consider the insecure mode as the default, I’ve swapped it, so the default and preferred variant is the secure mode. According to the docs, secure mode has ~10 % performance penalty.

About performance in general: the other cause of worse performance is that most packages are built with -Os (gcc/clang option to optimize for size – enables all -O2 optimizations except those that often increase code size) instead of -O2 or -O3 that is used by everyone else. However, this most likely doesn’t affect your benchmark with compiling python in any way. The problem is that -Os is the default option on Alpine, we use -O2 only in some arbitrary packages – when the maintainer or one of devs comes to the conclusion that it’s truly nonsense to sacrifice performance for a very small size difference in a package that is tens of megabytes in size (e.g. PostgreSQL) and/or some users complained about bad performance (e.g. Node.js). And there are more problems related with -Os and the fact that almost no upstream takes into account this option. Personally, I’d like to change this default, but I haven’t started discussion on this topic yet – I’ll do it later, out of the ML first, now I’m too busy with preparations for v3.15 release.

Jakub

[1]: https://github.com/microsoft/mimalloc
[2]: https://pkgs.alpinelinux.org/packages?name=mimalloc1&branch=edge
[3]: https://pkgs.alpinelinux.org/packages?name=mimalloc1-insecure&branch=edge
[4]: https://pkgs.alpinelinux.org/packages?name=mimalloc2&branch=edge
[5]: https://pkgs.alpinelinux.org/packages?name=mimalloc2-insecure&branch=edge

On 11/2/21 8:10 PM, Éloi Rivard wrote:
> Thank you for your valuable answer. I wasn't aware about that security-performance trade-off.
> 
>> One workaround might be to use jemalloc instead, which is available as
>> a package.  I am investigating a way to make it possible to always use
>> jemalloc instead of the hardened malloc for performance-critical
>> workloads, but that will require some discussion with the musl author
>> which I haven't gotten to yet.
> 
> Interesting. Would that take the form of a musl-jemalloc package for a system-wide usage?
> 
> Do you have thought on mimalloc and the Emerson Gomes benchmark blogpost?
Reply to thread Export thread (mbox)