Just to update. We've managed to find the issue which was that the
musl-libc semaphore library defaults to 256 per process which was
insufficient when we're spawning quite a few sub-processes and queues from
one application. In particular the sem_open() function uses
#define _POSIX_SEM_NSEMS_MAX 256
. as the limit (see git.musl-libc.org/cgit/musl/tree/include/limits.h#n63).
My colleague Toby found and patched the musl-libc to increase the limit to
4096 in our Alpine build - which now runs the required number of processes
and queue using the Python multiprocessing library. It now runs like a
Full details here:
From: Alex Butler <alex_at_ewinkle.com>
Sent: 14 August 2018 18:23
Subject: [alpine-user] Alpine limit on file descriptors?
We've been having some issues with what looks like some kind of limitation
on the maximum number of file descriptors (or a /dev/shm semaphore
limitation). Our application is in Python and uses the standard Python
multiprocessing library to create processes and associated queues for
communication (typically creating between 20 and 200 processes at start-up
depending on hardware configuration). It runs fine on Raspbian/Debian with
any number of processes we choose (within reason!) and runs fine under
Alpine when we run with low numbers of processes.
It however always barfs for larger numbers of processes under Alpine -
suggesting (from the reported OSError) that it is running out of file
descriptors. [Might be a red herring but might be related to the use of OS
semaphore management in /dev/shm. Just not sure!]
Anyway, after trying quite a few things we've narrowed it down to failing in
every stock flavour of Alpine we've tried (x64, Raspberry Pi etc) but which
just doesn't happen at all in the different flavours of
Is there some Alpine setting/limit which we haven't yet found which sets the
maximum number of file descriptors (or some other subtle Alpine difference).
We've tried all the "obvious" Linux file descriptor changes like ulimit,
sysctl type changes etc.
To help recreate this we've created a simple Python script (attached).
Under Alpine (Raspberry Pi) it fails after the 85th process pair. If
MAX_PAIRS is set to 85 it works fine. i.e. no exceptions. Put in anything
bigger for MAX_PAIRS and we always get the following error message at the
data for 83 was [83001, 83002, 83003, 83004, 83005, 83006, 83007, 83008,
data for 84 was [84001, 84002, 84003, 84004, 84005, 84006, 84007, 84008,
data for 85 was [85001, 85002, 85003, 85004, 85005, 85006, 85007, 85008,
Traceback (most recent call last):
File "queue_test.py", line 41, in <module>
q = Queue()
File "/usr/lib/python2.7/multiprocessing/__init__.py", line 218, in Queue
File "/usr/lib/python2.7/multiprocessing/queues.py", line 68, in __init__
self._wlock = Lock()
File "/usr/lib/python2.7/multiprocessing/synchronize.py", line 147, in
SemLock.__init__(self, SEMAPHORE, 1, 1)
File "/usr/lib/python2.7/multiprocessing/synchronize.py", line 75, in
sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)
OSError: [Errno 24] No file descriptors available
As I said - on other Linux distro's this code runs fine. We'd _really_ like
to use Alpine for a variety of obvious reasons. It's not obvious what is
going on and not being able to run multiprocessing to the level of
parallelism we need might be a deal-breaker.
Incidentally, at MAX_PAIRS = 85 (when the test code runs fine), doing a
"lsof | wc -l" reveals about 29991 file descriptors (~29k).
I've attached a copy of the test python code for ease of replication. We
just run it as root using "/usr/bin/python queue_test.py"
Any help or suggestions as to what might be going on gratefully received!
Received on Thu Aug 16 2018 - 16:00:04 GMT