~alpine/devel

16 10

Use of supervise-daemon in Alpine

Details
Message ID
<3LLUI2KOULSYM.359WA6HATX45B@8pit.net>
DKIM signature
missing
Download raw message
Hello,

OpenRC ships a program called supervise-daemon(8) which is capable of
starting daemons and restarting them if they crash. Contrary to
start-stop-daemon, it does not rely on PID files instead the started
daemon is a child process of supervise-daemon.

Some Alpine OpenRC services already use supervise-daemon(8) (e.g.
unbound, xdm, wpa_supplicant, …). I was recently wondering if we want to
migrate busybox-initscripts to using supervise-daemon too and was
pointed to some comments in the GitLab issue tracker which critique use
of supervise-daemon for busybox-initscripts because of concern over
memory usage [0]. Upon further discussion in the IRC some people also
expressed discomfort in regards to the automatic restarting of crashed
services (“you don't want to mask crashes”).

However, the primary benefit I personally see with widespread use of
supervise-daemon is that it would allow us to get rid of racy PID files.
I would therefore propose that we enable supervise-daemon whenever
possible in existing OpenRC services (including busybox-initscripts).
In any case it would be nice to clarify when using supervisor-daemon is
encouraged (see existing examples above) and when it isn't.

Thoughts?

Greetings,
Sören

[0]: https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/1363#note_56289
Details
Message ID
<f7b0aab5-4dc6-69f1-048d-f90f59891b45@dereferenced.org>
In-Reply-To
<3LLUI2KOULSYM.359WA6HATX45B@8pit.net> (view parent)
DKIM signature
missing
Download raw message
Hello,

On 2020-08-20 11:04, Sören Tempel wrote:
> Hello,
> 
> OpenRC ships a program called supervise-daemon(8) which is capable of
> starting daemons and restarting them if they crash. Contrary to
> start-stop-daemon, it does not rely on PID files instead the started
> daemon is a child process of supervise-daemon.
> 
> Some Alpine OpenRC services already use supervise-daemon(8) (e.g.
> unbound, xdm, wpa_supplicant, …). I was recently wondering if we want to
> migrate busybox-initscripts to using supervise-daemon too and was
> pointed to some comments in the GitLab issue tracker which critique use
> of supervise-daemon for busybox-initscripts because of concern over
> memory usage [0]. Upon further discussion in the IRC some people also
> expressed discomfort in regards to the automatic restarting of crashed
> services (“you don't want to mask crashes”).

I would rather mask crashes than deal with a 3 AM phone call.  As long 
as the crashes are logged (which supervise-daemon does), I don't see the 
problem.

> However, the primary benefit I personally see with widespread use of
> supervise-daemon is that it would allow us to get rid of racy PID files.
> I would therefore propose that we enable supervise-daemon whenever
> possible in existing OpenRC services (including busybox-initscripts).
> In any case it would be nice to clarify when using supervisor-daemon is
> encouraged (see existing examples above) and when it isn't.

Yes, lets finish moving to supervise-daemon for 3.13.

Ariadne
Laurent Bercot <ska-devel@skarnet.org>
Details
Message ID
<ema98f4fba-1095-4a5f-90fc-07f8ffdec5e8@elzian>
In-Reply-To
<f7b0aab5-4dc6-69f1-048d-f90f59891b45@dereferenced.org> (view parent)
DKIM signature
missing
Download raw message
  Back in 2015, I was pushing Alpine to move to a supervision system,
I packaged s6 for Alpine, and unless something happened that I'm not
aware of, it's still working out of the box. I spent a lot of time on
IRC trying to convince developers of the virtues of supervision, and
of s6 in particular.

  What I gathered from the conversations was that there *was* theoretical
interest, but moving to a supervision system was not work that the devs
wanted to prioritize; if I wanted the move to happen, I would have to
put in the work myself.

  I had other obligations at the time, so I did not prioritize the work
either - so the subject remained unresolved. I was not made to feel,
though, that the door had been closed.

  I am glad to see that Alpine is finally coming around and embracing
the supervision model. However, I wonder why you are choosing
supervision-daemon, which is technically inferior, when s6 has been
available on Alpine for years and when I have always signalled my
desire to help migrate.

  I have to admit that it feels slightly unpleasant to have been pushing
for a solution for *years* and always been met with very tepid
enthusiasm, and today, at the first mention of supervise-daemon, there
is immediate approval and you are going to switch right away.
If for some reason Alpine was never going to switch to s6, had an issue
not with the concept, but with the implementation, or with me, which is
what this suggestion seems to impliy, then it would have been good
expectation management to be honest and say it from the get go.

  If, however, my interpretation is wrong, I am *still* available to
help with a migration to s6, and will actually have time starting in
September to do the grunt work with the Alpine init scripts.

--
  Laurent
Andy Postnikov <apostnikov@gmail.com>
Details
Message ID
<CAM0T6JfoGv8fqF9vGcDBkHUh8v3HRt3qxFonSkcNZo=21y4cmg@mail.gmail.com>
In-Reply-To
<ema98f4fba-1095-4a5f-90fc-07f8ffdec5e8@elzian> (view parent)
DKIM signature
missing
Download raw message
Just wanted to mention that I'm using s6 a lot inside of alpine:*
containers last few years
and other supervisors still did not convince me to change it.

Laurent, thank you a lot for keeping s6 alive and stable!

пт, 21 авг. 2020 г. в 00:12, Laurent Bercot <ska-devel@skarnet.org>:

>
>   Back in 2015, I was pushing Alpine to move to a supervision system,
> I packaged s6 for Alpine, and unless something happened that I'm not
> aware of, it's still working out of the box. I spent a lot of time on
> IRC trying to convince developers of the virtues of supervision, and
> of s6 in particular.
>
>   What I gathered from the conversations was that there *was* theoretical
> interest, but moving to a supervision system was not work that the devs
> wanted to prioritize; if I wanted the move to happen, I would have to
> put in the work myself.
>
>   I had other obligations at the time, so I did not prioritize the work
> either - so the subject remained unresolved. I was not made to feel,
> though, that the door had been closed.
>
>   I am glad to see that Alpine is finally coming around and embracing
> the supervision model. However, I wonder why you are choosing
> supervision-daemon, which is technically inferior, when s6 has been
> available on Alpine for years and when I have always signalled my
> desire to help migrate.
>
>   I have to admit that it feels slightly unpleasant to have been pushing
> for a solution for *years* and always been met with very tepid
> enthusiasm, and today, at the first mention of supervise-daemon, there
> is immediate approval and you are going to switch right away.
> If for some reason Alpine was never going to switch to s6, had an issue
> not with the concept, but with the implementation, or with me, which is
> what this suggestion seems to impliy, then it would have been good
> expectation management to be honest and say it from the get go.
>
>   If, however, my interpretation is wrong, I am *still* available to
> help with a migration to s6, and will actually have time starting in
> September to do the grunt work with the Alpine init scripts.
>
> --
>   Laurent
>


-- 
*Andy Postnikov*, drupal consultant

dgo.to/@andypost
skype:andypost2005
Details
Message ID
<5b2db58f-ad11-e4ac-eede-66dc86b3ddcf@dereferenced.org>
In-Reply-To
<ema98f4fba-1095-4a5f-90fc-07f8ffdec5e8@elzian> (view parent)
DKIM signature
missing
Download raw message
Hello,

On 2020-08-20 15:12, Laurent Bercot wrote:
> 
>   Back in 2015, I was pushing Alpine to move to a supervision system,
> I packaged s6 for Alpine, and unless something happened that I'm not
> aware of, it's still working out of the box. I spent a lot of time on
> IRC trying to convince developers of the virtues of supervision, and
> of s6 in particular.
> 
>   What I gathered from the conversations was that there *was* theoretical
> interest, but moving to a supervision system was not work that the devs
> wanted to prioritize; if I wanted the move to happen, I would have to
> put in the work myself.
> 
>   I had other obligations at the time, so I did not prioritize the work
> either - so the subject remained unresolved. I was not made to feel,
> though, that the door had been closed.

The main difference is that the other approach to switching to a 
supervision model uses a program we already ship in Alpine -- 
supervise-daemon(8).

>   I am glad to see that Alpine is finally coming around and embracing
> the supervision model. However, I wonder why you are choosing
> supervision-daemon, which is technically inferior, when s6 has been
> available on Alpine for years and when I have always signalled my
> desire to help migrate.

While s6 may be superior to supervise-daemon, the fact that 
supervise-daemon is already in the base system *and* is explicitly 
designed to integrate with OpenRC are the reasons it was selected.

I also remember discussing s6 openrc integration with you personally, 
and you describing it as flawed.  At the end of the day, Alpine is, at 
least for the foreseeable future, tied to OpenRC in some way.  While 
there are proposals to replace OpenRC in various stages of development, 
we simply aren't there yet.  It is possible s6 could be selected as a 
component of one of those proposals.

>   I have to admit that it feels slightly unpleasant to have been pushing
> for a solution for *years* and always been met with very tepid
> enthusiasm, and today, at the first mention of supervise-daemon, there
> is immediate approval and you are going to switch right away.

This is not an accurate characterization of events.  We have allowed 
individual maintainers to use supervise-daemon in their packages for 
some time, and have been migrating individual system services to 
supervise-daemon since Jakub Jirutka began doing it in 2016.  This 
proposal is about finishing that migration -- which makes it easier for 
us to jump to another service manager later, as more initscripts are 
moved to declarative style.  Such a step would be necessary to adopt a 
new service manager, which may or may not use s6.

> If for some reason Alpine was never going to switch to s6, had an issue
> not with the concept, but with the implementation, or with me, which is
> what this suggestion seems to impliy, then it would have been good
> expectation management to be honest and say it from the get go.
> 
>   If, however, my interpretation is wrong, I am *still* available to
> help with a migration to s6, and will actually have time starting in
> September to do the grunt work with the Alpine init scripts.

We see s6 as a post-openrc option, because openrc already has a 
supervisor that is integrated into openrc, but we are not quite ready to 
make a jump away from OpenRC yet.  Other initiatives, such as 
ifupdown-ng, are intended in part, to demonstrate the architecture we 
would want for a service manager.

Ariadne
Laurent Bercot <ska-devel@skarnet.org>
Details
Message ID
<emdc86f515-08e2-457b-a19c-2fd31e8dca72@elzian>
In-Reply-To
<5b2db58f-ad11-e4ac-eede-66dc86b3ddcf@dereferenced.org> (view parent)
DKIM signature
missing
Download raw message
>The main difference is that the other approach to switching to a supervision model uses a program we already ship in Alpine -- supervise-daemon(8).

  s6 is also shipped in Alpine. It may not be part of the base system,
but that is a purely political decision. And despite my efforts to
satisfy the Alpine requirements (reducing the s6 disk footprint,
reducing the number of binaries, making the execline dependency 
optional,
etc.) the political will has constantly been lacking. So, forgive me
for not being convinced by this argument.


>While s6 may be superior to supervise-daemon, the fact that supervise-daemon is already in the base system *and* is explicitly designed to integrate with OpenRC are the reasons it was selected.
>
>I also remember discussing s6 openrc integration with you personally, and you describing it as flawed.  At the end of the day, Alpine is, at least for the foreseeable future, tied to OpenRC in some way.  While there are proposals to replace OpenRC in various stages of development, we simply aren't there yet.  It is possible s6 could be selected as a component of one of those proposals.

  But that's the thing: supervise-daemon does not integrate with OpenRC
any better than s6 does (barring, obviously, potential specific
interface extensions where the devs have put in the effort, which is
akin to vendor lock-in), and the flaws that exist with the s6 support as
currently implemented in OpenRC are present *in the exact same manner*
with supervise-daemon. The problem is not tied to a supervision
implementation in particular, it is tied to the way the service manager
uses the supervision system. supervise-daemon brings no architectural
advantage here; the only advantage it brings is that it is homemade by
the OpenRC team, so it reduces the number of support contact points for
you. Which is a double-edged sword.

  I know that Alpine is too reliant on OpenRC at the moment to consider
a change of service managers in the near future; that is not what I'm
talking about. What I'm saying is that there *are* ways to use s6 and
OpenRC together:
  - with the s6 support that is provided with OpenRC: it's flawed, but
not any more than the supervise-daemon support.
  - or with a small architectural redesign that ties a supervision
suite and a service manager in the correct way. The work has already
been done: this is how Adélie Linux operates. It would be a minimal
amount of effort to port this to Alpine.


>This is not an accurate characterization of events.  We have allowed individual maintainers to use supervise-daemon in their packages for some time, and have been migrating individual system services to supervise-daemon since Jakub Jirutka began doing it in 2016.  This proposal is about finishing that migration -- which makes it easier for us to jump to another service manager later, as more initscripts are moved to declarative style.  Such a step would be necessary to adopt a new service manager, which may or may not use s6.

  Again, I am not talking about adopting a new service manager, which is
indeed heavier work. I am talking about using s6 as a supervision suite
together with the OpenRC service manager. It's what I have been saying
from the start, but for some reason, it was never heard, and the
argument you are giving "we cannot replace the service manager at the
moment" is a strawman one. Yes, there is a s6-rc service manager and
I am working on a complete init sequence integration with friendlier
user commands, but this does not mean that you need the whole shebang
in order to use s6: s6 and OpenRC can work together in a satisfactory
way, as proven by Adélie.


>We see s6 as a post-openrc option, because openrc already has a supervisor that is integrated into openrc, but we are not quite ready to make a jump away from OpenRC yet.  Other initiatives, such as ifupdown-ng, are intended in part, to demonstrate the architecture we would want for a service manager.

  What I am reading is "we don't want to mix-and-match software, so as
long as we're tied to OpenRC, we prefer using the supervision
implementation that they are providing". Which is a reasonable argument
that I can hear, but it is the first time you are making it, and I am
left wondering why nobody has made it since 2016, or why you are still
also answering with "we're not ready to switch service managers" which
is irrelevant for now.

  In any case, you've (finally) made it clear that a s6+OpenRC solution
is definitely not what you want, so I'll stop pushing, and keep working
on the integrated init system instead. It's coming along slowly, but not
any more slowly than the Alpine zeitgeist change, so it will be ready
in time. :P

--
  Laurent
Details
Message ID
<4d10c39a-33cd-17b9-f6f7-9ca91b8d2e14@dereferenced.org>
In-Reply-To
<emdc86f515-08e2-457b-a19c-2fd31e8dca72@elzian> (view parent)
DKIM signature
missing
Download raw message
Hello,

On 2020-08-21 02:54, Laurent Bercot wrote:
>> The main difference is that the other approach to switching to a 
>> supervision model uses a program we already ship in Alpine -- 
>> supervise-daemon(8).
> 
>   s6 is also shipped in Alpine. It may not be part of the base system,
> but that is a purely political decision. And despite my efforts to
> satisfy the Alpine requirements (reducing the s6 disk footprint,
> reducing the number of binaries, making the execline dependency optional,
> etc.) the political will has constantly been lacking. So, forgive me
> for not being convinced by this argument.

supervise-daemon(8) already exists on every image we ship.  Adding s6 
simply for OpenRC to consume is not useful to us, as we still ship 
supervise-daemon(8) in every image.  So it makes sense to use what we 
are already shipping.

>> While s6 may be superior to supervise-daemon, the fact that 
>> supervise-daemon is already in the base system *and* is explicitly 
>> designed to integrate with OpenRC are the reasons it was selected.
>>
>> I also remember discussing s6 openrc integration with you personally, 
>> and you describing it as flawed.  At the end of the day, Alpine is, at 
>> least for the foreseeable future, tied to OpenRC in some way.  While 
>> there are proposals to replace OpenRC in various stages of 
>> development, we simply aren't there yet.  It is possible s6 could be 
>> selected as a component of one of those proposals.
> 
>   But that's the thing: supervise-daemon does not integrate with OpenRC
> any better than s6 does (barring, obviously, potential specific
> interface extensions where the devs have put in the effort, which is
> akin to vendor lock-in), and the flaws that exist with the s6 support as
> currently implemented in OpenRC are present *in the exact same manner*
> with supervise-daemon. The problem is not tied to a supervision
> implementation in particular, it is tied to the way the service manager
> uses the supervision system. supervise-daemon brings no architectural
> advantage here; the only advantage it brings is that it is homemade by
> the OpenRC team, so it reduces the number of support contact points for
> you. Which is a double-edged sword.

If you would like to elaborate on this in depth, I would like to hear an 
explanation.  But Alpine is unlikely to ever provide a DJB-style 
supervision tree as a primary management interface, so hopefully your 
argument is not based around s6's use of a DJB-style supervision tree.

>   I know that Alpine is too reliant on OpenRC at the moment to consider
> a change of service managers in the near future; that is not what I'm
> talking about. What I'm saying is that there *are* ways to use s6 and
> OpenRC together:
>   - with the s6 support that is provided with OpenRC: it's flawed, but
> not any more than the supervise-daemon support.

It is more flawed to us because it is an additional dependency we have 
to ship in the image.

>   - or with a small architectural redesign that ties a supervision
> suite and a service manager in the correct way. The work has already
> been done: this is how Adélie Linux operates. It would be a minimal
> amount of effort to port this to Alpine.

I would like to hear more about this, but mostly in the context of 
better integrating supervise-daemon(8).

>> This is not an accurate characterization of events.  We have allowed 
>> individual maintainers to use supervise-daemon in their packages for 
>> some time, and have been migrating individual system services to 
>> supervise-daemon since Jakub Jirutka began doing it in 2016.  This 
>> proposal is about finishing that migration -- which makes it easier 
>> for us to jump to another service manager later, as more initscripts 
>> are moved to declarative style.  Such a step would be necessary to 
>> adopt a new service manager, which may or may not use s6.
> 
>   Again, I am not talking about adopting a new service manager, which is
> indeed heavier work. I am talking about using s6 as a supervision suite
> together with the OpenRC service manager. It's what I have been saying
> from the start, but for some reason, it was never heard, and the
> argument you are giving "we cannot replace the service manager at the
> moment" is a strawman one. Yes, there is a s6-rc service manager and
> I am working on a complete init sequence integration with friendlier
> user commands, but this does not mean that you need the whole shebang
> in order to use s6: s6 and OpenRC can work together in a satisfactory
> way, as proven by Adélie.
> 
> 
>> We see s6 as a post-openrc option, because openrc already has a 
>> supervisor that is integrated into openrc, but we are not quite ready 
>> to make a jump away from OpenRC yet.  Other initiatives, such as 
>> ifupdown-ng, are intended in part, to demonstrate the architecture we 
>> would want for a service manager.
> 
>   What I am reading is "we don't want to mix-and-match software, so as
> long as we're tied to OpenRC, we prefer using the supervision
> implementation that they are providing".

Correct.  We don't see any value in the s6+OpenRC combination over 
supervise-daemon.  Also, many initscripts in Adélie are using 
supervise-daemon too, as they are shared with their Alpine counterparts.

> Which is a reasonable argument that I can hear, but it is the first time
> you are making it, and I am left wondering why nobody has made it since
> 2016, or why you are still also answering with "we're not ready to switch > service managers" which is irrelevant for now.

We have been making this argument since 2016.  I distinctly remember 
bringing it up when discussing s6+OpenRC with you.

>   In any case, you've (finally) made it clear that a s6+OpenRC solution
> is definitely not what you want, so I'll stop pushing, and keep working
> on the integrated init system instead. It's coming along slowly, but not
> any more slowly than the Alpine zeitgeist change, so it will be ready
> in time. :P

I cannot guarantee that we will select s6-rc as a replacement to OpenRC. 
  We have very specific requirements for the new service manager, and 
the requirements list is evolving over time.  The key non-negotiable 
things we would like to see are:

* Native support for network namespaces, VRFs and seccomp.
* Support for consuming systemd units without having to jump through hoops.
* An event triggered architecture -- plugging in a new device should 
start any services needed to manage that device.
* Backwards-compatibility with OpenRC runscripts, so we don't have to 
port everything to systemd-style units right away.
* Users interact with a friendly set of CLI tools that are similar to 
OpenRC, no editing DJB-style supervision trees, etc.

If s6-rc can provide these features, we will look at it.  That would 
interest us.

Ariadne
Konstantin Kulikov <k.kulikov2@gmail.com>
Details
Message ID
<CAD+eXGTbxeg1hCUfFBHYdpTPKnxkb98fgzyK2WPx6DGG6vmaCA@mail.gmail.com>
In-Reply-To
<4d10c39a-33cd-17b9-f6f7-9ca91b8d2e14@dereferenced.org> (view parent)
DKIM signature
missing
Download raw message
How can anyone seriously consider switching globally to supervision
service that cannot even properly redirect daemon's stderr to a file?
Details
Message ID
<20200821191507.7857010b@ncopa-macbook.copa.dup.pw>
In-Reply-To
<3LLUI2KOULSYM.359WA6HATX45B@8pit.net> (view parent)
DKIM signature
missing
Download raw message
On Thu, 20 Aug 2020 19:04:45 +0200
Sören Tempel <soeren@soeren-tempel.net> wrote:

> Hello,
> 
> OpenRC ships a program called supervise-daemon(8) which is capable of
> starting daemons and restarting them if they crash. Contrary to
> start-stop-daemon, it does not rely on PID files instead the started
> daemon is a child process of supervise-daemon.
> 
> Some Alpine OpenRC services already use supervise-daemon(8) (e.g.
> unbound, xdm, wpa_supplicant, …). I was recently wondering if we want to
> migrate busybox-initscripts to using supervise-daemon too and was
> pointed to some comments in the GitLab issue tracker which critique use
> of supervise-daemon for busybox-initscripts because of concern over
> memory usage [0]. Upon further discussion in the IRC some people also
> expressed discomfort in regards to the automatic restarting of crashed
> services (“you don't want to mask crashes”).

I think it would be nice if we could have the autorestart be
configurable, and let it be off by default.

> However, the primary benefit I personally see with widespread use of
> supervise-daemon is that it would allow us to get rid of racy PID files.
> I would therefore propose that we enable supervise-daemon whenever
> possible in existing OpenRC services (including busybox-initscripts).
> In any case it would be nice to clarify when using supervisor-daemon is
> encouraged (see existing examples above) and when it isn't.

Are there any history of problems with racy pids with busybox
initscripts?

The idea with Alpine has been that the requirements for a minimal
install should be *minimal*, but users can opt-in to use more
convenient tools and services that consumes more resources.

So when there is a request for a change that affects the minimal
install, I think we should have really good cost vs benefit reasons for
enabling them.

So for busybox-initscripts i'd still prefer avoid the supervise-daemon,
unless it is a common problem that busybox syslog, cron and friends
often crashes or have problems with racy pids.

For other, bigger, add-on services, I don't mind the extra KBs.


> Thoughts?

I'm thinking if it ain't broken, don't fix it. At leasts for the
busybox-initscripts.

> Greetings,
> Sören
> 
> [0]: https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/1363#note_56289
Details
Message ID
<20200821191640.5238557d@ncopa-macbook.copa.dup.pw>
In-Reply-To
<f7b0aab5-4dc6-69f1-048d-f90f59891b45@dereferenced.org> (view parent)
DKIM signature
missing
Download raw message
On Thu, 20 Aug 2020 14:33:53 -0600
Ariadne Conill <ariadne@dereferenced.org> wrote:

> Hello,
> 
> On 2020-08-20 11:04, Sören Tempel wrote:
> > Hello,
> > 
> > OpenRC ships a program called supervise-daemon(8) which is capable of
> > starting daemons and restarting them if they crash. Contrary to
> > start-stop-daemon, it does not rely on PID files instead the started
> > daemon is a child process of supervise-daemon.
> > 
> > Some Alpine OpenRC services already use supervise-daemon(8) (e.g.
> > unbound, xdm, wpa_supplicant, …). I was recently wondering if we want to
> > migrate busybox-initscripts to using supervise-daemon too and was
> > pointed to some comments in the GitLab issue tracker which critique use
> > of supervise-daemon for busybox-initscripts because of concern over
> > memory usage [0]. Upon further discussion in the IRC some people also
> > expressed discomfort in regards to the automatic restarting of crashed
> > services (“you don't want to mask crashes”).  
> 
> I would rather mask crashes than deal with a 3 AM phone call.  As long 
> as the crashes are logged (which supervise-daemon does), I don't see the 
> problem.

I think many agree with you, but not everyone. So it should be
configurable.

-nc
Details
Message ID
<dea709f7-94b7-f02c-929a-f7368f05bf6d@gmail.com>
In-Reply-To
<20200821191507.7857010b@ncopa-macbook.copa.dup.pw> (view parent)
DKIM signature
missing
Download raw message
Hi,

On 8/21/20 8:15 PM, Natanael Copa wrote:
> On Thu, 20 Aug 2020 19:04:45 +0200
> Sören Tempel <soeren@soeren-tempel.net> wrote:
>
>> Hello,
>>
>> OpenRC ships a program called supervise-daemon(8) which is capable of
>> starting daemons and restarting them if they crash. Contrary to
>> start-stop-daemon, it does not rely on PID files instead the started
>> daemon is a child process of supervise-daemon.
>>
>> Some Alpine OpenRC services already use supervise-daemon(8) (e.g.
>> unbound, xdm, wpa_supplicant, …). I was recently wondering if we want to
>> migrate busybox-initscripts to using supervise-daemon too and was
>> pointed to some comments in the GitLab issue tracker which critique use
>> of supervise-daemon for busybox-initscripts because of concern over
>> memory usage [0]. Upon further discussion in the IRC some people also
>> expressed discomfort in regards to the automatic restarting of crashed
>> services (“you don't want to mask crashes”).
> I think it would be nice if we could have the autorestart be
> configurable, and let it be off by default.


supervise-daemon does not offer such setting ("respawn_max" can be set 
to "0" but it means "unlimited"), perhaps because it kinda defeats the 
meaning to "supervise".

So what should be the approach? I see:

1. let the developer choose between supervised/unsupervised daemon

2. provide two init scripts, supervised and unsupervised

3. provide an "hybrid" init script which has a configurable user option 
to choose between supervised/unsupervised daemon

4. other?

I'm asking these questions because I got for the first time a MR which 
adopts solution 2, which I never saw so far. It seems to me that 
solution 1 was adopted so far.

Or if there's no single solution, which should be avoided?

Thanks

/eo
Francesco Colista <fcolista@alpinelinux.org>
Details
Message ID
<ff2f9139bf743abd7303b89f10fc9549@alpinelinux.org>
In-Reply-To
<dea709f7-94b7-f02c-929a-f7368f05bf6d@gmail.com> (view parent)
DKIM signature
missing
Download raw message
27 agosto 2020 14:49, "Leonardo" <rnalrd@gmail.com> wrote:

 
> So what should be the approach? I see:
> 
> 1. let the developer choose between supervised/unsupervised daemon
> 
> 2. provide two init scripts, supervised and unsupervised
> 
> 3. provide an "hybrid" init script which has a configurable user option
> to choose between supervised/unsupervised daemon
> 
> 4. other?
> 
> I'm asking these questions because I got for the first time a MR which
> adopts solution 2, which I never saw so far. It seems to me that
> solution 1 was adopted so far.
> 
> Or if there's no single solution, which should be avoided?

I would for the second option, because:

1. I'm the author of the MR :)
2. Is the most flexible solution:

Option n.1 is a limitation, option n.3 is difficult to maintain if/when we are going to implement other supervisor, like s6.
The second option allow also the two init to co-exists, which in some corner cases this might be wanted.


.: Francesco Colista
.: Alpine Linux Core Dev Team
Rasmus Thomsen <oss@cogitri.dev>
Details
Message ID
<799e151a9764838b5b0e273da3626e471976edb7.camel@cogitri.dev>
In-Reply-To
<ff2f9139bf743abd7303b89f10fc9549@alpinelinux.org> (view parent)
DKIM signature
missing
Download raw message
On Thu, 2020-08-27 at 13:20 +0000, Francesco Colista wrote:
> 27 agosto 2020 14:49, "Leonardo" <rnalrd@gmail.com> wrote:
> 
>  
> > So what should be the approach? I see:
> > 
> > 1. let the developer choose between supervised/unsupervised daemon
> > 
> > 2. provide two init scripts, supervised and unsupervised
> > 
> > 3. provide an "hybrid" init script which has a configurable user
> > option
> > to choose between supervised/unsupervised daemon
> > 
> > 4. other?
> > 
> > I'm asking these questions because I got for the first time a MR
> > which
> > adopts solution 2, which I never saw so far. It seems to me that
> > solution 1 was adopted so far.
> > 
> > Or if there's no single solution, which should be avoided?
> 
> I would for the second option, because:
> 
> 1. I'm the author of the MR :)
> 2. Is the most flexible solution:
> 
> Option n.1 is a limitation, option n.3 is difficult to maintain
> if/when we are going to implement other supervisor, like s6.
> The second option allow also the two init to co-exists, which in some
> corner cases this might be wanted.

I think in the majority of cases we don't need an unsupervised init
script (what's the rationale for having it in that MR (and what MR are
we talking about? :D)), since supervision has many advantges, as
mentioned in the thread already. IMHO it'd be best to just switch as
much as possible over to supervise-daemon before 3.13. I'd rather not
maintain two different sets of init scripts if possible.

Regards,

Rasmus Thomsen

> 
> .: Francesco Colista
> .: Alpine Linux Core Dev Team
Details
Message ID
<20200827171314.5bca06cf@ncopa-desktop.lan>
In-Reply-To
<799e151a9764838b5b0e273da3626e471976edb7.camel@cogitri.dev> (view parent)
DKIM signature
missing
Download raw message
On Thu, 27 Aug 2020 15:27:22 +0200
Rasmus Thomsen <oss@cogitri.dev> wrote:

> On Thu, 2020-08-27 at 13:20 +0000, Francesco Colista wrote:
> > 27 agosto 2020 14:49, "Leonardo" <rnalrd@gmail.com> wrote:
> > 
> >    
> > > So what should be the approach? I see:
> > > 
> > > 1. let the developer choose between supervised/unsupervised daemon
> > > 
> > > 2. provide two init scripts, supervised and unsupervised
> > > 
> > > 3. provide an "hybrid" init script which has a configurable user
> > > option
> > > to choose between supervised/unsupervised daemon
> > > 
> > > 4. other?
> > > 
> > > I'm asking these questions because I got for the first time a MR
> > > which
> > > adopts solution 2, which I never saw so far. It seems to me that
> > > solution 1 was adopted so far.
> > > 
> > > Or if there's no single solution, which should be avoided?  
> > 
> > I would for the second option, because:
> > 
> > 1. I'm the author of the MR :)
> > 2. Is the most flexible solution:
> > 
> > Option n.1 is a limitation, option n.3 is difficult to maintain
> > if/when we are going to implement other supervisor, like s6.
> > The second option allow also the two init to co-exists, which in some
> > corner cases this might be wanted.  
> 
> I think in the majority of cases we don't need an unsupervised init
> script (what's the rationale for having it in that MR (and what MR are
> we talking about? :D)), since supervision has many advantges, as
> mentioned in the thread already. IMHO it'd be best to just switch as
> much as possible over to supervise-daemon before 3.13.

But that would not give sysadmin/user the choice to die on error, which
I fear will lead to nobody caring if the services are buggy or not. The
"fix" is to restart the service.

> I'd rather not
> maintain two different sets of init scripts if possible.

I agree with this.

How about we fix supervise-daemon to accept an option or env var to
respawn?

-nc
Laurent Bercot <ska-devel@skarnet.org>
Details
Message ID
<em57aef832-b49c-4f20-b082-b7cd986daede@elzian>
In-Reply-To
<20200827171314.5bca06cf@ncopa-desktop.lan> (view parent)
DKIM signature
missing
Download raw message
>But that would not give sysadmin/user the choice to die on error, which
>I fear will lead to nobody caring if the services are buggy or not. The
>"fix" is to restart the service.

  That's a classic administration mistake, and it absolutely on the
sysadmin or ops person, not on the supervision infrastructure.

  A supervision system does not exist so that services can restart when
they die and the admin can continue napping because who cares, the
service is up.
  A supervision system exists so that services can restart when they
die so they're still kinda functional in an ever-imperfect world while
the admin actually analyzes the error and finds a real fix for the
service.

  The goal of a supervision system is to maximize the uptime. It is not
to enable laziness in fixing bugs. If nobody cares that a service is
buggy, you can lay the full blame on the people who do not care; not on
the supervision system. Not supervising daemons by default is putting
more the burden on competent admins in order to cater to the others,
and madness lies down this path.

  Of course, services should be configured so that if they crash,
appropriate notifications are sent to the admin, so problems will not
be silently ignored. supervise-daemon should have a hook you can use
to take some action depending on the exit code (or signal) of the 
daemon.

  Longruns should be supervised, but if some admin does not want to
supervise a given service, there should be an interface allowing them
to tell the supervisor not to restart the service next time it dies.
supervise-daemon should have such an interface, you shouldn't need to
patch it.

  *cough* Needless to say, s6 provides all of this. *cough*

--
  Laurent
Milan P. Stanić <mps@arvanta.net>
Details
Message ID
<20200827194923.GA29924@arya.arvanta.net>
In-Reply-To
<3LLUI2KOULSYM.359WA6HATX45B@8pit.net> (view parent)
DKIM signature
missing
Download raw message
On Thu, 2020-08-20 at 19:04, Sören Tempel wrote:
> Hello,
> 
> OpenRC ships a program called supervise-daemon(8) which is capable of
> starting daemons and restarting them if they crash. Contrary to
> start-stop-daemon, it does not rely on PID files instead the started
> daemon is a child process of supervise-daemon.
> 
> Some Alpine OpenRC services already use supervise-daemon(8) (e.g.
> unbound, xdm, wpa_supplicant, …). I was recently wondering if we want to
> migrate busybox-initscripts to using supervise-daemon too and was
> pointed to some comments in the GitLab issue tracker which critique use
> of supervise-daemon for busybox-initscripts because of concern over
> memory usage [0]. Upon further discussion in the IRC some people also
> expressed discomfort in regards to the automatic restarting of crashed
> services (“you don't want to mask crashes”).
> 
> However, the primary benefit I personally see with widespread use of
> supervise-daemon is that it would allow us to get rid of racy PID files.
> I would therefore propose that we enable supervise-daemon whenever
> possible in existing OpenRC services (including busybox-initscripts).
> In any case it would be nice to clarify when using supervisor-daemon is
> encouraged (see existing examples above) and when it isn't.
> 
> Thoughts?

I'm strongly against supervisors enabled by default.
'Fail early, fail hard' and not hide bugs.

Though I use supervisors but I as admin/user decide on which programs
and where I need it.

-- 
Kind regards
 
> Greetings,
> Sören
> 
> [0]: https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/1363#note_56289
Details
Message ID
<20200830135108.k4z4euzt4r6hrmn3@wolfsden.cz>
In-Reply-To
<20200827194923.GA29924@arya.arvanta.net> (view parent)
DKIM signature
missing
Download raw message
Hello,

On 2020-08-27 21:49:23 +0200, Milan P. Stanić wrote:
> I'm strongly against supervisors enabled by default.
> 'Fail early, fail hard' and not hide bugs.
> 
> Though I use supervisors but I as admin/user decide on which programs
> and where I need it.

Assuming we are able to convince upstream/patch it ourselves to provide
option *not to* autorestart, that seems like sensible default.

Having two scripts, one old-style, one for supervise-daemon, seems
cleanest but increases the maintenance cost, which might not be
desirable.

W.

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
Reply to thread Export thread (mbox)