X-Original-To: alpine-devel@mail.alpinelinux.org Delivered-To: alpine-devel@mail.alpinelinux.org Received: from mail.alpinelinux.org (dallas-a1.alpinelinux.org [127.0.0.1]) by mail.alpinelinux.org (Postfix) with ESMTP id D54A9DC0D48 for ; Wed, 13 Jan 2016 03:47:46 +0000 (UTC) Received: from mail-ob0-f172.google.com (mail-ob0-f172.google.com [209.85.214.172]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mail.alpinelinux.org (Postfix) with ESMTPS id AAA72DC050F for ; Wed, 13 Jan 2016 03:47:46 +0000 (UTC) Received: by mail-ob0-f172.google.com with SMTP id ba1so454570004obb.3 for ; Tue, 12 Jan 2016 19:47:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=2lEiX1lePyeNlj6qWzHBstKImfxyGvXqRJaM+nTCkIQ=; b=sSQ4WdASn5oK4L5fU+YzrZiAVcx9aAdQbhGrRBPqvb5V6AYeM5H8LJVxpPniuuS8ub COcG41XUXvX3BqdcikDIBfKj2v/bXXRPOqVYHr+sKfHD4l/Olpi8S5nCNVEDQD1U2eDf VPkHKjncyGMseE/lFoNdziHw8k9g95QqUAFUAEp62EtXKAWbhQj5x8crjxoZ1ljkuZVk M5uKFO3yFXx8biQRmC2m8rXX8uuNlSxIFtdcENGpoXITFHHHkvz50YE4kXX7/7MFXgQM SXY5nRWkiIEEtR4Q7KrRoKvPYsgdujoC3yv3lP73WMc84GgXXV1Eg7xsYhEv/oCWiRg/ +oMQ== X-Mailinglist: alpine-devel Precedence: list List-Id: Alpine Development List-Unsubscribe: List-Post: List-Help: List-Subscribe: MIME-Version: 1.0 X-Received: by 10.60.134.202 with SMTP id pm10mr50791880oeb.50.1452656865516; Tue, 12 Jan 2016 19:47:45 -0800 (PST) Received: by 10.202.81.80 with HTTP; Tue, 12 Jan 2016 19:47:45 -0800 (PST) In-Reply-To: <56958E22.90806@skarnet.org> References: <20150727103737.4f95e523@ncopa-desktop.alpinelinux.org> <20150728052436.GC1923@newbook> <20160112153804.GI32545@example.net> <56953ABE.5090203@skarnet.org> <56958E22.90806@skarnet.org> Date: Tue, 12 Jan 2016 22:47:45 -0500 Message-ID: Subject: Re: [alpine-devel] udev replacement on Alpine Linux From: Jude Nelson To: Laurent Bercot Cc: alpine-devel@lists.alpinelinux.org Content-Type: multipart/alternative; boundary=047d7b417a6373259605292f0892 X-Virus-Scanned: ClamAV using ClamSMTP --047d7b417a6373259605292f0892 Content-Type: text/plain; charset=UTF-8 Hi Laurent, thank you as always for your input. On Tue, Jan 12, 2016 at 6:37 PM, Laurent Bercot wrote: > On 12/01/2016 21:06, Jude Nelson wrote: > >> I've been using vdev and libudev-compat it on my production machine >> for several months. >> > > Sure, but since you're the author, it's certainly easier for you > than for other people. ;) > Agreed; I was just pointing out that the system has been seeing some real-world use :) > > I use it with heavily with Chromium (YouTube and >> Google Hangouts work) and udev-enabled Xorg (hotplugged input devices >> work as expected). My encrypted swap partition's device-mapped nodes >> and directories show up where they should, and my Android development >> tools work with my Android phone when I plug it in. >> > > That's neat, and very promising. > I doubt you're the right person to ask, but do you have any > experience running libudev-compat with a different hotplug > manager than vdev ? I'd like to stick with (s)mdev as long as > I can make it work. > I haven't tried this myself, but it should be doable. Vdev's event-propagation mechanism is a small program that constructs a uevent string from environment variables passed to it by vdev and writes the string to the appropriate place. The vdev daemon isn't aware of its existence; it simply executes it like it would for any another matching device-event action. Another device manager could supply the same program with the right environment variables and use it for the same purposes. > > I wouldn't say it's ready for prime time just yet, though. In >> particular, because libudev-compat uses (dev)tmpfs to record and >> distribute event messages as regular files (under >> /dev/metadata/udev/events), a program can leak files and directories >> simply by exiting without shutting down libudev (i.e. failing freeing >> up the struct udev_device). >> > > That may be OOT, but I'm interested in hearing the rationale for > that choice. An event is ephemeral, a file is (relatively) permanent; > recording events as regular files does not sound like a good match, > unless you have a reference counting process/thread somewhere that > cleans up an event as soon as it's consumed. > Tmpfs and devtmps are designed for holding ephemeral state already, so I'm not sure why the fact that they expose data as regular files is a concern? I went with a file-oriented model specifically because it made reference-counting simple and easy--specifically, by using hard-links. The aforementioned event-propagation tool writes the uevent into a scratch area under /dev, hard-links it into each libudev-compat monitor directory under /dev/metadata/udev, and unlinks the file (there is a directory in /dev/metadata/udev for each struct udev_monitor created by each libudev-compat program). When the libudev-compat client wakes up next, it consumes any new event-files (in delivery order) and unlinks them, thereby ensuring that once each libudev-compat client "receives" the event, the event's resources are fully reclaimed. > Anyway, unless I'm misunderstanding the architecture completely, > it sounds like leaks could be prevented by wrapping programs you're > not sure of. > I couldn't think of a simpler way that was also as robust. Unless I'm misunderstanding something, wrapping an arbitrary program to clean up the files it created would, in the extreme, require coming up with a way to do so on SIGKILL. I'd love to know if there is a simple way to do this, though. > > My plan is to have libudev-compat store >> its events to a special-purpose FUSE filesystem called eventfs [1] >> that automatically removes orphaned files and denies all future >> access to them. >> > > Unfortunately, FUSE is a deal breaker for the project I'm working on. > > I'm under the impression that you're slightly overengineering this; > you shouldn't need a specific filesystem to distribute events. My > s6-ftrig-* set of tools distribute events to arbitrary subscribers > without needing anything specific - the mechanism is just directories > and named pipes. > But I don't know the details of libudev, so I may be missing > something, and I'm really interested in learning more. > I went with a specialized filesystem for two reasons; both of which were to fulfill libudev's API contract: * Efficient, reliable event multicasting. By using hard-links as described above, the event only needs to be written out once, and the OS only needs to store one copy. * Automatic multicast channel cleanup. Eventfs would ensure that no matter how a process dies, its multicast state would be come inaccessible and be reclaimed once it is dead (i.e. a subsequent filesystem operation on the orphaned state, no matter how soon after the process's exit, will fail). Both of the above are implicitly guaranteed by libudev, since it relies on a netlink multicast group shared with the udevd process to achieve them. It is my understanding (please correct me if I'm wrong) that with s6-ftrig-*, I would need to write out the event data to each listener's pipe (i.e. once per struct udev_monitor instance), and I would still be responsible for cleaning up the fifodir every now and then if the libudev-compat client failed to do so itself. Is my understanding correct? Again, I would love to know of a simpler approach that is just as robust. > > Instead, I've been running a script >> every now and then that clears out orphaned directories in >> /dev/metadata/udev/events. >> > > A polling cleaner script works if you have no sensitive data. > A better design, though, is a notification-based cleaner, that > is triggered as soon as a reference expires. And I'm almost > certain you don't need eventfs for this :) > > I agree that a notification-based cleaner could be just as effective, but I wonder whether or not the machinery necessary to track all libudev-compat processes in a reliable and efficient manner would be simpler than eventfs? Would love to know what you had in mind :) Thanks for your feedback, Jude > > -- > Laurent > > > --- > Unsubscribe: alpine-devel+unsubscribe@lists.alpinelinux.org > Help: alpine-devel+help@lists.alpinelinux.org > --- > > --047d7b417a6373259605292f0892 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Laurent, thank you as always for your input.

On Tue, Jan 12, 2016 at = 6:37 PM, Laurent Bercot <ska-devel@skarnet.org> wrote:
On 12/= 01/2016 21:06, Jude Nelson wrote:
I've been using vdev and libudev-compat it on my production machine
for several months.

=C2=A0Sure, but since you're the author, it's certainly easier for = you
than for other people. ;)

=
Agreed; I was just pointing out that the system has been seeing = some real-world use :)



=C2=A0I use it with heavily with Chromium (YouTube and
Google Hangouts work) and udev-enabled Xorg (hotplugged input devices
work as expected).=C2=A0 My encrypted swap partition's device-mapped no= des
and directories show up where they should, and my Android development
tools work with my Android phone when I plug it in.

=C2=A0That's neat, and very promising.
=C2=A0I doubt you're the right person to ask, but do you have any
experience running libudev-compat with a different hotplug
manager than vdev ? I'd like to stick with (s)mdev as long as
I can make it work.

=
I haven't tried this myself, but it should be doable.=C2=A0 Vdev&#= 39;s event-propagation mechanism is a small program that constructs a ueven= t string from environment variables passed to it by vdev and writes the str= ing to the appropriate place.=C2=A0 The vdev daemon isn't aware of its = existence; it simply executes it like it would for any another matching dev= ice-event action.=C2=A0 Another device manager could supply the same progra= m with the right environment variables and use it for the same purposes.


I wouldn't say it's ready for prime time just yet, though.=C2=A0 In=
particular, because libudev-compat uses (dev)tmpfs to record and
distribute event messages as regular files (under
/dev/metadata/udev/events), a program can leak files and directories
simply by exiting without shutting down libudev (i.e. failing freeing
up the struct udev_device).

=C2=A0That may be OOT, but I'm interested in hearing the rationale for<= br> that choice. An event is ephemeral, a file is (relatively) permanent;
recording events as regular files does not sound like a good match,
unless you have a reference counting process/thread somewhere that
cleans up an event as soon as it's consumed.

<= /div>
Tmpfs and devtmps are designed for holding ephemeral state alread= y, so I'm not sure why the fact that they expose data as regular files = is a concern?

I went with a file-oriented model specifica= lly because it made reference-counting simple and easy--specifically, by us= ing hard-links.=C2=A0 The aforementioned event-propagation tool writes the = uevent into a scratch area under /dev, hard-links it into each libudev-comp= at monitor directory under /dev/metadata/udev, and unlinks the file (there = is a directory in /dev/metadata/udev for each struct udev_monitor created b= y each libudev-compat program).=C2=A0 When the libudev-compat client wakes = up next, it consumes any new event-files (in delivery order) and unlinks th= em, thereby ensuring that once each libudev-compat client "receives&qu= ot; the event, the event's resources are fully reclaimed.


=C2=A0Anyway, unless I'm misunderstanding the architecture completely,<= br> it sounds like leaks could be prevented by wrapping programs you're
not sure of.

I = couldn't think of a simpler way that was also as robust.=C2=A0 Unless I= 'm misunderstanding something, wrapping an arbitrary program to clean u= p the files it created would, in the extreme, require coming up with a way = to do so on SIGKILL.=C2=A0 I'd love to know if there is a simple way to= do this, though.



=C2=A0My plan is to have libudev-compat store
its events to a special-purpose FUSE filesystem called eventfs [1]
that automatically removes orphaned files and denies all future
access to them.

=C2=A0Unfortunately, FUSE is a deal breaker for the project I'm working= on.

=C2=A0I'm under the impression that you're slightly overengineering= this;
you shouldn't need a specific filesystem to distribute events. My
s6-ftrig-* set of tools distribute events to arbitrary subscribers
without needing anything specific - the mechanism is just directories
and named pipes.
=C2=A0But I don't know the details of libudev, so I may be missing
something, and I'm really interested in learning more.=

I went with a specia= lized filesystem for two reasons; both of which were to fulfill libudev'= ;s API contract:
* Efficient, reliable event multicasting.=C2= =A0 By using hard-links as described above, the event only needs to be writ= ten out once, and the OS only needs to store one copy.
* Auto= matic multicast channel cleanup.=C2=A0 Eventfs would ensure that no matter = how a process dies, its multicast state would be come inaccessible and be r= eclaimed once it is dead (i.e. a subsequent filesystem operation on the orp= haned state, no matter how soon after the process's exit, will fail).
Both of the above are implicitly guaranteed by libudev, since it rel= ies on a netlink multicast group shared with the udevd process to achieve t= hem.

It is my understanding (please correct me= if I'm wrong) that with s6-ftrig-*, I would need to write out the even= t data to each listener's pipe (i.e. once per struct udev_monitor insta= nce), and I would still be responsible for cleaning up the fifodir every no= w and then if the libudev-compat client failed to do so itself.=C2=A0 Is my= understanding correct?

Again, I would love to know of a = simpler approach that is just as robust.



Instead, I've been running a script
every now and then that clears out orphaned directories in
/dev/metadata/udev/events.

=C2=A0A polling cleaner script works if you have no sensitive data.
A better design, though, is a notification-based cleaner, that
is triggered as soon as a reference expires. And I'm almost
certain you don't need eventfs for this :)


I agree that a notifi= cation-based cleaner could be just as effective, but I wonder whether or no= t the machinery necessary to track all libudev-compat processes in a reliab= le and efficient manner would be simpler than eventfs?=C2=A0 Would love to = know what you had in mind :)

Thanks for your feedback,
Jude
=C2=A0

--
=C2=A0Laurent


---
Unsubscribe:=C2=A0 alpine-devel+unsubscribe@lists.alpinelinux.or= g
Help:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0alpine-devel+help@lists.alpineli= nux.org
---


--047d7b417a6373259605292f0892-- --- Unsubscribe: alpine-devel+unsubscribe@lists.alpinelinux.org Help: alpine-devel+help@lists.alpinelinux.org ---