~alpine/devel

[alpine-devel] Parsing all APKBUILD files

Details
Message ID
<CAOGGbnO7TvRMvJCwmtwSBVb+KhcwXPrjAWW6nEO0+ZP6ecnsJQ@mail.gmail.com>
Sender timestamp
1508216465
DKIM signature
missing
Download raw message
Hi there,

I just sent some minor bug fixes here:

https://github.com/alpinelinux/aports/pull/2506

and wanted to give some context.

I've been playing with Alpine for a few weeks and I like it!  After being
deep in the bowels of Debian a few years ago, the code is a breath of fresh
air!

I'm writing a new shell, and as a torture test for the parser, I parsed
~700K lines of shell scripts I have on my hard drive.  This includes the
entire aports repository, which is ~5000 APKBUILD files in ~250K lines.

Results here:

http://www.oilshell.org/git-branch/master/21552818/wild.wwz/distro/alpine-aports/index.html

Only two scripts failed, because of the unicode whitespace issue I noted in
the pull request.  (I only noticed this because of a bug in my shell.
However no shells I tested, including busybox ash, accept this code, so
they are real bugs.)

-----

The latest post on my blog has a little background, including why I'm
interested in Alpine:

http://www.oilshell.org/blog/2017/10/06.html

>From my perspective it seems as functional as Debian in many ways, but the
build scripts are more sane.

One big caveat is that my shell is very slow right now, due to a couple of
implementation shortcuts, which I plan to remove, as well as implementing
it in Python.  However the algorithm is principled, as well as fast in the
algorithmic sense [1].  I did the hard part of making it correct; now I
have to make it fast in practice :)

So what I'll probably be doing in the near future is speeding up the
parser, so I can build packages with abuild running under OSH.  (It takes a
few seconds to even parse abuild before it runs right now, especially in a
VM.)

[1] http://www.oilshell.org/blog/tags.html?tag=parsing-shell#parsing-shell

-----

I also have some longer term goals for improving shell.  As mentioned in
the blog post, I think Linux distros are the projects that really push
shell to its limits.  They're almost all built on shell in some way that is
not very satisfactory.  For example, Debian is a big mess of auto-generated
Makefiles and shell scripts.   Red Hat embeds shell snippets in a text file
with metadata.

Alpine has some of the cleaner shell code I've seen, but I have been
lurking and noticed a couple threads where shell is brought up:


2017/06/30 - Script to parse APKBUILDS and output table of arch support

http://lists.alpinelinux.org/alpine-devel/5741.html

"I agree. The way to parse the APKBUILDs is to execute them. "

I looked at the read_apkbuild code here, which pipes a program to bash that
sources an APKBUILD to bash and echos variables:

https://git.alpinelinux.org/cgit/user/ncopa/aports-cache/

I'm not sure if there is any benefit to this right now, but this can be
done in my shell by writing a Python program that pokes into the
interpreter state to eval and fetch variables.

I would be interested in use cases for querying APKBUILD metadata.

----

2017/06/12 - A few questions about Alpine

http://lists.alpinelinux.org/alpine-devel/5689.html

"The major issue with APKBUILD is that it is not possible to specify the
metadata for subpackages (pkgdesc, depends, arch etc) in global scope,
and regardless of how we define it in shell, it will be clumsy at best. "

I didn't quite understand this, but I plan to add structured metadata to
Oil, which seems to be a major thing missing from shell.  In other words,
you should be able to parse scripts and extract data without executing.

There are tons of tools that need to mix declarative metadata with shell
snippets, like essentially every package manager, Docker files,
Chef/Puppet-type configuration, systemd config files, Upstart config files,
every Makefile ever written, etc.

----

Also on IRC a few weeks ago, foxkit proposed a lint tool for APKBUILD:

"repoman uses python to static read the ebuild files in gentoo tree and
find common mistakes. rpmlint uses perl to do same with redhat rpms"

OSH has a very detailed AST so it may be possible to write lint tools on
top of it.  Here is an example:

http://www.oilshell.org/git-branch/master/21552818/wild.wwz/distro/alpine-aports/main/bash/APKBUILD__ast.html

However as mentioned I have to speed things up a lot first.

-----

None of these ideas are particularly urgent, but I thought I'd mention
them.  If there is something that you want from a shell, I'm interested.  I
suspect Alpine is not going to switch shells any time soon, but having a
second shell implementation to validate/analyze the large amount of code
seems like it could be useful.

Let me know what you think!

thanks,
Andy
Reply to thread Export thread (mbox)