For discussion of Alpine Linux development and developer support

7 4

[alpine-devel] rethinking the building infra

Natanael Copa
Details
Message ID
<20180206013944.7fa393b6@ncopa-macbook.copa.dup.pw>
Sender timestamp
1517877584
DKIM signature
missing
Download raw message
Hi!

I think we need to rethink the building infrastructure. The current
build scripts were written as a quick and dirt way to get started and
have lived longer way longer than originally planned. It simply does
not scale and is very fragile.

I guess I don't need to go any deeper into why we need replace it...

I'd like to discuss what we need from the build infra and why, before we
start talk about how to do it and what implementation etc.

Here are some things I want a new building infra should be able to
do:

- automatic build on git push

  There should not be needed to do anything more than do a git push to
  get the package built and uploaded. Like we have today, but with
  better error reporting.

- isolated environment for each build

  each build should set up an isolated enviroment and destroy it when
  build is done. This could be a container but it would be nice to be
  able to set up a disposable build env in a vm in case we want hook it
  into github PRs or similar. It should also kill everything after
  build and test is done so we dont get any remains of test suites that
  does not clean up after themselves (like redis and epmd)

- support multi architectures

  need to support x86_64, x86, armhf, aarch64, ppc64le and s390x. Would
  be nice it its not too complicated to add new architectures.

- support parallel building

  would be nice to be able to distribute the workload over available
  build servers. Should be possible and relatively easy to add new
  hardware to the pool or remove or replace old without taking
  everything down.

- support cross compile

  would be nice to cross compile packages that (easily) can be cross
  compiled. For example, we could let a big x86_64 or ppc64le machine
  build linux kernel for armhf instead doing that on the slow armhf
  server. Packages that cannot be cross compiled should be built on
  native hardware.

- separate out signing process of packages and index

  Would be nice if we could give access to build servers to more
  people without giving those people access to the private signing keys.

- build infra should be able to be used as CI

  We need do automatic compile checks of contributions, for example via
  github pull requests or something corresponding.

- efficient caching

  Would be nice to not need to git clone the entire repo for every
  build for every server. Would be nice if we checkout a shared git
  repo or do something so data does not goes over the wire more than
  necessary. Same goes for source and apk packages.

Anything else we need from the building infra?

-nc


---
Unsubscribe:  alpine-devel+unsubscribe@lists.alpinelinux.org
Help:         alpine-devel+help@lists.alpinelinux.org
---
Daniel Isaksen
Details
Message ID
<CAFWK1CDG0Vncg2axfwSUy7jgTszHgw8SFu244q8Y7SzgyzNVrw@mail.gmail.com>
In-Reply-To
<20180206013944.7fa393b6@ncopa-macbook.copa.dup.pw> (view parent)
Sender timestamp
1517877838
DKIM signature
missing
Download raw message
A small addition - this is in development, and some hands on it would be
nice: https://github.com/kaniini/abuildd

- Daniel

On Tue, Feb 6, 2018 at 1:39 AM, Natanael Copa <ncopa@alpinelinux.org> wrote:

> Hi!
>
> I think we need to rethink the building infrastructure. The current
> build scripts were written as a quick and dirt way to get started and
> have lived longer way longer than originally planned. It simply does
> not scale and is very fragile.
>
> I guess I don't need to go any deeper into why we need replace it...
>
> I'd like to discuss what we need from the build infra and why, before we
> start talk about how to do it and what implementation etc.
>
> Here are some things I want a new building infra should be able to
> do:
>
> - automatic build on git push
>
>   There should not be needed to do anything more than do a git push to
>   get the package built and uploaded. Like we have today, but with
>   better error reporting.
>
> - isolated environment for each build
>
>   each build should set up an isolated enviroment and destroy it when
>   build is done. This could be a container but it would be nice to be
>   able to set up a disposable build env in a vm in case we want hook it
>   into github PRs or similar. It should also kill everything after
>   build and test is done so we dont get any remains of test suites that
>   does not clean up after themselves (like redis and epmd)
>
> - support multi architectures
>
>   need to support x86_64, x86, armhf, aarch64, ppc64le and s390x. Would
>   be nice it its not too complicated to add new architectures.
>
> - support parallel building
>
>   would be nice to be able to distribute the workload over available
>   build servers. Should be possible and relatively easy to add new
>   hardware to the pool or remove or replace old without taking
>   everything down.
>
> - support cross compile
>
>   would be nice to cross compile packages that (easily) can be cross
>   compiled. For example, we could let a big x86_64 or ppc64le machine
>   build linux kernel for armhf instead doing that on the slow armhf
>   server. Packages that cannot be cross compiled should be built on
>   native hardware.
>
> - separate out signing process of packages and index
>
>   Would be nice if we could give access to build servers to more
>   people without giving those people access to the private signing keys.
>
> - build infra should be able to be used as CI
>
>   We need do automatic compile checks of contributions, for example via
>   github pull requests or something corresponding.
>
> - efficient caching
>
>   Would be nice to not need to git clone the entire repo for every
>   build for every server. Would be nice if we checkout a shared git
>   repo or do something so data does not goes over the wire more than
>   necessary. Same goes for source and apk packages.
>
> Anything else we need from the building infra?
>
> -nc
>
>
> ---
> Unsubscribe:  alpine-devel+unsubscribe@lists.alpinelinux.org
> Help:         alpine-devel+help@lists.alpinelinux.org
> ---
>
>
A. Wilcox
Details
Message ID
<192872b3-068a-0d7a-d618-e00e543fff3d@adelielinux.org>
In-Reply-To
<20180206013944.7fa393b6@ncopa-macbook.copa.dup.pw> (view parent)
Sender timestamp
1517941882
DKIM signature
missing
Download raw message
Hi there Natanael,

I thought I could go through and say what we (Adélie) would like to see too.

On 02/05/18 18:39, Natanael Copa wrote:
> I'd like to discuss what we need from the build infra and why, before we
> start talk about how to do it and what implementation etc.
> 
> Here are some things I want a new building infra should be able to
> do:
> 
> - automatic build on git push
> 
>   There should not be needed to do anything more than do a git push to
>   get the package built and uploaded. Like we have today, but with
>   better error reporting.


That would be good.  One important thing is that it may be desireable to
have a "[build skip]" or such for commits that deal with non-build
things (like git hooks or such) or a series with interdependencies.


> - isolated environment for each build
> 
>   each build should set up an isolated enviroment and destroy it when
>   build is done. This could be a container but it would be nice to be
>   able to set up a disposable build env in a vm in case we want hook it
>   into github PRs or similar. It should also kill everything after
>   build and test is done so we dont get any remains of test suites that
>   does not clean up after themselves (like redis and epmd)


It should not kill everything in cases of error.  I can't tell you how
much time abuild has saved me by leaving src/ alone when it fails to
compile or test.


> - support multi architectures
> 
>   need to support x86_64, x86, armhf, aarch64, ppc64le and s390x. Would
>   be nice it its not too complicated to add new architectures.


+1


> - support parallel building
> 
>   would be nice to be able to distribute the workload over available
>   build servers. Should be possible and relatively easy to add new
>   hardware to the pool or remove or replace old without taking
>   everything down.


That would be a very nice thing.  It would also be nice for it to track
what builders are idle so it can pick one.  I am just thinking of a
scenario:

Two builders: ppc64-1, ppc64-2
Four packages are modified close together: llvm4, which, unzip, acl

If it just goes in sequence, 'unzip' will be held up on llvm4, when
ppc64-2 could easily build it.

Maybe that is too much of a "implementation detail", but I think it is
an important requirement.


> - support cross compile
> 
>   would be nice to cross compile packages that (easily) can be cross
>   compiled. For example, we could let a big x86_64 or ppc64le machine
>   build linux kernel for armhf instead doing that on the slow armhf
>   server. Packages that cannot be cross compiled should be built on
>   native hardware.


The APKBUILD format doesn't do very well with cross.

Sure, some packages can cross easily, but many cannot.  Please see
coreutils mess in Adélie repo for an example:

https://code.foxkit.us/adelie/aports/blob/master/main/coreutils/APKBUILD

I don't really recommend cross to 'speed up' builds, especially since it
isn't possible to run tests then.  Much better to wait for real
hardware.  Only use it for things in bootstrap.sh.


> - separate out signing process of packages and index
> 
>   Would be nice if we could give access to build servers to more
>   people without giving those people access to the private signing keys.


Yes, when a builder is done, it should give the built package to an
internal, secure system for signing.  I have always agreed with this,
and the fact that the abuild signing key has to be present and readable
by abuild has prevented me from giving people access to our 32-bit x86
builder.


> - build infra should be able to be used as CI
> 
>   We need do automatic compile checks of contributions, for example via
>   github pull requests or something corresponding.


I note here that GitLab has CI too, and Jenkins is open source, so
Alpine doesn't always need to be tied to GitHub to still have CI :)


> - efficient caching
> 
>   Would be nice to not need to git clone the entire repo for every
>   build for every server. Would be nice if we checkout a shared git
>   repo or do something so data does not goes over the wire more than
>   necessary. Same goes for source and apk packages.


Well, you could use NFS or such if the build servers are on the same
local network.  That is what we did at Galapagos Linux.  But I'm not
sure if Alpine's build systems are all located in the same network.


> Anything else we need from the building infra?


Adélie would strongly benefit from having build logs kept in a
centralised place.  Something like logrotate could be used to keep it
from overflowing, so log maintenance is not something that needs to be
thought about.


The ability to have hooks (like git) for "build started", "build
success", and "build failure" would be useful.  Preferrably it would
allow custom ones written in shell or script, not web hoooks.  Then we
could have IRC poking if builds fail, and something like #alpine-commits
for all build statuses.


Best to you and yours,
--arw


> 
> -nc

-- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
http://adelielinux.org
Natanael Copa
Details
Message ID
<20180206151919.5e12b8ae@ncopa-macbook.copa.dup.pw>
In-Reply-To
<CAFWK1CDG0Vncg2axfwSUy7jgTszHgw8SFu244q8Y7SzgyzNVrw@mail.gmail.com> (view parent)
Sender timestamp
1517926759
DKIM signature
missing
Download raw message
On Tue, 6 Feb 2018 01:43:58 +0100
Daniel Isaksen <d@duniel.no> wrote:

> A small addition - this is in development, and some hands on it would be
> nice: https://github.com/kaniini/abuildd

I know, I just wanted to list the requirements before discussing
implementation.

-nc



---
Unsubscribe:  alpine-devel+unsubscribe@lists.alpinelinux.org
Help:         alpine-devel+help@lists.alpinelinux.org
---
A. Wilcox
Details
Message ID
<290b7bc4-203d-dad4-21e5-3892537a2a76@adelielinux.org>
In-Reply-To
<7676a963-d2b0-bf15-4f51-f5aa0d034e9a@bitmessage.ch> (view parent)
Sender timestamp
1518035622
DKIM signature
missing
Download raw message
On 02/07/18 13:58, Oliver Smith wrote:
> Hey Natanael,
> 
> the list sounds good to me, especially the increased security
> features!
> 
> While I agree with A. Wilcox that cross-compiling has the
> disadvantage of not running tests (or not properly if done through
> QEMU), I think it *does* make sense at least for kernels (there
> aren't any tests to execute for them). In my opinion, this is rather
> a detail, and the important thing would be getting everything else
> implemented first. With that being said: We have cross-compiling
> binary packages in the postmarketOS repo since we do lots of
> cross-compiling: gcc-armhf, binutils-armhf, musl-armhf etc. The
> aports for these are automatically generated from the upstream ones
> in Alpine (basically hardcoded variables from the bootstrap script on
> top). Unless the new build system would approach cross-compiling in a
> completely different way, you would also need such binary packages to
> do it, so maybe this feature can be upstreamed properly (that is the
> long-term plan anyway, if you guys want to have this - maybe with
> subpackages?).


I'm not sure what feature you are talking about.  There is already
bootstrap.sh to generate a working cross environment.  I suppose you
could make a bootstrap-lite.sh or such that only makes the toolchain and
none of the other components, which would work for a kernel build.


> Regarding caching: Maybe ccache makes sense, if it was separated for
> each package, and incoming pull-requests could only read from that
> cache, not write to it.


I think the cache point is more for the actual git data and distfiles,
not the compilation.


> Finally, I am not sure if the idea is to replace abuild entirely, or
> just abuild-rootbld.


Who said abuild was being replaced?  This is about `buildrepo` and
friends, not abuild.


> In case abuild should be replaced, I hope to see
> the following features preserved (since we make use of them in
> pmbootstrap, which wraps abuild):
>
> - (passing through environment variables)

This isn't a wise idea.  It's better to export them in the build() or
package() functions.

> - abuild -r (install depends with abuild)

This is currently how all Adélie packages are built, but it is a bit
buggy.  It is useful for end users making single contributions, but
honestly, I think that this new system is going to be much better than
`abuild -r`.

> - abuild undeps 

Due to the way abuild works right now, `apk del .makedepends-$pkgname`
has the same effect, if you ever need to do that without abuild.  That's
an implementation detail and is not guaranteed to always work, but it
does work right now if you need it.

> - abuild menuconfig

???  This isn't even an abuild feature.  It (ab)uses the fact that
abuild phases are shell functions by calling a menuconfig() function in
the kernel APKBUILD.  Honestly, I would be much in favour of /not/
abusing that fact, and making abuild more hardened against that, but
that is just me.


Best,
--arw

-- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
http://adelielinux.org
A. Wilcox
Details
Message ID
<45c16fdd-4091-0a50-e1b6-b025f949ae37@adelielinux.org>
In-Reply-To
<5e9111f1-2985-8597-b804-ddcb102a24fc@bitmessage.ch> (view parent)
Sender timestamp
1518046753
DKIM signature
missing
Download raw message
On 02/07/18 16:53, Oliver Smith wrote:
> What I mean is having packages directly in the binary repository, so
> one can do: $ apk add gcc-armhf
> 
> Just like it's possible to install gcc-avr right now. But gcc-avr
> package needs to be manually synced with the gcc aport. Instead of
> that it would be nice, if we had cross-compilers automatically built
> for all architectures, without running bootstrap.sh. Possibly as
> subpackage of gcc (but that's probably not the desired solution,
> since that will blow up the build time of GCC drastically). I think
> that this might be relevant to cross-compiling in the binary
> repository, because once such packages in the binary repo exist, it
> would be a clean way to install the cross-compiler in the compiling
> VM/container with apk.


We (Adélie) do this on x86_64 and ppc64 arches already (both of them
have various gcc-* binutils-* etc).  It would be cool if Alpine wanted
to do that.  And necessary, if kernels are cross-built.


> For package specific variables, we export them in the APKBUILD. We
> only pass them through abuild for package independent variables, such
> as: CARCH, CROSS_COMPILE, CC, CCACHE_PREFIX, CCACHE_PATH,
> CCACHE_COMPILERCHECK, DISTCC_HOSTS


CARCH and CROSS_COMPILE makes sense, since they are used by abuild.  CC,
CCACHE_*, and DISTCC_* make more sense in /etc/abuild.conf imo.


> Alpine's linux-vanilla APKBUILD used to have a menuconfig() function
> with a comment on top saying something like "# This is so we can use
> 'abuild menuconfig'". But I just realized that this was removed.
> Well, we still use that feature for that purpose. But if that did not
> work anymore in abuild, we could call menuconfig directly.


Yes, I personally think this is a bad idea and abuses the fact that
APKBUILD files are `source`d by abuild.  It is better to remove from the
APKBUILD.  I would even be in favour of having a little shell script in
the linux-vanilla package or such that takes CARCH and CROSS_COMPILE
like abuild does, and calls menuconfig properly for you.  But not
directly from the APKBUILD.

Best,
--arw

-- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
http://adelielinux.org
Oliver Smith
Details
Message ID
<7676a963-d2b0-bf15-4f51-f5aa0d034e9a@bitmessage.ch>
In-Reply-To
<20180206013944.7fa393b6@ncopa-macbook.copa.dup.pw> (view parent)
Sender timestamp
1518033480
DKIM signature
missing
Download raw message
Hey Natanael,

the list sounds good to me, especially the increased security features!

While I agree with A. Wilcox that cross-compiling has the disadvantage of not running tests (or not properly if done through QEMU), I think it *does* make sense at least for kernels (there aren't any tests to execute for them). In my opinion, this is rather a detail, and the important thing would be getting everything else implemented first. With that being said: We have cross-compiling binary packages in the postmarketOS repo since we do lots of cross-compiling: gcc-armhf, binutils-armhf, musl-armhf etc. The aports for these are automatically generated from the upstream ones in Alpine (basically hardcoded variables from the bootstrap script on top). Unless the new build system would approach cross-compiling in a completely different way, you would also need such binary packages to do it, so maybe this feature can be upstreamed properly (that is the long-term plan anyway, if you guys want to have this - maybe with subpackages?).

Regarding caching: Maybe ccache makes sense, if it was separated for each package, and incoming pull-requests could only read from that cache, not write to it.

Finally, I am not sure if the idea is to replace abuild entirely, or just abuild-rootbld. In case abuild should be replaced, I hope to see the following features preserved (since we make use of them in pmbootstrap, which wraps abuild):
- (passing through environment variables)
- abuild -r (install depends with abuild)
- abuild -d (do not install depends with abuild)
- abuild -f (force)
- abuild undeps
- abuild menuconfig
- abuild checksum
- abuild unpack
- abuild prepare

Thanks,
Oliver

Natanael Copa:
> Hi!
> 
> I think we need to rethink the building infrastructure. The current
> build scripts were written as a quick and dirt way to get started and
> have lived longer way longer than originally planned. It simply does
> not scale and is very fragile.
> 
> I guess I don't need to go any deeper into why we need replace it...
> 
> I'd like to discuss what we need from the build infra and why, before we
> start talk about how to do it and what implementation etc.
> 
> Here are some things I want a new building infra should be able to
> do:
> 
> - automatic build on git push
> 
>   There should not be needed to do anything more than do a git push to
>   get the package built and uploaded. Like we have today, but with
>   better error reporting.
> 
> - isolated environment for each build
> 
>   each build should set up an isolated enviroment and destroy it when
>   build is done. This could be a container but it would be nice to be
>   able to set up a disposable build env in a vm in case we want hook it
>   into github PRs or similar. It should also kill everything after
>   build and test is done so we dont get any remains of test suites that
>   does not clean up after themselves (like redis and epmd)
> 
> - support multi architectures
> 
>   need to support x86_64, x86, armhf, aarch64, ppc64le and s390x. Would
>   be nice it its not too complicated to add new architectures.
> 
> - support parallel building
> 
>   would be nice to be able to distribute the workload over available
>   build servers. Should be possible and relatively easy to add new
>   hardware to the pool or remove or replace old without taking
>   everything down.
> 
> - support cross compile
> 
>   would be nice to cross compile packages that (easily) can be cross
>   compiled. For example, we could let a big x86_64 or ppc64le machine
>   build linux kernel for armhf instead doing that on the slow armhf
>   server. Packages that cannot be cross compiled should be built on
>   native hardware.
> 
> - separate out signing process of packages and index
> 
>   Would be nice if we could give access to build servers to more
>   people without giving those people access to the private signing keys.
> 
> - build infra should be able to be used as CI
> 
>   We need do automatic compile checks of contributions, for example via
>   github pull requests or something corresponding.
> 
> - efficient caching
> 
>   Would be nice to not need to git clone the entire repo for every
>   build for every server. Would be nice if we checkout a shared git
>   repo or do something so data does not goes over the wire more than
>   necessary. Same goes for source and apk packages.
> 
> Anything else we need from the building infra?
> 
> -nc
> 
> 
> ---
> Unsubscribe:  alpine-devel+unsubscribe@lists.alpinelinux.org
> Help:         alpine-devel+help@lists.alpinelinux.org
> ---
> 
> 



---
Unsubscribe:  alpine-devel+unsubscribe@lists.alpinelinux.org
Help:         alpine-devel+help@lists.alpinelinux.org
---
Oliver Smith
Details
Message ID
<5e9111f1-2985-8597-b804-ddcb102a24fc@bitmessage.ch>
In-Reply-To
<290b7bc4-203d-dad4-21e5-3892537a2a76@adelielinux.org> (view parent)
Sender timestamp
1518043980
DKIM signature
missing
Download raw message
A. Wilcox:
> On 02/07/18 13:58, Oliver Smith wrote:
>> Hey Natanael,
>>
>> the list sounds good to me, especially the increased security
>> features!
>>
>> While I agree with A. Wilcox that cross-compiling has the
>> disadvantage of not running tests (or not properly if done through
>> QEMU), I think it *does* make sense at least for kernels (there
>> aren't any tests to execute for them). In my opinion, this is rather
>> a detail, and the important thing would be getting everything else
>> implemented first. With that being said: We have cross-compiling
>> binary packages in the postmarketOS repo since we do lots of
>> cross-compiling: gcc-armhf, binutils-armhf, musl-armhf etc. The
>> aports for these are automatically generated from the upstream ones
>> in Alpine (basically hardcoded variables from the bootstrap script on
>> top). Unless the new build system would approach cross-compiling in a
>> completely different way, you would also need such binary packages to
>> do it, so maybe this feature can be upstreamed properly (that is the
>> long-term plan anyway, if you guys want to have this - maybe with
>> subpackages?).
> 
> 
> I'm not sure what feature you are talking about.  There is already
> bootstrap.sh to generate a working cross environment.  I suppose you
> could make a bootstrap-lite.sh or such that only makes the toolchain and
> none of the other components, which would work for a kernel build.

What I mean is having packages directly in the binary repository, so one can do:
$ apk add gcc-armhf

Just like it's possible to install gcc-avr right now. But gcc-avr package needs to be manually synced with the gcc aport. Instead of that it would be nice, if we had cross-compilers automatically built for all architectures, without running bootstrap.sh. Possibly as subpackage of gcc (but that's probably not the desired solution, since that will blow up the build time of GCC drastically). I think that this might be relevant to cross-compiling in the binary repository, because once such packages in the binary repo exist, it would be a clean way to install the cross-compiler in the compiling VM/container with apk.

> 
> 
>> Regarding caching: Maybe ccache makes sense, if it was separated for
>> each package, and incoming pull-requests could only read from that
>> cache, not write to it.
> 
> 
> I think the cache point is more for the actual git data and distfiles,
> not the compilation.
> 

I understood that, this was just me throwing a different idea in to reduce compilation times for commits which only have small changes (configure flag changed etc.).

> 
>> Finally, I am not sure if the idea is to replace abuild entirely, or
>> just abuild-rootbld.
> 
> 
> Who said abuild was being replaced?  This is about `buildrepo` and
> friends, not abuild.

Thanks for clearing that up.

> 
> 
>> In case abuild should be replaced, I hope to see
>> the following features preserved (since we make use of them in
>> pmbootstrap, which wraps abuild):
>>
>> - (passing through environment variables)
> 
> This isn't a wise idea.  It's better to export them in the build() or
> package() functions.
> 

For package specific variables, we export them in the APKBUILD. We only pass them through abuild for package independent variables, such as:
CARCH, CROSS_COMPILE, CC, CCACHE_PREFIX, CCACHE_PATH, CCACHE_COMPILERCHECK, DISTCC_HOSTS

>> - abuild -r (install depends with abuild)
> 
> This is currently how all Adélie packages are built, but it is a bit
> buggy.  It is useful for end users making single contributions, but
> honestly, I think that this new system is going to be much better than
> `abuild -r`.
> 
>> - abuild undeps 
> 
> Due to the way abuild works right now, `apk del .makedepends-$pkgname`
> has the same effect, if you ever need to do that without abuild.  That's
> an implementation detail and is not guaranteed to always work, but it
> does work right now if you need it.

Right, thanks!

> 
>> - abuild menuconfig
> 
> ???  This isn't even an abuild feature.  It (ab)uses the fact that
> abuild phases are shell functions by calling a menuconfig() function in
> the kernel APKBUILD.  Honestly, I would be much in favour of /not/
> abusing that fact, and making abuild more hardened against that, but
> that is just me.
> 

Alpine's linux-vanilla APKBUILD used to have a menuconfig() function with a comment on top saying something like "# This is so we can use 'abuild menuconfig'". But I just realized that this was removed. Well, we still use that feature for that purpose. But if that did not work anymore in abuild, we could call menuconfig directly.

Thanks,
Oliver

> 
> Best,
> --arw
> 



---
Unsubscribe:  alpine-devel+unsubscribe@lists.alpinelinux.org
Help:         alpine-devel+help@lists.alpinelinux.org
---