X-Original-To: alpine-devel@lists.alpinelinux.org Received: from luna.geeknet.cz (luna.geeknet.cz [37.205.9.141]) by lists.alpinelinux.org (Postfix) with ESMTP id ECE795C592D for ; Wed, 25 Oct 2017 19:49:45 +0000 (GMT) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by luna.geeknet.cz (Postfix) with ESMTPSA id 430029E2D4; Wed, 25 Oct 2017 21:49:44 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jirutka.cz; s=mail; t=1508960984; bh=XQQsOfQbVNCixnMBx0SSAO6tGLxQZzS0w5kfK3XP2RE=; h=Subject:From:In-Reply-To:Date:Cc:References:To; b=d2wclt+bJCHGVOoXZeM5ImS/huQuChkSTJJVWCRDxNEDAifsqyVVWebVkNteEw/AS VhVeem2lrgKgYCnIha4+CA6GDIdYN7XY3EirzEjDVzFuFbR96wgC73Uz3Z5ja0pXqc q+OKQmYrKdrBqDpBXvGa8V+kg8TaTOT4meXVikeo= Subject: Re: [alpine-devel] force compile flag for musl? X-Mailinglist: alpine-devel Precedence: list List-Id: Alpine Development List-Unsubscribe: List-Post: List-Help: List-Subscribe: Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Content-Type: text/plain; charset=utf-8 From: Jakub Jirutka X-Priority: Medium In-Reply-To: <15f542f8688.cf7ee9da1704.8790472834519953285@zoho.com> Date: Wed, 25 Oct 2017 21:49:43 +0200 Cc: ska@skarnet.org, Shiz Content-Transfer-Encoding: quoted-printable Message-Id: References: <20171025164614.14c4c57e@ncopa-desktop.copa.dup.pw> <15f542f8688.cf7ee9da1704.8790472834519953285@zoho.com> To: Alpine-devel Uh, I=E2=80=99ve accidentally sent my response to just specific people = and not to ML. :( So once again and sorry for mess. >> --- a/main/musl/APKBUILD=20 >> +++ b/main/musl/APKBUILD=20 >> @@ -54,6 +54,8 @@ build() {=20 >> fi=20 >>=20 >> # note: not autotools=20 >> + # force -O2 compile flag for better performance=20 >> + CFLAGS=3D"-O2" \=20 >> LDFLAGS=3D"$LDFLAGS -Wl,-soname,libc.musl-${CARCH}.so.1" \=20 >> ./configure \=20 >> --build=3D$CBUILD \=20 This is IMO not correct, CFLAGS does not define just -Os, but more flags = (-fomit-frame-pointer). Here you replace all the default flags with -O2. I suggest to use `CFLAGS=3D"${CFLAGS/-Os/-O2}"` instead, that=E2=80=99s = what I already used in some abuilds. > OTOH higher optimization levels for x86-64 usually tend to give better = results than on other archs. x86_64 is the most used platform, especially when you need performance = (and can=E2=80=99t afford IBM=E2=80=99s proprietary architectures). = I=E2=80=99d bet that it applies even to Alpine, but don=E2=80=99t have = any numbers. > There were similar changes in aports for various applications over = recent months, but I haven=E2=80=99t seen even one proof behind them. >=20 > Performance improvements are imporant, and they may come from simply = bumping optimization level, but it should be verified, not blindly = assumed. Technically you=E2=80=99re right, it=E2=80=99d be indeed nice to have = some real proof. Unfortunately it=E2=80=99s quite hard to make a good = benchmark and probably no one of us have time to do that. :( However, there=E2=80=99s another reason why to prefer -O2. It=E2=80=99s = the default optimization level that use almost all upstream projects and = even downstream (other distros). So it=E2=80=99s the most tested variant = and if the project care about performance, they typically optimize for = -O2 (or -O3). But I must mention that I=E2=80=99m really not expert in = this field, I can only say what I see as the most used, but not whether = it really makes in technical aspects. I=E2=80=99m the one who has changed -Os to -O2 in _some_ specific = abuilds where performance is important, e.g. x264, opus, qemu, = postgresql=E2=80=A6 I actually tried to find some proof for PostgreSQL, = but the only performance good comparison I found [1] compares just -O2, = -O3, -O4, -march=3Dnative and -flto, not -Os. Just a note: we already compile many aports with -O2 and most of you = don=E2=80=99t even know about it and/or didn=E2=80=99t care. These are = almost all aports built by CMake. CMake by default doesn=E2=80=99t log = what it is really executing, unless you set = `-DCMAKE_VERBOSE_MAKEFILE=3DON` (that=E2=80=99s why I usually enable = it). If you=E2=80=99d enable verbose mode, you would see that -Os is = passed to gcc, but it=E2=80=99s followed by -O2 added by CMake. That=E2=80= =99s what `-DCMAKE_BUILD_TYPE=3DRelease` do (among others); it=E2=80=99s = a profile that predefines various flags including -O and it has higher = priority than CFLAGS from environment. If you want -Os, there=E2=80=99s = a built-in profile MinSizeRel. You can look which aports use this = flag=E2=80=A6 ;) However, as I have discovered over time, MinSizeRel is not always = usable. Many projects have very bad CMakeLists, they fully or partially = ignore CMAKE_BUILD_TYPE or foolishly assumes that release profile is = just and only Release, so build sometimes even fails with MinSizeRel. So, to sum it up, -O2 is the default and most used optimization level in = most upstream projects and even other distributions, even we already use = it for many aports, whatever you=E2=80=99re aware of it or not. So I=E2=80= =99d support changing the default to -O2, at least for x86_64, and = change it to -Os in specific abuilds where it makes sense (especially = small static binaries). As a (partial) academic person, I must agree with Przemys=C5=82aw about = benchmarking vs. believing, but it=E2=80=99s unfortunately not = realistic, at least not for all aports. It=E2=80=99d be really great and = useful if someone can measure difference at least for very core = components like musl libc. I=E2=80=99d like to ask Skarnet and Shiz about their opinion and = expertise in this topic. Jakub [1]: https://blog.pgaddict.com/posts/compiler-optimization-vs-postgresql > On 25. Oct 2017, at 17:38, Przemys=C5=82aw Pawe=C5=82czyk = wrote: >=20 > ---- On Wed, 25 Oct 2017 16:46:14 +0200 Natanael Copa = wrote ----=20 >> I wonder what you think about overriding the -Os compile flag for = musl,=20 >> and hardcode it to -O2.=20 >=20 > I would be very careful with such changes. >=20 > There is misconception that the higher optimization level, the faster > code is generated. That is the general -Olevel idea, but not what is > seen in practice. Gains (or losses) from higher optimization levels > vary between archs and obviously depend on the code that is being > optimized. >=20 > Smaller code, beside being smaller, is also more cache-friendly, so = -Os > can be faster than -O2 and often is. OTOH higher optimization levels > for x86-64 usually tend to give better results than on other archs. >=20 > There is no rule. It all depends on: > - source code, > - compiler, > - platform. >=20 >>=20 >> I think this makes sense since the functions in libc are so often = used=20 >> that we want trade better performance at the cost of slightly bigger=20= >> binary.=20 >=20 > This makes sense if we really get better performance with -O2 on all > platforms AL supports. And to be able to confirm that, it has te be > measured. >=20 >>=20 >> This means that we override whatever user as set CFLAGS to=20 >> in /etc/abuild.conf=20 >>=20 >> We already do this with zlib.=20 >=20 > zlib is a different beast, because it's computational software. It's > much more natural to see gains from higher -Olevel in that kind of = apps. >=20 >>=20 >> What do you think?=20 >=20 > There were similar changes in aports for various applications over > recent months, but I haven't seen even one proof behind them. >=20 > Performance improvements are imporant, and they may come from simply > bumping optimization level, but it should be verified, not blindly > assumed. >=20 > Regards, > Przemek >=20 >=20 >>=20 >> diff --git a/main/musl/APKBUILD b/main/musl/APKBUILD=20 >> index 1938bbb3ca..193002186d 100644=20 >> --- a/main/musl/APKBUILD=20 >> +++ b/main/musl/APKBUILD=20 >> @@ -54,6 +54,8 @@ build() {=20 >> fi=20 >>=20 >> # note: not autotools=20 >> + # force -O2 compile flag for better performance=20 >> + CFLAGS=3D"-O2" \=20 >> LDFLAGS=3D"$LDFLAGS -Wl,-soname,libc.musl-${CARCH}.so.1" \=20 >> ./configure \=20 >> --build=3D$CBUILD \=20 >>=20 >> -nc=20 >=20 >=20 >=20 > --- > Unsubscribe: alpine-devel+unsubscribe@lists.alpinelinux.org > Help: alpine-devel+help@lists.alpinelinux.org > --- >=20 --- Unsubscribe: alpine-devel+unsubscribe@lists.alpinelinux.org Help: alpine-devel+help@lists.alpinelinux.org ---