X-Original-To: alpine-user@lists.alpinelinux.org Received: from sdaoden.eu (sdaoden.eu [217.144.132.164]) by lists.alpinelinux.org (Postfix) with ESMTP id 593895C4EB5 for ; Fri, 16 Mar 2018 15:37:45 +0000 (GMT) Received: by sdaoden.eu (Postfix, from userid 1000) id 815EF16045; Fri, 16 Mar 2018 16:37:44 +0100 (CET) Date: Fri, 16 Mar 2018 16:37:41 +0100 From: Steffen Nurpmeso To: Natanael Copa Cc: alpine-user@lists.alpinelinux.org Subject: Re: [alpine-user] FYI: community/zstd binary much (up to 4x) slower than necessary Message-ID: <20180316153741.sbhOT%steffen@sdaoden.eu> References: <20180313180648.kXWsR%steffen@sdaoden.eu> <20180316091207.3ad9dd48@ncopa-desktop.copa.dup.pw> In-Reply-To: <20180316091207.3ad9dd48@ncopa-desktop.copa.dup.pw> Mail-Followup-To: Natanael Copa , alpine-user@lists.alpinelinux.org User-Agent: s-nail v14.9.9-19-gf907977e OpenPGP: id=EE19E1C1F2F7054F8D3954D8308964B51883A0DD; url=https://ftp.sdaoden.eu/steffen.asc BlahBlahBlah: Any stupid boy can crush a beetle. But all the professors in the world can make no bugs. X-Mailinglist: alpine-user Precedence: list List-Id: Alpine Development List-Unsubscribe: List-Post: List-Help: List-Subscribe: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=l8LlDT3ho1bzEhCf3SDVThyuRFCxCkkCPBYr=-=" This is a multi-part message in MIME format. --=-=l8LlDT3ho1bzEhCf3SDVThyuRFCxCkkCPBYr=-= Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-ID: <20180316153741.C2Ttr%steffen@sdaoden.eu> Hello. Natanael Copa wrote: |On Tue, 13 Mar 2018 19:06:48 +0100 |Steffen Nurpmeso wrote: | |> Hello, for your possible interest. |> |> In a thead for the LUGA(ustria) i eventually had to time some |> compression algorithms and wondered why zstd is so slow, but |> especially so the decompressing stage, which a key feature of this |> one. It turns out that the -Os compilation causes, well, drama- |> tical performance degradation. I compiled my own with -O3 and the |> difference is up to factor four. Just one example: ... |Are you compressing the same file? I see x4.txt, x5.txt avs x1.txt. |File content may make difference too. Yes, it was all the same. It was just an excerpt of that LUGA message, sorry. |> That makes me actually wonder how ports should deal with CFLAGS. |> Is it acceptable for a port to watch for compiler flags and set |> them, my MUA would go for PIE, relro and all that, then? | |I think if the difference is 4x then, yes, I think we should explicitly |set CFLAGS from aport with a reference on why. I do prefer -O2 over -O3 |though, so It would be nice to see the numbers with -O2 and also what |the numbers are on different platforms. | |We already explicitly set -O2 for zlib, because its a case where we do |want trade more speed at the cost of size. I see. I only have control of x86 (with Linux) for now, i really have to do something about that at some day... With -O2: #?0[steffen@essex zstd]$ CFLAGS=-O2 make zstd ... #?0[steffen@essex zstd]$ ll zstd -rwxr-x--- 1 steffen steffen 582392 Mar 16 16:11 zstd* #?0[steffen@essex zstd]$ ldd zstd /lib/ld-musl-x86_64.so.1 (0x7fc87972c000) libz.so.1 => /lib/libz.so.1 (0x7fc879291000) libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7fc87972c000) #?0[steffen@essex zstd]$ time ./zstd -c < C165.txt > .t1 0m00.40s real 0m00.27s user 0m00.09s system #?0[steffen@essex zstd]$ time ./zstd -c < C165.txt > .t1 0m00.31s real 0m00.23s user 0m00.07s system #?0[steffen@essex zstd]$ time ./zstd -19 -c < C165.txt > .t1 0m12.50s real 0m12.35s user 0m00.13s system #?0[steffen@essex zstd]$ time ./zstd -19 -c < C165.txt > .t1 0m12.32s real 0m12.14s user 0m00.15s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.17s real 0m00.11s user 0m00.06s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.13s real 0m00.09s user 0m00.03s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.12s real 0m00.09s user 0m00.02s system No difference with -O3, actually: #?0[steffen@essex zstd]$ CFLAGS=-O3 make zstd ... #?0[steffen@essex zstd]$ ll zstd -rwxr-x--- 1 steffen steffen 619296 Mar 16 16:17 zstd* #?0[steffen@essex zstd]$ ldd zstd /lib/ld-musl-x86_64.so.1 (0x7f423a622000) libz.so.1 => /lib/libz.so.1 (0x7f423a17e000) libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f423a622000) #?0[steffen@essex zstd]$ time ./zstd -c < C165.txt > .t1 0m00.33s real 0m00.26s user 0m00.06s system #?0[steffen@essex zstd]$ time ./zstd -c < C165.txt > .t1 0m00.28s real 0m00.23s user 0m00.04s system #?0[steffen@essex zstd]$ time ./zstd -19 -c < C165.txt > .t1 0m12.45s real 0m12.19s user 0m00.21s system #?0[steffen@essex zstd]$ time ./zstd -19 -c < C165.txt > .t1 0m12.97s real 0m12.82s user 0m00.14s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.13s real 0m00.07s user 0m00.06s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.13s real 0m00.08s user 0m00.05s system But lots of difference for /usr/bin/zstd: #?0[steffen@essex zstd]$ ll /usr/bin/zstd -rwxr-xr-x 1 root root 382792 Dec 27 15:17 /usr/bin/zstd* #?0[steffen@essex zstd]$ ldd /usr/bin/zstd /lib/ld-musl-x86_64.so.1 (0x7f2255a3d000) libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f2255a3d000) #?0[steffen@essex zstd]$ time /usr/bin/zstd -c < C165.txt > .t1 0m00.53s real 0m00.44s user 0m00.07s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -c < C165.txt > .t1 0m00.52s real 0m00.44s user 0m00.07s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -19 -c < C165.txt > .t1 0m15.16s real 0m15.06s user 0m00.09s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -19 -c < C165.txt > .t1 0m15.35s real 0m15.19s user 0m00.14s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -d -c < .t1 >/dev/null 0m00.40s real 0m00.27s user 0m00.12s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -d -c < .t1 >/dev/null 0m00.36s real 0m00.30s user 0m00.05s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -d -c < .t1 >/dev/null 0m00.40s real 0m00.27s user 0m00.14s system Quick PDF with Steven-Levy_Hackers-Heroes-Computer-Revolution.pdf, difference is not so big here, but decompression near factor two: #?0[steffen@essex zstd]$ ll slhhcr.pdf -rw-r----- 1 steffen steffen 2761072 Mar 16 16:24 slhhcr.pdf #?0[steffen@essex zstd]$ time ./zstd -c < slhhcr.pdf >.t1 0m00.13s real 0m00.06s user 0m00.06s system #?0[steffen@essex zstd]$ time ./zstd -19 -c < slhhcr.pdf >.t1 0m01.58s real 0m01.50s user 0m00.08s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.03s real 0m00.02s user 0m00.01s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.04s real 0m00.01s user 0m00.02s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.05s real 0m00.02s user 0m00.03s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -c < slhhcr.pdf >.t1 0m00.18s real 0m00.11s user 0m00.07s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -19 -c < slhhcr.pdf >.t1 0m01.82s real 0m01.74s user 0m00.07s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -d < .t1 >/dev/null 0m00.07s real 0m00.03s user 0m00.04s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -d < .t1 >/dev/null 0m00.09s real 0m00.04s user 0m00.04s system And the Guide_to_Digital_Signal_Processing (directory of PDF) as a tar file, finally, decompression factor three to four: #?0[steffen@essex zstd]$ ll gtdsp.tar -rw-r----- 1 steffen steffen 16537600 Mar 16 16:29 gtdsp.tar #?0[steffen@essex zstd]$ time ./zstd -c < gtdsp.tar >.t1 0m00.36s real 0m00.22s user 0m00.13s system #?0[steffen@essex zstd]$ time ./zstd -19 -c < gtdsp.tar >.t1 0m06.78s real 0m06.62s user 0m00.14s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.10s real 0m00.06s user 0m00.04s system #?0[steffen@essex zstd]$ time ./zstd -d -c < .t1 >/dev/null 0m00.10s real 0m00.05s user 0m00.04s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -c < gtdsp.tar >.t1 0m00.62s real 0m00.43s user 0m00.18s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -19 -c < gtdsp.tar >.t1 0m07.43s real 0m07.16s user 0m00.23s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -d -c < .t1 >/dev/null 0m00.37s real 0m00.21s user 0m00.15s system #?0[steffen@essex zstd]$ time /usr/bin/zstd -d -c < .t1 >/dev/null 0m00.33s real 0m00.29s user 0m00.04s system Since i have no chance to test i leave the arch= unmodified, but i wonder since the Makefile has explicit arm flags? Ciao, --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) --=-=l8LlDT3ho1bzEhCf3SDVThyuRFCxCkkCPBYr=-= Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="zstd.diff" Content-ID: <20180316153741.26PGx%steffen@sdaoden.eu> diff --git a/community/zstd/APKBUILD b/community/zstd/APKBUILD index 1113f6ff..517dea8a 100644 --- a/community/zstd/APKBUILD +++ b/community/zstd/APKBUILD @@ -14,6 +14,8 @@ builddir="$srcdir/$pkgname-$pkgver" build() { cd "$builddir" + # we trade size for a little more speed. + CFLAGS="$CFLAGS -O2" make -j1 } --=-=l8LlDT3ho1bzEhCf3SDVThyuRFCxCkkCPBYr=-=-- --- Unsubscribe: alpine-user+unsubscribe@lists.alpinelinux.org Help: alpine-user+help@lists.alpinelinux.org ---