Modern DWARF debugging sections can be compressed with zlib to save
storage. This is done using the gcc -gz or objcopy
--compress-debug-sections options.
Currently, Alpine does not compress debug sections. I think we should
consider doing so.
== Benefits ==
Compressing debug sections saves a lot of installed disk space. For
example, I took mesa-dbg (the largest dbg apk currently), ran
find . -name \*.debug -exec objcopy --compress-debug-sections {} \;
and reduced the installed size from 452M to 181M.
== Drawbacks ==
=== Compatibility ===
Some programs may not be compatible with compressed DWARF sections.
While gdb and lldb support it just fine, libunwind does not support it
until the not-yet-released v1.5. libexecinfo doesn't support compressed
DWARF sections, but it doesn't support split debug either, so it's a
moot point. It also seems that libexecinfo is broken now anyways: [1].
=== Speed ===
Obviously, compressed debug sections take somewhat longer to load. I
haven't tested it, but based on my experience using Gentoo, up to a few
extra seconds loading very large debug symbol files (e.g. mesa, firefox)
can be expected. This probably also applies to applications taking their
own backtraces using libunwind, but I'm not sure.
=== Size ===
In theory, compressing debug sections should slightly increase APK size.
However, in practice, I found this not to be the case. gzip uses a
fairly small window, so in fact double-compressing large, highly
compressible files can sometimes reduce the size. In this case,
musl-dbg tar size (i.e. without apk headers) decreased from 179984559
bytes to 177951179 bytes. Smaller dbg apks will likely see a slight
increase.
Unfortunately, this reverses completely with zstd. Using zstd -9,
compressing debug sections causes the tar size to increase from
134414806 to 178504848. This makes sense, because at high levels, zstd
compression is much more powerful than gzip. Unfortunately, there seems
to be no support for zstd or even LZMA DWARF compression at this time,
so unless we implement something like precomp [2], which seems overly
complicated for a simple `apk add`, it would increase the resulting apk
size.
== Conclusions ==
Overall, I think that once libunwind is bumped to 1.5, we should start
compressing debug sections. We shouldn't support old bundled versions of
gdb or libunwind, and the installed size benefit is substantial. In my
opinion, the speed difference is negligible compared to the size
benefits. If we switch to zstd apk format, then that will unfortunately
likely need to be reverted.
[1] https://gitlab.alpinelinux.org/alpine/aports/-/issues/10896
[2] https://github.com/schnaader/precomp-cpp
Hello,
On Thursday, July 2, 2020 11:09:19 AM MDT Alex Xu (Hello71) wrote:
> Modern DWARF debugging sections can be compressed with zlib to save> storage. This is done using the gcc -gz or objcopy> --compress-debug-sections options.> > Currently, Alpine does not compress debug sections. I think we should> consider doing so.> > == Benefits ==> > Compressing debug sections saves a lot of installed disk space. For> example, I took mesa-dbg (the largest dbg apk currently), ran> > find . -name \*.debug -exec objcopy --compress-debug-sections {} \;> > and reduced the installed size from 452M to 181M.
This sounds really great, we should do it.
> > == Drawbacks ==> > === Compatibility ===> > Some programs may not be compatible with compressed DWARF sections.> While gdb and lldb support it just fine, libunwind does not support it> until the not-yet-released v1.5. libexecinfo doesn't support compressed> DWARF sections, but it doesn't support split debug either, so it's a> moot point. It also seems that libexecinfo is broken now anyways: [1].
I don't think these are really a problem. We don't expect reliable symbolic
backtraces out of these libraries; we just provide them (well, libexecinfo)
for applications which demand them.
Besides, the worst case is the symbol in the (probably useless) backtrace
output being "??" instead of the symbol name. I don't see a huge problem with
that.
> === Speed ===> > Obviously, compressed debug sections take somewhat longer to load. I> haven't tested it, but based on my experience using Gentoo, up to a few> extra seconds loading very large debug symbol files (e.g. mesa, firefox)> can be expected. This probably also applies to applications taking their> own backtraces using libunwind, but I'm not sure.
This does not seem like a huge problem either.
> === Size ===> > In theory, compressing debug sections should slightly increase APK size.> However, in practice, I found this not to be the case. gzip uses a> fairly small window, so in fact double-compressing large, highly> compressible files can sometimes reduce the size. In this case,> musl-dbg tar size (i.e. without apk headers) decreased from 179984559> bytes to 177951179 bytes. Smaller dbg apks will likely see a slight> increase.> > Unfortunately, this reverses completely with zstd. Using zstd -9,> compressing debug sections causes the tar size to increase from> 134414806 to 178504848. This makes sense, because at high levels, zstd> compression is much more powerful than gzip. Unfortunately, there seems> to be no support for zstd or even LZMA DWARF compression at this time,> so unless we implement something like precomp [2], which seems overly> complicated for a simple `apk add`, it would increase the resulting apk> size.
I think the main benefit is on-disk install size. I am willing to accept a
little bit of storage increase for -dbg packages if we can reduce their
install size at large. We compress man pages for the same reason.
> > == Conclusions ==> > Overall, I think that once libunwind is bumped to 1.5, we should start> compressing debug sections. We shouldn't support old bundled versions of> gdb or libunwind, and the installed size benefit is substantial. In my> opinion, the speed difference is negligible compared to the size> benefits. If we switch to zstd apk format, then that will unfortunately> likely need to be reverted.
I don't think it will need to be reverted. We can make it possible to
override the compression method used on a per-package basis and default -dbg
packages to using zlib compression, or better yet no compression at all.
Ariadne
On Thu, 02 Jul 2020 13:09:19 -0400
"Alex Xu (Hello71)" <alex_y_xu@yahoo.ca> wrote:
> Modern DWARF debugging sections can be compressed with zlib to save > storage. This is done using the gcc -gz or objcopy > --compress-debug-sections options.> > Currently, Alpine does not compress debug sections. I think we should > consider doing so.
...
> == Conclusions ==> > Overall, I think that once libunwind is bumped to 1.5, we should start > compressing debug sections. We shouldn't support old bundled versions of > gdb or libunwind, and the installed size benefit is substantial. In my > opinion, the speed difference is negligible compared to the size > benefits. If we switch to zstd apk format, then that will unfortunately > likely need to be reverted.
I think that makes sense. Lets bring it up again once libunwind 1.5 is
out and our package is updated.
> > [1] https://gitlab.alpinelinux.org/alpine/aports/-/issues/10896> [2] https://github.com/schnaader/precomp-cpp
On Tue, 7 Jul 2020 14:10:20 +0200
Natanael Copa <ncopa@alpinelinux.org> wrote:
> On Thu, 02 Jul 2020 13:09:19 -0400> "Alex Xu (Hello71)" <alex_y_xu@yahoo.ca> wrote:> > > Modern DWARF debugging sections can be compressed with zlib to save > > storage. This is done using the gcc -gz or objcopy > > --compress-debug-sections options.> > > > Currently, Alpine does not compress debug sections. I think we> > should consider doing so. > ...> > > == Conclusions ==> > > > Overall, I think that once libunwind is bumped to 1.5, we should> > start compressing debug sections. We shouldn't support old bundled> > versions of gdb or libunwind, and the installed size benefit is> > substantial. In my opinion, the speed difference is negligible> > compared to the size benefits. If we switch to zstd apk format,> > then that will unfortunately likely need to be reverted. > > I think that makes sense. Lets bring it up again once libunwind 1.5 is> out and our package is updated.
If libunwind needs backtracing for something real, it should first use
.eh_frame which contains the unwinding info.
The debug symbols are really needed for debugging only. And then we'd
likely be using gdb or something better anyway.
I think we can just go ahead and do this in edge if not done already.
Timo