I have a running armel port of an alpine base system on an old
Dell/Kace M300 SBC appliance. It seems to be running very well except
that I can't install any packages with apk and I'm completely out of
ideas on how to fix it.
I _think_ there's something going on with apk's ability to locate
packages based on their short names. For example, on a working system
I can do:
sodco:/home/sodface# apk list file
file-5.38-r0 x86_64 {file} (BSD-2-Clause) [installed]
file-5.37-r1 x86_64 {file} (BSD-2-Clause)
and get results. On the armel system, the same command produces no output
m300-01:/home/sodface# apk list file
m300-01:/home/sodface#
However if I add a wildcard I get results:
m300-01:/home/sodface# apk list file*
file-5.37-r1 armel {file} (BSD-2-Clause)
file-dev-5.37-r1 armel {file} (BSD-2-Clause)
file-doc-5.37-r1 noarch {file} (BSD-2-Clause)
apk update seems to be aware of all the packages:
m300-01:/home/sodface# apk update
OK: 317 distinct packages available
I see the same sort of thing with other apk operations, like fetch and
list. So when I try to apk add a package it always comes back with
unsatisfiable constraints that the package is missing and displays the
package short name.
Any help would be very much appreciated.
Carl
On Thu, 13 Feb 2020 22:55:37 -0500
Carl Chave <online@chave.us> wrote:
> I have a running armel port of an alpine base system on an old> Dell/Kace M300 SBC appliance. It seems to be running very well except> that I can't install any packages with apk and I'm completely out of> ideas on how to fix it.> > I _think_ there's something going on with apk's ability to locate> packages based on their short names. For example, on a working system> I can do:> > sodco:/home/sodface# apk list file> file-5.38-r0 x86_64 {file} (BSD-2-Clause) [installed]> file-5.37-r1 x86_64 {file} (BSD-2-Clause)> > and get results. On the armel system, the same command produces no> output m300-01:/home/sodface# apk list file> m300-01:/home/sodface#> > However if I add a wildcard I get results:> m300-01:/home/sodface# apk list file*> file-5.37-r1 armel {file} (BSD-2-Clause)> file-dev-5.37-r1 armel {file} (BSD-2-Clause)> file-doc-5.37-r1 noarch {file} (BSD-2-Clause)
This sounds weird. What is apk-tools version, and Alpine branch (edge
or some stable)?
> apk update seems to be aware of all the packages:> m300-01:/home/sodface# apk update> OK: 317 distinct packages available
This indicates that no remote repositories are configured, and the
package count is pretty low. Perhaps you are missing some repositories
from etc/apk/repositories config?
> I see the same sort of thing with other apk operations, like fetch and> list. So when I try to apk add a package it always comes back with> unsatisfiable constraints that the package is missing and displays the> package short name.
Yeah, sounds like you are missing repositories.
Timo
Timo,
I'll try to keep this short but I'm out of ideas and hoping you might
see where I'm going wrong.
-- used aports/abuild and bootstrap.sh on alpine x86_64 to cross
compile an armel base package set, all but kernel and initramfs
-- compiled kernel separately with appended .dtb for the armel board
-- modified the genrootfs.sh script to build root file system image
with just alpine-base
-- added kernel modules to root file system
-- boot armel board from included uboot with root= kernel argument (no
initramfs)
All of the above seems to work just fine. I get no errors during
boot, I can login, network is good etc. The only issue so far is with
apk always returning unsatisfiable constraints missing (alpine-base)
and alpine-base is the only entry in world. My public key is in
/etc/apk/keys and I don't see anything that would indicate trust is an
issue. I rebuilt the whole repo again last night and the problem
remains.
So I decided to take the same armel repo and modified the
alpine-chroot script to use it to build an armel chroot on my Fedora
machine with QEMU static. This worked fine and apk worked as expected
in the chroot! I diff'd the /lib/apk/db/installed file between the
armel board load and the chroot and they were identical. I even moved
the file between machines just to see and either file works in the
chroot but neither works on the armel board load. apk still complains
of unsatisfiable constraints.
Next I tar'd up the chroot on the fedora machine, wiped the armel
board / partition and restored it with the chroot image. Did a few
changes in etc for securetty, inittab, fstab, shadow and symlinks in
/etc/runlevels but that's it. Rebooted, everything came up fine but
apk remains broken the same as it was.
Can you think of anything that I'm doing wrong that would break or
confuse apk? A kernel config option? Lack of initramfs? Tar option?
Timestamps? Locale? Timezone? fstab mount option? I ran strace at one
point but didn't see anything obviously different between a working
system and the non-working system. I couldn't get ltrace to compile.
I just don't know how to proceed from here.
Thanks,
Carl
On 16/02/2020 13:01, Carl Chave wrote:
> Timo,> > I'll try to keep this short but I'm out of ideas and hoping you might> see where I'm going wrong.> > -- used aports/abuild and bootstrap.sh on alpine x86_64 to cross> compile an armel base package set, all but kernel and initramfs> -- compiled kernel separately with appended .dtb for the armel board> -- modified the genrootfs.sh script to build root file system image> with just alpine-base> -- added kernel modules to root file system> -- boot armel board from included uboot with root= kernel argument (no> initramfs)> > All of the above seems to work just fine. I get no errors during> boot, I can login, network is good etc. The only issue so far is with> apk always returning unsatisfiable constraints missing (alpine-base)> and alpine-base is the only entry in world. My public key is in> /etc/apk/keys and I don't see anything that would indicate trust is an> issue. I rebuilt the whole repo again last night and the problem> remains.> > So I decided to take the same armel repo and modified the> alpine-chroot script to use it to build an armel chroot on my Fedora> machine with QEMU static. This worked fine and apk worked as expected> in the chroot! I diff'd the /lib/apk/db/installed file between the> armel board load and the chroot and they were identical. I even moved> the file between machines just to see and either file works in the> chroot but neither works on the armel board load. apk still complains> of unsatisfiable constraints.> > Next I tar'd up the chroot on the fedora machine, wiped the armel> board / partition and restored it with the chroot image. Did a few> changes in etc for securetty, inittab, fstab, shadow and symlinks in> /etc/runlevels but that's it. Rebooted, everything came up fine but> apk remains broken the same as it was.> > Can you think of anything that I'm doing wrong that would break or> confuse apk? A kernel config option? Lack of initramfs? Tar option?> Timestamps? Locale? Timezone? fstab mount option? I ran strace at one> point but didn't see anything obviously different between a working> system and the non-working system. I couldn't get ltrace to compile.> > I just don't know how to proceed from here.> > Thanks,> Carl
It sounds like you may be missing an /etc/apk/repositories file.
If not, maybe you are using HTTPS and the date is not set properly.
This could cause a certificate validation failure, and then apk would
ignore the repository.
Best,
--arw
--
A. Wilcox (awilfox)
Project Lead, Adélie Linux
https://www.adelielinux.org
> It sounds like you may be missing an /etc/apk/repositories file.>> If not, maybe you are using HTTPS and the date is not set properly.> This could cause a certificate validation failure, and then apk would> ignore the repository.
The repositories file exists and it's been my experience that you
don't even need any entries in it for apk to function for and return
information for commands like "apk info" as long as you pass it a name
of a package that's already installed, I believe it returns
information found in /lib/apk/db/installed. For example in a working
chroot with only alpine-base in /etc/apk/world I get:
sodpro:~# apk add
OK: 8 MiB in 20 packages
sodpro:~# apk info musl
musl-1.1.24-r0 description:
the musl c library (libc) implementation
musl-1.1.24-r0 webpage:
http://www.musl-libc.org/
musl-1.1.24-r0 installed size:
643072
Note that "apk add" with no arguments returns the number of installed
packages and "apk info musl" returns information about musl, on of the
installed packages. The repositories file has only one local
filesystem entry (there are no official armel repos) and that entry is
commented out and the repo is not even mounted. On the system that's
not working, which is actually booting from a tar'd image of the
chroot above, this is what I get with the same commands. Note that I
have to add a wildcard to "apk info musl*" to get output from the
command:
localhost:/# apk add
ERROR: unsatisfiable constraints:
alpine-base (missing):
required by: world[alpine-base]
localhost:/# apk info musl
localhost:/# apk info musl*
musl-1.1.24-r0 description:
the musl c library (libc) implementation
musl-1.1.24-r0 webpage:
http://www.musl-libc.org/
musl-1.1.24-r0 installed size:
643072
musl-utils-1.1.24-r0 description:
the musl c library (libc) implementation
musl-utils-1.1.24-r0 webpage:
http://www.musl-libc.org/
musl-utils-1.1.24-r0 installed size:
110592
Carl
The odd results of a few more tests and then I'll stop spamming the
list for help on an unsupported architecture:
First, apk info with no arguments, which gives me the correct output
as compared to my working chroot instance:
localhost:~# apk info
musl
busybox
alpine-baselayout
openrc
alpine-conf
libcrypto1.1
libssl1.1
ca-certificates-cacert
libtls-standalone
ssl_client
zlib
apk-tools
busybox-suid
busybox-initscripts
scanelf
musl-utils
libc-utils
alpine-keys
alpine-base
So far, so good. Run it through xargs and apk list which should give
me more detailed info for each package:
localhost:~# apk info | xargs apk list
alpine-conf-3.8.3-r4 armel {alpine-conf} (MIT) [installed]
Ok, well only for alpine-conf for some reason. Let's apk list that
one directly:
localhost:~# apk list alpine-conf
localhost:~#
Nothing. How about xargs one at a time:
localhost:~# apk info | xargs -n 1 apk list
zlib-1.2.11-r3 armel {zlib} (Zlib) [installed]
Ok, now it's just zlib, since xargs did them one at a time I should be
able to do that one manually:
localhost:~# apk list zlib
zlib-1.2.11-r3 armel {zlib} (Zlib) [installed]
Ok, that worked, how about xargs two at a time:
localhost:~# apk info | xargs -n 2 apk list
alpine-conf-3.8.3-r4 armel {alpine-conf} (MIT) [installed]
libssl1.1-1.1.1d-r3 armel {openssl} (OpenSSL) [installed]
And here's incrementing by one each time:
localhost:~# apk info | xargs -n 3 apk list
alpine-conf-3.8.3-r4 armel {alpine-conf} (MIT) [installed]
localhost:~# apk info | xargs -n 4 apk list
libssl1.1-1.1.1d-r3 armel {openssl} (OpenSSL) [installed]
localhost:~# apk info | xargs -n 5 apk list
localhost:~# apk info | xargs -n 6 apk list
alpine-conf-3.8.3-r4 armel {alpine-conf} (MIT) [installed]
libssl1.1-1.1.1d-r3 armel {openssl} (OpenSSL) [installed]
localhost:~# apk info | xargs -n 7 apk list
localhost:~# apk info | xargs -n 8 apk list
libssl1.1-1.1.1d-r3 armel {openssl} (OpenSSL) [installed]
localhost:~# apk info | xargs -n 9 apk list
localhost:~# apk info | xargs -n 10 apk list
localhost:~# apk info | xargs -n 11 apk list
zlib-1.2.11-r3 armel {zlib} (Zlib) [installed]
localhost:~# apk info | xargs -n 12 apk list
libssl1.1-1.1.1d-r3 armel {openssl} (OpenSSL) [installed]
localhost:~# apk info | xargs -n 13 apk list
localhost:~# apk info | xargs -n 14 apk list
localhost:~# apk info | xargs -n 15 apk list
localhost:~# apk info | xargs -n 16 apk list
libssl1.1-1.1.1d-r3 armel {openssl} (OpenSSL) [installed]
localhost:~# apk info | xargs -n 17 apk list
alpine-conf-3.8.3-r4 armel {alpine-conf} (MIT) [installed]
localhost:~# apk info | xargs -n 18 apk list
alpine-conf-3.8.3-r4 armel {alpine-conf} (MIT) [installed]
localhost:~# apk info | xargs -n 19 apk list
alpine-conf-3.8.3-r4 armel {alpine-conf} (MIT) [installed]
uname -a for comparison:
Working chroot: Linux sodpro 5.3.9-300.fc31.x86_64 #1 SMP Wed Nov 6
16:13:19 UTC 2019 armv7l Linux
Bare metal: Linux localhost 5.4.18-m300-alps-5 #1 PREEMPT Sat Feb 8
01:59:41 UTC 2020 armv5tel Linux
Hi,
On Mon, 17 Feb 2020 00:59:10 -0500
Carl Chave <online@chave.us> wrote:
> The odd results of a few more tests and then I'll stop spamming the> list for help on an unsupported architecture:>[snip]
Those sound really weird.
> uname -a for comparison:> Working chroot: Linux sodpro 5.3.9-300.fc31.x86_64 #1 SMP Wed Nov 6> 16:13:19 UTC 2019 armv7l Linux> Bare metal: Linux localhost 5.4.18-m300-alps-5 #1 PREEMPT Sat Feb 8> 01:59:41 UTC 2020 armv5tel Linux
But it does seem to be related to target hardware. The one you have is
definitely older box (armv5tel). Perhaps you could show the
/proc/cpuinfo on it too?
But it does start to sound like the ARM boxes of the older kind and
does not support unaligned memory access. And seems to be giving
undefined behaviour instead of e.g. giving a trap.
Reviewing the code in apk, it seems that the murmur hashing code is not
properly accounting for alignment. So the symptoms do match. The hash
lookup does not work (well, works randomly based on few things), but
when enumerating all packages (by using the wildcard in lookups) it'll
show everything.
We have not really supported these old arm boxes. But perhaps it's
better to fix the murmur hash anyway. Or just use some other hash if
unaligned access does not work.
See: https://www.kernel.org/doc/Documentation/arm/mem_alignment
So to verify this, you could:
echo 2 > /proc/cpu/alignment
If things start to work magically, it's the alignment issue. To verify
that no other program is suffering from this, you probably want initial
testing runs to be done using:
echo 4 > /proc/cpu/alignment
That causes application to be killed with SIGBUS when it's trying to
access unaligned data.
Cheers,
Timo
> Can you try the below patch if it fixes things?>> Cheers,> Timo>
Timo, I think you nailed it! Seems to be working. I will recreate
the root filesystem with the patched apk-tools and follow your advice
about using:
echo 4 > /proc/cpu/alignment
Thanks a lot for the help.
Carl
> Timo, I think you nailed it! Seems to be working. I will recreate> the root filesystem with the patched apk-tools and follow your advice> about using:> echo 4 > /proc/cpu/alignment>> Thanks a lot for the help.>> Carl
Hi Timo, I did some more testing tonight with the patched apk-tools
and while things have improved, there still appears to be some issues
lurking:
localhost:~# cat /proc/cpu/alignment
User: 259
System: 0 (0x0)
Skipped: 0
Half: 0
Word: 0
DWord: 0
Multi: 0
User faults: 0 (ignored)
localhost:~# echo 4 > /proc/cpu/alignment
localhost:~# cat /proc/cpu/alignment
User: 259
System: 0 (0x0)
Skipped: 0
Half: 0
Word: 0
DWord: 0
Multi: 0
User faults: 4 (signal)
localhost:~# apk list
Bus error
Hello,
February 17, 2020 4:33 AM, "Timo Teras" <timo.teras@iki.fi> wrote:
> On Mon, 17 Feb 2020 11:12:00 +0200> Timo Teras <timo.teras@iki.fi> wrote:> >> Reviewing the code in apk, it seems that the murmur hashing code is>> not properly accounting for alignment. So the symptoms do match. The>> hash lookup does not work (well, works randomly based on few things),>> but when enumerating all packages (by using the wildcard in lookups)>> it'll show everything.> > Seems this is a common complaint about original murmur3 and there's> been other projects affected with this too.
I wonder if we may be better off adopting a simpler hash function
for our hashtables, such as FNV-1? I have had much success over
the years using FNV hashes in various projects in a similar role
as what apk uses murmur3 for.
Ariadne
Hi,
On Tue, 18 Feb 2020 18:47:21 -0500
Carl Chave <online@chave.us> wrote:
> > Timo, I think you nailed it! Seems to be working. I will recreate> > the root filesystem with the patched apk-tools and follow your> > advice about using:> > echo 4 > /proc/cpu/alignment> > Hi Timo, I did some more testing tonight with the patched apk-tools> and while things have improved, there still appears to be some issues> lurking:> > localhost:~# echo 4 > /proc/cpu/alignment> > localhost:~# cat /proc/cpu/alignment> User: 259> System: 0 (0x0)> Skipped: 0> Half: 0> Word: 0> DWord: 0> Multi: 0> User faults: 4 (signal)> > localhost:~# apk list> Bus error
Now you have signal sent to the process. Could you use gdb to get a
backtrace? That would help identify where the problem is.
Thanks
On Wed, 19 Feb 2020 00:58:31 +0000
"Ariadne Conill" <ariadne@dereferenced.org> wrote:
> February 17, 2020 4:33 AM, "Timo Teras" <timo.teras@iki.fi> wrote:> > > On Mon, 17 Feb 2020 11:12:00 +0200> > Timo Teras <timo.teras@iki.fi> wrote:> > > >> Reviewing the code in apk, it seems that the murmur hashing code is> >> not properly accounting for alignment. So the symptoms do match.> >> The hash lookup does not work (well, works randomly based on few> >> things), but when enumerating all packages (by using the wildcard> >> in lookups) it'll show everything. > > > > Seems this is a common complaint about original murmur3 and there's> > been other projects affected with this too. > > I wonder if we may be better off adopting a simpler hash function> for our hashtables, such as FNV-1? I have had much success over> the years using FNV hashes in various projects in a similar role> as what apk uses murmur3 for.
It used to be DJB hash. And later switched to murmur for better speed
and hash qualities. FNV-1 seems to be slightly better quality and speed
than DJB, but below Murmur3. Though, that probably greatly depends on
input length.
See also:
https://aras-p.info/blog/2016/08/09/More-Hash-Function-Tests/
In the current code base that needs to intern large amounts of even
long strings, I'd prefer to keep Murmur3 or go even something better
like xxHash.
Though, perhaps this is becoming less important. That is, the
interning of most package data is not needed as we mmap the files in
future. If the result is that hashing is needed for short strings only,
such as the package name, it might be worth looking at simplifying
things by going to FNV-1.
Timo
Hello,
February 19, 2020 2:47 AM, "Timo Teras" <timo.teras@iki.fi> wrote:
> On Wed, 19 Feb 2020 00:58:31 +0000> "Ariadne Conill" <ariadne@dereferenced.org> wrote:> >> February 17, 2020 4:33 AM, "Timo Teras" <timo.teras@iki.fi> wrote:>> >> On Mon, 17 Feb 2020 11:12:00 +0200>> Timo Teras <timo.teras@iki.fi> wrote:>> >> Reviewing the code in apk, it seems that the murmur hashing code is>> not properly accounting for alignment. So the symptoms do match.>> The hash lookup does not work (well, works randomly based on few>> things), but when enumerating all packages (by using the wildcard>> in lookups) it'll show everything.>> >> Seems this is a common complaint about original murmur3 and there's>> been other projects affected with this too.>> >> I wonder if we may be better off adopting a simpler hash function>> for our hashtables, such as FNV-1? I have had much success over>> the years using FNV hashes in various projects in a similar role>> as what apk uses murmur3 for.> > It used to be DJB hash. And later switched to murmur for better speed> and hash qualities. FNV-1 seems to be slightly better quality and speed> than DJB, but below Murmur3. Though, that probably greatly depends on> input length.> > See also:> https://aras-p.info/blog/2016/08/09/More-Hash-Function-Tests> > In the current code base that needs to intern large amounts of even> long strings, I'd prefer to keep Murmur3 or go even something better> like xxHash.> > Though, perhaps this is becoming less important. That is, the> interning of most package data is not needed as we mmap the files in> future. If the result is that hashing is needed for short strings only,> such as the package name, it might be worth looking at simplifying> things by going to FNV-1.
Yeah, that is what I mean. As the current design is moving away
from the use of interned strings, it may make more sense to use a
simpler byte-for-byte hashing algorithm. That way we don't have
to worry about alignment.
Ariadne
On Wed, 19 Feb 2020 18:17:37 +0000
"Ariadne Conill" <ariadne@dereferenced.org> wrote:
> > Though, perhaps this is becoming less important. That is, the> > interning of most package data is not needed as we mmap the files in> > future. If the result is that hashing is needed for short strings> > only, such as the package name, it might be worth looking at> > simplifying things by going to FNV-1. > > Yeah, that is what I mean. As the current design is moving away> from the use of interned strings, it may make more sense to use a> simpler byte-for-byte hashing algorithm. That way we don't have> to worry about alignment.
Right. Though, the fix was simple (and committed now), just lead the 4
bytes by separate byte loads. It should be still better quality
and faster, as the major speed up comes from batching the mix operations
every 4 bytes, instead of having multiplication per byte.
But yeah, we can re-evaluate things with the new formats.
Timo
Timo,
I'm not very familiar with using gdb. See below and please let me
know how to get you better information.
localhost:~# gdb /usr/lib/debug/sbin/apk.debug
GNU gdb (GDB) 8.3.1
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "armv5-alpine-linux-musleabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/debug/sbin/apk.debug...
(gdb) run
Starting program: /usr/lib/debug/sbin/apk.debug
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
Warning:
Cannot insert breakpoint -1.
Cannot access memory at address 0x46ac
(gdb) backtrace
#0 0x76fb58e0 in ?? ()
#1 0x00000000 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
On Wed, Feb 19, 2020 at 3:25 AM Timo Teras <timo.teras@iki.fi> wrote:
>> Hi,>> On Tue, 18 Feb 2020 18:47:21 -0500> Carl Chave <online@chave.us> wrote:>> > > Timo, I think you nailed it! Seems to be working. I will recreate> > > the root filesystem with the patched apk-tools and follow your> > > advice about using:> > > echo 4 > /proc/cpu/alignment> >> > Hi Timo, I did some more testing tonight with the patched apk-tools> > and while things have improved, there still appears to be some issues> > lurking:> >> > localhost:~# echo 4 > /proc/cpu/alignment> >> > localhost:~# cat /proc/cpu/alignment> > User: 259> > System: 0 (0x0)> > Skipped: 0> > Half: 0> > Word: 0> > DWord: 0> > Multi: 0> > User faults: 4 (signal)> >> > localhost:~# apk list> > Bus error>> Now you have signal sent to the process. Could you use gdb to get a> backtrace? That would help identify where the problem is.>> Thanks>