Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com [209.85.208.174]) by nld3-dev1.alpinelinux.org (Postfix) with ESMTPS id 712B2781831 for <~alpine/apk-tools@lists.alpinelinux.org>; Wed, 19 Feb 2020 08:47:56 +0000 (UTC) Received: by mail-lj1-f174.google.com with SMTP id e18so26082925ljn.12 for <~alpine/apk-tools@lists.alpinelinux.org>; Wed, 19 Feb 2020 00:47:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Zu08KZu+/2Twxi7SMmGNyD9SnXS6Ymd5wh8sfBSK3Pw=; b=Qn31yc5wxO8ZU1ZYfGiLcFtTIfkBoBo7s/JvcEovgaX6I9zWcDqiy2qKcnzyyxxGU8 AdeAsbg5Evf1YXyp4bSSQg1JCZWDF4xDHc82/ErDafpvyvsxoDi3ZQoUsfR/1w4u/hJk WL6sXLC8uy67GDupsMsmGGIqfKeVsVJJBoQ3GxsfNGWGVnRvNfFBW/CCEU0q98siuBTy 0PVH3dKODmM4Xr6RBp+26n6JSFFXIxwHZpy0l/Ab//JHmBaaoJi+NRTCmcw9cc7feptX 1FA+oFyEeLFwu6Eh6xRBuFFnKm48KsXB4Qa3j+gRulgHzpz1vpbSUCYvhYJtQsEwYnwY 1abg== X-Gm-Message-State: APjAAAUAY1ceRwODA0uT41c6AjKzMMTeixANAvGTMKs//TIwDCGB4h3K c1lvGGwU6szoCXt2GLgHcFOK6drbqaM= X-Google-Smtp-Source: APXvYqyJIaStrdKSbjzZJETAcxsbyOZntU7v7ZmXOrUSjqP2o1tlT8GyzchSbrwZtsdnsQcy9y+Bag== X-Received: by 2002:a2e:85cd:: with SMTP id h13mr15857589ljj.191.1582102075903; Wed, 19 Feb 2020 00:47:55 -0800 (PST) Received: from vostro.wlan (dtc5qkyyyyyyyyyyyyx9y-3.rev.dnainternet.fi. [2001:14ba:80b2:d400::4fa]) by smtp.gmail.com with ESMTPSA id e17sm779365ljg.101.2020.02.19.00.47.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Feb 2020 00:47:55 -0800 (PST) Date: Wed, 19 Feb 2020 10:47:52 +0200 From: Timo Teras To: "Ariadne Conill" Cc: "Carl Chave" , ~alpine/apk-tools@lists.alpinelinux.org Subject: Re: APK Package Name Issue on armel port Message-ID: <20200219104752.01d68e90@vostro.wlan> In-Reply-To: <340209c209104e20dd01b7b6aa71251c@dereferenced.org> References: <20200217123334.031047d1@vostro.wlan> <20200214105215.6873e402@vostro.wlan> <62f21c96-ff43-83e4-5b7f-2e65e6d72729@adelielinux.org> <20200217111200.08405b9c@vostro.wlan> <340209c209104e20dd01b7b6aa71251c@dereferenced.org> X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; x86_64-alpine-linux-musl) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Wed, 19 Feb 2020 00:58:31 +0000 "Ariadne Conill" wrote: > February 17, 2020 4:33 AM, "Timo Teras" wrote: > > > On Mon, 17 Feb 2020 11:12:00 +0200 > > Timo Teras wrote: > > > >> Reviewing the code in apk, it seems that the murmur hashing code is > >> not properly accounting for alignment. So the symptoms do match. > >> The hash lookup does not work (well, works randomly based on few > >> things), but when enumerating all packages (by using the wildcard > >> in lookups) it'll show everything. > > > > Seems this is a common complaint about original murmur3 and there's > > been other projects affected with this too. > > I wonder if we may be better off adopting a simpler hash function > for our hashtables, such as FNV-1? I have had much success over > the years using FNV hashes in various projects in a similar role > as what apk uses murmur3 for. It used to be DJB hash. And later switched to murmur for better speed and hash qualities. FNV-1 seems to be slightly better quality and speed than DJB, but below Murmur3. Though, that probably greatly depends on input length. See also: https://aras-p.info/blog/2016/08/09/More-Hash-Function-Tests/ In the current code base that needs to intern large amounts of even long strings, I'd prefer to keep Murmur3 or go even something better like xxHash. Though, perhaps this is becoming less important. That is, the interning of most package data is not needed as we mmap the files in future. If the result is that hashing is needed for short strings only, such as the package name, it might be worth looking at simplifying things by going to FNV-1. Timo