Received: from mail-pj1-f65.google.com (mail-pj1-f65.google.com [209.85.216.65]) by nld3-dev1.alpinelinux.org (Postfix) with ESMTPS id 59979782B57 for <~alpine/devel@lists.alpinelinux.org>; Thu, 2 Jan 2020 20:46:51 +0000 (UTC) Received: by mail-pj1-f65.google.com with SMTP id r67so3735622pjb.0 for <~alpine/devel@lists.alpinelinux.org>; Thu, 02 Jan 2020 12:46:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xChxnS/7bG0cdYoKCabZeAFoh0W2muI1IqzPtwsgWAg=; b=MwjAwUa48J/N7Bf8dDzj1ZhpcdrPw2BookI0ok+Nhxtw8mEsdroJ2vnzG1JU9FI4J7 2hHwrK2P5fghbRG32nSFk01MPRigM4UOvSCLeud5fJWKrGVQYT3cE2CTMvDH/jiB4YQT xJ13PiSztv7Kqnn1ZuOJvazm8OaSBrNfe5+nxvnjzpWnrr6jfz2+ITm2SdG/QOIN626D lsmO7GGK02U0xWbdedS8nOMlh1lEzSElClTkbiVJq7gEYEarkh6TMri/WRG76La5Ulb9 YSwq/REewXAjzJEBv9KhSUkJ8DF96+UOgKUiMCwh8huPqZ+YoXrN0fgc9NUsJmw0jY8p 87LA== X-Gm-Message-State: APjAAAUZ6OhjDRC/iTRWzpw/4Liffdfc8ErJqMvkho3mCp+KF6JTlIe2 W4UoQj/HycyhYv+U9Kcl67v4gh7G X-Google-Smtp-Source: APXvYqz9q5trqO1geil5f7DlDwhmaJcbaemc53hmCNg2bA4yWm5+cydGfFH4pQBXS+QDbj9XV6APpw== X-Received: by 2002:a17:90a:8402:: with SMTP id j2mr22171550pjn.41.1577998009792; Thu, 02 Jan 2020 12:46:49 -0800 (PST) Received: from vostro.lan (2001-44b8-01b4-a600-3641-5dff-fe8b-7d4c.static.ipv6.internode.on.net. [2001:44b8:1b4:a600:3641:5dff:fe8b:7d4c]) by smtp.gmail.com with ESMTPSA id e10sm66768996pfj.7.2020.01.02.12.46.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Jan 2020 12:46:49 -0800 (PST) Date: Thu, 2 Jan 2020 22:46:43 +0200 From: Timo Teras To: Ariadne Conill Cc: ~alpine/devel@lists.alpinelinux.org Subject: Re: new package format and repository layout changes Message-ID: <20200102224643.1f2ace2e@vostro.lan> In-Reply-To: References: <20191230145542.1a7ca9cf@vostro> X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; x86_64-alpine-linux-musl) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Mon, 30 Dec 2019 14:47:38 -0600 Ariadne Conill wrote: > On Mon, Dec 30, 2019 at 6:56 AM Timo Teras wrote: > > > > Hi all, > > > > I am currently going through the list of data that goes to a package > > and a repository index, as well as the installed db. And trying to > > draft the first schema version of what data goes where. So I am now > > having some issues I'd like to discuss here. > > > > 1. Repository pinning > > > > There's one fundamental issue in the current installed-db that > > causes pain - especially when we do strong signing. This is how the > > package pinning is done (the "@edge" tagging to enable specific > > repositories for specific dependencies only). > > > > Main problem is that the package origin repository needs to be > > tracked for detecting pinning changes and it's not in the package > > meta data currently. The current workaround is to have the origin > > tag in installed-db which means there's data that cannot be signed > > ahead of time. There's also some other subtle issues. > > > > My thinking is to start putting the repository meta data > > (distribution name, branch, component etc.) in the package. This > > way the package origin is known and signed. > > The problem with this is that it makes it difficult to pack a new repo > with apk fetch. I would prefer to retain that capability, as I need > it for work. While it is possible that we could repack the fetched > packages with new metadata, it seems wasteful to me. Is this same issue as what Drew mentioned? But I'm rather curious on this. Perhaps we should instead allow building mixed repos like this? How about resigning the packages? You need a key for the new repo anyway... But am actually thinking now that from both security and maintenance side it's better to ship the repository as part of the package. We can then see in what environment the package was built. Also, there's issues on supporting "apk add http://path/to/pkg.apk" and for that to work the package needs to contain the repository data. Though, if we are going more and more "follow the repository" direction, supporting this might not be feasible. > > 2. repositories list format > > > > If the above happens, we might need to do some changes how the tags > > are specified. > > > > There was some discussion earlier if we should support more debian > > style definition of listing the distro repositories. E.g. > > http://dl-cdn.alpinelinux.org/alpine edge main community > > I think this is only worth it if we add support for multiple types of > repository (for example deb-src). Otherwise, we should keep things > simple. Noted. > > Where the first word is the base URL (or perhaps even some $MIRROR > > variable). The second word the distribution branch. And remaining > > words would be the list of enabled repositories. > > > > I think the package naming could then be: > > $base_url/$branch/$repo/$arch/$pkgname-$pkgrel.$uniqueid.$arch.apk > > and automatically constructed from the package metadata. > > I don't follow. There's no guarantee that a generated URI will > actually point to a package that still exists. Packages are added and > removed from repos all the time. That was mostly to indicate that I'm planning to change the visible package naming scheme. And yes, the repo will get still update. Perhaps we add some options to keep packages in the repo for a grace time before deletion - so using stale index would still work during that grace time. > > (Also wondering if the $uniqueid should be just random generated > > uuid, or some sort of hash calculated from the package metadata and > > contents. The requirement is that it can be used to identify if two > > packages are the same or not.) > > If we really need a uniqueid (I'm skeptical), then I would suggest > using a truncated hash for it, or perhaps simply a CRC32 or similar. Yes, I suppose that makes sense. Especially if considering reproducible builds. > > 3. 'noarch' handling > > > > When implementing the above, I would finally like to properly > > implement the 'noarch'. Currently the sources set 'noarch' and > > build subpackage properly. But they are put to the target > > architecture's storage and when creating index the arch is > > rewritten to the target arch always. The plan is to start creating > > real 'noarch' repository and put the built package there. I'm > > wonder if we'd put separate index there, or include the noarch > > packages also in the target arch index. > > We should use a separate index for noarch. I would prefer to see it > work in a way where we could move Alpine to being a fully multi-arch > distro in the future, where we can use qemu-user to run binaries for > other archs. This mostly solves cross-compiling in a clean way, too. That means few more downloads, but I guess that's acceptable. It probably simplifies things on the repository management side too. Reminds me, I still have not made decision, if the new format will be fixed to one endianess, or if I'll try to generate target endianess file. In the latter case we'd need two copies of noarch: the little and the big endian one. > > 4. version handling > > > > Sort of unrelated, but something I'd like to also bring up once > > again. Since now that if we do proper distribution / branch > > tracking. And the package downgrades happen at times. I'm wondering > > if we should make the package version "informative" only. And use > > the build_time to decide which package is the "preferred version". > > In most cases it is the latest built package from the repository we > > want to be using. > > If a package (or admin) declares a specific version dependency, we > should prefer it unless we supply --available. A problem with using > build_time as the automatic preference occurs in the case where a user > mixes repositories without correctly pinning them. In the present > case, the highest version matching the declared dependencies will > always be preferred. If we switch to build_time, this case may result > in versions being mixed if a security update happens in an old release > that isn't simply a version bump. The intent was not to override versioned world dependency or pinning. But yes, it would a problem to add multiple different repositories without pinning tag. I wonder if we should make pinning explicit requirement when mixing branches. Allowing mixing things make people assume that it's supported. And we've already got several bug reports on this. It's becoming FAQ that yes, pinning might work between edge/latest-stable for a while. But it's more intended for the testing repo. If ABI changes on a library in the build development the produced, you cannot mix the packages anymore. > > Alternative would be introduce some sort of concept similar to > > debian/pacman package "epoch". > > > > Though another way to look at it that the buildtime is the > > automatically generated epoch number. :) > > I suspect the epoch concept is actually *not* the right way to go, > which is why I have never been in favor of it. I suspect what we need > to do is have repository weighting, and use the weightings to > determine which version should be selected instead of blindly taking > the highest version. That way we can say edge has highest preference, > but 3.11 has moderate preference, when calculating the upgrade > transaction(s). Ok, this is another interesting approach. We could embed in the repository index metadata on the preference of it. Perhaps even in the packages. Perhaps we need to little bit better formulate first how we want the repository and package preference to work, before we go and figure what to put in the packages. Timo