Received: from mx1.tetrasec.net (mx1.tetrasec.net [74.117.189.118]) by nld3-dev1.alpinelinux.org (Postfix) with ESMTPS id 61FBD781B4D for <~alpine/devel@lists.alpinelinux.org>; Thu, 16 Jan 2020 14:19:59 +0000 (UTC) Received: from mx1.tetrasec.net (mail.local [127.0.0.1]) by mx1.tetrasec.net (Postfix) with ESMTP id C4E392DE409E; Thu, 16 Jan 2020 14:19:56 +0000 (UTC) Received: from ncopa-desktop.copa.dup.pw (67.63.200.37.customer.cdi.no [37.200.63.67]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alpine@tanael.org) by mx1.tetrasec.net (Postfix) with ESMTPSA id 0D04A2DE3B3B; Thu, 16 Jan 2020 14:19:54 +0000 (UTC) Date: Thu, 16 Jan 2020 15:19:47 +0100 From: Natanael Copa To: Timo Teras Cc: ~alpine/devel@lists.alpinelinux.org Subject: Re: apk-tools plans Message-ID: <20200116151947.63f7ade8@ncopa-desktop.copa.dup.pw> In-Reply-To: <20191229181150.7a0dcace@vostro> References: <20191203180717.0016af8a@vostro> <20191229181150.7a0dcace@vostro> X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; x86_64-alpine-linux-musl) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Hi! On Sun, 29 Dec 2019 18:11:50 +0200 Timo Teras wrote: > Hi, > > On Tue, 3 Dec 2019 18:07:17 +0200 > Timo Teras wrote: > > > Another thing, I really want to improve is the security of 'apk audit' > > and the system integrity checking. The concept is to create and store > > signed file manifests in the DB that can be used to establish strong > > trust in a system. (The current 'apk audit' was designed for 'lbu > > commit'.) > > > > Slightly related is also changing the file formats so that signature > > checking can be done first without much parsing to make the attack > > surface smaller. (The old design was motivated by restrictions from > > the original shell script based Alpine package manager; the signatures > > were not considered as first class citizen back then.) > >[snip] > > Unfortunately the above changes cannot be fixed easily without > > changing to binary file formats. > > > > The primary target for me is to redo the binary apk package and index > > formats for next Alpine release. We can discuss the exact details in > > this or another thread later in the coming weeks. Main design being > > security and speed. The idea is that index will be mmap:able; and the > > structures from .apk can be directly copied to installed-db. > > So I've made quite a bit progress with this. The work-in-progress > branch is now available at: > https://gitlab.alpinelinux.org/alpine/apk-tools/tree/v3.0-wip > > This is still early preview code. The intent is to show at this point > on how I'm planning the file format to be like and how the signing is > to work. There's still open ends and the "schemas" are still volatile. > > So far the code contains the basic encoding and decoding of the file > format as well as signing and verification of it. It's made pretty > generic and the signing can already do RSA/ECC/etc based on what type of > private key is given. Speaking of signing and verification. One of the features that I appreciate in the current design is that we calculate the checksum in-flight, while waiting for the network (or disk) IO. I recently learned about blake3[1] which I find very interesting. As i understand it support streaming which means that it can detect hash mismatch before it has received all data. And it is *fast*. [1]: https://www.infoq.com/news/2020/01/blake3-fast-crypto-hash/ > From technical point of view the format is first a container layer > basically Tag-Length-Value blobs. The main blocks are to be the > "database", "signatures", and for packages the "files" section. > > The "database" section is mostly resembling flat buffers format. > Basically it's a hierarchical object tree. The intent is to have enough > information to make deep copies without schema, but to pretty-print it > you'd need the schema. This is a trade of chosen to keep the repetitive > field type encodings out, but to allow some generic functionality to be > written without schema knowledge. > > As said, the main motivate for this work is to allow mmap() access, and > fairly trivial signature verification code. Additionally the format is > designed so that the packages' signed data dump can be just trivially > copied into the installed database. So there will be support to copy > "signed database" blobs to be inside another database. This is the key > to strong audit trail so that installed-db is just copy of these blobs. > And to do all this without much parsing so that accesses are fast > enough. This means that the index should not be stored compressed locally. Maybe we could decompress it while fetching it from network. > I will be working to finish up the schemas of the "package", "index" > and "installed database" formats next. And then start working on the > new format tools to create packages. > > Plan is to move from 'abuild' to 'apk-tools' the intelligence to > construct packages, and manage repositories. So in future 'abuild' > would just call 'apk mkpkg' or similar with a description file what > needs to go in. This should be helpful if some other distributions > choose to use apk but want to integrate it to their build scripts. This is very good. I guess it makes sense to separate out the mkpkg functionality to a separate binary, so we can separate the build time tools with the run time, to keep the runtime smaller. > Additionally, we are hoping to put all the repository management code > to apk-tools. There it will be more simpler to implement features such > as "keep superceded packages in repository for at least X days before > deleting them". This would be awesome. Or keep N versions of packages from each aport (origin). That way we could have current and previous versions. Useful for rolling back kernel updates for example. > Feedback welcome. Though, better to concentrate to the architectural > and overview of how things - instead of nitpicking style and/or minor > issues on the commit as it's still pretty volatile. I think it as been mentioned before but I think it would be nice if we could have 2 operational install modes: - quick: in-the-air extraction/verification of packages (current style) - safe: store all packages locally and verify before trying to extract them safe mode is useful when the network connection is unreliable. In case of network error it could continue where it left off last time, rather than try fetch it all from scratch. Something like `apk upgrade --fetch-first` or `apk upgrade --safe`. I also think it may be a good time to look over the options and try make things more consistent. For example `apk info` operates on both locally installed packages and the cached index. Maybe we could have an applet that only works on installed packages (for example list contents) and separate that only queries the cached index(es). Or maybe a flag for it. Also, `apk info` uses `--subcmd` while apk cache does not have double dashes. `apk cache subcmd` Some of the tools are designed for scripting, while some are designed for human friendly output. Would be nice to have some common flag for that. Personally i prefer script friendly and have a flag for human friendly output, like `--pretty`, which could be the default if stdout is a tty or similar. Thank you for working on this! I find it very exciting! > > Timo