Received: from out.migadu.com (out.migadu.com [91.121.223.63]) by nld3-dev1.alpinelinux.org (Postfix) with ESMTPS id 9862A781E8E for <~alpine/devel@lists.alpinelinux.org>; Thu, 23 Jan 2020 15:13:44 +0000 (UTC) Received: (Migadu outbound); Thu, 23 Jan 2020 15:13:43 +0000 Authentication-Results: out.migadu.com; auth=pass (plain) Received: from wms0-eu-central.migadu.com (wms0-eu-central.migadu.com [139.162.159.86]) by out.migadu.com (Haraka/2.8.16) with ESMTPSA id 1F28700F-C7C8-4AE1-9168-7A345FE162DD.1 envelope-from (authenticated bits=0) (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 verify=FAIL); Thu, 23 Jan 2020 15:13:43 +0000 MIME-Version: 1.0 Date: Thu, 23 Jan 2020 15:13:43 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: RainLoop/1.12.1 From: "Ariadne Conill" Message-ID: <1c4796e0cda2248c2de159d4d467421c@dereferenced.org> Subject: Lets talk about apk-tools 3, and apk-tools in 2020 in general To: ~alpine/devel@lists.alpinelinux.org DKIM-Signature: v=1;a=rsa-sha256;bh=8rQqsTTwQI47fss7VRPNbce/vMHhEMKplIfln/YLhJA=;c=relaxed/simple;d=dereferenced.org;h=from:subject:date:to;s=default;b=eqXmOcAy+mU0mKySM2AbnsRLcBSXOMnKe0TIdvSVeu+CMMF/+JNpUZjLiAE6MrVfiRdaNqoMupDRYadGFput4WQP5mt/ZTpwLH9NexjYCJaRRogkDJnUHyUeQuH5Wotg6k85FlCrOPNBXEKMrHwAf3d6SFnwlKtNSt+RYS7LuQg= Hello,=0A=0AI am writing this email today to discuss the proposed changes= to=0Aapk-tools, in the context of trying to include all potential=0Astak= eholders so that we can have a discussion about the future=0Aof apk-tools= , capture the conclusions made and drive them forward=0Ain the sense of a= ctionable changes.=0A=0ATimo announced back in December that he was pursu= ing some new=0Adevelopment on apk-tools, which would become the apk-tools= 3=0Abranch. The proposed changes are bold and forward-thinking,=0Ainten= ded to allow apk-tools to scale to the growth we expect=0AAlpine and othe= r APK-based distributions to have in this decade.=0A=0ATo be clear: absol= utely nothing is set in stone. The apk-tools=0A3 tree may be published a= nd no distribution including Alpine may=0Aactually use it. As many peopl= e have, off-the-record, talking=0Aamongst themselves raised concerns abou= t the scope and depth of=0Athe proposed apk-tools 3 changes, I believe it= important to=0Astep back and have a conversation that identifies all sta= keholders,=0Aso that we may understand the full requirements and usage ca= ses=0Afor apk-tools. This will allow us to ensure that apk-tools 3=0Ais = a success for everyone involved.=0A=0AIn order to make actionable decisio= ns, I believe it prudent to=0Aapproach this with a little bit of backgrou= nd and discussion of=0Athe pros and cons of the proposed apk-tools change= s, so that we=0Acan come to a conclusion as to what we want to do in orde= r to=0Amove forward.=0A=0AFirst, some background: there are two primary k= inds of data that=0Aapk-tools manipulates: the package databases (install= ed db and=0Aindices) and packages themselves. Package databases are=0Apr= esently stored in a compressed tar stream, as are packages.=0ATar streams= are good for packages, but as presently used by=0AAPK, not very good for= databases, because the APKINDEX.tar.gz=0Aand friends only contain a coup= le of files instead of storing=0Athe object tree directly in the tar stre= am. What Timo is=0Aproposing in the v3.0-wip branch is to replace the ta= r streams=0Awith a unified container format that is sufficient for storin= g=0Aboth packages and databases. This would also change the way=0Adata i= s stored in the database so that the database is serialized=0Adirectly in= to the container. However, it is important to=0Arealize that we could ac= complish that same serialization,=0Aincluding mmap-based random access, w= ith tar streams.=0A=0AThere are some pros to the approach taken in the v3= .0-wip tree:=0A=0A* A truly unified database and package format means tha= t we=0A ultimately have less code to audit and maintain.=0A=0A* mmap-bas= ed random access will significantly improve=0A performance, especially f= or embedded systems.=0A=0AThere are also some cons to this approach:=0A= =0A* Changing the format in such a radical way brings significant=0A ris= k. The tar streams code has already been audited and a=0A few CVEs have= been fixed over the years. Throwing that out=0A means we start over ag= ain, possibly reintroducing variations=0A of bugs we have already fixed.= Many stakeholders have said=0A privately that they would rather not ha= ve exposure to this=0A risk and would prefer a more conservative approac= h.=0A=0A* Compression of data will have to happen *inside* the container= =0A for mmap-based random access to work efficiently.=0A=0A* Building on= the last point, exposure of the container in a way=0A that allows it to= be used for mmap-based random access makes=0A it a desirable target for= tampering. The current signature=0A verification scheme of signing onl= y the control section will=0A be insufficient here, as an attacker could= trivially generate=0A a modified container that explicitly attacks the = parsing code.=0A Work will most certainly need to be done in the area of= tamper=0A resistance before people will be enthusiastic about mmaping= =0A data they fetched from the internet. At the very least,=0A use of = HTTPS for all package fetches will become a hard=0A requirement, while t= he current format is tamper-resistant=0A and it's tamper resistance has = been improved over the past=0A decade.=0A=0A* Usage of a unified contain= er format for package data and=0A database data removes transparency fro= m the current package=0A format. Right now, an APK package can be manip= ulated with=0A the tar command if a user wishes to know its contents. U= sing=0A the package manager is not even required.=0A=0AThere are other c= hanges that people are concerned about, such=0Aas being able to compose n= ew repositories from pre-existing=0Aones. While those are important disc= ussions to have as well,=0Awe are not really discussing them here, as tho= se concerns can=0Aeasily be overcome. Overall governance of the apk-tool= s=0Aproject itself is also not necessarily being discussed here,=0Awhile = we need to have that discussion as there are many=0Anon-Alpine stakeholde= rs at this point, we can do that later.=0A=0AUltimately, from the perspec= tive of apk-tools maintenance,=0Awe need to come to a conclusion on how t= o improve the=0Ascalability of the package manager as our consumers=0A(di= stributions like Alpine, Adelie, Abyss and possibly soon=0AYocto and othe= r opkg consumers who are looking at switching)=0Aare faced with growing p= ackage indices. The motivation is=0Ato improve the package manager so th= at it can work well=0Awith the requirements we believe distributions will= have in=0Athis decade.=0A=0AIn 2010, Alpine had a few thousand packages.= Now days, we=0Ahave almost 20k. That is starting to approach the size = of=0Adistributions like Debian, which have roughly 50-60k. It=0Ais clear= that we need to re-evaluate some scalability=0Achoices we are quickly ou= tgrowing. Timo should be applauded=0Afor starting that process.=0A=0AI l= ook forward to hearing everyone's thoughts on this, so=0Awe can decide ho= w to move forward for this development=0Acycle and beyond!=0A=0AAriadne