Date: Thu, 30 Apr 2020 16:20:51 +0200
From: Natanael Copa <ncopa@alpinelinux.org>
To: Rasmus Thomsen <oss@cogitri.dev>
Cc: ~alpine/devel@lists.alpinelinux.org
Subject: Re: Enforcing patch headers for patches
Message-ID: <20200430162051.10a7d0ee@ncopa-desktop.copa.dup.pw>
In-Reply-To: <37d426d6246dbf7702e86eac43fd393407ec6e28.camel@cogitri.dev>
References: <4ac4d3aad3bf0f050070166b27a2344efe32e9b7.camel@cogitri.dev>
	<20200430102823.11d1ed66@ncopa-desktop.copa.dup.pw>
	<37d426d6246dbf7702e86eac43fd393407ec6e28.camel@cogitri.dev>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Thu, 30 Apr 2020 12:45:40 +0200
Rasmus Thomsen <oss@cogitri.dev> wrote:

> On Thu, 2020-04-30 at 10:28 +0200, Natanael Copa wrote:
> > What should we set it to if the patch is suitable for upstreaming but
> > not yet submitted/reported upstream?  
> 
> Then you should upstream it and add the Url to the patch. The "I'll
> upstream it sometime later" thing often times doesn't work out and we
> end up having yet another downstream patch.

I agree that the best thing is to upstream patches, but...

I'll be honest with you. I have upstreamed patches for a decade. I am
tired. Specially when I have something that needs to get done
relatively quick (lets say a security patch). But someone else broke
the builders. It is not uncommon that I fix whatever is blocking to get
my own work done, and I don't have time or energy to find out how to
upstream (eg register to yet another mailing list or create another
user in bugzilla, google how to use mercurial) and prepare the patch in
an upstream friendly way (eg make sure it does not change behavior on
glibc) and test that it didnt break BSDs and glibc and then spend a
week on following up discussions with upstream. If this is enforced,
then I will revert stuff that breaks. (which does not always work
either, because old version may not build either in case breakage was
caused by other component such as compiler).

This does not mean that I never upstream patches. I often do. I'm just
tired and would appreciate help with upstreaming patches.

I just don't want it to be a hard requirement.

> > What if we are unsure if its suitable for upstreaming?  
> 
> Then you should ask upstream (or someone else who knows it, e.g. the
> reviewers of your MR) if it's suitable for upstreaming.

A lot of patches solves immediate issues for alpine or musl, but need
special care to be portable and not break GNU libc. And it is not
uncommon that upstream don't understand their own code or the problem
at hand. For example ruby main stack size calculation patch. their
comments indicate that they don't understand the problem nor the fix,
so I neeed to rearrange the patch to make it more *visible* that it
does not change behavior on glibc.

https://bugs.ruby-lang.org/issues/14387 (I have already prepared an
updated patch but I dont think I have taken the time to submit it yet).

What I'm trying to say is that sometimes even small and simple patches
that does not cost us much to carry can require significant effort to
upstream.


> > What happens if it was proposed upstream but later rejected or
> > reverted (eg. upstream finds ou they don't want support musl at all)  
> 
> Then we'll have to keep it downstream, but because we have a "Url"
> field we can track what happened to the thing, e.g. because upstream
> commented on the issue we linked that they don't want to support musl.
> The patch doesn't have to be updated then, that'd be a lot of effort,
> but I don't think that this will happen too often.
> 
> > What if we lift patches from debian or fedora and they have different
> > syntax for similar do we need to convert it to our format?  
> 
> Hm, although I think it'd be best if everything followed that format,
> so you don't have to spend too much time figuring out where to find
> information about a patch as reviewer/future maintainer/etc. I
> understand that that might be too much effort. I think if the
> Debian/Fedora patches are similiar enough (as in they provide the
> information we need, namely upstream status, an URL to track progress
> and an explanation) we can use them too.
> 
> > I am afraid of needing to change the upstreaming state of the patch
> > when it changes or that the specified upstreaming state information
> > getting outdated. This info becomes useless if it cannot be trusted.  
> 
> That's what the URL is for - if you want more info you can easily look
> it up by the URL value. Here is an example:
> 
> Upstream: Yes, pending
> Url: https://github.com/ximion/appstream-generator/pull/75
> Reason: Add support for scraping .apk packages for .desktop and
> .appdata.xml files
> 
> Once this PR gets merged we don't have to touch the patch - interested
> parties can just check the Url. After thinking about it for a bit,
> maybe we could live with just an Upstream and a Reason field, like so:
> 
> Upstream: Yes, pending: 
> https://github.com/ximion/appstream-generator/pull/75
> Reason: Add support for scraping .apk packages for .desktop and
> .appdata.xml files
> 
> The "Upstream" field isn't necessarily meant to indicate that a patch
> has been accepted upstream, it's just an indicator that upstream knows
> about the patch. This is the case when we opened an MR about it or we
> cherry picked a commit from upstream etc.

What I don't want is keep track of the upstream state and update the
patch file everytime upstream state changes. For example if the pending
patch gets applied, then I don't want be expected to go back to the
patch in our aports tree and change "Upstream: pending" to "Upstream:
applied". Rather, I just want let it be til next time I update the
package at which time the patch fails to apply. I can then look at the
upstream state (git log, issue tracker whatever) and remove the patch
in our tree.

So I don't think "pending" is useful info. But an URL to upstream state
is definitively useful.
 
> > May be good to have some indication for patches that are not
> > upstreamable and alpine specific. Maybe this is what you mean with
> > "Upstream: no"  
> 
> Yes, although it should be "Upstream: No, because $reason" :)

Is "Upstream: No, because I don't have the energy to deal with upstream
this week" acceptable? :)
 
> > There may be various different URLs. Where the patch comes from, URL
> > to
> > bug tracker, URL to other distros bugtracker or mailing list
> > discussions.  
> 
> I think the URL most relevant to the patch should be used. E.g. if the
> patch has been submitted upstream but not yet merged, link to the MR.
> If the patch is cherry picked from upstream you can link to the
> commit. If it's from a mailing list, link to that, etc.
> The Url: field isn't meant to inform the reader about why a patch is
> applied (that's what Reason: is for), it's meant so people looking at
> the patch at a later point can check what the progress of getting the
> patch upstreamed is or if there are new versions fo the patch.

What I mean is that we may want to have:

Upstream-bug: ...
Upstream-commit: ...

Instead of: 

URL: url-to-bugtracker
URL: url-to-commit


> > I agree that having some reference where the patch comes from, and
> > where to find upstream status is useful. I try (tried?) add that to
> > patches whenever I can.  
> 
> Yup, I've seen a few patches which already use a similiar scheme, but
> since we don't enforce it currently we still have loads of patches
> which don't have this and it's a lot of work to search for the
> information that is easily accessible from the patch header later on.

-nc