Page MenuHomeFreeBSD

Improved logic to generate a set of patches from *.orig files under ${PATCH_WRKSRC}
AbandonedPublic

Authored by marino on Aug 11 2014, 10:37 AM.

Details

Reviewers
danfe
antoine
bapt
bdrewery
Group Reviewers
portmgr
Summary

Several committers are unhappy with current state of affairs WRT makepatch target. Particularly, it was hardcoding double underscore (__) as path separator which was making patch file names unnecessary long and particularly ugly.

Another problem was the patch contents itself: timestamps were recorded relative to local timezone, which resulted in gratuitous changes (repo churn). Proposed implementation addresses all these problems.

By default, we propose '-' (dash, minus, hyphen) be to used to replace path separators in patch file names; it can be changed by setting PATCH_PATH_SEPARATOR variable, but we only allow single [-+_] characters.

If a file name happens to contain character which is also a separator replacement character, it will be doubled in the resulting patch name.

To minimize gratuitous patch renames, newly generated patches will be written under existing file names if they use any of the allowed path separators ([-+_]) or legacy double underscore (__).

This is version 10 of the patch. Previous (buggy) versions are available at here under makepatchX.diff names.

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
danfe updated this revision to Diff 1033.Aug 11 2014, 10:37 AM
marino accepted this revision.Aug 11 2014, 11:23 AM
marino added a reviewer: marino.

I vote for a default separator of a single underscore rather than a hyphen, but I'm fine going with what the majority decides. I was involved in the review of the code so I approve. :)

This revision is now accepted and ready to land.Aug 11 2014, 11:23 AM
antoine requested changes to this revision.Aug 11 2014, 12:38 PM
antoine added a reviewer: antoine.

patch fails to apply

This revision now requires changes to proceed.Aug 11 2014, 12:38 PM
danfe updated this revision to Diff 1042.Aug 11 2014, 1:59 PM
danfe edited edge metadata.

r363693 interfered with the patch; I've rerolled.

mat added a comment.Aug 11 2014, 3:04 PM

I think the default separator should be the underscore too.

danfe updated this revision to Diff 1044.EditedAug 11 2014, 3:19 PM
danfe edited edge metadata.

Due to some demand, provide an updated version which is using underscore instead of dash as a default path separator. Remaining logic is the same.

However, for the record: let's find out what is more common to see in file names, - or _ ?

$ find /usr/ports -type f -name patch-\* | xargs sed -n \
  '/^+++/s|\([[:blank:]][-0-9:.+]*\)*$||p' | grep -c -- -
2594

$ find . -type f -name patch-\* | xargs sed -n \
  '/^+++/s|\([[:blank:]][-0-9:.+]*\)*$||p' | grep -c -- _
4844

These are not perfectly accurate numbers, but they are close enough to the reality for our needs. It looks like '_' is more common, that's why I think that '-' (dash) is a better choice for PATCH_PATH_SEPARATOR. That said, let's listen for others' opinions.

@marino had also mentioned that '_' is common in pkgsrc, but AFAICT both Gentoo and Red Hat use '-' (dashes) for their patches. Since we sometimes pull patches from them, it looks like a point to consider as well.

As an update, I'm using "make makeplist" locally right now and I'm really pleased with the results. I'll definitely be using this as soon as it's committed.

bdrewery requested changes to this revision.Aug 15 2014, 2:19 PM
bdrewery added a reviewer: bdrewery.
bdrewery added a subscriber: bdrewery.

You quote timestamps as causing repo churn, yet want to change the naming scheme for all patches for no good reason. "It is ugly" is not good enough for causing more busy work for all porters and yet another gratuitous change in policy that everyone must be aware of.

This revision now requires changes to proceed.Aug 15 2014, 2:19 PM

"You quote timestamps as causing repo churn, yet want to change the naming scheme for all patches for no good reason.

This means you didn't read the changes closely enough.
Existing patches will not be renamed. The existing names will be used. We were adamant that this be a requirement. So the above criticism is invalid.

"It is ugly" is not good enough for causing more busy work for all porters and yet another gratuitous change in policy that everyone must be aware of."

To what extra work are you referring?
What policy has to be known? The average person uses "make makepatch" exactly as before and doesn't have to be aware of anything new.

I don't understand either of these criticisms really.

The existing ones not changing does not change that with this patch we will have an arbitrary separator of , _, +, -, depending on what the committer likes. Inevitably someone will go "fix" all patches to use the new default of _, or complain when someone commits a file with because it does not match the new default of _, or a bikeshed will ensue when someone uses +.

Not everyone is even aware of makepatch, so yes there is a naming convention/policy.

Phabric ate the double underscore in my comment.

"The existing ones not changing does not change that with this patch we will have an arbitrary separator of , _, +, -, depending on what the committer likes."

The truth is most people won't know there is an option. The default (currently an underscore) will be the one vastly used. I proposed only have a single one to remove "choice" essentially to address this concern, but it was though that an alternative was needed for specific ports.

Inevitably someone will go "fix" all patches to use the new default of _,

We specifically do that want that. Presumably somebody would look for a blanket to do that, portmgr should deny that. We know who the committers that like to sweep the tree are, we can tell them directly.

or complain when someone commits a file with because it does not match the new default of _, or a bikeshed will ensue when someone uses +.

Maybe a blurb in documentation will establish any of the three are legal and within the same port a mixture can be used. Then point to documentation and policy over.

Not everyone is even aware of makepatch, so yes there is a naming convention/policy.

The conversation started a year ago because there was no written policy. With this, you baseline one.

Talking the naming out of the equation, the timestamp changes alone are worthy of modifying makepatch.

danfe added a comment.Aug 15 2014, 4:25 PM

I concur with @marino on all accounts. There won't be diversity in path separators: most people simply do not care and will use whatever the default is. All supported separators (three of them) are already documented in PHB (section 4.4.1): we're not breaking any rules here. Last but not least, ability to alter separator for a specific port can (and often is, per my experience) handy.

Lots of folks make useless, gratuitous commits systematically, all the time, but very few of us happen to be concerned. Proposed makepatch implementation is definitely an improvement, it is more or less well designed, it went through a round of iterative reviews, me and John are already using it and works as expected. Previous makepatch was simply not good enough to be widely used. Now I expect it will gradually improve situation with our patches, without provoking massive patch rename sprees across the tree, even amongst those who love making sweeping changes when they have a chance.

Try it out, it is really not as scary as it might seem. ;-)

tijl added a subscriber: tijl.Aug 17 2014, 3:31 PM

If a file name happens to contain character which is also a separator replacement character, it will be doubled in the resulting patch name.

Can makepatch be made smart enough to select a different separator in this case? Doubling characters in the file name is worse than doubling characters in the path separator in my opinion.

In D582#29, @tijl wrote:

Can makepatch be made smart enough to select a different separator in this case? Doubling characters in the file name is worse than doubling characters in the path separator in my opinion.

This is where we suffer for not having the several iterations for discussions from before.

The doubling is not arbitrary. It is a good attempt at minimizing ambiguity, e.g. multiple paths that have the same patch name if you don't do it.

E.g. ${WRKSRC}/alpha/bravo.c and ${WRKSRC}/alpha_bravo.c would have the same patchname without the doubling. It's there for a reason.

It's impossible to eliminate the possibility of a multi-to-same patchname due to the fact all separators can be used the the original filenames, but it make the possibility remote. This would be a good reason to choose a different separator actually.

tijl added a comment.Aug 17 2014, 7:10 PM
In D582#31, @marino wrote:
In D582#29, @tijl wrote:

Can makepatch be made smart enough to select a different separator in this case? Doubling characters in the file name is worse than doubling characters in the path separator in my opinion.

This is where we suffer for not having the several iterations for discussions from before.
The doubling is not arbitrary. It is a good attempt at minimizing ambiguity, e.g. multiple paths that have the same patch name if you don't do it.
E.g. ${WRKSRC}/alpha/bravo.c and ${WRKSRC}/alpha_bravo.c would have the same patchname without the doubling. It's there for a reason.
It's impossible to eliminate the possibility of a multi-to-same patchname due to the fact all separators can be used the the original filenames, but it make the possibility remote. This would be a good reason to choose a different separator actually.

I understand why you need doubling. I was asking if it's feasible to make makepatch a bit smarter. If it detects that doubling is needed with the current separator, let it try another separator first. If that too requires doubling, try yet another. Only if that too requires doubling, use the original separator with doubling.

Actually I don't understand why you are ok with doubling in file names but not with path separators. Doubling in file names is worse. I think I would prefer that makepatch simply tried "_" and "-" as separator and used whichever avoids ambiguities. If neither works use "__" or "--" as separator. Doubling of the separator is less confusing. I mean it's a separator, it separates two components. Adding more space to a separator makes more sense than adding space inside a component.

In D582#32, @tijl wrote:

I understand why you need doubling. I was asking if it's feasible to make makepatch a bit smarter. If it detects that doubling is needed with the current separator, let it try another separator first. If that too requires doubling, try yet another. Only if that too requires doubling, use the original separator with doubling.

This patch is already more complicated than we had hoped.

  1. we need to support legacy separates to avoid gratuitous renames and bikesheds
  2. we need to support 3 legal separates
  3. we need this "last seen" mechanism to do that
  4. We need to use a non-default legal filename if we find it.

In my opinion this "smarter" algorithm is diminishing returns and just makes debugging all that much harder. Plus if it's on a per-patch basis (rather than per port) it could cause confusion with the user (why did it use those ugly "+" symbols for that patch, I don't want that, I want it to look like the others!

Actually I don't understand why you are ok with doubling in file names but not with path separators. Doubling in file names is worse.

How do you figure? There are likely more subdirectories that characters in the filenames that have an underscore in them. With the current way with a 5 level deep filename, you are guaranteed 10 underscores (5 doubles). With this way, chances are it's less than 5 doubles. So on the average, not worse.

I think I would prefer that makepatch simply tried "_" and "-" as separator and used whichever avoids ambiguities.

It can't know that. Any given path could have dozens of combinations that could also have the same patchname, now or in the future. And putting in AI for that is way complex.

If neither works use "__" or "--" as separator. Doubling of the separator is less confusing. I mean it's a separator, it separates two components. Adding more space to a separator makes more sense than adding space inside a component.

How does that solve the issue? The ambiguity issue exists now too with the double underscores. You could make each separator 5 underscores and I could contrive two paths where they would produce the same patchname with the current makepatch.

tijl added a comment.Aug 17 2014, 7:56 PM
In D582#33, @marino wrote:

Actually I don't understand why you are ok with doubling in file names but not with path separators. Doubling in file names is worse.

How do you figure? There are likely more subdirectories that characters in the filenames that have an underscore in them. With the current way with a 5 level deep filename, you are guaranteed 10 underscores (5 doubles). With this way, chances are it's less than 5 doubles. So on the average, not worse.

Worse in the sense that if I see foo_bar__baz, I tend to combine bar with foo and not with baz.

I think I would prefer that makepatch simply tried "_" and "-" as separator and used whichever avoids ambiguities.

It can't know that. Any given path could have dozens of combinations that could also have the same patchname, now or in the future. And putting in AI for that is way complex.

Of course it knows. The logic is dead simple.

If no path name contains _, use _
Else if no path name contains -, use -
Else if no path name contains , use
Else if no path name contains --, use --
Else if no path name contains ___, use ___
Else if no path name contains ---, use ---
(etc, just loop)

It's extremely unlikely that you will ever need ___. If you add in +, it's even unlikely that you will ever need __

In D582#34, @tijl wrote:

Worse in the sense that if I see foo_bar__baz, I tend to combine bar with foo and not with baz.

Like anything, you get used to it. I see your point but I don't mentally misgroup them because I'm used to it.

Of course it knows. The logic is dead simple.
If no path name contains _, use _
Else if no path name contains -, use -
Else if no path name contains , use
Else if no path name contains --, use --
Else if no path name contains ___, use ___
Else if no path name contains ---, use ---
(etc, just loop)

You're not thinking through to the ramifications. There are logical problems with this (beyond the unnecessary additional of complexity). I'd say its also less intuitive than what's proposed from a qualitative point of view.

tijl added a comment.Aug 17 2014, 8:20 PM
In D582#35, @marino wrote:

Of course it knows. The logic is dead simple.
If no path name contains _, use _
Else if no path name contains -, use -
Else if no path name contains , use
Else if no path name contains --, use --
Else if no path name contains ___, use ___
Else if no path name contains ---, use ---
(etc, just loop)

You're not thinking through to the ramifications. There are logical problems with this (beyond the unnecessary additional of complexity). I'd say its also less intuitive than what's proposed from a qualitative point of view.

Please elaborate. It's impossible to counter arguments that you don't present.

tijl added a comment.Aug 18 2014, 9:38 AM
In D582#34, @tijl wrote:

If no path name contains _, use _
Else if no path name contains -, use -
Else if no path name contains , use
Else if no path name contains --, use --
Else if no path name contains ___, use ___
Else if no path name contains ---, use ---
(etc, just loop)
It's extremely unlikely that you will ever need ___. If you add in +, it's even unlikely that you will ever need __

I did some simple analysis. There are 32597 path names (lines matching ^+++) in 25559 patch files (files matching ^+++). I then put all path names in one string per port.

1485 ports match _
1034 ports match -
35 ports match +

So + would be the most obvious choice for path separator but assuming that nobody likes that - is the next best choice.

287 ports match both - and _
6 ports match all of -, _ and +
1 port matches -- (it can use _)
24 ports match __ (they can use either - or +)

With only 6 ports that would require doubling I don't want to spend much more time and energy on this. My preference is that doubling (and tripling etc) happens on the path separator and not on file names because it matches the intuitive interpretation of foo_bar__baz that says that bar belongs with foo and not with baz.

I've put the file with patch paths (one line per port) online at http://people.freebsd.org/~tijl/patchpathsperport.xz in case someone wants to run some of their own tests (e.g. awk '/--/ && /_/ && /\+/' < patchpathsperport | wc -l)

Can you live with either underscore or hyphen as the default separator and doubling when separator is part of the filename. I am *very* much against a morphing/iterative name algorithm.

I skipped the "?". That was a question not that statement.

tijl added a comment.Aug 18 2014, 10:36 AM
In D582#38, @marino wrote:

Can you live with either underscore or hyphen as the default separator and doubling when separator is part of the filename. I am *very* much against a morphing/iterative name algorithm.

Again, arguments, you dislike it because?

I don't want to argue. I have technical reasons, I said the implications were obvious if you thought about it but you clearly want to counter everything I say ("Please elaborate. It's impossible to counter arguments that you don't present."). You want me line it up so you can shoot it down.

let's just skip all of that.

  1. the technical changes to the timestamps are badly needed . I guess you agree since aren't talking about that.
  2. the name has turned into the bikeshed everyone feared. So I ask again, Can you live with the proposal as is? For my part, I can't live with the morphing/iterative thing and yes, I have my reasons.
tdb added a subscriber: tdb.Aug 18 2014, 10:45 AM

Just throwing my comments in here. Firstly, I'm not too fussed what the separator is, but personally I think __ is clear enough to read and I don't think "it looks ugly" is really enough justification to change it.

The timezone fix is a no brainer, of course.

In D582#28, @danfe wrote:

All supported separators (three of them) are already documented in PHB (section 4.4.1): we're not breaking any rules here.

Slightly confused by this though. I read 4.4.1 as clearly saying __ must be used. The fourth bullet point is just saying what characters can be used in a filename, not what should be used as a separator.

Slightly confused by this though. I read 4.4.1 as clearly saying __ must be used. The fourth bullet point is just saying what characters can be used in a filename, not what should be used as a separator.

I beg to disagree. What it says right now is:

  • When creating names for patch files, replace slashes (/) with two underscores (__).
  • Only use characters [-+._a-zA-Z0-9] for naming patches.

Notice: no explicit must. Yes, we essentially propose changes to the 1st item. We respect (and not amending) allowed charset (2nd item). I do read naming patches as including the character used as path separator.

danfe added a comment.EditedAug 18 2014, 11:22 AM
In D582#40, @tijl wrote:
In D582#38, @marino wrote:

Can you live with either underscore or hyphen as the default separator and doubling when separator is part of the filename. I am *very* much against a morphing/iterative name algorithm.

Again, arguments, you dislike it because?

I believe John is worrying about making it overly complex, speaking of heuristic logic of adjusting path separator, potentially overriding user's choice and making code more cryptic.

Try to look at it from this angle: makepatch is a tool that should do simple work for you, since 80% of work is simple. There is no need to try to teach it to do remaining 20%, as it will very likely still suck at that even if you do teach it after all.

Even most sophisticated (or better, intricate) naming rules won't cover all cases. There are sometimes specific patches that are not mapped to a single filename (patch-gcc4x-fixes, for example); there are extra-* patches that you would have to deal manually after makepatch round, etc.

Current implementation is good enough. I'm sure everyone can name a port or imagine a situation when it will not shine, and either require change of separator character, or manual intervention of some kind. Oh, wait, but now you know what to do: I've just named it. :-)

In D582#39, @marino wrote:

I skipped the "?". That was a question not that statement.

You can edit your comments. Just click that little gear on the right and it'll pull down. ;-)

By the way, had this version of makepatch been in the system, maybe dmitry wouldn't be doing this gratuitous crap:

Author: amdmi3
Date: Mon Aug 18 13:37:34 2014
New Revision: 365311
URL: http://svnweb.freebsd.org/changeset/ports/365311
QAT: https://qat.redports.org/buildarchive/r365311/

Log:

  • Use canonical patch filenames
  • Fix SONAMEs and simplify libraries installation

Added:

head/devel/sfml1/files/patch-src__SFML__Audio__Makefile
   - copied, changed from r365260, head/devel/sfml1/files/patch-src-SFML-Audio-Makefile
head/devel/sfml1/files/patch-src__SFML__Graphics__Makefile
   - copied, changed from r365260, head/devel/sfml1/files/patch-src-SFML-Graphics-Makefile
head/devel/sfml1/files/patch-src__SFML__Makefile
   - copied unchanged from r365260, head/devel/sfml1/files/patch-src-SFML-Makefile
head/devel/sfml1/files/patch-src__SFML__Network__Makefile
   - copied, changed from r365260, head/devel/sfml1/files/patch-src-SFML-Network-Makefile
head/devel/sfml1/files/patch-src__SFML__System__Makefile
   - copied, changed from r365260, head/devel/sfml1/files/patch-src-SFML-System-Makefile
head/devel/sfml1/files/patch-src__SFML__Window__Makefile
   - copied, changed from r365260, head/devel/sfml1/files/patch-src-SFML-Window-Makefile

Deleted:

head/devel/sfml1/files/patch-src-SFML-Audio-Makefile
head/devel/sfml1/files/patch-src-SFML-Graphics-Makefile
head/devel/sfml1/files/patch-src-SFML-Makefile
head/devel/sfml1/files/patch-src-SFML-Network-Makefile
head/devel/sfml1/files/patch-src-SFML-System-Makefile
head/devel/sfml1/files/patch-src-SFML-Window-Makefile

Modified:

head/devel/sfml1/Makefile
head/devel/sfml1/pkg-plist
danfe added a subscriber: AMDmi3.Aug 19 2014, 2:52 AM

This bogus rename by @AMDmi3 is just atrocious. I find it ironic that Bryan and others worry about that proposed patch would inevitably cause gratuitous renames while those renames are already happening (and will keep happening because no one is stopping them) regardless of this particular patch we're discussing here.

I honestly don't see why is it easier to keep working code from landing than to keep people from making needless, stupid commits.

tijl added a comment.Aug 19 2014, 11:15 AM

I've reworked the patch last night and made some final touch ups this morning: http://people.freebsd.org/~tijl/b.p.m.makepatch.patch

I admit the awk script may be a little dense, but it certainly doesn't involve AI or heuristics. It builds up two lists, a list of .orig files and a list of paths from existing patch-* files. While it does that it determines an appropriate separator, i.e. a character sequence (-,_,+,--,__,++,---,etc) that does not appear in any of the paths. In the END section it removes existing patches that will be regenerated and for each .orig file it prints out the path without .orig suffix and the path of an existing or a new patch-* file. With this patch existing patch-* files are always reused, no matter what the name is and even if there are multiple diffs inside. There's one possible glitch if you create a patch-path-to-some-file that does not contain a diff for path/to/some/file. If you then try to create a patch for this file it will be appended to patch-path-to-some-file. I'm not sure if this is incorrect behaviour though.

AMDmi3 added a comment.EditedAug 19 2014, 11:33 AM

gratuitous crap
bogus rename is just atrocious

Please choose your wording. The rename was done according to 4.4.1. It declares patch naming style, my patches will be that style. If you don't like style, change handbook (I'd prefer patch-dir-dir-file.c style myself).

danfe added a comment.Aug 19 2014, 4:05 PM

@AMDmi3⇒ nothing (no rules) should contradict common sense; and PHB does not really mandate a particular patch naming scheme, only the charset: any sane patch naming is fine as long as it does not use explicitly prohibited characters. :: was forbidden for a reason: it prevented checkouts on some filesystems and required escaping in certain cases, and that's why it is not allowed. But as long as patch names are OK, renaming them is needless and gratuitous change that does not bring any good, especially if you like dashed names (like I do as well). I would probably not say a word if you had brought diversely named patches under common scheme, but merely changing one consistent and valid separator for another is beyond my understanding, sorry.

@tijl⇒ do you realize how complicated your code is? I'm not saying it does not work, you're one of our renowned developers, but I got lost in it. I don't think that it either 1) brings significant improvements over the proposed solution, or 2) will be accepted by the reviewers.

During the development of the patch I at one point had also come up with somewhat complicated (although not as far as yours) version that attempted to do conditional escaping (one can think of it as adaptive naming scheme), and eventually had to admit that added complexity is just not worth it. Current (proposed) solution hits the desired balance pretty well. I'm not saying it cannot be made smarter, but the curve bends so steeply it becomes impractical.

tijl added a comment.Aug 19 2014, 6:11 PM
In D582#52, @danfe wrote:

@tijl⇒ do you realize how complicated your code is? I'm not saying it does not work, you're one of our renowned developers, but I got lost in it.

Yes, it got more complicated than I anticipated, but let's see if I can break it down.

find -s * -type f -name '*.orig' | awk

This feeds a list of *.orig paths to stdin of awk. The awk command looks like this:

awk '...' - `find -s ${PATCHDIR} -name 'patch-*'` | while read f p

The arguments to awk are the awk program and then a list of files it will parse, starting with - (stdin) and then existing patch-* files. I had to use find instead of just patch-* because there may not be any existing patches and then patch-* doesn't expand correctly. The output of awk is sent to the while loop. The output consists of pairs <f,p> with f a file to generate a diff for and p a patch file e.g. path/to/file patch-path-to-file. The implementation of the loop is similar to what you already had.

The awk program consists of four parts BEGIN (just initialises the variable sep), FILENAME == "-" (parsing stdin), FILENAME != "-" (parsing an existing patch file) and END (producing output of pairs <f,p>).

The FILENAME == "-" case (reading a line from stdin) consists of two actions. The first one chops of ".orig" and stores the resulting path in the "list" array. The second one only executes if the path contains the current separator sep. In that case it iterates sep (any - becomes _, any _ becomes +, any + becomes - with an extra - appended).

The FILENAME != "-" case (reading a line from a patch file) consists of one action. It executes when the line starts with +++ and it stores the path on that line in the array "patches" together with the FILENAME of the patch. So for a path p, patches[p] returns the name of the patch that contains that path (awk arrays are associative).

The END section loops over "list" twice, once to remove existing patches and once to print pairs <f,p>. The patches have to be removed before any output is printed because the while loop starts producing patches as soon as it reads input. The second for loop prints each path from the "list" array together with an existing patch name if the path appeared in any, or with a new patch name that is the path with / replaced with sep.

I don't think that it either 1) brings significant improvements over the proposed solution, or 2) will be accepted by the reviewers.

My critique on the current patch is that it only solves the problem of __ if people set PATCH_PATH_SEPARATOR correctly. I'm not convinced that will be the case. I don't think people will know about it, won't forget about it and will care enough about the __ problem to do that.

If the current patch is committed at least change the default separator to -. It is a less common character.

Also, an advantage of my implementation is that it allows grouping multiple diffs in one file. It correctly regenerates the patches in security/libbeid for instance.

marino added a comment.EditedAug 19 2014, 7:12 PM
In D582#53, @tijl wrote:

My critique on the current patch is that it only solves the problem of __ if people set PATCH_PATH_SEPARATOR correctly. I'm not convinced that will be the case. I don't think people will know about it, won't forget about it and will care enough about the __ problem to do that.
If the current patch is committed at least change the default separator to -. It is a less common character.

The "problem" was that "__" was used for every separator. As a doubler, it was ok. I prefer double underscore over double hyphen for aethestic reasons. However, if the default character is the dealbreaker on the proposed code, I'll gladly agree to a hyphen as the default.

Also, an advantage of my implementation is that it allows grouping multiple diffs in one file. It correctly regenerates the patches in security/libbeid for instance.

This wasn't lost of me. I know you are a proponent of grouping patches for the same problem in one file and thus motivated by this. The proposed version separates those apart more or less intentionally which I believe the current philosophy (one patch, one file). Nobody has agreed that combining patches in a single file as a rule is a good idea, and it's extending the scope of the review.

danfe added a comment.EditedAug 19 2014, 7:54 PM

My critique on the current patch is that it only solves the problem of __ if people set PATCH_PATH_SEPARATOR correctly. I'm not convinced that will be the case. I don't think people will know about it, won't forget about it and will care enough about the __ problem to do that.

The idea is that most of the times people won't have to (re)set PATCH_PATH_SEPARATOR at all. Just like they do now: they simply run make makepatch without much thinking. The ability to change it is reserved for ports that already widely use different character (in this case committer should notice that many patches are getting renamed and adjust it to minimize the repo churn), or if a particular set of patch files would actually, really benefit from a non-default character.

tijl added a comment.Aug 19 2014, 8:10 PM
In D582#55, @danfe wrote:

My critique on the current patch is that it only solves the problem of __ if people set PATCH_PATH_SEPARATOR correctly. I'm not convinced that will be the case. I don't think people will know about it, won't forget about it and will care enough about the __ problem to do that.

The idea is that most of the times people won't have to (re)set PATCH_PATH_SEPARATOR at all. Just like they do now: they simply run make makepatch without much thinking. The ability to change it is reserved for ports that already widely use different character (in this case committer should notice that many patches are getting renamed and adjust it to minimize the repo churn), or if a particular set of patch files would actually, really benefit from a non-default character.

But then you will get doubling of _ in file and directory names. I mean if one group of committers has the right to complain about __ in the path separator certainly another group of committers has the right to complain about __ in the path components.

In D582#48, @danfe wrote:

This bogus rename by @AMDmi3 is just atrocious. I find it ironic that Bryan and others worry about that proposed patch would inevitably cause gratuitous renames while those renames are already happening (and will keep happening because no one is stopping them) regardless of this particular patch we're discussing here.
I honestly don't see why is it easier to keep working code from landing than to keep people from making needless, stupid commits.

Adam fixed the patches to follow the proper convention and makepatch output. There is nothing bogus about it.

This patch is the epitome of bikesheds. Please focus your energy on things that matter.

In D582#57, @bdrewery wrote:

Adam fixed the patches to follow the proper convention and makepatch output. There is nothing bogus about it.

Dirty. He changed "::" which needed changing. Danfe wasn't talking about that, he was talking about dmitri (as an example) and everyone knows it.

This patch is the epitome of bikesheds. Please focus your energy on things that matter.

Dirty. First, you have NO RIGHT to tell people want to focus THEIR energy on. Secondly, since you don't care, concede to those that do and end it. You guys keep saying you don't care.

In D582#58, @marino wrote:
In D582#57, @bdrewery wrote:

Adam fixed the patches to follow the proper convention and makepatch output. There is nothing bogus about it.

Dirty. He changed "::" which needed changing. Danfe wasn't talking about that, he was talking about dmitri (as an example) and everyone knows it.

I'm not familiar with it.

I still stand by my argument that changing the default separator (and also allowing 3 others as OK) will lead to massive churn and more debate, wasted time, and eventually someone "fixing" it all.

This patch is the epitome of bikesheds. Please focus your energy on things that matter.

Dirty. First, you have NO RIGHT to tell people want to focus THEIR energy on. Secondly, since you don't care, concede to those that do and end it. You guys keep saying you don't care.

I believe I do. I'm a committer and must follow what we decide the convention is. I point back at removing indefinite articles from COMMENT. I was not around for that debate, but it's something we're still cleaning up and something all committers are supposed to remember and follow. It's more busy work for no benefit. It even goes against the literal line from upstream. Another pointless debate is needing / on URLs. These things are not productive. These things always end up getting thrown at new committers and giving them a hard time.

I support this patch minus 1. changing the default separator and 2. giving an ability to use a different separator depending what you like.

Please, let's stick to 1 separator. Yes, some people (and new committers) don't realize that it is double underscore and we have inconsistency, but why make it worse?

Adding in -p and setting TZ are very welcome to me.

In D582#59, @bdrewery wrote:

I still stand by my argument that changing the default separator (and also allowing 3 others as OK) will lead to massive churn and more debate, wasted time, and eventually someone "fixing" it all.

I don't know why, fixing it all it hasn't happened yet despite tons of disparity. The churn has already started and this is clearly a debate. Moreover the simple solution is explicitly state that portmgr will not condone regenerating patch names en-masse or as the only change to a patch that has a legal name. so it won't happen because it would be against the rules.

I believe I do. I'm a committer and must follow what we decide the convention is.

obviously this started because there is no convention other than what makepatch had in it. And who know how many people were involved in that. The discussion started because this defacto convention is terrible in some opinions. We want to establish a better convention.

I point back at removing indefinite articles from COMMENT. I was not around for that debate, but it's something we're still cleaning up and something all committers are supposed to remember and follow. It's more busy work for no benefit.

If you are talking about en-masse commits, I agree. If you are talking about "while we are here, fix the comment", I'm fine with it. But no, as the only reason for the commit it's not a good one.

It even goes against the literal line from upstream. Another pointless debate is needing / on URLs. These things are not productive. These things always end up getting thrown at new committers and giving them a hard time.

Same -- I'm fine with it as long as it's not the only reason for the commit.

I support this patch minus 1. changing the default separator and 2. giving an ability to use a different separator depending what you like.
Please, let's stick to 1 separator. Yes, some people (and new committers) don't realize that it is double underscore and we have inconsistency, but why make it worse?

Actually I am pro "one separator". I just don't want that separator to be double underscores (or plus signs). I can live with 3 but truthfully I prefer only one.

Adding in -p and setting TZ are very welcome to me.

it's a very good addition.

danfe added a comment.Aug 20 2014, 8:15 AM
In D582#56, @tijl wrote:

But then you will get doubling of _ in file and directory names. I mean if one group of committers has the right to complain about __ in the path separator certainly another group of committers has the right to complain about __ in the path components.

Everyone can complain, but as you can see, most of the folks do not care. They are happy with whatever character is picked, and even if they don't quite like it, like Dmitry, they still prefer to obey the rules/PHB rather than try to improve or change them.

Leaving aside your awk-based patch and technical comments (things which I consider constructive and fair), other counter arguments received so far are not convincing and IMHO are clearly outspoken by @marino. Let me summarize them again here, concisely:

  • You're changing rules, go read what PHB tells. We're not changing legal character set, and existing double-underscored patches are preserved. PHB is not your Bible, it's a moving target and should accurately reflect the reality of the framework instead of preventing changes because "things are already documented". More to it, many people (surprise!) simply do not (re)read the PHB.
  • Changing default separator (and also allowing 3 others as OK)... Three others are already OK: they are explicitly allowed in the patch files names. If I remember the history correctly, rigid charset was defined to as a remedy to :: spread few years ago, which long predates this crazy "always __, no thinking required" convention.
  • ... will lead to massive churn... Proposal implements seat belts to prevent unneeded renames. Patches (en masse) are volatile thing: they tend to come and go. I believe that once the new code lands, patch names will start to slowly drift towards new scheme, without creating massive churn that we all hate and would rather avoid.
  • ... and more debate, wasted time, ... People who want to participate are welcome to do it here. Discussion started and the first version of the patch appeared on the public lists; I later advertized this phabric page on public lists as well to drag more attention to matured version of the patch. Those who wanted to talk about it are already here; others obviously do not care and they actually had said so on a number of occasions.
  • ... and eventually someone "fixing" it all. Freezing and slushing the tree, requiring approval for sweeping changes works (I am talking about administrative, not technical means). Make a bald statement. Repeatedly punish violators. After some time, people will get it (or not, but keep reading).

This whole discussion's starting to get ridiculous. Most of the valid, technical issues had been raised and addressed during internal discussion between me, John, Mathieu, and Antoine two weeks ago. I'm pretty sure if we had simply got it committed, updated the PHB, and posted an notice on the lists we would already be getting benefits from it without any or little complaints. People in their majority rarely ask questions. People are using the tools they are offered, often even without a deep understanding of how or why they work. So give them the tool, and tell them not to press the red button. We don't really think that all of our fellow committers are mangalores, do we? :-)

Now that we've had a nice breather, how do we progress? We need to get makepatch to the masses.

antoine added inline comments.Sep 1 2014, 7:57 AM

.

bsd.port.mk
1216

i can't edit my comment, i meant -p

So, still churning:

`Author: des
Date: Thu Sep 11 12:15:46 2014
New Revision: 367923
URL: http://svnweb.freebsd.org/changeset/ports/367923
QAT: https://qat.redports.org/buildarchive/r367923/

Log:

Rename the patches as per the guidelines in the porter's handbook.

Added:

head/www/mod_dnssd/files/patch-src__Makefile.in
   - copied unchanged from r367854, head/www/mod_dnssd/files/patch-src_Makefile.in
head/www/mod_dnssd/files/patch-src__mod_dnssd.c
   - copied unchanged from r367854, head/www/mod_dnssd/files/patch-src-mod_dnssd.c

Deleted:

head/www/mod_dnssd/files/patch-src-mod_dnssd.c
head/www/mod_dnssd/files/patch-src_Makefile.in`

Does the porters handbook really say to rename patches?

Can we please progress on this ticket?

tijl added a comment.Sep 11 2014, 1:35 PM

I've been using my version for a while now (link) and I tend to use it like this:

  1. go to wrksrc and apply one or more patches
  2. make a few changes
  3. run "make makepatch" which recreates the one or more patches

If a new patch needs to be created "make makepatch" chooses a sensible name, but then the maintainer can rename the patch if he wants or put multiple patches in one file and the next "make makepatch" will respect that. It does not enforce any naming or grouping policy and avoids all the bikeshed involved in trying to determine such a policy. It gives the freedom to the maintainer to organise the patches the way that's best for him, which is a good thing, because well, he's the one that has to maintain them. If others need to make changes then the above workflow is pretty easy no matter how the maintainer organised things.

I realize the implementation is a bit complicated but that's what comments are for.

bapt requested changes to this revision.Oct 16 2014, 7:53 AM
bapt added a reviewer: bapt.
bapt added a subscriber: bapt.

Please only accept one single separator and we will change the doc accordingly

All the rest of the patch looks ok
Concerning the diff -p is ugly but it saved me tons of time in the past to update patches.

So for the sake of consistence and simpleness if we can go for the one-true-separator and that separator being '_' then we can ship this patch

danfe updated this revision to Diff 2035.Oct 18 2014, 4:22 PM
danfe edited edge metadata.

Updated patch against fresh -CURRENT and one-true-separator, with that separator set to _ per request from @bapt.

I must say that I do not like one-true-separator idea, and I believe I've provided sufficient reasoning in the past on why all three alternatives are just fine and should be allowed to (co)exist. However, I also do realize that missing this chance and being stubborn when the patch is almost welcomed and ready to land is stupid, and would not help to solve the original problem, and gratuitous renames will keep happen, so here we go.

marino added a comment.EditedOct 19 2014, 9:44 AM

I see two issues with this when reviewing it against the previously submitted version:

  1. diff -p option was removed. we want to keep that (it changed fro -udp => -ud, but bapt just confirmed -p is desired)

regarding this:

PATCH_PATH_SEPARATOR?=  _
makepatch:
.if ${PATCH_PATH_SEPARATOR} != _
  @${ECHO_MSG} "Error (${.TARGET}): PATCH_PATH_SEPARATOR must be underscore (_)."
  @${FALSE}

I don't think this accomplishes anything.
just set "PATCH_PATH_SEPARATOR=_" and remove the .if statement. right?

danfe updated this revision to Diff 2049.EditedOct 20 2014, 2:12 AM
danfe edited edge metadata.

Reroll to address the latest concerns from @marino. Note that removing the check also implicitly allows to override PATCH_PATH_SEPARATOR from command line. I guess it's not a problem though.

In D582#74, @danfe wrote:

Reroll to address the latest concerns from @marino. Note that removing the check also implicitly allows to override PATCH_PATH_SEPARATOR from command line. I guess it's not a problem though.

perhaps changing
PATCH_PATH_SEPARATOR= _
to
PATCH_PATH_SEPARATOR:= _

would prevent that?

Everything else looks good to me.

danfe added a comment.Oct 20 2014, 2:36 PM

perhaps changing
PATCH_PATH_SEPARATOR= _
to
PATCH_PATH_SEPARATOR:= _
would prevent that?

No, it won't (per my testing here). And := should not be overused. Needless to say, if someone really wants to override _ as separator character, they will do it (ir)regardless of whatever checks we put there.

strange, I don't know how a variable defined with := gets overridden.
Obviously you could just mass replace "|${PATCH_PATH_SEPARATOR}" with "_" and it couldn't be overridden except by editing the file itself. I didn't suggest that before because having it as a (constant) variable make the logic more easily read.

danfe added a comment.Oct 20 2014, 4:46 PM

I didn't suggest that before because having it as a (constant) variable make the logic more easily read.

Yes, this is precisely the reason why I do not want to optimize away PATCH_PATH_SEPARATOR variable completely. It would also help shall we ever want to augment/improve the logic (or even support multiple separators again, who knows).

I'm fine with this. Someone would have to read this makefile to figure out how to override it, so it would be something intentionally done. I don't see a submitter doing that, and a committer would be bound by the documentation, so the scenario you bring up shouldn't happen in practice. It would take a "rogue" committer to do it, intentionally. It's not worth worrying about.

bapt accepted this revision.Oct 23 2014, 12:45 PM
bapt edited edge metadata.
bapt added a comment.Oct 30 2014, 10:28 AM

accepted == you can commit :)

I don't know why this didn't auto-close but it's been committed:

Author: marino
Date: Thu Oct 30 23:04:03 2014
New Revision: 371776
URL: https://svnweb.freebsd.org/changeset/ports/371776
QAT: https://qat.redports.org/buildarchive/r371776/

Log:

bsd.port.mk: Finish update to make makepatch

A portion of this patch to upgrade makepatch was committed almost 2
months ago; this is the rest of it.  It changes the directory separator
to "_" and it will transform "_" in the filename to "__" to avoid
ambiguous file names (e.g. A/B/C.c and A_B/C.c won't have the same patch
name).

The new logic will not rename an existing patch that used previously
standard separators of "-", "+", or "__" in its name.  It is desireable
to avoid commits that only change the filename of the patch, so that's
why existing filenames are re-used if previously legal.

The diff command is also pass the -p argument for additional useful
context.

Differential Revision:	https://reviews.freebsd.org/D582
Approved by:		portmgr (bapt)

Modified:

head/Mk/bsd.port.mk
danfe added a comment.Nov 25 2014, 1:38 AM
In D582#67, @bapt wrote:

Please only accept one single separator and we will change the doc accordingly

TPH Section 4.4 still contains old text that advises to use __ (yuck). When can we hope to get it updated? Who should/assigned to take care of it? Any help/patch is expected on our behalf?

mat added a comment.Nov 25 2014, 8:19 AM
In D582#84, @danfe wrote:
In D582#67, @bapt wrote:

Please only accept one single separator and we will change the doc accordingly

TPH Section 4.4 still contains old text that advises to use __ (yuck). When can we hope to get it updated? Who should/assigned to take care of it? Any help/patch is expected on our behalf?

Nah, it's on my todo list, I've been procrastinating it by working on other things lately, is all.

In D582#85, @mat wrote:

Nah, it's on my todo list, I've been procrastinating it by working on other things lately, is all.

Would it help if we submit a patch via PR?
Or has this already been done in a staged new version?

John

mat added a comment.Jan 21 2015, 2:02 PM
In D582#86, @marino wrote:
In D582#85, @mat wrote:

Nah, it's on my todo list, I've been procrastinating it by working on other things lately, is all.

Would it help if we submit a patch via PR?
Or has this already been done in a staged new version?
John

I started working on something, then got distracted, a lot, and then forgot about it, I'll get back on it.

mat added a comment.Jan 22 2015, 12:14 PM

Committed the doc update in rD46223.

danfe accepted this revision.Jan 23 2015, 1:20 PM
danfe added a reviewer: danfe.
This comment was removed by danfe.
marino commandeered this revision.Jan 23 2015, 9:43 PM
marino removed a reviewer: marino.

danfe can't seem to close this revision, so let me try.

marino abandoned this revision.Jan 23 2015, 9:59 PM

My guess is that "close" is not an option because antoine and bdrewery have rejected disposition.
Everything has been accomplished so the only practical option is just abandon this revision. ( i assume we don't want to add a dummy patch to all approve just to close this)

AMDmi3 removed a subscriber: AMDmi3.Jan 23 2015, 11:56 PM