Page MenuHomeFreeBSD

[new port] biology/py-macs2: Analysis of chromatin immunoprecipitation (ChIP) sequences
ClosedPublic

Authored by jwb on Dec 1 2017, 4:55 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Oct 24, 7:33 PM
Unknown Object (File)
Fri, Oct 24, 1:21 AM
Unknown Object (File)
Thu, Oct 23, 10:22 PM
Unknown Object (File)
Tue, Oct 21, 8:23 AM
Unknown Object (File)
Tue, Oct 14, 4:59 AM
Unknown Object (File)
Sat, Oct 11, 3:41 PM
Unknown Object (File)
Thu, Oct 9, 4:35 AM
Unknown Object (File)
Thu, Oct 9, 4:25 AM
Subscribers

Details

Summary

[new port] biology/py-macs2: Analysis of chromatin immunoprecipitation (ChIP) sequences
Approved by: jrm (mentor) or wen (mentor)
Differential to be added to commit message

Test Plan

portlint -C
WARN: Makefile: no port directory /usr/ports/math/py-numpy@py27 found, even though it is listed in BUILD_DEPENDS.
0 fatal errors and 1 warning found.
Passed poudriere on {10.3,11.1}-{amd64,i386}

Diff Detail

Repository
rP FreeBSD ports repository
Lint
No Lint Coverage
Unit
No Test Coverage
Build Status
Buildable 13275
Build 13512: arc lint + arc unit

Event Timeline

I think @koobs will suggest to using Use MASTER_SITES=CHEESESHOP based on his past comments, "Use MASTER_SITES=CHEESESHOP by default for Python packages that provide distribution files via PyPI, unless there is a compelling (temporary) reason not to, such as test/test data files not included in the source distribution (sdist)." However, I see that it lags behind the last commit on GitHub. If those more recent commits are compelling enough, I think this is preferred.

PORTNAME=	macs2
DISTVERSION=	2015.4.20
DISTVERSIONSUFFIX=	-46-g6946b2f
CATEGORIES=	biology python
PKGNAMEPREFIX=	${PYTHON_PKGNAMEPREFIX}
PKGNAMEPREFIX=	${PYTHON_PKGNAMEPREFIX}

and then you can remove GH_TAGNAME=....
This is based on

% git describe --tags 6946b2f6571cd0a7cfbb882c79aaa25a05a5c48d
2015.4.20-46-g6946b2f

The name of the last GitHub tag is unfortunate, but I think if they make a new tag following past patterns, we should be OK.

% pkg version -t 2.1.2 2015.4.20-46-g6946b2f
>

I think @koobs will also ask about USE_PYTHON=concurrent.

Normally I would use pypi, but the last pypi release is from spring 2016 and there are important new features since then.

I misunderstood the concurrent flagm somewhat. Yes, it's a good idea to add it.

I think using GH_TAGNAME is cleaner and more descriptive, although I see that abbreviated tagnames are the norm.

Thanks...

Clean up per mentor comments

The usual problem, which is a little different here, isn't with the short/long hashes. It's basing the port's version on something that isn't an upstream version. When using GH_TAGNAME it is/was common to set PORTVERSION to the date of the commit, like 20171201. In the future, if you want the port's version to match an upstream version, things can get messy. For example, if that future release is something like version 1.1, then you will have to bump PORTEPOCH. It is usually best to set DISTVERSION to an actual release, and if you need a commit after a tagged release, you could set DISTVERSIONSUFFIX as above. This way the port's versions follows a predictable sequence. Things here are a little different here, because upstream actually set the last tag to something weird (not following the usually release tags like 2.1.0).

I see your point... How about using DISTVERSION=2.1.0.20170505? Then whether or not the developer ever gets back on a release schedule, we can generate new versions that will compare correctly, e.g. 2.1.1 or 2.1.0.20180701 would both qualify as later than 2.1.0.20170505, I think.

What pypi did in choosing 2.1.1.20160309 doesn't make sense to me, as the next release would logically be 2.1.1, which would be considered earlier than 2.1.1.20160309, correct? Or am I missing something...

At any rate, I'd like to come up with something that makes sense to

dbaio added inline comments.
biology/py-macs2/Makefile
25–26

You can use PYSETUP here.

Mk/Uses/python.mk:

# PYSETUP       - Name of the setup script used by the distutils
#             package.
#             default: setup.py
In D13322#278250, @jwb wrote:

How about using DISTVERSION=2.1.0.20170505?

DISTVERSION has to match an upstream version. You can cheat a little by adding the DISTVERSIONSUFFIX to pull a tarball from GitHub that is created from commits after an upstream release (make the FreeBSD package version based on an upstream version plus some extra commits). I think we are stuck with the aberrant version.

What pypi did in choosing 2.1.1.20160309 doesn't make sense to me, as the next release would logically be 2.1.1, which would be considered earlier than 2.1.1.20160309, correct? Or am I missing something...

It wouldn't make sense for FreeBSD package versions, but I'm not familiar with pypi's versioning system.

pkg version -t 2.1.1.2010309 2.1.1
>

Sorry to drag this out, but it's important to me to establish a clean and rational pattern that's also intuitive to the average researcher. As you're probably aware, there's a lot of nascent software in the bioinformatics field, which also happens to be our fastest growing domain in HPC. Many of the developers are not consistent about maintaining a release schedule, so the MACS2 situation (long time since last release and important new features added since) is common. I often get requests from researchers for a non-release version of a package.

I carefully reviewed the guide and there appear to be a few options that would be palatable from the FreeBSD ports perspective. One of my main concerns is what the user sees when running "pkg info".

I think examples 5.4 and 5.5 (https://www.freebsd.org/doc/en_US.ISO8859-1/books/porters-handbook/makefile-naming.html#idp56060264) best illustrate how to approach this situation.

Of those two, I prefer the latter, using PORTVERSION=<last release>p<commit-date>. This causes "pkg info" to produce something that I think is intuitive:

Name : py27-macs2
Version : 2.1.0p20170505

I think the average user would correctly read this as 2.1.0 + patches.

Following example 5.4, we could use DISTVERSION=<++last version>-pre<date>, but I think the generated PORTVERSION would be counterintuitive to the average user:

Name : py27-macs2
Version : 2.1.1.p20170505

I don't think the average user is going to understand that ".p" means pre-release here and they would likely misinterpret this as 2.1.1 + patches. This could cause communication problems between collaborators, which could lead to delays in research when there are important differences between versions.

This might also violate the convention that DISTVERSION should match the upstream version.

Finally, if we separate the version and pre/patch level, DISTVERSIONSUFFIX is not translated into PORTVERSION, so we get misleading information from "pkg info":

Name : py27-macs2
Version : 2.1.0

It is *not* version 2.1.0 that's installed here, but the latest commit.

Thanks for all your feedback so far and let me know what you think.

Use PYSETUP per dbaio suggestion

Try another approach to versioning problem

Forgot to switch back to PORTVERSION on the last one...

jrm accepted this revision.EditedDec 2 2017, 3:39 PM
In D13322#278385, @jwb wrote:

I think examples 5.4 and 5.5 (https://www.freebsd.org/doc/en_US.ISO8859-1/books/porters-handbook/makefile-naming.html#idp56060264) best illustrate how to approach this situation.

Of those two, I prefer the latter, using PORTVERSION=<last release>p<commit-date>. This causes "pkg info" to produce something that I think is intuitive:

Name : py27-macs2
Version : 2.1.0p20170505

Agreed. But my main gripe/confusion is that upsream's latest release version is '2015.4.20'. I interpret this to be a pre-release for 2.1.0 (he/she uses 2.1.0 and rc in the name of the release). I assume the latest commits to the master branch are also intended to be part of some upcoming 2.1.0 release. Since you are basing the FreeBSD port on something that will eventually be 2.1.0, I think what you have now, PORTVERSION=2.1.0p20170505 makes the best of an unfortunate situation. UPDATED: Sorry, I meant PORTVERSION=2.1.0.p20170505 (pre release) and not PORVERSION=2.1.0p20170505 (patch level).

It's common for researchers to release a version, then add important commits that other researchers want to pull in. In that case, you can just do the DISTVERSION, DISTVERSIONSUFFIX dance and you get something that makes sense. This case is a little different because they threw in a release version that makes little sense, so setting PORTVERSION directly is probably the best thing to do.

I'm also sorry to drag this out. I was a little uncertain what the best approach was, but I'm feeling more confident with what you have now.

This revision is now accepted and ready to land.Dec 2 2017, 3:39 PM

Ah, yeah. I forgot that 2.1.0 is still a pre-release. Juggling too many things and getting them mixed up...

In light of that, I think I'd have to favor following example 5.4, DISTVERSION=2.1.0-pre20170505. I think it's less intuitive than a patch suffix in the "pkg info" output, but the most important thing in my mind is for "pkg info" to indicate the snapshot used as clearly as possible. It would be nicer if github used commit tags that would fit into the versioning scheme, like serial numbers instead of garbled hashes. Often there are multiple commits on a given date. But it appears to me that the best we can do here is use the last commit of the day and show the date.

Thanks for seeing this through with me. I think having this sort of discussion before making decisions is at the core of the whole idea of using package managers, i.e. investing the time now to save work for ourselves down the road.

FYI, there are numerous examples of this sort of approach in existing ports, but I found none that use <version>p<date>. They just use <version>.<date>. audio/amsynth and benchmarks/clpeak address a similar situation using 'g' as a separator, which I have not seen a reference to in the guide. I'll use the examples in the guide as a standard, though.

FreeBSD imacbsd.acadix bacon ~ 402: port-grep 'VERSION.*2017'
Searching accessibility...
Searching arabic...
Searching archivers...
Searching astro...
Searching audio...
audio/amsynth/Makefile:DISTVERSION= 1.8.0.g20171103
audio/caps-lv2/Makefile:PORTVERSION= g20170626
audio/logitechmediaserver/Makefile:PORTVERSION= 7.9.1.g2017.07.11
audio/oss/Makefile:DISTVERSION= 4.2-build2017
audio/pms-devel/Makefile:PORTVERSION= 0.0.20171103
audio/sayonara/Makefile:DISTVERSIONSUFFIX= -git2-20170509
audio/schismtracker/Makefile:PORTVERSION= 20170910
audio/synthpod-lv2/Makefile:PORTVERSION= g20170810
audio/x42-plugins-lv2/Makefile:PORTVERSION= 20170428
Searching base...
Searching benchmarks...
benchmarks/clpeak/Makefile:PORTVERSION= 1.0g20170524
benchmarks/flamegraph/Makefile:PORTVERSION= g20170824
Searching biology...
biology/phyml/Makefile:DISTVERSION= 3.3.20170530
Searching cad...
cad/elmerfem/Makefile:PORTVERSION= 8.3.20170519
cad/linuxcnc-devel/Makefile:PORTVERSION= 20170708
Searching chinese...
Searching comms...
comms/chirp/Makefile:PORTVERSION= 0.4.1.2017.02.22
comms/librs232/Makefile:PORTVERSION= 20170805

Change DISTVERSION to reflect the fact that 2.1.0 is a pre-release

This revision now requires review to proceed.Dec 2 2017, 4:20 PM

Or maybe

PORTVERSION= 2.1.0-pre20170505

would be better than blindly mimicking the example...

In D13322#278416, @jwb wrote:

In light of that, I think I'd have to favor following example 5.4, DISTVERSION=2.1.0-pre20170505.

If you want to set DISTVERSION, I think your only option is DISTVERSION=2015.4.20 and to grab the latest commit set DISTVERSIONSUFFIX=-46-g6946b2f. I think in this case, things still work when you set DISTVERSION to something else only because you have set GH_TAGNAME. Besides, DISTVERSION=2.1.0-pre20170505 gives a PORTVERSION with the same value you just had.

make -VDISTVERSION -VPORTVERSION
2.1.0-pre20170505
2.1.0.p20170505

Thanks for seeing this through with me. I think having this sort of discussion before making decisions is at the core of the whole idea of using package managers, i.e. investing the time now to save work for ourselves down the road.

Agreed. It's good (for me too) to have a consisentent solution for corner cases like these.

FYI, there are numerous examples of this sort of approach in existing ports, but I found none that use <version>p<date>. They just use <version>.<date>. audio/amsynth and benchmarks/clpeak address a similar situation using 'g' as a separator, which I have not seen a reference to in the guide. I'll use the examples in the guide as a standard, though.

There's multimedia/vdr-plugin-eepg/Makefile:PORTVERSION=0.0.6p2011071921.

How do you feel about proceeding with one of these options?

  1. Set PORTVERSION directly to 2.1.0.p20170505
  2. Wait a bit and hope that @mat chimes in.
  3. Submit an issue with upsream asking them to tag a new release based on the latest commit with something like 2.1.10rc4, 2.1.10pre20170505, or 2.1.10rc20170505.
In D13322#278422, @jwb wrote:

Or maybe

PORTVERSION= 2.1.0-pre20170505

would be better than blindly mimicking the example...

That's not compatible with FreeBSD's package versioning scheme. From bsd.port.mk,

.if ${PORTVERSION:M*[-_,]*}x != x
IGNORE=			PORTVERSION ${PORTVERSION} may not contain '-' '_' or ','
.endif

I agree with PORTVERSION=2.1.0.p20170505. That would have been the next thing I came up with myself, all seemingly reasonable alternatives eliminated.

I was planning to contact upstream as well, without holding my breath given the lack of activity there...

DISTVERSION=2015.4.20 + DISTVERSIONSUFFIX would cause misleading output from "pkg info". It also just occurred to me that "pkg upgrade" might not function if only DISTVERSIONSUFFIX is changed, because the generated PORTVERSION will still be 2015.4.20.

I also noticed the 0.0.<date> examples. I envision myself using this approach for some ports that have never had a release.

Thanks!

Settled on a usable PORTVERSION

In D13322#278456, @jwb wrote:

DISTVERSION=2015.4.20 + DISTVERSIONSUFFIX would cause misleading output from "pkg info". It also just occurred to me that "pkg upgrade" might not function if only DISTVERSIONSUFFIX is changed, because the generated PORTVERSION will still be 2015.4.20.

Agreed. ..driving home the point about DISTVERSION==version.

It also just occurred to me that "pkg upgrade" might not function if only DISTVERSIONSUFFIX is changed, because the generated PORTVERSION will still be 2015.4.20.

Yeah, I guess you would have to bump PORTREVISION if you updated to a new commit.

This revision is now accepted and ready to land.Dec 2 2017, 7:29 PM
This revision was automatically updated to reflect the committed changes.
head/biology/py-macs2/Makefile
14 ↗(On Diff #36121)

Note that this is better written as:

BUILD_DEPENDS= ${PYNUMPY}

Thanks for the tips. I hadn't thought to use bulk -t instead of testport. My plan was to manually run testport again with @py36 when needed. I see that numpy gets the VIP treatment in python.mk. I'll keep an eye on that section for additions in the future...

In D13322#279150, @jwb wrote:

Thanks for the tips. I hadn't thought to use bulk -t instead of testport. My plan was to manually run testport again with @py36 when needed. I see that numpy gets the VIP treatment in python.mk. I'll keep an eye on that section for additions in the future...

Well, you should probably set BUILD_ALL_PYTHON_FLAVORS= in your make.conf and use @all so that you do test everything your port say it should support :-)