Page MenuHomeFreeBSD

License Guide: Recommend SPDX Identifier come before copyright
AbandonedPublic

Authored by jrm on Jul 8 2024, 3:41 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Sep 18, 6:59 PM
Unknown Object (File)
Sep 11 2024, 10:48 PM
Unknown Object (File)
Sep 9 2024, 3:03 AM
Unknown Object (File)
Sep 8 2024, 11:38 PM
Unknown Object (File)
Sep 8 2024, 6:19 PM
Unknown Object (File)
Sep 3 2024, 8:36 PM
Unknown Object (File)
Sep 3 2024, 6:37 AM
Unknown Object (File)
Sep 3 2024, 4:05 AM
Subscribers

Details

Reviewers
imp
fuz
Summary

Be consistent with style(9) and recommend the SPDX identifier come
first.

Diff Detail

Repository
R9 FreeBSD doc repository
Lint
No Lint Coverage
Unit
No Test Coverage
Build Status
Buildable 58574
Build 55462: arc lint + arc unit

Event Timeline

jrm requested review of this revision.Jul 8 2024, 3:41 PM
jrm created this revision.

Thanks @mhorne. There are a bunch of other things I have to update here too. I'll try to make things consistent in the next update here.

documentation/content/en/articles/license-guide/_index.adoc
61

Hmmm. I've gone round and round on this.... i put it in this order iirc to match how these are used in the wider community of foss... let me double check this memory is correct

The current SPDX documentation is somewhat vague and says, "The SPDX-License-Identifier tag declares the license the file is under and should be placed at or near the top of the file in a comment." [1]

In our and Linux sources, the SPDX identifier almost always comes first. Here is a bit of the output from running fd ".*\.c" -x head -1 under the Linux sources.

// SPDX-License-Identifier: GPL-2.0
// SPDX-License-Identifier: GPL-2.0
// SPDX-License-Identifier: GPL-2.0-only
// SPDX-License-Identifier: GPL-2.0
// SPDX-License-Identifier: GPL-2.0
// SPDX-License-Identifier: GPL-2.0
// SPDX-License-Identifier: GPL-2.0
// SPDX-License-Identifier: GPL-2.0
// SPDX-License-Identifier: GPL-2.0-only
...

[1] https://spdx.github.io/spdx-spec/v2.3/using-SPDX-short-identifiers-in-source-files/

documentation/content/en/articles/license-guide/_index.adoc
61

Hmmm. I've gone round and round on this.... i put it in this order iirc to match how these are used in the wider community of foss... let me double check this memory is correct

Update related text and tweak some wording.

That page probably should be shortened so it refers to this page. What do you think?

In D45915#1046849, @jrm wrote:

That page probably should be shortened so it refers to this page. What do you think?

Yea. I think I did that at one point, but maybe I've missed something.

documentation/content/en/articles/license-guide/_index.adoc
61

This chunk should not be landed. nor the others that move copyright.
It's a mixed bag elsewhere, honestly. Linux has it as the first line, but that's a historical accident where they retroactively added them after 20-odd years of existing w/o them as the first line.

jrm marked an inline comment as not done.

I'll leave others to sort this out then and share what I found in case it's helpful. Some of this is repeated, but it might be easier to have it all in one place so if we do decide that the Copyright should go first, these are the places we need to fix.

  • I estimate that about 95% of our src files with both an SPDX identifier and a copyright statement have the SPDX identifier first [1].
  • All of our other documentation/examples [2] (that I'm now aware of) recommends the SPDX tag first.
  • People here reviewed (in private conversations stemming from D43770) the license text in the Foundation's contracts. That text had the SPDX identifier first.

[1]

~ % rg -lU '(?s)SPDX.*Copyright \(c\)' /usr/src | wc -l
   11315

~ % rg -lU '(?s)Copyright \(c\).*SPDX' /usr/src | wc -l
     920

As @imp says, other projects (other than Linux) may tend to put the Copyright first.

~ % rg -lU '(?s)SPDX.*Copyright \(c\)' /usr/src/contrib | wc -l
     251
~ % rg -lU '(?s)Copyright \(c\).*SPDX' /usr/src/contrib | wc -l
     705

[2]

Re [1]: We have a lot of SPDX-License-Identifier files that ALSO have the boilerplate. Those show up as false positives in how [1] was computed. The vast majority of these files won't have the boilerplate removed ever. I think we only have about 1000 files that have only the SPDX lines w/o the license statement.

In D45915#1046880, @imp wrote:

Re [1]: We have a lot of SPDX-License-Identifier files that ALSO have the boilerplate. Those show up as false positives in how [1] was computed. The vast majority of these files won't have the boilerplate removed ever. I think we only have about 1000 files that have only the SPDX lines w/o the license statement.

I guess you are saying that in files such as sbin/nvmecontrol/nc_util.c, it's off the table to move the SPDX identifier below the copyright/license text.

In that case, here is a lightly tested estimate that includes and excludes contrib/.

% rg -lU '(?s)SPDX.*Copyright \(c\)' /usr/src | xargs rg --files-without-match 'THIS SOFTWARE IS PROVIDED' | wc -l
    2242
% rg -g '!contrib' -lU '(?s)SPDX.*Copyright \(c\)' /usr/src | xargs rg --files-without-match 'THIS SOFTWARE IS PROVIDED' | wc -l
     518