Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repository licensing options that are appropriate for non-code content #5

Closed
dhimmel opened this issue Jul 16, 2016 · 9 comments
Closed

Comments

@dhimmel
Copy link
Member

dhimmel commented Jul 16, 2016

Currently, this repository is licensed under a BSD 3-Clause License, which is the default license for Project Cognoma repositories. This license was chosen for its permissive and open nature as well as it's compatibility with other Greene Lab software products. However, the BSD 3-Clause License is intended for code. Rather than referencing all content, the license specifically mentions "source code".

However, as a data repository, cognoma/cancer-data will hold much more than source code including data, visualizations, writing, documentation, and notebooks which combine all of the aforementioned content types into a single file.

As far as openly releasing content that is not only code, there's currently a short list of recommended licenses. My favorite is CC0 because it waives all copyrights and effectively places work in the public domain. This means anyone can use CC0 content without agonizing over license compatibility.

Despite being one of the most liberating licensing options available, CC0 is not an OSI-approved open source license for complicated reasons. We're still investigating the license of TCGA and Xena Browser data (see #3), but it's possible our upstream data providers may impose restrictions on our use that may require attribution, in which case CC BY is an option. However, Creative Commons recommends against using its licenses (besides CC0) for software.

Hence, switching the entire repository to a Creative Commons license doesn't make sense, since we want to remain OSI conformant. However, we also need a licensing arrangement that is appropriate for content that is not "source code". One option I can think of is multi-licensing where we apply both a CC0/CC BY License and a BSD 3-Clause License to the repository. Users would then be able to choose either license based on their use case. I want to make sure that this is a legally valid approach. What do people think (also asked on Twitter)?

@dhimmel
Copy link
Member Author

dhimmel commented Jul 17, 2016

It appears that at least some Wikipedia articles are dual-licensed under CC BY-SA and GFDL. See the following prompt to save your changes:

wiki-dual-license

So the dual-licensing of content under multiple open licenses does have some precedent.

@dhimmel
Copy link
Member Author

dhimmel commented Oct 27, 2016

I was talking with @jakevdp at a Data Science Summit, and he mentioned he was also dealing with how to license repositories with notebooks, which contain code, prose, figures, and data.

@jakevdp — any advice you have here (now or later) would be appreciated. Also let us know how you end up licensing your notebooks with diverse content types.

@jakevdp
Copy link

jakevdp commented Oct 27, 2016

I'm not sure what is best in this case. I asked about this on twitter a while ago, and the responses were pretty informative: thread.

@dhimmel
Copy link
Member Author

dhimmel commented Oct 27, 2016

Thanks @jakevdp, here are some of the tweets grouped by category:

Dual license the entire repo

Tweet by @minrk:

@jakevdp perhaps simpler is to dual-license the whole thing, so people can use appropriate license for the downstream work.

Tweet by @minrk:

@jakevdp the reason I like dual license for everything (if available) is that it's the downstream work that this matters to, not the source.

Tweet by @SethMichaelLarson:

@jakevdp I've seen LICENSE-MIT and LICENSE-APACHE2 in the same repo with an explanation in the README. That's probably how I would do it.

License by type

Tweet by @minrk:

@jakevdp in answer to the actual question, I might use a single LICENSE (or LICENSES) file with both and the explanation of when to use each

Tweet by @earthorguk:

I am told that outputs are special @minrk @jakevdp https://git.io/vXUFI

Tweet by @benjaminrose:

@jakevdp @arfon @github notebooks have different cell types. Can you use that to indicate what is text and what is code?

Tweet by @labarba:

@jakevdp @rasbt In #numericalmooc, we unite CC-BY and MIT in the license file, and add a message in all notebooks https://github.com/numerical-mooc/numerical-mooc

Tweet by @jakevdp:

@_benjaminrose @arfon @github My best idea now is to include LICENCSE_CODE and LICENSE_TEXT and describe the intent in the README.

In between

Tweet by @rasbt:

@jakevdp @github sth. like -> & having 2 seperate license files + texts. Not sure if it's ideal though

cqq-gj_umaal57h

@minrk
Copy link

minrk commented Oct 28, 2016

The reason I like to do dual licenses for the whole repo is letting the inheritor decide what's appropriate. Cases where BSD for code, CC-BY for non-code cause unnecessary headaches:

  • including a BSD code snippet in an otherwise non-code CC work (blog post, book)
  • CC prose into downstream BSD code comments or documentation
  • it eliminates grey areas where people need to ask which license applies to a particular piece

I'm not sure any benefit is achieved by separating which parts are under which license, but I could be wrong about that.

@dhimmel
Copy link
Member Author

dhimmel commented Oct 29, 2016

@minrk your assessment makes a lot of sense to me. I think dual licensing is the way to go.

I'm not sure any benefit is achieved by separating which parts are under which license, but I could be wrong about that.

It seems like you could always derive the separate licensing scheme from the dual licensing scheme. Therefore, I don't think there should be any major disadvantages to dual rather than separately licensing the content.

@dhimmel
Copy link
Member Author

dhimmel commented Oct 29, 2016

Permission from contributors to license this repository as CC0

@clairemcleod & @stephenshank (the two other people with contributions to this repository): are you okay with releasing your past work under CC0 1.0 (the public domain dedication) as well as the current BSD 3-Clause license?

@clairemcleod
Copy link
Member

@dhimmel fine by me!

@stephenshank
Copy link
Member

@dhimmel Sounds good to me!

dhimmel added a commit to dhimmel/cancer-data that referenced this issue Oct 31, 2016
Dual license this repository by adding a CC0 1.0 License as discussed in cognoma#5.

Permission to release this repository under CC0 was received from Claire McLeod
(https://git.io/vXmQn) and Stephen Shank (https://git.io/vXmQl). The only
remaining contributor thus far, Daniel Himmelstein, is the author of this
commit.

Closes cognoma#5.
@dhimmel dhimmel closed this as completed in 1d961fb Nov 1, 2016
dhimmel added a commit to dhimmel/biorxiv-licenses that referenced this issue Nov 28, 2016
See cognoma/cancer-data#5 for the motivations
to dual-license data science repositories.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants