data sharing:
social and normative

              kaitlin thaney
 program manager, science commons
  chantilly, va - ISWC - 25 oct 2009


 This presentation is licensed under the CreativeCommons-Attribution-3.0 license.
make sharing easy, legal and scalable

        integrated approach

building part of the infrastructure for
          knowledge sharing
I am not a lawyer.
  (first things first)
social and normative issues

human involvement, added roadblocks
       implications of FLOSS
  interoperability, design decisions
         how to navigate?
the data web
needs to be legally and technically
              accessible
“ By open access to the literature, we mean its free
availability on the public internet, permitting users
 to read, download, copy, distribute, print, search, or link
     to the full texts of the articles, crawl them for
indexing, pass them as data to software, or use them for
   any other lawful purpose, without financial, legal or
  technical barriers other than those inseparable from
           gaining access to the internet itself.”



           Image from the Public Library of Science, licensed to the public, under
                                        CC-BY-3.0
as a means to achieve Open Access
      but what about data?
knowledge?

    journal articles
          data
       ontologies
      annotations
plasmids and cell lines
knowledge?

             journal articles
                 data
               ontologies
              annotations
         plasmids and cell lines

... how to treat? like content? software?
early days of WWW

no licenses (even free)
  debate over code
   CERN’s decision
   view/edit source
   network effects
the data web

       still in its infancy ...
“the future is here ...
just unevenly distributed”
                      - william gibson
(i.e., linked data, W3C, neurocommons...)
(social) implications
   of FLOSS toggles
free/libre license ethos

notion of licensing to make
        more free
©
“creative expression”
is it creative?
is it creative?
is it creative?
category errors
the problem of...
   Non-Commercial


   for data
Non-Commercial


what’s a commercial use
   of the data web?
the problem of...
  Share Alike


   for data
issue of license proliferation

   whatever you do to the least of the
databases, you do to the integrated system

       (the most restrictive wins)

    risk for unintended consequences
the problem of...
   Attribution


   for data
social aspect of semantics
agreement
  is hard.
espresso
  coffee
             cafe
                    kopi
                             cafezinho

latte               koffee

           mocha             americano
“choice” or interoperability.

         (pick one)
converge on common names

    “coffee”


    “cafe”              coffee

    “kopi”      http://ontology.foo.org/1234567
national law hurdles

             sui generis,
        “sweat of the brow”
          Crown copyright
           “level of skill”

how internat’l data sharing efforts
          are affected?
protection instinct / culture of control

  quality control, integrity concerns

  “my data”, interpretation issues

     fear, uncertainty, doubt (FUD)
<mosquitos><transmit><malaria>

        validation, provenance
    relationship mapping, citation?
             what rights?

        still not fully automated
a norms approach
a non-license means to request
        certain behavior

      community norms
 best practices, terms of use
attribution vs. citation

which one applies? which is best fit?
      what’s the difference?


 “credit where credit is due”
attribution
 “the requirement to acknowledge or
 credit the author of a work which is
  used or appears in another work”

                citation
     “reference to a published or
unpublished source” ... prime purpose is
        of “intellectual honesty”
                            (via wikipedia)
attribution:
        (legal entity)

“triggered by making of a copy”
      does it apply to facts?
  how? (papers, ontologies, data)

 “in a manner specified by ...”
      attribution stacking
how to perform?
  how much is enough?
unintentional infringement
  (example, ontologies)

      does it apply?
citation:
(gentle(wo)man’s club)

    legal requirement?
     interoperability?
credit where credit is due
entrenched scientific norm
we shouldn’t use the law to make it
   hard to do the wrong thing ...
need for a legally accurate and
            simple solution

reducing or eliminating the need to make
   the distinction of what’s protected

  requires modular, standards based
          approach to licensing
calls for data providers to waive all rights
necessary for data extraction and re-use

  requires provider place no additional
    obligations (like share-alike) to limit
              downstream use

 request behavior (like attribution) through
        norms and terms of use
... must promote legal predictability and certainty.

             ... must be easy to use and understand.

... must impose the lowest possible transaction costs on
                         users.

full text:
http://sciencecommons.org/projects/publishing/open-access-data-protocol/
set of principles (not license)

open, accessible, interoperable

      know it’s safe to play
impose “toggles” through norms,
         terms of use

   best fit for the discipline

 doesn’t limit downstream use
at best, we’re partially right.

at worst, we’re really wrong.
resist the temptation to treat
              as property

embrace the potential to treat instead
      as a network resource
scalability is key

“get law out of the way”

build + allow for network effects
the right to fix our mistakes.
(remember Prodigy and AOL?)
thank you.

kaitlin@creativecommons.org
      sciencecommons.org
     creativecommons.org
   slideshare.net/kaythaney

Data Sharing: Social and Normative - ISWC

  • 1.
    data sharing: social andnormative kaitlin thaney program manager, science commons chantilly, va - ISWC - 25 oct 2009 This presentation is licensed under the CreativeCommons-Attribution-3.0 license.
  • 2.
    make sharing easy,legal and scalable integrated approach building part of the infrastructure for knowledge sharing
  • 3.
    I am nota lawyer. (first things first)
  • 4.
    social and normativeissues human involvement, added roadblocks implications of FLOSS interoperability, design decisions how to navigate?
  • 5.
  • 6.
    needs to belegally and technically accessible
  • 7.
    “ By openaccess to the literature, we mean its free availability on the public internet, permitting users to read, download, copy, distribute, print, search, or link to the full texts of the articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal or technical barriers other than those inseparable from gaining access to the internet itself.” Image from the Public Library of Science, licensed to the public, under CC-BY-3.0
  • 8.
    as a meansto achieve Open Access but what about data?
  • 9.
    knowledge? journal articles data ontologies annotations plasmids and cell lines
  • 10.
    knowledge? journal articles data ontologies annotations plasmids and cell lines ... how to treat? like content? software?
  • 11.
    early days ofWWW no licenses (even free) debate over code CERN’s decision view/edit source network effects
  • 12.
    the data web still in its infancy ...
  • 13.
    “the future ishere ... just unevenly distributed” - william gibson (i.e., linked data, W3C, neurocommons...)
  • 14.
    (social) implications of FLOSS toggles
  • 15.
    free/libre license ethos notionof licensing to make more free
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
    the problem of... Non-Commercial for data
  • 22.
  • 23.
    the problem of... Share Alike for data
  • 24.
    issue of licenseproliferation whatever you do to the least of the databases, you do to the integrated system (the most restrictive wins) risk for unintended consequences
  • 25.
    the problem of... Attribution for data
  • 28.
  • 29.
  • 32.
    espresso coffee cafe kopi cafezinho latte koffee mocha americano
  • 33.
  • 34.
    converge on commonnames “coffee” “cafe” coffee “kopi” http://ontology.foo.org/1234567
  • 37.
    national law hurdles sui generis, “sweat of the brow” Crown copyright “level of skill” how internat’l data sharing efforts are affected?
  • 38.
    protection instinct /culture of control quality control, integrity concerns “my data”, interpretation issues fear, uncertainty, doubt (FUD)
  • 39.
    <mosquitos><transmit><malaria> validation, provenance relationship mapping, citation? what rights? still not fully automated
  • 40.
  • 41.
    a non-license meansto request certain behavior community norms best practices, terms of use
  • 42.
    attribution vs. citation whichone applies? which is best fit? what’s the difference? “credit where credit is due”
  • 43.
    attribution “the requirementto acknowledge or credit the author of a work which is used or appears in another work” citation “reference to a published or unpublished source” ... prime purpose is of “intellectual honesty” (via wikipedia)
  • 44.
    attribution: (legal entity) “triggered by making of a copy” does it apply to facts? how? (papers, ontologies, data) “in a manner specified by ...” attribution stacking
  • 45.
    how to perform? how much is enough? unintentional infringement (example, ontologies) does it apply?
  • 46.
    citation: (gentle(wo)man’s club) legal requirement? interoperability? credit where credit is due entrenched scientific norm
  • 47.
    we shouldn’t usethe law to make it hard to do the wrong thing ...
  • 48.
    need for alegally accurate and simple solution reducing or eliminating the need to make the distinction of what’s protected requires modular, standards based approach to licensing
  • 51.
    calls for dataproviders to waive all rights necessary for data extraction and re-use requires provider place no additional obligations (like share-alike) to limit downstream use request behavior (like attribution) through norms and terms of use
  • 54.
    ... must promotelegal predictability and certainty. ... must be easy to use and understand. ... must impose the lowest possible transaction costs on users. full text: http://sciencecommons.org/projects/publishing/open-access-data-protocol/
  • 55.
    set of principles(not license) open, accessible, interoperable know it’s safe to play
  • 56.
    impose “toggles” throughnorms, terms of use best fit for the discipline doesn’t limit downstream use
  • 57.
    at best, we’repartially right. at worst, we’re really wrong.
  • 58.
    resist the temptationto treat as property embrace the potential to treat instead as a network resource
  • 59.
    scalability is key “getlaw out of the way” build + allow for network effects
  • 60.
    the right tofix our mistakes.
  • 61.
  • 62.
    thank you. kaitlin@creativecommons.org sciencecommons.org creativecommons.org slideshare.net/kaythaney