Events. Science Commons

Policy and Technology for e-Science

July 16-17, 2008
Barcelona, Spain

collaborating for the future of Open Science

This workshop will look to discuss and set basic principles of “Open Science”, taking a closer look at the common tenets necessary for a system to be identified as an “Open Science” system.

The event is co-sponsored by Science Commons, the Scholarly Publishing and Academic Resources Coalition (SPARC), the Center for the Study of the Public Domain at Duke University (CSPD) and the Institut d’Estudis Catalans.

Ver mapa más grande

These conferences aim at bringing together a diverse audience of scientists, policy makers, and commons advocates who are actively interested in designing ways to make access to scientific knowledge more widely available and more transparent across all scientific disciplines. For more information, visit the event’s Web site.

Original text: Science Commons

The Health Commons

It’s time to bring the same efficiencies to human health that the network brought to commerce and culture. And to do that, it takes a Commons.

“An Introduction to Health Commons” by John Wilbanks

Science Commons’ John Wilbanks lays out the argument for the Health Commons - how the existing drug discovery process is broken, and where to look for inspiration in how to fix it.

Read More
For our complete vision, please read Health Commons: Therapy Development in a Networked World - an introduction and overview, by John Wilbanks and Marty Tenenbaum.

The Health Commons: Solving the Health Research Puzzle

The pharmaceutical industry is at a crossroads. Despite revolutionary advances in molecular biology that have made genetic decoding routine, the time from gene to cure still stands at 17 years. High-throughput screening methods allow us to test the efficacy of millions of compounds against a molecular target in a single week; but the odds of one of those compounds making it through the development pipeline and becoming a drug are less than 1/1,000,000. A well-funded group starting today, using the traditional model of drug development, has a very slim chance at getting a drug to market by 2025.

The time has come to change the way we cure disease. We are no longer asking whether a gene or a molecule is critical to a particular biological process; rather, we are discovering whole networks of molecular and cellular interactions that contribute to disease. And soon, we will have such information about individuals, rather than the population as a whole. Biomedical knowledge is exploding, and yet the system to capture that knowledge and translate it into saving human lives still relies on an antiquated and risky strategy of focusing the vast resources of a few pharmaceutical companies on just a handful of disease targets.

The Health Commons Vision

Imagine a virtual marketplace or ecosystem where participants share data, knowledge, materials and services to accelerate research. The components might include databases on the results of chemical assays, toxicity screens, and clinical trials; libraries of drugs and chemical compounds; repositories of biological materials (tissue samples, cell lines, molecules), computational models predicting drug efficacies or side effects, and contract services for high-throughput genomics and proteomics, combinatorial drug screening, animal testing, biostatistics, and more. The resources offered through the Commons might not necessarily be free, though many could be. However, all would be available under standard pre-negotiated terms and conditions and with standardized data formats that eliminate the debilitating delays, legal wrangling and technical incompatibilities that frustrate scientific collaboration today.

We envision a Commons where a researcher will be able to order everything needed to replicate a published experiment as easily as ordering DVDs from Amazon. A Commons where one can create a workflow to exploit replicated results on an industrial scale – searching the world’s biological repositories for relevant materials; routing them to the best labs for molecular profiling; forwarding the data to a team of bioinfomaticians for collaborative analysis of potential drug targets; and finally hiring top service providers to run drug screens against those targets; with everything – knowledge, data, and materials – moving smoothly from one provider to the next, monitored and tracked with Fed-Ex precision; where the workflow scripts themselves can become part of the Commons, for others to reuse and improve. Health Commons’ marketplace will slash the time, cost, and risk of developing treatments for diseases. Individual researchers, institutions, and companies will be able to publish information about their expertise and resources so that others in the community can readily discover and use them. Core competencies, from clinical trial design to molecular profiling, will be packaged as turnkey services and made available over the Net. The Commons will serve as the public-domain, non-profit hub, with third-parties providing value added services that facilitate information access, communication, and collaboration.

What is Health Commons?

Health Commons is a coalition of parties interested in changing the way basic science is translated into the understanding and improvement of human health. Coalition members agree to share data, knowledge, and services under standardized terms and conditions by committing to a set of common technologies, digital information standards, research materials, contracts, workflows, and software. These commitments ensure that knowledge, data, materials and tools can move seamlessly from partner to partner across the entire drug discovery chain. They enable participants to offer standardized services, ranging from simple molecular assays to complex drug synthesis solutions, that others can discover in directories and integrate into their own processes to expedite development — or assemble like LEGO blocks to create new services.

The Health Commons is too complex for any one organization or company to create. It requires a coalition of partners across the spectrum. It is also too complex for public, private, or non-profit organizations alone - reinventing therapy development for the networked world requires, from the beginning, a commitment to public-private partnership. Only through a public-private partnership can the key infrastructure of the Commons be created: the investments in the public domain of information and materials will only be realized if that public domain is served by a private set of systems integrators and materials, tools and service providers motivated by profit. And in turn, the long-term success of the private sector depends on a growing, robust, and self-replenishing public domain of data, research tools, and open source software.

Original text: Science Commons

The Neurocommons. Open Source knowledge management

open source knowledge management

The Neurocommons project is creating an Open Source knowledge management platform for biological research. The first phase, a pilot project to organize and structure knowledge by applying text mining and natural language processing to open biomedical abstracts, was released to alpha testers in February 2007. The second phase is the development of a data analysis software system. The software will be released by Science Commons under the BSD Open Source License. These two elements together represent a viable open source platform based on open content and open Web standards.

We are launching this effort in neuroscience – thus, calling it the Neurocommons – to create network effects within a single therapeutic area and to leverage the connections we have developed with neurodegenerative disease funders through our MTA project. The long-term elements of the Neurocommons revolve around the mixture of commons-based peer editing and annotation of the pilot knowledge project and the creation of an open source software community around the analytics platform.

(To read more about the technical work behind the Neurocommons, visit the Neurocommons Technical Overview. Also, for information on how to format your database legally to declare freedom to integrate, read our Open Access Data Protocol and accompanying FAQ.)

The reason behind all of this …
The Neurocommons project comes out of Creative Commons history with the Semantic Web. Executive Director John Wilbanks founded the Semantic Web for Life Sciences project at the World Wide Web Consortium and led a semantics-driven bioinformatics startup company to acquisition in 2003. Science Commons Fellows Jonathan Rees and Alan Ruttenberg play key roles in the Semantic Web development efforts for science.

The scope of knowledge in the public and private domains has led many experts in the field of pharmaceutical knowledge management to embrace commons-based production methods and pre-competitive knowledge sharing. No one company, even one such as Pfizer, can capture, represent, and leverage all the available knowledge on the Web.

Our research meshed with emergent efforts and interest from the pharmaceutical industry in technology to harvest common terms and relationships from unstructured text and databases to provide a “map” of the implicit semantics shared in and across domains. Pfizer and Biogen both contributed significant input to early discussions, and the system is modeled part on a service already in use at Novartis, though proprietary.

Where we are now …
Currently, our Neurocommons team is working to release, improve and extend an open knowledgebase of annotations to the biomedical abstracts (in RDF), debugging and tailoring an open-source codebase for computational biology, and gradually integrating major neuroscience databases into the annotation graph. All the while using these efforts to further bring together the community within neuroscience around open approaches to systems biology.

With this system, scientists will be able to load in lists of genes that come off the lab robots, and get back those lists of genes with relevant information around them based on the public knowledge. They’ll be able to find the papers and pieces of data where that information came from, much faster and more relevant than Google or a full text literature search, because for all the content in our system, we’ve got links back to the underlying sources. And they’ve each got an incentive to put their own papers into the system, or to make their corner of the system more accurate for the better the system models their research, the better results they’ll get.

We’ll be inviting the bioinformatics community to work on both the content and the analytic software. Neither one can easily reach potential in a single organization. The scale of the knowledge-mapping effort is vast, and for the foreseeable future it’s going to require human input at some level to make it as accurate as possible (text mining is necessary but not sufficient). The model here is a machine-seeded Wikipedia, not unlike the way translations sometimes work for wikipedia content, where humans come in and tend to the patches of knowledge they care about. Because it’s all in RDF, it all hooks up into a single usable network, and decentralized, incremental edits turn into a really accurate knowledge model.

Looking into the future …
In the short term this is most valuable to people who already know how to use it. The skill set is rare and still considered a specialty. But over time the use of machine-annotation should evolve into a mainstream part of biology, just as the use of machine-generated data has evolved.

The longer term social goal is to bring the kind of advanced techniques that are common in pharma to more researchers and improve the quality of the information that’s available across the entire space of research, from pharma to university to industry to government. Right now, pharma can try to cobble together all of the information across its own enterprise, but that is so expensive it’s not available to the other four stakeholders. It doesn’t draw on the other three stakeholders in any meaningful way, much less an automated way. This will allow all the researchers involved in high throughput systems biology efforts to ask good questions, using all the information available, regardless of financial position. That means more knowledge moving back into the canon, faster, that will lead to a more systemic understanding of disease and cell activity for industry to call on.

The second phase is the development of a data analysis software system to be released by Science Commons under the BSD Open Source License. Without the software, we’d have the current state: a hodgepodge of software that lets you view networks, software tied to a specific protein network, and a couple of expensive closed platforms. A web without a browser and a search engine. This is akin to Mozilla for the life sciences Semantic Web, letting normal non-bioinformatics researchers input massive data sets and get back a sense of what’s really going on in the data, what pieces of the cell are activated or deactivated, when, and where. The result is better hypotheses coming out of experiences, which translates to more good experiments, and more papers to feed back into the system. And since it’s open and RDF, it again all hooks up, and feeds directly into the modern pharma IT systems to make decisions better there, too.

How SC - Data Works

SC-Data is guided by a group of “expert advisers from both the sciences and the law”:about and by the scientific community. We build “requirements” through public listserv discussion and the Data Working Group – in much the same spirit as the functional specifications for software are developed.

See also:

Frequently Asked Questions about Databases and Creative Commons

Neurocommons Technical pages

Please read the Neurocommons Project Background Briefing for the issues driving Science Commons’ work in this area.

Original text: Science Commons

Biological Materials Transfer Project (MTA)

building the clearinghouse for research tools

The Biological Materials Transfer Agreement Project (MTA) develops and deploys standard, modular contracts to lower the costs of transferring physical biological materials such as DNA, cell lines, model animals, antibodies and more. The MTA project covers transfers among non-profit institutions as well as between non-profit and for profit institutions. It integrates existing standard agreements into a Web-deployed suite alongside new Science Commons contracts, and allows for the emergence of a transaction system along the lines of Amazon or eBay by using the licensing as a discovery mechanism for materials.

Our MTA project has a simple goal: to make the scientist’s work easier and to allow him/her to start their work sooner. That means we need to help scientists better locate and order the materials needed, rather than letting months pass and thus jeopardizing research plans. The cumulative impact of such an innovation over time is enormous – systematically accelerated discovery, eventually leading to earlier cures and useful applications from science. It would not be a stretch to imagine that the course of many lives – especially those waiting for cures – would be very different in an alternative universe when ordering these kinds of research materials is fast and effortless.

The reason behind all of this …
The licensing problem that exists in the transfer of biological materials was brought to our attention as a result of a multidisciplinary workshop we held in the fall of 2005. While patents are popularly identified as impediments to scientific progress, our interview-based research yielded remarkable consensus that the impact of slow – or non-existent- materials transfer among entities was a far more significant slowdown of the basic research cycle.

Materials represent tacit knowledge – generating a bacterial vector or an antibody can take months of years, and replicating the work is time consuming and difficult. Gaining access to those materials is subject to secrecy, competition, lack of resources to manufacture materials, lack of time, legal transaction costs and delays, and more.

We thus decided to take the goal of constructing not only a contract suite to address the legal transaction costs but the entire cycle. In order to explore this further, we assembled a working group consisting of funders of neurodegenerative diseases, technology transfer officials, materials repositories, legal theorists, and other experts. This mix ensured that the contract would meet at least the initial requirements of a transaction system in which funders encouraged the use of the agreements, scientists deposited the materials in repositories for fulfillment and technology managers were comfortable binding themselves to the agreements.

(We have also compiled empirical data and other findings about materials transfer problems. To see that information, click here.)

Initial work …
With the help of our working group, we drafted a set of contracts and revised them, then made early presentations of the work through our social network to other technology transfer offices, organizations working to optimize university-industry innovation, efforts to write software for MTA workflows, materials repositories, and more, to refine our initial release. This work led us to not only build our own agreements to address university-industry transfer but to incorporate two key existing university-university agreements – the Uniform Biological Materials Transfer Agreement (UBMTA) and the Simple Letter Agreement (SLA).

We then began work on porting those contracts into the Creative Commons methodology. The contract launch with a Web enabled, question-driven interface, human readable deeds, and metadata, and will work with the Creative Commons software infrastructure for search and relationship tracking.

This metadata driven approach is based on the success of the Creative Commons licensing integration into search engines further allows for the integration of materials licensing directly into th research literature and databases so that scientists can “one-click” inline as they perform typical research. And like Creative Commons licensing, we can leverage the ccHost platform to track materials propagation and reuse, creating new data points for the impact of scientific research that are more dimensional than simple citation indices, tying specific materials to related peer-reviewed articles and data sets.

Where we are now …
Our efforts in building a system to meet these needs builds upon the best of what is already available (for example, the standard UBMTA). We have added more options to address unmet needs, done so in ways that will lead to greater standardization, not more burdens upon technology transfer offices and lawyers who are already stretched too thin to cope with the volume of work created by the current way of doing things. We have made the agreements easier to understand and indentify for the (non-lawyer) scientist. Continuing to work closely with universities, institutions and the private sector to ensure that scientists are fully empowered to exchange and re-use materials with less paperwork.

The iBridge Network will host the deployment of our beta test of the software infrastructure, which includes the UBMTA, Simple Letter Agreement, and a new suite of flexible, modular MTAs that will be offered for the first time by Science Commons. We believe that this suite offers enough choices to standardize the vast majority of scenarios involving materials transfer.

Our beta test, with the iBridge Network, will allow participating institutions to lower the friction of sharing materials among them. We will integrate into the iBridge network a portal through which providers of materials can use a simple interface, with drop-down menus and standard questions, to choose an MTA. By answering simple questions, they will be offered a choice between one of the standard academic MTAs or the ability to customize an MTA based on the Science Commons MTA suite. As the end of this process, they will receive a copy of an MTA (the actual legal text) generated from their choices and responses, accompanied by metadata that can be used to tag the offering on the Web in ways that make it easy to search and categorize by search engines, and a “human-readable deed” whose purpose is to “brand” each distinct MTA in ways that make their relevant attributes immediately obvious to human beings.

We have taken full advantage of Web technology to build a technology infrastructure that can support powerful searching and tracking of available materials. By putting all of these pieces together, we envision our materials transfer system to be one day as efficient as eBay for auctions, or for odering products, or Google for searching for content. Why have these examples of success in e-Commerce not been fully exploited for solving the problem of getting materials? Why are we still exchanging materials more or less the same way we have been before the Web was born?

By making materials easy to find on the Web and easy to acquire via standard contracts, any such system runs the risk of creating burdens on laboratories and scientists. Manufacturing and distributing materials is not the top priority for most labs.

That’s why, from the beginning, we involved partners like Addgene - organizations that serve the scientific community by providing the distribution and manufacturing as a service. The entire cycle of materials transfer requires not just contracts and technology, but also the presence of service organizations that can ship and track the materials under standard contract.

Part of the answer is that no one can do this alone. It takes many acting together to make this vision work. It will take policy chances and new ways of doing things. Our MTA project is one piece of that vision.

How SC-Licensing Works

SC-Licensing is guided by a group of expert advisers from both the sciences and the law and by the scientific community. We build “requirements” through public listserv discussion and the Licensing Working Group – in much the same spirit as the functional specifications for software are developed.

For more information on the issues driving Science Commons’ work in this area, see our Biological Materials Transfer Project Background Briefing. You can also read our empirical data and findings about materials transfer problems.

Original text: Science Commons

Scholar’s Copyright Project. Open Access Data Protocol

Open Access Data Protocol

The Science Commons Open Access Data Protocol is a method for ensuring that scientific databases can be legally integrated with one another. The protocol is not a license or legal tool, but instead a methodology and best practices document for creating such legal tools in the future, and marking data in the public domain for machine-assisted discovery.

For more information about the Protocol and our stance on Databases, visit our FAQ page, where you can read our new FAQ on the Database Protocol. Also, click here to read the official announcement of the Protocol, issued in concordance with the 5th anniversary of Creative Commons’ licenses.


freedom to archive and reuse scholarly works on the Internet

Applying traditional copyright metaphors to digital scientific communication restricts the number of opportunities afforded by the networked world. Translations of language and file format are banned. Intrepid scientists can’t aggregate a set of interesting articles into a PDF file for distribution to their colleagues. And new technologies like text mining can’t help scientists understand the 16,000,000 articles currently indexed for biomedical science.

Increased Open Access to peer-reviewed scholarly literature is essential. The network culture opens up enormous possibilities for discovery and research - more knowledge, distributed across the world at the speed of fiber, stored in digital formats accessible to machines for indexing, search, and innovative research such as our Neurocommons project. The benefits of Open Access are many, and have been well documented. The question that remains, though, is how.

There are two ways to go OA - publishing in an Open Access journal or self-archiving. Science Commons provides resources in support of both.


Through the Scholar’s Copyright project, Science Commons offers a spectrum of tools and resources catering to both methods of achieving Open Access. Here is a glimpse of our efforts and successes to date.

OA Publishing and CC Licensing

One road to Open Access is to publish in an OA journal. More than 250 peer-reviewed scholarly journals implement their OA philosophy using Creative Commons licensing. Key adopters include the Public Library of Science (PLoS), Hindawi, and BioMed Central.

Each uses the Creative Commons Attribution License, which is a legal implementation of the Open Access vision as laid out be the Budapest Open Access Initiative. The result - more information is made freely available for public consumption, and without unintended consequences of applying at least 70 years’ worth of control to scientific knowledge.

For more information about Creative Commons licenses, visit our FAQ or Creative Commons’ licenses page. And if you have a journal under CC licensing, send us an e-mail so we can add you to our list.

Scholar’s Copyright Addendum Engine (SCAE)

(For more background information on SCAE, visit the project page.)

Another road to OA is to put an archive copy of the peer-reviewed article on the Web (”self-archiving”) after publication in a peer-reviewed journal. This approach is gaining significant traction as well, with the European Union, US National Institutes of Health, and the Wellcome Trust adopting policy initiatives based on archives.

Although most journals support some form of self-archiving, the number of variables in their policies create real confusion among authors. The Science Commons Author Addenda help scholars negotiate the rights they need to use and distribute their work via self-archiving, eliminating confusion and doubt as to when, where, and how authors can make their work available to the world.

The Scholar’s Copyright Addendum Engine (SCAE) is a simple interface for generating a signature-ready Addendum. The SCAE generates the one page document, amending the copyright transfer agreements issued by publishers. This ensures that the author can make their work freely available on the public Internet whether upon publication, pre-publication (in the form of the author’s final manuscript), or after a certain period of time. Our FAQ walks you through step-by-step how to do this.

For more information on how you can integrate the Addendum Engine into your Web site, visit this page.

Open Access Law Program

Over 35 law journals have committed to the Open Access Law program since launch.

The Open Access Law (OAL) Program provides a comprehensive set of resources promoting open access in legal scholarship. The program relies on self-assessment and self-reporting, arming the editorial boards of law journals with the means to go OA.

The OAL program consists of a set of principles of Open Access, committing both author and journal to basic tenets of OA, and a free model agreement between authors and journals that implements the principles in contract.

Visit our Web site to see what journals are already on board.

Original text: Science Commons

What is science commons

There are petabytes of research data being produced in laboratories around the world, but the best web search tools available can’t help us make sense of it. Why? Because more stands between basic research and meaningful discovery than the problem of search.

Many scientists today work in relative isolation, left to follow blind alleys and duplicate existing research. Data is balkanized — trapped behind firewalls, locked up by contracts or lost in databases that can’t be accessed or integrated. Materials are hard to get — universities are overwhelmed with transfer requests that ought to be routine, while grant cycles pass and windows of opportunity close. It’s not uncommon for research sponsors to invest hundreds of millions of dollars in critically important efforts like drug discovery, only to see them fail.

The consequences in many cases are no less than tragic. The time it takes to go from identifying a gene to developing a drug currently stands at 17 years — forever, for people suffering from disease.

Science Commons has three interlocking initiatives designed to accelerate the research cycle — the continuous production and reuse of knowledge that is at the heart of the scientific method. Together, they form the building blocks of a new collaborative infrastructure to make scientific discovery easier by design.

Making scientific research “re-useful” — We help people and organizations open and mark their research and data sets for reuse. Learn more.

Enabling “one-click” access to research materials — We help streamline the materials-transfer process so researchers can easily replicate, verify and extend research. Learn more.

Integrating fragmented information sources — We help researchers find, analyze and use data from disparate sources by marking and integrating the information with a common, computer-readable language. Learn more.

Science Commons in action
We implement all three elements of our approach in the Neurocommons, our “proof-of-concept” project within the field of neuroscience. The Neurocommons is a beta open source knowledge management system for biomedical research that anyone can use, and anyone can build on.

Original text: Science Commons

Privacy Policy

Privacy Policy for

If you require any more information or have any questions about our privacy policy, please feel free to contact us by email at commonsci @

At, the privacy of our visitors is of extreme importance to us. This privacy policy document outlines the types of personal information is received and collected by and how it is used.

Log Files
Like many other Web sites, makes use of log files. The information inside the log files includes internet protocol ( IP ) addresses, type of browser, Internet Service Provider ( ISP ), date/time stamp, referring/exit pages, and number of clicks to analyze trends, administer the site, track user’s movement around the site, and gather demographic information. IP addresses, and other such information are not linked to any information that is personally identifiable.

Cookies and Web Beacons does use cookies to store information about visitors preferences, record user-specific information on which pages the user access or visit, customize Web page content based on visitors browser type or other information that the visitor sends via their browser.

Some of our advertising partners may use cookies and web beacons on our site. Our advertising partners include Google Adsense .

These third-party ad servers or ad networks use technology to the advertisements and links that appear on send directly to your browsers. They automatically receive your IP address when this occurs. Other technologies ( such as cookies, JavaScript, or Web Beacons ) may also be used by the third-party ad networks to measure the effectiveness of their advertisements and / or to personalize the advertising content that you see. has no access to or control over these cookies that are used by third-party advertisers.

You should consult the respective privacy policies of these third-party ad servers for more detailed information on their practices as well as for instructions about how to opt-out of certain practices.'s privacy policy does not apply to, and we cannot control the activities of, such other advertisers or web sites.

If you wish to disable cookies, you may do so through your individual browser options. More detailed information about cookie management with specific web browsers can be found at the browsers' respective websites.