Brief: In open source philosophy, you share source code. Why not share data along the same line? That’s what Linux Foundation’s Community Data License Agreement tries to address.
I am here at the first day of Open Source Summit 2017 Europe edition, in Prague. Things have just started. Mostly I hear the buzz around containers but among all these, one of the major new announcement that came today is the Community Data License Agreement.
In his keynote this morning, Jim Zemlin, head of the Linux Foundation, introduced this new open source license for sharing data for mass collaboration. The idea is similar to the open source philosophy of sharing source code.
Jim states that these “CDLA licenses are an effort to define a licensing framework to support collaborative communities built around curating and sharing “open” data”.
You probably have heard about Big Data. It plays an important role in machine learning, artificial intelligence etc. Now imagine that the vast amount of data available for the community to analyze and use them to create new machine learning and AI projects.
For example, self-driving cars rely heavily on AI systems and they need massive volumes of data to function properly. They could actually generate nearly a gigabyte of data every second on the road. For the average car, that means two petabytes of sensor, audio, video and other data each year. If automakers can share data, they may be able to improve safety and overall experience, thanks to community projects utilizing those data in their AI projects.
The CDLA licenses will help individuals and organizations to share data as easily as they share open source software code at present. Carefully drafted licensing models can help “people form communities to assemble, curate and maintain vast amounts of data, measured in petabytes and exabytes, to bring new value to communities of all types, to build new business opportunities and to power new applications that promise to enhance safety and services”.