eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

214
active users

#DataCommons

0 posts0 participants0 posts today

"The Report

The EuroStack: Europe Innovating on Its Own Terms

EuroStack is our rallying cry to shape Europe’s digital destiny. By prioritizing strategic autonomy and harnessing breakthrough tech—AI, chips, quantum computing, advanced cloud solutions—we’re preparing Europe for tomorrow’s challenges and new opportunities.

It’s time to build a digital future that mirrors our democratic values, economic ambitions, and environmental commitments.

Seize the Opportunity

EuroStack isn’t about idealism—it’s hard realism. The alternative? A Europe reduced to digital colonialism, with industries hollowed out, citizens under foreign surveillance, and climate goals held hostage by monopolies.

We have the tools, we have the talent. Now we need the will to innovate on our own terms.""

euro-stack.info/

www.euro-stack.infoEuroStackBuilding a european alternative for technological sovereignty

Interesting data from a new edition of the Foundation Model Transaprency Index - collected six months after the initial index was released.

Overall, there's big improvement, with average score jumping from 37 to 58 point (out of a 100). That's a lot!

The interesting fact is that researchers contacted developers and solicited data - interactions count.

More importantly, there is little improvement, and little overall transparency in a category that researchers describe as "upstream": on data, labour and compute that goes into training. And "data access" gets the lowest score of all the parameters.

More at Tech Policy Press: techpolicy.press/the-foundatio

Tech Policy Press · The Foundation Model Transparency Index: What Changed in 6 Months? | TechPolicy.PressFourteen model developers provided transparency reports on each of 100 indicators devised by Stanford, Princeton, and Harvard researchers.

#AI #GenerativeAI #AITraining #Copyright #FairUse #IP #DataCommons #Books: "This paper is a snapshot of an idea that is as underexplored as it is rooted in decades of existing work. The concept of mass digitization of books, including to support text and data mining, of which AI is a subset, is not new. But AI training is newly of the zeitgeist, and its transformative use makes questions about how we digitize, preserve, and make accessible knowledge and cultural heritage salient in a distinct way.

As such, efforts to build a books data commons need not start from scratch; there is much to glean from studying and engaging existing and previous efforts. Those learnings might inform substantive decisions about how to build a books data commons for AI training. For instance, looking at the design decisions of HathiTrust may inform how the technical infrastructure and data management practices for AI training might be designed, as well as how to address challenges to building a comprehensive, diverse, and useful corpus. In addition, learnings might inform the process by which we get to a books data commons — for example, illustrating ways to attend to the interests of those likely to be impacted by the dataset’s development." openfuture.pubpub.org/pub/towa

Open Future · Towards a Books Data Commons for AI TrainingThis paper, which has been informed by a series of workshop discussions, maps possible paths to building a books data commons and outlines key questions relevant to developers, repositories, and other potential stakeholders.

Open Future's newest white paper, authored by @zwarso and myself, addresses the governance of data sets used for #AI training.

Over the past two years, it has become evident that shared datasets are necessary to create a level playing field and support AI solutions in the public interest. Without these shared datasets, companies with vast proprietary data reserves will always have the winning hand.

However, data sharing in the era of AI poses new challenges. Thus, we need to build upon established methods like #opendata refining them and integrating innovative ideas for data governance.

Our white paper proposes that data sets should be governed as commons, shared and responsibly managed collectively. We outline six principles for commons-based governance, complemented by real-life examples of these principles in action.

openfuture.eu/publication/comm

Open FutureCommons-based Data Set Governance – Open FutureIn this white paper, we propose an approach to sharing data sets for AI training as a public good governed as a commons.