pwshub.com

Begun, the open source AI wars have

Opinion The Open Source Initiative (OSI) and its allies are getting closer to a definition of open source AI. If all goes well, Stefano Maffulli, the OSI's executive director, expects to announce the OSI open source AI definition at All Things Open in late October. But some open source leaders already want nothing to do with it.

code

Open Source Initiative tries to define Open Source AI

READ MORE

Let's start with some background. Lots of companies – I'm looking at you, Meta – have been claiming that their AI models are open source. They're not. They're not even close.

So the OSI and a host of other companies and groups have been working on creating a comprehensive open source AI definition. After all, the OSI is the same organization that defines open source software with the Open Source Definition.

In their latest draft, the Open Source AI Definition – draft v. 0.0.9, which was announced at KubeCon and Open Source Summit Asia in Hong Kong, significant changes were made, which grated on the nerves of some open source supporters. These are:

  • Role of training data: Training data is beneficial but not required for modifying AI systems. This decision reflects the complexities of sharing data, including legal and privacy concerns. The draft categorizes training data into open, public, and unshareable non-public data, each with specific guidelines to enhance transparency and understanding of AI system biases.
  • Separation of checklist: The license evaluation checklist has been separated from the main definition document, aligning with the Model Openness Framework (MOF). This separation allows for a focused discussion on identifying open source AI while maintaining general principles in the definition.

As Linux Foundation executive director Jim Zemlin detailed at the KubeCon and Open Source Summit China, the MOF "is a way to help evaluate if a model is open or not open. It allows people to grade models."

Within the MOF, Zemlin added, there are three tiers of openness. "The highest level, level one, is an open science definition where the data, every component used, and all of the instructions must go and create your model the same way. Level two is a subset where not everything is open, but most are. Then, on level three, you have areas where the data may not be available, and the data that describe the data sets would be available. And you can understand that – even though the model is open – not all the data is available."

This doesn't fly with some people. Tara Tarakiyee, FOSS Technologist for the Sovereign Tech Fund, writes: "A system that can only be built on proprietary data can only be proprietary. It doesn't get simpler than this self-evident axiom."

Tarakiyee adds: "The new definition contains so many weasel words that you can start a zoo... These words provide a barn-sized backdoor for what are essentially proprietary AI systems to call themselves open source."

Open source leader julia ferraioli agrees: "The Open Source AI Definition in its current draft dilutes the very definition of what it means to be open source. I am absolutely astounded that more proponents of open source do not see this very real, looming risk."

AWS principal open source technical strategist Tom Callaway said before the latest draft appeared: "It is my strong belief (and the belief of many, many others in open source) that the current Open Source AI Definition does not accurately ensure that AI systems preserve the unrestricted rights of users to run, copy, distribute, study, change, and improve them."

Afterwards, in a more sorrowful than angry statement, Callaway wrote: "I am deeply disappointed in the OSI's decision to choose a flawed definition. I had hoped they would be capable of being aspirational. Instead, we get the same excuses and the same compromises wrapped in a facade of an open process."

Chris Short, an AWS senior developer advocate, Open Source Strategy & Marketing, agreed. He responded to Callaway that he: "100 percent believe in my soul that adopting this definition is not in the best interests of not only OSI but open source at large will get completely diluted."

Steve Pousty, a developer advocacy consultant, commented on the OSI AI draft: "This definition does not grant the freedom to modify and is unacceptable as an Open Source Definition. With AI models, the weights are the user interface. I can use them directly as a user. They are what is typically distributed to everyone."

That's all well and good, but Maffulli doesn't feel a purely idealistic approach to the open source AI definition will work because no one will be able to meet the definition. Thus, the OSI's support for the MOF's levels of openness approach.

Callaway concluded: "They had a chance to lead, and they chose not to. I suppose the question is now: who will choose to lead in their place?"

That is indeed the question. Or will the community decide that the OSI AI Definition is the best practical way forward? Stay tuned. I fear this debate is going to last for years.

The real question to my mind is whether this will become a meaningless tech argument, such as vi vs EMACS (the answer's vi, by the way), while AI goes its merry way without referencing "open source" except as a marketing term. ®

Source: go.theregister.com

Related stories
2 weeks ago - Its ChatGPT chatbot quickly set the tone for what we can expect from Big Tech in the coming years.
1 month ago - Sources report that Arm is working on a new GPU at its development center in Ra'anana, Israel, where a team of up to 100 engineers has been assembled for the project. Although the exact focus of their work remains somewhat ambiguous,...
1 month ago - In the past few days, a new software package called Deep-Live-Cam has been making waves on social media, drawing attention for its ability to create real-time deepfakes with incredible ease. The software takes a single photo of a person...
1 week ago - Post-IPO chapter ends after SQL biz shed jobs, products in bid to find buyer A private equity bid has succeeded in its takeover of MariaDB 18 months after its disastrous IPO.…
2 days ago - If there's one thing we know about Big Red, it's being entirely reasonable JavaScript luminaries and at least 2,500 other interested parties have again asked Oracle to set the programming language free by walking away from the trademark...
Other stories
3 minutes ago - After California passed laws cracking down on AI-generated deepfakes of election-related content, a popular conservative influencer promptly sued,...
26 minutes ago - Act fast to grab this high-performing mesh router for less than $500, keeping you connected while saving some cash too.
26 minutes ago - If the old-school PlayStation is dear to your heart, you can soon relive those totally sweet 1990s memories. Sony is releasing a series of products...
27 minutes ago - If you've got an old phone to part with, T-Mobile is offering both new and existing customers the brand-new Apple iPhone 16 Pro for free with this trade-in deal.
27 minutes ago - Who doesn't want the best for their beloved pooch? Grab some of these tasty treats to make your dog feel special.