The Linux Foundation Projects
Skip to main content

Today, the LF AI & Data Foundation unveils the Generative AI Commons, dedicated to advancing open source generative AI technologies, including Large Language Models (LLMs), through trusted neutral governance and transparent collaboration.

LF AI & Data has experienced significant growth this year, marked by a substantial rise in both projects and new members. We are thrilled to welcome AWS, Intel, and IBM as premier members, along with Ant Group, Fujitsu, CloudGeometry, and New Relic as General members. In line with our expansion efforts, we are proud to introduce the Generative AI Commons dedicated to advancing open source generative AI technologies, including Large Language Models (LLMs), through transparent collaboration and trusted neutral governance.

Furthermore, we are excited to announce that RWKV ( Receptance Weighted Key Value) will be joining the Foundation in incubation as our first technical project as part of the Generative AI Commons. This project began as an EleutherAI community project lead by Bo Peng that is now being donated to the Linux Foundation. Compute for training the original RWKV models was sponsored by StabilityAI. 

RWKV is a Recurrent Neural Network (RNN) model that combines the strengths of both Transformers and RNNs. It offers parallelizable training akin to Transformers and efficient inference similar to RNNs. This unique blend results in exceptional performance, swift inference, VRAM optimization, fast training, extended context, and sentence embeddings—all achieved without the use of attention mechanisms. A paper introducing this revolutionary model has been recently released and is currently under review. You can access the preprint here.

Demonstrating our thriving ecosystem, the LF AI & Data Foundation has seen a remarkable 142% increase in contributor growth, boasting over 55 projects, more than 60 members, and a community of over 27,000 active contributors. This growth is a testament to our commitment to innovation and excellence as LF AI & Data continues to evolve. The introduction of the Generative AI Commons is a natural progression in our pursuit of these goals.

Generative AI, particularly Large Language Models (LLMs), is rapidly becoming key to enterprise and cross-industry applications. LLMs have versatile applications, including text generation, natural language understanding, conversational dialog, code generation, and emerging multi-modal capabilities. However, their popularity and limited accessibility raise concerns about ethics, data privacy, transparency, and intellectual property.

The Generative AI Commons promotes open source and open-science principles, fostering collaborative development while ensuring fair and transparent governance. This initiative aims to create a neutral and inclusive community where organizations collaborate and contribute to developing enterprise-ready platforms, filling a critical gap in the generative AI landscape. With thousands of members and collaborations across various industry sectors, it advances leading open source technologies.

In this complex landscape, the Generative AI Commons aims to establish a framework for evaluating open source generative AI projects based on their components’ availability and degree of openness. This framework helps differentiate projects with permissive OSI-approved licenses from those with usage restrictions. It provides clarity and transparency for generative models and datasets, advancing open science and responsible AI for generative AI adoption.

Dr. Ibrahim Haddad, Executive Director of LF AI & Data, states, “Generative AI technologies, especially LLMs, hold immense potential for innovation, collaboration, and transparency. LF AI & Data embraces these advancements, hosting them in a vendor-neutral environment with open governance, fostering accessibility, transparency, and democratizing NLP technology.”

Generative AI Commons: Key Focus Areas

  1. Trusted Large Language Models: We host trusted and ethical Large Language Models (LLMs) with vetted open source components for high security and privacy standards, enabling access to trusted open source AI models.
  2. AI Life Cycle Tools: Our comprehensive tools support AI model management from training to inference, aiding developers and data scientists in deploying their models and fine-tuning domain-specific datasets.
  3. Open Data Resources: We curate and host ethical open source datasets and tools to help developers and data scientists prepare and fine-tune datasets for multinational and multilingual use.
  4. Openness Evaluation Framework: We are developing a framework to assess generative AI projects’ openness, evaluating open science components using OSI-approved open source licensing to enhance transparency and inform users of commercial usage implications.
  5. Ethical AI Governance and Education: We aim to influence responsible AI legislation and promote AI literacy through partnerships and community engagement. We’ll offer guidelines, training, and best practices for widespread adoption.

————————

To learn about Generative AI Commons: wiki.

LF AI & Data Resources

About LF AI & Data Foundation

The LF AI & Data Foundation is a leading organization dedicated to accelerating and promoting the adoption of artificial intelligence and data technologies. With a diverse community of members and contributors, LF AI & Data drives innovation and fosters collaboration in these transformative fields