Open Model Initiative Enters Phase II: Strengthening Open Multimodal AI Development

Author: Zhipeng Huang, LF AI & Data Board Member, TSC Chair of Open Model Initiative

The Open Model Initiative (OMI), an LF AI & Data project, has made strong progress in advancing open source AI development. As the project moves into Phase II, OMI is refining its technical priorities, strengthening governance, and introducing new community-led efforts focused on open evaluation, multimodal models, and reproducible AI development.

Solidifying the Foundation: Phase I Recap

OMI’s first phase laid essential groundwork for a broad, cooperative effort in open AI development. The project established strong working groups around Machine Learning and Data, setting clear focus areas that include multimodal model design, scalable training pipelines, and ethical data curation practices.

These efforts solidified OMI’s vision to create openly licensed AI models, promote responsible and transparent AI development, and enable cross-community collaboration, establishing a foundation for the next phase of technical growth.

Expanding Technical Leadership and Governance

As OMI enters Phase II, the project has established a Technical Steering Committee (TSC) to guide its technical direction and roadmap. The TSC includes multi-modal development leads from industry organizations like Tencent, BAAI, vLLM, SGLang; research experts from LMU Munich, LMMS-Labs, NYU; and also independent open source researchers like Swayam Bhanded and Jiacheng You. The Phase II represents experience across multimodal model development, training/inference/RL infrastructure, training pipelines, and evaluation tooling.

The TSC is responsible for setting technical priorities, coordinating cross-project work, and reviewing proposals related to model architecture, data practices, and evaluation standards. OMI is continuing to expand the committee to include additional contributors from diverse regions and technical backgrounds to ensure balanced representation and sustained technical oversight as the project grows.

Launching an Multi-Modal SpeedRun Initiative

OMI has launched a community-driven and first-ever global ImageGen SpeedRun project to accelerate the model architecture and training techniques for multi-modal models. Standing on the shoulders of giants like Modded-gpt SpeedRun and Marin SpeedRun, the ImageGen SpeedRun is just a starting point of an open collaboration on innovation research of fast iteration on diffusion models and multi-modal generation tasks, and finally paved the way towards a true Omni-modal architecture.

Through this effort, OMI is establishing a public best practice, a shared evaluation standards and also useful implementations that enable reproducible, openly accessible benchmarking for modern LLM and multimodal architectures.

Phase II: Technical Focus Areas and Participation

Phase II of the Open Model Initiative centers on three directions:

Multi-modal SpeedRun: faster iterations on multi-modal model architecture and training exercises, curation of good quality dataset for speedrun effort, and incubate projects that reflect the implementation of best practices.
Omni-modal Recipes: Publishing reproducible, end-to-end recipes covering data preparation, multimodal training stages, instruction tuning, reinforcement learning workflows, inference optimization, quantization, and serving across common open inference frameworks.
General Evaluation and Benchmarking: Establishing standardized evaluation practices and a shared benchmark registry for multimodal capabilities, including vision-language reasoning, document understanding, video temporal reasoning, OCR, and agentic tasks. Evaluation recipes emphasize reproducibility, controlled task selection, and transparent reporting.

Phase II execution includes monthly TSC coordination, charter updates as needed, and quarterly releases of Cookbook content. Participation is open to contributors across research, industry, infrastructure, benchmarking, and data curation, with contributions reviewed through a structured, reproducibility-focused workflow.

Get Involved

OMI welcomes contributions from developers, researchers, and practitioners working on open multimodal AI.

You can participate by:

Contribute Check out our GitHub Repository for open issues and contribution guidelines.
Participate in Meetings Join our bi-weekly all-hands calls or participate in TSC meetings to stay engaged with the community.
Join Our Discord Connect with like-minded individuals and receive the latest updates from OMI.

Author

LF AI & Data

View all posts