Pluralis Research

About

Pluralis carries out foundational research on Protocol Learning: multi-participant training of foundation models where no single participant has, or can ever obtain, a full copy of the model.

This approach makes the model unextractable; no one can take a model and use it externally. Models become protocol assets, owned collectively by the training participants. This allows programmatic value flow from model revenue and enables collaborative ownership of the models. In turn, this makes possible an economically sustainable foundation model supply chain outside of corporate and governmental control. It allows genuine open innovation at the model layer and opens a path to potentially unprecedented scale.

Today's open-source AI does not provide any of these properties; it is dependent on some entity somewhere spending huge amounts to train a model and releasing it for free. Meanwhile, a societal-wide platform dependency on the centralized model providers is emerging as they become increasingly critical to everyday life. We believe it’s dangerous for technology that significantly shapes people’s decision-making and worldviews to be developed in closed, corporate settings.

Research

Protocol Models: Scaling Decentralized Training with Communication-Efficient Model Parallelism

S. Ramasinghe, T. Ajanthan G. Avraham, Y. Zuo, A. Long | arXiv:2506.01260, 2025

This is the first work that shows model-parallel training over low-bandwidth networks is possible. Specifically, it demonstrates an 8B LLaMA model being trained on par with centralized training when the devices holding subsequent transformer blocks are in four different locations and connected only via standard internet connections. This was considered completely impossible prior to this work.

Nesterov Method for Asynchronous Pipeline Parallel Optimization

T. Ajanthan, S. Ramasinghe, Y. Zuo, G. Avraham, A. Long | ICML 2025

Pipeline Parallelism allows large models to train across many small devices by slicing the network into stages. In pipeline parallelism, there is a problem of a “bubble” where devices are idle. It slows down both centralized and decentralized training, but the effect is more pronounced in the decentralized case as communication lag affects the size of the bubble. We solve this, outperforming all existing async techniques and even the synchronous baseline.

▶︎ Code

Team

Founder
Alexander Long
Scholar LinkedIn X
Founding Scientist
Gil Avraham
Scholar LinkedIn X
Founding Scientist
Yan Zuo
Scholar LinkedIn X
Founding Scientist
Ajanthan Thalaiyasingam
Scholar LinkedIn X
Founding Scientist
Sameera Ramasinghe
Scholar LinkedIn X
Research Scientist
Violetta Shevchenko
Scholar LinkedIn
Research Scientist
Hadi Mohaghegh Dolatabadi
Scholar LinkedIn X
Research Intern
Chamin Hewa Koneputugodage
Scholar LinkedIn

Media

Fortune
Pluralis raises $7.6 M to take on OpenAI with decentralized models
Article · Mar 2025
USV Blog
Pluralis: Towards Actually Open AI
Article · Mar 2025
CoinFund Podcast
The Most Non-consensus Approach to Training Models — Mined with CoinFund Ep. 24
Podcast · Mar 2025
Into the Bytecode Podcast
Alexander Long on Pluralis Research and Protocol Learning for Frontier Models
Podcast · Feb 2025

About

Research

Protocol Models: Scaling Decentralized Training with Communication-Efficient Model Parallelism

Nesterov Method for Asynchronous Pipeline Parallel Optimization

Team

Alexander Long

Gil Avraham

Yan Zuo

Ajanthan Thalaiyasingam

Sameera Ramasinghe

Violetta Shevchenko

Hadi Mohaghegh Dolatabadi

Chamin Hewa Koneputugodage

Media