These startups are building cutting-edge AI models without the need for a data center
A new approach to federated learning over the network, potentially shaking up the AI industry later this year with a gargantuan billion-parameter model.
Researchers utilized GPUs distributed globally, combined with private and public data, to train a new type of large language model (LLM). This initiative indicates that the mainstream approach to building artificial intelligence may be disrupted.
Two startups pursuing a non-traditional AI development approach, Flower AI and Vana, collaborated to develop this new model called Collective-1.
Flower's developed technology allows the training process to be distributed across hundreds of interconnected computers over the internet. The company's technology has been used by some organizations to train AI models without the need for centralized computing resources or data. Vana, on the other hand, provided data sources including private messages on platforms like X, Reddit, and Telegram.
By modern standards, Collective-1 is relatively small in scale, with 7 billion parameters—these parameters collectively empower the model—compared to today's state-of-the-art models (such as models powering ChatGPT, Claude, and Gemini) with hundreds of billions of parameters.
Nic Lane, a computer scientist at the University of Cambridge and co-founder of Flower AI, stated that this distributed approach is expected to scale far beyond Collective-1. Lane added that Flower AI is currently using conventional data to train a 300 billion parameter model and plans to train a 1 trillion parameter model later this year—approaching the scale offered by industry leaders. "This could fundamentally change how people view AI, so we are going all in," Lane said. He also mentioned that the startup is incorporating images and audio into training to create multimodal models.
Distributed model development could also shake up the power dynamics shaping the AI industry.
Currently, AI companies build models by combining massive training data with abundant computing resources centralized in data centers. These data centers are equipped with advanced GPUs and interconnected by ultra-high-speed optical fiber cables. They heavily rely on datasets created by scraping public (though sometimes copyrighted) materials such as websites and books.
This approach means that only the wealthiest companies and countries with a significant number of powerful chips can effectively develop the most robust, valuable models. Even open-source models like Meta's Llama and DeepSeek's R1 are constructed by companies with large data centers. The distributed approach may allow small companies and universities to build advanced AI by aggregating homogeneous resources. Alternatively, it may enable countries lacking traditional infrastructure to build more powerful models by interconnecting multiple data centers.
Lane believes that the AI industry will increasingly move toward allowing training to break out of individual data centers. The distributed approach "allows you to scale your compute in a more elegant way than a data-center model," he said.
Helen Toner, AI Governance Expert at the Center for Security and Emerging Technology, stated that Flower AI's approach is "intriguing and potentially highly relevant" to AI competition and governance. "It may be challenging to keep up with the cutting edge, but it could be an interesting fast-follow approach," Toner said.
Divide and Conquer
Distributed AI training involves rethinking how computation is allocated to build powerful AI systems. Creating LLM requires feeding a large amount of text into a model, adjusting its parameters to generate useful responses to prompts. Within a data center, the training process is segmented to run partial tasks on different GPUs and then periodically consolidated into a single master model.
The new approach allows tasks typically done in a large data center to be executed on hardware located potentially miles apart and connected via relatively slow or unreliable internet links.
Some major companies are also exploring federated learning. Last year, Google researchers demonstrated a new approach called DIstributed PAth COmposition (DiPaCo) to segment and integrate computations, making distributed learning more efficient.
To build Collective-1 and other LLMs, Lane collaborated with academic partners in the UK and China to develop a new tool called Photon to make distributed training more efficient. Lane stated that Photon improves on Google's approach by adopting a more efficient data representation and a scheme for sharing and integrating training. This process is slower than traditional training but more flexible, allowing the addition of new hardware to speed up training, Lane said.
Photon was developed in collaboration by researchers from Beijing University of Posts and Telecommunications and Zhejiang University. The team released the tool under an open-source license last month, allowing anyone to use this approach.
In the effort to build Collective-1, Flower AI's partner Vana is developing a new approach to allow users to share personal data with AI builders. Vana's software enables users to contribute private data from platforms such as X and Reddit to the training of large language models and potentially specify the allowed final use, even benefiting economically from their contributions.
Anna Kazlauskas, co-founder of Vana, stated that the idea is to make untapped data available for AI training while giving users more control over how their information is used in AI. "This data is typically unable to be included in AI models because it's not public," Kazlauskas said, "This is the first time user-contributed data is being used to train a foundational model, and the user owns the AI model created from their data."
University College London computer scientist Mirco Musolesi has stated that a key benefit of distributed AI training methods may be that it unlocks a new type of data. "Extending this to cutting-edge models would enable the AI industry to leverage vast amounts of distributed and privacy-sensitive data, such as in healthcare and finance, for training without the risks of centralization," he said.
This article is contributed content and does not represent the views of BlockBeats.
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
Berkshire Hathaway shares slipped after Buffett says he will step down as CEO, stocks surge pre-market
Share link:In this post: Warren Buffett announces retirement as Berkshire Hathaway CEO after six decades, triggering a brief dip in BRK-B shares. The US stock market eyes its longest winning streak in 20 years amid cooling trade optimism and a weaker dollar. Investors brace for the Fed’s rate decision as oil slumps, gold surges, and global markets react to shifting macro signals.
Bitcoin Drops Below $95K as Analysts Eye $92K Before FOMC
3 Reasons Dogecoin Price Could Reach $0.40 in May 2025
Semler Scientific Adds 167 Bitcoin, Bringing Holdings to 3,634 BTC
Trending news
MoreCrypto prices
More








