AI & Blockchain: Transformative Technologies in Distributed AI Solutions

2 min read

AI And Blockchain, A New Paradigm And Its Expression In Distributed AI

Blockchain and Trust

In recent years, the excitement surrounding blockchain technology has been overshadowed by the surge of interest in artificial intelligence (AI). While both technologies are relatively novel, AI has roots that stretch back to ancient concepts, such as the Golem, whereas blockchain emerged from innovations in hashing and distributed programming. Pioneering work by Leslie Lamport on distributed systems has been crucial in establishing a timeline and trust mechanisms, which are essential for decentralized trust and, by extension, blockchain. This means that blockchain has been developing for over four decades, while AI has a history of more than 80 years in its current forms. The challenge of distributed computing lies in solving problems cooperatively, necessitating both a temporal order and a consensus on a version of truth among potentially faulty or malicious computers. Decentralization thrives on the independent governance of these distributed systems, but it’s important to note that even Bitcoin struggles with true decentralization, as a few mining pools dominate the mining process, and major institutions control the pathways into the Bitcoin ecosystem. Furthermore, a small number of entities, often referred to as “whales,” possess 93% of all Bitcoin.

Challenges of AI

The well-documented issues surrounding AI encompass various aspects, including the risk of private data leakage, the high energy consumption associated with continuous training, and the reliance on siloed data for tailoring specific solutions. Some of these challenges could potentially be addressed through the integration of blockchain technology. An exploration of these issues leads to a startup known as Modelx.ai. Most insights regarding its operations come from an interview with CEO Jamiel Sheikh, who describes his company as part of the distributed AI movement, focusing on Federated AI. Traditionally, AI has been dominated by singular entities. When we refer to AI, we often mean advanced Deep Learning models, like Large Language Models (LLMs) akin to ChatGPT, as well as image and audio generation technologies. Looking ahead, AI may evolve into a “nation of geniuses” operating within data centers. The current AI training processes require enormous datasets that include nearly all digitized human outputs, which helps prevent overfitting—a condition where a model becomes too specialized to predict accurately due to limited data. Open-source models have begun to challenge the conventional wisdom surrounding data requirements. However, this data-intensive approach has its drawbacks. For instance, if the training data comprises outputs generated by AI, it may lack the depth and diversity of original human-created content. This self-referential cycle, described by the term “autophagy,” poses a genuine risk, as the proliferation of AI-generated content could lead to inherent biases and reduced effectiveness.

DeepSeek: an open AI model

Innovations like DeepSeek demonstrate that it is possible to achieve comparable performance without relying on vast datasets or extensive computational resources. Although the time and computational demands for inference—actual application of the AI—using DeepSeek may be higher, it represents an open model. This openness means that the source code and model weights are accessible, allowing anyone to modify and refine the model with their own data. However, this definition has faced criticism, as some argue that without sharing the training data, a model cannot truly be considered open source. The Open Source Initiative (OSI) has defended its stance on this matter, despite the pushback.

Modelx.ai

Modelx.ai operates under a framework that prioritizes keeping model data confidential while ensuring that the model and its weights remain open. This approach addresses the challenge of enhancing AI capabilities in specific fields using publicly available data, all while adhering to privacy laws. For instance, in the healthcare sector, HIPAA regulations prevent hospitals from sharing sensitive patient data. A practical example involves X-ray data: training a public AI model using private X-ray images from a single hospital can enhance its performance, but leveraging data from multiple hospitals would yield even greater improvements. Modelx.ai has developed a method for sharing data within a federated system that preserves privacy. In this model, each hospital trains a mature, pre-existing open model using its private data before passing the refined model onto the next hospital in the federation. This cycle continues until all participating hospitals have contributed their training data. The enhanced AI model, refined through contributions from various hospitals, remains exclusive to the federation. The blockchain component serves to verify improvements, facilitate compensation for contributors, and maintain data privacy. Each hospital receives tokens based on their contributions, and the enhanced model’s quality is assessed after each training round. When the models are utilized, hospitals pay for access using these tokens. By late 2024, open-source models were often criticized for being ineffective, but with the introduction of DeepSeek in early 2025, these criticisms began to lose their validity. Additionally, concerns about the potential extraction of private training data from AI models underscore the need for stringent safety measures, including the removal of identifiable information and protections against deanonymization, as outlined in the controversial EU AI Act.