How a Global Hunt for Crystal Structures is Powering the Next Scientific Revolution
In the world of artificial intelligence, a quiet revolution is underway—one that is shifting focus from the words we speak to the very atoms that make up our physical world. Just as large language models like GPT have transformed how we work with text, a new class of artificial intelligence known as Large Atom Models (LAMs) is emerging to reshape our understanding of the molecular universe.
At the forefront of this movement is the OpenLAM Initiative, an ambitious, community-driven project that aims to "Conquer the Periodic Table" by developing open-source foundation models capable of simulating and designing materials at the atomic level.
The significance of this endeavor extends far beyond academic curiosity. The development of new materials—whether for more efficient batteries, smarter pharmaceuticals, or advanced semiconductors—has traditionally been a slow, expensive process of trial and error. OpenLAM seeks to bridge this gap by creating AI infrastructure that can dramatically accelerate scientific discovery and materials design.
Dramatically reducing the time needed for materials development from years to months or weeks.
Harnessing global collective intelligence through open challenges and shared datasets.
Large Atom Models (LAMs) are sophisticated AI systems designed to understand and predict the behavior of atomic systems. Just as large language models learn the patterns and relationships between words, LAMs learn the fundamental physical principles that govern how atoms interact with each other.
These models approximate the universal potential energy surface—essentially, the mathematical description of how energy is distributed and transferred between atoms in different configurations4 .
Computationally intensive, taking days or weeks for complex systems
Similar calculations in a fraction of the time with remarkable accuracy8
The development of LAMs is part of the broader AI for Science (AI4Science) movement, which applies advanced machine learning techniques to long-standing scientific challenges.
This transformation is already visible across multiple domains—from AI systems that can predict weather patterns with unprecedented accuracy to models that are helping astronomers analyze cosmic data thousands of times faster than previously possible8 .
The OpenLAM Initiative was formally launched by the Deep Potential team in early 2024, though its roots trace back to 2022 when the team began actively pretraining LAMs2 3 . The project's ambitious slogan—"Conquer the Periodic Table!"—reflects its comprehensive scope: to create an open-source ecosystem around large atomic models that can span the entire periodic table5 .
The initiative operates on a simple but powerful premise: that open collaboration will accelerate the development of more robust and capable atomic models. By sharing curated datasets, algorithms, and relevant workflows, the project aims to democratize access to cutting-edge AI tools for scientific discovery.
Universal property learning capability
Universal cross-modal capability
Target-oriented atomic scale universal generation and planning capability5
OpenLAM represents more than just a research project—it's a growing ecosystem with multiple components:
At the heart of the OpenLAM Challenges is the LAM Crystal Philately competition, an innovative approach to building a comprehensive database of crystal structures. The competition's name evokes the practice of philately (stamp collecting), but instead of stamps, participants "collect" unique atomic configurations with arbitrary chemical compositions2 3 .
The competition mechanics are elegantly designed. Participants submit proposed crystal structures, which are then validated by a LAM based on energy and force criteria. The stability of these structures is assessed using the OpenLAM convex hull—a mathematical construct that identifies the most thermodynamically stable configurations from all structures within the database2 3 .
The first round of the Crystal Philately competition has yielded extraordinary results, collecting over 19.8 million valid structures, including approximately 350,000 on the OpenLAM convex hull2 3 .
Valid Structures Collected
On Convex Hull
By mid-2024, the competition database had grown to contain over 13 million crystal structures, with more than 5 million contributions coming directly from participants7 . All structure information in the database is open-source, accessible either through a Python API or via a dedicated application called CrystalCraft that supports multiple search functions and structure analysis7 .
The OpenLAM initiative brings together a sophisticated collection of computational tools and frameworks that enable researchers to participate in this scientific frontier.
| Tool/Component | Function | Significance |
|---|---|---|
| DeePMD-kit | Software for performing molecular dynamics simulations | Provides the foundation for training and running Deep Potential models7 |
| DPA-2 Architecture | Neural network design for large atomic models | Incorporates three-body encoding information for improved accuracy7 |
| LAMBench | Benchmarking system for evaluating LAMs | Enables standardized comparison of different models across domains4 |
| OpenLAM API | Programming interface for accessing structure data | Allows researchers to programmatically query the competition database7 |
| CrystalCraft App | Application for visualizing and analyzing crystal structures | Provides user-friendly access to the growing database of structures7 |
As the field of Large Atom Models has expanded, the need for standardized evaluation has become increasingly important. The LAMBench benchmarking system was developed to address this need, providing a comprehensive framework for evaluating LAMs in terms of their generalizability, adaptability, and applicability4 .
LAMBench assesses models across three critical dimensions:
The OpenLAM Initiative has demonstrated steady progress in improving the accuracy and efficiency of their models. The 2024 Q3 report highlighted substantial gains in both performance and speed for the DPA-2 model7 .
| Metric | DPA-2-b3 (Previous) | DPA-2-b4-medium (New) | Improvement |
|---|---|---|---|
| Energy Weighted RMSE | 18.5 meV/atom | 13.1 meV/atom | ~30% improvement |
| Force Weighted RMSE | 130.8 meV/Å | 113.1 meV/Å | ~14% improvement |
| Training Speed (100 steps) | 15.9 seconds | 8.4 seconds | ~47% faster |
| Inference Speed (100 runs) | 6.3 seconds | 3.7 seconds | ~41% faster |
These improvements are particularly significant because they demonstrate that the OpenLAM team is successfully navigating the trade-off between accuracy and computational efficiency—a critical challenge in the development of practical AI tools for scientific research.
The ultimate test of any scientific tool lies in its ability to solve real-world problems, and here, the OpenLAM Initiative and related AI4Science approaches are already showing remarkable promise.
A research team in China successfully trained an AI system for catalyst screening. From an initial pool of more than 14,000 potential candidates, they identified four molecular formulas that yielded highly satisfactory outcomes.
In the pharmaceutical industry, companies like MindRank have leveraged AI drug discovery platforms to identify preclinical drug candidates in record time.
Their system identified a promising molecule for treating obesity and type 2 diabetes from nearly 100 candidates in just eight months, with the resulting drug MDR-001 now receiving clinical trial approvals in both China and the United States8 .
OpenLAM approaches are also being applied to fundamental scientific challenges, such as the growth of lithium dendrites in batteries—a phenomenon that can render lithium batteries inoperative and has remained poorly understood.
The OpenLAM Challenges represent more than just a series of technical competitions—they embody a fundamental shift in how scientific research is conducted and who gets to participate in the process. By creating an open, collaborative ecosystem around Large Atom Models, the initiative is democratizing access to cutting-edge research tools that were previously available only to well-funded institutions.
The progress achieved through the Crystal Philately competition and related efforts demonstrates the power of this approach. With millions of validated structures added to community databases and consistent improvements in model performance, the project is building momentum toward its ambitious goal of "conquering the periodic table."
As the initiative continues to evolve, its focus on openness, standardization, and community engagement provides a compelling model for how we might approach other complex scientific challenges. In the words of the OpenLAM team's vision, the ultimate goal is to achieve "Large Atom Embodied Intelligence" for atomic-scale intelligent scientific discovery and synthetic design within 5-10 years5 .
If the current pace of progress is any indication, this vision may be closer than we think—promising a future where AI-powered discovery unlocks new materials, medicines, and technologies that today exist only in our imagination.