AI Projects – Lawrence Berkeley National Laboratory

Berkeley Lab scientists are developing new AI models to push the boundaries of science, and applying AI to make discoveries in biology, physics, clean energy, climate, materials, and more.

A Foundation Model for Atomistic Materials Chemistry

Digital 3D illustration of a molecular structure with gold and blue hues set against a dark background.

Demonstrating the power of the MACE-MP-0 model and its qualitative and quantitative accuracy on a diverse set of problems in the physical sciences, including the properties of solids, liquids, gases, chemical reactions, interfaces, and even the dynamics of a small protein.

Accelerator Advancements Through Machine Learning

Training neural networks to allow for a novel feed forward which increases source size stability by up to an order of magnitude compared to conventional physics model-based approaches.

AI Foundation Model for Proteins toward Automated Function Enhancement

Digital green abstract background with a grid pattern overlaid with various illuminating white dots and lines.

Developing a generative pre-trained AI model to enhance the functional properties of proteins for biomanufacturing and to advance self-driving labs for synthetic biology.

Center for Advanced Mathematics for Energy Research Applications (CAMERA)

A virus particle shell (left) and a 2-D slice through its center (right) depicting various densities (red, high; yellow and green, medium; blue, low). The image of the virus, called PBCV-1, was reconstructed from fluctuation X-ray scattering data using M-TIP, an algorithm developed as part of the CAMERA project.

CAMERA is an integrated, cross-disciplinary center that aims to invent, develop, and deliver the fundamental new mathematics required to capitalize on experimental investigations at scientific facilities.

Combining Data-driven and Science-based Generative Models

Digital illustration of a luminous, blue energy core with binary code streaming towards the center from all sides against a dark background.

This project investigates the many connections between data-driven and science-driven generative models.

Domain-Aware, Physics-Constrained Autonomous Experimentation

Digital illustration of a blue circuit board with glowing lines.

Next-generation Gaussian (and Gaussian-related) process engine for flexible, domain-informed and HPC-ready stochastic function approximation.

Exa.TrkX

Simulated data modeled for the ATLAS detector. The image shows numerous spiraling and straight lines emanating from a central point on a navy blue background.

A collaboration of data scientists and computational physicists developing graph neural networks models aimed at reconstructing millions of particle trajectories per second from petabytes of raw data produced by the next generation of particle tracking detectors at the energy and intensity frontiers.

Foundation Models for Scientific Machine Learning

Neural network model composed of red, yellow, blue, green, and purple colors.

Exploring how pre-trained ML could be used for scientific ML (SciML) applications, specifically in the context of transfer learning.

FourCastNet

A depiction of digital twin Earth adapted from the EU's Destination Earth project.

FourCastNet, short for Fourier Forecasting Neural Network, is a global data-driven weather forecasting model that provides accurate short to medium-range global predictions at high resolution.

From Molecules to Materials

Edited TEM image of Aluminum Copper Precipitates.

Developing AI models that generalize across different chemical systems and are trained on large datasets, aiding in more accurate and efficient predictions in the field of materials science.

gpCAM for Domain-Aware Autonomous Experimentation

$Green and blue fractal digital image.$

The gpCAM project consists of an API and software designed to make autonomous data acquisition and analysis for experiments and simulations faster, simpler, and more widely available by leveraging active learning.

HGDL for Hybrid Global Deflated Local Optimization

Abstract digital art featuring 3-dimensional undulated shapes in neon green and blue shades on a black background.

An optimization algorithm specialized in finding a diverse set of optima, alleviating challenges of non-uniqueness that are common in modern applications.

Hidden Activity Signal and Trajectory Anomaly Characterization (HAYSTAC)

Evening traffic in downtown Los Angeles.

The goal of HAYSTAC is to develop a generative model that produces complete trajectories of stay locations given sparse Location-Based Service (LBS) data.

Image across Domains, Experiments, Algorithms and Learning (IDEAL)

Digital abstract background featuring multiple layers of binary code in a central circular pattern.

Supported by a U.S. DOE Early Career Award, IDEAL focuses on computer vision and ML algorithms and software to enable timely interpretation of experimental data recorded as 2D or multispectral images.

Large-scale, Self-driving 5G Network for Science

Using AI combined with network virtualization to support complex end-to-end network connectivity from edge 5G sensors to supercomputing facilities.

Learning Continuous Models for Continuous Physics

A digital abstract background featuring interconnected lines and nodes in a network pattern, predominantly in shades of blue against a dark backdrop.

Developing principled numerical analysis methods to validate models for science and engineering applications.

Learning Differentiable Solvers for Systems with Hard Constraints

Aerial view of a densely populated city at night, illuminated with orange and white lights, overlaid with a complex mesh network pattern.

Designing a differentiable neural network layer to enforce physical laws and demonstrate that it can solve many problem instances of parameterized partial differential equations (PDEs) efficiently and accurately.

MetFish: Protein Sequence to Dynamic Structures

Digital illustration of solution x-ray scattering (SAXS), which enables high-throughput structural characterizations of gene products in solution.

Developing new AI methods to integrate Small-angle X-ray scattering (SAXS) data from the Advanced Light Source (ALS) with AlphaFold’s AI-based protein structure prediction to identify physiologically representative protein conformations.

Mitigation via Analytics for Inverter-Grid Cybersecurity (MAGIC)

Developing secure AI/ML tools to both detect and mitigate cyber attacks on aggregations of Distributed Energy Resources (DER) in electric power distribution systems and microgrids.

ML for Traumatic Brain Injury Research

Digital illustration of a human brain depicted with glowing blue and orange nodes and interconnected lines, set against a dark background.

Collaboration shows how machine learning methods can enhance the prognosis and understanding of traumatic brain injury (TBI).

Mobiliti

Map of the SF Bay Area with yellow and orange lines running through the region, simulating the movement of the population through a region's road networks.

Cutting-edge software system that accurately simulates the movement of an entire population through a region’s road networks.

New Battery Designs and Quality Control with Deep Learning

A blue and orange electric vehicle lithium-ion battery pack.

New deep learning based on U-net, Y-net, and viTransformers for detection and segmentation of defects in lithium metal batteries to expand the e-vehicle fleet.

Python-based Surrogate Modeling Objects (PySMO)

Abstract digital network illustration with interconnected nodes and lines on a dark background.

An open-source tool for generating accurate algebraic surrogates that are directly integrated with an equation-oriented optimization platform, providing a breadth of capabilities suitable for a variety of engineering applications.

Science Search

Digital illustration of a multicolored polygonal human brain connected to circuit lines on a dark background.

Developing innovative machine learning tools to pull contextual information from scientific datasets and automatically generate metadata tags for each file.

Statistical Mechanics for Interpretable Learning Algorithms

Spirals of blue lines against a black background.

Using statistical mechanics to interpret how popular machine learning algorithms behave, give users more control over these systems, and enable them to reach the results faster.

Topological Optimization

Abstract blue background with flowing wave patterns in neon yellow and blue tones.

Enabling a faster and more precise topological regularization.

Transformers for Topic Modeling and Recommendation

Digital illustration of a circuit board shaped like a human brain, set against a dark background with green binary code.

Turning text data into information that helps to identify key topics within certain science domains.

TRI: Charge Balance Predictor, an ML Model for Synthesis Validation

Scientist holding a microtiter plate with digital chemical structures in the background.

Developing an ML model that predicts whether a newly proposed chemical synthesis based on its composition will be charged balanced to assist researchers in validating their synthesis plans.

Visualizing Scientific ML Functions

Abstract digital image showing numerous interconnected lines and dots in various colors on a dark blue background.

Developing novel visualization methods to improve our understanding of scientific ML models.

WaveCastNet

A small brown wooden model of a house sits on cracked concrete.

WaveCastNet is a novel AI-enabled framework for forecasting ground motions from large earthquakes.

4DCamera Distillery

Electron detector known as the 4D Camera.

Develop and deploy methods and tools based on AI and ML to analyze electron scattering information from the data streams of fast direct electron detectors.

A-Lab

Dark-haired scientist in the center of the frame looks toward the camera. They are standing behind the clear glass of an encased automated lab.

To accelerate development of useful new materials, researchers have developed a new kind of automated lab that uses robots guided by artificial intelligence.

An ML Approach to Better Batteries

Digital illustration of batteries against a black and dark blue background.

Applying ML to atomic-scale images to extract the relationship between strain and composition in a battery material, paving the way for more durable batteries.

AI and ML for Accelerator Technologies

Harnessing the game-changing power of AI/ML for both modeling and control of particle accelerators.

AI and ML for Nuclear Physics

Arrangements of abstract atoms on abstract background into crystal network.

Powering the next generation of nuclear physics discoveries with ML.

AI for Energy

Opportunities for a modern grid and clean energy economy through the power of AI.

AI for Physics Breakthroughs

Approaching fundamental physics challenges through the lens of modern ML.

AR1K: Engineering Agriculture through ML in BioEPIC

Bringing together molecular biology, biogeochemistry, environmental sensing technologies, and ML to help revolutionize agriculture and create sustainable farming practices that benefit both the environment and farms.

Assessing Factors Underpinning PV Degradation through Data Analysis

Solar panels installed in a field with a view of mountains in the background at sunset.

DuraMAT uses advanced data analytics to more accurately pinpoint photovoltaic (PV) module degradation and isolate its causes.

Automating Data Acquisition & Analysis

An artistic illustration of a mixture of Gaussian processes and a light or particle beam passing through.

This project aims to develop new stochastic process-based mathematical and computational methods to achieve high-quality, domain-aware function approximation, uncertainty quantification, and, by extension, autonomous experimentation.

Automating High-Throughput Electron Microscopy at the Molecular Foundry

A flexible pipeline-based system for high-throughput acquisition of atomic-resolution structural data using an all-piezo sample stage applied to large-scale imaging of nanoparticles and multimodal data acquisition.

Berkeley Biomedical Data Science Center

Futuristic artistic representation of a shield surrounded by viruses and pathogens.

Berkeley Biomedical Data Science Center (BBDS) is a central hub of research at Lawrence Berkeley National Laboratory designed to facilitate and nurture data-intensive biomedical science.

Calibrated and Systematic Characterization, Attribution, and Detection of Extremes (CASCADE)

View of a planet Earth hurricane from space.

Developing AI-based methods for predicting the occurrence of low-likelihood, high-impact climate extremes that are missed by traditional weather predictions.

Codesign of Ultra-Low-Voltage, Beyond CMOS Microelectronics

Close-up of an electronic circuit board with circuitry and microchip details with golden and blue color tones.

Exploring new physics leading to higher energy efficiency in computing.

Cosmological Hydrodynamic Modeling with Deep Learning

Using deep neural networks to reconstruct important hydrodynamical quantities from coarse or N-body-only simulations, vastly reducing the amount of compute resources required to generate high-fidelity realizations while still providing accurate estimates with realistic statistical properties.

Data Analytics for Commercial Buildings

Modern cityscape at dusk with tall commercial buildings and traffic below. Digital overlay of white binary code stream towards the city.

Developing automated approaches to determine building characteristics, and retrofit and operational efficiency opportunities.

Data Driven Synthesis Science Program

Abstract collage of code overlaid on data centers.

Developing a data-driven approach to synthesis science by combining text mining and ML, in situ and ex situ characterization of experimental synthesis, and large-scale first-principles modeling.

Deepot: A Deep Learning Approach for Parking Lot Detection Using Low-Resolution Satellite Imagery

Deep learning approaches to detect parking lot locations using satellite imagery datasets.

Domain-Aware, Physics-Constrained Autonomous Experimentation

Next-generation Gaussian (and Gaussian-related) process engine for flexible, domain-informed and HPC-ready stochastic function approximation.

Earth AI & Data

A digital illustration of Earth with glowing blue network lines and dots across a dark space background.

Using ML, data sciences, informatics, and data management to advance state-of-the-art Earth science observations, modeling, and theory.

Enhancing Utility Operations during Heat Waves through Large-Scale Sensing and Data Fusion

Enhancing utilities operation during heat waves by developing new models to estimate hours-ahead electricity demand, flexibility of aggregated building stocks and overheating risks of vulnerable communities during heat waves.

Exa.TrkX

ExaEpi

Creative artwork featuring colorized 3D prints of influenza virus (surface glycoprotein hemagglutinin is blue and neuraminidase is orange; the viral membrane is a darker orange).

Developing an exascale-ready agent-based epidemiological model that can speed predictions of disease spread.

ExaSheds

Green mountains and river in the East River catchment in Crested Butte Colorado.

Using leadership-class computers, big data, and machine learning – combined in learning-assisted physics-based simulation tools – to fundamentally change how watershed function is understood and predicted.

FAIR Universe

The FAIR Universe project is developing and sharing datasets, training frameworks, and data challenges and benchmarks to facilitate common development and standardization, all with a focus on uncertainty-aware training.

Feedstock to Function (F2FT)

Improving bio-based product and fuel development through adaptive technoeconomic and performance modeling.

Fire Spread Simulator and Understanding Fire Behavior

Smoke-filled forest in the aftermath of a wildfire.

An open-source fire spread simulation framework that trains semi-empirical fire behavior model output data using ML and provides the learned logic into a cellular automata simulator to simulate fire spread.

Harnessing ML to Accelerate the Discovery of New Upconverting Nanoparticles

Schematic of an upconverting nanoparticle heterostructure.

Using ML to accelerate the discovery of novel UCNPs while domain-specific knowledge is being developed.

HGDL for Hybrid Global Deflated Local Optimization

An optimization algorithm specialized in finding a diverse set of optima, alleviating challenges of non-uniqueness that are common in modern applications.

How Scientists Are Accelerating Chemistry Discoveries With Automation

Illustration of a computer, robotic arm, and various chemical flasks.

New statistical-modeling workflow may help advance drug discovery and synthetic chemistry.

Institute for the Design of Advanced Energy Systems (IDAES)

A next generation multi-scale modeling & optimization framework to support the U.S. power industry.

La Silla Schmidt Southern Survey (LS4)

Over the next decade, the La Silla Schmidt Survey (LS4) will leverage an automated pipeline to uncover transient sky events in the Southern Hemisphere.

Large-scale, Self-driving 5G Network for Science

Using AI combined with network virtualization to support complex end-to-end network connectivity from edge 5G sensors to supercomputing facilities.

Leverage Large Language Models for Particle Physics

Cover image for the 2023 P5 Report. An illustration of a blue and purple light coming out of a black hole. Two light beams are jutting out from the center toward the edges of the frame. The beam on the left is filled with moving blue orbs and the beam on the right is filled with two larger orbs containing small galaxy depictions.

Enhancing particle reconstruction by harnessing the power of language models.

Macroscopic Traffic Modeling Using Probe Vehicle Data: An ML Approach

Traffic management system concept image.

Applying ML methods to predict the macroscopic fundamental diagrams (MFD) across U.S. urban areas and capture the impacts of location-specific input features on the network flow-density relationships at a large scale.

MetFish: Protein Sequence to Dynamic Structures

Mitigation via Analytics for Inverter-Grid Cybersecurity (MAGIC)

Developing secure AI/ML tools to both detect and mitigate cyber attacks on aggregations of Distributed Energy Resources (DER) in electric power distribution systems and microgrids.

ML for Traumatic Brain Injury Research

Collaboration shows how machine learning methods can enhance the prognosis and understanding of traumatic brain injury (TBI).

ML Informed Parameterization for Household Vehicle Microsimulation

A hand plugs in a charging cable in an electric car.

Developing a dynamic vehicle transaction model to fully evolve households and their vehicle fleet composition and usage over time for forecasting vehicle technology adoptions in the U.S.

ML Tackles Long COVID

Digital illustration of human lungs overlaid with various medical and technological symbols set against a blue background.

AI software gleans insights from health records to shed light on chronic COVID symptoms.

ML Takes on Synthetic Biology: Algorithms Can Bioengineer Cells for You

Two people working on a whiteboard in a small office.

Berkeley Lab scientists developed a new tool that adapts ML algorithms to the needs of synthetic biology to guide development systematically.

ML Web Interfaces across Light Sources

Two scientists prepare a sample tube by filling it with standardized sand and sealing the ends with glue.

Developing a suite of tools aimed at lowering the barriers of access to advanced data processing for all users.

MLExchange

MLExchange is a shared platform that lowers the barrier to entry by leveraging advances in ML methods across user facilities, thus empowering domain scientists and data scientists to discover new information using existing and new data with novel tools.

MLPerf HPC

A person with glasses stands holding a laptop next to a large digital screen featuring data and graphs in a blue, modern, high-tech environment.

MLPerf HPC is a machine learning performance benchmark suite for scientific ML workloads on large supercomputers.

Mobiliti

Cutting-edge software system that accurately simulates the movement of an entire population through a region’s road networks.

NAWI: Direct Electrochemical Reduction of Selenium to Achieve A-PRIME Water Treatment

Predicting optimal electrode materials with high activity for aqueous electrochemical selenite and selenate reduction.

NAWI Water Treatment Model Development (WaterTAP)

Close-up of a water faucet with a drop of water dripping out.

Providing computational and modeling solutions to optimize the performance, energy use, and economic cost of existing and developing water treatment processes and infrastructures.

New Battery Designs and Quality Control with Deep Learning

New deep learning based on U-net, Y-net, and viTransformers for detection and segmentation of defects in lithium metal batteries to expand the e-vehicle fleet.

Normalizing Flows for Statistical Data Analysis

Dynamic display of blue dots formed into a wave pattern, set against a dark background.

Developing fast Bayesian statistical analysis methods for scientific data analysis that can be applied to a wide range of scientific domains and problems.

Privacy-Preserving, Collective Cyberattack Defense of Distributed Energy Resources

Green padlock on binary code background.

Developing a software platform to allow utilities to share relevant cybersecurity information with one another in a manner that does not compromise the privacy of customers in their service territories.

Process Optimization and Modeling for Minerals Sustainability (PrOMMiS)

Artist’s illustration of pyrite mineral crystals.

A resource for accelerating the identification, design, scaleup, and integration of innovative rare earth elements and critical processes.

Produced Water Application for Beneficial Reuse, Environmental Impact and Treatment Optimization (PARETO)

An open-source, optimization-based, downloadable and executable produced water decision-support application for produced water management and beneficial reuse.

Python-based Surrogate Modeling Objects (PySMO)

Reliable Edge ML Hardware for Science

Blue wormhole with light streaks, geometric shapes, and binary code.

This project explores approaches for developing and validating reliable algorithms for real-time computing at the scientific edge.

Rhizonet

A purple gloved hand holding a transparent box with a plant inside.

Harnessing the power of AI to study plant roots, offering new insights into root behavior under various environmental conditions.

Self-Supervised Learning for Cosmological Surveys

12 galaxy images from the Dark Energy Spectroscopic Instrument.

Sky surveys for downstream tasks like morphology classification, redshift estimation, similarity search, and detection of rare events, paving new pathways for scientific discovery.

Supercomputing-Scale AI on the Perlmutter System at NERSC

Decorative panels on the exterior of the computer cabinets for the Perlmutter NERSC-9.

The Perlmutter system is a world-leading AI supercomputer consisting of over 6,000 NVIDIA A100 GPUs, an all-flash filesystem, and a novel high-speed network.

Transformers for Topic Modeling and Recommendation

Turning text data into information that helps to identify key topics within certain science domains.

Using ML to Disentangle Strain Maps in Electron Microscopy

$Examples of simulated diffraction patterns for crystals of different thicknesses and orientations.$

A Fourier space, complex-valued deep-neural network, FCU-Net, to invert highly nonlinear electron diffraction patterns into the corresponding quantitative structure factor images.

WaveCastNet

WaveCastNet is a novel AI-enabled framework for forecasting ground motions from large earthquakes.

Artificial intelligence is bringing transformative solutions to complex scientific challenges. Through advanced computation, network facilities, and data integration, Berkeley Lab is advancing the foundations of powerful new AI capabilities and using AI for discoveries in materials, energy, chemistry, physics, biology, climate science, and more.

Machine Learning for Science