Using Generative AI and Foundation Models to Predict Above Ground Biomass for Nature Based Carbon Sequestration
11-02, 11:40–12:20 (America/New_York), Music Box (Room 5411)

A key challenge in training AI models is lack of labeled or ground truth data. This is especially true in the remote sensing field where seasonal changes and differences between label characteristics makes difficult creating a common labeled dataset. With the emergence of self-supervised learning the amount and quality of labeled data can be relaxed but model performance is still of paramount importance. This is especially true for quantifying the sources and sinks of greenhouse gases that drive climate change. In this talk, we present how state of the art AI technologies such as generative AI and Foundation Models can be used to estimate Above Ground Biomass (AGBD) changes due to extraction of CO2 from the atmosphere by vegetation. We demonstrate how these tools can be used by companies with NetZero pledge to quantify, monitor, validate and report their offsetting methodologies and sustainability practices.


This talk will first introduce the concept of using a fully automated way to process satellite data to quantify the amount of carbon trapped in vegetation using sparse ground truth data. Using machine learning techniques, the gaps in data are filled using a variety of techniques ranging from classical machine learning methods, like Random Forest, to Vision Transformers used in Foundational Models. The automated remote-sensing based approach enables an assessment of carbon pools across the globe.
The goal of this talk is to give the audience a good understanding of:
- Assembling and aggregating data sets used to train models for predicting carbon sequestration: We will give an overview of the optical, radar and aerial imagery data sets and the labeled data set used in the AI framework, specifically the Global Ecosystem Dynamics Investigation (GEDI) data set from NASA.
- Challenges in training AI models using sparse data; quantify the sparsity of training data on model output performance.
- A semi-automated data processing pipeline and AI framework built using standard Python and Pytorch packages that accommodates several AI models – Random Forest, deep-learning architecture such as U-net and a generative AI approach built on IBM’s Geo-Spatial Foundation Model.
- The emerging role of generative AI and geo-spatial foundation model (GFM) based approach in the framework.
- The performance of the various models and demo of results (predicted AGBD) in different regions of the globe, including changes in AGBD over time.
We expect the audience to have working knowledge of machine learning, Python and a basic understanding of self-supervised machine learning models forming the basis of Foundation Models.


Prior Knowledge Expected

No previous knowledge expected

Harini Srinivasan is a Senior Technical Staff Member in the IBM Sustainability Software organization. She currently manages a team of Data Scientists and Machine Learning Engineers and focuses on building AI solutions using weather data, satellite imagery, and other geo-spatial data. Over her tenure at IBM, Harini has worked with several clients – be it in analyzing and fixing performance problems in enterprise applications, or bringing new innovative solutions to clients in various technical areas like deployment, Social Media Data Analysis, B2B solutions using Weather and other geo-spatial data. Her current focus is application of AI models (including Generative AI) that use environment data such as satellite imagery, weather for sustainability solutions such as Outage Prediction, Vegetation Management and Carbon Sequestration.

Data Scientist at IBM