This R&D initiative enhances an existing Random Forest–based workflow for predicting sedimentological genetic elements in uncored wells by integrating Large Language Models (LLMs). The multi-agent Retrieval-Augmented Generation (RAG) framework addresses the limitation of point-by-point predictions, incorporating vertical context, geological principles, and expert input into the process. The methodology begins with a supervised machine learning model trained on cored well data, followed by an unsupervised approach to account for variations in uncored intervals. LLMs are then leveraged to refine vertical stacking patterns per well and ensure consistency at field scale, drawing on a vector database of geological knowledge.
Preliminary testing shows a 15–25% improvement in accuracy over standalone Random Forest models, with enhanced alignment to established stratigraphic principles and a significant reduction in manual interpretation time. By bridging the gap between purely quantitative models and qualitative geological expertise, this approach yields more robust and efficient reservoir characterisations. Future work aims to expand the geological principles agent’s capabilities, incorporate larger LLM context windows, and validate the system across diverse geological settings
Seksaf, M.A., Clay, D., Kostic, B., Smith, R. and Charlaftis, D., 2025, September. Enhancing ML-Based Reservoir Characterisation with Large Language Models: A Multi-Agent RAG Workflow for Improved Sedimentological Prediction. In Sixth EAGE Borehole Geology Workshop (Vol. 2025, No. 1, pp. 1-4). European Association of Geoscientists & Engineers.