Session Manager: Joanne Suffert (Cegal)
12:30 Turning Unstructured Subsurface Data into Trusted AI: A Secure RAG Approach
Raphael Peltzer, Senior Data Scientist, Cegal AS
Abstract:
The oil and gas industry continues to digitise subsurface data, yet a significant portion of high-value information enriched by subject matter experts remains trapped in unstructured formats such as scanned reports, well documents, and legacy PDFs. While the OSDU® Data Platform standardises structured datasets, unlocking value from unstructured content requires both semantic enrichment and rigorous, enterprise-grade security.
This presentation introduces a scalable, entitlement-first Retrieval Augmented Generation (RAG) architecture developed in collaboration with a major operator, transforming unstructured OSDU-referenced content into actionable, contextaware intelligence. The approach combines document reconstruction logic and header-aware chunking with a hybrid retrieval strategy, integrating vector search and keyword-based methods to improve retrieval quality. Evaluation on a curated set of subsurface-specific questions is used to guide the tuning of key system parameters and quantify performance improvements.
A key architectural principle is the enforcement of access control and watermarking through mapping user identity to OSDU Access Control Lists (ACLs), ensuring that only authorised data is retrieved and that both outputs and source documents remain traceable. The solution is designed for seamless integration into existing subsurface workflows, allowing users to access OSDU-sourced insights directly within their daily tools without introducing additional friction, thereby reducing the time spent on searching and validating data.
Beyond retrieval, the architecture establishes a foundation for future agent-based workflows, where AI systems can reason over both structured and unstructured data while maintaining full compliance with enterprise security requirements. The presentation highlights key lessons learned, challenges encountered, and the practical considerations required to scale the solution in an enterprise setting.
13:30 Bridging Legacy and Cloud – Preparing Petrel project data for OSDU
Adam Watt, Product Manager, Cegal
Abstract:
As energy companies modernise their data estates, one challenge remains: unlocking value from legacy Petrel project data. While the OSDU™ Data Platform provides a cloud-native, vendor-agnostic foundation for data access and reuse, Petrel data is often locked inside proprietary project structures that are difficult to search, govern, and integrate beyond the desktop.
This presentation shows how Petrel projects can be transformed into OSDU-ready data through automated extraction, enrichment, and metadata normalisation. Using domain-aware tools such as Blueback Project Tracker and purpose-built extraction pipelines, large volumes of Petrel projects can be scanned, assessed, and prepared for ingestion. The session highlights common challenges, including version inconsistencies, incomplete legacy archives, embedded unstructured content, and missing spatial metadata, and explains how validation, enrichment, and reference-data mapping improve quality and alignment with OSDU domain models.
A central theme is the shift from treating Petrel projects as static containers to viewing them as structured sources of reusable domain data. By breaking projects into independently managed data objects, organisations can enable lineage, governance, cross-project comparison, and more collaborative cloud workflows.
The talk shares practical lessons from real-world deployments, including archive scanning, project prioritisation, ingestion integration, and quality assurance feedback loops. Attendees will leave with a clearer view of how to approach legacy-to-cloud transformation at scale and turn trapped project data into governed, discoverable, cloud-ready assets.
09:00 From Data Chaos to Insight: Agent-Driven Subsurface Workflows
Thomas Meldahl Olsen, Product Owner, Cegal AS
Abstract:
Recent advances in agent-based artificial intelligence are transforming how subsurface professionals interact with enterprise data ecosystems, information models, and technical workflows. Instead of manually discovering, validating, and integrating data across multiple systems, intelligent agents act as context-aware assistants that interpret user intent, reason across distributed data sources, and orchestrate end-to-end data management processes.
This work presents a practical demonstration of an agent-driven approach applied to a subsurface use case using an open field dataset. Starting from a high-level objective, the agent discovers and evaluates available data assets across enterprise repositories, summarizes field and reservoir context, inventories wells and associated datasets, and assesses data completeness, consistency, and quality. The agent integrates information from both internal data platforms and external regulatory and public data services to identify and resolve gaps, including missing interpretation data.
Building on this foundation, the agent generates reproducible workflows for data ingestion, transformation, conditioning, and enrichment within a unified environment aligned with enterprise data models. It demonstrates how heterogeneous data can be harmonized through metadata-driven approaches and interoperable services. As part of the workflow, the agent constructs an analytical pipeline, including a machine learning model to estimate missing subsurface properties.
By interacting with data catalogs, services, orchestration frameworks, and computational platforms, the agent highlights the importance of interoperability and standardization. This work shows how agent-based systems bridge the gap between domain intent and governed, reproducible workflows, improving data accessibility, strengthening data quality, and accelerating integrated digital subsurface workflows.
09:30 From Data Visibility to Action: Practical Workflows for Geoscience Data Management at Scale
Xingyu Zhang Espedal, Software analyst Geoscience, Cegal AS
Abstract:
The growing volume and complexity of geoscience data require scalable solutions for efficient data governance and workflows. This paper demonstrates how data-driven approaches and digital transformation principles can be applied in practice through a data management platform, using Blueback Project Tracker as a case study.
Blueback Project Tracker scans and structures geoscience data, enabling users to identify data duplication, mispositioned assets, and large datasets. Automated project size calculation and integration with visualization tools such as Power BI support data-driven decision-making.
Beyond monitoring, the platform is used in data assessment and consultancy workflows to extract Petrel metadata and support data rationalization. Using command-line (CLI) functionality, standardized workflows are applied, including consolidating data from multiple projects into a single environment, distributing data across projects, and reconnecting or externalizing seismic data to reduce redundancy.
The platform also supports key data management tasks such as project upgrading, archiving, and deletion. Batch processing with optional backup, rule-based validation, and flexible filtering improves efficiency and reduces operational risk. User activity monitoring and access control further support governance and data security.
Overall, this work demonstrates how systematic workflows powered by Blueback Project Tracker can turn data chaos into strategic advantage, eliminating redundancy, unlocking hidden storage savings, and support efficient collaboration across subsurface teams.