AI and machine learning in upstream oil & gas
The upstream industry has always run on inference under uncertainty — estimating what lies kilometres underground from sparse, expensive, indirect measurements. That is precisely the kind of problem machine learning is built for. But the subsurface is also where naive machine learning fails most spectacularly: tiny datasets, brutal extrapolation, and physics that punishes a model for ignoring it. This is a systematic tour of where AI and ML genuinely add value across the upstream value chain, how the methods actually work, where they break, and why the field is converging on physics-informed approaches rather than pure black boxes.
Why upstream is both ideal and hostile for ML
Upstream generates enormous volumes of data: seismic surveys measured in terabytes, continuous sensor streams from drilling rigs and producing wells, decades of production history, and millions of feet of well logs. The promise is obvious — patterns too subtle or too high-dimensional for a human to see, surfaced automatically and at scale. Yet the same domain is unusually hostile to off-the-shelf machine learning. Labelled data is scarce and expensive (every label may cost a multi-million-dollar well). The systems are governed by hard physics — mass, momentum, and energy balances — that a correlation-only model will happily violate. And the cost of a confident wrong answer is measured in dry holes and abandoned facilities, not a mis-served advertisement. Everything that follows is shaped by that tension.
A practical precondition for any of this is a usable data foundation. The industry’s move toward standardized, vendor-neutral data platforms — the Open Subsurface Data Universe (OSDU) being the most prominent — exists because the single biggest blocker to upstream ML is not algorithms but siloed, inconsistent, poorly governed data. No model survives contact with a spreadsheet whose units nobody documented.
A map of the value chain
AI/ML is not one thing applied once; it is many techniques applied at distinct stages, each with different data, different stakes, and different maturity. The cleanest way to organize the field is by where in the asset lifecycle the model lives.
The spectrum from black box to physics
Before the applications, one organizing idea earns its place above all others: every upstream model sits somewhere on a spectrum between a pure data-driven model that knows only correlations and a pure physics simulator that encodes the governing equations. Pure ML is fast and flexible but extrapolates dangerously and ignores conservation laws. Pure simulation is faithful but slow and hungry for parameters you cannot measure. The fertile middle is physics-informed machine learning — models that learn from data while being constrained to respect the physics. This is the single most important trend in technical upstream AI, and it is where the field is heading.
Domain by domain
1 · Seismic and exploration
Seismic interpretation was among the first upstream domains transformed by deep learning, because a seismic volume is essentially a 3D image and convolutional neural networks (CNNs) excel at images. The landmark example is automated fault detection: where interpreters once hand-picked faults slice by slice, a 3D CNN trained on synthetic seismic now outputs a fault-probability volume in minutes (Wu et al.’s FaultSeg3D being the canonical reference). Related tasks include salt-body and channel segmentation, seismic facies classification, and noise attenuation. A deeper frontier is velocity inversion — recovering the subsurface velocity model from raw waveforms — where physics-informed networks encode the wave equation directly so the inversion respects wave physics rather than merely fitting amplitudes.
2 · Petrophysics and well logs
Logs are the highest-resolution direct window into the rock, and ML serves several roles. Log prediction reconstructs missing or degraded curves (synthesizing a sonic or a density log from the others). Lithofacies classification — assigning each depth a rock type from log responses — is the textbook supervised-learning task in petrophysics, popularized by an open machine-learning contest whose dataset is still a teaching staple. ML also automates formation-top picking and feeds directly into rock typing, where clustering and classification group the reservoir into flow units. The recurring caution: logs from one field rarely transfer to another without recalibration, because the same tool reading means different things in different rocks.
3 · Reservoir characterization and modeling
Full-physics reservoir simulation is accurate but expensive — a single run can take hours, and uncertainty studies need thousands. ML answers with surrogate (proxy) models: a fast statistical emulator trained on a limited set of full simulations that then predicts outcomes across the parameter space in milliseconds, enabling optimization and uncertainty quantification that would otherwise be intractable. ML also accelerates assisted history matching — tuning a model to reproduce observed production — and underpins data-driven reservoir modeling, where field behaviour is learned largely from data with physics as a guide rather than starting from a fully built geological model.
4 · Drilling
Drilling is a real-time, sensor-dense activity, which suits ML well. Models optimize rate of penetration (ROP) by recommending weight-on-bit and rotary speed; predict drilling dysfunctions such as stuck pipe, kicks, and washouts before they escalate; and support geosteering by interpreting logging-while-drilling data on the fly. Here the payoff is immediate and measurable — non-productive time avoided is money saved the same day — which is why drilling analytics has been one of the faster areas to reach production deployment.
5 · Production and surveillance
This is where data is richest and ML is most operationally embedded. Production forecasting extends classical decline-curve analysis with sequence models (LSTMs and, increasingly, graph networks that capture well-to-well interference). Virtual flow metering infers rates from pressure and temperature when physical meters are absent or unreliable. And predictive maintenance — most famously electric submersible pump (ESP) failure prediction — flags equipment degradation from sensor trends so an intervention can be planned before an unplanned, production-killing failure. Surveillance dashboards increasingly fold these models in so that the wells needing attention rise automatically to the top of the queue.
| Domain | Representative task | Typical technique | Output |
|---|---|---|---|
| Seismic | Fault / salt detection | 3D CNN (U-Net family) | Probability volume |
| Seismic | Velocity inversion | Physics-informed NN / FWI | Velocity model |
| Petrophysics | Facies / log prediction | Gradient boosting, CNN, RNN | Classified / synthesized curve |
| Reservoir | Surrogate modeling | Neural net / Gaussian process | Fast forecast emulator |
| Drilling | ROP / dysfunction | Tree ensembles, time-series NN | Recommendation / alarm |
| Production | Forecast / interference | LSTM, graph neural network | Rate & EUR forecast |
| Facilities | ESP / equipment failure | Anomaly detection, survival models | Time-to-failure / alert |
The physics-informed turn
Why has the industry converged on physics-informed methods? Because a model that scores well on a random test split can still produce physically absurd results — negative saturations, mass that appears from nowhere, pressure responses that violate diffusion. The fix is to make the physics part of the training objective. A physics-informed neural network (PINN) minimizes a composite loss: a data term that fits observations plus a physics term that penalizes violations of the governing partial differential equation, evaluated by automatic differentiation of the network itself.
Ldata = misfit to measurements · Lphysics = residual of the governing PDE (mass / momentum / energy)
λ balances trusting the data against trusting the physics.
The workflow nobody photographs
The glamour is in the model; the value is in the workflow around it. A deployable upstream ML system is a loop, not a one-off script: assemble and clean data, engineer features with domain meaning, split correctly, train, validate, deploy, and — critically — monitor for drift as the field changes, then retrain. Skip the loop and a model that dazzled in a notebook quietly rots in production as new wells, new operating conditions, and instrument changes pull the live data away from what it was trained on.
Subsurface-specific pitfalls
Generic ML advice is necessary but not sufficient here; the subsurface adds failure modes of its own.
- Data leakage through spatial correlation. Splitting train and test by random row lets samples from the same well sit on both sides, so the model memorizes the well rather than learning the physics — and scores beautifully until it meets a new well. Always split by well, field, or time.
- Tiny labelled datasets. Tens of wells, not millions of rows of independent examples. This is why synthetic training data (for seismic CNNs) and physics constraints (for everything) matter so much.
- Extrapolation, not interpolation. The questions that matter — undrilled locations, future pressures — lie outside the training range, exactly where pure ML is least trustworthy.
- Non-stationarity. Reservoirs deplete and operating conditions change, so the relationship the model learned last year may not hold this year.
- Overfitting classical methods. An ML decline curve that bends to honour every noisy point forecasts worse than a disciplined Arps fit. Flexibility is not free.
- Interpretability and trust. A black-box recommendation that an engineer cannot interrogate will not (and should not) drive a multi-million-dollar decision.
The generative and agentic frontier
The newest wave extends beyond prediction into generation and autonomy. Generative models now synthesize plausible geological realizations and fill data gaps, supporting uncertainty studies that need many equiprobable scenarios. Foundation and large language models are being pointed at the mountain of unstructured upstream knowledge — well reports, end-of-well summaries, historical interpretations — to retrieve and synthesize what used to take an engineer days to dig out. And AI agents operating over unified data platforms can chain tasks: pull a well’s data, run a diagnostic, draft a summary, flag an anomaly. The promise is real, but so are the cautions that run through this whole article — physical consistency, honest validation, interpretability, and a human in the loop for any decision that spends real capital. The trajectory is clear: not AI replacing subsurface judgment, but AI compressing the distance between a question and a defensible, physics-consistent answer.
Closing
AI and ML are now woven through the upstream value chain — CNNs reading seismic, classifiers typing rock, surrogates accelerating simulation, sequence models forecasting production, and anomaly detectors saving pumps. What separates durable value from hype is discipline that the subsurface demands more than most domains: respect the physics, guard against leakage, test on unseen wells, keep the human in the loop, and never confuse a good test score with a good decision. Get those right, and machine learning becomes what it should be in upstream — not a replacement for reservoir engineering and geoscience, but a powerful amplifier of both.
References
LeCun, Y., Bengio, Y., Hinton, G. (2015). Deep Learning. Nature, 521, 436–444.
Raissi, M., Perdikaris, P., Karniadakis, G. E. (2019). Physics-Informed Neural Networks. Journal of Computational Physics, 378, 686–707.
Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., Yang, L. (2021). Physics-Informed Machine Learning. Nature Reviews Physics, 3, 422–440.
Wu, X., Liang, L., Shi, Y., Fomel, S. (2019). FaultSeg3D: Using Synthetic Datasets to Train an End-to-End CNN for 3D Seismic Fault Segmentation. Geophysics, 84(3), IM35–IM45.
Bergen, K. J., Johnson, P. A., de Hoop, M. V., Beroza, G. C. (2019). Machine Learning for Data-Driven Discovery in Solid Earth Geoscience. Science, 363(6433).
Hall, B. (2016). Facies Classification Using Machine Learning. The Leading Edge, 35(10), 906–909.
Mohaghegh, S. D. (2017). Data-Driven Reservoir Modeling. Society of Petroleum Engineers.
Latrach, A., et al. (2024). A Critical Review of Physics-Informed Machine Learning Applications in Subsurface Energy Systems. (Preprint / review).
The Open Group. Open Subsurface Data Universe (OSDU) Data Platform.