Li
Lean Informatics — Field Manual Upstream Oil & Gas for the AI-First Engineer
v1 · May 2026
A FIELD MANUAL

Upstream Oil & Gas, From Zero

A working primer on petroleum exploration and operations, written for the AI-first engineer building Lean Informatics. Every concept is paired with the vocabulary you'll hear in the field and a brief note on where it shows up in our pipelines.

Twenty chapters. Roughly four hours end-to-end. Use the sidebar to jump around — each section is self-contained.

Chapter 00·~5 min

How to use this manual #

This manual exists for a specific person: the engineer or operator at Lean Informatics who is technically fluent in machine learning and data infrastructure but has never worked in the petroleum industry. By the end you should be able to sit across the table from a landman, a petrophysicist, or an integrity inspector and follow the conversation — and know what would be a useful question to ask back.

It is not a substitute for years of fieldwork. It is a fast on-ramp. The goal is fluency in the vocabulary and a working mental model of the workflows, so that when a client says “we need offset logs and completions for our SCOOP bolt-on,” you know roughly what they want, where the data comes from, and why the dataset is harder to assemble than it sounds.

A note on register

The oilfield speaks plainly. Where the industry uses jargon, it's almost always because the jargon is more precise than English. PUD means something specific to an auditor; “undeveloped reserve” in casual conversation does not. The vocabulary in this manual is the vocabulary used in the field — not academic.

How each chapter is built

  • The concept in a paragraph or two.
  • The vocabulary as a definition list — these are the words you'll hear.
  • Why it matters here — a brief connection back to what Lean Informatics is building.
  • Things to remember — three to five anchor facts to internalize.

The chapters build on each other but each one stands alone. If you find yourself lost, jump back to 02 (geology) and 04 (well lifecycle); almost everything else hangs off those two.

Chapter 01·Orientation

The upstream value chain #

The petroleum industry is usually described in three segments: upstream (finding and producing hydrocarbons), midstream (moving them from wellhead to refinery — pipelines, gathering, storage, processing), and downstream (refining and selling the finished products). Lean Informatics serves upstream, with NVI-style integrity work reaching into midstream.

Inside upstream, the work breaks roughly into:

Exploration
Geoscience, seismic, prospect generation. Finding where the hydrocarbons are.
Land & legal
Securing the right to drill — leases, title, surface use, regulatory permits.
Drilling
Getting a hole in the ground to the target formation.
Completion
Preparing the well to flow — casing, perforating, fracturing.
Production
Running the well, lifting the fluid, maintaining the equipment, hauling the product.
P&A
Plug and abandonment — closing out depleted wells safely.

Who does what

The industry is built on specialization. An operator (the E&P company whose name is on the lease) rarely does most of the work itself. It contracts out drilling to a drilling contractor, completions to a service company like Halliburton or SLB, integrity inspection to firms like NVI, and land work to firms like BETA Land Services. The operator's full-time staff is typically geoscientists, engineers, landmen, and an operations team. Everything else is outsourced.

Where this shows up

Almost every Lean Informatics engagement involves working with the operator's internal team and at least one of these contractor types as data sources or consumers. A scout ticket comes from the operator. A mud log comes from the drilling crew. An inspection report comes from NVI. Knowing who produced a document tells you a lot about how trustworthy and structured it will be.

Remember

  • Upstream = finding and producing. Midstream = moving. Downstream = refining and selling.
  • The operator's name is on the lease. Almost everyone else on a well is a contractor.
  • “E&P” (exploration and production) and “upstream operator” mean the same thing.
  • Different document types come from different parties — that affects format, accuracy, and where they end up filed.
Chapter 02·Foundational

Petroleum geology basics #

Crude oil and natural gas form from organic matter — algae, plankton, plant debris — buried in sediment, cooked by heat and pressure over millions of years. The cooking process is called thermal maturation. The temperature window where oil forms (roughly 60–120°C) is called the oil window; hotter than that and the molecules crack further into gas.

For commercial production you need four geological conditions to coincide. Without any one of them, no oilfield.

Source rock
The original organic-rich sediment (usually a marine shale) where the hydrocarbons formed.
Reservoir rock
A porous, permeable rock the hydrocarbons can migrate into and accumulate in — typically sandstone, limestone, or fractured shale.
Trap
A geometric arrangement that stops the hydrocarbons from migrating further — an anticline, fault block, salt dome, or stratigraphic pinch-out.
Seal
An impermeable layer (often a shale or evaporite) above the trap that keeps the hydrocarbons from escaping upward.

Conventional vs. unconventional

Conventional production targets reservoirs that hold and flow hydrocarbons on their own — sandstones and carbonates with measurable permeability. Drill into them and they produce; no stimulation required.

Unconventional production — the entire shale revolution — targets the source rock itself. Shales have huge volumes of hydrocarbons trapped in nano-scale pores but practically zero natural permeability. The rock is so tight that unconventional rocks often have permeability less than the cement used to seal the well casing. Hydraulic fracturing creates artificial permeability by cracking the rock open and propping the cracks with sand. Without fracturing, an unconventional well produces almost nothing.

Formations and tops

Rocks are stratified into layers, and the layers are named — the Wolfcamp, the Bakken, the Marcellus, the Eagle Ford. The depth at which a given formation begins is called the formation top. A list of formation tops for a given well, written down at known depths, is one of the most reusable pieces of geological data — operators correlate well logs by snapping them to a common set of tops.

Formation nomenclature is inconsistent. The same rock unit might be called one name in Texas and a different name in New Mexico. Operators within a basin will standardize on their own preferred names, often with internal sub-divisions (“Wolfcamp A”, “Wolfcamp B”, etc.) that are not strictly geological but reflect drilling practice.

Where this shows up

Entity resolution on formations is a real problem. The same physical rock shows up under different names across operators, eras, and basins. Our formation synonym graph is one of the assets we maintain per client.

Remember

  • Four conditions for a field: source, reservoir, trap, seal.
  • Conventional = the rock flows. Unconventional = the rock has to be fractured before it flows.
  • Formation tops are the geological skeleton of a well — everything correlates against them.
  • Formation names are not standardized. They are basin-, operator-, and era-specific.
Chapter 03·Orientation

The major US plays #

A basin is a regional geologic feature — a structurally subsided area that filled with sediment over time and where source, reservoir, trap, and seal coincide. A play is a specific commercial target inside a basin: usually one or a few stacked formations that operators currently develop with a similar drilling and completion design.

These are the plays that come up most often in upstream conversation. You don't need to remember the geology of each. You do need to recognize the names and have a rough sense of where they are and what they produce.

Play Where Mostly produces Notes
Permian BasinWest Texas, southeast New MexicoOil + associated gasThe largest US producing basin. Stacked Wolfcamp and Bone Spring targets. Sub-basins: Midland and Delaware.
Eagle FordSouth TexasOil, condensate, gas (varies by depth)One of the first shale plays. Maturity zones run from oil-rich shallow to gas-rich deep.
SCOOP / STACKCentral OklahomaOil + wet gasSouth Central Oklahoma Oil Province / Sooner Trend, Anadarko, Canadian, and Kingfisher counties. Stacked plays — Woodford, Meramec, Osage, Springer.
Bakken / Three ForksNorth Dakota, eastern MontanaOilWilliston Basin. The play that put unconventional oil on the map.
MarcellusPennsylvania, West Virginia, Ohio, NYDry gasThe largest gas play in the US. Heavily regulated by PA and WV DEP.
HaynesvilleNW Louisiana, NE TexasDry gasDeep, high-pressure, high-temperature. A “tier-1” gas play that came back to life with LNG demand.
Niobrara / DJ BasinColorado, WyomingOil + gasDenver-Julesburg basin. Wattenberg field. Regulated by CO ECMC.
Anadarko BasinWestern Oklahoma, Texas PanhandleGas + oilHosts SCOOP/STACK. Long conventional history — the Hunton Lime trend is here.
Powder River BasinWyoming, MontanaOilNiobrara, Frontier, Mowry, Turner targets. Less mature than Permian and Bakken.
Gulf of MexicoFederal offshoreOil + gasRegulated by BOEM. Conventional, deepwater. Long lead times. Mostly major-operator territory.
Cotton ValleyNW Louisiana, East TexasGasConventional and tight-gas sand. Heavy land services work historically.
Tuscaloosa Marine ShaleCentral Louisiana / MississippiOilMixed results. A play that has come and gone with oil price.

Sub-basin geography matters

Saying “the Permian” is like saying “the West Coast” — too vague to be useful at the operator level. The Permian splits into the Midland Basin (east) and the Delaware Basin (west), with different formation characteristics and economics. When you hear “a Permian operator,” the next question to ask is which sub-basin and what formations.

Where this shows up

Our state adapter framework is keyed by state, but our model fine-tuning is keyed by play. A vintage Schlumberger gamma-ray log run in the Bakken looks different from a modern color log run in the Permian, and the curve tracer needs to know which one it's looking at.

Remember

  • Basin = the geological container. Play = the commercial target inside.
  • Permian, SCOOP/STACK, Bakken, Marcellus, Haynesville, Eagle Ford. Recognize all six on sight.
  • “Permian” means Midland or Delaware. Ask which.
  • Plays produce different fluid mixes — oil, wet gas (with NGLs), or dry gas — and that drives the economics.
Chapter 04·Foundational

The well lifecycle #

A well goes through a predictable sequence of stages from idea to abandonment. Each stage produces a characteristic set of documents and data — knowing the sequence lets you predict what data should exist for any given well.

The eight stages

1. Prospect
The geoscience team identifies a target — a piece of acreage where source, reservoir, trap, and seal are believed to coincide. Output: a prospect map, an internal economic estimate, a recommendation to lease.
2. Lease
The land team acquires the legal right to drill. This is the BETA Land world. Output: leases, title abstracts, curative documents, ROW agreements. (See chapter 07.)
3. Permit
The operator files an Application for Permit to Drill (APD) with the relevant state regulator. The APD includes location, target formation, casing program, and surface plan. Output: a permit and an assigned API number. (See chapters 17 and 18.)
4. Drill
The hole is drilled to the target depth, often horizontally. The drilling crew produces a daily drilling report, a mud log (lithology from cuttings), and LWD/MWD logs (logs while drilling). Casing is run and cemented as the hole deepens. (See chapter 08.)
5. Log
Wireline tools are run down the hole to measure rock properties — the logging suite. Output: digital LAS files and paper-format prints. (See chapter 09.)
6. Complete
The well is prepared to flow. For unconventionals: cement the casing in place, perforate at the target intervals, hydraulically fracture each stage, then drill out the plugs. Output: completion report. (See chapter 11.)
7. Produce
The well makes oil and gas. Volumes are reported monthly to the regulator. Decline curve analysis predicts future production. Output: production records, monthly state filings. (See chapter 12.)
8. P&A
Plug and abandonment. When the well is depleted, it is cemented closed and surface facilities are removed. Output: a plugging report.

The first-production milestone

Within those eight stages, the most important moment economically is first production (sometimes “first oil” or “FOO”). Everything before it is cost; everything after it is revenue. The spud date (when the bit first turns into the ground) and the first-production date together define the cycle time the operator is trying to minimize. Modern unconventional operators target a spud-to-first-production cycle of 60–120 days.

Where this shows up

The structured well master we deliver to clients is keyed by these stages. Every well has a permit document, a drilling report, a completion report, and ongoing production records — each one a distinct doc type for our router. Document type + stage tells the system what schema to apply.

Remember

  • Prospect → Lease → Permit → Drill → Log → Complete → Produce → P&A. Eight stages, predictable order.
  • The API number is assigned at the permit stage. It is the well's identity from then on.
  • Spud-to-first-production is the operator's headline cycle-time metric.
  • Every stage produces specific documents. If a doc class is missing for a well, you can name what stage failed.
Chapter 05·Land & Position

The Public Land Survey System #

The PLSS is the federal grid system that describes land across most of the United States west of the original thirteen colonies. Established by the Land Ordinance of 1785, it covers more than 1.5 billion acres across 30 states and remains the legal basis for mineral leases, oil and gas permits, and federal land management. If you work in upstream, you live in PLSS coordinates.

The hierarchy

The system divides land into a nested grid anchored on a principal meridian (a north-south reference line) and a baseline (east-west). Their intersection is the local point of origin for a survey region.

Township
A 6 mile × 6 mile square (36 square miles, ~23,000 acres). The primary unit of the grid. Identified by a township number north or south of the baseline and a range number east or west of the principal meridian.
Section
One square mile, 640 acres. Each township contains 36 sections, numbered 1–36 starting in the northeast corner and snaking back and forth across the township.
Quarter section
160 acres. Identified by compass direction — NE¼, NW¼, SE¼, SW¼.
Quarter-quarter
40 acres. The typical smallest meaningful unit for leasing. Written compounded: “NW¼ NE¼” is the northwest quarter of the northeast quarter — 40 acres.

Section numbering inside a township

Sections are numbered like an ox-plow boustrophedon. Section 1 is northeast; section 6 is northwest; section 7 is one row south of section 6; section 36 is southeast.

6
5
4
3
2
1
7
8
9
10
11
12
18
17
16
15
14
13
19
20
21
22
23
24
30
29
28
27
26
25
31
32
33
34
35
36

Reading a legal description

PLSS land descriptions are read inside-out, smallest unit first. A typical legal description looks like this:

NW¼ of the SE¼ of Section 14, T-2-S, R-3-W, 5th Principal Meridian

Decoded:

  • NW¼ of the SE¼ — northwest quarter of the southeast quarter (40 acres)
  • Section 14 — section 14 of the township (1 sq mi)
  • T-2-S — Township 2 South of the baseline
  • R-3-W — Range 3 West of the principal meridian
  • 5th Principal Meridian — which survey region we're in

Always include the principal meridian. Without it the description is technically ambiguous — “T-2-S, R-3-W” exists in multiple PLSS regions across the country.

Where the PLSS doesn't apply

The 13 original colonies and a few other eastern states (Texas, parts of Hawaii, much of Louisiana) use a much older system called metes and bounds, which describes parcels by walking the perimeter — distance, compass bearing, and physical landmarks. This is one reason Texas land records are notoriously harder to crawl than Oklahoma or North Dakota.

Important nuance

Texas runs its own land survey system based on original Spanish-era grants — labels like “Survey 24, Block 31, T&P RR Co.” replace township-range-section. When working a Permian project, expect Texas descriptions to look entirely different from New Mexico descriptions in the same basin.

Where this shows up

Our map georeferencing pipeline has three modes — PLSS-based, lat/long-tick, and feature-based. The PLSS mode is the workhorse because it works on most US onshore acreage. Pulling the PLSS labels off a paper map and snapping them to known section corner coordinates is what turns a scanned image into a georeferenced GeoTIFF.

Remember

  • Township = 6×6 mile box. Section = 1 sq mi. Quarter = 160 ac. Quarter-quarter = 40 ac.
  • Sections number 1–36 in a snaking pattern from the northeast corner.
  • Read legal descriptions inside-out, smallest first.
  • Always cite the principal meridian.
  • Texas uses its own survey system, not PLSS. Plan for it.
Chapter 06·Land & Position

The API well number #

Every well drilled in the United States receives a unique permanent identifier called the API well number (more formally, the US Well Number). It is assigned by the state regulator at the permit stage and follows the well through its entire life. If you remember one identifier in upstream data, this is the one.

The current standard format is 12 or 14 digits, broken into segments by hyphens:

42501201300300
state
county
well
bore
event
State (2 digits)
State code — 42 = Texas, 30 = New Mexico, 05 = Arkansas, etc. Based on a 1952 IBM standard, alphabetical (mostly). Do not confuse with FIPS codes.
County (3 digits)
County within the state. 501 in Texas = Yoakum County.
Well (5 digits)
Unique well identifier within the county. Assigned sequentially by the state regulator at permit time.
Sidetrack (2 digits)
Directional sidetrack code. 00 = the original wellbore. 01, 02 … = subsequent sidetracks drilled from the same surface location to different bottomhole locations.
Event (2 digits)
Event sequence code. Tracks recompletions or other physical configuration changes within an existing wellbore. 00 = original completion.

API-10, API-12, API-14

Three lengths are in common use and they don't mean the same thing:

  • API-10 — just state, county, and well-within-county. Describes the surface location. Two sidetracks from the same surface location share an API-10.
  • API-12 — adds the sidetrack code. Describes a specific wellbore.
  • API-14 — adds the event sequence. Describes a specific completion event on a wellbore.

For most upstream data work, API-10 is the right level of granularity to identify a well. API-14 is needed when you care about which completion event you're looking at (matters for production allocation and for some types of regulatory filings).

Operator habits

Most state portals serve API numbers in 10-digit form. Operators internally often track API-14. When loading a vendor extract into an operator's master database, mismatched lengths are a common source of pain. Decide which version is your primary key per engagement.

Where this shows up

API-10 is our default blocking key for entity resolution on wells. Two filings that share an API-10 plus a surface location plus an operator are almost certainly the same well. The hard work is everything else: operator name changes, well-name typos, and reformulated leases that re-label the same physical hole.

Remember

  • Format: SS-CCC-WWWWW-BB-EE. State, county, well, sidetrack, event.
  • API-10 = surface location. API-12 = wellbore. API-14 = wellbore + event.
  • Assigned by the state regulator at permit time, immutable thereafter.
  • State codes are not FIPS codes. Use the real API state-code lookup.
Chapter 07·Land & Position

Leases, title, and land services #

Before an operator can drill, it needs the legal right to extract minerals from the ground under a specific piece of acreage. That right comes from a lease — a contract between the mineral rights owner (the lessor) and the operator (the lessee). The work of finding mineral owners, negotiating leases, researching title, and resolving conflicting claims is land services. This is the BETA Land Services world.

Surface vs. mineral estate

In the US, mineral rights can be severed from surface rights. The person who owns the surface (the farmer, the rancher) may not own what's under it. That severance creates two separate legal estates:

  • Surface estate — the right to use the land surface. Owned by the surface owner.
  • Mineral estate — the right to extract subsurface minerals. May belong to a different party, possibly broken into fractional interests across dozens of heirs.

A mineral estate that has been split among multiple parties is called fractionated. Working out who owns what fraction, often across decades of inheritances and divorces, is the bulk of what a landman does.

The lease document

An oil and gas lease typically specifies:

Lessor / lessee
Who owns the minerals; who is leasing them.
Legal description
The specific acreage covered, in PLSS or metes-and-bounds terms.
Bonus
Upfront cash payment to the lessor for signing the lease.
Primary term
How long the operator has to drill before the lease expires. Usually 3–5 years.
Royalty
The fraction of production revenue paid to the mineral owner. Historically 12.5% (one-eighth); modern leases run 18.75%–25% in competitive basins.
Held by production (HBP)
The clause that extends the lease indefinitely beyond the primary term as long as the well keeps producing in paying quantities.
Pooling clause
Permits combining multiple leases into a drilling unit (often a section) for the purpose of drilling a single well.

The land services workflow

This is the work BETA Land Services and firms like it deliver to operators:

Abstracting
Walking the chain of title for a piece of acreage all the way back through county records — sometimes 150 years or more — and producing an abstract: a summary of every recorded document affecting ownership. The raw material for everything else.
Title research / opinion
An attorney's reading of the abstract: who owns what fraction of the minerals, with what burdens, and is title clear enough to drill on?
Curative
Fixing defects in title — finding missing heirs, getting affidavits signed, clearing old liens — so the title opinion can be re-issued clean.
Leasing
The landman in the field negotiating and signing the actual leases with mineral owners.
Division order
The instrument that tells the operator exactly which fraction of revenue is owed to which party. Issued after first production based on the cleared title.
Right of way (ROW)
Easements across surface land for pipelines, roads, power, and gathering infrastructure. A separate negotiation from the lease itself.
The BETA Way

A land services firm wins on speed and procedural discipline. BETA Land Services maintains a 48–72 hour deployment promise across North America, has built the firm to 51–200 employees across 37 states, and has touched 4.3 million acres and thousands of wells. The deliverable — a clean abstract, a curative file, a ready-to-execute lease — is reliable and defensible. That's the model Lean Informatics applies to upstream data.

Remember

  • Surface estate and mineral estate are legally separate. Most leasing work concerns the mineral estate.
  • A lease has a primary term and is held by production thereafter — that's why operators rush to drill before the primary term ends.
  • Abstracting → title opinion → curative → leasing → division order. That's the land services workflow.
  • The landman is the operator's most expensive non-engineer because their work is the most legally exposed.
Chapter 08·Operations

Drilling #

Drilling is the part of the lifecycle that converts permits and plans into a physical hole in the ground. A single modern unconventional well takes 10–30 days to drill and runs to depths of 8,000–14,000 feet vertical with a horizontal section (the lateral) of 5,000–15,000 feet beyond that.

Vertical, directional, horizontal

  • Vertical wells drill straight down. Dominant historically; still used for shallow conventional targets and for certain monitoring/disposal wells.
  • Directional wells intentionally deviate from vertical. Used to reach a bottomhole location offset from the surface — common offshore and on tight surface pads.
  • Horizontal wells deviate to roughly 90° and run a long lateral inside the target formation. Standard for unconventional plays — the lateral exposes hundreds of times more reservoir rock to the wellbore than a vertical hole would.

Modern unconventional drilling design

A typical horizontal well has three sections: the vertical section, the curve (where it builds angle), and the lateral (the horizontal portion within the target). Operators plan and steer to a specific landing zone within the target formation — typically a 10–30 foot vertical window of the highest-quality rock.

Lateral lengths have grown steadily over the last decade. Many laterals today run 7,500–10,000 feet, with some operators experimenting with 15,000+ foot designs and even “horseshoe” geometries that turn the lateral 180° to honor lease boundaries. Longer laterals expose more rock per well and amortize the vertical drilling cost across more producing length.

Casing and cementing

A drilled hole is not stable on its own. As the bit goes deeper, the operator runs casing — concentric steel pipe — and cements it in place to isolate the wellbore from surrounding formations. A typical well has multiple casing strings:

  • Conductor casing — the outermost, near surface. Stops the hole from collapsing in the first few hundred feet.
  • Surface casing — set below the deepest fresh water aquifer. Protects groundwater. Cemented to surface.
  • Intermediate casing — isolates problem zones (pressure changes, lost circulation, weak formations).
  • Production casing — the innermost string, set across the producing interval. The one that gets perforated.

Geosteering

While drilling the lateral, the operator's geosteering team watches real-time log data from tools mounted near the bit (LWD — Logging While Drilling) and adjusts the trajectory to stay inside the target zone. Modern geosteering uses gamma ray, resistivity, and inclination measurements every few feet. A well that drifts out of the target zone produces dramatically less.

Where this shows up

The drilling stage produces some of the densest data on a well: daily drilling reports, mud logs, LWD log files. Operators integrate these into their well master and into Petrel for next-well planning. Our pipelines pull and structure all of it.

Remember

  • Vertical, directional, horizontal. Horizontal + frac is the unconventional pattern.
  • A horizontal well = vertical section + curve + lateral. The lateral is the productive part.
  • Casing is run and cemented in stages as the hole deepens. Each string has a purpose.
  • Geosteering keeps the bit inside the target zone using real-time log data.
Chapter 09·Operations

Well logs #

A well log is a continuous depth-by-depth record of physical measurements taken inside the wellbore by lowering instruments down the hole. The data is plotted as parallel curves on a multi-track strip — depth on the vertical axis, measurement values on the horizontal axes. Reading well logs is one of the foundational skills of petroleum geoscience.

Logs are run by service companies (Schlumberger, Halliburton, Weatherford, and others) and delivered to the operator as both digital data (LAS files) and plotted images (paper, PDF, or TIFF). For decades, only the plotted image was standard. That is why so much legacy log data exists only as raster — and why digitizing it is hard.

The standard track layout

By convention, logs are presented on a three-track format: Track 1 sits to the left of the depth column and holds the gamma ray, SP, and caliper curves. Track 2 typically holds resistivity. Track 3 holds porosity measurements — density and neutron. Once you orient on one well log, you can orient on almost any other.

DEPTH
GR / CAL / SP
RESISTIVITY
NEUTRON / DENSITY

The basic curve suite

Gamma ray (GR)
Measures natural radioactivity. Shales and clays contain naturally occurring radioactive elements (mostly potassium, uranium, thorium) and read high; clean sandstones and carbonates read low. The first curve geologists look at — it separates reservoir rock from non-reservoir.
Spontaneous potential (SP)
Measures the natural voltage between the mud in the wellbore and the formation fluid. Reads negative in permeable beds; useful for identifying water vs. hydrocarbon-bearing intervals when GR is ambiguous.
Caliper (CAL)
Mechanical arms measure the diameter of the borehole. A diameter larger than the bit size means washout (a weak rock); smaller means swelling clays or mudcake (a sign of permeability — a positive indicator).
Resistivity (deep, medium, shallow)
How resistive the rock is to electrical current. Hydrocarbon-bearing zones read high; water-bearing zones read low. Multiple depths of investigation reveal mud filtrate invasion. The most important hydrocarbon indicator on a basic log.
Neutron porosity (NPHI)
Measures the rock's response to neutron radiation, which is dominated by hydrogen content. Hydrogen lives in water, in oil, and in clays — so neutron porosity must be interpreted alongside density.
Bulk density (RHOB)
Measures the rock's bulk density via gamma backscatter. Together with neutron, density gives an unambiguous porosity reading — and the density-neutron crossover pattern is a classic gas signature.
Sonic (DT)
Measures the time for a sound wave to travel a foot through the rock. Used for porosity and for tying the well log to seismic data.

The LAS file

Digital log data is delivered in LAS (Log ASCII Standard) format — a plain-text file with a header section describing the well and the curves, followed by a data section of depth-indexed numerical values. LAS 2.0 is the most widely supported version. LAS 3.0 adds some metadata structure but adoption has been mixed.

~Version Information
 VERS.    2.0  : CWLS Log ASCII Standard
 WRAP.    NO   : One line per depth step
~Well Information
 STRT.F   8000.0           : First reference value
 STOP.F   8500.0           : Last reference value
 STEP.F   0.5              : Step
 NULL.    -999.25          : Missing value
 WELL.    POPE STATE 4H    : Well name
 API .    42501201300300   : API number
~Curve Information
 DEPT.F                    : Depth
 GR  .GAPI                 : Gamma ray
 RES .OHMM                 : Deep resistivity
 NPHI.V/V                  : Neutron porosity
 RHOB.G/C3                 : Bulk density
~Ascii
 8000.0  45.2  12.5  0.18  2.42
 8000.5  47.1  13.8  0.17  2.43
 ...
Where this shows up

A huge fraction of legacy well-log data — anything pre-1990 and a stubborn tail of more recent vintage — exists only as raster (TIFF, PDF, or paper). Petrel won't read it. Kingdom won't read it. The work of turning those rasters into clean LAS files is exactly what the curve tracer does. This is the bespoke moat described in the business plan.

Remember

  • Three tracks: GR/SP/CAL on the left, resistivity in the middle, neutron/density on the right.
  • Gamma ray separates reservoir from non-reservoir.
  • Resistivity separates hydrocarbon-bearing from water-bearing.
  • Neutron + density together give porosity, and their crossover signals gas.
  • LAS 2.0 is the de facto digital interchange. The raster legacy is where the curve tracer earns its keep.
Chapter 10·Operations

Petrophysics #

Petrophysics is the discipline of turning log measurements into quantitative answers about the reservoir: how much oil and gas is in place, how much can flow, and how producible it actually is. It sits between the raw geophysics and the reservoir engineering.

The three quantities a petrophysicist wants

Porosity (ϕ)
The fraction of the rock that is pore space. Computed from density, neutron, and sometimes sonic logs. Reported as a fraction (0.08 = 8%) or a percentage. Without porosity, nothing fits inside the rock.
Water saturation (Sw)
The fraction of the pore space filled with water (the rest is hydrocarbon). Computed mainly from resistivity using Archie's equation (see below). The lower Sw, the more hydrocarbon in place.
Permeability (k)
How readily fluid flows through the rock, measured in darcies or millidarcies. Logs estimate permeability poorly — it's usually measured directly from cores or inferred from production. In unconventional shales it's measured in nanodarcies.

Archie's equation

The single most cited equation in petrophysics is Archie's equation, which relates rock resistivity to water saturation:

Sw^n = (a · Rw) / (ϕ^m · Rt)

where:

  • Sw = water saturation (the unknown we want)
  • Rw = resistivity of the formation water
  • Rt = true resistivity of the formation, from deep resistivity log
  • ϕ = porosity
  • a, m, n = empirical constants for the specific rock type — usually a≈1, m≈2, n≈2 for clean sandstones.

The equation is named after Gus Archie, who derived it at Shell in 1942. It only works in clean (shale-free) rock; modified versions exist for shaly sands and for unconventional plays where Archie's assumptions break down.

Gardner and Wyllie equations

Two other equations come up alongside Archie, both relating sonic and density measurements to porosity. The Gardner equation relates bulk density to seismic velocity; the Wyllie time-average equation relates sonic travel time to porosity. Both are workhorses for synthetic well log generation — which is why our synthetic data factory uses them.

Pay zones

A pay zone is an interval of rock that's commercially worth producing. The petrophysicist's job is to find the pay zones by combining the curves: clean (low GR), porous (high porosity from density-neutron), and hydrocarbon-saturated (high resistivity). Each pay zone gets a top, bottom, and net-pay thickness — the “feet of pay” that drive economic estimates.

Where this shows up

Our synthetic data factory generates raster log images with petrophysically plausible curves — using Archie, Gardner, and Wyllie to drive realistic resistivity, density, and sonic responses from synthetic lithology sequences. That gives us millions of perfectly labeled training pages to pretrain the curve tracer on, with no real-data labeling cost.

Remember

  • Petrophysics extracts porosity, water saturation, and permeability from logs.
  • Archie's equation is the foundational saturation calculation.
  • Gardner and Wyllie equations relate sonic and density to porosity.
  • Pay zones are clean, porous, and hydrocarbon-saturated. Net pay thickness drives economic estimates.
Chapter 11·Operations

Completions and frac #

Drilling makes a hole. Completion makes it produce. In an unconventional well, completion is where most of the cost lives — often 60–70% of total well capital. It's also where the operator's design choices most visibly affect economics.

Plug and perf — the standard unconventional completion

The dominant completion method in US shales is plug and perf. The lateral is divided into stages — typically 30–80 stages on a modern long lateral, each 100–250 feet long. Each stage is fractured independently, in sequence, starting from the toe (the far end of the lateral) and working back to the heel (where it meets the curve).

  1. The production casing is cemented in place across the lateral.
  2. A wireline crew runs a perforating gun down to the toe stage and shoots a cluster of holes through casing, cement, and into the formation rock.
  3. A frac crew pumps water, sand, and chemicals at high pressure through the perforations, cracking open the rock — this is the frac job.
  4. A bridge plug is set above the just-fractured stage to isolate it.
  5. The next stage's perforating gun is run down, perforating above the plug. Frac job, plug, perforate, frac job — and so on, stage by stage, back toward the heel.
  6. When all stages are done, a coiled tubing unit drills out all the plugs and the well is ready to flow.

What the frac job actually does

A single frac stage pumps roughly 10,000–25,000 barrels of fluid (water plus chemicals — friction reducers, biocides, scale inhibitors) and 200,000–500,000 pounds of proppant (sand or ceramic beads). Fluid pressure cracks the rock open; proppant flows into the cracks; when pressure is released, the proppant holds the cracks open, giving fluid a path to flow back to the wellbore. A modern long lateral well consumes 10–25 million pounds of sand total.

The fractures propagate in the direction of the maximum horizontal stress in the rock, generally perpendicular to the wellbore. By the end of 2018, about 96% of US crude oil production from tight oil formations came from horizontal wells. The combination of long horizontal laterals + many frac stages is what unlocked the entire shale revolution.

Cluster spacing and intensity

Within each stage, the perforations are clustered. Cluster spacing (the distance between clusters) and proppant intensity (pounds of sand per foot of lateral) are the two knobs operators have tuned hardest over the last decade. Both have generally increased — tighter clusters and more proppant per foot, producing more initial production but at higher cost.

Completion variants

  • Sliding sleeve — instead of plugs, ball-activated sleeves open each stage in sequence. Faster but less flexible. Common in some Canadian plays.
  • Open-hole — the lateral is not cased. Frac goes directly into the open formation. Cheaper but less control.
  • Acid jobs — instead of (or alongside) hydraulic fracturing, acid is pumped to dissolve carbonate rock. Common in conventional carbonate plays.

The completion report

Once a well is completed, the operator files a completion report with the regulator. It documents the stages, fluid volumes, proppant tonnage, perforation depths, and initial test rates. Operators care intensely about competitors' completion reports — they reveal design choices and let them benchmark.

Where this shows up

Completion reports are one of the most valuable document classes in our pipeline. They're filed publicly with state regulators but vary enormously in format. Our extractors pull stage tables, fluid volumes, and proppant totals into structured records that operators can compare across competitors offset-by-offset.

Remember

  • Plug and perf is the dominant completion method in US shale.
  • A modern lateral has 30–80 stages, fractured toe-to-heel.
  • Each stage pumps thousands of barrels of fluid and hundreds of thousands of pounds of proppant.
  • Cluster spacing and proppant intensity are the design knobs. Both have generally trended up.
  • Completion design from competitor reports is one of the most read pieces of public data in the industry.
Chapter 12·Operations

Production and decline #

Once a well starts producing, the operator's job is to keep it producing as economically as possible. Reservoir pressure depletes over time; flow rates decline; equipment fails; operating cost stays fixed per well. The decline curve is the mental model that organizes all of this.

The shape of production

A new unconventional well typically produces a flush burst of oil and gas in its first 30–90 days — the initial production rate or IP30/IP90 — then declines steeply for the first year or two, then flattens to a slower exponential decline that can last a decade or more. Roughly 50–70% of a typical unconventional well's lifetime production happens in its first three years.

Decline curve analysis

Decline curve analysis (DCA) fits a parametric curve to measured production data and extrapolates future production. The classical Arps equations give three flavors:

  • Exponential — constant percentage decline per unit time. Simplest. Reasonable for mature conventional wells.
  • Hyperbolic — decline rate itself declines. The standard for unconventional plays.
  • Harmonic — a special case of hyperbolic with the b-factor set to 1.

Fitting a decline curve gives the operator a forecast — usually expressed as estimated ultimate recovery (EUR), the total volume the well will produce over its life. EUR per well, divided by lateral length, gives EUR-per-foot, which is one of the standard productivity metrics for comparing operators and basins.

Type curves

A type curve is an averaged decline curve representing a class of wells — e.g., “Permian Wolfcamp B 7,500-foot lateral, 2024 vintage.” Operators build type curves from offset wells and use them as templates for forecasting new wells in similar conditions. A&D buyers use type curves to estimate the value of acreage they don't operate yet.

Allocation and reporting

Production volumes are reported monthly to the regulator. Allocation splits the produced volume across the wells and the leases that share a surface facility (a tank battery serving multiple wells, a saltwater disposal facility, etc.). Bad allocation produces wells with implausible production profiles — a known source of pain in public data.

Operators also report production in three streams:

Oil
Liquid hydrocarbons, measured in barrels (bbl). One barrel = 42 US gallons.
Gas
Methane and other gaseous hydrocarbons, measured in standard cubic feet (scf) or thousand cubic feet (mcf).
Water
Produced water, in barrels. In some basins (Permian especially), water cuts exceed 80%, and water handling is a major cost center.

The BOE

To talk about oil and gas together, operators convert them to a common unit: barrel of oil equivalent (BOE), with gas converted to oil at an energy ratio of 6,000 cubic feet of gas per barrel of oil (6 mcf/bbl). The conversion is energy-equivalent, not value-equivalent — at most oil prices, a barrel of oil is worth 10–20× a BOE of gas.

Where this shows up

Monthly production records are public in most major states. We crawl them and feed them to clients alongside the well master. Production + completion design + log data together let an operator answer the question that matters most: which acreage is worth drilling next?

Remember

  • Unconventional wells decline steeply for 1–2 years, then flatten. Most production comes early.
  • Decline curves are usually hyperbolic for unconventional, exponential for mature conventional.
  • EUR per well and EUR per foot of lateral are the headline productivity metrics.
  • BOE converts oil and gas to a common energy unit. 6 mcf = 1 BOE.
  • Allocation errors are a frequent source of dirty public production data.
Chapter 13·Commercial

Reserves and reserve reporting #

A reserve is a volume of hydrocarbons in the ground that is expected to be recovered under current economic conditions. Reserves are the primary asset on an E&P company's balance sheet. The way they're categorized and disclosed is regulated by the SEC for public companies, and drives bank borrowing bases, M&A valuations, and analyst coverage.

The reserve categories

Reserves are classified as proved, probable, or possible. The SEC permits disclosure only of proved reserves; SPE definitions also recognize probable and possible. The proved category is further subdivided by whether the well is drilled and producing.

CategoryCodeDefinitionConfidence
Proved Developed ProducingPDPExisting well, currently producingHighest
Proved Developed Non-ProducingPDNPExisting well, not currently producing (shut-in, behind-pipe zones)High
Proved UndevelopedPUDLocation not yet drilled; recovery expected with reasonable certaintyLower
ProbableP2Unproved; ≥50% probability of recoveryLower still
PossibleP3Unproved; ≥10% probability of recoveryLowest

1P, 2P, 3P

The shorthand combinations roll up these categories:

  • 1P = proved (PDP + PDNP + PUD). The SEC-reportable number.
  • 2P = proved + probable. Used by SPE and most non-US regulators.
  • 3P = proved + probable + possible. The most aggressive view.

Reserve reports

A reserve report is the engineering document that quantifies a company's reserves at a point in time. Independent firms (Cawley Gillespie, Netherland Sewell, DeGolyer, Ryder Scott, and others) audit reserves for SEC filers. The reserve report includes:

  • Well-by-well or location-by-location forecasts.
  • Decline curve assumptions for each well.
  • Operating cost assumptions.
  • Pricing assumptions (SEC pricing is a 12-month trailing average).
  • PV-10 — the present value of future net cash flows discounted at 10%, before tax. The single most cited reserve-value number.

Borrowing bases

Most E&P debt is structured as a reserve-based loan (RBL) against the PDP reserves. The lender's engineering team independently values the proved reserves and sets a borrowing base — typically a fraction of PV-10 on the PDP, with smaller fractions on PDNP and PUD. The borrowing base is redetermined twice a year. When prices fall, borrowing bases fall — and that drives a lot of distressed-asset deal flow.

Where this shows up

A&D evaluations rely on reserve reports plus the underlying log and completion data. The faster a buyer can recompute reserves on the target acreage using their own assumptions, the more confident their bid. Our pipelines feed the data layer that recomputation runs on.

Remember

  • PDP, PDNP, PUD are the proved categories. Probable and possible are unproved.
  • 1P = proved. 2P = proved + probable. 3P = all three.
  • PV-10 is the headline reserve-value number, discounted at 10% pre-tax.
  • Reserve-based loans drive a lot of corporate behavior. A falling oil price triggers borrowing-base redeterminations and distressed asset sales.
Chapter 15·Commercial

Midstream and pipeline integrity #

Midstream is everything between the wellhead and the refinery — gathering, processing, transportation, storage. It is also where pipeline integrity work lives, and where NVI and firms like it operate.

The flow

A typical onshore production stream goes through:

  1. Wellhead — the surface equipment at the well.
  2. Tank battery — separator, tanks for oil and water, gas line. Usually shared by multiple wells on a pad or lease.
  3. Gathering system — small-diameter pipelines that move product from tank batteries to a central processing or sales point.
  4. Processing plant — removes impurities (water, CO₂, H₂S, NGLs) from raw gas.
  5. Transmission pipeline — large-diameter long-distance pipe to refineries, LNG terminals, or storage.
  6. Storage — salt caverns, depleted reservoirs, above-ground tanks.

Pipeline integrity

Pipelines are highly regulated. The Pipeline and Hazardous Materials Safety Administration (PHMSA) requires integrity management programs covering inspection, monitoring, and risk assessment. Failure consequences are large — spills, fires, environmental damage, civil and criminal liability — so operators spend heavily on inspection.

NDT methods

Non-destructive testing (NDT) — also called NDE (examination) — covers techniques to inspect pipe and equipment without damaging it. The common ones:

Ultrasonic testing (UT)
High-frequency sound waves measure wall thickness and detect internal flaws. The workhorse of pipe inspection.
Radiography (RT)
X-ray or gamma-ray imaging of welds. The image is interpreted by a certified technician. This is the “pipe x-ray” world directly.
Magnetic particle (MT)
Magnetic flux + iron particles reveals surface and near-surface cracks in ferromagnetic materials.
Dye penetrant (PT)
Liquid dye plus developer reveals surface-breaking flaws. Cheap and simple.
Visual inspection (VT)
A trained inspector's eye, often with magnification and lighting. Underestimated; essential.
Inline inspection (ILI) — “pigging”
A robotic tool (a “smart pig”) is run through the pipe with the flow, recording wall-thickness, geometry, and crack data along the entire length.

Why the deliverable matters

An NDT inspection produces a written report that goes into the operator's integrity management file. NVI alone has inspected over 100,000 miles of pipeline over 30 years, with certified technicians and procedures that translate into defensible inspection records. When a regulator audits the operator, or when an incident requires forensic reconstruction, those records are the evidence. The procedural discipline behind them is what an integrity firm sells.

Where this shows up

Lean Informatics applies the same defensibility standard to data delivery. Every datum we hand a client carries source, page, bounding box, extractor version, and model confidence. When an auditor or counterparty challenges a fact, we can defend it the way an NVI inspector can defend a weld report.

Remember

  • Midstream is the connective tissue between wellhead and refinery.
  • Pipeline integrity is heavily regulated and inspection is mandatory.
  • UT, RT, MT, PT, VT are the standard NDT methods.
  • Inline inspection (smart pigs) covers long lengths of pipe at once.
  • The integrity firm's product is a defensible record. That standard applies to data too.
Chapter 16·Data Layer

Maps in oil and gas #

Maps are one of the densest information artifacts in the industry. A single structure map can encode formation top depth, fault geometry, productive fairway boundaries, well locations, lease outlines, and operator interpretation of the geology — all on one sheet of paper. The catch is that an enormous fraction of historically valuable maps has no embedded coordinate system, so modern GIS can't read them without manual georeferencing.

Map types you'll encounter

Structure map
Contour map of the top of a formation. Shows highs and lows of the geological surface. Foundational for trap identification.
Isopach
Contour map of formation thickness. Shows where the productive interval thickens and thins.
Plat
A surveyed map of lease boundaries or units. Drawn against the PLSS grid or against surveyed metes-and-bounds. The primary land artifact.
Scout map
An informal industry-produced map of which operator holds which acreage, with well locations and recent permits annotated. Historically passed around in print form; the basis for competitive intelligence.
Fault map
Map showing the locations and traces of subsurface faults. Critical for completion design and for understanding compartmentalization.
Net pay map
Contour map of net hydrocarbon-bearing thickness. Combines structure, isopach, and petrophysical interpretation. Often the deliverable from a reservoir study.

The georeferencing problem

A digital map needs a coordinate reference system (CRS) — the mathematical mapping that converts image pixels to real-world coordinates. A modern GIS export carries a CRS as metadata. A scanned plat from 1962 does not — you have the image, and the PLSS labels printed on it, but no georeferencing.

The three modes for fixing this:

  • PLSS-based. OCR the township/range/section labels, look up known coordinates for section corners from the BLM's PLSS reference data, solve an affine or projective transform. Works for most US onshore land.
  • Geographic tick. If the map has visible latitude/longitude tick marks with values, OCR the values and solve directly.
  • Feature-based. If neither of the above is present, identify recognizable named features — towns, rivers, highways — geocode them, and solve the transform. The fallback mode.

The output is a GeoTIFF (a raster image with embedded CRS metadata) plus, ideally, extracted vector features — well symbols, contour lines, lease polygons — that come out of dedicated detection and segmentation models.

Where this shows up

Our map georeferencing pipeline implements all three modes, in priority order. PLSS-based mode dominates the volume because most US onshore maps carry section labels. The output drops directly into ArcGIS and QGIS, no manual control-point work required.

Remember

  • Structure maps, isopachs, plats, scout maps, fault maps, net pay maps. Recognize the names.
  • A digital map without a CRS is essentially a photograph. Georeferencing fixes it.
  • PLSS-based georeferencing is the workhorse for US onshore data.
  • GeoTIFF is the output format that drops into modern GIS without further work.
Chapter 17·Data Layer

State regulators and their portals #

Oil and gas regulation in the US is overwhelmingly a state-level activity. Each major producing state runs its own regulator with its own forms, its own submission norms, its own portal. There is no federal one-stop. The federal agencies — BLM for onshore federal land, BOEM for offshore — handle a smaller but important slice. Together, the eleven sources below cover roughly 95% of US onshore unconventional activity plus most offshore.

Priority sources

AgencyJurisdictionPlays covered
TX RRC — Railroad Commission of TexasTexasPermian (TX side), Eagle Ford, Haynesville (TX side), East Texas, Panhandle
OK OCC — Oklahoma Corporation CommissionOklahomaSCOOP, STACK, Anadarko, Arkoma
NM OCD — Oil Conservation DivisionNew MexicoPermian (NM side, Delaware Basin), San Juan Basin
ND IC — North Dakota Industrial CommissionNorth DakotaBakken, Three Forks
CO ECMC — Energy and Carbon Management CommissionColoradoDJ Basin, Piceance, Niobrara
WY OGCC — Wyoming Oil and Gas Conservation CommissionWyomingPowder River, Greater Green River
LA SONRIS — Strategic Online Natural Resources Information SystemLouisianaHaynesville (LA), Tuscaloosa Marine Shale, Gulf Coast onshore, Cotton Valley
PA DEPPennsylvaniaMarcellus, Utica
WV DEPWest VirginiaMarcellus, Utica
BLM — Bureau of Land Management (AFMSS)Federal onshoreFederal lands in every play above; large fraction of NM, WY, CO production
BOEM — Bureau of Ocean Energy ManagementFederal offshoreGulf of Mexico, Alaska, Pacific

Why each portal is a snowflake

The portals share a similar conceptual scope — permits, well status, production, plugging — but the implementation details differ on every dimension:

  • Forms: Texas's W-1 permit form differs structurally from Oklahoma's 1002A or New Mexico's C-101. Same conceptual document, different fields, different naming.
  • Access: some portals offer programmatic search; others require navigating an HTML form; some have rate-limited APIs; some don't have APIs at all.
  • Change detection: a few publish RSS feeds. Most don't. Many require periodic re-fetching of an index page to detect new filings.
  • Submission format: more recent filings are typically searchable PDFs; older filings are scanned images of paper forms; very old filings are microfilm scans of typewriter-era paper.
  • Layout drift: every portal occasionally updates its layout. Adapters built last quarter may break this quarter without notice.
No shortcuts

There is no single commercial dataset that solves this cleanly. Vendor products that claim to are reselling someone's scraped, often-stale extract. The only durable solution is per-state adapters maintained by a team that monitors layout drift and re-extracts on a schedule.

Where this shows up

The state adapter framework is the single largest reusable asset in Lean Informatics' technical estate. One framework. Eleven adapters. Each one a Python module implementing list_changes, fetch, and doc_types. Nightly regression tests against the live portals catch layout drift within 24 hours. This is the crawl flywheel.

Remember

  • US oil and gas regulation is state-level. Eleven sources cover most activity.
  • Texas's RRC, Oklahoma's OCC, New Mexico's OCD, North Dakota's IC are the four heaviest.
  • Federal lands report to BLM (onshore) or BOEM (offshore).
  • Every portal is a snowflake. Adapter maintenance is permanent work, not a one-time build.
Chapter 18·Data Layer

Document types #

A handful of document classes account for most of the value in upstream public data. If you can extract these cleanly, you can answer most of an operator's or buyer's questions about a well or a piece of acreage. The class names differ across states; the conceptual contents do not.

ClassTexas formOklahoma formNew Mexico formWhat it tells you
Permit / APDW-11002AC-101Operator's intent to drill — location, target formation, casing program
Completion reportW-2 / G-11002A re-filesC-103Stages, fluid, proppant, perforation intervals, initial test rates
Production reportP-11004AC-115Monthly oil, gas, water volumes
Plugging recordW-31003AC-103 (final)How and when the well was plugged; mechanical history
Well log / LAS(via API)(via portal)(via portal)Continuous depth-by-depth measurements; the geophysical signal
Scout ticketIndustry-shared informal report of a competitor's well status — not a regulatory filing
Mud logOperator's daily lithology log from drilling cuttings — internal, sometimes shared

The W-1 in detail (since Texas comes up a lot)

The W-1 — Application for Permit to Drill, Deepen, Plug Back, or Re-enter — is the canonical Texas drilling permit. A W-1 carries:

  • Operator name and TX RRC operator number.
  • Lease name and well number.
  • Surface location (lat/long + Texas survey description).
  • Bottom-hole location for directional/horizontal wells.
  • Target formation and proposed total depth.
  • Casing program: surface, intermediate, production casing depths and weights.
  • API number, once assigned.

Once approved, the W-1 becomes the basis for the assigned API number and the well's regulatory identity. Variations on the form (W-1A through W-1G) handle specific cases: amendments, re-entry, deepening.

The long tail

Beyond the headline doc types, the long tail of upstream documents includes: well status reports, plugging affidavits, casing diagrams, directional surveys, hydraulic fracturing chemical disclosures (often via FracFocus), incident reports, NOIs (Notices of Intent), pooling orders, division orders, transfer-of-operatorship filings, and seismic survey notifications. Each one matters in some workflow.

Where this shows up

Our document router classifies each fetched page into one of roughly 40 doc types. The first six types — permit, completion, production, plugging, log, and scout — cover most engagement scope. The long tail is added per client as engagements expand. Each new doc type adds an extractor and a schema, and trains the router slightly better.

Remember

  • Permit, completion, production, plugging, log. Five doc classes carry most of the load.
  • Form names differ by state but the conceptual content maps cleanly.
  • Scout tickets and mud logs are industry-internal, not regulatory — different sources, different reliability.
  • The W-1 in Texas, the 1002A in Oklahoma, the C-101 in New Mexico are the same conceptual document.
Chapter 19·Data Layer

Operator software #

Operators run on a stack of specialized desktop and server software with decades of inertia behind it. Most operators have spent millions licensing, integrating, and training on this stack. Asking them to leave it is a non-starter. Asking them to feed it better data is everything.

The core stack

Petrel (Schlumberger / SLB)
The dominant geoscience interpretation platform. Used for log interpretation, structural modeling, reservoir simulation, well planning. Most large operators standardize on Petrel; many mid-sized ones do too. Petrel's Ocean SDK lets you build plugins (in C#) that can read and write project data — the integration surface for our outputs.
Kingdom (S&P Global / IHS Markit)
An alternative to Petrel for seismic and log interpretation. Strong among smaller operators and consulting shops. Different SDK, same general integration pattern.
Geographix (Landmark / Halliburton)
Another mid-tier interpretation platform. Strong in some independent operator pockets, especially for well planning and mapping workflows.
ArcGIS Pro (Esri)
The dominant GIS platform across upstream. Lease maps, well locations, infrastructure overlays. ArcGIS Pro has a Python toolbox API that's the standard integration target for georeferenced data.
Spotfire (TIBCO)
The de facto upstream analytics platform. Production data analysis, decline curve fitting, type curve work, basic A&D dashboards. Spotfire data functions let you call out to external services (or pull from databases) in workflows.
Power BI / Tableau
Business analytics across the organization. Petroleum-specific extensions are common.
Aries / PHDWin / ARIES Tools (Halliburton)
Specialized reserves and economics platforms. Where decline curve analysis and reserve reports get built.
Excel
Despite the above, an enormous fraction of operator analysis still happens in Excel. Analysts paste data in, run macros, model offline. Useful to remember: a workflow that breaks “the spreadsheet” will not be adopted.

The well master

Most operators maintain a central database — the well master — that is the single source of truth for which wells the operator considers its own, plus metadata (operator, API number, surface location, status). It might be Postgres, SQL Server, a vendor product (Enverus, IHS, OneSpot), or a homegrown system. It is the integration point that matters most. Every well- indexed deliverable should resolve through the well master.

Where this shows up

The Workflow Embedding chapter of the business plan is largely about these integration surfaces. Petrel plugin, Spotfire data functions, ArcGIS toolbox, well master sync. Every plugin shipped is reusable across operators on the same tool stack — that's the integration flywheel.

Remember

  • Petrel dominates geoscience interpretation. Most operators have it.
  • ArcGIS Pro dominates GIS. Spotfire dominates production analytics.
  • The well master is the operator's source of truth for “our wells.”
  • Excel still runs much of the actual analytical work, despite everything above. Plan for it.
Chapter 20·Reference

Glossary #

Quick alphabetical reference. Cross-references point to the chapter where each term is treated in depth.

A&D
Acquisition and divestiture. Buying and selling oil and gas properties. See 14.
APD
Application for Permit to Drill. Filed before drilling. 18.
API number
The unique federal-format identifier for every US well. 06.
Archie's equation
The foundational saturation equation relating resistivity to water content. 10.
Abstract
Chain-of-title summary for a piece of acreage. 07.
Bakken
The oil-producing shale of the Williston Basin, North Dakota. 03.
Basin
A regional geological feature where source, reservoir, trap, and seal coincide. 02.
BHL
Bottom-hole location — where the wellbore ends in 3D space, as opposed to the surface location.
BLM
Bureau of Land Management. Federal onshore minerals regulator. 17.
BOE
Barrel of oil equivalent. 6 mcf of gas = 1 BOE on energy basis. 12.
BOEM
Bureau of Ocean Energy Management. Federal offshore minerals regulator.
Borrowing base
The reserves-collateralized credit limit set by a bank under an RBL. 13.
Casing
Steel pipe run and cemented inside a wellbore to stabilize and isolate it. 08.
CIM
Confidential information memorandum. The seller's full pitch document in an A&D process.
Completion
The stage that prepares a drilled well to flow. 11.
CRS
Coordinate reference system. The math that maps an image to real-world coordinates. 16.
Curative
Fixing defects in title so a lease can be cleared. 07.
DCA
Decline curve analysis. 12.
Division order
The instrument that defines who gets paid what fraction of a well's revenue. 07.
E&P
Exploration and production. Synonym for “upstream operator.”
EUR
Estimated ultimate recovery. Total lifetime production for a well. 12.
Formation top
The depth at which a named formation begins. 02.
Frac (fracking)
Hydraulic fracturing of the rock to create permeability. 11.
Gamma ray (GR)
Natural radioactivity log. Separates shale from clean rock. 09.
GeoTIFF
Raster image format with embedded CRS metadata. 16.
Geosteering
Real-time trajectory adjustment of a horizontal well using LWD data. 08.
HBP
Held by production. Lease clause that extends a lease for as long as the well produces. 07.
Horizontal well
A well with a lateral section that runs nearly parallel to the bedding plane of the target formation. 08.
IP (IP30, IP90)
Initial production rate, often averaged over the first 30 or 90 days. 12.
Isopach
Contour map of formation thickness. 16.
LAS
Log ASCII Standard. The standard file format for digital well log data. 09.
Lateral
The horizontal portion of a horizontal well, inside the target. 08.
Landman
The land services professional who negotiates leases and researches title. 07.
Lease
Contract granting an operator the right to extract minerals from a piece of acreage. 07.
LWD / MWD
Logging While Drilling / Measurement While Drilling. Real-time log data from tools near the bit. 08.
Marcellus
The dry-gas shale of Pennsylvania, West Virginia, Ohio, and New York. 03.
Mineral estate
The legal right to extract subsurface minerals. Separable from surface ownership. 07.
NDT / NDE
Non-destructive testing / examination. Inspecting equipment without damaging it. 15.
Net pay
The thickness of rock that is commercially productive in a given well or interval. 10.
NGL
Natural gas liquids — ethane, propane, butanes, pentanes. Separated from raw gas in processing.
NOI
Notice of intent. A class of regulatory pre-filing.
OCC
Oklahoma Corporation Commission. The state's oil and gas regulator. 17.
OCD
Oil Conservation Division. New Mexico's oil and gas regulator. 17.
Offset well
A well near a subject well, used for analog data when planning a new one.
Operator
The E&P company whose name is on the lease and the regulatory filings.
P&A
Plug and abandonment. Closing out a depleted well safely. 04.
Pay zone
An interval of rock that is commercially productive. 10.
PDP / PDNP / PUD
Reserve categories. 13.
Permian Basin
West Texas / southeast New Mexico. The largest US producing basin. 03.
Plug and perf
The dominant unconventional completion method. 11.
PLSS
Public Land Survey System. Federal grid for land descriptions. 05.
Pooling
Combining multiple leases into a drilling unit. 07.
Porosity (ϕ)
Fraction of rock that is pore space. 10.
Proppant
Sand or ceramic beads pumped into hydraulic fractures to keep them open. 11.
PV-10
Present value of reserve cash flows discounted at 10% pre-tax. 13.
RBL
Reserve-based loan. Bank financing secured by proved reserves. 13.
Reservoir rock
Porous, permeable rock that can hold and flow hydrocarbons. 02.
Resistivity
How resistive the rock is to electric current. Hydrocarbon indicator. 09.
ROW
Right of way. Easement for pipelines, roads, or infrastructure across surface land. 07.
RRC
Railroad Commission of Texas. State oil and gas regulator. 17.
SAM2
Segment Anything Model v2. The image segmentation foundation model our curve tracer extends.
SCOOP / STACK
Major shale plays in central Oklahoma. 03.
Section
One square mile (640 acres) under the PLSS grid. 05.
Seal
An impermeable rock layer that traps hydrocarbons in the reservoir. 02.
Scout ticket
Informal industry-shared report on competitor well status. 18.
SI
Shut-in. A well that's capable of production but temporarily closed.
SONRIS
Louisiana's regulatory portal. 17.
Source rock
The organic-rich sediment in which hydrocarbons formed. 02.
SP
Spontaneous potential. A log measurement that helps identify permeable beds. 09.
Spud
The moment drilling begins on a well.
Stage
A discrete section of lateral that is fractured as a unit. 11.
Surface estate
The legal right to use the land surface, separable from minerals. 07.
Title opinion
An attorney's reading of who owns what fraction of the minerals. 07.
Township
6×6 mile box under the PLSS grid. 05.
Type curve
Averaged decline curve for a class of wells. 12.
Wellhead
The surface equipment at the top of the wellbore.
Well master
The operator's central database of all wells it considers its own. 19.
Wolfcamp
Major productive formation in the Permian Basin.
Workover
Maintenance operation on an existing producing well.
W-1
Texas's drilling permit form. 18.