Upstream Oil & Gas, From Zero
A working primer on petroleum exploration and operations, written for the AI-first engineer building Lean Informatics. Every concept is paired with the vocabulary you'll hear in the field and a brief note on where it shows up in our pipelines.
Twenty chapters. Roughly four hours end-to-end. Use the sidebar to jump around — each section is self-contained.
How to use this manual #
This manual exists for a specific person: the engineer or operator at Lean Informatics who is technically fluent in machine learning and data infrastructure but has never worked in the petroleum industry. By the end you should be able to sit across the table from a landman, a petrophysicist, or an integrity inspector and follow the conversation — and know what would be a useful question to ask back.
It is not a substitute for years of fieldwork. It is a fast on-ramp. The goal is fluency in the vocabulary and a working mental model of the workflows, so that when a client says “we need offset logs and completions for our SCOOP bolt-on,” you know roughly what they want, where the data comes from, and why the dataset is harder to assemble than it sounds.
The oilfield speaks plainly. Where the industry uses jargon, it's almost always
because the jargon is more precise than English. PUD means something
specific to an auditor; “undeveloped reserve” in casual conversation
does not. The vocabulary in this manual is the vocabulary used in the field —
not academic.
How each chapter is built
- The concept in a paragraph or two.
- The vocabulary as a definition list — these are the words you'll hear.
- Why it matters here — a brief connection back to what Lean Informatics is building.
- Things to remember — three to five anchor facts to internalize.
The chapters build on each other but each one stands alone. If you find yourself lost, jump back to 02 (geology) and 04 (well lifecycle); almost everything else hangs off those two.
The upstream value chain #
The petroleum industry is usually described in three segments: upstream (finding and producing hydrocarbons), midstream (moving them from wellhead to refinery — pipelines, gathering, storage, processing), and downstream (refining and selling the finished products). Lean Informatics serves upstream, with NVI-style integrity work reaching into midstream.
Inside upstream, the work breaks roughly into:
Who does what
The industry is built on specialization. An operator (the E&P company whose name is on the lease) rarely does most of the work itself. It contracts out drilling to a drilling contractor, completions to a service company like Halliburton or SLB, integrity inspection to firms like NVI, and land work to firms like BETA Land Services. The operator's full-time staff is typically geoscientists, engineers, landmen, and an operations team. Everything else is outsourced.
Almost every Lean Informatics engagement involves working with the operator's internal team and at least one of these contractor types as data sources or consumers. A scout ticket comes from the operator. A mud log comes from the drilling crew. An inspection report comes from NVI. Knowing who produced a document tells you a lot about how trustworthy and structured it will be.
Remember
- Upstream = finding and producing. Midstream = moving. Downstream = refining and selling.
- The operator's name is on the lease. Almost everyone else on a well is a contractor.
- “E&P” (exploration and production) and “upstream operator” mean the same thing.
- Different document types come from different parties — that affects format, accuracy, and where they end up filed.
Petroleum geology basics #
Crude oil and natural gas form from organic matter — algae, plankton, plant debris — buried in sediment, cooked by heat and pressure over millions of years. The cooking process is called thermal maturation. The temperature window where oil forms (roughly 60–120°C) is called the oil window; hotter than that and the molecules crack further into gas.
For commercial production you need four geological conditions to coincide. Without any one of them, no oilfield.
- Source rock
- The original organic-rich sediment (usually a marine shale) where the hydrocarbons formed.
- Reservoir rock
- A porous, permeable rock the hydrocarbons can migrate into and accumulate in — typically sandstone, limestone, or fractured shale.
- Trap
- A geometric arrangement that stops the hydrocarbons from migrating further — an anticline, fault block, salt dome, or stratigraphic pinch-out.
- Seal
- An impermeable layer (often a shale or evaporite) above the trap that keeps the hydrocarbons from escaping upward.
Conventional vs. unconventional
Conventional production targets reservoirs that hold and flow hydrocarbons on their own — sandstones and carbonates with measurable permeability. Drill into them and they produce; no stimulation required.
Unconventional production — the entire shale revolution — targets the source rock itself. Shales have huge volumes of hydrocarbons trapped in nano-scale pores but practically zero natural permeability. The rock is so tight that unconventional rocks often have permeability less than the cement used to seal the well casing. Hydraulic fracturing creates artificial permeability by cracking the rock open and propping the cracks with sand. Without fracturing, an unconventional well produces almost nothing.
Formations and tops
Rocks are stratified into layers, and the layers are named — the Wolfcamp, the Bakken, the Marcellus, the Eagle Ford. The depth at which a given formation begins is called the formation top. A list of formation tops for a given well, written down at known depths, is one of the most reusable pieces of geological data — operators correlate well logs by snapping them to a common set of tops.
Formation nomenclature is inconsistent. The same rock unit might be called one name in Texas and a different name in New Mexico. Operators within a basin will standardize on their own preferred names, often with internal sub-divisions (“Wolfcamp A”, “Wolfcamp B”, etc.) that are not strictly geological but reflect drilling practice.
Entity resolution on formations is a real problem. The same physical rock shows up under different names across operators, eras, and basins. Our formation synonym graph is one of the assets we maintain per client.
Remember
- Four conditions for a field: source, reservoir, trap, seal.
- Conventional = the rock flows. Unconventional = the rock has to be fractured before it flows.
- Formation tops are the geological skeleton of a well — everything correlates against them.
- Formation names are not standardized. They are basin-, operator-, and era-specific.
The major US plays #
A basin is a regional geologic feature — a structurally subsided area that filled with sediment over time and where source, reservoir, trap, and seal coincide. A play is a specific commercial target inside a basin: usually one or a few stacked formations that operators currently develop with a similar drilling and completion design.
These are the plays that come up most often in upstream conversation. You don't need to remember the geology of each. You do need to recognize the names and have a rough sense of where they are and what they produce.
| Play | Where | Mostly produces | Notes |
|---|---|---|---|
| Permian Basin | West Texas, southeast New Mexico | Oil + associated gas | The largest US producing basin. Stacked Wolfcamp and Bone Spring targets. Sub-basins: Midland and Delaware. |
| Eagle Ford | South Texas | Oil, condensate, gas (varies by depth) | One of the first shale plays. Maturity zones run from oil-rich shallow to gas-rich deep. |
| SCOOP / STACK | Central Oklahoma | Oil + wet gas | South Central Oklahoma Oil Province / Sooner Trend, Anadarko, Canadian, and Kingfisher counties. Stacked plays — Woodford, Meramec, Osage, Springer. |
| Bakken / Three Forks | North Dakota, eastern Montana | Oil | Williston Basin. The play that put unconventional oil on the map. |
| Marcellus | Pennsylvania, West Virginia, Ohio, NY | Dry gas | The largest gas play in the US. Heavily regulated by PA and WV DEP. |
| Haynesville | NW Louisiana, NE Texas | Dry gas | Deep, high-pressure, high-temperature. A “tier-1” gas play that came back to life with LNG demand. |
| Niobrara / DJ Basin | Colorado, Wyoming | Oil + gas | Denver-Julesburg basin. Wattenberg field. Regulated by CO ECMC. |
| Anadarko Basin | Western Oklahoma, Texas Panhandle | Gas + oil | Hosts SCOOP/STACK. Long conventional history — the Hunton Lime trend is here. |
| Powder River Basin | Wyoming, Montana | Oil | Niobrara, Frontier, Mowry, Turner targets. Less mature than Permian and Bakken. |
| Gulf of Mexico | Federal offshore | Oil + gas | Regulated by BOEM. Conventional, deepwater. Long lead times. Mostly major-operator territory. |
| Cotton Valley | NW Louisiana, East Texas | Gas | Conventional and tight-gas sand. Heavy land services work historically. |
| Tuscaloosa Marine Shale | Central Louisiana / Mississippi | Oil | Mixed results. A play that has come and gone with oil price. |
Sub-basin geography matters
Saying “the Permian” is like saying “the West Coast” — too vague to be useful at the operator level. The Permian splits into the Midland Basin (east) and the Delaware Basin (west), with different formation characteristics and economics. When you hear “a Permian operator,” the next question to ask is which sub-basin and what formations.
Our state adapter framework is keyed by state, but our model fine-tuning is keyed by play. A vintage Schlumberger gamma-ray log run in the Bakken looks different from a modern color log run in the Permian, and the curve tracer needs to know which one it's looking at.
Remember
- Basin = the geological container. Play = the commercial target inside.
- Permian, SCOOP/STACK, Bakken, Marcellus, Haynesville, Eagle Ford. Recognize all six on sight.
- “Permian” means Midland or Delaware. Ask which.
- Plays produce different fluid mixes — oil, wet gas (with NGLs), or dry gas — and that drives the economics.
The well lifecycle #
A well goes through a predictable sequence of stages from idea to abandonment. Each stage produces a characteristic set of documents and data — knowing the sequence lets you predict what data should exist for any given well.
The eight stages
- 1. Prospect
- The geoscience team identifies a target — a piece of acreage where source, reservoir, trap, and seal are believed to coincide. Output: a prospect map, an internal economic estimate, a recommendation to lease.
- 2. Lease
- The land team acquires the legal right to drill. This is the BETA Land world. Output: leases, title abstracts, curative documents, ROW agreements. (See chapter 07.)
- 3. Permit
- The operator files an Application for Permit to Drill (APD) with the relevant state regulator. The APD includes location, target formation, casing program, and surface plan. Output: a permit and an assigned API number. (See chapters 17 and 18.)
- 4. Drill
- The hole is drilled to the target depth, often horizontally. The drilling crew produces a daily drilling report, a mud log (lithology from cuttings), and LWD/MWD logs (logs while drilling). Casing is run and cemented as the hole deepens. (See chapter 08.)
- 5. Log
- Wireline tools are run down the hole to measure rock properties — the logging suite. Output: digital LAS files and paper-format prints. (See chapter 09.)
- 6. Complete
- The well is prepared to flow. For unconventionals: cement the casing in place, perforate at the target intervals, hydraulically fracture each stage, then drill out the plugs. Output: completion report. (See chapter 11.)
- 7. Produce
- The well makes oil and gas. Volumes are reported monthly to the regulator. Decline curve analysis predicts future production. Output: production records, monthly state filings. (See chapter 12.)
- 8. P&A
- Plug and abandonment. When the well is depleted, it is cemented closed and surface facilities are removed. Output: a plugging report.
The first-production milestone
Within those eight stages, the most important moment economically is first production (sometimes “first oil” or “FOO”). Everything before it is cost; everything after it is revenue. The spud date (when the bit first turns into the ground) and the first-production date together define the cycle time the operator is trying to minimize. Modern unconventional operators target a spud-to-first-production cycle of 60–120 days.
The structured well master we deliver to clients is keyed by these stages. Every well has a permit document, a drilling report, a completion report, and ongoing production records — each one a distinct doc type for our router. Document type + stage tells the system what schema to apply.
Remember
- Prospect → Lease → Permit → Drill → Log → Complete → Produce → P&A. Eight stages, predictable order.
- The API number is assigned at the permit stage. It is the well's identity from then on.
- Spud-to-first-production is the operator's headline cycle-time metric.
- Every stage produces specific documents. If a doc class is missing for a well, you can name what stage failed.
The Public Land Survey System #
The PLSS is the federal grid system that describes land across most of the United States west of the original thirteen colonies. Established by the Land Ordinance of 1785, it covers more than 1.5 billion acres across 30 states and remains the legal basis for mineral leases, oil and gas permits, and federal land management. If you work in upstream, you live in PLSS coordinates.
The hierarchy
The system divides land into a nested grid anchored on a principal meridian (a north-south reference line) and a baseline (east-west). Their intersection is the local point of origin for a survey region.
- Township
- A 6 mile × 6 mile square (36 square miles, ~23,000 acres). The primary unit of the grid. Identified by a township number north or south of the baseline and a range number east or west of the principal meridian.
- Section
- One square mile, 640 acres. Each township contains 36 sections, numbered 1–36 starting in the northeast corner and snaking back and forth across the township.
- Quarter section
- 160 acres. Identified by compass direction — NE¼, NW¼, SE¼, SW¼.
- Quarter-quarter
- 40 acres. The typical smallest meaningful unit for leasing. Written compounded: “NW¼ NE¼” is the northwest quarter of the northeast quarter — 40 acres.
Section numbering inside a township
Sections are numbered like an ox-plow boustrophedon. Section 1 is northeast; section 6 is northwest; section 7 is one row south of section 6; section 36 is southeast.
Reading a legal description
PLSS land descriptions are read inside-out, smallest unit first. A typical legal description looks like this:
NW¼ of the SE¼ of Section 14, T-2-S, R-3-W, 5th Principal Meridian
Decoded:
NW¼ of the SE¼— northwest quarter of the southeast quarter (40 acres)Section 14— section 14 of the township (1 sq mi)T-2-S— Township 2 South of the baselineR-3-W— Range 3 West of the principal meridian5th Principal Meridian— which survey region we're in
Always include the principal meridian. Without it the description is technically ambiguous — “T-2-S, R-3-W” exists in multiple PLSS regions across the country.
Where the PLSS doesn't apply
The 13 original colonies and a few other eastern states (Texas, parts of Hawaii, much of Louisiana) use a much older system called metes and bounds, which describes parcels by walking the perimeter — distance, compass bearing, and physical landmarks. This is one reason Texas land records are notoriously harder to crawl than Oklahoma or North Dakota.
Texas runs its own land survey system based on original Spanish-era grants — labels like “Survey 24, Block 31, T&P RR Co.” replace township-range-section. When working a Permian project, expect Texas descriptions to look entirely different from New Mexico descriptions in the same basin.
Our map georeferencing pipeline has three modes — PLSS-based, lat/long-tick, and feature-based. The PLSS mode is the workhorse because it works on most US onshore acreage. Pulling the PLSS labels off a paper map and snapping them to known section corner coordinates is what turns a scanned image into a georeferenced GeoTIFF.
Remember
- Township = 6×6 mile box. Section = 1 sq mi. Quarter = 160 ac. Quarter-quarter = 40 ac.
- Sections number 1–36 in a snaking pattern from the northeast corner.
- Read legal descriptions inside-out, smallest first.
- Always cite the principal meridian.
- Texas uses its own survey system, not PLSS. Plan for it.
The API well number #
Every well drilled in the United States receives a unique permanent identifier called the API well number (more formally, the US Well Number). It is assigned by the state regulator at the permit stage and follows the well through its entire life. If you remember one identifier in upstream data, this is the one.
The current standard format is 12 or 14 digits, broken into segments by hyphens:
- State (2 digits)
- State code —
42= Texas,30= New Mexico,05= Arkansas, etc. Based on a 1952 IBM standard, alphabetical (mostly). Do not confuse with FIPS codes. - County (3 digits)
- County within the state.
501in Texas = Yoakum County. - Well (5 digits)
- Unique well identifier within the county. Assigned sequentially by the state regulator at permit time.
- Sidetrack (2 digits)
- Directional sidetrack code.
00= the original wellbore.01,02… = subsequent sidetracks drilled from the same surface location to different bottomhole locations. - Event (2 digits)
- Event sequence code. Tracks recompletions or other physical configuration changes within an existing wellbore.
00= original completion.
API-10, API-12, API-14
Three lengths are in common use and they don't mean the same thing:
- API-10 — just state, county, and well-within-county. Describes the surface location. Two sidetracks from the same surface location share an API-10.
- API-12 — adds the sidetrack code. Describes a specific wellbore.
- API-14 — adds the event sequence. Describes a specific completion event on a wellbore.
For most upstream data work, API-10 is the right level of granularity to identify a well. API-14 is needed when you care about which completion event you're looking at (matters for production allocation and for some types of regulatory filings).
Most state portals serve API numbers in 10-digit form. Operators internally often track API-14. When loading a vendor extract into an operator's master database, mismatched lengths are a common source of pain. Decide which version is your primary key per engagement.
API-10 is our default blocking key for entity resolution on wells. Two filings that share an API-10 plus a surface location plus an operator are almost certainly the same well. The hard work is everything else: operator name changes, well-name typos, and reformulated leases that re-label the same physical hole.
Remember
- Format:
SS-CCC-WWWWW-BB-EE. State, county, well, sidetrack, event. - API-10 = surface location. API-12 = wellbore. API-14 = wellbore + event.
- Assigned by the state regulator at permit time, immutable thereafter.
- State codes are not FIPS codes. Use the real API state-code lookup.
Leases, title, and land services #
Before an operator can drill, it needs the legal right to extract minerals from the ground under a specific piece of acreage. That right comes from a lease — a contract between the mineral rights owner (the lessor) and the operator (the lessee). The work of finding mineral owners, negotiating leases, researching title, and resolving conflicting claims is land services. This is the BETA Land Services world.
Surface vs. mineral estate
In the US, mineral rights can be severed from surface rights. The person who owns the surface (the farmer, the rancher) may not own what's under it. That severance creates two separate legal estates:
- Surface estate — the right to use the land surface. Owned by the surface owner.
- Mineral estate — the right to extract subsurface minerals. May belong to a different party, possibly broken into fractional interests across dozens of heirs.
A mineral estate that has been split among multiple parties is called fractionated. Working out who owns what fraction, often across decades of inheritances and divorces, is the bulk of what a landman does.
The lease document
An oil and gas lease typically specifies:
- Lessor / lessee
- Who owns the minerals; who is leasing them.
- Legal description
- The specific acreage covered, in PLSS or metes-and-bounds terms.
- Bonus
- Upfront cash payment to the lessor for signing the lease.
- Primary term
- How long the operator has to drill before the lease expires. Usually 3–5 years.
- Royalty
- The fraction of production revenue paid to the mineral owner. Historically 12.5% (one-eighth); modern leases run 18.75%–25% in competitive basins.
- Held by production (HBP)
- The clause that extends the lease indefinitely beyond the primary term as long as the well keeps producing in paying quantities.
- Pooling clause
- Permits combining multiple leases into a drilling unit (often a section) for the purpose of drilling a single well.
The land services workflow
This is the work BETA Land Services and firms like it deliver to operators:
- Abstracting
- Walking the chain of title for a piece of acreage all the way back through county records — sometimes 150 years or more — and producing an abstract: a summary of every recorded document affecting ownership. The raw material for everything else.
- Title research / opinion
- An attorney's reading of the abstract: who owns what fraction of the minerals, with what burdens, and is title clear enough to drill on?
- Curative
- Fixing defects in title — finding missing heirs, getting affidavits signed, clearing old liens — so the title opinion can be re-issued clean.
- Leasing
- The landman in the field negotiating and signing the actual leases with mineral owners.
- Division order
- The instrument that tells the operator exactly which fraction of revenue is owed to which party. Issued after first production based on the cleared title.
- Right of way (ROW)
- Easements across surface land for pipelines, roads, power, and gathering infrastructure. A separate negotiation from the lease itself.
A land services firm wins on speed and procedural discipline. BETA Land Services maintains a 48–72 hour deployment promise across North America, has built the firm to 51–200 employees across 37 states, and has touched 4.3 million acres and thousands of wells. The deliverable — a clean abstract, a curative file, a ready-to-execute lease — is reliable and defensible. That's the model Lean Informatics applies to upstream data.
Remember
- Surface estate and mineral estate are legally separate. Most leasing work concerns the mineral estate.
- A lease has a primary term and is held by production thereafter — that's why operators rush to drill before the primary term ends.
- Abstracting → title opinion → curative → leasing → division order. That's the land services workflow.
- The landman is the operator's most expensive non-engineer because their work is the most legally exposed.
Drilling #
Drilling is the part of the lifecycle that converts permits and plans into a physical hole in the ground. A single modern unconventional well takes 10–30 days to drill and runs to depths of 8,000–14,000 feet vertical with a horizontal section (the lateral) of 5,000–15,000 feet beyond that.
Vertical, directional, horizontal
- Vertical wells drill straight down. Dominant historically; still used for shallow conventional targets and for certain monitoring/disposal wells.
- Directional wells intentionally deviate from vertical. Used to reach a bottomhole location offset from the surface — common offshore and on tight surface pads.
- Horizontal wells deviate to roughly 90° and run a long lateral inside the target formation. Standard for unconventional plays — the lateral exposes hundreds of times more reservoir rock to the wellbore than a vertical hole would.
Modern unconventional drilling design
A typical horizontal well has three sections: the vertical section, the curve (where it builds angle), and the lateral (the horizontal portion within the target). Operators plan and steer to a specific landing zone within the target formation — typically a 10–30 foot vertical window of the highest-quality rock.
Lateral lengths have grown steadily over the last decade. Many laterals today run 7,500–10,000 feet, with some operators experimenting with 15,000+ foot designs and even “horseshoe” geometries that turn the lateral 180° to honor lease boundaries. Longer laterals expose more rock per well and amortize the vertical drilling cost across more producing length.
Casing and cementing
A drilled hole is not stable on its own. As the bit goes deeper, the operator runs casing — concentric steel pipe — and cements it in place to isolate the wellbore from surrounding formations. A typical well has multiple casing strings:
- Conductor casing — the outermost, near surface. Stops the hole from collapsing in the first few hundred feet.
- Surface casing — set below the deepest fresh water aquifer. Protects groundwater. Cemented to surface.
- Intermediate casing — isolates problem zones (pressure changes, lost circulation, weak formations).
- Production casing — the innermost string, set across the producing interval. The one that gets perforated.
Geosteering
While drilling the lateral, the operator's geosteering team watches real-time log data from tools mounted near the bit (LWD — Logging While Drilling) and adjusts the trajectory to stay inside the target zone. Modern geosteering uses gamma ray, resistivity, and inclination measurements every few feet. A well that drifts out of the target zone produces dramatically less.
The drilling stage produces some of the densest data on a well: daily drilling reports, mud logs, LWD log files. Operators integrate these into their well master and into Petrel for next-well planning. Our pipelines pull and structure all of it.
Remember
- Vertical, directional, horizontal. Horizontal + frac is the unconventional pattern.
- A horizontal well = vertical section + curve + lateral. The lateral is the productive part.
- Casing is run and cemented in stages as the hole deepens. Each string has a purpose.
- Geosteering keeps the bit inside the target zone using real-time log data.
Well logs #
A well log is a continuous depth-by-depth record of physical measurements taken inside the wellbore by lowering instruments down the hole. The data is plotted as parallel curves on a multi-track strip — depth on the vertical axis, measurement values on the horizontal axes. Reading well logs is one of the foundational skills of petroleum geoscience.
Logs are run by service companies (Schlumberger, Halliburton, Weatherford, and others) and delivered to the operator as both digital data (LAS files) and plotted images (paper, PDF, or TIFF). For decades, only the plotted image was standard. That is why so much legacy log data exists only as raster — and why digitizing it is hard.
The standard track layout
By convention, logs are presented on a three-track format: Track 1 sits to the left of the depth column and holds the gamma ray, SP, and caliper curves. Track 2 typically holds resistivity. Track 3 holds porosity measurements — density and neutron. Once you orient on one well log, you can orient on almost any other.
The basic curve suite
- Gamma ray (GR)
- Measures natural radioactivity. Shales and clays contain naturally occurring radioactive elements (mostly potassium, uranium, thorium) and read high; clean sandstones and carbonates read low. The first curve geologists look at — it separates reservoir rock from non-reservoir.
- Spontaneous potential (SP)
- Measures the natural voltage between the mud in the wellbore and the formation fluid. Reads negative in permeable beds; useful for identifying water vs. hydrocarbon-bearing intervals when GR is ambiguous.
- Caliper (CAL)
- Mechanical arms measure the diameter of the borehole. A diameter larger than the bit size means washout (a weak rock); smaller means swelling clays or mudcake (a sign of permeability — a positive indicator).
- Resistivity (deep, medium, shallow)
- How resistive the rock is to electrical current. Hydrocarbon-bearing zones read high; water-bearing zones read low. Multiple depths of investigation reveal mud filtrate invasion. The most important hydrocarbon indicator on a basic log.
- Neutron porosity (NPHI)
- Measures the rock's response to neutron radiation, which is dominated by hydrogen content. Hydrogen lives in water, in oil, and in clays — so neutron porosity must be interpreted alongside density.
- Bulk density (RHOB)
- Measures the rock's bulk density via gamma backscatter. Together with neutron, density gives an unambiguous porosity reading — and the density-neutron crossover pattern is a classic gas signature.
- Sonic (DT)
- Measures the time for a sound wave to travel a foot through the rock. Used for porosity and for tying the well log to seismic data.
The LAS file
Digital log data is delivered in LAS (Log ASCII Standard) format — a plain-text file with a header section describing the well and the curves, followed by a data section of depth-indexed numerical values. LAS 2.0 is the most widely supported version. LAS 3.0 adds some metadata structure but adoption has been mixed.
~Version Information VERS. 2.0 : CWLS Log ASCII Standard WRAP. NO : One line per depth step ~Well Information STRT.F 8000.0 : First reference value STOP.F 8500.0 : Last reference value STEP.F 0.5 : Step NULL. -999.25 : Missing value WELL. POPE STATE 4H : Well name API . 42501201300300 : API number ~Curve Information DEPT.F : Depth GR .GAPI : Gamma ray RES .OHMM : Deep resistivity NPHI.V/V : Neutron porosity RHOB.G/C3 : Bulk density ~Ascii 8000.0 45.2 12.5 0.18 2.42 8000.5 47.1 13.8 0.17 2.43 ...
A huge fraction of legacy well-log data — anything pre-1990 and a stubborn tail of more recent vintage — exists only as raster (TIFF, PDF, or paper). Petrel won't read it. Kingdom won't read it. The work of turning those rasters into clean LAS files is exactly what the curve tracer does. This is the bespoke moat described in the business plan.
Remember
- Three tracks: GR/SP/CAL on the left, resistivity in the middle, neutron/density on the right.
- Gamma ray separates reservoir from non-reservoir.
- Resistivity separates hydrocarbon-bearing from water-bearing.
- Neutron + density together give porosity, and their crossover signals gas.
- LAS 2.0 is the de facto digital interchange. The raster legacy is where the curve tracer earns its keep.
Petrophysics #
Petrophysics is the discipline of turning log measurements into quantitative answers about the reservoir: how much oil and gas is in place, how much can flow, and how producible it actually is. It sits between the raw geophysics and the reservoir engineering.
The three quantities a petrophysicist wants
- Porosity (ϕ)
- The fraction of the rock that is pore space. Computed from density, neutron, and sometimes sonic logs. Reported as a fraction (0.08 = 8%) or a percentage. Without porosity, nothing fits inside the rock.
- Water saturation (Sw)
- The fraction of the pore space filled with water (the rest is hydrocarbon). Computed mainly from resistivity using Archie's equation (see below). The lower Sw, the more hydrocarbon in place.
- Permeability (k)
- How readily fluid flows through the rock, measured in darcies or millidarcies. Logs estimate permeability poorly — it's usually measured directly from cores or inferred from production. In unconventional shales it's measured in nanodarcies.
Archie's equation
The single most cited equation in petrophysics is Archie's equation, which relates rock resistivity to water saturation:
Sw^n = (a · Rw) / (ϕ^m · Rt)
where:
Sw= water saturation (the unknown we want)Rw= resistivity of the formation waterRt= true resistivity of the formation, from deep resistivity logϕ= porositya,m,n= empirical constants for the specific rock type — usuallya≈1,m≈2,n≈2for clean sandstones.
The equation is named after Gus Archie, who derived it at Shell in 1942. It only works in clean (shale-free) rock; modified versions exist for shaly sands and for unconventional plays where Archie's assumptions break down.
Gardner and Wyllie equations
Two other equations come up alongside Archie, both relating sonic and density measurements to porosity. The Gardner equation relates bulk density to seismic velocity; the Wyllie time-average equation relates sonic travel time to porosity. Both are workhorses for synthetic well log generation — which is why our synthetic data factory uses them.
Pay zones
A pay zone is an interval of rock that's commercially worth producing. The petrophysicist's job is to find the pay zones by combining the curves: clean (low GR), porous (high porosity from density-neutron), and hydrocarbon-saturated (high resistivity). Each pay zone gets a top, bottom, and net-pay thickness — the “feet of pay” that drive economic estimates.
Our synthetic data factory generates raster log images with petrophysically plausible curves — using Archie, Gardner, and Wyllie to drive realistic resistivity, density, and sonic responses from synthetic lithology sequences. That gives us millions of perfectly labeled training pages to pretrain the curve tracer on, with no real-data labeling cost.
Remember
- Petrophysics extracts porosity, water saturation, and permeability from logs.
- Archie's equation is the foundational saturation calculation.
- Gardner and Wyllie equations relate sonic and density to porosity.
- Pay zones are clean, porous, and hydrocarbon-saturated. Net pay thickness drives economic estimates.
Completions and frac #
Drilling makes a hole. Completion makes it produce. In an unconventional well, completion is where most of the cost lives — often 60–70% of total well capital. It's also where the operator's design choices most visibly affect economics.
Plug and perf — the standard unconventional completion
The dominant completion method in US shales is plug and perf. The lateral is divided into stages — typically 30–80 stages on a modern long lateral, each 100–250 feet long. Each stage is fractured independently, in sequence, starting from the toe (the far end of the lateral) and working back to the heel (where it meets the curve).
- The production casing is cemented in place across the lateral.
- A wireline crew runs a perforating gun down to the toe stage and shoots a cluster of holes through casing, cement, and into the formation rock.
- A frac crew pumps water, sand, and chemicals at high pressure through the perforations, cracking open the rock — this is the frac job.
- A bridge plug is set above the just-fractured stage to isolate it.
- The next stage's perforating gun is run down, perforating above the plug. Frac job, plug, perforate, frac job — and so on, stage by stage, back toward the heel.
- When all stages are done, a coiled tubing unit drills out all the plugs and the well is ready to flow.
What the frac job actually does
A single frac stage pumps roughly 10,000–25,000 barrels of fluid (water plus chemicals — friction reducers, biocides, scale inhibitors) and 200,000–500,000 pounds of proppant (sand or ceramic beads). Fluid pressure cracks the rock open; proppant flows into the cracks; when pressure is released, the proppant holds the cracks open, giving fluid a path to flow back to the wellbore. A modern long lateral well consumes 10–25 million pounds of sand total.
The fractures propagate in the direction of the maximum horizontal stress in the rock, generally perpendicular to the wellbore. By the end of 2018, about 96% of US crude oil production from tight oil formations came from horizontal wells. The combination of long horizontal laterals + many frac stages is what unlocked the entire shale revolution.
Cluster spacing and intensity
Within each stage, the perforations are clustered. Cluster spacing (the distance between clusters) and proppant intensity (pounds of sand per foot of lateral) are the two knobs operators have tuned hardest over the last decade. Both have generally increased — tighter clusters and more proppant per foot, producing more initial production but at higher cost.
Completion variants
- Sliding sleeve — instead of plugs, ball-activated sleeves open each stage in sequence. Faster but less flexible. Common in some Canadian plays.
- Open-hole — the lateral is not cased. Frac goes directly into the open formation. Cheaper but less control.
- Acid jobs — instead of (or alongside) hydraulic fracturing, acid is pumped to dissolve carbonate rock. Common in conventional carbonate plays.
The completion report
Once a well is completed, the operator files a completion report with the regulator. It documents the stages, fluid volumes, proppant tonnage, perforation depths, and initial test rates. Operators care intensely about competitors' completion reports — they reveal design choices and let them benchmark.
Completion reports are one of the most valuable document classes in our pipeline. They're filed publicly with state regulators but vary enormously in format. Our extractors pull stage tables, fluid volumes, and proppant totals into structured records that operators can compare across competitors offset-by-offset.
Remember
- Plug and perf is the dominant completion method in US shale.
- A modern lateral has 30–80 stages, fractured toe-to-heel.
- Each stage pumps thousands of barrels of fluid and hundreds of thousands of pounds of proppant.
- Cluster spacing and proppant intensity are the design knobs. Both have generally trended up.
- Completion design from competitor reports is one of the most read pieces of public data in the industry.
Production and decline #
Once a well starts producing, the operator's job is to keep it producing as economically as possible. Reservoir pressure depletes over time; flow rates decline; equipment fails; operating cost stays fixed per well. The decline curve is the mental model that organizes all of this.
The shape of production
A new unconventional well typically produces a flush burst of oil and gas in its first 30–90 days — the initial production rate or IP30/IP90 — then declines steeply for the first year or two, then flattens to a slower exponential decline that can last a decade or more. Roughly 50–70% of a typical unconventional well's lifetime production happens in its first three years.
Decline curve analysis
Decline curve analysis (DCA) fits a parametric curve to measured production data and extrapolates future production. The classical Arps equations give three flavors:
- Exponential — constant percentage decline per unit time. Simplest. Reasonable for mature conventional wells.
- Hyperbolic — decline rate itself declines. The standard for unconventional plays.
- Harmonic — a special case of hyperbolic with the b-factor set to 1.
Fitting a decline curve gives the operator a forecast — usually expressed as estimated ultimate recovery (EUR), the total volume the well will produce over its life. EUR per well, divided by lateral length, gives EUR-per-foot, which is one of the standard productivity metrics for comparing operators and basins.
Type curves
A type curve is an averaged decline curve representing a class of wells — e.g., “Permian Wolfcamp B 7,500-foot lateral, 2024 vintage.” Operators build type curves from offset wells and use them as templates for forecasting new wells in similar conditions. A&D buyers use type curves to estimate the value of acreage they don't operate yet.
Allocation and reporting
Production volumes are reported monthly to the regulator. Allocation splits the produced volume across the wells and the leases that share a surface facility (a tank battery serving multiple wells, a saltwater disposal facility, etc.). Bad allocation produces wells with implausible production profiles — a known source of pain in public data.
Operators also report production in three streams:
- Oil
- Liquid hydrocarbons, measured in barrels (bbl). One barrel = 42 US gallons.
- Gas
- Methane and other gaseous hydrocarbons, measured in standard cubic feet (scf) or thousand cubic feet (mcf).
- Water
- Produced water, in barrels. In some basins (Permian especially), water cuts exceed 80%, and water handling is a major cost center.
The BOE
To talk about oil and gas together, operators convert them to a common unit: barrel of oil equivalent (BOE), with gas converted to oil at an energy ratio of 6,000 cubic feet of gas per barrel of oil (6 mcf/bbl). The conversion is energy-equivalent, not value-equivalent — at most oil prices, a barrel of oil is worth 10–20× a BOE of gas.
Monthly production records are public in most major states. We crawl them and feed them to clients alongside the well master. Production + completion design + log data together let an operator answer the question that matters most: which acreage is worth drilling next?
Remember
- Unconventional wells decline steeply for 1–2 years, then flatten. Most production comes early.
- Decline curves are usually hyperbolic for unconventional, exponential for mature conventional.
- EUR per well and EUR per foot of lateral are the headline productivity metrics.
- BOE converts oil and gas to a common energy unit. 6 mcf = 1 BOE.
- Allocation errors are a frequent source of dirty public production data.
Reserves and reserve reporting #
A reserve is a volume of hydrocarbons in the ground that is expected to be recovered under current economic conditions. Reserves are the primary asset on an E&P company's balance sheet. The way they're categorized and disclosed is regulated by the SEC for public companies, and drives bank borrowing bases, M&A valuations, and analyst coverage.
The reserve categories
Reserves are classified as proved, probable, or possible. The SEC permits disclosure only of proved reserves; SPE definitions also recognize probable and possible. The proved category is further subdivided by whether the well is drilled and producing.
| Category | Code | Definition | Confidence |
|---|---|---|---|
| Proved Developed Producing | PDP | Existing well, currently producing | Highest |
| Proved Developed Non-Producing | PDNP | Existing well, not currently producing (shut-in, behind-pipe zones) | High |
| Proved Undeveloped | PUD | Location not yet drilled; recovery expected with reasonable certainty | Lower |
| Probable | P2 | Unproved; ≥50% probability of recovery | Lower still |
| Possible | P3 | Unproved; ≥10% probability of recovery | Lowest |
1P, 2P, 3P
The shorthand combinations roll up these categories:
- 1P = proved (PDP + PDNP + PUD). The SEC-reportable number.
- 2P = proved + probable. Used by SPE and most non-US regulators.
- 3P = proved + probable + possible. The most aggressive view.
Reserve reports
A reserve report is the engineering document that quantifies a company's reserves at a point in time. Independent firms (Cawley Gillespie, Netherland Sewell, DeGolyer, Ryder Scott, and others) audit reserves for SEC filers. The reserve report includes:
- Well-by-well or location-by-location forecasts.
- Decline curve assumptions for each well.
- Operating cost assumptions.
- Pricing assumptions (SEC pricing is a 12-month trailing average).
- PV-10 — the present value of future net cash flows discounted at 10%, before tax. The single most cited reserve-value number.
Borrowing bases
Most E&P debt is structured as a reserve-based loan (RBL) against the PDP reserves. The lender's engineering team independently values the proved reserves and sets a borrowing base — typically a fraction of PV-10 on the PDP, with smaller fractions on PDNP and PUD. The borrowing base is redetermined twice a year. When prices fall, borrowing bases fall — and that drives a lot of distressed-asset deal flow.
A&D evaluations rely on reserve reports plus the underlying log and completion data. The faster a buyer can recompute reserves on the target acreage using their own assumptions, the more confident their bid. Our pipelines feed the data layer that recomputation runs on.
Remember
- PDP, PDNP, PUD are the proved categories. Probable and possible are unproved.
- 1P = proved. 2P = proved + probable. 3P = all three.
- PV-10 is the headline reserve-value number, discounted at 10% pre-tax.
- Reserve-based loans drive a lot of corporate behavior. A falling oil price triggers borrowing-base redeterminations and distressed asset sales.
A&D and deal flow #
Acquisition and divestiture (A&D) is the business of buying and selling oil and gas properties. It is the part of the industry where data quality matters most acutely — a single misread completion report or a mis-loaded production curve can shift a bid by millions of dollars. It is also where most of Lean Informatics' early urgency lives.
The shapes of a deal
A&D transactions come in three main flavors:
- Asset deal
- The buyer acquires specific oil and gas properties — leases, wells, related infrastructure — but not the corporate entity. The most common form. Most published deal lists are asset deals.
- Corporate (M&A)
- The buyer acquires the entire company. Brings the operating team, the contracts, the liabilities. Big deals at the major and large-independent level.
- Joint venture / drilling partnership
- The buyer takes a working interest in undrilled acreage in exchange for funding part of the drilling program. Common in plays where acreage is held but capital is scarce.
The deal process
The standard A&D process compresses about 60–120 days into a sequence:
- Teaser — anonymized one-pager from the seller's banker describing the asset at a high level.
- NDA + CIM — the buyer signs an NDA and receives the confidential information memorandum: detailed description of the assets, reserves summary, financial highlights.
- Data room — the buyer gets read access to the seller's data: leases, wells, logs, reserve reports, financials, regulatory history. This is where data quality wins or loses deals.
- Indicative bid — buyer submits a non-binding bid based on data-room review.
- Confirmatory diligence — narrower buyer pool gets deeper access, runs site visits, tests assumptions.
- Binding bid → PSA — final bid, purchase and sale agreement, close.
How buyers actually evaluate
A buyer's analyst team takes the seller's data and rebuilds the asset value from scratch, using the buyer's own assumptions. Typical workstreams:
- Reserves — re-fit decline curves on every PDP well; build type curves for the PUD inventory.
- Operating cost — model LOE (lease operating expense) per well; identify integration synergies if the buyer already operates the basin.
- Title — sample-check the lease file; confirm leases are held by production where claimed.
- Regulatory — pull state filings on every well; check for plugging liability, environmental notices, lapsed permits.
- Offset performance — pull all available log, completion, and production data for wells around the asset to validate the seller's claims.
All of this happens under time pressure. A 500-well asset evaluation that takes a full month is moving fast. The buyer who can move from CIM to confident bid in three weeks instead of six has a real edge — and that edge is gated by data quality and tooling.
The worked engagement in the business plan — 500 SCOOP/STACK wells in eight weeks — is an A&D engagement. Our role is to compress the data-room and confirmatory diligence work from weeks to days. Clean logs, structured completion data, current regulatory filings, all integrated into the buyer's Petrel and Spotfire — that's the deliverable.
Remember
- Asset deals, corporate deals, and JVs. Asset deals are the most common.
- Data room access is where data quality determines bid confidence.
- Buyers rebuild reserves, opex, title, and regulatory pictures from scratch in 60–120 days.
- Speed and data quality both compound. A buyer who moves twice as fast with cleaner data wins deals.
Midstream and pipeline integrity #
Midstream is everything between the wellhead and the refinery — gathering, processing, transportation, storage. It is also where pipeline integrity work lives, and where NVI and firms like it operate.
The flow
A typical onshore production stream goes through:
- Wellhead — the surface equipment at the well.
- Tank battery — separator, tanks for oil and water, gas line. Usually shared by multiple wells on a pad or lease.
- Gathering system — small-diameter pipelines that move product from tank batteries to a central processing or sales point.
- Processing plant — removes impurities (water, CO₂, H₂S, NGLs) from raw gas.
- Transmission pipeline — large-diameter long-distance pipe to refineries, LNG terminals, or storage.
- Storage — salt caverns, depleted reservoirs, above-ground tanks.
Pipeline integrity
Pipelines are highly regulated. The Pipeline and Hazardous Materials Safety Administration (PHMSA) requires integrity management programs covering inspection, monitoring, and risk assessment. Failure consequences are large — spills, fires, environmental damage, civil and criminal liability — so operators spend heavily on inspection.
NDT methods
Non-destructive testing (NDT) — also called NDE (examination) — covers techniques to inspect pipe and equipment without damaging it. The common ones:
- Ultrasonic testing (UT)
- High-frequency sound waves measure wall thickness and detect internal flaws. The workhorse of pipe inspection.
- Radiography (RT)
- X-ray or gamma-ray imaging of welds. The image is interpreted by a certified technician. This is the “pipe x-ray” world directly.
- Magnetic particle (MT)
- Magnetic flux + iron particles reveals surface and near-surface cracks in ferromagnetic materials.
- Dye penetrant (PT)
- Liquid dye plus developer reveals surface-breaking flaws. Cheap and simple.
- Visual inspection (VT)
- A trained inspector's eye, often with magnification and lighting. Underestimated; essential.
- Inline inspection (ILI) — “pigging”
- A robotic tool (a “smart pig”) is run through the pipe with the flow, recording wall-thickness, geometry, and crack data along the entire length.
Why the deliverable matters
An NDT inspection produces a written report that goes into the operator's integrity management file. NVI alone has inspected over 100,000 miles of pipeline over 30 years, with certified technicians and procedures that translate into defensible inspection records. When a regulator audits the operator, or when an incident requires forensic reconstruction, those records are the evidence. The procedural discipline behind them is what an integrity firm sells.
Lean Informatics applies the same defensibility standard to data delivery. Every datum we hand a client carries source, page, bounding box, extractor version, and model confidence. When an auditor or counterparty challenges a fact, we can defend it the way an NVI inspector can defend a weld report.
Remember
- Midstream is the connective tissue between wellhead and refinery.
- Pipeline integrity is heavily regulated and inspection is mandatory.
- UT, RT, MT, PT, VT are the standard NDT methods.
- Inline inspection (smart pigs) covers long lengths of pipe at once.
- The integrity firm's product is a defensible record. That standard applies to data too.
Maps in oil and gas #
Maps are one of the densest information artifacts in the industry. A single structure map can encode formation top depth, fault geometry, productive fairway boundaries, well locations, lease outlines, and operator interpretation of the geology — all on one sheet of paper. The catch is that an enormous fraction of historically valuable maps has no embedded coordinate system, so modern GIS can't read them without manual georeferencing.
Map types you'll encounter
- Structure map
- Contour map of the top of a formation. Shows highs and lows of the geological surface. Foundational for trap identification.
- Isopach
- Contour map of formation thickness. Shows where the productive interval thickens and thins.
- Plat
- A surveyed map of lease boundaries or units. Drawn against the PLSS grid or against surveyed metes-and-bounds. The primary land artifact.
- Scout map
- An informal industry-produced map of which operator holds which acreage, with well locations and recent permits annotated. Historically passed around in print form; the basis for competitive intelligence.
- Fault map
- Map showing the locations and traces of subsurface faults. Critical for completion design and for understanding compartmentalization.
- Net pay map
- Contour map of net hydrocarbon-bearing thickness. Combines structure, isopach, and petrophysical interpretation. Often the deliverable from a reservoir study.
The georeferencing problem
A digital map needs a coordinate reference system (CRS) — the mathematical mapping that converts image pixels to real-world coordinates. A modern GIS export carries a CRS as metadata. A scanned plat from 1962 does not — you have the image, and the PLSS labels printed on it, but no georeferencing.
The three modes for fixing this:
- PLSS-based. OCR the township/range/section labels, look up known coordinates for section corners from the BLM's PLSS reference data, solve an affine or projective transform. Works for most US onshore land.
- Geographic tick. If the map has visible latitude/longitude tick marks with values, OCR the values and solve directly.
- Feature-based. If neither of the above is present, identify recognizable named features — towns, rivers, highways — geocode them, and solve the transform. The fallback mode.
The output is a GeoTIFF (a raster image with embedded CRS metadata) plus, ideally, extracted vector features — well symbols, contour lines, lease polygons — that come out of dedicated detection and segmentation models.
Our map georeferencing pipeline implements all three modes, in priority order. PLSS-based mode dominates the volume because most US onshore maps carry section labels. The output drops directly into ArcGIS and QGIS, no manual control-point work required.
Remember
- Structure maps, isopachs, plats, scout maps, fault maps, net pay maps. Recognize the names.
- A digital map without a CRS is essentially a photograph. Georeferencing fixes it.
- PLSS-based georeferencing is the workhorse for US onshore data.
- GeoTIFF is the output format that drops into modern GIS without further work.
State regulators and their portals #
Oil and gas regulation in the US is overwhelmingly a state-level activity. Each major producing state runs its own regulator with its own forms, its own submission norms, its own portal. There is no federal one-stop. The federal agencies — BLM for onshore federal land, BOEM for offshore — handle a smaller but important slice. Together, the eleven sources below cover roughly 95% of US onshore unconventional activity plus most offshore.
Priority sources
| Agency | Jurisdiction | Plays covered |
|---|---|---|
| TX RRC — Railroad Commission of Texas | Texas | Permian (TX side), Eagle Ford, Haynesville (TX side), East Texas, Panhandle |
| OK OCC — Oklahoma Corporation Commission | Oklahoma | SCOOP, STACK, Anadarko, Arkoma |
| NM OCD — Oil Conservation Division | New Mexico | Permian (NM side, Delaware Basin), San Juan Basin |
| ND IC — North Dakota Industrial Commission | North Dakota | Bakken, Three Forks |
| CO ECMC — Energy and Carbon Management Commission | Colorado | DJ Basin, Piceance, Niobrara |
| WY OGCC — Wyoming Oil and Gas Conservation Commission | Wyoming | Powder River, Greater Green River |
| LA SONRIS — Strategic Online Natural Resources Information System | Louisiana | Haynesville (LA), Tuscaloosa Marine Shale, Gulf Coast onshore, Cotton Valley |
| PA DEP | Pennsylvania | Marcellus, Utica |
| WV DEP | West Virginia | Marcellus, Utica |
| BLM — Bureau of Land Management (AFMSS) | Federal onshore | Federal lands in every play above; large fraction of NM, WY, CO production |
| BOEM — Bureau of Ocean Energy Management | Federal offshore | Gulf of Mexico, Alaska, Pacific |
Why each portal is a snowflake
The portals share a similar conceptual scope — permits, well status, production, plugging — but the implementation details differ on every dimension:
- Forms: Texas's W-1 permit form differs structurally from Oklahoma's 1002A or New Mexico's C-101. Same conceptual document, different fields, different naming.
- Access: some portals offer programmatic search; others require navigating an HTML form; some have rate-limited APIs; some don't have APIs at all.
- Change detection: a few publish RSS feeds. Most don't. Many require periodic re-fetching of an index page to detect new filings.
- Submission format: more recent filings are typically searchable PDFs; older filings are scanned images of paper forms; very old filings are microfilm scans of typewriter-era paper.
- Layout drift: every portal occasionally updates its layout. Adapters built last quarter may break this quarter without notice.
There is no single commercial dataset that solves this cleanly. Vendor products that claim to are reselling someone's scraped, often-stale extract. The only durable solution is per-state adapters maintained by a team that monitors layout drift and re-extracts on a schedule.
The state adapter framework is the single largest reusable asset in Lean Informatics' technical estate. One framework. Eleven adapters. Each one a Python module implementing list_changes, fetch, and doc_types. Nightly regression tests against the live portals catch layout drift within 24 hours. This is the crawl flywheel.
Remember
- US oil and gas regulation is state-level. Eleven sources cover most activity.
- Texas's RRC, Oklahoma's OCC, New Mexico's OCD, North Dakota's IC are the four heaviest.
- Federal lands report to BLM (onshore) or BOEM (offshore).
- Every portal is a snowflake. Adapter maintenance is permanent work, not a one-time build.
Document types #
A handful of document classes account for most of the value in upstream public data. If you can extract these cleanly, you can answer most of an operator's or buyer's questions about a well or a piece of acreage. The class names differ across states; the conceptual contents do not.
| Class | Texas form | Oklahoma form | New Mexico form | What it tells you |
|---|---|---|---|---|
| Permit / APD | W-1 | 1002A | C-101 | Operator's intent to drill — location, target formation, casing program |
| Completion report | W-2 / G-1 | 1002A re-files | C-103 | Stages, fluid, proppant, perforation intervals, initial test rates |
| Production report | P-1 | 1004A | C-115 | Monthly oil, gas, water volumes |
| Plugging record | W-3 | 1003A | C-103 (final) | How and when the well was plugged; mechanical history |
| Well log / LAS | (via API) | (via portal) | (via portal) | Continuous depth-by-depth measurements; the geophysical signal |
| Scout ticket | — | — | — | Industry-shared informal report of a competitor's well status — not a regulatory filing |
| Mud log | — | — | — | Operator's daily lithology log from drilling cuttings — internal, sometimes shared |
The W-1 in detail (since Texas comes up a lot)
The W-1 — Application for Permit to Drill, Deepen, Plug Back, or Re-enter — is the canonical Texas drilling permit. A W-1 carries:
- Operator name and TX RRC operator number.
- Lease name and well number.
- Surface location (lat/long + Texas survey description).
- Bottom-hole location for directional/horizontal wells.
- Target formation and proposed total depth.
- Casing program: surface, intermediate, production casing depths and weights.
- API number, once assigned.
Once approved, the W-1 becomes the basis for the assigned API number and the well's regulatory identity. Variations on the form (W-1A through W-1G) handle specific cases: amendments, re-entry, deepening.
The long tail
Beyond the headline doc types, the long tail of upstream documents includes: well status reports, plugging affidavits, casing diagrams, directional surveys, hydraulic fracturing chemical disclosures (often via FracFocus), incident reports, NOIs (Notices of Intent), pooling orders, division orders, transfer-of-operatorship filings, and seismic survey notifications. Each one matters in some workflow.
Our document router classifies each fetched page into one of roughly 40 doc types. The first six types — permit, completion, production, plugging, log, and scout — cover most engagement scope. The long tail is added per client as engagements expand. Each new doc type adds an extractor and a schema, and trains the router slightly better.
Remember
- Permit, completion, production, plugging, log. Five doc classes carry most of the load.
- Form names differ by state but the conceptual content maps cleanly.
- Scout tickets and mud logs are industry-internal, not regulatory — different sources, different reliability.
- The W-1 in Texas, the 1002A in Oklahoma, the C-101 in New Mexico are the same conceptual document.
Operator software #
Operators run on a stack of specialized desktop and server software with decades of inertia behind it. Most operators have spent millions licensing, integrating, and training on this stack. Asking them to leave it is a non-starter. Asking them to feed it better data is everything.
The core stack
- Petrel (Schlumberger / SLB)
- The dominant geoscience interpretation platform. Used for log interpretation, structural modeling, reservoir simulation, well planning. Most large operators standardize on Petrel; many mid-sized ones do too. Petrel's Ocean SDK lets you build plugins (in C#) that can read and write project data — the integration surface for our outputs.
- Kingdom (S&P Global / IHS Markit)
- An alternative to Petrel for seismic and log interpretation. Strong among smaller operators and consulting shops. Different SDK, same general integration pattern.
- Geographix (Landmark / Halliburton)
- Another mid-tier interpretation platform. Strong in some independent operator pockets, especially for well planning and mapping workflows.
- ArcGIS Pro (Esri)
- The dominant GIS platform across upstream. Lease maps, well locations, infrastructure overlays. ArcGIS Pro has a Python toolbox API that's the standard integration target for georeferenced data.
- Spotfire (TIBCO)
- The de facto upstream analytics platform. Production data analysis, decline curve fitting, type curve work, basic A&D dashboards. Spotfire data functions let you call out to external services (or pull from databases) in workflows.
- Power BI / Tableau
- Business analytics across the organization. Petroleum-specific extensions are common.
- Aries / PHDWin / ARIES Tools (Halliburton)
- Specialized reserves and economics platforms. Where decline curve analysis and reserve reports get built.
- Excel
- Despite the above, an enormous fraction of operator analysis still happens in Excel. Analysts paste data in, run macros, model offline. Useful to remember: a workflow that breaks “the spreadsheet” will not be adopted.
The well master
Most operators maintain a central database — the well master — that is the single source of truth for which wells the operator considers its own, plus metadata (operator, API number, surface location, status). It might be Postgres, SQL Server, a vendor product (Enverus, IHS, OneSpot), or a homegrown system. It is the integration point that matters most. Every well- indexed deliverable should resolve through the well master.
The Workflow Embedding chapter of the business plan is largely about these integration surfaces. Petrel plugin, Spotfire data functions, ArcGIS toolbox, well master sync. Every plugin shipped is reusable across operators on the same tool stack — that's the integration flywheel.
Remember
- Petrel dominates geoscience interpretation. Most operators have it.
- ArcGIS Pro dominates GIS. Spotfire dominates production analytics.
- The well master is the operator's source of truth for “our wells.”
- Excel still runs much of the actual analytical work, despite everything above. Plan for it.
Glossary #
Quick alphabetical reference. Cross-references point to the chapter where each term is treated in depth.
- A&D
- Acquisition and divestiture. Buying and selling oil and gas properties. See 14.
- APD
- Application for Permit to Drill. Filed before drilling. 18.
- API number
- The unique federal-format identifier for every US well. 06.
- Archie's equation
- The foundational saturation equation relating resistivity to water content. 10.
- Abstract
- Chain-of-title summary for a piece of acreage. 07.
- Bakken
- The oil-producing shale of the Williston Basin, North Dakota. 03.
- Basin
- A regional geological feature where source, reservoir, trap, and seal coincide. 02.
- BHL
- Bottom-hole location — where the wellbore ends in 3D space, as opposed to the surface location.
- BLM
- Bureau of Land Management. Federal onshore minerals regulator. 17.
- BOE
- Barrel of oil equivalent. 6 mcf of gas = 1 BOE on energy basis. 12.
- BOEM
- Bureau of Ocean Energy Management. Federal offshore minerals regulator.
- Borrowing base
- The reserves-collateralized credit limit set by a bank under an RBL. 13.
- Casing
- Steel pipe run and cemented inside a wellbore to stabilize and isolate it. 08.
- CIM
- Confidential information memorandum. The seller's full pitch document in an A&D process.
- Completion
- The stage that prepares a drilled well to flow. 11.
- CRS
- Coordinate reference system. The math that maps an image to real-world coordinates. 16.
- Curative
- Fixing defects in title so a lease can be cleared. 07.
- DCA
- Decline curve analysis. 12.
- Division order
- The instrument that defines who gets paid what fraction of a well's revenue. 07.
- E&P
- Exploration and production. Synonym for “upstream operator.”
- EUR
- Estimated ultimate recovery. Total lifetime production for a well. 12.
- Formation top
- The depth at which a named formation begins. 02.
- Frac (fracking)
- Hydraulic fracturing of the rock to create permeability. 11.
- Gamma ray (GR)
- Natural radioactivity log. Separates shale from clean rock. 09.
- GeoTIFF
- Raster image format with embedded CRS metadata. 16.
- Geosteering
- Real-time trajectory adjustment of a horizontal well using LWD data. 08.
- HBP
- Held by production. Lease clause that extends a lease for as long as the well produces. 07.
- Horizontal well
- A well with a lateral section that runs nearly parallel to the bedding plane of the target formation. 08.
- IP (IP30, IP90)
- Initial production rate, often averaged over the first 30 or 90 days. 12.
- Isopach
- Contour map of formation thickness. 16.
- LAS
- Log ASCII Standard. The standard file format for digital well log data. 09.
- Lateral
- The horizontal portion of a horizontal well, inside the target. 08.
- Landman
- The land services professional who negotiates leases and researches title. 07.
- Lease
- Contract granting an operator the right to extract minerals from a piece of acreage. 07.
- LWD / MWD
- Logging While Drilling / Measurement While Drilling. Real-time log data from tools near the bit. 08.
- Marcellus
- The dry-gas shale of Pennsylvania, West Virginia, Ohio, and New York. 03.
- Mineral estate
- The legal right to extract subsurface minerals. Separable from surface ownership. 07.
- NDT / NDE
- Non-destructive testing / examination. Inspecting equipment without damaging it. 15.
- Net pay
- The thickness of rock that is commercially productive in a given well or interval. 10.
- NGL
- Natural gas liquids — ethane, propane, butanes, pentanes. Separated from raw gas in processing.
- NOI
- Notice of intent. A class of regulatory pre-filing.
- OCC
- Oklahoma Corporation Commission. The state's oil and gas regulator. 17.
- OCD
- Oil Conservation Division. New Mexico's oil and gas regulator. 17.
- Offset well
- A well near a subject well, used for analog data when planning a new one.
- Operator
- The E&P company whose name is on the lease and the regulatory filings.
- P&A
- Plug and abandonment. Closing out a depleted well safely. 04.
- Pay zone
- An interval of rock that is commercially productive. 10.
- PDP / PDNP / PUD
- Reserve categories. 13.
- Permian Basin
- West Texas / southeast New Mexico. The largest US producing basin. 03.
- Plug and perf
- The dominant unconventional completion method. 11.
- PLSS
- Public Land Survey System. Federal grid for land descriptions. 05.
- Pooling
- Combining multiple leases into a drilling unit. 07.
- Porosity (ϕ)
- Fraction of rock that is pore space. 10.
- Proppant
- Sand or ceramic beads pumped into hydraulic fractures to keep them open. 11.
- PV-10
- Present value of reserve cash flows discounted at 10% pre-tax. 13.
- RBL
- Reserve-based loan. Bank financing secured by proved reserves. 13.
- Reservoir rock
- Porous, permeable rock that can hold and flow hydrocarbons. 02.
- Resistivity
- How resistive the rock is to electric current. Hydrocarbon indicator. 09.
- ROW
- Right of way. Easement for pipelines, roads, or infrastructure across surface land. 07.
- RRC
- Railroad Commission of Texas. State oil and gas regulator. 17.
- SAM2
- Segment Anything Model v2. The image segmentation foundation model our curve tracer extends.
- SCOOP / STACK
- Major shale plays in central Oklahoma. 03.
- Section
- One square mile (640 acres) under the PLSS grid. 05.
- Seal
- An impermeable rock layer that traps hydrocarbons in the reservoir. 02.
- Scout ticket
- Informal industry-shared report on competitor well status. 18.
- SI
- Shut-in. A well that's capable of production but temporarily closed.
- SONRIS
- Louisiana's regulatory portal. 17.
- Source rock
- The organic-rich sediment in which hydrocarbons formed. 02.
- SP
- Spontaneous potential. A log measurement that helps identify permeable beds. 09.
- Spud
- The moment drilling begins on a well.
- Stage
- A discrete section of lateral that is fractured as a unit. 11.
- Surface estate
- The legal right to use the land surface, separable from minerals. 07.
- Title opinion
- An attorney's reading of who owns what fraction of the minerals. 07.
- Township
- 6×6 mile box under the PLSS grid. 05.
- Type curve
- Averaged decline curve for a class of wells. 12.
- Wellhead
- The surface equipment at the top of the wellbore.
- Well master
- The operator's central database of all wells it considers its own. 19.
- Wolfcamp
- Major productive formation in the Permian Basin.
- Workover
- Maintenance operation on an existing producing well.
- W-1
- Texas's drilling permit form. 18.