Inflection Point Engineering Scripts & Utilities

P&ID Tag Extractor

Extracts equipment tags, line numbers, and ISA-5.1 instrument tags from raw P&ID text using regex pattern matching. Classifies instruments by first-letter variable and function letters per ISA-5.1.

Server-side limitation: the full Python tool runs Tesseract OCR on PDF / image P&IDs to produce raw text, which is not possible in-browser without a server component. This page ports the regex extraction and ISA classification logic — paste OCR output, a plain-text P&ID annotation export, or any free-text blob containing tags, and the tool will extract and categorize them. For OCR-from-image workflows, use the Python script under 05_Scripts/pid_tag_extractor/.

Patterns: Equipment ([A-Z]{1,3})-(\d{2,5})([A-Z](?:/[A-Z])?)?; Lines (\d{1,2})"-([A-Z]{1,4})-(\d{2,5})(?:-(\d{2,5}))?; Instruments ([A-Z]{2,5})-(\d{2,5})([A-Z])?. Instruments are classified by ISA-5.1 first letter (variable) and subsequent letters (function / modifier). Equipment prefix must match a known list (P, C, K, V, D, E, F, H, R, T, TK, A, B, G, M, X, EJ, FI, MX, AG) to reduce false positives.