Transaction Forensics — Enterprise Process Analysis

Structured data tells you what happened.
Unstructured text tells you why.

Transaction Forensics is a process mining engine that combines ERP transaction logs with the unstructured text that surrounds them — emails, Slack messages, progress reports, call transcripts, and order notes — to surface the discrepancies between what organizations report and what actually happened. Built as a Stanford TECH 41 project.

The Core Insight

Every enterprise system generates two kinds of data. Structured transactions — timestamps, amounts, stage changes, user IDs — tell you the official story. Unstructured text — the emails, Slack threads, meeting notes, timesheets, SOWs, and progress reports that surround those transactions — tell you what actually happened. The gap between them is where fraud, waste, and dysfunction hide.

Structured Data Says

CRM OPPORTUNITY

"Deal in Negotiation for 6 months"

SAP P2P

"Purchase Order created 03/15"

TIMESHEET

"40 hours billed to Project Alpha"

PROJECT STATUS

"Phase 2: On Track, Green"

Unstructured Text Reveals

SLACK THREAD

"Customer said not ready — but Sales moved it forward anyway. No documented sign-off."

EMAIL CHAIN

"Requisition wasn't approved yet. Create the PO now, we'll get the paperwork later."

PROGRESS REPORT

"Assigned to Project Alpha but worked on Beta all week. SOW deliverables not started."

MEETING TRANSCRIPT

"We're 3 weeks behind. Tell the client we're on track while we figure this out."

Evidence From Our Analysis

SAP IDES: Retroactive Documentation

Structured data shows PO and PR both exist. Timestamps reveal the PO was created before the PR — approval was documented after the fact. Only detectable by cross-referencing temporal sequence.

See IDES Compliance tab →

HERB: Approvals in Slack, Not Systems

37,064 enterprise documents analyzed (32.8K Slack messages, 3.6K pull requests, 400 docs, 321 transcripts). 1,226 "LGTM/Approved" messages found in Slack channels — informal approvals with no audit trail. Structured approval workflows show no record of these decisions.

See NLP Patterns tab →

BPI: 57K Payment Blocks — Why?

Structured data shows 22.7% of POs hit payment blocks. The event log can't explain why — that answer lives in vendor correspondence, invoice discrepancy notes, and buyer emails that aren't in the event log.

See BPI Challenge tab →

20 years of ERP consulting taught me this: In every engagement where something went wrong — billing disputes, project failures, compliance gaps — the structured transaction data looked clean. The truth was always in the unstructured layer: the email where someone said "skip the approval," the timesheet that didn't match the progress report, the SOW deliverable that was marked complete but never started. This tool automates finding those discrepancies at scale.

System Architecture

Data Sources

SAP ERP (IDES/ECC)

Salesforce CRM

BPI Challenge (XES)

Slack / HERB Comms

NetSuite ERP

CSV / Custom

→

7 Adapters

IDataAdapter
interface
Normalize to
unified event log

BPI, CSV, ECC,
S4, SALT, SFDC,
Synthetic

→

Analysis Engines

Conformance Checker

Token-based replay
van der Aalst algorithms
Process model builder

Temporal Analyzer

Throughput times
Bottleneck detection
Delay probability

Pattern Clustering

TF-IDF + K-Means
Effect sizing (Cohen's d)
Stability bootstrap

Cross-System Resolver

Entity matching
Levenshtein + proximity
Gap detection

→

Forensic Output

Compliance
Violations

Bottleneck
Reports

Pattern
Cards

Evidence
Ledger

TypeScript (MCP Server)

Python 3.11 (Pattern Engine)

834 tests (602 TS + 232 Py)

Deterministic (seed=42)

Zero frontend dependencies

Four Forensic Lenses

8,800

CRM Pipeline Forensics

Sales opportunity analysis — win rates, velocity patterns, agent performance, quarter-end compression. Kaggle real-world CRM data.

Explore →

251K

BPI Challenge 2019

Real purchase-to-pay from a multinational. 1.6M events, payment blocks, process variability, resource concentration risks.

Explore →

SAP IDES Compliance

Compliance violations in SAP's own demo system. Maverick buying, retroactive documentation, segregation of duties risks.

Explore →

37K

NLP Pattern Analysis

Salesforce HERB — 37K documents (Slack, PRs, transcripts) clustered into communication patterns. Network graphs, bridge users, team dynamics.

Explore →

Real-World Client Data

3 anonymized engagements — 3M+ ERP records, $103K savings, ITGC violations, 28.6% RMA rate

License audit + ticket forensics + high-growth hardware company ERP forensics with credit hold overrides and SOD violations

View Cases →

Design Principles

Adapter Pattern

Every data source implements IDataAdapter — normalize once, analyze everywhere. Adding a new ERP means writing one adapter, not rewriting analysis logic. Currently: SAP, Salesforce, NetSuite, BPI (XES/OCEL), CSV, synthetic.

Deterministic Reproducibility

All analysis uses seed=42. Every pattern card, every cluster, every statistical test can be reproduced exactly. Run make demo and get identical output. No non-determinism in the forensic chain.

Evidence-Based Findings

Every claim links to an evidence ledger entry with source files, row counts, timestamps, and reproducibility parameters. Effect sizes use Cohen's d with 95% CI. Weak results are labeled as exploratory — no overclaiming.

How AI Was Used — Honest Accounting

What I Did (Christopher Bailey)

• Defined the problem space and research questions
• Selected data sources and licensed datasets
• Designed the adapter architecture and analysis pipeline
• Chose conformance algorithms (van der Aalst token replay)
• Interpreted findings and wrote forensic narratives
• Determined what's a real finding vs. a statistical artifact
• Real-world client engagement and domain expertise (20 yrs ERP)

What Claude Code Did (AI Pair-Programmer)

• Implemented data adapters and parsers (TypeScript)
• Built pattern engine and clustering pipeline (Python)
• Wrote conformance checking engine
• Generated test suites (834 tests)
• Built this dashboard (vanilla HTML/CSS/JS)
• Statistical computations (effect sizing, CI, p-values)
• All code visible in git history with co-author tags

The honest version: Claude Code is a force multiplier. The 834-test, 7-adapter, 4-engine system you see here was built in weeks, not months. But the AI doesn't know what's worth finding — it doesn't know that a PO-before-PR is a Sarbanes-Oxley risk, or that 22.7% payment block rates are 4x industry norms. Domain expertise decides what to look for; AI makes looking fast.

Data Sources

Kaggle CRM Sales Opportunities — Apache 2.0
BPI Challenge 2019 — 4TU.ResearchData, CC BY 4.0
SAP IDES — sap-extractor, MIT License
Salesforce HERB — HuggingFace, CC-BY-NC-4.0
Client data — anonymized, used with permission

Reproduce This

git clone github.com/chrbailey/SAP-Transaction-Forensics
make demo # one-command bootstrap
make test # 834 tests
make demo-kaggle # real Kaggle CRM data
View on GitHub →

Pipeline Overview

63.2%

Win Rate

4,238 Won / 6,711 Closed

$2,361

Avg Deal Size

Across all closed-won deals

57 days

Median Velocity

Time from open to close

2,089

Open Pipeline

Opportunities in progress

Conformance Analysis

New Business Pipeline Model

Expected stage sequence for opportunity progression

Prospecting

→

Qualification

→

Needs Analysis

→

Value Proposition

→

Id. Decision Makers

→

Perception Analysis

→

Proposal / Price Quote

→

Negotiation / Review

→

Closed Won

0.05

Avg Fitness Score

Average fitness 0.05 — expected when real-world CRM data is measured against an aspirational 8-stage model. Most organizations skip stages; low conformance is typical, not alarming.

53,862

Deviations Detected

Stage skips, reversals, and out-of-sequence transitions detected across 8,300 opportunities (closed + open) with stage history.

8,300

Cases Analyzed

All opportunities with stage history (closed + open) analyzed for sequence conformance against the defined process model.

Quarter-End Compression

Monthly close distribution — QE months highlighted in orange

Jan

—

Feb

—

Mar

647

Apr

586

May

805

Jun

641

Jul

627

Aug

785

Sep

635

Oct

566

Nov

768

Dec

651

38.4% close in quarter-end months (Mar/Jun/Sep/Dec) — 1.15× the expected baseline of 33.3% (4 of 12 months). Slight concentration but within normal range for B2B sales cycles.

Deal Velocity Distribution

Time-to-close buckets — closed won opportunities only

0–30 days

1,817

42.9%

31–60 days

377

8.9%

61–90 days

1,078

25.4%

91–180 days

966

22.8%

Velocity Insights

Median Days

4,238

Deals Measured

The bimodal distribution — 43% closing in under 30 days, 23% taking 91–180 days — suggests two distinct deal types: quick transactional sales and extended enterprise negotiations.

Top Agents by Win Rate

Minimum 50 closed opportunities — top 10 performers

Agent	Win Rate	Won	Closed	Revenue
1Hayden Neloms	70.4%	107	152	$272K
avg deal size $2,542 · deals/month: ~8.9 · win rate 5.4 pts above team avg
2Maureen Marcano	70.0%	149	213	$350K
avg deal size $2,349 · deals/month: ~12.4 · win rate 5.0 pts above team avg
3Wilburn Farren	69.6%	55	79	$158K
avg deal size $2,873 · deals/month: ~4.6 · win rate 4.6 pts above team avg
4Cecily Lampkin	66.9%	107	160	$230K
avg deal size $2,150 · deals/month: ~9.4 · win rate 1.9 pts above team avg
5Versie Hillebrand	66.7%	176	264	$188K
avg deal size $1,068 · deals/month: ~15.5 · high volume, lower avg deal
6Moses Frase	66.2%	129	195	$207K
avg deal size $1,605 · deals/month: ~11.5 · win rate 1.2 pts above team avg
7Boris Faz	66.0%	101	153	$262K
avg deal size $2,594 · deals/month: ~9.0 · win rate 1.0 pts above team avg
8James Ascencio	65.5%	135	206	$414K
avg deal size $3,067 · deals/month: ~12.1 · highest avg deal in top 10
9Corliss Cosme	65.5%	150	229	$421K
avg deal size $2,807 · deals/month: ~13.5 · strong volume + high deal size
10Reed Clapper	65.4%	155	237	$438K
avg deal size $2,826 · deals/month: ~13.9 · top revenue generator in the table

Product Revenue Mix

Closed-won revenue by product — total $10,005K

GTXPro

$3,510K

35.1%

GTX Plus Pro

$2,630K

26.3%

MG Advanced

$2,216K

22.2%

GTX Plus Basic

$705K

7.1%

GTX Basic

$499K

5.0%

GTK 500

$401K

4.0%

MG Special

0.4%

Concentration

Top 3 products (GTXPro, GTX Plus Pro, MG Advanced) account for 83.6% of total revenue. MG Special at $44K represents tail inventory with minimal impact.

Account Concentration

Top Accounts by Revenue

Closed-won revenue across 85 accounts

Kan-code

$341K

Konex

$269K

Condax

$206K

Cheers

$198K

Hottechi

$195K

Diversification Assessment

CR5 concentration ratio (top-5 revenue share)

12.1%

Top 5 Share

Total Accounts

Healthy distribution. Top 5 accounts represent only 12.1% of revenue — well below typical key-account concentration risk thresholds. No single customer dependency.

Key Findings

Quarter-End Concentration

Low

38.4%

38.4% of closed deals close in quarter-end months (Mar/Jun/Sep/Dec), vs. a 33.3% uniform baseline (4 of 12 months). The 1.15× ratio suggests mild concentration — typical in B2B sales where fiscal quarters influence buyer and seller timing. Not a strong anomaly signal on its own.

Stage Conformance Gap (Expected)

Expected

Avg fitness: 0.05

This is an expected result, not an anomaly. The 8-stage New Business pipeline (Prospecting → Qualification → Needs Analysis → ... → Negotiation/Review → Closed Won) is an aspirational reference model, not an operational requirement. Real-world CRM data typically records only 2-4 stages per deal. The 53,862 "missing_activity" deviations across 8,300 opportunities (closed + open) quantify the gap between prescribed process and actual practice — useful for process improvement, but not indicative of control failures. Average fitness score: 0.05.

Agent Performance Spread

Low

15-pt spread

Win rates among active agents range from approximately 55% to 70.4%, a 15-point spread. Top performers (Hayden Neloms, Maureen Marcano) sustain 70%+ win rates across 150–200+ closed deals, suggesting reproducible behavioral patterns worth codifying as playbooks.

Account Diversification

Low

12.1% top-5

The top 5 accounts (Kan-code, Konex, Condax, Cheers, Hottechi) represent only 12.1% of total closed-won revenue across 85 accounts. Revenue is broadly distributed, reducing customer concentration risk. No single account exceeds 3.4% of total revenue.

CSV Ingest

→

SFDC Normalize

→

Event Log Build

→

Conformance Check

→

Pattern Detection

→

Report

Stage 1: CSV Ingest

Source: Kaggle CRM Sales Opportunities (innocentmfa)
Files: sales_pipeline.csv (8,800 rows), accounts.csv (85), products.csv (7), sales_teams.csv (35)
Converter: convert_kaggle_crm.py maps CSV fields to SFDC JSON schema

Stage 2: SFDC Normalization

Adapter: SFDCSyntheticAdapter implements IDataAdapter (8 methods)
Field mapper: Opportunity.Id → VBELN, Account.Id → KUNNR, Amount → NETWR
Pipeline models: Prospecting → Qualification → Needs Analysis → ... → Closed Won
Stage mapping: "Engaging" → "Qualification", "Won" → "Closed Won", "Lost" → "Closed Lost"

Stage 3: Event Log Construction

Records: 49,408 events from 8,800 opportunities
Stage transitions: 31,787 entries (with synthetic intermediate stages for Won deals)
Activities: 17,621 task records (product-related subjects)
Format: case_id, activity, timestamp, resource, attributes

Stage 4: Conformance Checking

Algorithm: Token-based replay (van der Aalst, 2016)
Model: sfdc_new_business — 8-stage pipeline with mandatory transitions
Cases: 8,300 analyzed | Fitness range: 0.00 – 0.10 (real data lacks full stage coverage)
Deviations: 53,862 total (45,386 missing_activity, 8,476 skipped_activity)

Stage 5: Pattern Detection

Quarter-end compression: 38.4% of closes in QE months (1.15x the 33.3% baseline)
Deal velocity bimodal: 42.9% close within 30 days, 22.8% take 91-180 days
Agent spread: 15-point win rate range (55%–70%) across 30 agents
Account concentration: Top 5 = 12.1% of revenue (healthy diversification)

Technology Stack

MCP Server: TypeScript (ESM, strict mode), 7 data adapters
Pattern Engine: Python 3.11, scikit-learn, scipy
Conformance: Token-based replay, ProcessModelBuilder, van der Aalst algorithms
Cross-System: Entity resolver (Levenshtein + proximity), unified event log
Tests: 834 passing (602 TypeScript + 232 Python)
Data: Kaggle CRM Sales Opportunities (Apache 2.0 license)
Frontend: Vanilla HTML/CSS/JS (zero dependencies)
Deployment: Vercel (static)
Source: github.com/chrbailey/SAP-Transaction-Forensics

Cluster Quality Note: Global silhouette score is 0.028 (KMeans) / 0.09 (BERTopic), indicating weak cluster separation. Findings should be treated as exploratory signals, not confirmed patterns. See Pipeline Transparency for methodology details. HERB dataset timestamps extend to 2027 (synthetic); temporal patterns reflect relative timing.

⏳

Loading forensic analysis...

View Analysis Code on GitHub

This is an analysis tool, not a hosted service. Clone the repo to run the pattern engine on your own Slack exports, case comments, or CRM data with text fields and timestamps.

View on GitHub →

Data Source: BPI Challenge 2019 — 4TU.ResearchData (CC BY 4.0). Real purchase-to-pay event log from a multinational coatings and paints company. 251,734 purchase order items, 1,595,923 events, 628 unique resources. Period: Jan 2018 — Jan 2019.

Process Overview

64.3

Median Throughput (days)

Average 72.3 days; max 25,670 days (stale POs)

32%

Top 2 Variants Coverage

Only 2 of hundreds of paths cover a third of cases

57,136

Payment Blocks

22.7% of all POs required manual block removal

2,835

Incomplete Cases

POs that never progressed past creation

Most Common Process Paths

#	Process Path	Cases	%
1	Create PO → Vendor Invoice → Goods Receipt → Invoice Receipt → Clear Invoice	50,286	20.0%
2	Create PO → Goods Receipt → Vendor Invoice → Invoice Receipt → Clear Invoice	30,798	12.2%
3	Create PO → Goods Receipt (incomplete)	9,443	3.8%
4	Create PO → Vendor Invoice → GR → IR → Remove Payment Block → Clear	6,931	2.8%
5	Create PO (abandoned — never progressed)	2,835	1.1%

Key finding: Variants 1 and 2 differ only in whether the vendor invoice arrives before or after goods receipt — a classic "3-way match" ordering question in P2P. Variant 4 shows 2.8% of cases require payment block removal, indicating invoice discrepancies.

Anomalies Detected

Process Issues

Payment blocks requiring intervention57,136

Quantity changes after PO creation21,449

Price changes after PO creation12,423

Deleted / cancelled orders5,298

Abandoned POs (single event)2,835

Resource Concentration

Top 10 resources handle 63% of all 1.6M events, creating operational risk if key personnel are unavailable.

System

25.0%

user_002

10.4%

user_001

6.0%

batch_001

4.6%

user_002 alone handles 166,353 events (10.4%). Single-person bottleneck risk.

Document & Matching Types

Purchase Order Types

Standard PO152,562 (60.6%)

Framework Order62,543 (24.9%)

Consignment36,629 (14.6%)

Invoice Matching Strategy

3-way match, invoice after GR164,874 (65.5%)

3-way match, invoice before GR67,583 (26.9%)

2-way match19,277 (7.7%)

3-way matching compares PO, goods receipt, and invoice. The 26.9% "invoice before GR" cases are where vendors bill before physical delivery — common but higher fraud risk.

Forensic Insight

This dataset comes from a real multinational company (anonymized for BPI Challenge 2019). The process mining community uses it as a benchmark for purchase-to-pay analysis. Our forensic engine processed the full 1.6M event log and surfaced several structural concerns:

Payment blocks (22.7%) — Nearly 1 in 4 purchase orders hit a payment block requiring human intervention. This signals systematic issues in invoice matching or vendor master data quality. In a healthy P2P process, payment block rates should be under 5%.

Process variability — With the top 2 variants covering only 32% of cases, the remaining 68% follow hundreds of different paths. This "spaghetti process" pattern makes it difficult to automate, audit, or optimize. Industry benchmarks target 80%+ coverage in the top 5 variants.

Resource concentration — user_002 processes 10.4% of all events. If this person is unavailable (vacation, resignation), the bottleneck could cascade across the entire P2P process. This is a classic single-point-of-failure that process mining can identify but traditional audits miss.

Compliance Alert: 7 process violations detected in SAP's own IDES demo system. IDES (Internet Demonstration and Evaluation System) is SAP's official reference environment used for training and certification. These violations exist in SAP's reference data — our engine found them through automated conformance checking.

Violations Detected

Total Violations

In SAP's own reference data

Missing Purchase Requisition

PO created without prior approval workflow

Retroactive Documentation

PO created before PR — approval was backdated

Segregation of Duties Risk

Same resource created both PR and PO

Violation Breakdown

HIGH MISSING_PR — Purchase Order Without Purchase Requisition 6 cases

Six purchase orders were created directly without an associated Purchase Requisition. In a compliant Procure-to-Pay process, every PO must originate from an approved PR to ensure proper authorization and budget control.

Root Cause

Maverick buying — users bypassing requisition approval workflow to create POs directly

Business Risk

Unauthorized spending, budget overruns, audit findings. Bypasses approval controls designed to prevent fraud.

Remediation

Enforce system control: block PO creation without linked PR. Add validation rule in SAP transaction ME21N.

CRITICAL PO_BEFORE_PR — Retroactive Approval Documentation 1 case

One Purchase Order was created before its associated Purchase Requisition. This means the order was placed first and the approval was documented after the fact — a clear compliance violation where the authorization workflow was bypassed and retroactively papered over.

Root Cause

Approval workflow bypassed — order placed first, requisition created afterward to satisfy documentation requirements

Business Risk

Fraudulent documentation, Sarbanes-Oxley violations, audit failure. Represents intentional control circumvention.

Remediation

Implement sequential control: system must enforce PR approval timestamp < PO creation timestamp. Add automated alert.

MEDIUM SOD_VIOLATION — Segregation of Duties Risk 1 case

One case where the same resource created both the Purchase Requisition (request) and the Purchase Order (fulfillment). Proper segregation of duties requires different individuals to request and approve purchases to prevent self-dealing.

Root Cause

Missing role separation — user has authorization for both ME51N (create PR) and ME21N (create PO)

Business Risk

Self-dealing, fictitious vendor schemes. One person can request, approve, and fulfill without oversight.

Remediation

Review SAP role assignments (PFCG). Ensure PR creator and PO creator roles are mutually exclusive.

IDES Process Comparison — O2C vs P2P

Order-to-Cash (O2C)

Sales side — from customer order to payment

Cases analyzed646

Total events5,708

Unique activities8

Process variants158

Median duration2.7 days

Max duration6,578 days

Normal completion rate83.6%

Invoice cancellation delay17.7 days avg

Key finding: 158 unique process variants for just 8 activities and 646 cases — extreme variability. The max duration of 6,578 days (18 years) indicates stale/orphaned orders in the demo system that were never closed.

Procure-to-Pay (P2P)

Purchasing side — from requisition to vendor payment

Cases analyzed2,486

Total events7,420

Unique activities20

Process variants142

Average duration45.2 days

Max duration1,027 days

Compliance violations7

Batch processing outlier2,181 events

Key finding: 7 compliance violations in SAP's own demo data. The "PO before PR" case is particularly notable — it represents retroactive documentation, a pattern that in production systems is a red flag for fraud investigators.

Why This Matters

SAP IDES is not production data — it's SAP's official demo and training environment. Thousands of consultants learn SAP using this system. Yet our automated conformance checker found 7 compliance violations that exist in the reference data itself.

This demonstrates two things: (1) Automated process mining catches what manual review misses, even in well-known systems. (2) If reference data contains these patterns, production systems — with real users under real deadline pressure — almost certainly contain more.

The conformance checking engine uses token-based replay (van der Aalst algorithm) to compare actual event sequences against expected process models. For P2P, the expected model requires: PR → PO → Goods Receipt → Invoice → Payment. Any deviation is flagged, measured, and classified by severity.

The O2C analysis reveals a different problem: 158 process variants from just 8 activities. This is a "spaghetti process" — technically functional but impossible to audit or optimize at scale. Combined with the 6,578-day max duration (stale orders from the 1990s still open), it paints a picture of a system that works but accumulates technical debt in its process layer.

Analysis Methodology

Conformance Engine: Token-based replay, ProcessModelBuilder, van der Aalst algorithms
Data Adapters: BPI (XES/OCEL), SAP IDES (sap-extractor, MIT), Synthetic (seed=42)
Pattern Engine: Python 3.11, scikit-learn, scipy (TF-IDF + K-Means + effect sizing)
Temporal Analysis: Throughput time, bottleneck detection, delay probability
Tests: 834 passing (602 TypeScript + 232 Python)
Source: github.com/chrbailey/SAP-Transaction-Forensics

Client Data Notice: All data on this page comes from real consulting engagements. Company names, individual names, and email addresses have been anonymized. Financial figures, ticket counts, and category distributions are actual. Used with permission for educational purposes.

Case 1 — Healthcare Company: NetSuite License Optimization

Engagement: ERP User License Audit

289-user NetSuite environment — automated classification found $103,896 in annual savings

14.4x

ROI

289

Total Users

Eliminable

$103,896

Annual Savings

0.8 mo

Payback Period

Savings by Category

Dormant full-access (8 users, no login 90+ days)$46,464

Departed employee center (est. 53)$31,800

Approval-only users (4, replace w/ SuiteFlow)$23,232

Deprecated integrations (est. 4 of 8)$2,400

What Structured Data Shows vs. What We Found

Structured: NetSuite user list shows 289 active users with assigned roles. Looks clean.

Unstructured signals: Login timestamps reveal 8 full-access users ($5,808/yr each) haven't logged in for 90+ days. Cross-referencing with HR termination dates shows ~53 Employee Center users are departed employees still consuming licenses. 4 users' entire activity consists of clicking "Approve" on purchase orders — replaceable by an email-based workflow that costs nothing.

The gap: $103,896/year in waste invisible to anyone looking at the user list alone.

Case 2 — MedTech Manufacturer: Help Desk Ticket Forensics During Acquisition

Engagement: NetSuite Implementation + Post-Acquisition Support

2,525 help desk tickets reveal organizational stress invisible in ERP transaction data

Diagnostics manufacturer acquired by Fortune 500. Structured data showed normal operations. Tickets told a different story.

2,525

Help Tickets

Categories

38%

Uncategorized

3,992

ERP Users

1,423

Inventory Items

Ticket Category Distribution

Uncategorized

956

Finance

469

Access

257

Procurement

215

Inventory

119

Manufacturing

107

Warehouse

103

Cost Accounting

Quality

Order Mgmt

What Ticket Text Reveals — Unstructured Signals from Real Tickets

DATA INTEGRITY

"How did 20413 turn into 20433?"

Inventory team can't explain item number mutation. Structured data shows both items exist. The ticket reveals someone doesn't trust the data — and they're right to question it.

SYSTEM WORKAROUNDS

"Explore creating dummy transactions for MRP"

Manufacturing is building fake transactions to work around MRP limitations. Structured data will show these as real — auditors would never know.

ESCALATION CULTURE

"Bill Payment Email Notification for Vendors — URGENT"

Multiple "URGENT" tickets for routine vendor payments. Finance team is under pressure. Transaction data shows payments made on time — the stress is invisible.

ACQUISITION CHAOS

257 "Request for NetSuite Access" tickets

10% of all tickets are access requests — many from acquiring company email domains. IT is drowning in onboarding during the acquisition. ERP data shows users; tickets show the churn.

ERP Transaction Data

• 3,992 employees in system
• 1,044 active customers
• 1,423 inventory items tracked
• 307 bills of materials
• 465 GL accounts
• 5,035 warehouse bin locations
Status: Operational

Ticket Text Analysis

• 38% of tickets uncategorized (overwhelmed)
• Dummy transactions created as workarounds
• Item numbers mutating unexplainably
• "URGENT" escalation culture in Finance
• Acquiring company flooding access requests
• Lot traceability questions (FDA compliance)
Status: Organization under stress

Available data for forensic analysis: 2,525 help desk tickets (with summary, assignee, priority, category, timestamps, response/close times) + complete NetSuite master data (employees, customers, vendors, items, BOM, chart of accounts, inventory, financial statements). The combination of structured ERP data with unstructured ticket text is exactly the dual-layer forensic approach this tool is designed for.

Case 3 — Connected Hardware Manufacturer: High-Growth ERP Forensics

Engagement: ERP Migration Assessment + ITGC Audit + International Expansion (multi-year engagement)

3M+ ERP records forensically analyzed — credit hold overrides, 28.6% return rate, SOD violations, approval chain complexity

High-growth hardware manufacturer during rapid scaling. Legacy ERP → enterprise ERP migration. Structured transaction data + ITGC audit findings + process documentation.

3M+

CSV Rows

102K

Sales Orders

97K

RMA Returns

43K

Vendors

10K

Customers

28.6%

RMA Rate

Data Sources Analyzed

Master Data
10K customers, 43K vendors
8.7K fixed assets, 5K contacts

Transaction Data
102K sales orders, 1M+ EDI lines
97K RMAs, 164K credit memos

Governance / Text
ITGC audit, SOD analysis
7.6K deductions, PES call notes

Forensic Findings — What the Data Revealed

ITGC VIOLATIONS (Deloitte Audit)

7 users with Administrator role. Terminated employee still active.

153 active users, 40 unique roles. SOD violations at both role and user level. 4 generic shared accounts. No formalized change management policy. Admin access to both dev and prod environments. No post-implementation review process. Critical gaps for a publicly traded company.

Source: Deloitte SOD Role Definition Analysis

CREDIT HOLD OVERRIDES

Sales orders shipped despite "Customer On Credit Hold" flag

Sales order headers contain both "Customer On Credit Hold" and "Shipment Hold Released by Finance" fields. Cross-referencing reveals orders where credit holds were manually overridden — Finance releasing shipments to customers already flagged for credit risk. The structured status says "shipped." The override field tells you it shouldn't have been.

Source: Sales Order Header — 102K records

RETURN RATE ANOMALY

28.62% of customer accounts had return events — only 67.5% on-time delivery

Forensic case analysis of 1,090 customer accounts: 312 had at least one RMA event (28.62%). 97K total RMA line items in the extract across 6 types: Open Box, Closed Box, Destroyed in Field, Stock Rotation, Warranty, Error Shipment. Transaction data shows the returns; memo fields and reason codes hint at systemic quality or logistics failures the structured data can't explain.

Source: Case Outcome Analysis (1,090 accounts, 21,099 events) + RMA Extract (97K line items)

APPROVAL CHAIN COMPLEXITY

7,610 customer deductions with multi-approver routing and rerouting

Marketing deductions (MDF) routed through "Next Approver" and "Set Rerouted Next Approver" chains. Multiple email notification flags. Deductions linked to DFI invoices, credit memos, and proof-of-performance documents. The approval chain is so complex that the rerouting field exists specifically because the normal chain fails regularly.

Source: Customer Deductions — 7,610 records, 50+ columns

Structured ERP Data Says

• 102K sales orders processed
• 97K returns authorized
• 43K vendors in master data
• 153 active users, roles assigned
• Orders shipped, invoiced, cleared
• International entities operational
Status: ERP Functioning

Governance + Text Layer Reveals

• Credit holds overridden to ship anyway
• 28.6% returns — systemic product/logistics issue
• 7 users with admin (SOX risk, public co.)
• Terminated employee still accessing system
• Approval chains so broken a "reroute" field exists
• 144K PO changes — constant purchasing churn
Status: Controls Gap, SOX Exposure

Engagement scope: Multi-year consulting engagement spanning ERP vendor selection, enterprise ERP migration assessment, ITGC audit (Big Four SOD analysis), international tax restructuring (European entity), regional expansion (APAC), and PCI/SOX compliance programs. Data sources: legacy ERP production extracts, audit firm findings, change management logs, 24+ project status call notes, SOW/FRD documentation. All company names, customer IDs, employee names, and identifying details anonymized.

The Pattern Across All Three Cases

Case 1 (license audit): Structured user data hides $103K in waste — login timestamps and role assignments alone told the story. Case 2 (acquisition tickets): When structured data looks normal, 2,525 help desk tickets reveal an organization under stress — workarounds, data trust issues, and IT drowning in access requests. Case 3 (high-growth ERP): 3 million rows of clean-looking transaction data mask credit hold overrides, a 28.6% return rate, SOD violations in a rapidly scaling company, and approval chains so dysfunctional that a "reroute approver" field was built into the system. In every case, the structured data said "operational." The unstructured layer — tickets, audit findings, override fields, memo text — told the real story.