Methodology

Five phases. Inherited from ERP. Re-tuned for AI.

Most AI projects fail because they're run like software pilots. We run them like ERP implementations — discovery, fit-gap, configure, train, hypercare — because that's the playbook that's already known to work in the room AI has to fit into. The differences are in what gets measured.

Phase 01· Week 1 of Assessment

Discovery. The week where we stop assuming.

A round of 1:1 interviews with the people who actually do the work — finance, ops, IT, and whoever owns the chart of accounts. The output is a system inventory, a process map for every candidate workflow, and a list of the disagreements between what your software is supposed to do and what your team actually does.

Inputs

Org chart · ERP access · prior pilot post-mortems

Activities

InterviewsInventoryProcess maps

Output

A landscape document. Not a deck — a working artifact the Build team references for the entire engagement.

Owned by

Firmcraft principal · with your CFO or COO as the lead sponsor

Discovery schedule · wk 15 of 7 scheduled

CFO · J. Reyes

Org thesis · funding · audit posture

Done

Controller · R. Tanaka

Close cadence · AP/AR pain · COA hygiene

Done

Dir. Ops · M. Brennan

Work order flow · field-to-ERP gaps

Done

IT lead · S. Park

Stack · access · sovereignty constraints

Thu

AP team · A. Nguyen +2

Invoice intake · coding patterns

Fri

Phase 02· Week 2 of Assessment

Fit-gap. Where ERP discipline meets AI scope.

For every candidate workflow, three questions: does the existing system already do this (Fit), does it almost-but-not-quite (Gap), or is it a true greenfield (Missing). Only Gaps and Missings go forward into the scorecard. The Fits get a one-line "don't build this" — that answer alone usually pays for the Assessment.

Inputs

Discovery landscape · ERP capability matrix · sovereignty rules

Activities

Capability matrixScoringSovereignty pass

Output

The scorecard. Every candidate use case, ranked, with a feasibility × ROI × sovereign-fit score and a Build / Defer / Don't decision.

Owned by

Firmcraft principal · review with CFO + IT lead

Fit-gap scorecard · wk 218 candidates · 7 greenlit

Use case

Fit

Gap

Missing

AP invoice triageHigh ROI · on-prem

Work-order draftHigh ROI · on-prem

Vendor master mgmt.BC already covers

Voice agent (sched.)Hybrid sovereign-fit

Month-end commentaryDefer · low feasibility

3-way matchBC handles natively

Phase 03· Weeks 1–10 of Build

Configure. The Build, run like an implementation.

The Foundation goes in first — Hermes deployed, retrieval indexed, the messaging gateway wired, Langfuse observing. Then the vertical workflows stack on top, one at a time, each shipped to production behind a feature flag, each with its eval suite in place before traffic. The discipline is straight out of ERP go-lives.

Inputs

Scorecard · roadmap · vendor matrix · TCO model

Activities

Foundation installRAG indexingWorkflow buildIntegration

Output

A running system, behind feature flags, with the eval suite green on each workflow before it sees a real user.

Owned by

Same principal as Discovery · with the Build team behind the scenes

Workflow · ap.triage.v3flag · canary · 5%

Phase 04· Weeks 8–12 of Build

Train & eval. People and models, in that order.

Two parallel tracks. Your team learns the operator — what it can and can't do, how to handle reviews, where the audit log lives. Meanwhile the eval suite is locked in: a regression set for every workflow, scored on accuracy, latency, and cost, gating each release. We don't ship a workflow without it.

Inputs

Production workflows · golden test set from Discovery · audit requirements

Activities

Team trainingEval harnessRegression setCost budget

Output

A scored, regression-gated system and a team that can run it without us being on every standup.

Owned by

Firmcraft principal + your AP or Ops lead as champion

Eval run · ap.triage · v3.2passed · 18 / 20

Accuracy

94%

p95 latency

410ms

$ / run

$0.018

vendor.match.exact312mspass

coa.code.lookup288mspass

3way.match.partial510mswarn

duplicate.detect204mspass

currency.convert198mspass

approval.threshold410mspass

Phase 05· First 90 days post-launch

Hypercare. The first 90 days, run by us.

After cutover, we're on every standup for 30 days, on the channel for 90, and named owners on the runbook for the duration. Hypercare ends when the system has cleared an eval regression cycle without our intervention. After that, you're on an Operate retainer — or off our books entirely.

Inputs

Production system · runbook · eval & cost dashboards

Activities

Daily standupRunbook drillsEval regressionsIncident review

Output

A handoff packet, a passing eval cycle, and a named on-call rotation — yours or ours, depending on which Operate tier you signed.

Owned by

Firmcraft principal · then handoff to retained AI lead

Runbook · day 14on-track

Daily eval regression runscheduled · 06:00 CT

live

Cost budget reviewweekly · finance lead

Mon

Audit-log spot-check5 random invoices · controller

d 12

Incident drill · vendor outagerunbook §3.2

d 10

Retrospective with CFO + ops leadend of week 2

Fri

Eval regression — full suiteunattended · gates handoff

d 28

06 · Negative space

The methodology is also what we won't do.

Most of what makes an AI engagement go off the rails happens in the gaps between the steps above. So we name the disciplines we don't carry over from the rest of the consulting industry.

No 90-day "discoveries."

The Assessment is two to three weeks. If we need longer, scope is wrong — and the wrong scope kills a project faster than the wrong model.

No demo-driven sales.

We won't build a pretty PoC to win the room. The Assessment is the sales conversation, and it's billed.

No quiet routing to frontier models.

If a workflow leaves your walls for a frontier API, you'll know — it'll be in the architecture diagram, the contract, and the audit log.

No account-handoffs.

The principal who scoped your engagement is the one who's on the hypercare standup. We're sized around that constraint — not around scaling headcount.

Start here

Five phases, run by the same person, from week one.

Start with the Assessment. Three weeks, fixed-fee, refundable against Build.

Book the call →