Building Resilient B2B Supply Chain Workflows: Why We Replaced LangChain/CrewAI with LangGraph.js for Human-in-the-Loop Operations

Building Resilient B2B Supply Chain Workflows: Why We Replaced LangChain/CrewAI with LangGraph.js for Human-in-the-Loop Operations
When we built our first enterprise supply chain automation in 2023, we used LangChain. It was the obvious choice — large community, extensive integrations, rapid development. The prototype looked impressive. The production system was a disaster.
Not because LangChain is bad software. It isn't. But because supply chain workflows in a mid-market B2B business have properties that LangChain's chain-based execution model fundamentally cannot handle well. The same turned out to be true of CrewAI when we evaluated it as an alternative.
This post documents the specific technical problems we encountered, how LangGraph.js solved them, and what the actual architecture looks like for a production purchase order routing system serving a manufacturing client with $180M in annual procurement spend.
The Supply Chain Problem Space
A B2B supply chain workflow is not a simple question-and-answer loop. A typical purchase order routing flow involves:
- Receiving a purchase request from a department
- Validating the request against approved vendor lists, budget codes, and spending limits
- Performing a three-way match if receiving against an existing contract
- Routing to the appropriate approver based on amount, category, and organizational hierarchy
- Pausing for human approval — which may take minutes or days
- Resuming after approval with execution steps: creating the PO in the ERP, notifying the vendor, updating inventory projections
- Handling rejections, modifications, and exception escalations
The key property here is long-running, stateful, human-interrupted. A purchase order might sit waiting for a VP's approval for three business days. During that time, the system needs to remember everything it knew when it paused. It needs to be restartable. And critically, it needs to be auditable — your finance team needs to know exactly what the agent knew, what it decided, and why.
Why LangChain Failed Us
LangChain's execution model is sequential chain execution. You define a chain of operations, LLM calls, and tool invocations that run in order. It works well for single-turn tasks with a defined start and end.
The problems we encountered with supply chain workflows:
No native state persistence. LangChain has no built-in mechanism for persisting execution state across process restarts. We built custom state management around Redis, which worked until it didn't — cache eviction policies and edge cases around process crashes meant we had orphaned workflows and state loss in production.
No first-class human interruption. Implementing a "wait for human approval" step in LangChain meant building a webhook system, external state management, and complex resume logic. Every resume was effectively a new chain execution that had to reconstruct context from scratch. This made the implementation brittle and the debug experience terrible.
Linear execution, no conditional routing. Approval routing in procurement is inherently conditional. Different amounts route to different approvers. Different categories have different compliance checks. LangChain's chain model required us to either hard-code all routing logic in a monolithic chain or build a meta-chain that selected sub-chains — both approaches were fragile and hard to modify when business rules changed.
No built-in audit trail. LangChain offers callbacks for logging, but constructing a complete, structured audit record of every decision in a workflow required significant custom instrumentation.
Why CrewAI Also Fell Short
CrewAI addresses the multi-agent coordination problem. For supply chain, the appeal was defining specialist agents — a validation agent, a compliance agent, an approvals agent — that work together. In theory, this maps well to the domain.
In practice, the issues were different but equally blocking:
Non-deterministic execution order. CrewAI's agent coordination is LLM-driven. Agents decide what to do next based on model output. In a regulated procurement workflow, non-deterministic execution order is a compliance problem. The three-way match must happen before approval routing. The compliance check must happen before the PO is created. These are hard sequencing requirements that LLM-driven coordination cannot guarantee.
Shared state is a shared problem. CrewAI agents share a global task context that all agents can read and write. In a long-running workflow with multiple agents touching the same purchase order, we observed state corruption — agents reading partially-updated state written by another agent mid-execution.
Human-in-the-loop is still bolted on. Like LangChain, CrewAI does not have a native mechanism for pausing crew execution for human input and resuming deterministically. The implementation pattern is the same: external state management, webhooks, and reconstruction logic.
LangGraph.js: What Changed
LangGraph.js models agent workflows as directed graphs where each node is a discrete function and edges define routing logic. State flows through the graph as a typed object, and the framework handles checkpointing that state to a persistent store at every node transition.
This is architecturally different from both LangChain and CrewAI in ways that matter for supply chain:
Typed State Schema
The purchase order workflow state in LangGraph.js is a TypeScript interface:
interface POWorkflowState { purchaseRequestId: string; requestedBy: string;
Every node in the graph reads from this typed state and returns a partial state update. TypeScript's type system catches entire categories of bugs at compile time. And because the state schema is the single source of truth for what the workflow knows, adding new information (a new compliance field, a new approval tier) means updating one schema definition, not hunting through chain logic.
Checkpoint Persistence
LangGraph.js has a built-in checkpointer interface. In our production deployment we use a PostgreSQL-backed checkpointer (with Azure Database for PostgreSQL). Every time a node completes, the framework automatically serializes the current state and writes it to the checkpoint store with a thread ID and timestamp.
This means:
- If the Node.js process crashes mid-workflow, the next process that picks up the thread ID continues from the last checkpoint
- Long-running workflows that span days (waiting for VP approval) survive process restarts, deployments, and infrastructure maintenance windows
- Every state in the workflow history is queryable for audit and debugging
For our $180M procurement client, this meant zero lost purchase requests over six months of production operation, including two planned infrastructure maintenance windows and one unplanned process crash.
Native Human-in-the-Loop with Interrupt
LangGraph.js has first-class support for interrupting graph execution to wait for external input. The interrupt() function pauses execution at the current node and returns control to the calling process. When the human provides input (via Slack, via web UI, via API), the workflow resumes from the exact checkpoint where it paused.
In our PO routing system, the approval node looks like this conceptually:
async function humanApprovalNode(state: POWorkflowState) { // Send Slack notification to approver await slackClient.postMessage({
The Slack message includes an Approve/Reject button. When the approver clicks, a Slack webhook fires, the backend calls langgraph.updateState() with the approval decision, and the graph resumes from the checkpoint. The approver's decision is now part of the immutable state history.
Conditional Routing
LangGraph.js edges can be conditional functions. Our approval routing looks like:
function routeApproval(state: POWorkflowState): string { if (state.totalAmount < 10000) return "auto_approve"; if (state.totalAmount < 100000) return "department_head_approval";
This is a plain TypeScript function. It's unit-testable in isolation. When business rules change — the CFO approval threshold moves from $500K to $750K — it's a one-line change in a clearly defined function, not a hunt through prompt engineering or chain configuration.
ERPNext Integration: The Data Layer
Every data read and write in this workflow goes through ERPNext. The vendor compliance status comes from the ERPNext Supplier master. The budget code validation comes from the ERPNext Cost Center and Budget document. The three-way match pulls the Purchase Order, Purchase Receipt, and Purchase Invoice documents from ERPNext and compares quantities and amounts.
When the workflow completes, the create_po_node calls the ERPNext REST API to create and submit the Purchase Order document. The response includes the ERPNext PO number, which is stored in the workflow state and written to the audit log.
This means the ERPNext database is always the system of record. The LangGraph.js workflow is the process engine. They are cleanly separated and independently auditable.
Results in Production
After six months in production:
- Average PO processing time: reduced from 4.2 days to 11 hours
- Exception escalation rate: 23% of POs (previously all manual) to 6% (legitimate exceptions only)
- Three-way match accuracy: 99.6% (vs 96.1% manual)
- Approver response time: improved 34% because Slack notifications with full context replaced email chains with attachments
- Zero lost workflow state across 2,847 PO workflows processed
The supply chain team's comment after 90 days: "We stopped thinking about it. It just works."
That's what production-grade AI automation should feel like.
Ready to deploy AI agents that actually work in production? Book a Strategy Session with Techseria — we'll map your supply chain workflows and show you exactly what LangGraph.js can automate in your environment.
[Book a Strategy Session](https://techseria.com/contact)
IMAGE PROMPT: A dark abstract visualization of a supply chain network graph. Background is deep navy-black (#0a0a0f). A directed acyclic graph structure glows in the foreground with nodes rendered as clean circular icons connected by animated-style arrows. Some nodes glow in vibrant cyan (representing automated steps), one node pulses in amber/orange (representing a human approval gate with a pause symbol). Lines between nodes carry gradient flows in purple and electric blue. In the background, subtle geometric supply chain motifs — boxes, arrows, warehouse grid lines — fade into the dark background at low opacity. The composition feels like a technical dashboard from the future: minimal, precise, enterprise-grade. No text, no logos, no people. 16:9 aspect ratio at 1920x1080px.