COBOL Migration
Accelerator
From Zero to Knowledge Graph in One Sprint
AWS CardDemo · 200+ Mainframe Artifacts · March 2026
The Business Problem
Mainframe migrations fail because teams cannot understand what the system does before they change it.
Operationally Opaque
COBOL programs encode decades of business logic in formats no modern tool can read. Knowledge lives in tribal memory, not documentation.
No Dependency Map
Programs CALL other programs, COPY copybooks, EXEC CICS, and access VSAM files. No single artifact reveals the full dependency web.
No Interface Catalog
COMMAREA and LINKAGE SECTION structures define inter-program contracts but are buried in source code with no API registry equivalent.
Weeks of Manual Discovery
Traditional discovery requires mainframe SMEs to manually trace call chains, map data flows, and document business rules. Slow and error-prone.
Architecture Context
Mainframe Source
Evidence Parsers
Neo4j
Schemas, APIs, UI
What Was Built (Business View)
Complete Automated Discovery
Every COBOL artifact type parsed and cataloged: source, copybooks, JCL, BMS maps, DB2 schemas, CICS resources, IMS definitions, and scheduler jobs.
Financial Services Vocabulary
252 pre-loaded domain terms covering banking, payments, accounts, and compliance. The knowledge graph speaks the business language from day one.
Migration Planning Outputs
Auto-generated PostgreSQL schemas from VSAM/DB2, service boundary recommendations, UI scaffolding from BMS maps, and OpenAPI contracts from COMMAREA.
Platform Ready
COB project live in Blaze with Jira integration (8 epics), GitHub repository, and full onboarding wizard. Customer can start immediately.
End-to-End Flow
Mainframe Source
COBOL, JCL, BMS, DDL, CSD, DBD, PSB, CA-7
Evidence Parsers
7 new parsers, 16 existing
Knowledge Graph
Ontology + relationships
Migration Engine
Schema, arch, codegen
Target Architecture
PostgreSQL, FastAPI, React
Outcomes & Evidence
Evidence-First Methodology
CDD Compliance
Evidence collected at every phase of the SDLC. Every parser, every test, every design decision has a traceable audit trail from requirement to deployment.
Real-World Validation
All 352 tests run against real AWS CardDemo files, not synthetic test data. Parser outputs verified against known mainframe program structures and data layouts.
Complete Audit Trail
From Jira epic to GitHub PR to deployed parser, every artifact is linked. Compliance managers can trace any output back to its originating requirement.
What This Enables (Next 90 Days)
Full CardDemo Ingestion
Ingest all 200+ artifacts from AWS CardDemo. Populate knowledge graph with complete program dependency map, data flow catalog, and business rule inventory.
- Run all 7 parsers against full CardDemo corpus
- Resolve cross-program CALL chains
- Map VSAM file access patterns
- Catalog all CICS transaction IDs
Auto-Generate Boundaries
Use knowledge graph to identify natural service boundaries. Generate BPMN process models from JCL batch flows and CICS transaction sequences.
- Service boundary recommendations from call graph analysis
- BPMN generation from JCL procedures
- Data ownership mapping per service
- API contract drafts from COMMAREA structures
Code Generation Begins
PostgreSQL schemas from VSAM/DB2 layouts. FastAPI stubs from service boundaries. React component scaffolding from BMS screen maps.
- PostgreSQL DDL from VSAM DEFINE CLUSTER
- Alembic migrations with type-mapped columns
- OpenAPI specs from interface contracts
- React forms from BMS map definitions
Strategic Value
Platform Asset
7 parsers + 252 domain terms are reusable for every COBOL customer. Built once, deployed to every future mainframe engagement. The investment compounds.
Amortized cost per engagement drops with each new customer onboarded.
Competitive Moat
The knowledge graph is permanent institutional memory. Competitors deliver a migration. Blaze delivers a migration plus an auditable knowledge base that persists beyond the project.
Knowledge graph becomes a living asset the customer keeps forever.
Accelerated Economics
Next engagement starts with 80% of discovery pre-built. Only customer-specific artifacts need parsing. Time-to-value drops from months to weeks.
Sprint 1 for customer N+1 delivers what took 3 sprints for customer 1.
Technical Deep Dive
Parser architecture, knowledge graph ontology, data flow pipelines, and migration module internals.
Parser Architecture
KMFlow's parser factory was extended with 7 new COBOL-specific parsers that plug into the existing 16-parser framework.
KMFlow Parser Factory
register() / create() / parse()
Existing Parsers (16)
New COBOL Parsers (7)
Parser Detail Cards
CobolSourceParser
94% coverage
Db2SchemaParser
98% coverage
CicsResourceParser
95% coverage
JclParser
97% coverage
BmsMapParser
96% coverage
ImsDefinitionParser
97% coverage
SchedulerParser
97% coverage
Full Coverage Table
| Parser | Extensions | Fragment Types | Coverage |
|---|---|---|---|
CobolSourceParser |
.cbl .cob .cpy .asm |
PROCESS, ENTITY, TEXT, REL, TABLE | 94% |
Db2SchemaParser |
.ddl .dcl .ctl |
ENTITY, REL, TABLE | 98% |
CicsResourceParser |
.csd |
PROCESS, ENTITY, REL | 95% |
JclParser |
.jcl .prc |
PROCESS, ENTITY, REL, TEXT | 97% |
BmsMapParser |
.bms |
PROCESS, TABLE, ENTITY, TEXT | 96% |
ImsDefinitionParser |
.dbd .psb |
ENTITY, REL | 97% |
SchedulerParser |
.ca7 .controlm |
PROCESS, REL, TABLE | 97% |
Knowledge Graph & Ontology
The knowledge graph captures all mainframe artifacts and their relationships in a queryable Neo4j graph. Five node types form the ontology core.
Process
COBOL programs, batch jobs, CICS transactions. Every executable unit in the mainframe ecosystem.
- COBOL programs (.cbl, .cob)
- JCL job steps
- CICS transaction IDs
DataObject
VSAM files, DB2 tables, IMS segments. Every persistent data store the mainframe reads or writes.
- VSAM KSDS/RRDS/ESDS files
- DB2 tables and views
- IMS database segments
InterfaceContract NEW
COMMAREA and LINKAGE SECTION structures that define inter-program contracts. The missing API catalog.
- COMMAREA field layouts
- LINKAGE SECTION definitions
- Parameter passing conventions
Activity
CICS commands, SQL operations, IMS DL/I calls. Every action a program performs against an external resource.
- EXEC CICS SEND/RECEIVE/READ/WRITE
- EXEC SQL SELECT/INSERT/UPDATE
- CBLTDLI calls (IMS)
BusinessRule
EVALUATE blocks, 88-level conditions, decision tables. The encoded business logic that governs program behavior.
- EVALUATE WHEN decision trees
- 88-level condition names
- PERFORM UNTIL iteration rules
4-Phase Ingestion Pipeline
Phase 1: Foundations
Copybooks, DDL, DBD, BMS
Phase 2: Programs
COBOL, DCLGEN, PSB
Phase 3: Orchestration
JCL, CSD, ASM
Phase 4: Scheduling
CA-7, Control-M
Data Flow Pipeline
Tracing a single COBOL program through the entire pipeline, from source to migration output.
Migration Module
Schema Derivation
Type mapping from mainframe to PostgreSQL with full fidelity.
Architecture Recommender
Pattern mapping from mainframe constructs to cloud-native equivalents.
Codegen Orchestrator
Multi-output code generation from knowledge graph queries.
- OpenAPI 3.1 specs from interface contracts
- Alembic migration scripts from schema derivation
- FastAPI route stubs from CICS transaction mapping
- React form components from BMS map definitions
- pytest-bdd feature files from business rules
- AsyncIO workers from JCL batch flows
Epic Roadmap
8 epics spanning 9 sprints. COB-1 and COB-2 completed in Sprint 1. COB-3 is next.
Epic Dependencies
Parsers
Ontology
Ingestion
Migration
Codegen
BPMN
Testing
Pilot
Platform Onboarding
The COBOL Migration customer (COB project) was onboarded through the standard Blaze provisioning wizard. Five steps from zero to a fully configured workspace.
Scroll horizontally to see all onboarding steps. Each screenshot shows a stage of the multi-tenant provisioning wizard.
Deliverables & Next Steps
Files Delivered
| Category | Files | Lines |
|---|---|---|
| Parser implementations | 7 | ~3,200 |
| Unit test suites | 7 | ~4,100 |
| Test fixtures (CardDemo) | 35+ | ~2,800 |
| Ontology module | 4 | ~1,500 |
| Migration module | 5 | ~1,800 |
| Domain vocabulary | 1 | 252 terms |
| Configuration & setup | 6 | ~400 |
Total: ~65 files, ~14,000+ lines of code and tests
Next Steps
| Action | Owner | Target |
|---|---|---|
| Ingest full CardDemo corpus into Neo4j | Engineering | Sprint 2 |
| Validate knowledge graph completeness | Engineering | Sprint 2 |
| Generate architecture blueprint | Engineering | Sprint 3 |
| PostgreSQL schema generation from VSAM | Engineering | Sprint 4 |
| OpenAPI contracts from COMMAREA | Engineering | Sprint 4 |
| Customer demo with populated graph | Delivery | Sprint 3 |
| Production pilot planning | Management | Sprint 6 |