CFDI Extractor API: Parse Mexican Invoices Automatically

Extract structured data from CFDI XML invoices with an API. RFC, total, IVA, emisor, receptor, UUID — all parsed in milliseconds. Python and Node.js examples.

March 17, 2026 · 5 min read

CFDI (Comprobante Fiscal Digital por Internet) is Mexico's mandatory digital invoice format. Every business transaction in Mexico generates one. If you're building accounting software, an ERP, a fintech platform, or a tax compliance tool, you need to parse CFDIs programmatically. This guide shows you how.

What is a CFDI?

A CFDI is a digitally signed XML file issued by businesses for every sale, payroll payment, or expense in Mexico. It contains:

Manually parsing CFDI XML is tedious and error-prone. The Ocilar CFDI extractor handles the parsing and returns clean JSON.

Extracting CFDI data with Ocilar

cURL — from XML file

curl -X POST https://api.ocilar.com/api/v1/extract/cfdi \
  -H "X-API-Key: sk-your_key" \
  -F "file=@factura.xml"

# Response
{
  "uuid": "6128a3d4-1234-5678-abcd-ef0123456789",
  "emisor_rfc": "AAA010101AAA",
  "emisor_nombre": "EMPRESA DEMO SA DE CV",
  "receptor_rfc": "JUAJ850101ABC",
  "receptor_nombre": "JULIO JUAREZ",
  "subtotal": 10000.00,
  "iva": 1600.00,
  "total": 11600.00,
  "moneda": "MXN",
  "tipo_comprobante": "I",
  "fecha": "2026-03-17T10:00:00",
  "forma_pago": "03",
  "metodo_pago": "PUE",
  "uso_cfdi": "G03",
  "regimen_fiscal_emisor": "601",
  "conceptos": [
    {
      "descripcion": "Servicios de desarrollo",
      "cantidad": 1,
      "valor_unitario": 10000.00,
      "importe": 10000.00,
      "clave_prod_serv": "81161700"
    }
  ],
  "credits_used": 1
}

Python

from ocilar import OcilarClient

client = OcilarClient(api_key="sk-your_key")

# From file path
result = client.extract_cfdi(file_path="factura.xml")

print(result.uuid)           # "6128a3d4-..."
print(result.emisor_rfc)     # "AAA010101AAA"
print(result.total)          # 11600.00
print(result.iva)            # 1600.00

# From bytes (e.g. downloaded from SAT portal)
with open("factura.xml", "rb") as f:
    result = client.extract_cfdi(file_bytes=f.read())

# Access line items
for concepto in result.conceptos:
    print(concepto.descripcion, concepto.importe)

Node.js

import { OcilarClient } from '@ocilar/sdk'
import { readFileSync } from 'fs'

const client = new OcilarClient({ apiKey: 'sk-your_key' })

const result = await client.extractCfdi({
  file: readFileSync('factura.xml')
})

console.log(result.uuid)         // "6128a3d4-..."
console.log(result.total)        // 11600.00
console.log(result.emisorRfc)    // "AAA010101AAA"

// Line items
result.conceptos.forEach(c => {
  console.log(c.descripcion, c.importe)
})

Bulk processing — multiple CFDIs

import asyncio
from ocilar import AsyncOcilarClient
import glob

async def process_cfdi_folder(folder: str):
    client = AsyncOcilarClient(api_key="sk-your_key")
    xml_files = glob.glob(f"{folder}/*.xml")

    tasks = [client.extract_cfdi(file_path=f) for f in xml_files]
    results = await asyncio.gather(*tasks)

    for result in results:
        print(f"{result.uuid} | {result.emisor_rfc} | ${result.total} MXN")

asyncio.run(process_cfdi_folder("./facturas"))

Common use cases

Accounting software

Automatically import and categorize expenses from CFDI XMLs received from suppliers. Eliminate manual data entry for accountants.

ERP integration

Sync vendor invoices directly into SAP, Oracle, or custom ERPs by parsing CFDIs and mapping fields to your data model.

Tax compliance validation

Validate that received CFDIs contain the correct RFC, fiscal regime, and CFDI usage code before paying invoices.

Expense management platforms

Let employees upload CFDIs as expense receipts and auto-extract the amount, vendor, date, and tax breakdown.

Pricing

CFDI extraction costs $0.05 per document on PAYG, or from $0.020/document on volume plans.

PlanDocuments/monthPrice/doc
PAYGUnlimited$0.05
Starter1,000$0.049
Pro5,000$0.040
Business25,000$0.020
EnterpriseCustomCustom

FAQ

What CFDI versions are supported?

Ocilar supports CFDI 3.3 and CFDI 4.0 (current SAT standard). Both are handled automatically.

Can I extract data from CFDI PDFs (not just XML)?

Yes. If you only have the PDF representation of a CFDI, use the Generic OCR endpoint. For best results, always use the original XML.

Does this validate the CFDI signature with SAT?

The extractor parses the document structure and data. SAT signature validation (timbrado) is a separate process — contact us if you need it.

Can I extract addenda and complementos?

Standard complementos (nomina, pagos, comercio exterior) are extracted automatically. Custom addendas are returned as raw XML in the response.

Try Ocilar free

1,000 free solves. No credit card required.

Get API Key