CFDI (Comprobante Fiscal Digital por Internet) is Mexico's mandatory digital invoice format. Every business transaction in Mexico generates one. If you're building accounting software, an ERP, a fintech platform, or a tax compliance tool, you need to parse CFDIs programmatically. This guide shows you how.
What is a CFDI?
A CFDI is a digitally signed XML file issued by businesses for every sale, payroll payment, or expense in Mexico. It contains:
- UUID — unique fiscal identifier (folio fiscal)
- Emisor RFC — seller's tax ID
- Receptor RFC — buyer's tax ID
- Total, subtotal, IVA — amounts and taxes
- Fecha — invoice date and time
- Conceptos — line items
- Sello SAT — SAT digital signature for validation
Manually parsing CFDI XML is tedious and error-prone. The Ocilar CFDI extractor handles the parsing and returns clean JSON.
Extracting CFDI data with Ocilar
cURL — from XML file
curl -X POST https://api.ocilar.com/api/v1/extract/cfdi \
-H "X-API-Key: sk-your_key" \
-F "file=@factura.xml"
# Response
{
"uuid": "6128a3d4-1234-5678-abcd-ef0123456789",
"emisor_rfc": "AAA010101AAA",
"emisor_nombre": "EMPRESA DEMO SA DE CV",
"receptor_rfc": "JUAJ850101ABC",
"receptor_nombre": "JULIO JUAREZ",
"subtotal": 10000.00,
"iva": 1600.00,
"total": 11600.00,
"moneda": "MXN",
"tipo_comprobante": "I",
"fecha": "2026-03-17T10:00:00",
"forma_pago": "03",
"metodo_pago": "PUE",
"uso_cfdi": "G03",
"regimen_fiscal_emisor": "601",
"conceptos": [
{
"descripcion": "Servicios de desarrollo",
"cantidad": 1,
"valor_unitario": 10000.00,
"importe": 10000.00,
"clave_prod_serv": "81161700"
}
],
"credits_used": 1
} Python
from ocilar import OcilarClient
client = OcilarClient(api_key="sk-your_key")
# From file path
result = client.extract_cfdi(file_path="factura.xml")
print(result.uuid) # "6128a3d4-..."
print(result.emisor_rfc) # "AAA010101AAA"
print(result.total) # 11600.00
print(result.iva) # 1600.00
# From bytes (e.g. downloaded from SAT portal)
with open("factura.xml", "rb") as f:
result = client.extract_cfdi(file_bytes=f.read())
# Access line items
for concepto in result.conceptos:
print(concepto.descripcion, concepto.importe) Node.js
import { OcilarClient } from '@ocilar/sdk'
import { readFileSync } from 'fs'
const client = new OcilarClient({ apiKey: 'sk-your_key' })
const result = await client.extractCfdi({
file: readFileSync('factura.xml')
})
console.log(result.uuid) // "6128a3d4-..."
console.log(result.total) // 11600.00
console.log(result.emisorRfc) // "AAA010101AAA"
// Line items
result.conceptos.forEach(c => {
console.log(c.descripcion, c.importe)
}) Bulk processing — multiple CFDIs
import asyncio
from ocilar import AsyncOcilarClient
import glob
async def process_cfdi_folder(folder: str):
client = AsyncOcilarClient(api_key="sk-your_key")
xml_files = glob.glob(f"{folder}/*.xml")
tasks = [client.extract_cfdi(file_path=f) for f in xml_files]
results = await asyncio.gather(*tasks)
for result in results:
print(f"{result.uuid} | {result.emisor_rfc} | ${result.total} MXN")
asyncio.run(process_cfdi_folder("./facturas")) Common use cases
Accounting software
Automatically import and categorize expenses from CFDI XMLs received from suppliers. Eliminate manual data entry for accountants.
ERP integration
Sync vendor invoices directly into SAP, Oracle, or custom ERPs by parsing CFDIs and mapping fields to your data model.
Tax compliance validation
Validate that received CFDIs contain the correct RFC, fiscal regime, and CFDI usage code before paying invoices.
Expense management platforms
Let employees upload CFDIs as expense receipts and auto-extract the amount, vendor, date, and tax breakdown.
Pricing
CFDI extraction costs $0.05 per document on PAYG, or from $0.020/document on volume plans.
| Plan | Documents/month | Price/doc |
|---|---|---|
| PAYG | Unlimited | $0.05 |
| Starter | 1,000 | $0.049 |
| Pro | 5,000 | $0.040 |
| Business | 25,000 | $0.020 |
| Enterprise | Custom | Custom |
FAQ
What CFDI versions are supported?
Ocilar supports CFDI 3.3 and CFDI 4.0 (current SAT standard). Both are handled automatically.
Can I extract data from CFDI PDFs (not just XML)?
Yes. If you only have the PDF representation of a CFDI, use the Generic OCR endpoint. For best results, always use the original XML.
Does this validate the CFDI signature with SAT?
The extractor parses the document structure and data. SAT signature validation (timbrado) is a separate process — contact us if you need it.
Can I extract addenda and complementos?
Standard complementos (nomina, pagos, comercio exterior) are extracted automatically. Custom addendas are returned as raw XML in the response.