Mexico's regulatory landscape requires businesses to process several types of government-issued documents: voter IDs (INE), digital invoices (CFDI), tax certificates (CSF), and population registry documents (CURP). Manually extracting data from these documents is slow and error-prone. This guide covers how to automate extraction for all of them with a single API.
Supported Mexican documents
| Document | Spanish name | Key fields extracted | Common use case |
|---|---|---|---|
| INE / IFE Voter ID | Credencial para votar | Nombre, CURP, fecha nacimiento, domicilio, vigencia | KYC, identity verification |
| CFDI Invoice (XML) | Comprobante Fiscal Digital | UUID, emisor RFC, receptor RFC, total, IVA, conceptos | Accounting, ERP, expense management |
| CSF Certificate | Constancia de Situación Fiscal | RFC, régimen fiscal, domicilio fiscal, actividades | Vendor onboarding, supplier validation |
| CURP Card | Cédula de CURP | CURP, nombre, fecha nacimiento, entidad de nacimiento | HR onboarding, benefits registration |
| Pasaporte | Pasaporte mexicano | Nombre, número de pasaporte, fecha vencimiento, MRZ | International KYC, travel |
Extract INE voter ID data
Python
from ocilar import OcilarClient
client = OcilarClient(api_key="sk-your_key")
result = client.extract_ine(
front="ine_front.jpg",
back="ine_back.jpg"
)
print(result.nombre) # "JULIO JUAREZ MARTINEZ"
print(result.curp) # "JUAJ850101HMCRRL09"
print(result.fecha_nacimiento) # "1985-01-01"
print(result.domicilio) # "CALLE INDEPENDENCIA 123..."
print(result.vigencia) # "2029"
# Check if expired
from datetime import datetime
is_valid = int(result.vigencia) >= datetime.now().year cURL
curl -X POST https://api.ocilar.com/api/v1/extract/ine \ -H "X-API-Key: sk-your_key" \ -F "front=@ine_front.jpg" \ -F "back=@ine_back.jpg"
Extract CFDI invoice data
Python
result = client.extract_cfdi(file_path="factura.xml")
print(result.uuid) # "6128a3d4-1234-..."
print(result.emisor_rfc) # "AAA010101AAA"
print(result.total) # 11600.00
print(result.iva) # 1600.00
for concepto in result.conceptos:
print(concepto.descripcion, concepto.importe) From bytes (e.g. downloaded from SAT portal)
with open("factura.xml", "rb") as f:
result = client.extract_cfdi(file_bytes=f.read())
# Supports CFDI 3.3 and 4.0 automatically Extract CSF (Constancia de Situación Fiscal)
Python
result = client.extract_csf(file_path="csf.pdf") print(result.rfc) # "JUAJ850101ABC" print(result.nombre) # "JULIO JUAREZ MARTINEZ" print(result.regimen_fiscal) # "612 - Personas Físicas con Actividades Empresariales" print(result.domicilio_fiscal) # "CALLE INDEPENDENCIA 123..." print(result.codigo_postal) # "44100" print(result.actividades) # ["Desarrollo de software", ...] print(result.fecha_inicio_ops) # "2018-03-01"
Node.js
import { OcilarClient } from '@ocilar/sdk'
import { readFileSync } from 'fs'
const client = new OcilarClient({ apiKey: 'sk-your_key' })
const result = await client.extractCsf({
file: readFileSync('csf.pdf')
})
console.log(result.rfc)
console.log(result.regimenFiscal)
console.log(result.domicilioFiscal) Extract CURP card
from ocilar import OcilarClient client = OcilarClient(api_key="sk-your_key") result = client.extract_curp(file_path="curp.pdf") print(result.curp) # "JUAJ850101HMCRRL09" print(result.nombre_completo) # "JULIO JUAREZ MARTINEZ" print(result.fecha_nacimiento) # "1985-01-01" print(result.entidad) # "JALISCO" print(result.sexo) # "H"
Multi-document onboarding flow
A complete KYC onboarding flow for a Mexican fintech typically requires INE + CSF (for business customers) or INE + CURP (for individuals). Here's a complete example:
from ocilar import OcilarClient
from datetime import datetime
client = OcilarClient(api_key="sk-your_key")
def kyc_onboard_individual(ine_front: str, ine_back: str, curp_pdf: str) -> dict:
"""Full KYC extraction for individual customers."""
# Extract INE
ine = client.extract_ine(front=ine_front, back=ine_back)
# Validate INE not expired
if int(ine.vigencia) < datetime.now().year:
raise ValueError(f"INE expired in {ine.vigencia}")
# Extract CURP to cross-validate
curp = client.extract_curp(file_path=curp_pdf)
# Cross-check: CURP on INE should match CURP document
if ine.curp != curp.curp:
raise ValueError("CURP mismatch between INE and CURP document")
return {
"nombre": ine.nombre,
"curp": ine.curp,
"fecha_nacimiento": ine.fecha_nacimiento,
"domicilio": ine.domicilio,
"estado": ine.estado,
"ine_vigencia": ine.vigencia,
"verified": True
}
def kyc_onboard_business(ine_front: str, ine_back: str, csf_pdf: str) -> dict:
"""Full KYC extraction for business customers (persona moral/fisica con actividad empresarial)."""
rep_legal = client.extract_ine(front=ine_front, back=ine_back)
empresa = client.extract_csf(file_path=csf_pdf)
return {
"representante_legal": rep_legal.nombre,
"rfc_empresa": empresa.rfc,
"razon_social": empresa.nombre,
"regimen_fiscal": empresa.regimen_fiscal,
"domicilio_fiscal": empresa.domicilio_fiscal,
"actividades": empresa.actividades,
"verified": True
} Industry use cases
Fintech & lending (KYC)
Mexican regulation requires identity verification for any credit product. Extract INE data automatically during the loan application flow — no manual form filling for the borrower, no manual review for your team.
B2B vendor onboarding
Extract and validate supplier RFC and fiscal regime from their CSF before adding them to your accounts payable system. Ensure you're paying the correct RFC and catching mismatches early.
Accounting software & ERPs
Parse CFDIs received from vendors automatically. Extract UUID, amounts, line items, and fiscal data directly into your accounting module without copy-pasting from PDFs.
HR & payroll platforms
Onboard new employees by scanning their INE and CURP. Feed data directly into IMSS and INFONAVIT systems without manual data entry.
Sharing economy & gig platforms
Verify driver, delivery agent, and contractor identities with INE extraction as part of onboarding. Pair with IMSS verification to confirm employment history.
Pricing
| Document type | Price/doc | Notes |
|---|---|---|
| INE extraction | $0.05–$0.10 | Front + back count as 1 document |
| CFDI extraction | $0.05 | XML preferred; PDF also supported |
| CSF extraction | $0.08 | PDF only |
| CURP extraction | $0.04 | PDF or image |
| All document types | From $0.020/doc | Volume plans available |
FAQ
What image quality is required for INE?
Minimum 300 DPI equivalent. The card must be fully visible, well-lit, no blur, no glare. Mobile photos work well when the card fills most of the frame.
Does CFDI extraction support CFDI 3.3 and 4.0?
Yes. Both versions are handled automatically — you don't need to specify the version.
Can I extract data from scanned CSF PDFs?
Yes. The CSF extractor handles both digital PDFs (from SAT portal) and scanned copies, though digital PDFs produce higher accuracy.
Is there an SDK for Go or PHP?
Python and Node.js SDKs are available. REST API works with any language. See the full documentation.