Gorzen Engineering offers two main production-ready ingestion pipelines: the Advanced Engine and Gorzen Ingestion.
Advanced Engine: This pipeline is designed for handling complex PDFs, tables, OCR, formulas, and requires high enterprise precision. It supports optional GPU acceleration and cross-encoder reranking for enhanced precision. The Advanced Engine focuses on maximum extraction fidelity and retrieval precision, using technologies like Docling 2.70+ and EasyOCR for parsing and OCR tasks.
Gorzen Ingestion: This pipeline is tailored for rapid, cloud-first deployments, utilizing managed APIs such as LangChain loaders and GPT-4o Vision. It emphasizes fast deployment with low operational overhead.
Both pipelines share a unified vector backbone in Pinecone, allowing them to write compatible records into the same index configuration. This setup ensures interoperability and scalability across different use cases. The Advanced Engine provides deterministic extraction without AI-generated image descriptions, ensuring code and formulas are extracted verbatim.