Experience: 8 years to 12 year
Location: Bangalore
Job Description:
Key Responsibilities
• Design, deploy and maintain automated invoice extraction pipelines using GCP Document AI.
• Develop custom model training workflows for documents with non-standard formats.
• Preprocess and upload document datasets to Cloud Storage.
• Label documents using DocAI Workbench or JSONL for training.
• Train and evaluate custom processors using AutoML or custom schema definitions.
• Integrate extracted data into downstream tools (e.g., BigQuery, ERP systems).
• Write robust, production-grade Python code for end-to-end orchestration.
• Maintain CI/CD deployment pipelines
• Ensure secure document handling and compliance with data policies.
________________________________________
Required Skills & Experience
• Strong hands-on experience with Google Cloud Platform, especially:
o Document AI
o Cloud Storage
o IAM
o Vertex AI (preferred)
o Cloud Functions / Cloud Run
• Proficient in Python, including Google Cloud SDK libraries
• Familiarity with OCR, and schema-based information extraction
• Understanding of security best practices for handling financial documents
________________________________________
Preferred Qualifications
• Previous projects involving invoice or document parsing
• Familiarity with BigQuery for analytics/reporting