
How your CV stacks up
Upload your CV to see how well it fits this job role
?%
ML Engineer
Software Engineers – Data & Accuracy (OCR, Embeddings, Recommender Systems)
About ApprovalMax
ApprovalMax is a fast-growing B2B SaaS company that helps businesses automate their approval workflows and financial controls. With a global team spanning the UK, Europe, North America, Australia, and South Africa, we build software that matters—scaling quickly to serve 20,000+ subscribers.
The Role: Domain-Driven Accuracy O&S Engineer
Our Capture product automates financial approvals by extracting structured data from thousands of invoices, bills, and procurement orders monthly using an OCR pipeline. Your mission: drastically improve the zero-touch rate—the percentage of documents requiring no manual correction—by merging forensic data science, model engineering, and production-scale pipeline fixes.
Your Impact Areas
Your work splits into:
- Accuracy Investigation & Measurement (~70%) – Systematically diagnosing root causes of errors at scale.
- ML Engineering & Model Development (~30%, scaling to 50%) – Building and deploying AI systems to correct OCR, matching, and logic errors.
Core Challenges (Initiative Backlog)
We’re Kubernetes optimised on four systemic error categories:
-
Entity Matching (~50% of fixable errors)
- Fields extracted correctly but misaligned with accounts, suppliers, or tax codes.
- Plans: Embedding-based similarity search, recommender systems, consensus-based deduplication.
-
Pipeline Logic (~25%)
- Tax misclassification, rounding errors, or spurious adjustments from deterministic logic.
- Plans: Forensic tracing and rule-based fixes design.
-
OCR Extraction (~25%)
- False Positives, phantom line items, foreign creatures misread.
- Plans: Add a correction layer using LLMs, alternative OCRs, or document correction models (HuggingFace).
-
User Overrides
- Domain ownership decisions (unfixable via automated systems).
Remote work only available to candidates based in the UK, Serbia, or Moldova.
Key Responsibilities
Reasons to use Rodeo
I’m in my final year doing Economics and I don’t know whether to apply for grad schemes now or do a masters first. What do you think?
Honest answer — it depends on where you want to end up. A lot of top grad schemes (Big 4, civil service, banking) don’t need a masters. Let’s look at the ones you’d be competitive for now, and we can decide if a masters actually adds anything.
Also worth knowing: most autumn 2026 applications are open now. Timing matters more than you think.
Start with a chat, not a search bar
Grad scheme, placement, apprenticeship? Not sure what you want yet — that's fine. Your agent talks it through with you and turns "I have no idea" into a shortlist.
Graduate Consultant — 2026 Scheme
Why you're a good match
StrongYour economics background and your summer at a regional bank line up with what PwC looks for on the consulting scheme. Applications close in four weeks.
See breakdownIt searches the market for you
Every day your agent scans the market matching roles against what actually matters to you, not just keywords on a CV.
Why you're a good match
You’ve got the grades and the economics background, and your bank internship is exactly the experience this scheme looks for. Apply soon — deadlines close within the month.
Experience fit
Your summer at the bank plus your econometrics coursework map directly to the day-one responsibilities on this scheme — client modelling, market briefings, and deal support.
Only hits
No noise. No "maybe this fits." Just roles with a clear explanation of why they're right — and where to focus when applying.
Accuracy Investigation & Measurement (~70% of Time)
-
Scale-wise Forensics
- Investigate root causes across hundreds of thousands of documents simultaneously.
- Leverage production data comparisons, identify statistical outliers that explain broad failure patterns.
-
Own the Accuracy Framework
- Monitor post-deployment impact of fixes, track false positives, verify uplift metrics independently.
- Detect subtlety bias, and validate results with stratified sampling.
-
Document Tooling Augmentation
- Transform current LLM+SQL hybrid error methodology using analytical domain expertise.
-
Postfixarity > Perfixarity
- Diagnose post-processing pipeline logic issues (tax misclocking, rounding, adjustments), either via C# source tracing or SQL into raw document history.
ML Engineering & Model Development (~30%, Scaled to 50%)
-
Embeddings & Impact
- Deploy supplier/vendor matching using sentence-transformers, HuggingFace, and RESTserve-layered retrieval.
- Evaluate retrieval accuracy using
pgvector,FAISS, or self-hosted vector DB optimised for financial domains.
-
OCR Correction Layer
- Test and validate OCR errors with LLM-provided confidence scores and / or structured output parsers.
- Forward LLM grounding, MLP-driven corrected extraction, or process capabilities (e.g., DocLing, PDF Europe indices) to assume.
-
MLOps Orchestration
- Ship optimised Airflow + MLflow pipelines for evaluation, experiment tracking, and production rollouts.
-
Recommendations from Past User Trending
- Build datasets of manual override history to train pattern-mergence and explainable recommendations.
Collaboration Scope
-
Strategic Workplace with Capture Team
- Standup ownership, sprint compromises where needed.
-
Accomplicated Machine Learning Team
- Final model architecture decisions, deployment залоги.
-
Backend Systems (C# Team)
- You design and implement; they bridge with downstream teams.
Essential Skills (Experience Requirements)


Get help with your application
Your very own career expert that helps elevate your application to the next level.
Data Investigation & Measurement
- Data Love and Truth
- Proven rigor in building data pipelines (* fraud analysts, payment reconciliation, invoice matching).
- Query Tooling
- Real-life PostgreSQL / BigQuery handling, coupled with Python plotting (Pandas, NumPy).
ML Architecture (Structured Docs)
-
Natural Text to Accuracy
- Experience with end-to-end structured banking doc analysis, manual inspection of OCR engines, or bridging between LLM-based extraction and quantifiable validation systems.
-
Modeling from Domain Tools
- Proven delivery using OCRL, PeerProp housing (FAISS/PR-logic), ophysical constraints by context (e.g., financial-specific taxed units).
LLM Integration
- LLM+DS as a Combo Plan
- Notebook-based hustle; instead, structured prompt iterations and self audibles (e.g. structured output via Pydantic).
- ** certeza Autonomy**
- Can evaluate alternative training objectives, discuss sugary effects before notional alignment.
ML Infrastructure
- Deployment Hour CTF
- Productionised models for medium (on-premises or cloud, e.g. Azure Pipelines over notebook storage).
Nice to Have Proficiencies
-
Premier Finance Domain Knowledge
- Understanding of invoices, bill visions, or Quicken per domestic accounts.
-
OCR APIs
- Experience with Azure/Google Cloud, AWS Textract, or open-source models (Tesseract).
-
FGE BootingOutils (* layoutlm2, donut model groups *)
-
Corollary Inquiry Tooling
- Know or build inspection pipelines (e.g., Dapcon, YaDL, csv-correction-hub).
What We Offer
- Fast-moving global startup: 20,000+ subscribers, 150+ global employes across 5+ geos.
- Compensation reviews are growing quarter-by-quarter based on owned impact metrics.
- 26 days paid time off + 1-day “celebrate ability” leave bonus.
- Office setup reimbursement & hybrid bag upgrades.
- Recognition programme (financial rewards) tied to service years & engineered success.
“It took my CV and asked me questions relevant to understanding what kind of jobs to suggest for me. Suggestions were almost perfect. Jobs were exactly what I’ve been looking for.”
Jessica, London
Skills
Location