All Projects
Past Project

Insurance AI Automation

Custom, self-hosted AI for an insurance SaaS leveraging Courier.

We worked with a small business running a successful insurance SaaS. They were paying ~$2,500/mo for AI when they contacted us and wanted to 4x their usage. They were desperate to lower their AI costs. We self hosted a small Open Source model that performed the task incredibly well and eliminated their token-based spending.

Self-hosted AI

We self-hosted Qwen2 14B for document parsing, analyses, and data extraction. The model performed well and can run at full precision on ~48GB of hardware with batching.

Scalability

We calculated batching requirements for the client's current workload and 4x scale concerns

Results

They were able to eliminate a $2,500/mo cost while maintaining performance for a core feature their users loved and 4x their usage without any increase in cost.

Fine Tuning

The client wanted to optimize even further, so we LoRA trained the model to know the task without explicit instruction. This increased speed and reduced prompt overhead significantly, as well as decreasing API complexity. Worth noting the fine tuning cost is significant and with modern models it's rarely necessary.

Tech Stack

TensorRT LLMOpenAI-CompatibleQwen2 14BPythonLoRA