08 February 2024

Towards Optimizing the Costs of LLM Usage

Written by Startup Idea

Authors: Shivanshu Shekhar, Tanishq Dubey, Koyel Mukherjee, Apoorv Saxena, Atharv Tyagi, Nishanth Kotla

Published on: January 29, 2024

Impact Score: 8.22

Arxiv code: Arxiv:2402.01742

Summary

What is new: A new method to optimize the usage costs of LLMs by predicting their output quality and using an optimization routine for LLM selection, incorporating quality, latency, and cost.
Why this is important: High costs and varying performance of different LLMs in document processing tasks.
What the research proposes: A predictive model for LLM output quality, an LP rounding algorithm for LLM selection optimization, and sentence simplification and deterministic heuristics for token reduction.
Results: Reduction in costs by 40%-90% and improvement in quality by 4%-7%, using both enterprise and open-source datasets.

Technological frameworks used: LP rounding optimization, sentence simplification models, deterministic heuristics

Models used: Predictive models for LLM output quality

Data used: Enterprise datasets, open-source datasets annotated for this study

Companies heavily utilizing LLMs for document processing, AI service providers, enterprises focusing on cost-efficient AI deployments

We have generated a startup concept here: AI CostOptim.