10 February 2024

(Why) Is My Prompt Getting Worse? Rethinking Regression Testing for Evolving LLM APIs

Written by Startup Idea

Authors: Wanqin Ma, Chenyang Yang, Christian Kästner

Published on: November 18, 2023

Impact Score: 8.22

Arxiv code: Arxiv:2311.11123

Summary

What is new: Focuses on the unique challenges of regression testing for large language model (LLM) APIs, which hasn’t been widely discussed before.
Why this is important: LLM APIs are frequently updated, leading to performance issues and forcing developers to constantly adapt without prior notice.
What the research proposes: Re-examining and proposing a new approach to regression testing for LLM APIs that accommodates their unique characteristics.
Results: The case study on toxicity detection illustrates how current practices are inadequate, emphasizing the need for revised testing methods.

Technological frameworks used: Regression testing methodology tailored for LLM APIs

Models used: Not specified, but involves analysis of LLM APIs used for toxicity detection

Data used: Case study data from testing LLM API updates on toxicity detection applications

Software development companies, especially those integrating LLM APIs into their products, and providers of LLM API services

We have generated a startup concept here: ModelGuard.