Authors: Chulin Xie, Zinan Lin, Arturs Backurs, Sivakanth Gopi, Da Yu, Huseyin A Inan, Harsha Nori, Haotian Jiang, Huishuai Zhang, Yin Tat Lee, Bo Li, Sergey Yekhanin
Published on: March 04, 2024
Impact Score: 8.4
Arxiv code: Arxiv:2403.01749
Summary
- What is new: Introduction of the Aug-PE algorithm for generating DP synthetic text using only API access to LLMs without requiring model training.
- Why this is important: Existing methods to generate DP synthetic text involve DP finetuning of large language models, which is not feasible for proprietary models and requires significant resources.
- What the research proposes: The proposed Aug-PE algorithm generates DP synthetic text through API access to LLMs, bypassing the need for direct model training or modifications.
- Results: Aug-PE achieves competitive utility with state-of-the-art DP finetuning baselines across three benchmark datasets, demonstrating the effectiveness of generating high-quality DP synthetic text via API access alone.
Technical Details
Technological frameworks used: Private Evolution (PE) algorithm adapted to text (Aug-PE)
Models used: API access to large language models
Data used: Three benchmark datasets
Potential Impact
This innovation could impact markets dealing with data privacy and companies specializing in data generation, especially those in the AI, ML, and data analytics sectors.
Want to implement this idea in a business?
We have generated a startup concept here: PrivTextAI.
Leave a Reply