10 February 2024

LLsM: Generative Linguistic Steganography with Large Language Model

Written by Startup Idea

Authors: Yihao Wang, Ruiqi Song, Ru Zhang, Jianyi Liu, Lingxiao Li

Published on: January 28, 2024

Impact Score: 8.12

Arxiv code: Arxiv:2401.15656

Summary

What is new: This paper introduces the first use of a Large Language Model (LLM) for Linguistic Steganography, specifically fine-tuning LLaMA2 to generate steganographic text with specific and controllable discourse characteristics.
Why this is important: Existing linguistic steganography schemes lack controllability and produce text that is easily detectable due to poor incorporation of discourse characteristics like style.
What the research proposes: The paper proposes LLsM, a method that fine-tunes a Large Language Model with a dataset containing rich discourse characteristics, allowing for the generation of steganographic text that is not easily detected.
Results: LLsM outperforms existing methods in text quality, statistical analysis, discourse matching, and resistance to steganalysis, with notable improvements in MAUVE metric and anti-steganalysis performance.

Technological frameworks used: LLsM, fine-tuning LLaMA2 Large Language Model

Models used: Large Language Model (LLM)

Data used: Large-scale constructed dataset with rich discourse characteristics

Cybersecurity companies focusing on steganalysis, privacy-focused communication platforms, companies seeking advanced data hiding techniques

We have generated a startup concept here: CoveText.