Authors: Nick Baumann, Alexander Brinkmann, Christian Bizer
Published on: March 04, 2024
Impact Score: 7.8
Arxiv code: Arxiv:2403.02130
Summary
- What is new: The research paper introduces the use of large language models, specifically OpenAI’s GPT-3.5 and GPT-4, to extract and normalize attribute values from product titles and descriptions, outperforming existing PLM-based methods.
- Why this is important: The difficulty lies in extracting and normalizing attribute-value pairs from unstructured product descriptions for e-commerce features like faceted filtering and content-based recommendation.
- What the research proposes: Utilizing GPT-3.5 and GPT-4 to perform attribute-value pair extraction and normalization, including operations like name expansion, generalization, unit normalization, and string wrangling.
- Results: GPT-4 significantly improved extraction methods by 10%, achieving an F1-Score of 91%. It was particularly effective at string wrangling and name expansion.
Technical Details
Technological frameworks used: Large Language Models (LLMs)
Models used: OpenAI GPT-3.5, GPT-4
Data used: WDC Product Attribute-Value Extraction (WDC PAVE) dataset
Potential Impact
This innovation might disrupt the e-commerce and retail industry sectors by enhancing product recommendation systems and search filters, benefiting companies that rely on sophisticated product attribute extraction for their platforms.
Want to implement this idea in a business?
We have generated a startup concept here: OptiCatalog.
Leave a Reply