Authors: Izzeddin Gur, Hiroki Furuta, Austin Huang, Mustafa Safdari, Yutaka Matsuo, Douglas Eck, Aleksandra Faust
Published on: July 24, 2023
Impact Score: 7.8
Arxiv code: Arxiv:2307.12856
Summary
- What is new: WebAgent uses Flan-U-PaLM and HTML-T5 models to improve performance on web automation tasks by addressing open domainness, limited context length, and lack of inductive bias on HTML.
- Why this is important: Existing large language models struggle with web automation due to open domain challenges, limited context length, and inadequacy in HTML understanding.
- What the research proposes: Introducing WebAgent, a system that breaks down instructions into sub-tasks, summarizes lengthy HTML for relevancy, and performs actions through Python code generation.
- Results: WebAgent achieved a 50% improvement in task success on real websites and set new state-of-the-art benchmarks on MiniWoB and Mind2Web evaluations.
Technical Details
Technological frameworks used: Flan-U-PaLM for code generation, HTML-T5 for HTML document summarization
Models used: Pre-trained large language models with local and global attention mechanisms, long-span denoising objectives
Data used: HTML documents, MiniWoB web automation benchmark, Mind2Web offline task planning evaluation
Potential Impact
Web automation services, companies reliant on web scraping, and businesses seeking to automate interactions with web interfaces could be disrupted or benefit from WebAgent’s advancements.
Want to implement this idea in a business?
We have generated a startup concept here: WebSavvy.
Leave a Reply