Authors: Fudan Zheng, Mengfei Li, Ying Wang, Weijiang Yu, Ruixuan Wang, Zhiguang Chen, Nong Xiao, Yutong Lu
Published on: February 06, 2024
Impact Score: 8.45
Arxiv code: Arxiv:2402.03754
Summary
- What is new: The introduction of the Globally-intensive Attention (GIA) module that simulates multi-view vision perception and the Visual Knowledge-guided Decoder (VKGD) for combining visual information and text more effectively in radiology report generation.
- Why this is important: Existing approaches to automatic radiology report generation fail to adequately incorporate multi-view imaging information and context reasoning with multi-modal information, limiting the accuracy and comprehensiveness of generated reports.
- What the research proposes: A new model, the Intensive Vision-guided Network (IVGN), which integrates a GIA-guided Visual Encoder for better image feature extraction and a VKGD for more accurate report generation by effectively combining visual and textual information.
- Results: The IVGN framework demonstrated superior performance in generating radiology reports over other state-of-the-art approaches on the IU X-Ray and MIMIC-CXR datasets.
Technical Details
Technological frameworks used: Intensive Vision-guided Network (IVGN)
Models used: Globally-intensive Attention (GIA) module, Visual Knowledge-guided Decoder (VKGD)
Data used: IU X-Ray and MIMIC-CXR datasets
Potential Impact
Healthcare industry, particularly radiology departments and companies developing medical imaging software
Want to implement this idea in a business?
We have generated a startup concept here: Visionary Diagnostics.
Leave a Reply