Authors: Zhenwei Wang, Qiule Sun, Bingbing Zhang, Pengfei Wang, Jianxin Zhang, Qiang Zhang
Published on: April 13, 2024
Impact Score: 7.2
Arxiv code: Arxiv:2404.08915
Summary
- What is new: Introduction of a multi-modal model paradigm, PM2, for medical image classification, using both images and supplementary text inputs. This approach employs prompt engineering and combines two classification heads for improved performance.
- Why this is important: Limited number of annotated medical images for training models, making single image modality insufficient for accurate medical image classification.
- What the research proposes: A new prompting multi-modal model, PM2, which utilizes supplementary text inputs alongside images and a novel approach in linear classification on feature distribution, leveraging global covariance pooling.
- Results: PM2 significantly outperforms existing models in few-shot learning across three medical datasets, marking state-of-the-art performance.
Technical Details
Technological frameworks used: nan
Models used: PM2, prompting multi-modal model, global covariance pooling, linear probing
Data used: Three medical datasets
Potential Impact
Healthcare technology providers, medical imaging software companies, AI-driven diagnostic services
Want to implement this idea in a business?
We have generated a startup concept here: MediPrompt.
Leave a Reply