Authors: Soham Sinha, Shekhar Dwivedi, Mahdi Azizian
Published on: February 06, 2024
Impact Score: 8.3
Arxiv code: Arxiv:2402.04466
Summary
- What is new: A system design optimized for heterogeneous GPU workloads in medical devices, leveraging CUDA MPS for spatial partitioning and isolating compute and graphics processing onto separate GPUs.
- Why this is important: Unpredictable end-to-end latency in medical devices due to concurrent execution of several AI applications on a single platform.
- What the research proposes: A system design that reduces latency and improves performance predictability for concurrent and heterogeneous GPU workloads on NVIDIA’s Holoscan platform.
- Results: Reduces maximum latency by 21-30% and improves latency distribution flatness by 17-25% for up to five concurrent applications; decreases maximum latency by 35% and improves GPU utilization by 42% for up to six concurrent applications.
Technical Details
Technological frameworks used: CUDA MPS
Models used: nan
Data used: Real-world Holoscan medical device applications
Potential Impact
Medical device manufacturing, healthcare diagnostics, edge-computing solutions
Want to implement this idea in a business?
We have generated a startup concept here: MediCore AI Solutions.
Leave a Reply