18 April 2024

On the Scalability of GNNs for Molecular Graphs

Written by Startup Idea

Authors: Maciej Sypetkowski, Frederik Wenkel, Farimah Poursafaei, Nia Dickson, Karush Suri, Philip Fradkin, Dominique Beaini

Published on: April 17, 2024

Impact Score: 7.6

Arxiv code: Arxiv:2404.11568

Summary

What is new: The first observation of GNNs showing significant improvements from scaling up, using the largest collection of 2D molecular graphs to date.
Why this is important: GNNs have struggled to demonstrate the benefits of scaling due to sparse operations inefficiency and unclear architectural effectiveness.
What the research proposes: Evaluating different GNN architectures, including message-passing networks, graph Transformers, and hybrids, on a massive dataset to study scaling effects.
Results: Achieved a 30.25% performance improvement with 1 billion parameters, and a 28.98% improvement with an eightfold increase in dataset size, showing strong finetuning scaling behavior on 38 tasks.

Technological frameworks used: nan

Models used: GNN architectures, including message-passing networks, graph Transformers, and hybrid models

Data used: The largest public collection of 2D molecular graphs

Pharmaceutical companies, particularly those involved in drug discovery, could experience significant disruption or benefit from these insights.

We have generated a startup concept here: MolecuNet.