Cross-modal contrastive learning of microscopy image and structure-based representations of molecules
Sprache des Titels:
International Conference on Machine Learning (ICML 2022), 3rd Women in Machine Learning Un-Workshop
Contrastive learning for self-supervised representation learning has brought a strong improvement to many application areas, such as computer vision and natural language processing. With the availability of large collections of unlabeled data in vision and language, contrastive learning of language and image representations has shown impressive results. The contrastive learning methods CLIP  and CLOOB  have demonstrated that the learned representations are highly transferable to a large set of diverse tasks when trained on multi-modal data from two different domains. In drug discovery, similar large, multi-modal datasets comprising both cell-based microscopy images and chemical structures of molecules are available. However, contrastive learning has not been used for this type of multi-modal data in drug discovery, although transferable representations could be a remedy for the time-consuming and cost-expensive label acquisition in this domain. In this work, we present a contrastive learning method for image- and structure-based representations of small molecules for drug discovery. Our method, Contrastive Leave-One-Out boost for Molecule Encoders (CLOOME), comprises an encoder for microscopy data, an encoder for chemical structures and a contrastive learning objective. On the benchmark dataset Cell Painting , we demonstrate the ability of our method to learn proficient representations by performing linear probing for activity prediction tasks.