A new press release reports, “Recursion, a Fast Company ‘Most Innovative Company’ and leader in the artificial intelligence for drug discovery movement, today announced it will open-source a glimpse of the massive biological dataset the company has been building for more than five years. At more than two petabytes, and across more than 10 million different biological contexts, Recursion’s data is the world’s largest image-based dataset designed specifically for the development of machine learning algorithms in experimental biology and drug discovery. The announcement was made at the global machine learning conference, ICLR 2019, and will be accompanied by a competition available through the NeurIPS 2019 Competition Track and co-sponsored by NVIDIA and Google Cloud. The goal of the competition is to inspire the development of effective machine learning methods that can identify representations of biology from the complex experimental dataset, called RxRx1.”
Chris Gibson, Ph.D., CEO of Recursion, commented, “To answer fundamental questions facing biology and disease, and reimagine the drug discovery paradigm, we’re building the world’s largest, relatable, empirical biological dataset… The RxRx1 dataset we’re announcing today represents an important resource for the machine learning community, with more than 100,000 images and 300-plus gigabytes of data representing diverse biological contexts. Yet despite the massive scale of this dataset, it represents just 0.4 percent of what we generate at Recursion on a weekly basis. We expect that the richness of this dataset, combined with the context surrounding the scale of our efforts, will inspire the world’s machine learning and AI community to help us in our mission to decode biology to radically improve lives.”
Read more at Business Wire.
Image used under license from Shutterstock.com