May 30, 2017
A few weeks ago, I had an interesting conversation at a conference. The person I was talking to asked me about my work. I mentioned that I worked on DeepChem, an open source library facilitating deep learning for drug discovery. I explained that the tools we build help make the drug discovery process a little easier. Nothing revolutionary yet, but given the powers of deep learning and our promising early results, we had cause to be hopeful. My questioner was a little surprised to hear my answer. He asked, “Since pharma companies have a lot of money, why does it make sense to make your tools open source? I can see the point of open source for projects where there aren’t resources available, but why bother making free tools given that your biggest users have plenty of cash?” This question stumped me a bit.
I have to admit that I’d never seriously considered the question before. I started working on DeepChem because I wanted to build a toolchain that would help me write research papers. I put it online because that’s what everyone in our research group does. (The Pande research group has a long history of creating high quality open source software as products of our research). The package was initially released under the GPL license, a more restrictive release which limits the flexibility of commercial users. However, I chatted with a number of open source developers who convinced me that an open source package was most useful with a larger community of users. After getting permission from existing DeepChem contributors, I made the package MIT license, which let companies use the library with basically no restrictions.
From a research standpoint, this stance makes plenty of sense. The more open our software is, the more users we get. The more users we get, the more people cite our papers. The more people cite our papers, the more we succeed as researchers. But, the question remains. Why continue developing an open source toolchain if our target users have plenty of money? Why not let them take on the task once the interesting papers in the space are cleared out? Or put more bluntly, why are open source tools for drug discovery worth building?
To explain why, let’s take a short detour into the structure of the pharmaceutical industry. Needless to say, the pharmaceutical industry makes money by selling (often expensive) medication to doctors and patients. The high costs of such treatments are partially justified by the large research expenditures and numerous failed compounds that litter the path to just one successful drug. The industry is challenging from multiple perspectives. The underlying biology and chemistry for many diseases is still poorly understood, side effects are impossible to completely banish, and regulatory approval is challenging. Given this difficult state of affairs, pharmaceutical and biotech companies often choose to focus on diseases with the potential for greatest profit. That is, on diseases like cancer, autoimmune disorders, and neurological conditions. In particular, these diseases are widespread in the developed world and sufferers include those with the money to afford expensive treatments. As a result, the vast majority of pharmaceutical research and development is done for developed markets and associated diseases. I’d argue that this research is on the whole a good; cancer or Alzheimer’s are terrible diseases and anything that brings us closer to cures is worthwhile.
However, the limited availability of resources means that far less effort is allocated to antibiotic resistance, tropical diseases, or malnutrition. Or put another way, to diseases that afflict the developing world and the poor. Profit motive is a powerful and dynamic force which has led to powerful innovations, but one which doesn’t necessarily create the greatest human good. Given the very high cost of drug discovery, researchers seeking treatments for these critical illnesses are dependent on limited funds from governmental agencies or philanthropic organizations such as the Gates foundation.
This isn’t an ideal state of affairs. On the one hand, there’s no fighting the economic laws of supply and demand. If there’s not much money in tropical diseases and viruses, it may well not make sense to build companies attempting to solve these problems (not entirely true; there are some successful companies here, but the slog is much harder). But on the other hand, as scientists and as human beings, we want to find treatments for all diseases and not only those that afflict the wealthy.
Open source is a solution to problems like this. By working with pharmaceutical companies on projects targeting commercially valuable targets, researchers can extract the general know-how about how to build algorithms useful for drug discovery. By contributing the fruits of these collaborations back into open source packages like DeepChem, researchers can share back the wealth into the common pool of knowledge. With each collaboration, paper, and github pull request, we hope that the process of drug discovery becomes easier and cheaper. Open source software is free and easily available. The tools we build today can be used by researchers tomorrow to treat conditions we might never have considered. In some ways, open source drug discovery is a Robin Hood ploy. Open source researchers use resources from the wealthy to build tools for the poor. But even better, open source solves two problems for the price of one: Helping treat diseases like cancer is in itself a worthwhile goal; the fact that the tools we build can one day help treat neglected diseases as well is simply icing on the cake.
Acknowledgements: Thanks to Karim Galil for the conversation that spurred this post.