Project Reflection
Overview
My project, Mollykill, aims to comprehensively apply the informations regarding PIC16B and beyond in crafting a basic generative molecular model and providing a basic idea for computational methodologies in the process of drug design.
Project Achievements
Currently, my project provides a basic pipeline of the process of generating new molecules that possibly possess similar features with the original dataset.
The two specific aspects I like about my project is first how it managed to get the structure of a GAN that seems to be successful, and is able to generate molecules that at least looks viable. Secondly, the featurizer for the model is able to give out a some what reasonable graph representation of the molecules. Generally, I enjoy observing that computational approaches could be intervened with the world of other subjects to generate great outcomes.
Future work
- Currently, the generation of molecules is still limited to a specific length that is shorter than most molecules in the real world. If users want to generate longer molecules, the generated molecule would not be syntactically recognizable by the defeaturizer. Future work involving some concepts of Conditional GAN that ensures the viability of the generated molecules might be employed.
- The dimension of the discriminator might be improved by adding more features of the molecules(ie. hybridization, stereochemistry, etc)
Proposed & Outcome
Indeed, the rationale of the project has changed a bit through time compared to that in the original proposal. Initially, I’ve thought of making a classification model and applying a larger database. In other words, creating a pipeline of computational virtual screening process. However, generating actual novel molecules with preliminary wet-lab assy informations is indeed very important in drug discovery, the molecules generated by computer might even have better effect than those from the initial molecule database. Thus, I ultimately decided to build a generative model for molecules. Other technical details, such as the final present format, has not changed too much.
Lessons Learned!
Thanks to the help from Professor and Erin, I’ve learned a lot about github, including how to fork other’s repository, git clone, etc. When building my deep learning model, I’ve also gained more understanding about the math concepts when trying to understand what those layers of tensorflow do. Furthermore, I’ve also gotten more used to referring to the API of all sort of functions when making use of them.
Beyond and Above
Regarding the specific skills, such as tensorflow and git, learned during doing this project, the experience would definitely help me to adapt more efficiently when encountering related topics in the future. Moreover, an important skill that Sthis course is taught me is to self-learn. I’ve gradually got use to browsing youtube videos, tutorial blogposts, github source codes, and function APIs for stuffs I need. Most importantly, through crafting this project, I’ve somehow determined my future career interest, in which I would like to examine to what extend could computational methods aid the development in the field of traditional health science discoveries.
Overall, special thanks to Professor Chodrow and TA Erin for supporting us emotionally and technically during this quarter about this project, this course, and more beyond. Even though I don’t have a groupmate, I would also like to thank all my classmates, for that I would always be motivated when thinking that we are progressing as a whole team.
