Energy prediction of small molecules is a much sought after problem in Chemistry. It has far reaching applications in biological simulations of large systems. Traditional computational methods that use molecular mechanics are fast but unreliable. Quantum Mechanics (QM) based methods, although accurate, are time consuming and cannot be scaled to large systems. Hence, faster methods that are QM accurate are much sought after. In this task, you are provided with QM energies (calculated at the B3LYP/6-31G(2df,p) level of quantum chemistry) for a set of atomic coordinates and their molecular structures (SMILES sequence). Train a model to predict QM energy from the coordinates and structure on the test set.
- Use atomic coordinates and/or SMILES strings to determine QM energies.
- Possible atoms - Carbon (C), Nitrogen(N), Hydrogen(H), Oxygen(O)
- Maximum number of heavy atoms (atoms besides Hydrogen) in a molecule is 8.
- Check the resources tab to download the data.
- train.csv: File_name, SMILES string, QM energy
- test.csv: Same as train.csv. Edit the sample test.csv file provided to replace dummy energies with predicted QM energies.
- xyzs/ : Coordinate files in .xyz format corresponding to molecules in train and test sets.
- Submit the test.csv file in the Submissions Tab.
- Evaluation metric used is Root Mean Squared Error (RMSE).
- Code used for obtaining the results should also be submitted on Moodle.