
Learning to Smell

Where to start? 5 ways to learn 2 smell!

We have written a notebook that explores 5 ways to attempt this challenge.


Hi everyone!

Open In Colab

@rohitmidha23 and me are undergrad students studying computer science, and found this challenge particularly interesting to explore the applications of ML in Chemistry. We have written a notebook that explores 5 ways to attempt this challenge. It includes baselines for

  • ChemBERTa
  • Graph Conv Networks
  • MultiTaskClassifier using Molecular Fingerprints
  • Sklearn Classifiers (Random Forest etc.) using Molecular Fingerprints
  • Chemception (2D representation of molecules)

Check it out @ https://colab.research.google.com/drive/1-RedHEQSAVKUowOx2p-QoKthxayRshUa?usp=sharing

The most difficult task in this challenge is trying to get good representations of SMILES that is understandable for ML algorithms and we have tried to give examples on how that has been done in the past for these kind of tasks.

We hope that this notebook helps out other beginners like ourselves.

As always we are open to any feedback, suggestions and criticism!

If you found our work helpful, do drop us a :heart:!

AICrowd Learning To Smell Challenge

What is the challenge exactly?

This challenge is all about the ability to be able to predict the different smells associate with a molecule. The information based upon which we are supposed to predict the smell is the smile of a molecule. Each molecule is labelled with multiple smells, with the total number of distinct smells being 109.

What is a smile?

SMILES (Simplified Molecular Input Line Entry System) is a chemical notation that allows a user to represent a chemical structure in a way that can be used by the computer. They describe the structure of chemical species using short ASCII strings.

What is the most important task in this challenge?

This most important task at hand here is gaining a meaningful representation of each smile. There are several ways to do this, and this notebook attempts to give you quite a few pathways to gain a representation of a smile that can then be used in an ML pipeline. The different ways discussed here are:

  • Tokenizing of Smiles and using ChemBERTA
  • Graph Conv
  • Molecular Fingerprints
  • 2D representation of molecules (Chemception)

Download the Data

Install reqd Libraries

In [ ]:
import sys
import os
import requests
import subprocess
import shutil
from logging import getLogger, StreamHandler, INFO

logger = getLogger(__name__)

def install(
        conda_path=os.path.expanduser(os.path.join("~", "miniconda")),
    """install rdkit from miniconda
    import rdkit_installer

    python_path = os.path.join(

    if add_python_path and python_path not in sys.path:
        logger.info("add {} to PYTHONPATH".format(python_path))

    if os.path.isdir(os.path.join(python_path, "rdkit")):
        logger.info("rdkit is already installed")
        if not force:

        logger.info("force re-install")

    url = url_base + file_name
    python_version = "{0}.{1}.{2}".format(*sys.version_info)

    logger.info("python version: {}".format(python_version))

    if os.path.isdir(conda_path):
        logger.warning("remove current miniconda")
    elif os.path.isfile(conda_path):
        logger.warning("remove {}".format(conda_path))

    logger.info('fetching installer from {}'.format(url))
    res = requests.get(url, stream=True)
    with open(file_name, 'wb') as f:
        for chunk in res.iter_content(chunk_size):

    logger.info('installing miniconda to {}'.format(conda_path))
    subprocess.check_call(["bash", file_name, "-b", "-p", conda_path])

    logger.info("installing rdkit")
        os.path.join(conda_path, "bin", "conda"),
        "-c", "rdkit",
        "rdkit" if rdkit_version is None else "rdkit=={}".format(rdkit_version)])

    import rdkit
    logger.info("rdkit-{} installation finished!".format(rdkit.__version__))
add /root/miniconda/lib/python3.6/site-packages to PYTHONPATH
python version: 3.6.9
ChemBERTa ia a collection of BERT-like models applied to chemical SMILES data for drug design, chemical modelling, and property prediction. We finetune this existing model to use it for our application.

First we visualize the attention head using the bert-viz library, we can use this tool to see if the model infact understands the smiles it is processing.

We will be using the tokenizer that was pretrained, if we trained our own tokenizer the results would probably be better.

I plan on implementing this soon, but I have included a link in the References section of this notebook, if you want to have a crack at this.

In [ ]:
  paths: {
      d3: '//cdnjs.cloudflare.com/ajax/libs/d3/3.4.8/d3.min',
      jquery: '//ajax.googleapis.com/ajax/libs/jquery/2.0.0/jquery.min',
In [ ]:
def call_html():
  import IPython
        <script src="/static/components/requirejs/require.js"></script>
            paths: {
              base: '/static/base',
              "d3": "https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.8/d3.min",
              jquery: '//ajax.googleapis.com/ajax/libs/jquery/2.0.0/jquery.min',

Lets load the train data and have a look at a few molecules that have the same label and pass them to the pretrained roberta model(trained on the zinc 250k dataset).

In [ ]:
import pandas as pd
import numpy as np
train_df = pd.read_csv("data/train.csv")
Out[ ]:
0 C/C=C/C(=O)C1CCC(C=C1C)(C)C fruity,rose
1 COC(=O)OC fresh,ethereal,fruity
2 Cc1cc2c([nH]1)cccc2 resinous,animalic
3 C1CCCCCCCC(=O)CCCCCCC1 powdery,musk,animalic
4 CC(CC(=O)OC1CC2C(C1(C)CC2)(C)C)C coniferous,camphor,fruity
In [ ]:
Out[ ]:
2 Cc1cc2c([nH]1)cccc2 resinous,animalic
1108 Cc1nc2c(o1)cccc2 resinous,animalic
3183 Cc1ccc2c(n1)cccc2 resinous,animalic
In [ ]:
import torch
import rdkit
import rdkit.Chem as Chem
from rdkit.Chem import rdFMCS
from matplotlib import colors
from rdkit.Chem import Draw
from rdkit.Chem.Draw import MolToImage
m = Chem.MolFromSmiles('Cc1nc2c(o1)cccc2')
fig = Draw.MolToMPL(m, size=(200, 200))
In [ ]:
m = Chem.MolFromSmiles('Cc1ccc2c(n1)cccc2')
fig = Draw.MolToMPL(m, size=(200,200))
In [ ]:
!git clone https://github.com/jessevig/bertviz.git
In [ ]:
import sys
In [ ]:
from transformers import RobertaModel, RobertaTokenizer
from bertviz import head_view

model_version = 'seyonec/ChemBERTa_zinc250k_v2_40k'
model = RobertaModel.from_pretrained(model_version, output_attentions=True)
tokenizer = RobertaTokenizer.from_pretrained(model_version)

sentence_a = "Cc1cc2c([nH]1)cccc2"
sentence_b = "Cc1ccc2c(n1)cccc2"
inputs = tokenizer.encode_plus(sentence_a, sentence_b, return_tensors='pt', add_special_tokens=True)
input_ids = inputs['input_ids']
attention = model(input_ids)[-1]
input_id_list = input_ids[0].tolist() # Batch index 0
tokens = tokenizer.convert_ids_to_tokens(input_id_list)


head_view(attention, tokens)

This is a pretty cool visualization of the attention head, please do explore the bert-viz library to have a look at some similar visualisation, i.e the model view and the nueron view!

Now we finetune ChemBerta for our application!

For this we will be using the simple transformers library, ofcourse we might get better results if we do some hyperparameter tuning, but for a baseline let us assume the default params.

In [ ]:
from simpletransformers.classification import MultiLabelClassificationModel
import pandas as pd
import logging

# Uncomment for logging
# logging.basicConfig(level=logging.INFO)
# transformers_logger = logging.getLogger("transformers")
# transformers_logger.setLevel(logging.WARNING)

model = MultiLabelClassificationModel('roberta', 'seyonec/ChemBERTa_zinc250k_v2_40k',num_labels=109, args={'num_train_epochs': 10, 'auto_weights': True,'reprocess_input_data': True, 'overwrite_output_dir': True,'use_cuda':True,}) #'wandb_project': 'l2s'}) Use wandb if you want by uncommenting
# You can set class weights by using the optional weight argument
Some weights of the model checkpoint at seyonec/ChemBERTa_zinc250k_v2_40k were not used when initializing RobertaForMultiLabelSequenceClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.decoder.bias']
- This IS expected if you are initializing RobertaForMultiLabelSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing RobertaForMultiLabelSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForMultiLabelSequenceClassification were not initialized from the model checkpoint at seyonec/ChemBERTa_zinc250k_v2_40k and are newly initialized: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
In [ ]:
with open("/content/data/vocabulary.txt") as f:
  vocab = f.read().split('\n')
In [ ]:
def get_ohe_label(sentence):
  sentence = sentence.split(',')
  ohe_sent = len(vocab)*[0]
  for i,x in enumerate(vocab):
    if x in sentence:
      ohe_sent[i] = 1
  return ohe_sent
In [ ]:
labels = []
for x in train_df.SENTENCE.tolist():
data_df = pd.DataFrame(train_df["SMILES"].tolist(),columns=["text"])# pd.DataFrame(labels,columns = vocab)
data_df["labels"]= labels

Out[ ]:
text labels
0 C/C=C/C(=O)C1CCC(C=C1C)(C)C [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
1 COC(=O)OC [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
2 Cc1cc2c([nH]1)cccc2 [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, ...
3 C1CCCCCCCC(=O)CCCCCCC1 [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, ...
4 CC(CC(=O)OC1CC2C(C1(C)CC2)(C)C)C [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
In [ ]:
# Split the train and test dataset 80-20

train_size = 0.9
In [ ]:
# check if our train and evaluation dataframes are setup properly. There should only be two columns for the SMILES string and its corresponding label.

print("FULL Dataset: {}".format(data_df.shape))
print("TRAIN Dataset: {}".format(train_dataset.shape))
print("TEST Dataset: {}".format(test_dataset.shape))
FULL Dataset: (4316, 2)
TRAIN Dataset: (3884, 2)
TEST Dataset: (432, 2)
In [ ]:
# !wandb login  ## Log into wandb if you want to keep an eye on how your model is training
In [ ]:
!rm -rf outputs

Time to Train!

In [ ]:

Out[ ]:
(4860, 0.11029124964534501)
In [ ]:
import sklearn
result, model_outputs, wrong_predictions = model.eval_model(test_dataset)

{'LRAP': 0.48465526156536565, 'eval_loss': 0.0850957441661093}
[[0.00508118 0.00619125 0.01182556 ... 0.0051384  0.01132965 0.03271484]
 [0.0078125  0.00818634 0.05340576 ... 0.00867462 0.00963593 0.025177  ]
 [0.00346947 0.01490021 0.00288963 ... 0.00933838 0.006073   0.22290039]
 [0.00340271 0.01069641 0.00712204 ... 0.00698471 0.00468063 0.04803467]
 [0.00547028 0.0050621  0.01963806 ... 0.00720596 0.00772476 0.03271484]
 [0.00400543 0.02442932 0.00223351 ... 0.00769424 0.00413513 0.58496094]]

Generate the test predictions

In [ ]:
test_df = pd.read_csv("/content/data/test.csv")
Out[ ]:
2 CC(=O)/C=C/C1=CCC[C@H](C1(C)C)C
In [ ]:
predictions, raw_outputs = model.predict(test_df["SMILES"].tolist())

In [ ]:
from tqdm.notebook import tqdm
for i,row in tqdm(test_df.iterrows(),total=len(test_df)):
    #predictions, raw_outputs = model.predict([row["SMILES"]])
    order = np.argsort(raw_outputs[i])[::-1][:15]
    labelled_preds = [vocab[i] for i in order]
    for x in labelled_preds:
      assert x in vocab
    sents = []
    for sent in range(0,15,3):
      sents.append(",".join([x for x in labelled_preds[sent:sent+3]]))
    pred = ";".join([x for x in sents])
1079 1079
In [ ]:
final = pd.DataFrame({"SMILES":test_df.SMILES.tolist(),"PREDICTIONS":final_preds})
Out[ ]:
0 CCC(C)C(=O)OC1CC2CCC1(C)C2(C)C camphor,resinous,fruity;woody,earthy,coniferou...
1 CC(C)C1CCC(C)CC1OC(=O)CC(C)O fruity,sweet,mint;woody,herbal,green;spicy,van...
2 CC(=O)/C=C/C1=CCC[C@H](C1(C)C)C woody,fruity,floral;herbal,resinous,fresh;bals...
3 CC(=O)OCC(COC(=O)C)OC(=O)C fruity,apple,sweet;ethereal,fresh,burnt;herbal...
4 CCCCCCCC(=O)OC/C=C(/CCC=C(C)C)\C fruity,oily,fresh;floral,herbal,fatty;citrus,g...
In [ ]:
Out[ ]:
1074 CC(=CCCC(C)CC=O)C citrus,floral,fresh;aldehydic,lemon,green;herb...
1075 CC(=O)c1ccc(c(c1)OC)O sweet,spicy,phenolic;floral,vanilla,woody;resi...
1076 C[C@@H]1CC[C@H]2[C@@H]1C1[C@H](C1(C)C)CC[C@]2(C)O woody,camphor,earthy;herbal,resinous,musk;gree...
1077 C=C1C=CCC(C)(C)C21CCC(C)O2 woody,floral,green;herbal,sweet,rose;balsamic,...
1078 CCC/C=C/C(OC)OC fruity,green,herbal;apple,fresh,ethereal;sweet...
In [ ]:

This gives a score of ~0.29 on the leaderboard.

Using Graph Networks

deepchem is an amazing library that provides a high quality open-source toolchain that democratizes the use of deep-learning in drug discovery, materials science, quantum chemistry, and biology.

We will be using deepchem to make our GraphConvModel. Molecules naturally lend themselves to being viewed as graphs. Graph Convolutions are one of the most powerful deep learning tools for working with molecular data.

Graph convolutions are similar to CNNs that are used to process images, but they operate on a graph. They begin with a data vector for each node of the graph (for example, the chemical properties of the atom that node represents). Convolutional and pooling layers combine information from connected nodes (for example, atoms that are bonded to each other) to produce a new data vector for each node.

In [ ]:
!pip install --pre deepchem
Collecting deepchem
  Downloading https://files.pythonhosted.org/packages/14/c2/76c72bd5cdde182a6516bb40aaa0eb6322e89afabc12d1f713b53d1f732d/deepchem-2.4.0rc1.dev20201021184017.tar.gz (397kB)
     |████████████████████████████████| 399kB 11.9MB/s 
Requirement already satisfied: joblib in /usr/local/lib/python3.6/dist-packages (from deepchem) (0.16.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from deepchem) (1.18.5)
Requirement already satisfied: pandas in /usr/local/lib/python3.6/dist-packages (from deepchem) (1.1.2)
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.6/dist-packages (from deepchem) (0.22.2.post1)
Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from deepchem) (1.4.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.6/dist-packages (from pandas->deepchem) (2.8.1)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.6/dist-packages (from pandas->deepchem) (2018.9)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.6/dist-packages (from python-dateutil>=2.7.3->pandas->deepchem) (1.15.0)
Building wheels for collected packages: deepchem
  Building wheel for deepchem (setup.py) ... done
  Created wheel for deepchem: filename=deepchem-2.4.0rc1.dev20201022163145-cp36-none-any.whl size=502778 sha256=15d5e46cda835e8b579932f827986f789ca962fbbcf9b853b043ca4961f84a1d
  Stored in directory: /root/.cache/pip/wheels/4b/ef/ce/c166ee776d4fcc0cd5e586887dcab72f9fc990b5b5f29fccea
Successfully built deepchem
Installing collected packages: deepchem
Successfully installed deepchem-2.4.0rc1.dev20201022163145

The following code generated the top15 smells present in the dataset we will be using these smells to pad the results our model predicts, since the submission requires us to submit 5 sentences each of 3 smells each.

In [ ]:
from sklearn.preprocessing import MultiLabelBinarizer
def make_sentence_list(sent):
  return sent.split(",")
train_df = pd.read_csv("/content/data/train.csv")
train_df["SENTENCE_LIST"] = train_df.SENTENCE.apply(make_sentence_list)
multilabel_binarizer = MultiLabelBinarizer()
Y = multilabel_binarizer.transform(train_df.SENTENCE_LIST)
d = {}
for x,y in zip(multilabel_binarizer.classes_,Y.sum(axis=0)):

d = sorted(d.items(), key=lambda x: x[1], reverse=True)
top_15 = [x[0] for x in d[:20]]
Out[ ]:

The GraphConvModel accepts the output of the ConvMolFeaturizer. To get the complete model cheatsheet click here. We thus preprocess our smiles using this Featurizer.

In [ ]:
import deepchem as dc
mols = [Chem.MolFromSmiles(smile) for smile in data_df["text"].tolist()]
feat = dc.feat.ConvMolFeaturizer()
arr = feat.featurize(mols)
wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
In [ ]:
labels = []
train_df = pd.read_csv("data/train.csv")
for x in train_df.SENTENCE.tolist():
labels = np.array(labels)
(4316, 109)

Create a Train and Validation set

In [ ]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(arr,labels, test_size=.1, random_state=42)
In [ ]:
train_dataset = dc.data.NumpyDataset(X=X_train, y=y_train)
val_dataset = dc.data.NumpyDataset(X=X_test, y=y_test)

<NumpyDataset X.shape: (3884,), y.shape: (3884, 109), w.shape: (3884, 1), task_names: [  0   1   2 ... 106 107 108]>
In [ ]:
<NumpyDataset X.shape: (432,), y.shape: (432, 109), w.shape: (432, 1), ids: [0 1 2 ... 429 430 431], task_names: [  0   1   2 ... 106 107 108]>
In [ ]:
model = dc.models.GraphConvModel(n_tasks=109, mode='classification',dropout=0.2)
model.fit(train_dataset, nb_epoch=110)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/indexed_slices.py:432: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Out[ ]:

The metric we will use to evaluate our model will be jaccard score, similar to the one used in the competition

In [ ]:
metric = dc.metrics.Metric(dc.metrics.jaccard_score)
print('training set score:', model.evaluate(train_dataset, [metric]))
training set score: {'jaccard_score': 0.3169358218239942}

The following is the auc roc score for each of the 109 classes.

In [ ]:
y_true = val_dataset.y
y_pred = model.predict(val_dataset)
metric = dc.metrics.roc_auc_score
for i in range(109):
    for gt,prediction in zip(y_true[:,i],y_pred[:,i]):
      assert round(prediction[0]+prediction[1])==1,prediction[0]-prediction[1]
    score = metric(dc.metrics.to_one_hot(y_true[:,i]), y_pred[:,i])
    print(vocab[i], score)
alcoholic 0.9597902097902098
aldehydic 0.7968215994531784
alliaceous 0.9701421800947867
almond 0.980327868852459
ambergris 0.9151162790697674
ambery 0.8611374407582939
ambrette 0.9604651162790698
ammoniac 0.9930394431554523
animalic 0.7099206349206348
anisic 0.6938616938616939
apple 0.8399680255795363
balsamic 0.7959270984854967
banana 0.984375
berry 0.7504396482813749
blackcurrant 0.8998247663551402
body 0.8761682242990655
bread 0.9410046728971964
burnt 0.8518450523223793
butter 0.7444964871194379
cacao 0.9045667447306791
camphor 0.9093196314670446
caramellic 0.9230769230769231
cedar 0.732903981264637
cheese 0.8755668322176635
chemical 0.7866715623278868
cherry 0.9867909867909869
cinnamon 0.9945609945609946
citrus 0.8318850654617078
clean 0.7103174603174602
clove 0.6744366744366744
coconut 0.9970794392523364
coffee 0.9812646370023419
coniferous 0.9625292740046838
cooked 0.8358287365379564
cooling 0.6711492564714522
cucumber 0.994199535962877
dairy 0.812675448067372
dry 0.7238095238095238
earthy 0.8255126868265554
ester 0.7302325581395349
ethereal 0.8611782071926999
fatty 0.8433014354066986
fermented 0.602436974789916
floral 0.7697652483756026
fresh 0.7470848300635535
fruity 0.7830530354645534
geranium 0.6523255813953488
gourmand 0.954225352112676
grape 0.8097447795823666
grapefruit 0.9848837209302326
grass 0.7258411580594679
green 0.6896101240514383
herbal 0.6987903225806451
honey 0.822429906542056
hyacinth 0.9622950819672131
jasmin 0.9174491392801252
lactonic 0.6418026418026417
leaf 0.7687796208530806
leather 0.9984459984459985
lemon 0.77380695314187
lily 0.9566627358490567
liquor 0.6599531615925058
meat 0.9577294685990339
medicinal 0.9913928012519562
melon 0.8732311320754718
metallic 0.49393583724569634
mint 0.7727088402270883
mushroom 0.855154028436019
musk 0.88490606780393
musty 0.6343577620173365
nut 0.8486328125
odorless 0.8408018867924529
oily 0.8976111778846154
orange 0.8517214397496088
pear 0.9222748815165877
pepper 0.780885780885781
phenolic 0.9590361445783133
plastic 0.7710280373831775
plum 0.8174418604651162
powdery 0.8420168067226891
pungent 0.666218487394958
rancid 0.6252100840336134
resinous 0.7452023674299038
ripe 0.888631090487239
roasted 0.8001984126984126
rose 0.845460827043691
seafood 0.8552325581395348
sharp 0.6262626262626263
smoky 0.9907192575406032
sour 0.8715966386554622
spicy 0.70349609375
sulfuric 0.9550457652647433
sweet 0.610172321667112
syrup 0.9761124121779859
terpenic 0.839907192575406
tobacco 0.7579225352112676
tropicalfruit 0.8007441530829199
vanilla 0.9428794992175273
vegetable 0.7854594112399643
violetflower 0.6685011709601874
watery 0.919392523364486
waxy 0.8652912621359223
whiteflower 0.8607888631090487
wine 0.6337236533957845
woody 0.7921620091606196

Let's compare how the model does in comparision to the groundtruth.

Note: y_pred is of the shape (n_samples,n_tasks,n_classes) with y_pred[:,:,1] corresponding to the probabilities for class 1.

In [ ]:
y_true = val_dataset.y
y_pred = model.predict(val_dataset)
# print(y_true.shape,y_pred.shape)
for i in range(y_true.shape[0]):
  final_pred = []
  prob_val = []
  for y in range(109):
      prediction = y_pred[i,y]
      if prediction[1] > 0.37:
  smell_ids = np.where(np.array(final_pred)==1)
  smells = [vocab[k] for k in smell_ids[0]]
  smells = [smells[k] for k in np.argsort(np.array(prob_val))] #to further order based on probability
  gt_smell_ids = np.where(np.array(y_true[i])==1)
  gt_smells = [vocab[k] for k in gt_smell_ids[0]]
['sweet', 'resinous', 'cinnamon', 'berry', 'fruity', 'balsamic'] ['balsamic', 'cinnamon', 'fruity', 'powdery', 'sweet']
['spicy', 'woody'] ['dry', 'herbal']
['sweet', 'resinous', 'fruity', 'phenolic'] ['floral', 'fruity', 'resinous', 'sweet']
['herbal', 'ethereal', 'citrus', 'fresh', 'floral'] ['citrus', 'fresh', 'lily']
['green', 'waxy', 'tropicalfruit', 'butter', 'banana', 'fruity', 'cognac'] ['green', 'pear']
['fresh', 'floral'] ['geranium', 'leaf', 'spicy']
['resinous', 'fruity', 'geranium', 'rose', 'honey'] ['berry', 'geranium', 'honey', 'powdery', 'waxy']
['fermented', 'floral', 'balsamic', 'resinous', 'honey', 'rose'] ['floral', 'herbal', 'lemon', 'rose', 'spicy']
['waxy', 'seafood', 'musty', 'ammoniac', 'chemical', 'cognac', 'aldehydic', 'ethereal', 'pungent', 'oily', 'terpenic'] ['chemical']
['fresh', 'fermented', 'ester', 'ethereal', 'cognac', 'apple', 'banana', 'fruity'] ['apple', 'fermented']
['meat', 'dairy', 'cooked', 'vegetable', 'alliaceous', 'sulfuric'] ['alliaceous', 'fresh', 'gourmand', 'metallic', 'spicy', 'sulfuric', 'vegetable']
['fresh', 'floral', 'fruity', 'resinous'] ['green', 'herbal', 'jasmin']
['lily', 'lemon', 'terpenic', 'floral', 'resinous'] ['coniferous', 'floral', 'resinous']
['ethereal', 'floral', 'rose', 'fresh', 'herbal', 'berry', 'fruity', 'apple'] ['fruity', 'herbal']
['camphor', 'mint'] ['mint']
['resinous', 'chemical', 'medicinal', 'spicy', 'leather', 'phenolic'] ['medicinal', 'phenolic', 'resinous', 'spicy']
['cacao', 'lemon', 'cognac', 'butter', 'cheese', 'fresh', 'fruity', 'mushroom', 'ethereal', 'fermented', 'alcoholic'] ['burnt', 'ethereal', 'fermented', 'fresh']
['mint', 'floral', 'fresh'] ['melon', 'tobacco']
['terpenic', 'woody'] ['chemical', 'resinous', 'rose']
['grapefruit', 'fresh', 'terpenic', 'vanilla', 'woody', 'dairy', 'musk', 'gourmand', 'ambery', 'camphor'] ['woody']
['gourmand', 'burnt', 'caramellic', 'phenolic', 'sour'] ['syrup']
['fermented', 'floral', 'lily', 'herbal', 'ethereal', 'fresh'] ['clean', 'floral', 'fresh', 'grass', 'lily', 'resinous', 'spicy', 'sweet']
['waxy', 'fresh', 'oily', 'rose', 'ethereal'] ['citrus', 'fresh', 'rose', 'waxy']
['woody', 'ethereal', 'earthy', 'cedar', 'camphor'] ['camphor', 'clean', 'cooling', 'green', 'resinous']
['oily', 'lily', 'grapefruit', 'leaf', 'fresh', 'citrus', 'aldehydic', 'lemon', 'herbal'] ['fruity', 'grass', 'green', 'herbal']
['herbal', 'floral', 'fruity'] ['animalic', 'green', 'pear', 'tropicalfruit']
['dairy', 'vanilla'] ['cedar', 'fresh', 'leaf', 'mint']
['fresh', 'oily', 'fruity', 'pear'] ['floral', 'fresh', 'fruity', 'herbal']
['caramellic', 'gourmand', 'meat', 'sulfuric', 'bread'] ['bread', 'fermented', 'gourmand', 'meat', 'nut', 'sulfuric']
['blueberry', 'sweet', 'fresh', 'liquor', 'plum', 'apple', 'fruity', 'ethereal'] ['chemical', 'ethereal', 'fruity', 'sweet']
['butter', 'banana', 'wine', 'dairy', 'fermented', 'sour', 'burnt', 'cheese', 'caramellic', 'fruity'] ['burnt', 'ester', 'fruity', 'pungent']
['nut', 'earthy', 'gourmand', 'vegetable'] ['meat', 'roasted', 'vegetable']
['fresh', 'citrus', 'aldehydic', 'floral'] ['orange']
['mint', 'leather', 'burnt', 'cedar', 'body', 'cheese', 'sour'] ['balsamic', 'pungent', 'sour']
['waxy', 'woody', 'floral'] ['balsamic', 'rose']
['fruity', 'whiteflower', 'balsamic', 'medicinal'] ['cooling', 'medicinal', 'sweet']
['leather', 'nut', 'earthy', 'coffee'] ['coffee', 'earthy', 'nut']
['caramellic', 'nut', 'fruity'] ['fruity', 'tobacco']
['mint', 'fresh', 'citrus'] ['mint']
['ethereal', 'woody', 'ambery'] ['ambery', 'cedar', 'dry', 'ethereal', 'herbal']
['camphor', 'woody', 'musk', 'earthy'] ['ambery', 'rancid', 'woody']
['dairy', 'butter', 'cheese', 'sour'] ['odorless']
['herbal', 'floral', 'lily', 'aldehydic', 'resinous', 'fresh'] ['fresh']
['resinous', 'mint', 'fresh', 'spicy', 'camphor'] ['camphor', 'cedar']
['gourmand', 'burnt', 'sulfuric', 'roasted', 'chemical', 'smoky', 'seafood', 'meat', 'alliaceous'] ['alliaceous', 'fruity', 'green']
['herbal', 'caramellic', 'fruity', 'ethereal', 'fresh'] ['ethereal', 'fresh', 'fruity']
['ethereal', 'tobacco', 'herbal', 'fruity'] ['floral', 'fruity']
['fruity', 'herbal', 'pungent', 'fresh', 'fatty', 'aldehydic'] ['citrus', 'dairy', 'floral', 'fruity']
['chemical', 'herbal', 'resinous', 'mint', 'pungent', 'terpenic'] ['roasted']
['ethereal', 'fresh', 'ester', 'pear', 'fruity', 'apple', 'banana'] ['fruity']
['chemical', 'banana', 'caramellic', 'burnt', 'sharp', 'butter', 'fruity'] ['burnt', 'caramellic', 'fruity', 'resinous']
['banana', 'fresh', 'ester', 'fruity'] ['fatty', 'fruity']
['alliaceous', 'metallic', 'vegetable', 'cheese', 'sulfuric', 'tropicalfruit'] ['alliaceous', 'tropicalfruit']
['fruity', 'coconut', 'jasmin'] ['fruity', 'jasmin']
['floral', 'waxy', 'aldehydic', 'green', 'leaf', 'fresh', 'rose', 'hyacinth'] ['green', 'hyacinth']
['oily', 'lily', 'grapefruit', 'leaf', 'fresh', 'citrus', 'aldehydic', 'lemon', 'herbal'] ['grass', 'green', 'tobacco']
['cheese', 'sour'] ['herbal', 'resinous']
['ethereal', 'wine', 'liquor', 'caramellic', 'fruity', 'apple'] ['animalic', 'caramellic', 'fruity', 'herbal']
['nut', 'dairy', 'spicy', 'fruity', 'mint', 'coconut'] ['chemical', 'fruity', 'herbal', 'pungent']
['fruity', 'fresh', 'mint'] ['mint']
['ethereal', 'tobacco', 'herbal', 'fruity'] ['herbal', 'tropicalfruit']
['ethereal', 'herbal', 'fruity', 'fresh', 'mushroom'] ['green', 'liquor', 'mushroom', 'sweet']
['vanilla', 'resinous', 'grass', 'medicinal', 'coconut', 'phenolic'] ['floral', 'fruity', 'resinous', 'sweet']
['fresh', 'musk', 'mint'] ['chemical', 'mint', 'musk']
['alcoholic', 'fresh', 'ethereal', 'fruity'] ['apple', 'fermented']
['earthy', 'fruity'] ['herbal', 'spicy']
['alliaceous', 'sulfuric'] ['burnt', 'resinous', 'sulfuric']
['fresh', 'coffee', 'burnt'] ['coffee', 'woody']
['fresh', 'melon', 'rancid', 'waxy', 'citrus', 'aldehydic', 'cucumber'] ['aldehydic', 'citrus']
['phenolic', 'sour', 'balsamic', 'spicy', 'vanilla'] ['odorless']
['pungent', 'meat', 'sulfuric', 'camphor', 'animalic', 'almond', 'vegetable', 'earthy', 'roasted', 'alcoholic', 'rancid', 'seafood', 'chemical'] ['ammoniac']
['fresh', 'balsamic', 'resinous', 'cinnamon', 'honey', 'fermented', 'alcoholic', 'bread', 'rose'] ['balsamic', 'hyacinth']
['herbal', 'lily', 'floral', 'alcoholic', 'fresh'] ['aldehydic', 'floral', 'grass', 'leaf']
['tobacco', 'burnt', 'animalic', 'sulfuric', 'blackcurrant'] ['fermented', 'green', 'metallic', 'sulfuric', 'tropicalfruit', 'woody']
['fatty', 'fruity', 'waxy', 'pear'] ['fruity', 'green', 'oily']
['body', 'cheese', 'sour'] ['cheese', 'fruity']
['sweet', 'dairy', 'cognac', 'grape', 'sour', 'fruity', 'fermented', 'ethereal', 'alcoholic'] ['alcoholic', 'chemical', 'ethereal']
['caramellic'] ['odorless']
['cheese', 'alcoholic', 'fermented', 'fresh', 'burnt', 'chemical', 'ethereal', 'fruity', 'banana'] ['chemical', 'dairy', 'fruity', 'green', 'herbal', 'sharp']
['fruity', 'mushroom'] ['melon', 'mushroom', 'violetflower']
['blueberry', 'fruity', 'citrus', 'rose'] ['citrus', 'fatty', 'floral']
['fresh', 'herbal'] ['fresh', 'mint']
['smoky', 'sulfuric', 'seafood', 'roasted', 'meat', 'gourmand', 'alliaceous'] ['meat', 'plastic']
['burnt', 'tobacco', 'resinous', 'earthy', 'animalic', 'leather'] ['animalic', 'earthy', 'leather', 'tobacco', 'woody']
['berry', 'violetflower', 'cucumber', 'grass', 'chemical', 'fruity'] ['nut']
['fresh', 'fruity', 'blackcurrant'] ['balsamic', 'herbal', 'rancid']
['citrus', 'sulfuric'] ['animalic', 'berry', 'green', 'mint', 'sulfuric']
['earthy', 'tropicalfruit', 'smoky', 'vegetable', 'seafood', 'gourmand', 'nut', 'roasted', 'alliaceous', 'meat'] ['alliaceous', 'cooked', 'fatty', 'roasted', 'vegetable']
['lily', 'lemon', 'terpenic', 'floral', 'resinous'] ['coniferous', 'floral', 'resinous']
['gourmand', 'caramellic', 'nut', 'meat', 'bread'] ['ethereal', 'sweet']
['banana', 'fresh', 'ethereal', 'fruity'] ['fruity', 'vegetable']
['woody'] ['citrus', 'resinous', 'spicy']
['herbal', 'floral', 'fruity', 'resinous'] ['floral', 'green', 'herbal', 'plum']
['lily', 'honey', 'resinous', 'floral', 'rose', 'balsamic'] ['herbal', 'rose']
['spicy', 'vanilla', 'leather', 'medicinal', 'smoky', 'phenolic'] ['medicinal', 'phenolic', 'spicy']
['anisic', 'phenolic', 'cinnamon', 'sweet', 'cherry', 'vanilla', 'almond'] ['almond', 'balsamic', 'cherry', 'floral', 'sweet']
['cooling', 'mint'] ['cooling', 'sweet']
['green', 'terpenic', 'camphor', 'citrus'] ['citrus', 'earthy', 'green', 'mint', 'spicy', 'sweet', 'woody']
['citrus', 'orange', 'violetflower', 'fruity', 'resinous', 'floral', 'lily'] ['floral', 'lemon', 'orange']
['lactonic', 'fruity'] ['floral', 'fruity', 'woody']
['herbal', 'lemon', 'citrus'] ['aldehydic', 'dry', 'green', 'lemon', 'orange']
['cooked', 'meat', 'gourmand'] ['cheese', 'gourmand', 'meat']
['honey', 'green', 'fruity', 'sweet', 'herbal', 'rose', 'cherry', 'almond', 'hyacinth'] ['chemical', 'floral', 'pungent', 'resinous']
['medicinal', 'vanilla', 'phenolic'] ['floral', 'herbal', 'sweet']
['camphor', 'woody', 'earthy'] ['ambery', 'tobacco', 'woody']
['cedar', 'floral', 'ambergris', 'violetflower', 'woody'] ['balsamic', 'woody']
['berry', 'violetflower', 'cucumber', 'grass', 'chemical', 'fruity'] ['dairy', 'mushroom']
['body', 'fruity', 'pungent', 'sour', 'cheese'] ['meat', 'oily', 'roasted', 'sour']
['sweet', 'cherry', 'almond'] ['almond', 'cherry', 'spicy', 'sweet']
['sweet', 'phenolic', 'blackcurrant', 'cinnamon'] ['phenolic', 'spicy']
['fermented', 'cheese', 'cognac', 'musty', 'apple', 'banana', 'butter', 'fruity'] ['banana', 'meat', 'ripe']
['burnt', 'bread', 'fruity', 'berry', 'syrup', 'butter', 'caramellic'] ['berry', 'caramellic', 'clean']
['fresh', 'ethereal', 'body', 'burnt', 'fruity'] ['pungent', 'vegetable']
['sulfuric', 'spicy'] ['herbal', 'spicy', 'sulfuric', 'vegetable']
['fresh', 'floral', 'fruity', 'resinous'] ['floral', 'fresh', 'fruity', 'resinous']
['metallic', 'tropicalfruit', 'cacao', 'alliaceous', 'resinous', 'syrup', 'green', 'leaf', 'roasted', 'meat', 'vegetable'] ['cooked', 'meat', 'nut']
['fruity'] ['fatty', 'floral', 'sweet']
['odorless', 'waxy', 'cognac', 'fruity', 'oily'] ['oily']
['oily', 'fruity'] ['liquor', 'oily', 'wine']
['honey', 'apple', 'mint', 'fruity'] ['fresh', 'herbal', 'woody']
['fruity', 'woody', 'mint'] ['camphor', 'musty', 'woody']
['sharp', 'body', 'pungent', 'lemon', 'pepper', 'fresh', 'chemical', 'ethereal', 'terpenic'] ['chemical', 'ethereal']
['sweet', 'phenolic', 'resinous', 'balsamic'] ['balsamic', 'floral', 'herbal']
['lemon', 'fresh', 'herbal'] ['aldehydic', 'floral', 'herbal']
['pear', 'pungent', 'green', 'tropicalfruit', 'butter', 'banana', 'fruity', 'apple', 'cognac'] ['apple', 'banana', 'fatty']
['sweet', 'pungent', 'pear', 'blueberry', 'ethereal', 'ester', 'fruity', 'wine', 'fermented', 'banana', 'apple'] ['apple', 'banana']
['waxy', 'cognac', 'fruity', 'rose'] ['citrus', 'fatty', 'floral', 'fresh']
['phenolic', 'medicinal'] ['phenolic', 'plastic']
['alliaceous', 'spicy', 'vegetable'] ['spicy', 'sulfuric']
['leaf', 'waxy', 'oily', 'vegetable', 'melon', 'cucumber'] ['cucumber', 'green', 'melon', 'mushroom', 'oily', 'seafood']
['mint', 'phenolic'] ['burnt', 'herbal', 'phenolic', 'sweet']
['butter', 'fruity', 'wine', 'apple'] ['apple']
['grass', 'woody', 'camphor', 'liquor', 'ethereal', 'fresh'] ['camphor', 'woody']
['woody', 'fruity', 'tobacco', 'musk', 'berry'] ['syrup', 'woody']
['green', 'banana', 'apple', 'pear'] ['green', 'pear', 'resinous', 'tropicalfruit']
['resinous', 'animalic', 'musk', 'earthy'] ['balsamic', 'phenolic']
['camphor'] ['alliaceous', 'roasted']
['mint', 'resinous', 'cedar', 'camphor'] ['camphor', 'fresh']
['fruity', 'waxy', 'oily', 'pear'] ['fruity', 'oily']
['herbal', 'fruity', 'green', 'fresh'] ['ethereal', 'fatty', 'liquor']
['musk', 'mint'] ['herbal']
['fruity', 'resinous', 'cinnamon', 'leather', 'cacao', 'almond', 'bread', 'caramellic', 'coffee', 'burnt'] ['bread', 'caramellic', 'phenolic', 'woody']
['mushroom', 'waxy', 'fatty', 'fruity', 'pear'] ['green', 'melon', 'pear', 'tropicalfruit', 'waxy']
['resinous', 'apple', 'ester', 'butter', 'blackcurrant', 'sour', 'sweet', 'berry', 'cherry', 'fruity', 'cinnamon', 'fermented', 'balsamic', 'blueberry'] ['balsamic', 'fruity']
['dairy', 'fruity', 'oily', 'musk', 'woody', 'animalic'] ['ambergris', 'animalic', 'anisic', 'clean', 'fatty', 'musk', 'powdery']
['fresh', 'fruity'] ['floral', 'green', 'herbal']
['fruity', 'mint'] ['cooling', 'tropicalfruit', 'woody']
['fruity', 'coniferous', 'resinous', 'camphor'] ['camphor', 'nut', 'resinous']
['spicy', 'phenolic'] ['floral', 'mint', 'sweet', 'violetflower', 'woody']
['oily', 'fruity', 'dairy', 'melon', 'aldehydic', 'cucumber'] ['fruity']
['resinous'] ['rose']
['cherry', 'sweet', 'anisic'] ['anisic']
['sweet', 'camphor', 'mint'] ['mint']
['roasted', 'berry', 'cheese', 'sour', 'grass', 'mint', 'coconut', 'syrup', 'phenolic', 'caramellic', 'bread', 'burnt'] ['body', 'gourmand', 'nut', 'spicy', 'syrup']
['orange', 'resinous'] ['aldehydic', 'citrus', 'green']
['rose', 'tobacco', 'fruity', 'honey'] ['honey', 'rose']
['spicy', 'cinnamon', 'sweet', 'balsamic'] ['balsamic', 'spicy']
['medicinal', 'balsamic', 'earthy', 'resinous', 'camphor'] ['camphor', 'coniferous', 'cooling']
['ester', 'fermented', 'cognac', 'butter', 'cheese', 'apple', 'fruity', 'banana'] ['butter', 'fruity']
['fruity', 'hyacinth', 'rose'] ['balsamic', 'floral']
['citrus', 'woody', 'spicy', 'fresh', 'camphor'] ['fresh', 'mint']
['fermented', 'sweet', 'banana', 'apple', 'ethereal', 'fruity'] ['banana', 'cheese', 'tropicalfruit']
['woody', 'herbal', 'spicy', 'fresh', 'resinous'] ['aldehydic', 'fresh', 'herbal']
['medicinal', 'coffee', 'meat', 'burnt', 'seafood', 'sulfuric', 'alliaceous'] ['burnt', 'caramellic', 'coffee']
['resinous', 'sweet'] ['balsamic', 'musk']
['woody'] ['clean', 'fresh', 'herbal']
['fruity', 'fresh', 'mint'] ['camphor']
['lily', 'fresh', 'ester', 'woody', 'citrus', 'ethereal', 'fruity', 'lemon', 'floral'] ['ethereal', 'floral', 'lemon']
['waxy', 'rose', 'pear', 'fruity', 'ester', 'banana'] ['banana', 'body', 'fermented', 'green', 'meat', 'melon']
['herbal', 'burnt', 'ethereal', 'fresh', 'fruity'] ['berry', 'cheese', 'green', 'herbal']
['herbal', 'cooling', 'mint', 'fruity'] ['cooling', 'fruity']
['musty', 'bread', 'meat', 'seafood', 'roasted', 'smoky', 'phenolic', 'vegetable', 'earthy', 'caramellic', 'burnt', 'cacao', 'coffee', 'body', 'nut'] ['cacao', 'caramellic', 'dry', 'musty', 'nut', 'vanilla']
['herbal', 'balsamic', 'resinous', 'fruity', 'woody'] ['cedar', 'fresh', 'sharp']
['aldehydic', 'metallic', 'ethereal', 'citrus', 'lemon'] ['herbal', 'lemon']
['terpenic', 'fresh', 'camphor'] ['camphor', 'resinous']
['herbal', 'woody', 'blackcurrant', 'camphor', 'leaf', 'tobacco', 'cooling'] ['pepper', 'woody']
['resinous', 'balsamic', 'honey', 'rose', 'fruity'] ['herbal', 'sweet', 'wine']
['tropicalfruit', 'fruity'] ['burnt', 'fruity', 'herbal']
['berry', 'cedar', 'woody', 'ambergris'] ['ambery', 'woody']
['fruity', 'apple', 'wine'] ['tropicalfruit', 'wine']
['tropicalfruit', 'fruity'] ['apple', 'berry']
['cinnamon', 'chemical', 'phenolic', 'bread', 'blackcurrant', 'camphor', 'anisic'] ['floral', 'orange', 'resinous', 'sweet']
['sweet', 'nut', 'tobacco', 'caramellic', 'ethereal', 'fruity'] ['burnt', 'caramellic', 'fresh', 'fruity']
['tropicalfruit', 'meat', 'alliaceous', 'berry', 'blackcurrant', 'sulfuric'] ['floral', 'grapefruit', 'lemon']
['fresh', 'alcoholic', 'ethereal', 'fermented'] ['citrus', 'floral', 'fresh', 'oily', 'sweet']
['sulfuric', 'tropicalfruit', 'alliaceous', 'vegetable'] ['cheese', 'sulfuric', 'tropicalfruit', 'vegetable']
['clove', 'medicinal', 'vanilla', 'leather', 'phenolic'] ['phenolic', 'spicy']
['gourmand', 'burnt', 'fruity', 'roasted', 'meat', 'cooked'] ['caramellic', 'dairy']
['green', 'woody'] ['anisic', 'floral', 'fruity', 'woody']
[] ['herbal', 'spicy']
['ambergris', 'ethereal', 'floral', 'woody', 'violetflower'] ['floral', 'fruity', 'woody']
['floral', 'geranium', 'fresh', 'citrus', 'fruity', 'rose'] ['floral', 'fresh', 'fruity']
[] ['aldehydic', 'fresh', 'herbal', 'woody']
['fruity', 'coconut'] ['coconut', 'dairy', 'fruity']
['earthy', 'camphor', 'anisic', 'green', 'spicy', 'woody'] ['clove', 'herbal', 'rose', 'woody']
['animalic', 'gourmand', 'bread', 'earthy', 'terpenic'] ['balsamic', 'earthy', 'green', 'musk']
['fermented', 'alcoholic', 'oily', 'ethereal', 'cheese', 'fruity', 'mushroom'] ['fresh', 'oily']
['resinous', 'floral', 'aldehydic', 'lily', 'fresh'] ['clean', 'fresh']
['honey', 'balsamic', 'fruity', 'rose'] ['dry', 'herbal', 'rose']
['ethereal', 'fruity', 'apple'] ['fruity', 'green', 'musty', 'pungent']
['fresh', 'herbal', 'woody', 'terpenic'] ['fresh', 'herbal', 'waxy']
['woody', 'ambery'] ['ambery', 'herbal', 'woody']
['waxy', 'cognac', 'fruity', 'rose'] ['rose', 'tropicalfruit']
['grapefruit', 'cooling', 'mint'] ['dry', 'herbal']
['anisic', 'chemical', 'apple', 'alliaceous', 'liquor', 'ethereal', 'fruity'] ['chemical', 'ethereal', 'fresh', 'fruity', 'plastic']
['floral', 'lemon'] ['floral', 'green', 'lemon']
['waxy', 'oily', 'pear', 'rose', 'fruity'] ['citrus', 'fatty', 'waxy']
['fatty', 'geranium', 'fermented', 'ethereal', 'fresh', 'citrus', 'rose'] ['fatty', 'floral', 'mint']
['vanilla', 'syrup', 'coconut', 'mint'] ['coconut', 'cooling', 'fruity']
['overripe', 'dairy', 'ethereal', 'fermented', 'fruity', 'cheese', 'alcoholic', 'mushroom'] ['mushroom', 'musty', 'oily']
['cognac', 'body', 'fresh', 'ripe', 'fruity', 'pear', 'apple', 'ethereal', 'banana'] ['apple', 'balsamic', 'banana', 'pear', 'woody']
['nut', 'earthy'] ['herbal', 'nut']
['coffee', 'cooked', 'vegetable', 'sulfuric', 'gourmand', 'meat', 'alliaceous'] ['cooked', 'dairy', 'roasted']
['butter', 'gourmand', 'bread', 'caramellic', 'syrup'] ['clean', 'syrup']
['cedar', 'fruity', 'berry', 'woody', 'violetflower'] ['berry', 'green', 'sweet', 'woody']
['tropicalfruit', 'cheese', 'dairy', 'fruity', 'alliaceous', 'vegetable', 'meat', 'sulfuric'] ['animalic', 'grape', 'rancid']
['phenolic', 'coffee', 'animalic', 'leather', 'earthy', 'burnt'] ['floral', 'resinous']
['jasmin', 'phenolic', 'fruity', 'fatty', 'balsamic', 'animalic'] ['balsamic', 'nut', 'vanilla']
['camphor', 'cooling', 'terpenic', 'sulfuric'] ['citrus', 'resinous', 'sulfuric']
['cooked', 'alliaceous'] ['alliaceous', 'earthy', 'spicy', 'sulfuric', 'vegetable']
['sweet', 'smoky', 'resinous', 'green', 'cherry', 'fruity', 'balsamic'] ['balsamic', 'berry', 'cacao', 'green', 'herbal', 'whiteflower', 'woody']
['butter', 'caramellic', 'sour'] ['sour', 'spicy', 'vegetable']
['tropicalfruit', 'green', 'pear', 'fruity', 'apple', 'banana'] ['dairy', 'fresh', 'fruity', 'green']
['melon', 'fruity', 'earthy', 'fresh', 'ethereal', 'grass', 'alcoholic', 'leaf', 'vegetable'] ['fresh', 'leaf']
['floral', 'cinnamon', 'fruity', 'balsamic'] ['balsamic', 'fruity']
['fresh', 'balsamic', 'honey', 'resinous', 'floral', 'rose', 'lily'] ['rose']
['leaf', 'waxy', 'oily', 'vegetable', 'melon', 'cucumber'] ['floral', 'melon', 'oily', 'vegetable']
['woody', 'resinous', 'fruity'] ['ester', 'fruity', 'woody']
['fresh', 'fruity'] ['aldehydic', 'ambery', 'green', 'lemon', 'waxy']
['fresh', 'herbal', 'coconut', 'woody', 'rancid', 'fruity'] ['floral', 'fresh', 'fruity', 'mint']
['ambrette', 'dry', 'musk'] ['fruity', 'leaf', 'violetflower']
['herbal', 'fruity', 'coconut'] ['coconut', 'dairy', 'herbal', 'lactonic']
['herbal', 'jasmin', 'sweet', 'fruity', 'woody'] ['earthy', 'herbal']
['fruity'] ['apple']
['grapefruit', 'fresh', 'floral'] ['balsamic', 'woody']
['nut', 'phenolic', 'meat'] ['caramellic', 'cooked', 'meat']
['terpenic', 'oily'] ['waxy']
['herbal', 'fresh', 'resinous', 'pungent', 'camphor', 'coniferous', 'terpenic'] ['camphor']
['gourmand', 'nut', 'green', 'earthy', 'alliaceous', 'vegetable', 'meat'] ['mint', 'vegetable']
['fruity', 'cinnamon', 'caramellic', 'leather', 'coffee', 'burnt'] ['resinous', 'sweet']
['alcoholic', 'earthy', 'musk'] ['earthy', 'musk']
['lily', 'lemon', 'terpenic', 'floral', 'resinous'] ['floral']
['jasmin', 'resinous'] ['aldehydic', 'ethereal', 'fruity', 'jasmin']
['oily', 'fruity'] ['jasmin', 'oily', 'watery']
['butter', 'plum', 'lactonic', 'berry', 'fruity'] ['apple', 'fruity', 'green', 'resinous']
['sweet', 'almond', 'cherry', 'balsamic', 'cinnamon'] ['blackcurrant', 'cinnamon']
['chemical', 'terpenic', 'green', 'oily', 'cucumber'] ['green', 'resinous']
['banana', 'cheese', 'butter', 'fruity'] ['apple', 'banana', 'berry']
['fermented', 'pungent', 'butter', 'fruity', 'cheese', 'sour'] ['body', 'cheese', 'sour']
['cheese', 'sulfuric', 'cooked', 'dairy', 'meat', 'gourmand', 'alliaceous'] ['sulfuric', 'sweet']
['rose', 'green'] ['earthy', 'floral', 'sweet']
['rose', 'seafood', 'meat', 'roasted', 'sulfuric', 'ammoniac', 'plastic', 'burnt', 'chemical', 'alliaceous'] ['burnt', 'earthy', 'sulfuric']
['blueberry', 'waxy', 'fruity', 'floral', 'citrus', 'lily', 'fresh'] ['citrus', 'earthy', 'floral']
['odorless', 'fruity'] ['balsamic', 'clean', 'green', 'plastic']
['fermented', 'alcoholic', 'oily', 'ethereal', 'cheese', 'fruity', 'mushroom'] ['floral', 'grass', 'lemon', 'sweet']
['herbal', 'balsamic', 'mint', 'spicy', 'woody'] ['herbal', 'spicy', 'woody']
['gourmand', 'roasted', 'meat', 'alliaceous', 'sulfuric', 'burnt'] ['meat', 'roasted', 'sulfuric']
['cinnamon', 'resinous', 'body', 'balsamic', 'tobacco', 'leather', 'animalic', 'sour'] ['balsamic', 'burnt', 'sour']
['resinous', 'chemical', 'animalic', 'floral'] ['floral', 'jasmin', 'seafood']
['sweet', 'vanilla', 'medicinal', 'phenolic'] ['medicinal', 'phenolic', 'sweet']
['vegetable', 'nut', 'bread', 'earthy'] ['earthy', 'floral', 'nut', 'pepper']
['grape', 'caramellic', 'burnt', 'fruity'] ['caramellic', 'fruity']
['burnt', 'alliaceous', 'sulfuric'] ['caramellic']
['earthy', 'musty', 'coffee', 'burnt', 'caramellic', 'cacao', 'nut'] ['green', 'resinous', 'roasted']
['earthy', 'nut', 'coffee'] ['butter', 'cooked', 'earthy']
['fresh', 'aldehydic', 'lily', 'floral', 'citrus'] ['aldehydic', 'hyacinth', 'lily', 'watery', 'waxy']
['fresh', 'fermented', 'alcoholic', 'odorless'] ['odorless']
['burnt', 'leather', 'coffee', 'medicinal', 'smoky', 'phenolic'] ['phenolic', 'spicy']
['violetflower', 'woody'] ['ambergris', 'ambery', 'animalic', 'dry', 'fresh', 'metallic']
['fresh', 'ethereal', 'caramellic', 'fruity'] ['alcoholic', 'ethereal', 'musty', 'nut']
['hyacinth', 'fruity', 'honey'] ['honey', 'resinous']
['resinous', 'balsamic', 'rose', 'fruity'] ['balsamic', 'fruity']
['fruity'] ['floral', 'fresh', 'fruity']
['burnt', 'alliaceous', 'sulfuric', 'roasted', 'meat'] ['meat']
['sweet', 'ethereal', 'fruity', 'banana'] ['banana', 'melon', 'pear']
['bread', 'berry', 'syrup', 'caramellic'] ['bread', 'burnt', 'syrup']
['violetflower', 'floral', 'woody'] ['woody']
['resinous', 'grape', 'bread', 'caramellic', 'ethereal', 'burnt', 'fruity', 'mushroom'] ['berry', 'wine']
['cooked'] ['odorless']
['banana', 'ester', 'body', 'fruity', 'nut', 'sour', 'butter', 'cheese'] ['fruity']
['rose', 'floral', 'fresh', 'aldehydic', 'citrus'] ['aldehydic', 'floral', 'watery', 'waxy']
['fresh', 'ethereal', 'banana', 'fruity'] ['pear', 'rose']
['aldehydic', 'pungent', 'cucumber', 'chemical', 'ethereal'] ['aldehydic', 'ethereal', 'fresh']
['clean', 'fresh', 'herbal'] ['fresh', 'green', 'mint', 'rose']
['woody', 'balsamic', 'fruity', 'floral'] ['fruity', 'resinous', 'roasted']
['vegetable', 'body', 'burnt', 'caramellic', 'earthy', 'cacao', 'coffee', 'nut'] ['cacao', 'earthy', 'nut']
['mint', 'fresh', 'camphor'] ['cooling', 'musty', 'spicy']
['coniferous', 'resinous', 'camphor'] ['camphor', 'rancid', 'resinous']
['mint', 'fresh'] ['clean', 'floral', 'fresh', 'grass', 'lily', 'resinous', 'spicy', 'sweet']
['fruity', 'resinous'] ['fruity', 'herbal', 'resinous', 'rose']
['earthy', 'mint', 'fruity', 'jasmin'] ['herbal', 'jasmin', 'oily']
['herbal', 'leaf', 'hyacinth', 'rose'] ['leaf', 'vegetable']
['geranium', 'citrus', 'apple', 'fruity', 'rose'] ['apple', 'floral']
['vegetable', 'meat', 'sulfuric', 'alliaceous'] ['green', 'sulfuric', 'sweet']
['almond', 'smoky', 'chemical', 'medicinal', 'coffee', 'leather', 'phenolic'] ['earthy', 'leather', 'medicinal', 'phenolic']
['fresh', 'fruity'] ['green', 'mushroom', 'oily', 'sweet']
['fresh', 'rose', 'citrus', 'fruity', 'floral'] ['green', 'pear', 'rose', 'tropicalfruit', 'waxy', 'woody']
['fresh', 'coniferous', 'resinous', 'camphor'] ['herbal']
['waxy', 'geranium', 'floral', 'banana', 'fruity', 'rose'] ['cooling', 'green', 'herbal', 'rose', 'waxy']
['mint'] ['odorless']
['body', 'ethereal', 'nut', 'burnt', 'seafood', 'roasted'] ['bread', 'nut', 'vegetable', 'woody']
['phenolic', 'syrup', 'fruity', 'earthy', 'coconut'] ['mint', 'spicy']
['cooling', 'mint'] ['mint']
['fruity', 'butter', 'coconut'] ['coconut', 'dairy', 'fruity', 'resinous']
['fatty', 'dairy'] ['waxy', 'woody']
['chemical', 'fruity', 'caramellic', 'ethereal'] ['burnt', 'fresh', 'fruity', 'sweet']
[] ['odorless']
['apple', 'balsamic', 'wine', 'geranium', 'caramellic', 'herbal', 'fruity'] ['fruity', 'herbal']
['green', 'chemical', 'waxy', 'vegetable', 'mushroom', 'fruity', 'pear'] ['body', 'cooked', 'green', 'pear']
['cinnamon', 'vanilla'] ['vanilla']
['berry', 'meat', 'tropicalfruit', 'alliaceous', 'blackcurrant', 'sulfuric'] ['roasted', 'sweet', 'vegetable']
['pear', 'liquor', 'burnt', 'grape', 'alliaceous', 'ethereal', 'apple', 'caramellic', 'chemical', 'fruity'] ['chemical', 'fruity']
['fruity', 'apple'] ['cheese', 'green', 'spicy', 'tropicalfruit', 'woody']
['dairy', 'burnt', 'earthy', 'cheese', 'metallic', 'vegetable', 'sulfuric', 'alliaceous'] ['blackcurrant', 'burnt', 'sulfuric', 'vegetable']
['fruity', 'woody', 'floral'] ['ambery', 'ambrette', 'dry']
['woody', 'terpenic', 'camphor'] ['woody']
['floral'] ['clean', 'fresh']
['green', 'apple', 'banana', 'fruity'] ['apple', 'herbal']
['herbal', 'floral', 'fresh', 'woody', 'fruity'] ['apple', 'herbal', 'woody']
['alliaceous', 'cooked', 'burnt', 'ethereal', 'seafood', 'ammoniac', 'chemical'] ['alliaceous', 'cheese', 'cooked', 'mushroom']
['resinous', 'floral', 'phenolic'] ['earthy', 'spicy']
['caramellic', 'odorless'] ['odorless']
['fruity', 'floral', 'lily'] ['citrus', 'fresh', 'lily']
['lily', 'spicy', 'floral', 'fresh', 'resinous'] ['floral', 'fresh', 'fruity', 'green', 'musty']
['fermented', 'cooked', 'blackcurrant', 'butter', 'sour', 'gourmand', 'burnt', 'caramellic', 'berry', 'phenolic', 'bread', 'syrup'] ['caramellic', 'fruity', 'resinous']
['sweet', 'phenolic', 'smoky', 'vanilla'] ['vanilla']
['musk', 'herbal', 'mint'] ['mint', 'musk', 'spicy']
['leather', 'sweet', 'coconut', 'dairy', 'fruity', 'almond', 'vanilla', 'medicinal'] ['floral', 'medicinal', 'phenolic']
['grapefruit', 'floral', 'fruity', 'resinous'] ['balsamic', 'fruity', 'green', 'resinous', 'woody']
['floral', 'fresh', 'lily'] ['citrus', 'floral', 'green', 'melon']
['fermented', 'fruity', 'fresh', 'apple', 'banana', 'alcoholic', 'ethereal'] ['ethereal', 'fresh', 'fruity']
['vegetable', 'chemical', 'alliaceous'] ['ethereal', 'sulfuric']
['fatty', 'fruity', 'sour'] ['butter']
['floral', 'sweet', 'resinous', 'honey', 'balsamic', 'fruity'] ['fruity', 'musty', 'resinous']
['fermented', 'sour', 'caramellic', 'fruity'] ['berry', 'green', 'herbal', 'honey', 'tropicalfruit']
['medicinal', 'phenolic', 'leather'] ['leather', 'phenolic', 'terpenic']
['citrus', 'green', 'woody', 'fresh', 'lily', 'herbal', 'lemon', 'fruity', 'floral'] ['citrus', 'lemon']
['floral', 'lily'] ['ethereal', 'lily']
['woody'] ['earthy', 'woody']
['medicinal', 'vanilla', 'resinous', 'fruity', 'spicy', 'phenolic'] ['fruity', 'herbal', 'phenolic', 'spicy']
['fruity', 'pungent', 'sour'] ['earthy']
['waxy', 'floral'] ['balsamic', 'floral']
['roasted', 'coffee', 'plastic', 'phenolic', 'medicinal', 'sulfuric', 'seafood', 'alliaceous', 'meat', 'chemical'] ['meat', 'metallic', 'phenolic', 'roasted', 'sulfuric']
['musk', 'camphor', 'earthy'] ['balsamic', 'earthy', 'green', 'liquor', 'musk', 'resinous']
['lemon', 'fresh', 'herbal', 'fruity', 'banana'] ['floral', 'fresh', 'fruity', 'herbal']
['floral', 'berry', 'citrus', 'fruity', 'rose', 'geranium'] ['floral', 'fresh', 'fruity']
['herbal', 'leaf', 'hyacinth', 'rose'] ['hyacinth', 'rose']
['waxy', 'rancid', 'orange', 'fatty', 'aldehydic', 'citrus', 'oily'] ['green', 'herbal', 'orange']
['sweet', 'fresh', 'fruity', 'alcoholic', 'butter'] ['alcoholic', 'cacao', 'cheese', 'liquor', 'vegetable']
['earthy', 'woody', 'cooling', 'camphor'] ['camphor', 'cooling', 'resinous']
['aldehydic', 'lemon', 'floral', 'citrus'] ['citrus', 'fresh', 'herbal']
['vanilla'] ['floral', 'vanilla']
['terpenic', 'coniferous', 'resinous', 'camphor'] ['camphor']
['seafood', 'earthy', 'vegetable', 'meat', 'alliaceous', 'sulfuric', 'bread', 'gourmand', 'nut', 'coffee'] ['vegetable']
['spicy', 'vanilla', 'clove', 'medicinal', 'phenolic', 'smoky'] ['clove', 'sweet', 'woody']
['ethereal', 'woody'] ['animalic', 'fresh', 'fruity', 'green', 'herbal', 'sour', 'spicy', 'woody']
['phenolic', 'spicy', 'dairy', 'vanilla'] ['resinous', 'woody']
['resinous', 'camphor'] ['camphor', 'coniferous', 'resinous']
['oily', 'rose', 'fruity'] ['leaf', 'rose', 'waxy']
['tropicalfruit'] ['apple', 'tropicalfruit', 'waxy']
['green', 'resinous', 'earthy', 'vegetable', 'pepper'] ['green', 'woody']
['floral', 'sweet', 'resinous', 'phenolic', 'fruity'] ['balsamic', 'berry', 'powdery']
['woody', 'fruity', 'herbal', 'resinous'] ['oily', 'resinous']
['leather', 'vanilla', 'smoky', 'medicinal', 'phenolic'] ['phenolic']
['fruity', 'mint'] ['fruity', 'mint']
['resinous', 'herbal', 'woody', 'floral', 'fruity'] ['fruity', 'rose']
['herbal', 'burnt', 'ethereal', 'fresh', 'fruity'] ['berry', 'cheese', 'sweet']
['leaf', 'fruity', 'green'] ['leaf', 'mushroom', 'violetflower']
['herbal', 'camphor', 'spicy'] ['woody']
['burnt', 'vegetable', 'earthy', 'nut', 'meat'] ['coffee', 'earthy', 'meat', 'nut']
['powdery', 'sweet', 'cherry', 'cinnamon', 'almond', 'vanilla'] ['dairy', 'herbal', 'phenolic', 'powdery', 'vanilla']
['rose', 'fresh', 'citrus', 'lily', 'floral'] ['aldehydic', 'waxy']
['ethereal', 'bread', 'floral', 'lily', 'resinous'] ['clean', 'fruity', 'green', 'oily', 'rose']
['fresh', 'mint'] ['herbal', 'sweet', 'tobacco']
['woody', 'alcoholic', 'musk', 'earthy'] ['earthy', 'rancid', 'watery', 'woody']
['chemical', 'spicy', 'smoky', 'medicinal', 'leather', 'phenolic'] ['camphor', 'cooling', 'smoky']
['resinous', 'camphor'] ['balsamic', 'chemical', 'mint', 'resinous', 'sweet']
['floral', 'fruity', 'citrus'] ['green', 'metallic', 'orange', 'rancid', 'sulfuric', 'waxy']
['earthy', 'woody'] ['ambery', 'animalic', 'musk']
['apple', 'oily', 'butter', 'banana', 'plum', 'fruity', 'pear'] ['berry', 'clove', 'herbal']
['oily', 'cognac', 'fruity'] ['fruity', 'green', 'musty', 'waxy']
['tobacco', 'burnt', 'animalic', 'sulfuric', 'blackcurrant'] ['blackcurrant', 'powdery', 'spicy', 'sulfuric']
['alliaceous', 'fruity', 'ethereal'] ['cacao', 'resinous']
['ripe', 'vegetable', 'fresh', 'apple', 'fruity', 'ethereal'] ['aldehydic', 'ethereal', 'fresh', 'fruity']
['mushroom', 'fresh', 'fruity', 'ethereal', 'fermented'] ['green']
['fruity', 'mushroom', 'green', 'sweet', 'alcoholic', 'ethereal', 'resinous', 'leaf'] ['hyacinth', 'leaf', 'mushroom', 'nut']
['dairy', 'meat', 'sulfuric', 'mushroom', 'earthy', 'alliaceous', 'vegetable'] ['meat']
['ethereal', 'oily', 'woody', 'ester', 'banana', 'cinnamon'] ['camphor', 'floral', 'fruity', 'woody']
['resinous', 'fruity', 'cinnamon', 'balsamic'] ['cherry', 'cooling', 'floral', 'green', 'spicy', 'sweet']
['floral', 'fruity', 'geranium', 'rose', 'fresh', 'citrus'] ['citrus', 'floral', 'leaf', 'sweet']
[] ['herbal', 'vegetable']
['cognac', 'fermented', 'body', 'sour', 'cheese', 'fruity', 'apple', 'ethereal', 'chemical', 'banana'] ['butter', 'fruity']
['fatty', 'odorless', 'waxy', 'fruity', 'oily'] ['fruity', 'oily']
['green', 'sharp', 'fresh', 'pungent', 'ethereal', 'fruity'] ['almond', 'fruity', 'green', 'herbal', 'sweet']
['green', 'camphor', 'woody', 'anisic'] ['floral', 'woody']
['wine', 'berry', 'plum', 'cooling', 'honey', 'blueberry', 'rose', 'fruity', 'fermented'] ['balsamic', 'fruity', 'rose']
['oily', 'fruity'] ['butter', 'floral', 'wine']
['oily', 'animalic', 'fruity', 'waxy', 'musk'] ['animalic', 'clean', 'dry', 'metallic', 'musk', 'powdery', 'tropicalfruit', 'waxy']
['meat', 'sulfuric'] ['fatty', 'gourmand', 'meat', 'sulfuric']
['hyacinth', 'floral', 'cinnamon'] ['cinnamon']
['fresh', 'powdery', 'camphor', 'mint', 'resinous', 'apple', 'violetflower', 'ethereal', 'chemical'] ['ambery', 'chemical', 'rancid', 'violetflower', 'woody']
['green', 'woody', 'citrus', 'fresh', 'herbal', 'resinous', 'floral'] ['citrus', 'floral', 'resinous']
['orange', 'fresh', 'citrus', 'floral', 'lily'] ['fruity', 'green', 'lily', 'woody']
['floral', 'herbal', 'grapefruit', 'fresh', 'resinous'] ['lily', 'resinous']
['butter', 'sour'] ['caramellic', 'dairy', 'fruity']
['nut', 'vegetable', 'earthy'] ['earthy', 'floral', 'green', 'pepper', 'resinous']
['pungent', 'almond', 'fresh', 'cucumber', 'leaf', 'vegetable'] ['almond', 'cheese']
['phenolic', 'smoky', 'balsamic', 'almond', 'resinous', 'powdery', 'sweet', 'vanilla', 'anisic'] ['almond', 'floral', 'spicy']
['nut', 'earthy', 'caramellic', 'burnt', 'cacao', 'coffee'] ['coffee', 'cooked', 'meat', 'nut', 'sulfuric']
['floral', 'woody', 'berry'] ['berry', 'cedar', 'floral', 'lactonic']
['odorless', 'fruity'] ['sour']
['leaf', 'fruity', 'green'] ['green']
[] ['animalic', 'green', 'woody']
['berry', 'mint', 'blackcurrant'] ['blackcurrant', 'mint']
['fruity', 'rose', 'violetflower'] ['citrus', 'floral']
['fresh', 'mushroom', 'green', 'chemical', 'seafood', 'terpenic'] ['citrus', 'fresh']
['fresh', 'geranium', 'fruity', 'citrus', 'rose'] ['apple', 'dry', 'fatty', 'lemon', 'pear', 'rose', 'waxy']
['honey', 'balsamic', 'resinous', 'rose', 'floral', 'lily'] ['fresh', 'rose']
['tropicalfruit', 'fruity', 'honey', 'rose'] ['balsamic', 'fruity']
['floral', 'woody'] ['lactonic', 'plum', 'tropicalfruit']
['medicinal', 'balsamic', 'earthy', 'resinous', 'camphor'] ['camphor', 'coniferous', 'cooling']
['orange', 'woody', 'grapefruit'] ['grapefruit']
['animalic', 'musk'] ['ambrette', 'animalic', 'musk', 'vegetable']
['blackcurrant', 'cooked', 'alliaceous', 'roasted', 'meat', 'gourmand'] ['alliaceous', 'meat', 'roasted']
['rose', 'woody', 'fruity'] ['woody']
['fatty', 'powdery', 'seafood', 'musk', 'chemical'] ['dry', 'fatty', 'musk', 'sweet', 'waxy']
['grass', 'cucumber', 'cooked', 'rancid', 'fruity'] ['fatty', 'fruity', 'mushroom']
['sweet', 'fresh', 'fruity', 'alcoholic', 'butter'] ['ethereal', 'fermented', 'fresh']
['metallic', 'ripe', 'vegetable', 'sulfuric', 'tropicalfruit'] ['berry', 'dry', 'pungent']
['chemical', 'coffee', 'meat', 'alliaceous'] ['alliaceous', 'gourmand', 'meat', 'sharp']
['fruity', 'berry', 'plum'] ['floral', 'woody']
['vegetable', 'alliaceous'] ['alliaceous', 'cooked']

Generating test predictions

In [ ]:
test_df = pd.read_csv("/content/data/test.csv")
mols = [Chem.MolFromSmiles(smile) for smile in test_df["SMILES"].tolist()]
feat = dc.feat.ConvMolFeaturizer()#dc.feat.CircularFingerprint(size=1024)
test_arr = feat.featurize(mols)
In [ ]:
test_dataset = dc.data.NumpyDataset(X=test_arr, y=np.zeros((len(test_df),109)))
<NumpyDataset X.shape: (1079,), y.shape: (1079, 109), w.shape: (1079, 1), task_names: [  0   1   2 ... 106 107 108]>
In [ ]:
y_pred = model.predict(test_dataset)
# print(y_true.shape,y_pred.shape)
for i in range(y_pred.shape[0]):
  final_pred = []
  prob_val = []
  for y in range(109):
      prediction = y_pred[i,y]
      if prediction[1]>0.30:
  smell_ids = np.where(np.array(final_pred)==1)
  smells = [vocab[k] for k in smell_ids[0]]
  #smells = [smells[k] for k in np.argsort(np.array(prob_val))] #to further order based on probability

  if len(smells)==0:
  if len(smells)>15:
      smells = smells[:15]
    new_smells = [x for x in top_15 if x not in smells]
  assert len(smells)==15
  sents = []
  for sent in range(0,15,3):
    sents.append(",".join([x for x in smells[sent:sent+3]]))
  pred = ";".join([x for x in sents])
print("[info] did not predict for ",c)
[info] did not predict for  3
In [ ]:
final = pd.DataFrame({"SMILES":test_df.SMILES.tolist(),"PREDICTIONS":top_5_preds})
Out[ ]:
0 CCC(C)C(=O)OC1CC2CCC1(C)C2(C)C balsamic,camphor,cedar;coniferous,fruity,resin...
1 CC(C)C1CCC(C)CC1OC(=O)CC(C)O cooling,fruity,mint;floral,woody,herbal;green,...
2 CC(=O)/C=C/C1=CCC[C@H](C1(C)C)C berry,cedar,fruity;powdery,violetflower,woody;...
3 CC(=O)OCC(COC(=O)C)OC(=O)C apple,banana,ester;fruity,odorless,floral;wood...
4 CCCCCCCC(=O)OC/C=C(/CCC=C(C)C)\C cognac,floral,fruity;geranium,rose,waxy;woody,...
In [ ]:

This submission gives a score of 0.277 on the leaderboard.


Next we will use the MultiTaskClassifier of deepchem, with the featurizer CircularFingerprint.

The model accepts the following different featurizers.

  • CircularFingerprint
  • RDKitDescriptors
  • CoulombMatrixEig
  • RdkitGridFeaturizer
  • BindingPocketFeaturizer
  • ElementPropertyFingerprint

Feel free to substitute any one of these, and do report back your results and findings!

In [ ]:
mols = [Chem.MolFromSmiles(smile) for smile in data_df["text"].tolist()]
feat = dc.feat.CircularFingerprint(size=1024)
arr = feat.featurize(mols)
(4316, 1024)
In [ ]:
labels = []
train_df = pd.read_csv("data/train.csv")
for x in train_df.SENTENCE.tolist():
labels = np.array(labels)
(4316, 109)
In [ ]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(arr,labels, test_size=.1, random_state=42)
In [ ]:
train_dataset = dc.data.NumpyDataset(X=X_train, y=y_train)
val_dataset = dc.data.NumpyDataset(X=X_test, y=y_test)

<NumpyDataset X.shape: (3884, 1024), y.shape: (3884, 109), w.shape: (3884, 1), task_names: [  0   1   2 ... 106 107 108]>
In [ ]:
<NumpyDataset X.shape: (432, 1024), y.shape: (432, 109), w.shape: (432, 1), ids: [0 1 2 ... 429 430 431], task_names: [  0   1   2 ... 106 107 108]>

The MultitaskClassifier, is just a stack of dense layers. But that still leaves a lot of options. How many layers should there be, and how wide should each one be? What dropout rate should we use? What learning rate?

These are called hyperparameters. DeepChem provides a selection of hyperparameter optimization algorithms, which are found in the dc.hyper package. We use GridHyperparamOpt, which is the most basic method. We just give it a list of options for each hyperparameter and it exhaustively tries all combinations of them.

In [ ]:
import numpy as np
import numpy.random

params_dict = {'layer_sizes': [[500], [1000], [1000, 1000]],
              'dropouts': [0.2, 0.5],
              'learning_rate': [0.001, 0.0001] ,
              'n_tasks': [109],
               'n_features': [1024]

optimizer = dc.hyper.GridHyperparamOpt(dc.models.MultitaskClassifier)
metric = dc.metrics.Metric(dc.metrics.jaccard_score)
best_model, best_hyperparams, all_results = optimizer.hyperparam_search(
        params_dict, train_dataset, val_dataset, [], metric)
We again evaluate the model based on jaccard score.

In [ ]:
metric = dc.metrics.Metric(dc.metrics.jaccard_score)
print('training set score:', best_model.evaluate(train_dataset, [metric]))
training set score: {'jaccard_score': 0.19314449838336573}

This seems to perform worse than the GraphConvModel that we made.

Lets look at the individual auc roc scores for each of the 109 classes.

In [ ]:
y_true = val_dataset.y
y_pred = best_model.predict(val_dataset)
metric = dc.metrics.roc_auc_score
for i in range(109):
    for gt,prediction in zip(y_true[:,i],y_pred[:,i]):
      assert round(prediction[0]+prediction[1])==1,prediction[0]-prediction[1]
    score = metric(dc.metrics.to_one_hot(y_true[:,i]), y_pred[:,i])
    print(vocab[i], score)
alcoholic 0.8372183372183372
aldehydic 0.869958988380041
alliaceous 0.9765402843601896
almond 0.9185011709601874
ambergris 0.9686046511627906
ambery 0.9554502369668245
ambrette 0.8825581395348838
ammoniac 0.14849187935034802
animalic 0.7011904761904761
anisic 0.7451437451437453
apple 0.8836930455635492
balsamic 0.815435954479336
banana 0.9537146226415094
berry 0.8086330935251798
blackcurrant 0.8536799065420562
body 0.7429906542056075
bread 0.959696261682243
burnt 0.8098035615935377
butter 0.8072599531615925
cacao 0.8653395784543325
camphor 0.9113394755492559
caramellic 0.8869102258123738
cedar 0.8997658079625293
cheese 0.8958108399913625
chemical 0.8017257205801358
cherry 0.9898989898989898
cinnamon 0.9945609945609946
citrus 0.8515815085158152
clean 0.6396825396825397
clove 0.8461538461538461
coconut 0.9970794392523364
coffee 0.9896955503512881
coniferous 0.9414519906323184
cooked 0.9054373522458629
cooling 0.7304020561777125
cucumber 0.9918793503480279
dairy 0.8420427553444181
dry 0.6830853174603174
earthy 0.7827598192561696
ester 0.8069767441860465
ethereal 0.8718800322061191
fatty 0.8026315789473684
fermented 0.7780672268907562
floral 0.7303500314399496
fresh 0.742967670627245
fruity 0.7546078818087281
geranium 0.8011627906976744
gourmand 0.9295774647887325
grape 0.9187935034802784
grapefruit 0.9976744186046511
grass 0.5271909233176839
green 0.6414890716590177
herbal 0.6965725806451613
honey 0.8119158878504673
hyacinth 0.9285714285714286
jasmin 0.9086463223787168
lactonic 0.8764568764568765
leaf 0.753850710900474
leather 0.9766899766899768
lemon 0.8345389764629669
lily 0.9640330188679245
liquor 0.5306791569086651
meat 0.9614868491680086
medicinal 0.9851330203442881
melon 0.9112617924528302
metallic 0.7642801251956182
mint 0.7878577221642915
mushroom 0.8380331753554504
musk 0.9106024616713453
musty 0.7339112161807197
nut 0.8315054086538461
odorless 0.8048349056603774
oily 0.839393028846154
orange 0.8658059467918622
pear 0.9210308056872037
pepper 0.8935508935508936
phenolic 0.944578313253012
plastic 0.7593457943925234
plum 0.7709302325581395
powdery 0.7356302521008404
pungent 0.6957983193277311
rancid 0.5603361344537815
resinous 0.7472649010581693
ripe 0.888631090487239
roasted 0.8646825396825397
rose 0.7945412990326353
seafood 0.775
sharp 0.6573426573426574
smoky 0.9860788863109049
sour 0.893109243697479
spicy 0.748515625
sulfuric 0.8901633646159195
sweet 0.638592038471814
syrup 0.9395784543325527
terpenic 0.9489559164733179
tobacco 0.861697965571205
tropicalfruit 0.8147413182140325
vanilla 0.8568075117370892
vegetable 0.7777175990824519
violetflower 0.6930913348946135
watery 0.9626168224299065
waxy 0.8731492718446601
whiteflower 0.962877030162413
wine 0.8351288056206089
woody 0.7860377746899285

Lets compare the model predictions with the gt.

In [ ]:
y_true = val_dataset.y
y_pred = best_model.predict(val_dataset)
# print(y_true.shape,y_pred.shape)
for i in range(y_true.shape[0]):
  final_pred = []
  for y in range(109):
      prediction = y_pred[i,y]
      if prediction[1]>0.1:
  smell_ids = np.where(np.array(final_pred)==1)
  smells = [vocab[k] for k in smell_ids[0]]
  gt_smell_ids = np.where(np.array(y_true[i])==1)
  gt_smells = [vocab[k] for k in gt_smell_ids[0]]
['balsamic', 'berry', 'fruity', 'spicy', 'sweet'] ['balsamic', 'cinnamon', 'fruity', 'powdery', 'sweet']
['mint', 'spicy', 'woody'] ['dry', 'herbal']
['floral', 'fresh', 'fruity', 'resinous', 'sweet'] ['floral', 'fruity', 'resinous', 'sweet']
['citrus', 'earthy', 'floral', 'fresh', 'herbal', 'pepper', 'rose', 'sweet', 'woody'] ['citrus', 'fresh', 'lily']
['butter', 'fruity', 'green', 'oily', 'tropicalfruit', 'waxy'] ['green', 'pear']
['dry', 'ethereal', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'oily', 'sweet', 'woody'] ['geranium', 'leaf', 'spicy']
['balsamic', 'floral', 'fresh', 'fruity', 'geranium', 'honey', 'leaf', 'rose'] ['berry', 'geranium', 'honey', 'powdery', 'waxy']
['balsamic', 'floral', 'fresh', 'honey', 'oily', 'resinous', 'rose'] ['floral', 'herbal', 'lemon', 'rose', 'spicy']
['oily'] ['chemical']
['apple', 'banana', 'floral', 'fruity'] ['apple', 'fermented']
['alliaceous', 'cooked', 'earthy', 'fruity', 'sulfuric', 'tropicalfruit'] ['alliaceous', 'fresh', 'gourmand', 'metallic', 'spicy', 'sulfuric', 'vegetable']
['floral', 'fresh', 'fruity', 'herbal', 'jasmin', 'resinous'] ['green', 'herbal', 'jasmin']
['floral', 'lemon', 'oily', 'resinous'] ['coniferous', 'floral', 'resinous']
['apple', 'berry', 'fresh', 'herbal', 'rose', 'sweet'] ['fruity', 'herbal']
['camphor', 'cooling', 'fresh', 'mint', 'sweet', 'woody'] ['mint']
['earthy', 'leather', 'medicinal', 'phenolic', 'resinous', 'spicy', 'sweet'] ['medicinal', 'phenolic', 'resinous', 'spicy']
['ethereal', 'floral', 'fresh', 'fruity'] ['burnt', 'ethereal', 'fermented', 'fresh']
['berry', 'burnt', 'citrus', 'floral', 'fresh', 'fruity', 'grass', 'green', 'sulfuric', 'tropicalfruit'] ['melon', 'tobacco']
['earthy', 'ethereal', 'oily', 'sweet', 'terpenic', 'woody'] ['chemical', 'resinous', 'rose']
['ethereal', 'floral', 'fresh', 'green', 'sweet'] ['woody']
['burnt', 'fresh', 'phenolic', 'woody'] ['syrup']
['floral', 'fresh', 'herbal'] ['clean', 'floral', 'fresh', 'grass', 'lily', 'resinous', 'spicy', 'sweet']
['ethereal', 'fatty', 'fresh', 'fruity', 'oily', 'rose', 'waxy'] ['citrus', 'fresh', 'rose', 'waxy']
['camphor', 'coniferous', 'dry', 'earthy', 'fresh', 'lemon', 'resinous', 'sweet'] ['camphor', 'clean', 'cooling', 'green', 'resinous']
['citrus', 'earthy', 'fresh', 'herbal', 'spicy'] ['fruity', 'grass', 'green', 'herbal']
['citrus', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'oily'] ['animalic', 'green', 'pear', 'tropicalfruit']
['clove', 'earthy', 'floral', 'fruity', 'resinous', 'spicy', 'sweet'] ['cedar', 'fresh', 'leaf', 'mint']
['apple', 'chemical', 'ethereal', 'floral', 'fresh', 'fruity', 'green', 'oily'] ['floral', 'fresh', 'fruity', 'herbal']
['earthy', 'green', 'meat', 'nut', 'resinous', 'sulfuric'] ['bread', 'fermented', 'gourmand', 'meat', 'nut', 'sulfuric']
['ethereal', 'floral', 'fresh', 'fruity', 'sweet'] ['chemical', 'ethereal', 'fruity', 'sweet']
['burnt', 'caramellic', 'dairy', 'earthy', 'fruity', 'wine'] ['burnt', 'ester', 'fruity', 'pungent']
['alliaceous', 'coffee', 'earthy', 'gourmand', 'meat', 'nut', 'sulfuric', 'vegetable'] ['meat', 'roasted', 'vegetable']
['aldehydic', 'citrus', 'ethereal', 'fatty', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'lemon', 'pungent', 'sweet'] ['orange']
['cheese', 'floral', 'fresh', 'green', 'herbal', 'sour'] ['balsamic', 'pungent', 'sour']
['citrus', 'floral', 'fresh', 'green', 'herbal', 'lemon', 'rose', 'sweet', 'waxy', 'woody'] ['balsamic', 'rose']
['balsamic', 'floral', 'medicinal', 'sweet'] ['cooling', 'medicinal', 'sweet']
['animalic', 'coffee', 'earthy', 'floral', 'green', 'leather', 'nut', 'tobacco'] ['coffee', 'earthy', 'nut']
['caramellic', 'fruity', 'green', 'nut', 'resinous', 'tobacco'] ['fruity', 'tobacco']
['camphor', 'floral', 'fresh', 'mint', 'spicy', 'woody'] ['mint']
['ambery', 'citrus', 'ethereal', 'green', 'herbal', 'woody'] ['ambery', 'cedar', 'dry', 'ethereal', 'herbal']
['camphor', 'earthy', 'green', 'musty', 'woody'] ['ambery', 'rancid', 'woody']
['caramellic', 'floral', 'sweet'] ['odorless']
['aldehydic', 'floral', 'fresh', 'green', 'herbal', 'lily', 'resinous'] ['fresh']
['camphor', 'fruity', 'mint', 'resinous', 'spicy', 'woody'] ['camphor', 'cedar']
['alliaceous', 'chemical', 'earthy', 'gourmand', 'green', 'meat', 'roasted', 'sulfuric', 'vegetable'] ['alliaceous', 'fruity', 'green']
['ethereal', 'fresh', 'fruity', 'green', 'herbal', 'vegetable'] ['ethereal', 'fresh', 'fruity']
['animalic', 'apple', 'berry', 'ethereal', 'floral', 'fruity', 'herbal', 'liquor', 'spicy', 'tropicalfruit', 'wine', 'woody'] ['floral', 'fruity']
['aldehydic', 'animalic', 'citrus', 'ethereal', 'fatty', 'floral', 'fresh', 'green', 'musty'] ['citrus', 'dairy', 'floral', 'fruity']
['chemical', 'fresh', 'green', 'herbal', 'mint', 'terpenic'] ['roasted']
['apple', 'banana', 'berry', 'fresh', 'sweet'] ['fruity']
['burnt', 'butter', 'caramellic', 'chemical', 'floral', 'fresh', 'fruity', 'sharp'] ['burnt', 'caramellic', 'fruity', 'resinous']
['apple', 'citrus', 'fatty', 'floral', 'fresh', 'fruity', 'green', 'oily', 'rose', 'sweet', 'wine', 'woody'] ['fatty', 'fruity']
['earthy', 'metallic', 'sulfuric', 'sweet', 'tropicalfruit', 'vegetable'] ['alliaceous', 'tropicalfruit']
['coconut', 'dairy', 'fruity', 'green', 'jasmin', 'tropicalfruit', 'woody'] ['fruity', 'jasmin']
['aldehydic', 'floral', 'fresh', 'fruity', 'green', 'hyacinth', 'lily', 'resinous', 'rose'] ['green', 'hyacinth']
['citrus', 'earthy', 'fresh', 'herbal', 'spicy'] ['grass', 'green', 'tobacco']
['citrus', 'green', 'resinous'] ['herbal', 'resinous']
['apple', 'berry', 'floral', 'fruity', 'herbal', 'liquor', 'wine'] ['animalic', 'caramellic', 'fruity', 'herbal']
['floral', 'fruity', 'resinous'] ['chemical', 'fruity', 'herbal', 'pungent']
['fresh', 'herbal', 'mint', 'sweet'] ['mint']
['animalic', 'apple', 'berry', 'ethereal', 'floral', 'fruity', 'herbal', 'liquor', 'spicy', 'tropicalfruit', 'wine', 'woody'] ['herbal', 'tropicalfruit']
['earthy', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'mushroom'] ['green', 'liquor', 'mushroom', 'sweet']
['coconut', 'floral', 'grass', 'phenolic', 'resinous', 'sweet'] ['floral', 'fruity', 'resinous', 'sweet']
['fresh', 'mint', 'musk'] ['chemical', 'mint', 'musk']
['ethereal', 'floral', 'fresh', 'green', 'herbal'] ['apple', 'fermented']
['earthy', 'ethereal', 'fruity', 'oily'] ['herbal', 'spicy']
['earthy', 'fresh', 'green', 'tropicalfruit'] ['burnt', 'resinous', 'sulfuric']
['ambery', 'berry', 'burnt', 'caramellic', 'coffee', 'earthy', 'floral', 'musk', 'nut', 'roasted', 'spicy'] ['coffee', 'woody']
['aldehydic', 'citrus', 'cucumber', 'green', 'melon', 'waxy'] ['aldehydic', 'citrus']
['balsamic', 'clove', 'dairy', 'floral', 'phenolic', 'smoky', 'sweet', 'vanilla'] ['odorless']
['chemical', 'sulfuric', 'sweet'] ['ammoniac']
['balsamic', 'cinnamon', 'floral', 'fruity', 'honey', 'resinous', 'rose'] ['balsamic', 'hyacinth']
['floral', 'fresh', 'green', 'lily'] ['aldehydic', 'floral', 'grass', 'leaf']
['alliaceous', 'blackcurrant', 'ethereal', 'floral', 'spicy', 'sulfuric'] ['fermented', 'green', 'metallic', 'sulfuric', 'tropicalfruit', 'woody']
['citrus', 'floral', 'fruity', 'green', 'oily', 'pear', 'waxy'] ['fruity', 'green', 'oily']
['cheese', 'green', 'sour'] ['cheese', 'fruity']
['chemical', 'ethereal', 'fresh'] ['alcoholic', 'chemical', 'ethereal']
['apple', 'berry', 'dry', 'musty', 'nut', 'rose', 'sour', 'vegetable'] ['odorless']
['apple', 'burnt', 'fruity', 'green', 'sweet'] ['chemical', 'dairy', 'fruity', 'green', 'herbal', 'sharp']
['earthy', 'floral', 'green', 'mushroom', 'oily'] ['melon', 'mushroom', 'violetflower']
['berry', 'citrus', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'rose'] ['citrus', 'fatty', 'floral']
['camphor', 'citrus', 'floral', 'fresh', 'green', 'herbal', 'woody'] ['fresh', 'mint']
['earthy', 'meat', 'sulfuric'] ['meat', 'plastic']
['animalic', 'earthy', 'ethereal', 'floral', 'green', 'leather', 'resinous', 'tobacco', 'woody'] ['animalic', 'earthy', 'leather', 'tobacco', 'woody']
['berry', 'chemical', 'fruity', 'grass', 'green', 'sweet'] ['nut']
['earthy', 'fruity', 'green', 'mint', 'resinous'] ['balsamic', 'herbal', 'rancid']
['citrus', 'floral', 'green', 'sulfuric', 'vegetable'] ['animalic', 'berry', 'green', 'mint', 'sulfuric']
['alliaceous', 'earthy', 'meat', 'nut'] ['alliaceous', 'cooked', 'fatty', 'roasted', 'vegetable']
['floral', 'lemon', 'oily', 'resinous'] ['coniferous', 'floral', 'resinous']
[] ['ethereal', 'sweet']
['citrus', 'ethereal', 'floral', 'fresh', 'fruity', 'green', 'herbal'] ['fruity', 'vegetable']
['earthy', 'fresh', 'green', 'herbal', 'pepper', 'woody'] ['citrus', 'resinous', 'spicy']
['floral', 'fresh', 'fruity', 'green', 'herbal', 'jasmin', 'lactonic', 'resinous', 'rose'] ['floral', 'green', 'herbal', 'plum']
['balsamic', 'floral', 'fresh', 'honey', 'oily', 'resinous', 'rose'] ['herbal', 'rose']
['medicinal', 'phenolic', 'smoky', 'spicy', 'sweet', 'woody'] ['medicinal', 'phenolic', 'spicy']
['blackcurrant', 'phenolic', 'spicy', 'sweet', 'vanilla'] ['almond', 'balsamic', 'cherry', 'floral', 'sweet']
['cooling', 'mint'] ['cooling', 'sweet']
['camphor', 'citrus', 'cooling', 'floral', 'fresh', 'green', 'herbal', 'spicy', 'woody'] ['citrus', 'earthy', 'green', 'mint', 'spicy', 'sweet', 'woody']
['citrus', 'floral', 'fresh', 'lemon', 'rose', 'sweet'] ['floral', 'lemon', 'orange']
['floral', 'fruity', 'green', 'herbal', 'woody'] ['floral', 'fruity', 'woody']
['citrus', 'floral', 'fresh', 'green', 'herbal', 'lemon', 'woody'] ['aldehydic', 'dry', 'green', 'lemon', 'orange']
['dairy', 'earthy', 'meat', 'spicy', 'sulfuric', 'vegetable'] ['cheese', 'gourmand', 'meat']
['fresh', 'green', 'herbal', 'rose', 'sweet'] ['chemical', 'floral', 'pungent', 'resinous']
['balsamic', 'floral', 'leather', 'medicinal', 'phenolic', 'smoky', 'sweet', 'vanilla'] ['floral', 'herbal', 'sweet']
['balsamic', 'dry', 'earthy', 'musk', 'sweet'] ['ambery', 'tobacco', 'woody']
['ambergris', 'earthy', 'floral', 'violetflower', 'woody'] ['balsamic', 'woody']
['berry', 'chemical', 'fruity', 'grass', 'green', 'sweet'] ['dairy', 'mushroom']
['body', 'caramellic', 'cheese', 'fruity', 'green', 'pungent', 'sour'] ['meat', 'oily', 'roasted', 'sour']
['almond', 'chemical', 'cherry', 'green', 'spicy', 'sweet'] ['almond', 'cherry', 'spicy', 'sweet']
['chemical', 'earthy', 'floral', 'phenolic', 'spicy', 'sweet'] ['phenolic', 'spicy']
['apple', 'banana', 'butter', 'cheese', 'fruity'] ['banana', 'meat', 'ripe']
['berry', 'caramellic', 'syrup'] ['berry', 'caramellic', 'clean']
['burnt', 'chemical', 'ethereal', 'floral', 'fresh', 'fruity', 'liquor', 'sweet'] ['pungent', 'vegetable']
['floral', 'green', 'herbal', 'rose', 'spicy'] ['herbal', 'spicy', 'sulfuric', 'vegetable']
['floral', 'fresh', 'fruity', 'herbal', 'jasmin', 'resinous'] ['floral', 'fresh', 'fruity', 'resinous']
['earthy', 'ethereal', 'meat', 'nut', 'roasted'] ['cooked', 'meat', 'nut']
['fatty', 'fresh', 'fruity', 'green', 'herbal'] ['fatty', 'floral', 'sweet']
['fatty', 'fruity', 'oily'] ['oily']
['fatty', 'fruity', 'herbal', 'oily', 'rose', 'tropicalfruit'] ['liquor', 'oily', 'wine']
['berry', 'floral', 'fresh', 'herbal', 'mint'] ['fresh', 'herbal', 'woody']
['fresh', 'fruity', 'mint'] ['camphor', 'musty', 'woody']
['chemical', 'ethereal', 'fresh', 'sweet'] ['chemical', 'ethereal']
['balsamic', 'cinnamon', 'floral', 'fruity', 'resinous', 'sweet'] ['balsamic', 'floral', 'herbal']
['cacao', 'floral', 'fresh', 'green', 'herbal'] ['aldehydic', 'floral', 'herbal']
['apple', 'fruity', 'green', 'musty', 'tropicalfruit'] ['apple', 'banana', 'fatty']
['apple', 'fruity', 'green', 'sweet', 'wine'] ['apple', 'banana']
['berry', 'butter', 'floral', 'fruity', 'green', 'rose', 'tropicalfruit', 'waxy'] ['citrus', 'fatty', 'floral', 'fresh']
['floral', 'medicinal', 'phenolic'] ['phenolic', 'plastic']
['green', 'herbal', 'spicy'] ['spicy', 'sulfuric']
['fresh', 'melon'] ['cucumber', 'green', 'melon', 'mushroom', 'oily', 'seafood']
['clean', 'nut', 'phenolic', 'spicy', 'sweet', 'tobacco'] ['burnt', 'herbal', 'phenolic', 'sweet']
['apple', 'ethereal', 'fruity', 'wine'] ['apple']
['earthy', 'fresh', 'green', 'herbal', 'sweet', 'woody'] ['camphor', 'woody']
['berry', 'earthy', 'lactonic', 'mint', 'musk', 'vanilla', 'woody'] ['syrup', 'woody']
['apple', 'floral', 'fruity', 'green', 'pear', 'tropicalfruit', 'wine'] ['green', 'pear', 'resinous', 'tropicalfruit']
['animalic', 'floral', 'green', 'herbal', 'resinous', 'woody'] ['balsamic', 'phenolic']
['fresh', 'green'] ['alliaceous', 'roasted']
['camphor', 'earthy', 'fresh', 'resinous'] ['camphor', 'fresh']
['apple', 'fruity', 'green', 'oily', 'pear', 'sweet', 'waxy'] ['fruity', 'oily']
['ethereal', 'floral', 'fresh', 'green', 'sweet'] ['ethereal', 'fatty', 'liquor']
['fresh', 'herbal', 'mint', 'musk'] ['herbal']
['burnt', 'fresh', 'fruity', 'resinous', 'spicy', 'sweet'] ['bread', 'caramellic', 'phenolic', 'woody']
['fatty', 'fruity', 'green', 'herbal', 'pear', 'tropicalfruit'] ['green', 'melon', 'pear', 'tropicalfruit', 'waxy']
['apple', 'balsamic', 'berry', 'butter', 'cinnamon', 'fruity', 'green', 'spicy', 'sweet', 'wine'] ['balsamic', 'fruity']
['animalic', 'fresh', 'mint', 'musk', 'sweet', 'woody'] ['ambergris', 'animalic', 'anisic', 'clean', 'fatty', 'musk', 'powdery']
['fatty', 'fresh', 'fruity', 'green', 'herbal'] ['floral', 'green', 'herbal']
['cooling', 'fruity', 'green', 'herbal', 'mint', 'sweet'] ['cooling', 'tropicalfruit', 'woody']
['balsamic', 'camphor', 'coniferous', 'earthy', 'fruity', 'green', 'resinous', 'spicy', 'woody'] ['camphor', 'nut', 'resinous']
['chemical', 'floral', 'phenolic', 'resinous', 'spicy', 'sweet'] ['floral', 'mint', 'sweet', 'violetflower', 'woody']
['aldehydic', 'green'] ['fruity']
['camphor', 'earthy', 'mint', 'woody'] ['rose']
['anisic', 'earthy', 'floral', 'sweet'] ['anisic']
['camphor', 'dry', 'fresh', 'mint', 'sweet', 'woody'] ['mint']
['burnt', 'fresh', 'sweet'] ['body', 'gourmand', 'nut', 'spicy', 'syrup']
['citrus', 'floral', 'fresh', 'grape', 'lily', 'resinous', 'spicy', 'sweet'] ['aldehydic', 'citrus', 'green']
['balsamic', 'cacao', 'fruity', 'green', 'honey', 'rose'] ['honey', 'rose']
['balsamic', 'floral', 'fruity', 'green', 'powdery', 'spicy', 'sweet', 'vanilla'] ['balsamic', 'spicy']
['balsamic', 'camphor', 'resinous', 'woody'] ['camphor', 'coniferous', 'cooling']
['berry', 'butter', 'cheese', 'fruity', 'tropicalfruit'] ['butter', 'fruity']
['earthy', 'green'] ['balsamic', 'floral']
['camphor', 'earthy', 'floral', 'herbal', 'spicy', 'tropicalfruit', 'woody'] ['fresh', 'mint']
['apple', 'caramellic', 'ethereal', 'fruity', 'green', 'musty', 'sweet'] ['banana', 'cheese', 'tropicalfruit']
['floral', 'fresh', 'herbal', 'lemon', 'sweet', 'woody'] ['aldehydic', 'fresh', 'herbal']
['alliaceous', 'chemical', 'coffee', 'green', 'meat', 'roasted', 'seafood', 'sulfuric'] ['burnt', 'caramellic', 'coffee']
['balsamic', 'floral', 'fruity'] ['balsamic', 'musk']
['aldehydic', 'citrus', 'earthy', 'green', 'resinous', 'woody'] ['clean', 'fresh', 'herbal']
['fresh', 'herbal', 'mint', 'sweet'] ['camphor']
['citrus', 'ethereal', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'lemon', 'resinous', 'sweet', 'woody'] ['ethereal', 'floral', 'lemon']
['apple', 'banana', 'berry', 'earthy', 'fresh', 'fruity', 'green', 'melon', 'pear', 'rose', 'waxy'] ['banana', 'body', 'fermented', 'green', 'meat', 'melon']
['berry', 'chemical', 'fresh', 'fruity'] ['berry', 'cheese', 'green', 'herbal']
['cooling', 'fresh', 'fruity', 'herbal', 'mint', 'rose'] ['cooling', 'fruity']
['earthy', 'nut'] ['cacao', 'caramellic', 'dry', 'musty', 'nut', 'vanilla']
['dry', 'ethereal', 'fruity', 'mint', 'oily', 'spicy', 'woody'] ['cedar', 'fresh', 'sharp']
['aldehydic', 'chemical', 'citrus', 'ethereal', 'floral', 'fresh', 'herbal', 'lemon', 'metallic'] ['herbal', 'lemon']
['earthy', 'floral', 'herbal', 'musty', 'spicy', 'sweet'] ['camphor', 'resinous']
['earthy', 'floral', 'herbal', 'spicy', 'sweet', 'woody'] ['pepper', 'woody']
['balsamic', 'floral', 'fruity', 'green', 'herbal', 'honey', 'resinous', 'rose', 'sweet'] ['herbal', 'sweet', 'wine']
['chemical', 'fruity', 'green', 'herbal', 'spicy', 'tropicalfruit'] ['burnt', 'fruity', 'herbal']
['ambergris', 'cedar', 'chemical', 'dry', 'phenolic', 'woody'] ['ambery', 'woody']
['apple', 'fruity', 'green', 'sweet', 'wine'] ['tropicalfruit', 'wine']
['apple', 'fatty', 'green', 'oily', 'tropicalfruit'] ['apple', 'berry']
['berry', 'floral', 'green', 'sweet'] ['floral', 'orange', 'resinous', 'sweet']
['ethereal', 'fresh', 'fruity', 'green', 'liquor', 'sweet', 'wine'] ['burnt', 'caramellic', 'fresh', 'fruity']
['blackcurrant', 'floral', 'sulfuric'] ['floral', 'grapefruit', 'lemon']
['ethereal', 'fermented', 'musty', 'sweet'] ['citrus', 'floral', 'fresh', 'oily', 'sweet']
['alliaceous', 'earthy', 'sulfuric', 'vegetable'] ['cheese', 'sulfuric', 'tropicalfruit', 'vegetable']
['balsamic', 'clove', 'dairy', 'earthy', 'floral', 'phenolic', 'powdery', 'smoky', 'vanilla'] ['phenolic', 'spicy']
['cooked', 'floral', 'fruity', 'meat', 'roasted'] ['caramellic', 'dairy']
['fruity', 'green', 'herbal', 'melon', 'woody'] ['anisic', 'floral', 'fruity', 'woody']
['citrus', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'resinous', 'rose'] ['herbal', 'spicy']
['earthy', 'ethereal', 'floral', 'woody'] ['floral', 'fruity', 'woody']
['citrus', 'clean', 'floral', 'fresh', 'fruity', 'herbal', 'lemon', 'rose'] ['floral', 'fresh', 'fruity']
['citrus', 'fatty', 'floral', 'fresh', 'green', 'herbal', 'oily', 'waxy', 'woody'] ['aldehydic', 'fresh', 'herbal', 'woody']
['coconut', 'dairy', 'fresh', 'fruity', 'green', 'jasmin', 'lactonic', 'sweet'] ['coconut', 'dairy', 'fruity']
['earthy', 'floral', 'green', 'musk', 'resinous', 'sweet', 'woody'] ['clove', 'herbal', 'rose', 'woody']
['earthy', 'green', 'sweet', 'woody'] ['balsamic', 'earthy', 'green', 'musk']
['ethereal', 'floral', 'fresh', 'fruity', 'green', 'mushroom', 'oily'] ['fresh', 'oily']
['aldehydic', 'floral', 'fresh', 'herbal', 'lily', 'resinous'] ['clean', 'fresh']
['balsamic', 'citrus', 'floral', 'fruity', 'green', 'herbal', 'honey', 'rose'] ['dry', 'herbal', 'rose']
['apple', 'berry', 'fruity', 'sweet'] ['fruity', 'green', 'musty', 'pungent']
['fresh', 'herbal', 'mint', 'spicy', 'woody'] ['fresh', 'herbal', 'waxy']
['ambery', 'camphor', 'ethereal', 'herbal', 'woody'] ['ambery', 'herbal', 'woody']
['floral', 'fruity', 'green', 'herbal', 'rose', 'tropicalfruit', 'waxy'] ['rose', 'tropicalfruit']
['camphor', 'cooling', 'fresh', 'green', 'herbal', 'mint', 'sweet', 'woody'] ['dry', 'herbal']
['apple', 'chemical', 'ethereal', 'fruity', 'liquor', 'sweet', 'tropicalfruit'] ['chemical', 'ethereal', 'fresh', 'fruity', 'plastic']
['citrus', 'ethereal', 'floral', 'fresh', 'herbal', 'lemon', 'metallic'] ['floral', 'green', 'lemon']
['fruity', 'pear', 'waxy'] ['citrus', 'fatty', 'waxy']
['citrus', 'clean', 'fatty', 'fresh', 'green', 'rose'] ['fatty', 'floral', 'mint']
['coconut', 'dairy', 'floral', 'mint', 'powdery', 'rose', 'sweet', 'tobacco', 'vanilla'] ['coconut', 'cooling', 'fruity']
['ethereal', 'floral', 'fresh', 'fruity', 'green', 'mushroom', 'oily'] ['mushroom', 'musty', 'oily']
['fresh', 'fruity'] ['apple', 'balsamic', 'banana', 'pear', 'woody']
['chemical', 'earthy', 'floral', 'green', 'nut', 'roasted', 'woody'] ['herbal', 'nut']
['alliaceous', 'earthy', 'green', 'nut', 'spicy', 'sulfuric', 'vegetable'] ['cooked', 'dairy', 'roasted']
['berry', 'caramellic', 'fruity', 'gourmand', 'syrup'] ['clean', 'syrup']
['balsamic', 'berry', 'cedar', 'floral', 'fruity', 'herbal', 'powdery', 'tobacco', 'violetflower', 'woody'] ['berry', 'green', 'sweet', 'woody']
['meat', 'sulfuric', 'tropicalfruit', 'vegetable'] ['animalic', 'grape', 'rancid']
['animalic', 'burnt', 'coffee', 'earthy', 'floral', 'green', 'leather', 'sweet'] ['floral', 'resinous']
['animalic', 'balsamic', 'fatty', 'floral', 'fruity', 'jasmin', 'oily', 'phenolic', 'whiteflower'] ['balsamic', 'nut', 'vanilla']
['camphor', 'coniferous', 'cooling', 'earthy', 'sulfuric', 'terpenic', 'woody'] ['citrus', 'resinous', 'sulfuric']
['alliaceous', 'green', 'sulfuric', 'vegetable'] ['alliaceous', 'earthy', 'spicy', 'sulfuric', 'vegetable']
['balsamic', 'floral', 'fruity', 'green', 'herbal', 'resinous'] ['balsamic', 'berry', 'cacao', 'green', 'herbal', 'whiteflower', 'woody']
['ethereal', 'floral', 'fresh', 'fruity', 'sour'] ['sour', 'spicy', 'vegetable']
['apple', 'fruity', 'green', 'pear', 'tropicalfruit'] ['dairy', 'fresh', 'fruity', 'green']
['earthy', 'fresh', 'grass', 'green', 'melon', 'vegetable'] ['fresh', 'leaf']
['ambery', 'balsamic', 'floral', 'fresh', 'fruity', 'powdery', 'resinous', 'sweet'] ['balsamic', 'fruity']
['floral', 'fresh', 'honey', 'lily', 'rose'] ['rose']
['fresh', 'melon'] ['floral', 'melon', 'oily', 'vegetable']
['camphor', 'floral', 'fruity', 'resinous', 'woody'] ['ester', 'fruity', 'woody']
['citrus', 'earthy', 'floral', 'green', 'oily'] ['aldehydic', 'ambery', 'green', 'lemon', 'waxy']
['camphor', 'coconut', 'floral', 'fresh', 'fruity', 'herbal', 'woody'] ['floral', 'fresh', 'fruity', 'mint']
['ambrette', 'dry', 'floral', 'leather', 'musk', 'musty', 'powdery', 'sweet'] ['fruity', 'leaf', 'violetflower']
['coconut', 'dairy', 'fresh', 'fruity', 'green', 'jasmin', 'lactonic', 'sweet'] ['coconut', 'dairy', 'herbal', 'lactonic']
['floral', 'fresh', 'green', 'lily'] ['earthy', 'herbal']
['floral', 'fruity', 'sweet', 'tropicalfruit'] ['apple']
['floral', 'fresh', 'green', 'herbal', 'resinous', 'woody'] ['balsamic', 'woody']
['floral', 'medicinal', 'nut', 'phenolic', 'vanilla'] ['caramellic', 'cooked', 'meat']
['citrus', 'fatty', 'floral', 'fresh', 'fruity', 'green', 'oily'] ['waxy']
['earthy'] ['camphor']
['earthy', 'green', 'meat', 'nut', 'pepper', 'vegetable'] ['mint', 'vegetable']
['burnt', 'caramellic', 'cinnamon', 'earthy', 'floral', 'fresh', 'fruity', 'herbal', 'nut', 'spicy'] ['resinous', 'sweet']
['earthy', 'floral', 'musk', 'woody'] ['earthy', 'musk']
['floral', 'lemon', 'oily', 'resinous'] ['floral']
['apple', 'floral', 'jasmin', 'resinous', 'sweet'] ['aldehydic', 'ethereal', 'fruity', 'jasmin']
['animalic', 'clean', 'floral', 'fruity', 'jasmin', 'lactonic', 'oily', 'rose'] ['jasmin', 'oily', 'watery']
['balsamic', 'berry', 'butter', 'cooling', 'floral', 'fruity', 'green', 'jasmin', 'lactonic', 'resinous', 'sour', 'tropicalfruit'] ['apple', 'fruity', 'green', 'resinous']
['balsamic', 'cinnamon', 'floral', 'resinous', 'sweet'] ['blackcurrant', 'cinnamon']
['citrus', 'fatty', 'fruity', 'green', 'oily', 'resinous', 'woody'] ['green', 'resinous']
['cheese', 'floral', 'fruity', 'sweet', 'woody'] ['apple', 'banana', 'berry']
['apple', 'cacao', 'cheese', 'ethereal', 'fruity', 'sour'] ['body', 'cheese', 'sour']
['alliaceous', 'cheese', 'dairy', 'earthy', 'gourmand', 'meat', 'sour', 'spicy', 'sulfuric', 'vegetable'] ['sulfuric', 'sweet']
['floral', 'green', 'herbal', 'spicy'] ['earthy', 'floral', 'sweet']
['chemical', 'sulfuric', 'sweet'] ['burnt', 'earthy', 'sulfuric']
['citrus', 'floral', 'fresh', 'green', 'herbal', 'resinous', 'sweet', 'woody'] ['citrus', 'earthy', 'floral']
['balsamic', 'floral', 'fruity', 'odorless', 'sweet'] ['balsamic', 'clean', 'green', 'plastic']
['ethereal', 'floral', 'fresh', 'fruity', 'green', 'mushroom', 'oily'] ['floral', 'grass', 'lemon', 'sweet']
['floral', 'herbal', 'oily', 'spicy', 'woody'] ['herbal', 'spicy', 'woody']
['alliaceous', 'coffee', 'cooked', 'gourmand', 'green', 'meat', 'roasted', 'sulfuric', 'vegetable'] ['meat', 'roasted', 'sulfuric']
['burnt'] ['balsamic', 'burnt', 'sour']
['animalic', 'earthy', 'floral'] ['floral', 'jasmin', 'seafood']
['balsamic', 'earthy', 'floral', 'medicinal', 'phenolic', 'sweet'] ['medicinal', 'phenolic', 'sweet']
['earthy', 'floral', 'green', 'nut', 'pepper', 'vegetable'] ['earthy', 'floral', 'nut', 'pepper']
['berry', 'burnt', 'caramellic', 'chemical', 'ethereal', 'fruity', 'grape', 'plum', 'ripe', 'spicy'] ['caramellic', 'fruity']
['alliaceous', 'chemical', 'cooked', 'gourmand', 'meat', 'roasted', 'seafood', 'sulfuric', 'vegetable'] ['caramellic']
['cacao', 'caramellic', 'nut', 'sweet'] ['green', 'resinous', 'roasted']
['cacao', 'earthy', 'nut'] ['butter', 'cooked', 'earthy']
['aldehydic', 'citrus', 'floral', 'fresh', 'green', 'lily', 'rose', 'sweet'] ['aldehydic', 'hyacinth', 'lily', 'watery', 'waxy']
['earthy', 'ethereal', 'fermented', 'odorless', 'oily'] ['odorless']
['chemical', 'earthy', 'leather', 'medicinal', 'phenolic', 'smoky', 'sweet'] ['phenolic', 'spicy']
['ambergris', 'fresh', 'herbal', 'woody'] ['ambergris', 'ambery', 'animalic', 'dry', 'fresh', 'metallic']
['fruity', 'green', 'herbal', 'sweet'] ['alcoholic', 'ethereal', 'musty', 'nut']
['balsamic', 'cacao', 'floral', 'fruity', 'honey', 'resinous', 'rose', 'sweet'] ['honey', 'resinous']
['balsamic', 'berry', 'floral', 'fruity', 'green', 'resinous', 'rose', 'sweet'] ['balsamic', 'fruity']
['berry', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'liquor', 'sweet', 'tropicalfruit'] ['floral', 'fresh', 'fruity']
['chemical', 'floral', 'roasted', 'sulfuric'] ['meat']
['berry', 'floral', 'fresh', 'fruity', 'rose'] ['banana', 'melon', 'pear']
['berry', 'burnt', 'caramellic', 'coconut', 'earthy', 'spicy', 'syrup'] ['bread', 'burnt', 'syrup']
['citrus', 'green', 'woody'] ['woody']
['chemical', 'fresh', 'fruity', 'resinous', 'sweet'] ['berry', 'wine']
['green'] ['odorless']
['apple', 'banana', 'butter', 'cheese', 'fresh', 'fruity', 'musty', 'wine'] ['fruity']
['aldehydic', 'citrus', 'floral', 'fresh', 'green', 'lemon', 'lily', 'rose', 'watery'] ['aldehydic', 'floral', 'watery', 'waxy']
['ethereal', 'floral', 'fruity', 'green', 'mint'] ['pear', 'rose']
['ethereal', 'floral', 'fresh'] ['aldehydic', 'ethereal', 'fresh']
['earthy', 'floral', 'fresh', 'grass', 'whiteflower'] ['fresh', 'green', 'mint', 'rose']
['balsamic', 'berry', 'floral', 'fruity', 'honey', 'powdery', 'resinous', 'rose', 'sweet'] ['fruity', 'resinous', 'roasted']
['body', 'cacao', 'caramellic', 'coffee', 'earthy', 'green', 'musty', 'nut', 'vegetable'] ['cacao', 'earthy', 'nut']
['camphor', 'fresh', 'mint', 'woody'] ['cooling', 'musty', 'spicy']
['camphor', 'coniferous', 'earthy'] ['camphor', 'rancid', 'resinous']
['floral', 'fresh', 'herbal'] ['clean', 'floral', 'fresh', 'grass', 'lily', 'resinous', 'spicy', 'sweet']
['camphor', 'dry', 'fresh', 'fruity', 'green', 'resinous', 'woody'] ['fruity', 'herbal', 'resinous', 'rose']
['earthy', 'fruity', 'green', 'jasmin', 'resinous'] ['herbal', 'jasmin', 'oily']
['earthy', 'floral', 'green', 'herbal', 'hyacinth', 'resinous', 'rose'] ['leaf', 'vegetable']
['apple', 'berry', 'fatty', 'floral', 'fruity', 'green', 'rose', 'tropicalfruit', 'waxy', 'woody'] ['apple', 'floral']
['alliaceous', 'cooked', 'floral', 'green', 'sulfuric'] ['green', 'sulfuric', 'sweet']
['chemical', 'earthy', 'phenolic', 'woody'] ['earthy', 'leather', 'medicinal', 'phenolic']
['fatty', 'fresh', 'fruity', 'oily', 'woody'] ['green', 'mushroom', 'oily', 'sweet']
['citrus', 'floral', 'fresh', 'fruity', 'green', 'rose', 'sweet'] ['green', 'pear', 'rose', 'tropicalfruit', 'waxy', 'woody']
['camphor', 'cooling', 'earthy', 'fresh', 'resinous'] ['herbal']
['berry', 'floral', 'fresh', 'fruity', 'rose'] ['cooling', 'green', 'herbal', 'rose', 'waxy']
['cooling', 'herbal', 'mint', 'woody'] ['odorless']
['coffee', 'earthy', 'meat', 'nut'] ['bread', 'nut', 'vegetable', 'woody']
['coconut', 'earthy', 'fatty', 'floral', 'fruity', 'green', 'woody'] ['mint', 'spicy']
['cooling', 'mint'] ['mint']
['butter', 'coconut', 'dairy', 'fruity', 'jasmin', 'sweet'] ['coconut', 'dairy', 'fruity', 'resinous']
['fatty'] ['waxy', 'woody']
['burnt', 'caramellic', 'ethereal', 'fresh', 'fruity', 'sweet'] ['burnt', 'fresh', 'fruity', 'sweet']
['ambery', 'floral', 'fruity', 'resinous', 'sweet'] ['odorless']
['balsamic', 'fruity', 'green', 'herbal'] ['fruity', 'herbal']
['fruity', 'green', 'pear', 'sweet', 'waxy'] ['body', 'cooked', 'green', 'pear']
['cinnamon', 'clove', 'dairy', 'floral', 'phenolic', 'powdery', 'smoky', 'spicy', 'vanilla'] ['vanilla']
['blackcurrant', 'floral', 'sulfuric'] ['roasted', 'sweet', 'vegetable']
['apple', 'caramellic', 'chemical', 'ethereal', 'fruity', 'grape', 'liquor', 'pear', 'sweet'] ['chemical', 'fruity']
['apple', 'berry', 'green', 'tropicalfruit'] ['cheese', 'green', 'spicy', 'tropicalfruit', 'woody']
['alliaceous', 'earthy', 'metallic', 'spicy', 'sulfuric', 'tropicalfruit', 'vegetable'] ['blackcurrant', 'burnt', 'sulfuric', 'vegetable']
['ambery', 'balsamic', 'dry', 'floral', 'fruity', 'herbal', 'sweet', 'woody'] ['ambery', 'ambrette', 'dry']
['camphor', 'fresh', 'herbal', 'resinous', 'rose', 'woody'] ['woody']
['citrus', 'earthy', 'floral', 'green', 'rose'] ['clean', 'fresh']
['apple', 'fresh', 'fruity', 'green', 'herbal', 'vegetable'] ['apple', 'herbal']
['citrus', 'floral', 'fresh', 'fruity', 'geranium', 'green', 'herbal', 'musk', 'spicy'] ['apple', 'herbal', 'woody']
['chemical', 'ethereal', 'fresh', 'fruity'] ['alliaceous', 'cheese', 'cooked', 'mushroom']
['coconut', 'earthy', 'floral', 'herbal', 'phenolic', 'woody'] ['earthy', 'spicy']
['apple', 'butter', 'floral', 'odorless'] ['odorless']
['floral', 'lily', 'woody'] ['citrus', 'fresh', 'lily']
['floral', 'fresh', 'lily', 'resinous', 'spicy', 'sweet'] ['floral', 'fresh', 'fruity', 'green', 'musty']
['burnt', 'caramellic', 'nut', 'phenolic', 'sweet', 'syrup'] ['caramellic', 'fruity', 'resinous']
['balsamic', 'clove', 'floral', 'phenolic', 'powdery', 'smoky', 'sweet', 'vanilla'] ['vanilla']
['herbal', 'mint', 'oily', 'sweet'] ['mint', 'musk', 'spicy']
['floral', 'fruity', 'leather', 'medicinal', 'phenolic', 'powdery', 'vanilla'] ['floral', 'medicinal', 'phenolic']
['balsamic', 'floral', 'grapefruit', 'lily', 'oily', 'woody'] ['balsamic', 'fruity', 'green', 'resinous', 'woody']
['clean', 'ethereal', 'floral', 'fresh', 'lily'] ['citrus', 'floral', 'green', 'melon']
['ethereal', 'fresh', 'fruity', 'herbal', 'liquor', 'sweet'] ['ethereal', 'fresh', 'fruity']
['alliaceous', 'chemical', 'earthy', 'floral', 'green', 'sulfuric', 'vegetable'] ['ethereal', 'sulfuric']
['dairy', 'fatty', 'green', 'rose', 'sour'] ['butter']
['balsamic', 'ethereal', 'floral', 'honey', 'hyacinth', 'jasmin', 'rose'] ['fruity', 'musty', 'resinous']
['apple', 'ethereal', 'floral', 'fruity', 'tropicalfruit'] ['berry', 'green', 'herbal', 'honey', 'tropicalfruit']
['earthy', 'floral', 'fresh', 'leather', 'medicinal', 'musk', 'phenolic'] ['leather', 'phenolic', 'terpenic']
['citrus', 'ethereal', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'lemon', 'lily', 'woody'] ['citrus', 'lemon']
['ethereal', 'floral', 'fresh', 'lily'] ['ethereal', 'lily']
['earthy', 'oily', 'sweet', 'terpenic', 'woody'] ['earthy', 'woody']
['chemical', 'floral', 'fruity', 'grass', 'herbal', 'medicinal', 'phenolic', 'resinous', 'spicy', 'sweet', 'woody'] ['fruity', 'herbal', 'phenolic', 'spicy']
['berry', 'fruity', 'herbal', 'pungent', 'sour'] ['earthy']
['citrus', 'floral', 'fresh', 'green', 'rose', 'sweet', 'woody'] ['balsamic', 'floral']
['chemical', 'meat', 'medicinal', 'phenolic', 'spicy', 'sulfuric'] ['meat', 'metallic', 'phenolic', 'roasted', 'sulfuric']
['camphor', 'earthy', 'ethereal', 'musk', 'woody'] ['balsamic', 'earthy', 'green', 'liquor', 'musk', 'resinous']
['ethereal', 'floral', 'fresh', 'fruity', 'herbal', 'woody'] ['floral', 'fresh', 'fruity', 'herbal']
['berry', 'floral', 'fresh', 'fruity', 'geranium', 'rose', 'spicy'] ['floral', 'fresh', 'fruity']
['earthy', 'floral', 'green', 'herbal', 'hyacinth', 'resinous', 'rose'] ['hyacinth', 'rose']
['aldehydic', 'citrus', 'fatty', 'floral', 'green', 'oily', 'waxy'] ['green', 'herbal', 'orange']
['ethereal'] ['alcoholic', 'cacao', 'cheese', 'liquor', 'vegetable']
['camphor', 'cooling', 'earthy', 'green'] ['camphor', 'cooling', 'resinous']
['citrus', 'floral', 'fresh', 'herbal', 'lemon', 'rose'] ['citrus', 'fresh', 'herbal']
['clove', 'dairy', 'earthy', 'floral', 'phenolic', 'powdery', 'smoky', 'sweet', 'vanilla'] ['floral', 'vanilla']
['camphor', 'earthy', 'green', 'musk'] ['camphor']
['caramellic', 'coffee', 'earthy', 'green', 'nut', 'sulfuric', 'vegetable'] ['vegetable']
['balsamic', 'chemical', 'clove', 'floral', 'phenolic', 'smoky', 'vanilla', 'woody'] ['clove', 'sweet', 'woody']
['ethereal', 'grapefruit', 'musk', 'woody'] ['animalic', 'fresh', 'fruity', 'green', 'herbal', 'sour', 'spicy', 'woody']
['balsamic', 'clove', 'dairy', 'phenolic', 'smoky', 'sweet', 'vanilla'] ['resinous', 'woody']
['camphor', 'resinous'] ['camphor', 'coniferous', 'resinous']
['fatty', 'floral', 'fruity', 'oily', 'rose', 'tropicalfruit', 'waxy'] ['leaf', 'rose', 'waxy']
['floral', 'fruity', 'green', 'tropicalfruit'] ['apple', 'tropicalfruit', 'waxy']
['earthy', 'floral', 'green', 'herbal', 'pepper', 'resinous'] ['green', 'woody']
['balsamic', 'floral', 'fruity', 'green', 'resinous', 'sweet'] ['balsamic', 'berry', 'powdery']
['aldehydic', 'cooling', 'fruity', 'green', 'herbal', 'resinous', 'sweet', 'woody'] ['oily', 'resinous']
['earthy', 'green', 'leather', 'liquor', 'medicinal', 'phenolic', 'sweet', 'woody'] ['phenolic']
['fruity', 'green', 'herbal', 'mint', 'sweet'] ['fruity', 'mint']
['citrus', 'floral', 'fruity', 'herbal', 'resinous', 'woody'] ['fruity', 'rose']
['berry', 'chemical', 'fresh', 'fruity'] ['berry', 'cheese', 'sweet']
['apple', 'ethereal', 'fresh', 'fruity', 'grass', 'green', 'herbal', 'leaf', 'liquor', 'tropicalfruit', 'vegetable'] ['leaf', 'mushroom', 'violetflower']
['floral', 'herbal', 'mint', 'phenolic', 'spicy'] ['woody']
['meat', 'nut', 'sulfuric', 'vegetable'] ['coffee', 'earthy', 'meat', 'nut']
['sweet', 'vanilla'] ['dairy', 'herbal', 'phenolic', 'powdery', 'vanilla']
['citrus', 'floral', 'fresh', 'green', 'herbal', 'lily', 'rose', 'woody'] ['aldehydic', 'waxy']
['floral', 'fresh', 'green', 'lily', 'resinous', 'sweet'] ['clean', 'fruity', 'green', 'oily', 'rose']
['earthy', 'fresh', 'herbal', 'woody'] ['herbal', 'sweet', 'tobacco']
['earthy', 'fresh'] ['earthy', 'rancid', 'watery', 'woody']
['chemical', 'floral', 'green', 'herbal', 'leather', 'medicinal', 'phenolic', 'resinous', 'smoky', 'spicy', 'sweet', 'woody'] ['camphor', 'cooling', 'smoky']
['camphor', 'cooling', 'fresh', 'green', 'herbal', 'resinous', 'woody'] ['balsamic', 'chemical', 'mint', 'resinous', 'sweet']
['citrus', 'floral', 'fruity', 'rose', 'woody'] ['green', 'metallic', 'orange', 'rancid', 'sulfuric', 'waxy']
['earthy'] ['ambery', 'animalic', 'musk']
['apple', 'fatty', 'fruity', 'green', 'melon', 'pear', 'wine'] ['berry', 'clove', 'herbal']
['apple', 'fatty', 'fruity', 'green', 'oily', 'tropicalfruit', 'waxy'] ['fruity', 'green', 'musty', 'waxy']
['alliaceous', 'blackcurrant', 'ethereal', 'floral', 'spicy', 'sulfuric'] ['blackcurrant', 'powdery', 'spicy', 'sulfuric']
['earthy', 'ethereal', 'floral', 'fresh', 'fruity'] ['cacao', 'resinous']
['ethereal', 'floral', 'fresh', 'fruity', 'green', 'herbal', 'liquor', 'tropicalfruit'] ['aldehydic', 'ethereal', 'fresh', 'fruity']
['ethereal', 'fresh', 'fruity', 'oily', 'rose'] ['green']
['ethereal', 'floral', 'green', 'leaf', 'rose', 'woody'] ['hyacinth', 'leaf', 'mushroom', 'nut']
['alliaceous', 'earthy', 'fresh', 'green', 'herbal', 'spicy', 'tropicalfruit', 'vegetable'] ['meat']
['banana', 'ethereal', 'floral', 'fresh', 'green', 'sweet', 'woody'] ['camphor', 'floral', 'fruity', 'woody']
['balsamic', 'cinnamon', 'fruity', 'spicy', 'sweet', 'wine'] ['cherry', 'cooling', 'floral', 'green', 'spicy', 'sweet']
['citrus', 'floral', 'fresh', 'fruity', 'geranium', 'green', 'herbal', 'leaf', 'rose', 'woody'] ['citrus', 'floral', 'leaf', 'sweet']
['earthy', 'herbal', 'metallic', 'musk', 'powdery', 'resinous', 'smoky', 'spicy', 'vanilla'] ['herbal', 'vegetable']
['apple', 'banana', 'burnt', 'caramellic', 'chemical', 'ethereal', 'fresh', 'fruity', 'sweet'] ['butter', 'fruity']
['fatty', 'fresh', 'fruity', 'musty', 'oily', 'waxy'] ['fruity', 'oily']
['chemical', 'ethereal', 'fresh', 'green', 'herbal', 'pungent'] ['almond', 'fruity', 'green', 'herbal', 'sweet']
['earthy', 'green', 'musk', 'oily', 'woody'] ['floral', 'woody']
['floral', 'fruity', 'green', 'honey', 'resinous', 'rose', 'sweet', 'wine'] ['balsamic', 'fruity', 'rose']
['fatty', 'fruity', 'herbal', 'oily', 'rose', 'tropicalfruit'] ['butter', 'floral', 'wine']
['animalic', 'fruity', 'musk', 'sweet', 'waxy'] ['animalic', 'clean', 'dry', 'metallic', 'musk', 'powdery', 'tropicalfruit', 'waxy']
['alliaceous', 'meat', 'sulfuric'] ['fatty', 'gourmand', 'meat', 'sulfuric']
['berry', 'cinnamon', 'floral', 'green', 'sweet'] ['cinnamon']
['chemical', 'floral', 'fresh', 'fruity'] ['ambery', 'chemical', 'rancid', 'violetflower', 'woody']
['citrus', 'earthy', 'floral', 'fresh', 'fruity', 'herbal', 'lemon', 'resinous', 'woody'] ['citrus', 'floral', 'resinous']
['citrus', 'earthy', 'ethereal', 'floral', 'fresh', 'lily'] ['fruity', 'green', 'lily', 'woody']
['earthy', 'floral', 'green', 'herbal', 'lily', 'resinous'] ['lily', 'resinous']
['chemical', 'fruity', 'sour'] ['caramellic', 'dairy', 'fruity']
['cacao', 'coffee', 'earthy', 'green', 'mint', 'musty', 'nut', 'pepper', 'resinous', 'tobacco', 'vegetable', 'woody'] ['earthy', 'floral', 'green', 'pepper', 'resinous']
['earthy', 'ethereal', 'fresh', 'green', 'vegetable'] ['almond', 'cheese']
['anisic', 'balsamic', 'floral', 'green', 'powdery', 'resinous', 'sweet', 'vanilla'] ['almond', 'floral', 'spicy']
['bread', 'burnt', 'cacao', 'caramellic', 'coffee', 'earthy', 'meat', 'nut', 'sulfuric'] ['coffee', 'cooked', 'meat', 'nut', 'sulfuric']
['berry', 'ethereal', 'floral', 'rose', 'woody'] ['berry', 'cedar', 'floral', 'lactonic']
['fatty', 'fruity', 'odorless', 'oily', 'waxy'] ['sour']
['apple', 'ethereal', 'fresh', 'fruity', 'grass', 'green', 'herbal', 'leaf', 'liquor', 'tropicalfruit', 'vegetable'] ['green']
['earthy', 'floral', 'green', 'nut', 'woody'] ['animalic', 'green', 'woody']
['blackcurrant', 'herbal', 'mint', 'sulfuric', 'tropicalfruit'] ['blackcurrant', 'mint']
['citrus', 'floral', 'grape', 'green', 'herbal', 'honey', 'lily', 'rose', 'violetflower'] ['citrus', 'floral']
['chemical', 'ethereal', 'fruity', 'oily'] ['citrus', 'fresh']
['citrus', 'clean', 'floral', 'fresh', 'fruity', 'herbal', 'lemon', 'rose'] ['apple', 'dry', 'fatty', 'lemon', 'pear', 'rose', 'waxy']
['floral', 'green', 'herbal', 'lily', 'rose'] ['fresh', 'rose']
['balsamic', 'berry', 'floral', 'fruity', 'green', 'honey', 'rose', 'spicy', 'sweet', 'tropicalfruit'] ['balsamic', 'fruity']
['coconut', 'dairy', 'floral', 'fruity', 'jasmin', 'lily', 'rose', 'woody'] ['lactonic', 'plum', 'tropicalfruit']
['balsamic', 'camphor', 'resinous', 'woody'] ['camphor', 'coniferous', 'cooling']
['ethereal', 'grapefruit', 'musk', 'woody'] ['grapefruit']
['animalic', 'earthy', 'fresh', 'musk', 'sweet'] ['ambrette', 'animalic', 'musk', 'vegetable']
['floral', 'fruity', 'green', 'meat', 'roasted', 'sulfuric'] ['alliaceous', 'meat', 'roasted']
['balsamic', 'berry', 'fruity', 'green', 'nut', 'rose', 'sweet', 'tropicalfruit', 'woody'] ['woody']
['chemical', 'dry', 'earthy', 'fatty', 'floral', 'musk', 'powdery'] ['dry', 'fatty', 'musk', 'sweet', 'waxy']
['floral', 'fresh', 'fruity', 'grass'] ['fatty', 'fruity', 'mushroom']
['ethereal'] ['ethereal', 'fermented', 'fresh']
['fatty', 'fresh', 'fruity', 'musty', 'sulfuric', 'sweet', 'tropicalfruit', 'wine'] ['berry', 'dry', 'pungent']
['alliaceous', 'chemical', 'green', 'meat', 'sulfuric'] ['alliaceous', 'gourmand', 'meat', 'sharp']
['fresh', 'woody'] ['floral', 'woody']
['alliaceous', 'green', 'sulfuric'] ['alliaceous', 'cooked']

Generating the test predictions

In [ ]:
test_df = pd.read_csv("/content/data/test.csv")
mols = [Chem.MolFromSmiles(smile) for smile in test_df["SMILES"].tolist()]
feat = dc.feat.CircularFingerprint(size=1024)
test_arr = feat.featurize(mols)
(1079, 1024)
In [ ]:
test_dataset = dc.data.NumpyDataset(X=test_arr, y=np.zeros((len(test_df),109)))
<NumpyDataset X.shape: (1079, 1024), y.shape: (1079, 109), w.shape: (1079, 1), task_names: [  0   1   2 ... 106 107 108]>
In [ ]:
y_pred = best_model.predict(test_dataset)
# print(y_true.shape,y_pred.shape)
for i in range(y_pred.shape[0]):
  final_pred = []
  for y in range(109):
      prediction = y_pred[i,y]
      if prediction[1]>0.30:
  smell_ids = np.where(np.array(final_pred)==1)
  smells = [vocab[k] for k in smell_ids[0]]
  if len(smells)==0:
  if len(smells)>15:
      smells = smells[:15]
    new_smells = [x for x in top_15 if x not in smells]
  assert len(smells)==15
  sents = []
  for sent in range(0,15,3):
    sents.append(",".join([x for x in smells[sent:sent+3]]))
  pred = ";".join([x for x in sents])
print("[info] did not predict for ",c)
[info] did not predict for  182
In [ ]:
final = pd.DataFrame({"SMILES":test_df.SMILES.tolist(),"PREDICTIONS":top_5_preds})
Out[ ]:
0 CCC(C)C(=O)OC1CC2CCC1(C)C2(C)C camphor,resinous,woody;fruity,floral,herbal;gr...
1 CC(C)C1CCC(C)CC1OC(=O)CC(C)O cooling,fruity,floral;woody,herbal,green;fresh...
2 CC(=O)/C=C/C1=CCC[C@H](C1(C)C)C fresh,fruity,floral;woody,herbal,green;sweet,r...
3 CC(=O)OCC(COC(=O)C)OC(=O)C fruity,floral,woody;herbal,green,fresh;sweet,r...
4 CCCCCCCC(=O)OC/C=C(/CCC=C(C)C)\C fruity,rose,floral;woody,herbal,green;fresh,sw...
In [ ]:

This submission gives a score of ~0.253 on the leaderboard.

Using skmultilearn

We can also use the molecular fingerprints as inputs to class ML Algos like Random Forest, LabelPowerSet set a try.

We can use any models available in sklearn for multiclass problem, but I have just used these two to demonstrate a possible attempt at using sklearn.

In [ ]:
!pip install -q scikit-multilearn --upgrade
     |████████████████████████████████| 92kB 7.4MB/s 

This function should approximate the evaluation metric used in the challenge.

In [ ]:
def in_top_5(top_5_sents,target):
  ll = top_5_sents.split(";")
  max_scr = 0 
  for x in ll:
    smells = x.split(',')
    c = 0 
    for y in target.split(','):
      if y in smells:
    scr = c/(len(target.split(','))+len(smells)-c)
    if scr>max_scr:
      max_scr = scr

  return max_scr
In [ ]:
def get_submission(y_pred,labels = vocab):
  top_5_preds = []
  for i in range(y_pred.shape[0]):
    smell_ids = np.where(y_pred[i]==1)
    smells = [labels[k] for k in smell_ids[0]]
    if len(smells)>15:
      smells = smells[:15]
     new_smells = [x for x in top_15 if x not in smells]
    sents = []
    for sent in range(0,15,3):
      sents.append(",".join([x for x in smells[sent:sent+3]]))
    pred = ";".join([x for x in sents])
  return top_5_preds
In [ ]:
mols = [Chem.MolFromSmiles(smile) for smile in data_df["text"].tolist()]
feat = dc.feat.CircularFingerprint(size=1024)
arr = feat.featurize(mols)
(4316, 1024)
In [ ]:
labels = []
train_df = pd.read_csv("data/train.csv")
for x in train_df.SENTENCE.tolist():
labels = np.array(labels)
(4316, 109)
In [ ]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(arr,labels, test_size=.1, random_state=42)
In [ ]:
from skmultilearn.problem_transform import BinaryRelevance
from sklearn.ensemble import RandomForestClassifier
import time

classifier = BinaryRelevance(
    classifier = RandomForestClassifier(),
    require_dense = [False, True]

classifier.fit(X_train, y_train)

print('training time taken: ',round(time.time()-start,0),'seconds')
training time taken:  87.0 seconds
In [ ]:
gt = []
for i in range(y_test.shape[0]):
    smell_ids = np.where(y_test[i]==1)
    smells = [vocab[k] for k in smell_ids[0]]
    gt.append(",".join([x for x in smells[:min(len(smells),3)]])) 
val_dict = {"top_5_sents":top_5_predictions,"target":gt}
df_to_eval = pd.DataFrame(val_dict)
top_5_scr = []
for pred,y in zip(df_to_eval.top_5_sents.tolist(),df_to_eval.target.tolist()):
df_to_eval["top_5_score"] = top_5_scr
print("OVERALL SCORE:",np.mean(df_to_eval.top_5_score.tolist()))
top_5_sents target top_5_score
0 balsamic,fruity,floral;woody,herbal,green;fres... balsamic,cinnamon,fruity 0.50
1 fruity,floral,woody;herbal,green,fresh;sweet,r... dry,herbal 0.25
2 resinous,fruity,floral;woody,herbal,green;fres... floral,fruity,resinous 1.00
3 citrus,fresh,lily;rose,fruity,floral;woody,her... citrus,fresh,lily 1.00
4 balsamic,fatty,fruity;green,sweet,tropicalfrui... green,pear 0.25

Generating Test Predictions

In [ ]:
test_df = pd.read_csv("/content/data/test.csv")
mols = [Chem.MolFromSmiles(smile) for smile in test_df["SMILES"].tolist()]
feat = dc.feat.CircularFingerprint(size=1024)
test_arr = feat.featurize(mols)
(1079, 1024)
In [ ]:
In [ ]:
final = pd.DataFrame({"SMILES":test_df.SMILES.tolist(),"PREDICTIONS":top_5_predictions})
Out[ ]:
0 CCC(C)C(=O)OC1CC2CCC1(C)C2(C)C camphor,resinous,fruity;floral,woody,herbal;gr...
1 CC(C)C1CCC(C)CC1OC(=O)CC(C)O fruity,floral,woody;herbal,green,fresh;sweet,r...
2 CC(=O)/C=C/C1=CCC[C@H](C1(C)C)C fruity,floral,woody;herbal,green,fresh;sweet,r...
3 CC(=O)OCC(COC(=O)C)OC(=O)C odorless,fruity,floral;woody,herbal,green;fres...
4 CCCCCCCC(=O)OC/C=C(/CCC=C(C)C)\C cognac,floral,fruity;woody,herbal,green;fresh,...
In [ ]:

This submission gives a score of ~0.24 on the leaderboard.

Label PowerSet

It is a problem transformation approach to multi-label classification that transforms a multi-label problem to a multi-class problem with 1 multi-class classifier trained on all unique label combinations found in the training data.

In [ ]:
# using Label Powerset
from skmultilearn.problem_transform import LabelPowerset
from sklearn.naive_bayes import GaussianNB

# initialize Label Powerset multi-label classifier
# with a gaussian naive bayes base classifier
classifier = LabelPowerset(GaussianNB())

# train
classifier.fit(X_train, y_train)

# predict
predictions = classifier.predict(X_test)

top_5_predictions = get_submission(predictions.toarray())
gt = []
for i in range(y_test.shape[0]):
    smell_ids = np.where(y_test[i]==1)
    smells = [vocab[k] for k in smell_ids[0]]
    gt.append(",".join([x for x in smells[:min(len(smells),3)]])) 
val_dict = {"top_5_sents":top_5_predictions,"target":gt}
df_to_eval = pd.DataFrame(val_dict)
top_5_scr = []
for pred,y in zip(df_to_eval.top_5_sents.tolist(),df_to_eval.target.tolist()):
df_to_eval["top_5_score"] = top_5_scr
print("OVERALL SCORE:",np.mean(df_to_eval.top_5_score.tolist()))
OVERALL SCORE: 0.23996913580246917
top_5_sents target top_5_score
0 balsamic,fruity,floral;woody,herbal,green;fres... balsamic,cinnamon,fruity 0.50
1 spicy,fruity,floral;woody,herbal,green;fresh,s... dry,herbal 0.25
2 odorless,fruity,floral;woody,herbal,green;fres... floral,fruity,resinous 0.50
3 citrus,fresh,lily;rose,fruity,floral;woody,her... citrus,fresh,lily 1.00
4 balsamic,fatty,fruity;green,sweet,tropicalfrui... green,pear 0.25

Generating Test Predictions

In [ ]:
print('prediction time taken: ',round(time.time()-start,0),'seconds')
top_5_predictions = get_submission(predictions.toarray())
prediction time taken:  10.0 seconds
In [ ]:
final = pd.DataFrame({"SMILES":test_df.SMILES.tolist(),"PREDICTIONS":top_5_predictions})
Out[ ]:
0 CCC(C)C(=O)OC1CC2CCC1(C)C2(C)C balsamic,resinous,fruity;floral,woody,herbal;g...
1 CC(C)C1CCC(C)CC1OC(=O)CC(C)O mint,fruity,floral;woody,herbal,green;fresh,sw...
2 CC(=O)/C=C/C1=CCC[C@H](C1(C)C)C odorless,fruity,floral;woody,herbal,green;fres...
3 CC(=O)OCC(COC(=O)C)OC(=O)C odorless,fruity,floral;woody,herbal,green;fres...
4 CCCCCCCC(=O)OC/C=C(/CCC=C(C)C)\C cognac,floral,fruity;woody,herbal,green;fresh,...
In [ ]:

This submission gives a score of ~0.26 on the leaderboard.


Chemception is named after the Inception modules which will be used for the neural network. This method is based on this paper. The Smiles are encoded as 2D images.

In [ ]:
import pandas as pd
import rdkit
from rdkit import Chem
from rdkit.Chem import AllChem
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline
print("RDKit: %s"%rdkit.__version__)
RDKit: 2020.09.1

Preprocessing the Data

The first step is to encode the molecule into an “image”.

The function below takes an RDKit mol and encodes the molecular graph as an image with 4 channels.

After reading in the molecule the Gasteiger charges are calculated and the 2D drawing coordinates computed. They are usually computed before generating depictions of the molecule, but we are goind to need them “raw”, so they are extracted to the coords matrix.

The vect matrix is defined and filled with zeros (vacuum) and is of the shape (image_width, image_height,4). Each layer is then used to encode different information from the molecule.

Layer zero is filled with information about the bonds and encoded the bondorder. The next three layers are encoded with the atomic number, Gasteiger charges and hybridization.

More features can lead to the creation of more channels. But infact for right now we use only the first 3 channels to input into any CNN of our choice.

I am working on the implementation that will use the Inception model mentioned in the original paper, that takes into account all 4 channels so stay tuned!

In [ ]:
import pandas as pd 
import numpy as np

embed = 20 
res = 0.5 
train = pd.read_csv("data/train.csv")
0 C/C=C/C(=O)C1CCC(C=C1C)(C)C fruity,rose
1 COC(=O)OC fresh,ethereal,fruity
2 Cc1cc2c([nH]1)cccc2 resinous,animalic
3 C1CCCCCCCC(=O)CCCCCCC1 powdery,musk,animalic
4 CC(CC(=O)OC1CC2C(C1(C)CC2)(C)C)C coniferous,camphor,fruity
In [ ]:
from rdkit import Chem 
train["mol"] = train["SMILES"].apply(Chem.MolFromSmiles)
In [ ]:
mol = train["mol"][1]
dims = int(embed*2/res)
cmol = Chem.Mol(mol.ToBinary())
In [ ]:
def chemcepterize_mol(mol, embed=20.0, res=0.5):
    dims = int(embed*2/res)
    cmol = Chem.Mol(mol.ToBinary())
    coords = cmol.GetConformer(0).GetPositions()
    vect = np.zeros((dims,dims,4))
    #Bonds first
    for i,bond in enumerate(mol.GetBonds()):
        bondorder = bond.GetBondTypeAsDouble()
        bidx = bond.GetBeginAtomIdx()
        eidx = bond.GetEndAtomIdx()
        bcoords = coords[bidx]
        ecoords = coords[eidx]
        frac = np.linspace(0,1,int(1/res*2)) #
        for f in frac:
            c = (f*bcoords + (1-f)*ecoords)
            idx = int(round((c[0] + embed)/res))
            idy = int(round((c[1]+ embed)/res))
            #Save in the vector first channel
            vect[ idx , idy ,0] = bondorder
    #Atom Layers
    for i,atom in enumerate(cmol.GetAtoms()):
            idx = int(round((coords[i][0] + embed)/res))
            idy = int(round((coords[i][1]+ embed)/res))
            #Atomic number
            vect[ idx , idy, 1] = atom.GetAtomicNum()
            #Gasteiger Charges
            charge = atom.GetProp("_GasteigerCharge")
            vect[ idx , idy, 3] = charge
            hyptype = atom.GetHybridization().real
            vect[ idx , idy, 2] = hyptype
    return vect
data = pd.read_csv("data/train.csv")
data["mol"] = data["SMILES"].apply(Chem.MolFromSmiles)

Lets try to “chemcepterize” a molecule and show it as an image. Matplotlib only supports RGB, so only the first three channels are used.

In [ ]:
mol = data["mol"][3]
v = chemcepterize_mol(mol, embed=18, res=0.5)
plt.title("Chemceptionized Molecule")
Some molecules are really long and its difficult to embed them into such a small image, so we drop these molecules from our dataset.

In [ ]:
chemcepterize_mols = []
idxs_to_drop = []
for i,mol in enumerate(data["mol"].tolist()):
    chemcepterize_mols.append(chemcepterize_mol(mol, embed=16, res=0.5))
142 <rdkit.Chem.rdchem.Mol object at 0x7f45c1976030> CC1CCc2c(C1)occ2C.CC1CCC(C(C1)OC(=O)C)C(C)C.CC1CCC(=C(C)C)C(=O)C1.CC1CCC(C(=O)C1)C(C)C.CC1CCC(C(C1)O)C(C)C.CC1CCC2(CC1)OCC2C 124
195 <rdkit.Chem.rdchem.Mol object at 0x7f45c1993120> CCCCCCC(C/C=C\CCCCCCCC(=O)OC(COC(=O)CCCCCCC/C=C\CC(CCCCCC)O)COC(=O)CCCCCCC/C=C\CC(CCCCCC)O)O 92
248 <rdkit.Chem.rdchem.Mol object at 0x7f45c196f210> OC1C[C@H]2C([C@]1(C)CC2)(C)C.C=CC(CCC=C(C)C)C.C=CCc1ccc(c(c1)OC)OC.OC/C=C(\CCC=C(C)C)/C.O=C/C=C(\CCC=C(C)C)/C 109
2318 <rdkit.Chem.rdchem.Mol object at 0x7f4606823850> Cn1cnc2c1c(=O)n(C)c(=O)n2C.Cn1cnc2c1c(=O)[nH]c(=O)n2C.Oc1cc2OC(c3ccc(c(c3)O)O)C(Cc2c(c1)O)O 91
2716 <rdkit.Chem.rdchem.Mol object at 0x7f46068247b0> C/C(=C\C=C\C=C(\C=C\C=C(\C=C\C1=C(C)C[C@H](CC1(C)C)O)/C)/C)/C=C/C=C(/C=C/C(=O)[C@]1(C)C[C@H](CC1(C)C)O)\C 105
2880 <rdkit.Chem.rdchem.Mol object at 0x7f4606832c10> CC(=O)CCC/C=C/C=C/C=C.CC(=O)CCC/C=C\C=C\C=C.CCC(=O)CC/C=C/C=C/C=C.CCC(=O)CC/C=C\C=C\C=C.CCCC(=O)C/C=C/C=C/C=C.CCCC(=O)C/C=C\C=C\C=C 131
2892 <rdkit.Chem.rdchem.Mol object at 0x7f4606831030> C/C/1=C\CCC(=C2/C(=C\C1)/CC2)C.CC(=CCC1=C(O)C(C(=O)C(=C1O)C(=O)CC(C)C)(CC=C(C)C)CC=C(C)C)C.CC(=CCC1=C(O)C(C(=O)C(=C1O)C(=C)CC(C)C)(O)CC=C(C)C)C 143

Data Augmentation, Let's Double the dataset!

A smile is not a unique way to represent a molecule so we use rdkit to create different versions of the same smile and add them to our dataset.

In [ ]:
import random

def randomSmiles(m1):
    m1.SetProp("_canonicalRankingNumbers", "True")
    idxs = list(range(0,m1.GetNumAtoms()))
    for i,v in enumerate(idxs):
        m1.GetAtomWithIdx(i).SetProp("_canonicalRankingNumber", str(v))
    return Chem.MolToSmiles(m1)

m1 = Chem.MolFromSmiles(data["SMILES"][3])
v1 = chemcepterize_mol(m1, embed=16, res=0.5)
s = set()
for i in range(1000):
  smiles = randomSmiles(m1)


for new_mol in s:
  mol = Chem.MolFromSmiles(new_mol)
  v = chemcepterize_mol(mol, embed=16, res=0.5)
  diff = (v1-v).astype(np.int)
  if len(np.unique(diff)) > 4:
    plt.title("New Chemceptionized Molecule")

Process and save the entire train set

In [ ]:
from tqdm.notebook import tqdm
new_smiles = []
new_labels = []
did_not_add = 0
for smile,label in tqdm(zip(data["SMILES"].tolist(),data["SENTENCE"].tolist()),total = len(data)):
    m1 = Chem.MolFromSmiles(smile)
    v1 = chemcepterize_mol(m1, embed=16, res=0.5)
    s = set()
    flag = 0 
    for i in range(1000):
      smiles = randomSmiles(m1)

    for new_mol in s:
      mol = Chem.MolFromSmiles(new_mol)
           v = chemcepterize_mol(mol, embed=16, res=0.5)
      diff = (v1-v).astype(np.int)
      if len(np.unique(diff)) > 4:
    if flag == 0:
print("[INFO] Did not add for",did_not_add,"/",len(data),"i.e",did_not_add/len(data)*100,"%")
[INFO] Did not add for 382 / 4309 i.e 8.865165931770711 %
In [ ]:
Out[ ]:
(3927, 3927)
In [ ]:
orig_smiles = data["SMILES"].tolist()
orig_labels = data["SENTENCE"].tolist()
In [ ]:
filenames = [str(x)+".png" for x in range(len(orig_smiles))]
In [ ]:
new_data = pd.DataFrame({"SMILES":orig_smiles,"SENTENCE":orig_labels,"FILENAME":filenames})
new_data["mol"] = new_data["SMILES"].apply(Chem.MolFromSmiles)
In [ ]:
Out[ ]:
0 C/C=C/C(=O)C1CCC(C=C1C)(C)C fruity,rose 0.png <rdkit.Chem.rdchem.Mol object at 0x7f4606622f30>
1 COC(=O)OC fresh,ethereal,fruity 1.png <rdkit.Chem.rdchem.Mol object at 0x7f4606622440>
2 Cc1cc2c([nH]1)cccc2 resinous,animalic 2.png <rdkit.Chem.rdchem.Mol object at 0x7f4606622760>
3 C1CCCCCCCC(=O)CCCCCCC1 powdery,musk,animalic 3.png <rdkit.Chem.rdchem.Mol object at 0x7f46066224e0>
4 CC(CC(=O)OC1CC2C(C1(C)CC2)(C)C)C coniferous,camphor,fruity 4.png <rdkit.Chem.rdchem.Mol object at 0x7f46066220d0>
In [ ]:
!mkdir chemception_data
!mkdir chemception_data/train
!mkdir chemception_data/test
In [ ]:
import PIL
def save_png(mol,filename,mode):
    v = chemcepterize_mol(mol, embed=16,res=0.5)
for i,r in new_data.iterrows():

Process and save the entire test set

In [ ]:
test = pd.read_csv("data/test.csv")
test["mol"] = test["SMILES"].apply(Chem.MolFromSmiles)
test_filenames = [str(x)+".png" for x in range(len(test))]
test["FILENAME"] = test_filenames
Out[ ]:
0 CCC(C)C(=O)OC1CC2CCC1(C)C2(C)C <rdkit.Chem.rdchem.Mol object at 0x7f460a8731c0> 0.png
1 CC(C)C1CCC(C)CC1OC(=O)CC(C)O <rdkit.Chem.rdchem.Mol object at 0x7f460a8733a0> 1.png
2 CC(=O)/C=C/C1=CCC[C@H](C1(C)C)C <rdkit.Chem.rdchem.Mol object at 0x7f460a873210> 2.png
3 CC(=O)OCC(COC(=O)C)OC(=O)C <rdkit.Chem.rdchem.Mol object at 0x7f460a873530> 3.png
4 CCCCCCCC(=O)OC/C=C(/CCC=C(C)C)\C <rdkit.Chem.rdchem.Mol object at 0x7f460a873580> 4.png
In [ ]:
chemcepterize_mols = []
idxs_to_drop = []
for i,mol in enumerate(test["mol"].tolist()):
    chemcepterize_mols.append(chemcepterize_mol(mol, embed=16, res=0.5))
405 <rdkit.Chem.rdchem.Mol object at 0x7f460a87d5d0> CCCCCCCCCCCCOC(=O)CCC(=O)OCCCCCCCCCCCC 38
470 <rdkit.Chem.rdchem.Mol object at 0x7f460a87ea80> CCCCCCCCCc1ccc(cc1)OP(Oc1ccc(cc1)CCCCCCCCC)Oc1ccc(cc1)CCCCCCCCC 63
755 <rdkit.Chem.rdchem.Mol object at 0x7f460a8845d0> CC(=CCC/C(=C/C=C/C(=C/C=C/C(=C/C=C/C=C(\C)/C=C/C=C(\C)/C=C/C=C(\C)/CCC=C(C)C)/C)/C)/C)C 87
In [ ]:
for i,r in test.iterrows():
In [ ]:

Using Fastai and first 3 channels of Chemceptionized Molecule as Input

In [ ]:
import sys
import os
import gc
import warnings
import torch

import torch.nn as nn
import numpy as np
import pandas as pd 
import torch.nn.functional as F

from fastai.script import *
from fastai.vision import *
from fastai.callbacks import *
from fastai.distributed import *
from fastprogress import fastprogress
from torchvision.models import *
from tqdm.notebook import tqdm
In [ ]:
np.random.seed(42) # set random seed so we always get the same validation set
src =  (ImageList.from_df(path="/content/chemception_data/train", df=new_data, cols=["FILENAME"]).split_by_rand_pct(0.2).label_from_df(label_delim=',').databunch(num_workers=0,bs=64)).normalize(imagenet_stats)

Let's visualize the data

In [ ]:
src.show_batch(rows=3, figsize=(12, 9))

Defining the model, feel free to switch to any cnn, I am using vgg19.

In [ ]:
# create metrics
acc_02 = partial(accuracy_thresh, thresh=0.2)
f_score = partial(fbeta, thresh=0.2)
# create cnn with the resnet50 architecture
learn = cnn_learner(src, models.vgg19_bn, metrics=[acc_02, f_score])
Downloading: "https://download.pytorch.org/models/vgg19_bn-c79401a0.pth" to /root/.cache/torch/hub/checkpoints/vgg19_bn-c79401a0.pth

Find optimal Learning Rate

In [ ]:
learn.lr_find() # find learning rate
learn.recorder.plot() # plot learning rate
Stage 1 Training

In [ ]:
lr = 0.001 # chosen learning rate
learn.fit_one_cycle(4, lr) # train model for 4 epochs

learn.save('chemception-stage-1') # save model
Unfreeze the model and find new optimal lr.

In [ ]:
learn.unfreeze() # unfreeze all layers

learn.lr_find() # find learning rate
learn.recorder.plot() # plot learning rate
Stage 2 Training

In [ ]:
learn.fit_one_cycle(15, slice(1e-5, lr/5)) # fit model with differential learning rates

learn.save('chemception-stage-2') # save model
Export the model for future use

In [ ]:

Let's see how the model performs on the Val set

In [ ]:
val_fns = [str(x) for x in learn.data.valid_ds.items]
In [ ]:
val_fns_y=[str(x) for x in learn.data.valid_ds.y]
In [ ]:
for img,gt in zip(val_fns,val_fns_y):
  pred_class, pred_idx, preds=learn.predict(open_image(img))
  thresh = 0.1
  labelled_preds = [' '.join([learn.data.classes[i] for i,p in enumerate(preds) if p > thresh])]
PREDS: ['fresh fruity herbal mint']
GT: melon;pear;tropicalfruit;earthy
PREDS: ['fatty floral fresh fruity green herbal']
GT: apple
PREDS: ['floral fresh fruity green herbal']
GT: sulfuric;meat;seafood;metallic;roasted
PREDS: ['fruity meat nut']
GT: vanilla
PREDS: ['floral phenolic resinous spicy sweet woody']
GT: violetflower;waxy;pear;melon;vegetable
PREDS: ['fatty floral fresh fruity green herbal']
GT: green;green;fruity;herbal
PREDS: ['balsamic floral fresh fruity green herbal resinous sweet woody']
GT: apple;rose;herbal
PREDS: ['floral fresh fruity green herbal rose sweet woody']
GT: resinous;camphor;balsamic
PREDS: ['camphor coniferous earthy herbal resinous woody']
GT: woody;earthy;camphor
PREDS: ['camphor dry earthy floral herbal musk resinous spicy woody']
GT: balsamic;spicy
PREDS: ['balsamic floral fruity resinous spicy sweet']
GT: fruity;herbal;apple;herbal
PREDS: ['fresh fruity green herbal sweet']
GT: apple;herbal
PREDS: ['fresh fruity green herbal']
GT: fruity;jasmin;woody;balsamic
PREDS: ['floral fresh fruity green herbal sweet']
GT: sulfuric
PREDS: ['alliaceous ethereal fresh fruity meat sulfuric']
GT: woody;floral;oily
PREDS: ['ambery fresh fruity woody']
GT: lily;orange;coconut;fruity;jasmin
PREDS: ['floral fresh fruity green herbal']
GT: balsamic;spicy
PREDS: ['floral fruity spicy sweet']
GT: pungent;terpenic;citrus;herbal
PREDS: ['floral fresh fruity green herbal sweet']
GT: sweet;lemon
PREDS: ['floral fresh fruity green herbal sweet woody']
GT: earthy;bread;nut
PREDS: ['burnt ethereal fresh fruity']
GT: sharp;mint;phenolic
PREDS: ['floral fresh fruity green herbal mint sweet woody']
GT: violetflower;woody;fresh;clean
PREDS: ['floral fresh fruity green herbal woody']
GT: oily;herbal;citrus
PREDS: ['fatty floral fresh fruity green herbal rose woody']
GT: fresh;camphor;coniferous
PREDS: ['camphor floral fresh fruity green herbal resinous woody']
GT: mint
PREDS: ['fresh herbal mint woody']
GT: dry;rose;green;resinous;musk;orange
PREDS: ['balsamic floral resinous spicy sweet woody']
GT: vanilla;spicy;floral;clove;powdery;dairy;burnt;balsamic
PREDS: ['floral fruity green herbal spicy sweet']
GT: lactonic
PREDS: ['floral fruity green herbal rose']
GT: meat
PREDS: ['floral fruity green resinous sweet']
GT: lactonic;plum;green;dry
PREDS: ['balsamic floral fruity green herbal resinous rose sweet']
GT: rose;woody
PREDS: ['floral fruity herbal woody']
GT: fresh;honey;grass;woody;spicy
PREDS: ['floral fruity green herbal woody']
GT: sweet;woody;green;floral
PREDS: ['citrus fatty floral fresh fruity green herbal rose waxy']
GT: sweet;green;fruity
PREDS: ['balsamic floral fresh fruity green herbal resinous rose sweet']
GT: lily;green;fruity;woody
PREDS: ['floral fresh fruity green herbal rose sweet woody']
GT: dry;grass;waxy;rose
PREDS: ['floral fresh fruity green herbal sweet']
GT: floral
PREDS: ['fresh fruity herbal mint']
GT: sweet;anisic;almond;floral
PREDS: ['floral nut phenolic resinous spicy sweet']
GT: woody;floral;balsamic
PREDS: ['floral fruity herbal woody']
GT: geranium;spicy;leaf
PREDS: ['floral fresh fruity green herbal sweet woody']
GT: hyacinth;green
PREDS: ['floral nut spicy sweet']
GT: mushroom;musty;oily
PREDS: ['ethereal fresh fruity green']
GT: cinnamon;aldehydic
PREDS: ['balsamic floral fruity green herbal resinous rose sweet']
GT: balsamic;berry;whiteflower;cacao;green;herbal;woody;balsamic
PREDS: ['balsamic floral fruity green herbal resinous sweet']
GT: woody;floral;fruity
PREDS: ['floral fresh fruity green herbal sweet woody']
GT: apple;banana
PREDS: ['fresh fruity green']
GT: green;fruity
PREDS: ['floral fresh fruity green herbal sweet']
GT: leather;animalic;earthy;ethereal
PREDS: ['floral fruity green herbal resinous sweet woody']
GT: resinous;coniferous;camphor
PREDS: ['fresh fruity herbal mint']
GT: berry;lactonic;cedar;floral
PREDS: ['floral fruity green herbal woody']
GT: gourmand;cooked;roasted
PREDS: ['floral fresh fruity green herbal sweet woody']
GT: caramellic;berry;liquor
PREDS: ['floral fresh fruity green herbal mint woody']
GT: floral;woody
PREDS: ['floral fruity green herbal resinous woody']
GT: bread;clean;caramellic
PREDS: ['ethereal fresh fruity']
GT: resinous;caramellic
PREDS: ['fresh fruity green herbal sweet']
GT: musty;coffee;cacao
PREDS: ['floral fruity green sweet']
GT: seafood;whiteflower;jasmin
PREDS: ['floral fruity green nut']
GT: ethereal;fruity
PREDS: ['floral fresh fruity green herbal']
GT: earthy
PREDS: ['nut phenolic spicy sweet']
GT: tropicalfruit
PREDS: ['burnt ethereal fresh fruity sulfuric']
GT: woody;floral;pepper
PREDS: ['citrus floral fresh fruity green herbal woody']
GT: grapefruit
PREDS: ['floral fresh fruity green herbal']
GT: banana;fatty;apple
PREDS: ['floral fresh fruity green herbal sweet']
GT: woody;balsamic
PREDS: ['balsamic floral fruity resinous sweet']
GT: ripe;dry;tropicalfruit;liquor;green;pear;caramellic
PREDS: ['fresh fruity green herbal']
GT: dry;green;ester;green;fruity;ethereal;grapefruit
PREDS: ['balsamic floral fruity green herbal resinous sweet woody']
GT: syrup;camphor;coniferous
PREDS: ['camphor coniferous fresh green resinous']
GT: lily
PREDS: ['citrus floral fresh fruity green herbal woody']
GT: earthy;woody
PREDS: ['camphor earthy floral fruity green herbal resinous woody']
GT: fruity;clove
PREDS: ['balsamic floral fruity resinous spicy sweet woody']
GT: tobacco;ambery;woody
PREDS: ['ambery earthy herbal musk resinous spicy woody']
GT: fresh;banana;burnt;dairy
PREDS: ['ethereal fresh fruity green']
GT: grapefruit;resinous;orange;oily;woody
PREDS: ['floral fresh fruity herbal woody']
GT: fruity;chemical
PREDS: ['ethereal fresh fruity herbal']
GT: fruity;woody
PREDS: ['floral fruity herbal woody']
GT: mint;fruity
PREDS: ['floral fresh fruity green herbal sweet woody']
GT: earthy
PREDS: ['fresh fruity green']
GT: resinous;fruity;tobacco
PREDS: ['balsamic floral fruity green herbal resinous spicy sweet woody']
GT: tropicalfruit;berry;sulfuric;earthy;fatty;waxy
PREDS: ['fresh fruity green herbal sweet']
GT: fresh;floral;woody
PREDS: ['floral fresh fruity green herbal sweet']
GT: honey
PREDS: ['floral fruity herbal woody']
GT: nut
PREDS: ['balsamic floral fruity resinous spicy sweet woody']
GT: sweet;violetflower;orange;powdery
PREDS: ['floral fruity green herbal resinous sweet']
GT: woody;floral;balsamic
PREDS: ['floral fruity herbal woody']
GT: fruity;tropicalfruit;lactonic;coconut;dairy;vanilla
PREDS: ['floral fruity green herbal woody']
GT: citrus;lemon;floral
PREDS: ['floral fresh fruity green herbal woody']
GT: aldehydic;herbal;lily;woody;watery
PREDS: ['balsamic camphor fresh green resinous']
GT: fresh;rose;green
PREDS: ['floral nut phenolic sweet']
GT: resinous;fruity;rose;herbal
PREDS: ['floral fresh fruity green herbal woody']
An example test prediction

In [ ]:
order = np.argsort(preds).cpu().numpy()[::-1][:15]
labelled_preds = [learn.data.classes[i] for i in order]
sents = []
for sent in range(0,15,3):
  sents.append(",".join([x for x in labelled_preds[sent:sent+3]]))
pred = ";".join([x for x in sents])
Out[ ]:

Generating test predictions

In [ ]:
with open("/content/data/vocabulary.txt") as f:
  vocab =f.read().split("\n")
In [ ]:
from sklearn.preprocessing import MultiLabelBinarizer
def make_sentence_list(sent):
  return sent.split(",")
train_df = pd.read_csv("/content/data/train.csv")
train_df["SENTENCE_LIST"] = train_df.SENTENCE.apply(make_sentence_list)
multilabel_binarizer = MultiLabelBinarizer()
Y = multilabel_binarizer.transform(train_df.SENTENCE_LIST)
d = {}
for x,y in zip(multilabel_binarizer.classes_,Y.sum(axis=0)):

d = sorted(d.items(), key=lambda x: x[1], reverse=True)
top_15 = [x[0] for x in d[:15]]
Out[ ]:
In [ ]:
final_preds = []
test_df = pd.read_csv("/content/data/test.csv")
new_test = pd.read_csv("/content/chemception_data/test_data.csv")
for i,row in tqdm(test_df.iterrows(),total=len(test_df)):
  if i in [405,470,755]:
    sents = []
    for sent in range(0,15,3):
      sents.append(",".join([x for x in top_15[sent:sent+3]]))
    pred = ";".join([x for x in sents])
    for x in top_15:
      assert x in vocab,x
    fn = new_test.loc[new_test["SMILES"]==row["SMILES"]]["FILENAME"].values[0] #.split('.')[0]+".png"
    pred_class, pred_idx, preds=learn.predict(open_image("/content/chemception_data/test/"+fn))
    order = np.argsort(preds).cpu().numpy()[::-1][:15]
    labelled_preds = [learn.data.classes[i] for i in order]
    for x in labelled_preds:
      assert x in vocab
    sents = []
    for sent in range(0,15,3):
      sents.append(",".join([x for x in labelled_preds[sent:sent+3]]))
    pred = ";".join([x for x in sents])
1079 1079
In [ ]:
final = pd.DataFrame({"SMILES":test_df.SMILES.tolist(),"PREDICTIONS":final_preds})
Out[ ]:
0 CCC(C)C(=O)OC1CC2CCC1(C)C2(C)C woody,herbal,floral;fruity,resinous,green;fres...
1 CC(C)C1CCC(C)CC1OC(=O)CC(C)O woody,fruity,floral;herbal,sweet,fresh;spicy,g...
2 CC(=O)/C=C/C1=CCC[C@H](C1(C)C)C woody,herbal,fruity;floral,fresh,green;sweet,s...
3 CC(=O)OCC(COC(=O)C)OC(=O)C fruity,floral,green;herbal,fresh,rose;citrus,s...
4 CCCCCCCC(=O)OC/C=C(/CCC=C(C)C)\C fruity,floral,rose;green,herbal,fatty;oily,wax...
In [ ]:

This submission gives a score of ~0.247 on the leaderboard.


👾 Shraddhaa Mohan

🚀 Rohit Midha

If you found this notebook helpful, drop us a ❤!


You must login before you can post a comment.
