De-Shuffling Text

Solution for submission 148568

In [ ]:

!nvidia-smi

Sat Jun 26 19:08:43 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.27       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   37C    P0    27W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

In [ ]:

!pip install aicrowd-cli
!rm -rf data
!mkdir data

Requirement already satisfied: aicrowd-cli in /usr/local/lib/python3.7/dist-packages (0.1.7)
Requirement already satisfied: click<8,>=7.1.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (7.1.2)
Collecting tqdm<5,>=4.56.0
  Using cached https://files.pythonhosted.org/packages/b4/20/9f1e974bb4761128fc0d0a32813eaa92827309b1756c4b892d28adfb4415/tqdm-4.61.1-py2.py3-none-any.whl
Requirement already satisfied: requests-toolbelt<1,>=0.9.1 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (0.9.1)
Requirement already satisfied: requests<3,>=2.25.1 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (2.25.1)
Requirement already satisfied: gitpython<4,>=3.1.12 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (3.1.18)
Requirement already satisfied: toml<1,>=0.10.2 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (0.10.2)
Requirement already satisfied: rich<11,>=10.0.0 in /usr/local/lib/python3.7/dist-packages (from aicrowd-cli) (10.4.0)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (3.0.4)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.10)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (1.24.3)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.25.1->aicrowd-cli) (2021.5.30)
Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/lib/python3.7/dist-packages (from gitpython<4,>=3.1.12->aicrowd-cli) (4.0.7)
Requirement already satisfied: typing-extensions>=3.7.4.0; python_version < "3.8" in /usr/local/lib/python3.7/dist-packages (from gitpython<4,>=3.1.12->aicrowd-cli) (3.7.4.3)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (2.6.1)
Requirement already satisfied: colorama<0.5.0,>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (0.4.4)
Requirement already satisfied: commonmark<0.10.0,>=0.9.0 in /usr/local/lib/python3.7/dist-packages (from rich<11,>=10.0.0->aicrowd-cli) (0.9.1)
Requirement already satisfied: smmap<5,>=3.0.1 in /usr/local/lib/python3.7/dist-packages (from gitdb<5,>=4.0.1->gitpython<4,>=3.1.12->aicrowd-cli) (4.0.0)
ERROR: datasets 1.8.0 has requirement tqdm<4.50.0,>=4.27, but you'll have tqdm 4.61.1 which is incompatible.
Installing collected packages: tqdm
  Found existing installation: tqdm 4.49.0
    Uninstalling tqdm-4.49.0:
      Successfully uninstalled tqdm-4.49.0
Successfully installed tqdm-4.61.1
API Key valid
Saved API Key successfully!
val.csv:   0% 0.00/714k [00:00<?, ?B/s]
train.csv:   0% 0.00/7.00M [00:00<?, ?B/s]

val.csv: 100% 714k/714k [00:00<00:00, 1.07MB/s]


test.csv: 100% 1.83M/1.83M [00:00<00:00, 2.08MB/s]

train.csv: 100% 7.00M/7.00M [00:01<00:00, 5.97MB/s]

In [ ]:

!pip install datasets transformers

Requirement already satisfied: datasets in /usr/local/lib/python3.7/dist-packages (1.8.0)
Requirement already satisfied: transformers in /usr/local/lib/python3.7/dist-packages (4.8.1)
Requirement already satisfied: multiprocess in /usr/local/lib/python3.7/dist-packages (from datasets) (0.70.12.2)
Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from datasets) (1.1.5)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /usr/local/lib/python3.7/dist-packages (from datasets) (4.5.0)
Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.7/dist-packages (from datasets) (2.25.1)
Requirement already satisfied: dill in /usr/local/lib/python3.7/dist-packages (from datasets) (0.3.4)
Collecting tqdm<4.50.0,>=4.27
  Using cached https://files.pythonhosted.org/packages/73/d5/f220e0c69b2f346b5649b66abebb391df1a00a59997a7ccf823325bd7a3e/tqdm-4.49.0-py2.py3-none-any.whl
Requirement already satisfied: xxhash in /usr/local/lib/python3.7/dist-packages (from datasets) (2.0.2)
Requirement already satisfied: pyarrow<4.0.0,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from datasets) (3.0.0)
Requirement already satisfied: huggingface-hub<0.1.0 in /usr/local/lib/python3.7/dist-packages (from datasets) (0.0.12)
Requirement already satisfied: fsspec in /usr/local/lib/python3.7/dist-packages (from datasets) (2021.6.1)
Requirement already satisfied: packaging in /usr/local/lib/python3.7/dist-packages (from datasets) (20.9)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.7/dist-packages (from datasets) (1.19.5)
Requirement already satisfied: tokenizers<0.11,>=0.10.1 in /usr/local/lib/python3.7/dist-packages (from transformers) (0.10.3)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/dist-packages (from transformers) (3.13)
Requirement already satisfied: sacremoses in /usr/local/lib/python3.7/dist-packages (from transformers) (0.0.45)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.7/dist-packages (from transformers) (2019.12.20)
Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from transformers) (3.0.12)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas->datasets) (2.8.1)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas->datasets) (2018.9)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata; python_version < "3.8"->datasets) (3.4.1)
Requirement already satisfied: typing-extensions>=3.6.4; python_version < "3.8" in /usr/local/lib/python3.7/dist-packages (from importlib-metadata; python_version < "3.8"->datasets) (3.7.4.3)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->datasets) (3.0.4)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->datasets) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->datasets) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->datasets) (2021.5.30)
Requirement already satisfied: pyparsing>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging->datasets) (2.4.7)
Requirement already satisfied: click in /usr/local/lib/python3.7/dist-packages (from sacremoses->transformers) (7.1.2)
Requirement already satisfied: joblib in /usr/local/lib/python3.7/dist-packages (from sacremoses->transformers) (1.0.1)
Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from sacremoses->transformers) (1.15.0)
ERROR: aicrowd-cli 0.1.7 has requirement tqdm<5,>=4.56.0, but you'll have tqdm 4.49.0 which is incompatible.
Installing collected packages: tqdm
  Found existing installation: tqdm 4.61.1
    Uninstalling tqdm-4.61.1:
      Successfully uninstalled tqdm-4.61.1
Successfully installed tqdm-4.49.0

In [ ]:

import pandas as pd
import numpy as np
import os

import torch
import datasets
from datasets import load_dataset
from transformers import EncoderDecoderModel, EncoderDecoderConfig, BertTokenizerFast, Seq2SeqTrainingArguments, Seq2SeqTrainer, BertConfig

In [ ]:

train_dataset = pd.read_csv("data/train.csv")
validation_dataset = pd.read_csv("data/val.csv")
test_dataset = pd.read_csv("data/test.csv")

In [ ]:

dataset = load_dataset('csv', data_files={"train"     : ["data/train.csv"], 
                                          "validation": ["data/val.csv"], 
                                          "test"      : ["data/test.csv"]})

Using custom data configuration default-9c3a67fe7c873bf0

Downloading and preparing dataset csv/default (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to /root/.cache/huggingface/datasets/csv/default-9c3a67fe7c873bf0/0.0.0/2dc6629a9ff6b5697d82c25b73731dd440507a69cbce8b425db50b751e8fcfd0...

Dataset csv downloaded and prepared to /root/.cache/huggingface/datasets/csv/default-9c3a67fe7c873bf0/0.0.0/2dc6629a9ff6b5697d82c25b73731dd440507a69cbce8b425db50b751e8fcfd0. Subsequent calls will reuse this data.

In [ ]:

tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")

In [ ]:

MAX_TEXT_LENGTH = 150
MAX_LABEL_LENGTH = 150

def preprocess_function(sample):
  
  # Getting text and label
    text = sample["text"]
    label = sample["label"]

  # Tokenizing the text and label
    inputs = tokenizer(text, padding="max_length", truncation=True, max_length=MAX_TEXT_LENGTH)
    outputs = tokenizer(label, padding="max_length", truncation=True, max_length=MAX_LABEL_LENGTH)


    sample["input_ids"] = inputs.input_ids
    sample["attention_mask"] = inputs.attention_mask
    sample["decoder_input_ids"] = outputs.input_ids
    sample["decoder_attention_mask"] = outputs.attention_mask
    sample["labels"] = outputs.input_ids

  # The labels are used to calcuate the loss while training, and because we added padding to make all tokens to be of same size,
  # we also need to convert the padding number ( 0 ) to ( -100 ), so that we can tell huggingface that these number can be ignorned while calcuating loss. 
  # Why specifically -100 ? It's simply an arbitrary number, again so that huggingface can ignore this number while calcuating loss
      
    sample["labels"] = [[-100 if token == tokenizer.pad_token_id else token for token in labels] for labels in sample["labels"]]

    return sample

In [ ]:

BATCH_SIZE = 16
      
tokenized_datasets = dataset.map(preprocess_function, batch_size=BATCH_SIZE, batched=True)

In [ ]:

tokenized_datasets.set_format(
    type="torch", columns=["input_ids", "attention_mask", "decoder_input_ids", "decoder_attention_mask", "labels"],
)

In [ ]:

model = EncoderDecoderModel.from_encoder_decoder_pretrained("bert-base-uncased", "bert-base-uncased")

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertLMHeadModel: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertLMHeadModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertLMHeadModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertLMHeadModel were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['bert.encoder.layer.11.crossattention.self.key.weight', 'bert.encoder.layer.10.crossattention.self.query.bias', 'bert.encoder.layer.7.crossattention.output.dense.bias', 'bert.encoder.layer.5.crossattention.self.query.weight', 'bert.encoder.layer.10.crossattention.self.query.weight', 'bert.encoder.layer.6.crossattention.output.dense.bias', 'bert.encoder.layer.3.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.1.crossattention.self.value.bias', 'bert.encoder.layer.3.crossattention.self.query.weight', 'bert.encoder.layer.6.crossattention.self.query.weight', 'bert.encoder.layer.11.crossattention.self.value.weight', 'bert.encoder.layer.4.crossattention.self.value.bias', 'bert.encoder.layer.7.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.2.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.10.crossattention.self.key.weight', 'bert.encoder.layer.9.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.11.crossattention.self.query.bias', 'bert.encoder.layer.6.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.0.crossattention.output.dense.weight', 'bert.encoder.layer.1.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.1.crossattention.self.key.weight', 'bert.encoder.layer.10.crossattention.output.dense.weight', 'bert.encoder.layer.5.crossattention.self.key.weight', 'bert.encoder.layer.7.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.8.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.6.crossattention.self.value.weight', 'bert.encoder.layer.2.crossattention.self.value.bias', 'bert.encoder.layer.7.crossattention.output.dense.weight', 'bert.encoder.layer.5.crossattention.output.dense.bias', 'bert.encoder.layer.9.crossattention.self.key.weight', 'bert.encoder.layer.8.crossattention.self.key.weight', 'bert.encoder.layer.4.crossattention.self.key.weight', 'bert.encoder.layer.2.crossattention.output.dense.bias', 'bert.encoder.layer.3.crossattention.output.dense.bias', 'bert.encoder.layer.2.crossattention.self.value.weight', 'bert.encoder.layer.6.crossattention.self.query.bias', 'bert.encoder.layer.0.crossattention.self.query.bias', 'bert.encoder.layer.9.crossattention.self.query.bias', 'bert.encoder.layer.9.crossattention.output.dense.weight', 'bert.encoder.layer.5.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.2.crossattention.self.query.bias', 'bert.encoder.layer.6.crossattention.output.dense.weight', 'bert.encoder.layer.9.crossattention.self.value.weight', 'bert.encoder.layer.10.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.1.crossattention.self.key.bias', 'bert.encoder.layer.9.crossattention.self.value.bias', 'bert.encoder.layer.4.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.3.crossattention.self.value.weight', 'bert.encoder.layer.1.crossattention.self.value.weight', 'bert.encoder.layer.6.crossattention.self.value.bias', 'bert.encoder.layer.3.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.11.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.4.crossattention.self.value.weight', 'bert.encoder.layer.0.crossattention.self.value.bias', 'bert.encoder.layer.4.crossattention.self.query.bias', 'bert.encoder.layer.0.crossattention.self.value.weight', 'bert.encoder.layer.9.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.6.crossattention.self.key.bias', 'bert.encoder.layer.10.crossattention.self.key.bias', 'bert.encoder.layer.3.crossattention.self.query.bias', 'bert.encoder.layer.4.crossattention.output.dense.weight', 'bert.encoder.layer.0.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.10.crossattention.self.value.bias', 'bert.encoder.layer.9.crossattention.self.query.weight', 'bert.encoder.layer.2.crossattention.self.key.weight', 'bert.encoder.layer.4.crossattention.output.dense.bias', 'bert.encoder.layer.7.crossattention.self.query.weight', 'bert.encoder.layer.2.crossattention.self.query.weight', 'bert.encoder.layer.11.crossattention.self.key.bias', 'bert.encoder.layer.3.crossattention.self.key.bias', 'bert.encoder.layer.8.crossattention.self.value.bias', 'bert.encoder.layer.1.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.8.crossattention.self.key.bias', 'bert.encoder.layer.2.crossattention.self.key.bias', 'bert.encoder.layer.11.crossattention.output.dense.weight', 'bert.encoder.layer.9.crossattention.output.dense.bias', 'bert.encoder.layer.1.crossattention.output.dense.bias', 'bert.encoder.layer.5.crossattention.output.dense.weight', 'bert.encoder.layer.4.crossattention.self.query.weight', 'bert.encoder.layer.10.crossattention.self.value.weight', 'bert.encoder.layer.1.crossattention.self.query.weight', 'bert.encoder.layer.4.crossattention.self.key.bias', 'bert.encoder.layer.0.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.10.crossattention.output.dense.bias', 'bert.encoder.layer.8.crossattention.output.dense.bias', 'bert.encoder.layer.0.crossattention.self.query.weight', 'bert.encoder.layer.7.crossattention.self.key.weight', 'bert.encoder.layer.2.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.0.crossattention.self.key.bias', 'bert.encoder.layer.0.crossattention.output.dense.bias', 'bert.encoder.layer.1.crossattention.self.query.bias', 'bert.encoder.layer.8.crossattention.self.value.weight', 'bert.encoder.layer.2.crossattention.output.dense.weight', 'bert.encoder.layer.3.crossattention.output.dense.weight', 'bert.encoder.layer.8.crossattention.output.dense.weight', 'bert.encoder.layer.5.crossattention.self.value.weight', 'bert.encoder.layer.7.crossattention.self.value.bias', 'bert.encoder.layer.5.crossattention.self.value.bias', 'bert.encoder.layer.5.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.9.crossattention.self.key.bias', 'bert.encoder.layer.7.crossattention.self.value.weight', 'bert.encoder.layer.8.crossattention.self.query.bias', 'bert.encoder.layer.4.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.5.crossattention.self.query.bias', 'bert.encoder.layer.3.crossattention.self.key.weight', 'bert.encoder.layer.11.crossattention.self.value.bias', 'bert.encoder.layer.10.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.7.crossattention.self.query.bias', 'bert.encoder.layer.3.crossattention.self.value.bias', 'bert.encoder.layer.8.crossattention.self.query.weight', 'bert.encoder.layer.7.crossattention.self.key.bias', 'bert.encoder.layer.6.crossattention.self.key.weight', 'bert.encoder.layer.6.crossattention.output.LayerNorm.weight', 'bert.encoder.layer.11.crossattention.output.LayerNorm.bias', 'bert.encoder.layer.0.crossattention.self.key.weight', 'bert.encoder.layer.11.crossattention.output.dense.bias', 'bert.encoder.layer.11.crossattention.self.query.weight', 'bert.encoder.layer.1.crossattention.output.dense.weight', 'bert.encoder.layer.5.crossattention.self.key.bias', 'bert.encoder.layer.8.crossattention.output.LayerNorm.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

In [ ]:

model.config.decoder_start_token_id = tokenizer.cls_token_id
model.config.eos_token_id = tokenizer.sep_token_id
model.config.pad_token_id = tokenizer.pad_token_id
model.config.vocab_size = model.config.encoder.vocab_size

In [ ]:

N_EPOCHS = 10

args = Seq2SeqTrainingArguments(
    "Scambled Text",
    evaluation_strategy = "epoch",
    per_device_train_batch_size=BATCH_SIZE,
    per_device_eval_batch_size=BATCH_SIZE,
    num_train_epochs=N_EPOCHS,
    fp16=True,
    save_strategy="epoch",
    save_total_limit=5,
)

In [ ]:

trainer = Seq2SeqTrainer(
    model=model,
    args=args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['validation'],
)

Using amp fp16 backend

In [15]:

trainer.train()

The following columns in the training set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running training *****
  Num examples = 40001
  Num Epochs = 10
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 25010
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:1299: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  args.max_grad_norm,

[ 4061/25010 50:56 < 4:22:56, 1.33 it/s, Epoch 1.62/10]

Epoch	Training Loss	Validation Loss
1	1.272700	0.987479

The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-2501
Configuration saved in Scambled Text/checkpoint-2501/config.json
Model weights saved in Scambled Text/checkpoint-2501/pytorch_model.bin

[25010/25010 5:18:56, Epoch 10/10]

Epoch	Training Loss	Validation Loss
1	1.272700	0.987479
2	0.788000	0.736117
3	0.541200	0.642489
4	0.367600	0.612808
5	0.247600	0.612991
6	0.170100	0.618677
7	0.114400	0.624014
8	0.073700	0.639887
9	0.049200	0.650224
10	0.034800	0.648488

The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-5002
Configuration saved in Scambled Text/checkpoint-5002/config.json
Model weights saved in Scambled Text/checkpoint-5002/pytorch_model.bin
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:1299: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  args.max_grad_norm,
The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-7503
Configuration saved in Scambled Text/checkpoint-7503/config.json
Model weights saved in Scambled Text/checkpoint-7503/pytorch_model.bin
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:1299: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  args.max_grad_norm,
The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-10004
Configuration saved in Scambled Text/checkpoint-10004/config.json
Model weights saved in Scambled Text/checkpoint-10004/pytorch_model.bin
Deleting older checkpoint [Scambled Text/checkpoint-1251] due to args.save_total_limit
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:1299: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  args.max_grad_norm,
The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-12505
Configuration saved in Scambled Text/checkpoint-12505/config.json
Model weights saved in Scambled Text/checkpoint-12505/pytorch_model.bin
Deleting older checkpoint [Scambled Text/checkpoint-2502] due to args.save_total_limit
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:1299: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  args.max_grad_norm,
The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-15006
Configuration saved in Scambled Text/checkpoint-15006/config.json
Model weights saved in Scambled Text/checkpoint-15006/pytorch_model.bin
Deleting older checkpoint [Scambled Text/checkpoint-2501] due to args.save_total_limit
The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-17507
Configuration saved in Scambled Text/checkpoint-17507/config.json
Model weights saved in Scambled Text/checkpoint-17507/pytorch_model.bin
Deleting older checkpoint [Scambled Text/checkpoint-5002] due to args.save_total_limit
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:1299: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  args.max_grad_norm,
The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-20008
Configuration saved in Scambled Text/checkpoint-20008/config.json
Model weights saved in Scambled Text/checkpoint-20008/pytorch_model.bin
Deleting older checkpoint [Scambled Text/checkpoint-7503] due to args.save_total_limit
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:1299: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  args.max_grad_norm,
The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-22509
Configuration saved in Scambled Text/checkpoint-22509/config.json
Model weights saved in Scambled Text/checkpoint-22509/pytorch_model.bin
Deleting older checkpoint [Scambled Text/checkpoint-10004] due to args.save_total_limit
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:1299: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  args.max_grad_norm,
The following columns in the evaluation set  don't have a corresponding argument in `EncoderDecoderModel.forward` and have been ignored: text.
***** Running Evaluation *****
  Num examples = 4001
  Batch size = 16
Saving model checkpoint to Scambled Text/checkpoint-25010
Configuration saved in Scambled Text/checkpoint-25010/config.json
Model weights saved in Scambled Text/checkpoint-25010/pytorch_model.bin
Deleting older checkpoint [Scambled Text/checkpoint-12505] due to args.save_total_limit


Training completed. Do not forget to share your model on huggingface.co/models =)

Out[15]:

TrainOutput(global_step=25010, training_loss=0.46864897358279284, metrics={'train_runtime': 19137.0824, 'train_samples_per_second': 20.902, 'train_steps_per_second': 1.307, 'total_flos': 1.7810609046094797e+17, 'train_loss': 0.46864897358279284, 'epoch': 10.0})

In [16]:

def generate_predictions(batch):

    # Tokenizing the test
    inputs = tokenizer(batch["text"], padding="max_length", truncation=True, max_length=MAX_TEXT_LENGTH, return_tensors="pt")
    
    # Sending the tensors to GPU
    input_ids = inputs.input_ids.to("cuda")
    attention_mask = inputs.attention_mask.to("cuda")

    # Generating the predicted tokens ids
    outputs = model.generate(input_ids, attention_mask=attention_mask)

    # Converting the token ids to sentence
    output_str = tokenizer.batch_decode(outputs, skip_special_tokens=True)

    batch["predictions"] = output_str

    return batch

In [17]:

results = dataset['test'].map(generate_predictions, batched=True, batch_size=16)

In [18]:

test_dataset

Out[18]:

	id	text	label
0	0	safely objects move. that system images detect...	system Using approach, move. safely skip of th...
1	1	We detectors popular influence of confidences ...	confidences popular different in detectors We ...
2	2	compact coding present supervised We a approac...	We a supervised approach, compact present codi...
3	3	study high-throughput vital of quantitative be...	is for and individuals study of of collective ...
4	4	on data sets. We evaluate method many challeng...	sets. challenging the data We method on many e...
...	...	...	...
9995	9995	particular i.e. problem, of to due However, na...	the i.e. of problem, However, particular to th...
9996	9996	Simulation methods. proposed outperforming met...	state-of-the-art the that of Simulation demons...
9997	9997	in This view introduces a scenarios. paper tec...	used label introduces in placement scenarios. ...
9998	9998	valve. water from water are and pipeline sourc...	source noise interference of The are valve. pi...
9999	9999	localization leak analyzed. The leak the compu...	The leak analyzed. leak distance against actua...

10000 rows × 3 columns

In [19]:

test_dataset['label'] = results['predictions']
test_dataset

Out[19]:

	id	text	label
0	0	safely objects move. that system images detect...	using this approach, skip images of objects th...
1	1	We detectors popular influence of confidences ...	we had studied the influence of three popular ...
2	2	compact coding present supervised We a approac...	we present a compact coding approach, supervis...
3	3	study high-throughput vital of quantitative be...	quantitative and high - throughput measurement...
4	4	on data sets. We evaluate method many challeng...	we evaluate the proposed method on many challe...
...	...	...	...
9995	9995	particular i.e. problem, of to due However, na...	however, due to the nature of the particular p...
9996	9996	Simulation methods. proposed outperforming met...	simulation results demonstrate that the propos...
9997	9997	in This view introduces a scenarios. paper tec...	this paper introduces a technique for street v...
9998	9998	valve. water from water are and pipeline sourc...	the pipeline of water, motor, water and interf...
9999	9999	localization leak analyzed. The leak the compu...	the accuracy computed from the actual leak dis...

10000 rows × 3 columns

In [20]:

!mkdir assets
test_dataset.to_csv(os.path.join("assets", "submission.csv"), index=False)

In [ ]:

Mounting Google Drive 💾
Your Google Drive will be mounted to access the colab notebook
Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.activity.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fexperimentsandconfigs%20https%3a%2f%2fwww.googleapis.com%2fauth%2fphotos.native&response_type=code

Enter your authorization code:

In [ ]:

Content

2954

Show Comments

Comments

You must login before you can post a comment.