Loading

AGEPR

[Getting Started Notebook] AGEPR Challange

This is a Baseline Code to get you started with the challenge.

gauransh_k

You can use this code to start understanding the data and create a baseline model for further imporvments.

Starter Code for AGEPR Practice Challange

Note : Create a copy of the notebook and use the copy for submission. Go to File > Save a Copy in Drive to create a new copy

Downloading Dataset

Installing aicrowd-cli

In [1]:
!pip install aicrowd-cli
%load_ext aicrowd.magic
Requirement already satisfied: aicrowd-cli in /home/gauransh/anaconda3/lib/python3.8/site-packages (0.1.10)
Requirement already satisfied: click<8,>=7.1.2 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (7.1.2)
Requirement already satisfied: requests-toolbelt<1,>=0.9.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (0.9.1)
Requirement already satisfied: requests<3,>=2.25.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (2.26.0)
Requirement already satisfied: GitPython==3.1.18 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (3.1.18)
Requirement already satisfied: toml<1,>=0.10.2 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (0.10.2)
Requirement already satisfied: tqdm<5,>=4.56.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (4.62.2)
Requirement already satisfied: pyzmq==22.1.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (22.1.0)
Requirement already satisfied: rich<11,>=10.0.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from aicrowd-cli) (10.15.2)
Requirement already satisfied: gitdb<5,>=4.0.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from GitPython==3.1.18->aicrowd-cli) (4.0.9)
Requirement already satisfied: smmap<6,>=3.0.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from gitdb<5,>=4.0.1->GitPython==3.1.18->aicrowd-cli) (5.0.0)
Requirement already satisfied: idna<4,>=2.5 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from requests<3,>=2.25.1->aicrowd-cli) (3.1)
Requirement already satisfied: charset-normalizer~=2.0.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from requests<3,>=2.25.1->aicrowd-cli) (2.0.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from requests<3,>=2.25.1->aicrowd-cli) (1.26.6)
Requirement already satisfied: certifi>=2017.4.17 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from requests<3,>=2.25.1->aicrowd-cli) (2021.5.30)
Requirement already satisfied: colorama<0.5.0,>=0.4.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from rich<11,>=10.0.0->aicrowd-cli) (0.4.4)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from rich<11,>=10.0.0->aicrowd-cli) (2.10.0)
Requirement already satisfied: commonmark<0.10.0,>=0.9.0 in /home/gauransh/anaconda3/lib/python3.8/site-packages (from rich<11,>=10.0.0->aicrowd-cli) (0.9.1)
In [2]:
%aicrowd login
Please login here: https://api.aicrowd.com/auth/tK8jq9FaBgEDD9GACzX7OUfbHwY5nggWZgRT9hIvBqI
Opening in existing browser session.
API Key valid
Saved API Key successfully!
In [2]:
!rm -rf data
!mkdir data
%aicrowd ds dl -c agepr -o data

Importing Libraries

In this baseline, we will be using skleanr library to train the model and generate the predictions

In [31]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import normalize
from sklearn.linear_model import LinearRegression 
import os
from IPython.display import display

Reading the dataset

Here, we will read the train.csv which contains both training samples & labels, and test.csv which contains testing samples.

In [2]:
# Reading the CSV
train_data_df = pd.read_csv("data/train.csv")
test_data_df = pd.read_csv("data/test.csv")

# train_data.shape, test_data.shape
display(train_data_df.head())
display(test_data_df.head())
print(train_data_df.shape)
0 1 2 3 4 5 6 7 8
0 3 0.325 0.240 0.075 0.1520 0.0650 0.0305 0.0450 6
1 3 0.530 0.380 0.125 0.6160 0.2920 0.1130 0.1850 8
2 2 0.670 0.530 0.205 1.4015 0.6430 0.2465 0.4160 12
3 3 0.370 0.285 0.095 0.2260 0.1135 0.0515 0.0675 8
4 3 0.575 0.450 0.145 0.7950 0.3640 0.1505 0.2600 10
0 1 2 3 4 5 6 7
0 2 0.470 0.360 0.110 0.4965 0.2370 0.1270 0.130
1 1 0.455 0.350 0.105 0.4010 0.1575 0.0830 0.135
2 3 0.395 0.290 0.095 0.3000 0.1580 0.0680 0.078
3 1 0.585 0.465 0.150 0.9800 0.4315 0.2545 0.247
4 2 0.630 0.485 0.185 1.1670 0.5480 0.2485 0.340
(3340, 9)
In [17]:
train_data_df.describe()
Out[17]:
0 1 2 3 4 5 6 7 8
count 3340.000000 3340.000000 3340.000000 3340.000000 3340.000000 3340.000000 3340.000000 3340.000000 3340.000000
mean 1.950000 0.524409 0.408424 0.139409 0.830479 0.359726 0.181365 0.239702 9.966467
std 0.826459 0.120586 0.099743 0.042123 0.490222 0.221377 0.110196 0.139798 3.245207
min 1.000000 0.075000 0.055000 0.000000 0.002000 0.001000 0.000500 0.001500 1.000000
25% 1.000000 0.450000 0.350000 0.115000 0.441000 0.186375 0.093500 0.130000 8.000000
50% 2.000000 0.545000 0.425000 0.140000 0.802500 0.335500 0.171000 0.235000 10.000000
75% 3.000000 0.615000 0.480000 0.165000 1.157875 0.504125 0.255125 0.330000 11.000000
max 3.000000 0.815000 0.650000 1.130000 2.779500 1.488000 0.760000 1.005000 29.000000

Data Preprocessing

In [32]:
# Separating data from the dataframe for final training and normalising it for better predictions
X = normalize(train_data_df.drop("8", axis=1).to_numpy())
y = train_data_df["8"].to_numpy()
print(X.shape, y.shape)
(3340, 8) (3340,)

Splitting the data

In [33]:
# Splitting the training set, and training & validation
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2)
print(X_train.shape)
print(y_train.shape)
(2672, 8)
(2672,)
In [34]:
X_train[0], y_train[0]
Out[34]:
(array([0.55504738, 0.34967985, 0.28029893, 0.09435805, 0.60583422,
        0.25615437, 0.1476426 , 0.16651421]),
 9)

Training the Model

In [35]:
model = LinearRegression()
model.fit(X_train, y_train)
Out[35]:
LinearRegression()

Validation

In [36]:
model.score(X_val, y_val)
Out[36]:
0.5508491200549271

So, we are done with the baseline let's test with real testing data and see how we submit it to challange.

Predictions

In [37]:
# Separating data from the dataframe for final testing
X_test = test_data_df.to_numpy()
print(X_test.shape)
(836, 8)
In [38]:
# Predicting the labels
predictions = model.predict(X_test)
predictions.shape
Out[38]:
(836,)
In [39]:
# Converting the predictions array into pandas dataset
submission = pd.DataFrame({"age":predictions})
submission
Out[39]:
age
0 25.001010
1 13.558783
2 34.245491
3 21.192935
4 38.389748
... ...
831 31.141963
832 32.304479
833 35.316612
834 34.156687
835 25.725141

836 rows × 1 columns

In [40]:
# Saving the pandas dataframe
!rm -rf assets
!mkdir assets
submission.to_csv(os.path.join("assets", "submission.csv"), index=False)

Submitting our Predictions

Note : Please save the notebook before submitting it (Ctrl + S)

In [41]:
!!aicrowd submission create -c agepr -f assets/submission.csv
Out[41]:
['submission.csv ━━━━━━━━━━━━━━━━━━━━━━ 100.0% • 17.0/15.4 KB • 3.9 MB/s • 0:00:00',
 '                                  ╭─────────────────────────╮                                  ',
 '                                  │ Successfully submitted! │                                  ',
 '                                  ╰─────────────────────────╯                                  ',
 '                                        Important links                                        ',
 '┌──────────────────┬──────────────────────────────────────────────────────────────────────────┐',
 '│  This submission │ https://www.aicrowd.com/challenges/agepr/submissions/167674              │',
 '│                  │                                                                          │',
 '│  All submissions │ https://www.aicrowd.com/challenges/agepr/submissions?my_submissions=true │',
 '│                  │                                                                          │',
 '│      Leaderboard │ https://www.aicrowd.com/challenges/agepr/leaderboards                    │',
 '│                  │                                                                          │',
 '│ Discussion forum │ https://discourse.aicrowd.com/c/agepr                                    │',
 '│                  │                                                                          │',
 '│   Challenge page │ https://www.aicrowd.com/challenges/agepr                                 │',
 '└──────────────────┴──────────────────────────────────────────────────────────────────────────┘',
 "{'submission_id': 167674, 'created_at': '2021-12-12T18:33:37.748Z'}"]
In [ ]:


Comments

You must login before you can post a comment.

Execute