Loading
3 Follower
2 Following
simon_coulombe

Location

CA

Badges

1
1
2

Connect

Activity

Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Mon
Wed
Fri

Challenge Categories

Loading...

Challenges Entered

Play in a realistic insurance market, compete for profit!

Latest submissions

See All
graded 125543
graded 125479
graded 125125
Participant Rating
lolatu2 0
michael_bordeleau 0
s_charlesworth 0
Participant Rating
nigel_carpenter 0
michael_bordeleau 0
simon_coulombe has not joined any teams yet...

Insurance pricing game

Access to the test data?

Over 4 years ago

Hey @alfarzan & friends :slight_smile:

I was wondering if it would be possible to have access to the test data now that the competition is over.

Iโ€™d like to have the opportunity to score the shared solutions and simulate my own little market to get better feedback on what happened without asking you to do more work.

cheers

1st place solution

Over 4 years ago

congrats! and thanks for sharing!

Uncertainty quantification type models

Over 4 years ago

:heart_eyes:
(post must be at least 20 characters)

Ideas for an eventual next edition

Over 4 years ago

Hey all,

Iโ€™m sure lots of people have ideas for a future edition. I thought now might be the last chance to discuss them.

Here are mine:

  1. Change the way we are given the training data set so that we are always tested โ€œin the futureโ€. This would involve gradually feeding us a larger training set . It would look like this. Letโ€™s say the whole dataset is 5 years and is split in folds A B C D, E and F(a policy_id is always in the same fold).
    Week 1 : we train on Year1 for folds A,B, C and D. We are tested on year2 for folds A,B and E.
    Week 2: same training data set, but we are tested on year2 for folds, C, D and F.
    Week 3: NEW TRAINING data: we now have access to year 1 and 2 for folds A,B,C,D and we are tested on year 3 for folds A,B and E
    Week 4: same training data, tested on year 3 for folds C,D and F
    Week5: New training data: we now have access to year 1-2-3 for folds A,B,C,D, tested on year 4 A,B and E
    Week 6: same training data, tested on year 4 for folds C,D and F.
    Week 7: new training data: we now have the full training data set (year 1-2-3-4) , tested on year 5 for folds A,B and E
    Final WEEK : same training data, tested on year 5 of folds C, D and F.

a big con is that inactive people would need to at least refit their data on weeks 3 , 5 and 7. A solution would be to have Ali & crew refit inactive peoples on the new training set using the fit_model() functioin.

  1. I wouldnt do a โ€œcumulativeโ€ profit approach because a bad week would disqualify people and would make them create a new account to start from scratch, which wouldnโ€™t be fun and also would be hell to monitor. However, a โ€œchampionshipโ€ where you accumulate points like in Formula 1 could be interesting. A โ€œcrashโ€ simply means you earn 0 point. Iโ€™d only start accumulating points about halfway through the championship so that early adopters donโ€™t have too big of an advantage. Iโ€™d also give more points for the last week to keep the suspense.

  2. Provide a bootstrapped estimate of the variance for the leaderboard by generating a bunch of โ€œsmall test setsโ€ sampled from the โ€œbig test setโ€.

  3. Really need to find a way to give better feedback, but I canโ€™t think of a non-hackable way.

  4. We need to find a way that โ€œselling 1 policy and hoping it doesnt make a claimโ€ is no longer a strategy that can secure a spot in the top 12. A simple fix is disqualifying companies with less than 1/5 of a normal market share (in our case 10% / 5= 2%), but Iโ€™d rather find something less arbitraty.

One more post about bugs

Over 4 years ago

Iโ€™m sorry to see this happened to you, this sucks. (and Iโ€™m guessing the folks at aicrowd feel terrible)

The heatmap is weird โ€“ shouldnt you be at โ€œ100% oftenโ€ since you got a 100% market share?

It's (almost) over! sharing approaches

Over 4 years ago

this thread in a picture

A thought for people with frequency-severity models

Over 4 years ago

more reading on the pile - thanks for the link (and the summary) !

A thought for people with frequency-severity models

Over 4 years ago

I mean adding a features named โ€œpredicted_frequencyโ€ to the severity model and checking if that improves the severity model.

I thought about this because back in the days I was interested in a different type of discrete-continuous model: โ€œwhich type of heating system do people have in their houseโ€ and โ€œhow much gas do they use if they picked gas?โ€ and the predicted probability for all other systems would work itโ€™s way into the gas consumption model to correct some bias. (Dubin and McFadden 1984, donโ€™t read it) : https://econ.ucsb.edu/~tedb/Courses/GraduateTheoryUCSB/mcfaddendubin.pdf

An example explanation then would be "if you picked gas (higher cost up front than electricity, lower cost per energy unit, so only economical when you need a lot of energy) despite having a really small house (measured) then you probably have a really crappy insulation (not measured) and will probably consume more energy than would have been predicted only from your small square footage.

In that case, the estimate of the coefficient for the relation between โ€œsquare footageโ€ and โ€œgas consumptionโ€ would be biaised downward since all big houses get gas, but only badly insulated small houses get gas.

Itโ€™s not the same purpose, but maybe thereโ€™s some signal left.

I didnt think about this for long - this might be part of my 111 out of 112 ideas that are useless :slight_smile:

A thought for people with frequency-severity models

Over 4 years ago

Did you try adding the predicted frequency to the severity model?

Maybe โ€œbeing unlikely to make a claimโ€ means you only call your insurer after a disaster?

Legal / ethical aspects and other obligation

Over 4 years ago

Iโ€™ve also heard that European actuaries eat babies. Iโ€™m not asking for confirmation, that one has been confirmed.

Legal / ethical aspects and other obligation

Over 4 years ago

thanks mate! Iโ€™m a big believer in the value of โ€œlearning of publicโ€. Someomes I look like a fool, but much more often I get some really cool insight from knowledgeable people I wouldnt have received otherwise. Overall, itโ€™s totally worth it :slight_smile:

edit: also, this:

Legal / ethical aspects and other obligation

Over 4 years ago

Iโ€™ve heard rumors of European insurers charging different price depending on the day you were born (Monday, Thursdayโ€ฆ) to allow them to calculate price elasticities. Would love to have it confirmed or denied though.

Sharing of industrial practice

Over 4 years ago

Not an actuary, but from what Iโ€™ve seen in 3 insurance companies, the pricing is also mostly GLMs in Canada.

This is mostly due to the regulators, which vary by province. Non-pricing models such as fraud detection, churn probability or conversion rate can lean more towards machine learning and less towards interpretability.

Who else got a call from Ali?

Over 4 years ago

Haha!

I meant to joke that I had won but summoning you from a star pattern painted on the floor with burning candles at each points work too.

I had forgot about the debugging everyone part - sorry youโ€™re having a terrible week.

Who else got a call from Ali?

Over 4 years ago

just kidding, carry on
:slight_smile:

"Asymmetric" loss function?

Over 4 years ago

wowโ€ฆ wow!

Thanks for taking the time to dig into this and share your results. Iโ€™ve only been cheerleading so far, but all the work you and @Calico have shared is really interesting :slight_smile: Super cool to have Calicoโ€™s initial idea of linear increase show up in @guillaume_bs 's simulation.

Iโ€™m trying to come up with a simulation where there are 2 insurers and 2 groups of clients. Group A and B have the same average, but group B is much more variable. One insurer is aware of that, but the other is not. How bad is the unaware insurer going to get hurt? Iโ€™ll let this marinate for a bit :slight_smile:

"Asymmetric" loss function?

Over 4 years ago

really nice work!
12% seems pretty low compared to what most folks ended up charging. Iโ€™ll go back to the solution sharing thread to see if you posted your final % :slight_smile:

"Asymmetric" loss function?

Over 4 years ago

Thatโ€™s super cool!

Doesnt really have to be an โ€œasymmetric loss functionโ€, just something that reflects the fact that Iโ€™m more careful when giving rebates than when I am charging more.

Your model ended up loading a higher percentage to policies with lower predicted claims and that works with the spirit of what Iโ€™m looking for :slight_smile:

As a percentage of the premium, what did 2 times the standard deviation typically represent on an average premium of 100$ ?

"Asymmetric" loss function?

Over 4 years ago

Thanks for sharing!

I also had an โ€œotherโ€ category in my model for cars that were less frequent, but thereโ€™s a lot of different cars in that category, many of which probably have wildly different expected claims.

Iโ€™m also very interested on something to account for the uncertainty around a single premium.

"Asymmetric" loss function?

Over 4 years ago

Interesting!
What did you do with your premium once you had your final prediction and a couple more predictions to assess variance?

Hereโ€™s how I imagine this.
I have a model trained on 100% of the population and 5 others models trained on 40% samples of the population with replacement. (I could also just reuse the 5 models I already trained for cross validation)

Person A is predicted a 200$ claim by the โ€œmainโ€ model and 180,190, 200, 210 and 220$ by the small models.

Person B is also predicted a 200$ claim by the โ€œmainโ€ model, byt 140, 170, 200, 230 and 260$ by the small models.

Clearly, I want to charge person B more. My understanding from your comments is that there is no set formula in the actuarial world for that. Guess weโ€™ll have to simulate itโ€ฆ :slight_smile:

gosseux de donneux | pelleteux de cloud