Round 1: Completed

Novartis DSAI Challenge


Reimagine Probability of Success

Misc Prizes : 1


Which of pharma’s biggest challenges can we tackle with data science & AI if we put our collective minds together? Can you help find one of Pharma’s holy grails? Today, predicting the Probability of Success (PoS) of a development program is based on simple industry benchmarks. Probability of Success is used to support decision making at the highest level: Should we invest potentially hundreds of dollars in a compound and run a Phase 3 trial? Data can guide those decisions and there are several approaches to do so. This is a challenge of such importance that we have teamed up with one of the world’s foremost experts, Prof. Andrew Lo of MIT’s Sloan School of Management.


  • The objective: There are several questions for you to work on, details here. You are expected to submit your programs and a presentation (details coming).

Data Sources

  • Data sources: This challenge focuses on publically available data which contains historical information on the trial and program level. The data will be available in Aridhia. You may want to connect or upload new data. If you wish to make this available for the leaderboard evaluation, please contact Nick Kelley.

Data dictionaries and some query file and look up table names - as well as future API instructions for data models and linking which we’ve just been approved for will be in the resources section. Master variables description table will be kept and updated here.

Getting Started

New Installing software - new guidance updated in the Resources page - see README - Installing Software 01.pdf

note: Live Platform Support with Aridhia in MS Teams

In summary, this is what you’ll need to do at least once per team:

  1. Set up your workspace environment based on how you work
  2. Get the gitlab project starter code into your workspace
  3. Have some fun: Use our rstudio and/or jupyter python tutorials from the starter kit to load data, train, and save a model
  4. Get on the leader board by making your first test submission!

Platforms: Aridhia supplies the environment and compute used for the data challenge, AIcrowd provides you the leaderboard and discussions. Leaderboard: The leaderboard will focus on one of the questions: to predict the probability of approval. Details and rules can be found here. The evaluation committee will evaluate all other challenges outside of the leaderboard.


  • Prizes: There will be five prize categories: Leaderboard Performance, Data Wrangling, Innovation, and Insights: general, trial, and program level insights. Note that your presentation is the face of your work, present yourself well!


  • Timeframe: The challenge will end 20 December 2019 at 18.00 CET.


  • Questions & Discussion: If you have additional questions, please submit them in our forums.


note: please use our forums where possible - Novartis: nicholas.kelley@novartis.com - Aridhia platform: rodrigo.barnes@aridhia.com - AI Crowd: mohanty@aicrowd.com

Detailed Evaluation Criteria

Data challenge awards, challenges, and evaluation criteria

Teams may attempt any or all of the challenges below, which are grouped into the following categories:

  • The Leaderboard and most predictive model
  • General insights and learning from the model, including specific questions the business team has for clinical trials and over-arching clinical programs
  • Data wrangling: enabling better predictions and insights by linking more data

The core dataset is composed of two historical trial and drug data sets from Informa which have been linked together. (details in resources) Please note that your solutions to the challenges may be improved by leveraging more information than is contained in the core dataset. Teams are encouraged to explore the impact of adding additional information to the core dataset, and can make this available centrally to all participants for leaderboard evaluations. (see rules)

The «Leaderboard»: Model Performance and Predictive Power The challenge: Predict the chance of obtaining regulatory approval while imagining you are planning your Phase 3 program. Regulatory outcomes for programs are contained in the “outcome” column of the data frame which is mapped from the variable Dev Status.

  • Evaluation criterion: We want to emulate a real world decision making scenario after a Phase 2 trial:whether or not to invest in a Phase 3 program. To that aim, we will train on only Phase 2 data before 2015 and try to predict the Regulatory Approval outcome as described above. The performance of your algorithm on a hold-out dataset will be measured using the log-loss metric and used to rank your solution.

Data wrangling / engineering

  • The challenge: Bring and link new data sets which allow additional features to be incorporated in algorithms either to provide new scientific insights (see these challenges listed below) or for the Leaderboard competition of predicting the chance of obtaining regulatory approval. External publicly available data or data which is provided can be used. In order to use such data for the Leaderboard and methods competition, you will need to publish the new data and a high level description to the central workspace for all participants. Participants using new data will be asked to demonstrate the performance boost of including it.

  • Evaluation criteria:

    • Gains in performance (as assessed by your team or by the Leaderboard) made by using the new data to train the algorithm
    • Innovation of your data wrangling / engineering approach
    • Novelty of your approach for validating your own algorithm

Presentation and visualization

  • The challenge: Communicate your data, insights and results to non-data-scientists

  • Evaluation criteria:

    • Clarity, transparency, scientific integrity of information
    • Teams will be shortlisted based on their submission and invited to give a presentation to the evaluation committee, including leadership representation from portfolio strategy, biostatistics,the Digital Office and of course Prof. Andrew Lo.

General insights:

  • The challenge:
    • Explain which drivers are most important for predictions, and/or predictions in certain areas
    • Explore and visualise how various features relate to one another
    • Show for a given trial/program prediction which specific features were most influential in its prediction
    • Bonus – try to give an example of how the insights above would support portfolio decision making and help to balance risk across programs / trials
  • Additional evaluation criteria:
    • Communicate your insights in as intuitive and accessible a way as possible

Scientific program level insights:

  • The challenge:
    • Can predicting trial success improve your prediction of regulatory approval?
    • What is the chance of obtaining regulatory approval in the following scenarios, and what are the drivers for successful regulatory approval for each of them?
      • without Phase 2, with Phase 3
      • with Phase 2, without Phase 3
      • without Phase 2, without Phase 3
    • Can you predict reason(s) for failure of the program, as provide by the variable “Trial Outcome”.
    • Can you predict regulatory approval, imagining you are prior to starting Phase 2?. What are the key drivers?
  • Evaluation criteria:
    • Innovation of your ML approach and interpretability
    • Novelty of your approach for validating your own algorithm
    • Approach for handling additional or missing data

Scientific trial level insights

  • The challenge
    • Identify the most important drivers of success for phase 2 (phase 3) trials and describe their impact. These drivers should be known at the time of planning phase 2 (phase 3). [variable: trial outcome]
    • What features predict that a trial will be terminated due to safety/adverse effects?
    • How does the importance of different features in predicting trial success vary across phases?
    • What are the recommendations for designing dose-finding studies (phase 2 trial with 3 and more doses) which would ultimately increase the chance of approval?
    • Imagine you are planning a phase 3 trial. Can you predict the duration of the trial? What are the key drivers of Phase 3 trial duration – as provided by the column “duration” in the core data.
    • Can you predict reason(s) for failure in a study (efficacy, safety, operational, …) as outlined in the Trial Outcome column/variable?
  • Evaluation criteria
    • Innovation of your ML approach and interpretability
    • Novelty of your approach for validating your own algorithm
    • Approach for handling additional or missing data

Outside the box thinking and Innovation:

  • The challenge: Surprise us!
  • The evaluation: There are no predetermined criteria for this prize. An evaluation team will identify the winner.