Loading
Feedback

Leckofunny 157

Name

Marco Pleines

Organization

TU Dortmund

Location

DE

Badges

0
0
3

Connect

GitHub

Activity

Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Mon
Wed
Fri

Ratings Progression

Loading...

Challenge Categories

Loading...

Challenges Entered

Measure sample efficiency and generalization in reinforcement learning using procedurally generated environments

Latest submissions

See All
graded 68908

A new benchmark for Artificial Intelligence (AI) research in Reinforcement Learning

Latest submissions

See All
graded 9149
graded 9146
graded 9142
Gold 0
Silver 0
Bronze 3
Trustable
May 16, 2020
Newtonian
May 16, 2020
Newtonian
May 16, 2020

Badges


  • May 16, 2020
  • Has filled their profile page
    May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020
  • Kudos! You've won a bronze badge in this challenge. Keep up the great work!
    Challenge: Unity Obstacle Tower Challenge
    May 16, 2020
  • Kudos! You've won a bronze badge in this challenge. Keep up the great work!
    Challenge: Unity Obstacle Tower Challenge
    May 16, 2020
Participant Rating
Participant Rating
Leckofunny has not joined any teams yet...

NeurIPS 2020: Procgen Competition

Problems of using rllib for a research competition

About 1 year ago

I decided to not participate in this competition as well, due to the aforementioned constraints. I’ll continue using Procgen though using my established workflow and code.

Selecting seeds during training

About 1 year ago

My auto curriculum algorithm just alters the way seeds are sampled to provide much more useful data to the agent and hence improve sample efficiency. Having more seeds than 100 or 200 doesn’t even help in my opinion.

Selecting seeds during training

About 1 year ago

I guess my assumption is correct since nobody negates it.
It is a pity that Curriculum Learning cannot be done during this challenge.

Selecting seeds during training

About 1 year ago

@mohanty
I’d like to explicitly set a distinct seed for each worker during training, because I’ve got a concept for sampling seeds.
The implementation would probably look similar to this:
https://docs.ray.io/en/master/rllib-training.html#curriculum-learning

As far as I know, the Procgen environment has to be closed and instantiated again to apply a distinct seed (num_leves = 1, start_level = my_desired_seed), because I cannot enforce a new seed during the reset() call.

So I assume that 200 seeds will be sampled uniformly and it will not be possible to inject my logic to alter the sampling strategy of the 200 seeds.

Selecting seeds during training

About 1 year ago

Any info about this @mohanty ?

Selecting seeds during training

About 1 year ago

Hi!

How are you enforcing the usage of 200 training seeds once submitted?
I’m planning on a submission that has some logics to sample certain seeds for each environment.
And as far as Procgen is implemented, I’d have to close and instantiate again the environment to apply the designated seed.

FAQ: Regarding rllib based approach for submissions

About 1 year ago

Is there any kind of interface that could be used to dynamically tell each environment instance which seed to use? I’ve got some curriculum concepts to sample seeds during training.

From first sight, I think that this is way too cumbersome using RLlib.

Multi-Task Challenge?

About 1 year ago

According to this image, each environment is being trained and evaluated solely.
After all, the agent gets to train on the unknown environments as well, right?

And does this image mean that the training and the evaluation are done on your side?

Multi-Task Challenge?

About 1 year ago

Hi!

I’m wondering whether this competition challenges us with a multi-task setting.

To my understanding, one agent shall train on 16 environments So this agent/model should be able to play each environment and the 4 unseen ones, right?

Unity Obstacle Tower Challenge

Good testing environment that does not need X?

Almost 2 years ago

Unfortunately I did not find one yet.

Release of the evaluation seeds?

Almost 2 years ago

1001, 1002, 1003, 1004, 1005 are the evaluation seeds.
The environment’s source is finally available.

Submissions are stuck

About 2 years ago

Thanks, the fix works!

Submissions are stuck

About 2 years ago

@mohanty

Are you going to fix the bug of the show post-challenge submission button?
Pressing this button does not change the leaderboard.

Good testing environment that does not need X?

About 2 years ago

Hey,

do you guys know of an environment, which would be suitable for testing DRL features?

Obstacle Tower takes too much time as well as the dependency of using an X server is daunting.
I’m working on two clusters, one in Jülich and one in Dortmund and neither of them has a suitable strategy for making X available. X needs root privileges to be started and that’s basically their major issue.

If X was not an issue, I would build myself a Unity environment.

Does anybody know if Unreal Engine is dependant on X as well?

Release of the evaluation seeds?

About 2 years ago

Hey @arthurj
when could we get an OT build with the evaluation seeds?
I guess we all would love to see what our agents are capable of doing.

Thanks for the great challenge!

Submissions are stuck

About 2 years ago

same issue over here

Submissions are stuck

About 2 years ago

Monday as a deadline is kind of a bad choice, because debugging the submission process on a weekend does not sound feasible.

Is the evaluation seed truly random?

About 2 years ago

Due to the very stochastic nature, I think more trials on the evaluation seeds would mitigate the strongly varying results. So just resubmitting the same agent may yield in much better or much worse performance.

Evaluation perspective config?

About 2 years ago

I assume that the circumstances of the evaluation will be the same for every participant.

Evaluation perspective config?

About 2 years ago

The evaluation config uses the default values such as 3rd person view.

Deep Reinforcement Learning PhD

Notebooks

Create Notebook