Loading
Feedback

Leckofunny 157

Name

Marco Pleines

Organization

TU Dortmund

Location

DE

Badges

0
0
3

Connect

GitHub

Activity

Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Mon
Wed
Fri

Ratings Progression

Loading...

Challenge Categories

Loading...

Challenges Entered

Measure sample efficiency and generalization in reinforcement learning using procedurally generated environments

Latest submissions

See All
graded 68908

A new benchmark for Artificial Intelligence (AI) research in Reinforcement Learning

Latest submissions

See All
graded 9149
graded 9146
graded 9142
Gold 0
Silver 0
Bronze 3
Trustable
May 16, 2020
Newtonian
May 16, 2020
Newtonian
May 16, 2020

Badges

  • Has filled their profile page
    May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020

  • May 16, 2020
  • Kudos! You've won a bronze badge in this challenge. Keep up the great work!
    Challenge: Unity Obstacle Tower Challenge
    May 16, 2020
  • Kudos! You've won a bronze badge in this challenge. Keep up the great work!
    Challenge: Unity Obstacle Tower Challenge
    May 16, 2020
Participant Rating
Participant Rating
Leckofunny has not joined any teams yet...

NeurIPS 2020: Procgen Competition

Problems of using rllib for a research competition

5 months ago

I decided to not participate in this competition as well, due to the aforementioned constraints. I’ll continue using Procgen though using my established workflow and code.

Selecting seeds during training

5 months ago

My auto curriculum algorithm just alters the way seeds are sampled to provide much more useful data to the agent and hence improve sample efficiency. Having more seeds than 100 or 200 doesn’t even help in my opinion.

Selecting seeds during training

5 months ago

I guess my assumption is correct since nobody negates it.
It is a pity that Curriculum Learning cannot be done during this challenge.

Selecting seeds during training

5 months ago

@mohanty
I’d like to explicitly set a distinct seed for each worker during training, because I’ve got a concept for sampling seeds.
The implementation would probably look similar to this:
https://docs.ray.io/en/master/rllib-training.html#curriculum-learning

As far as I know, the Procgen environment has to be closed and instantiated again to apply a distinct seed (num_leves = 1, start_level = my_desired_seed), because I cannot enforce a new seed during the reset() call.

So I assume that 200 seeds will be sampled uniformly and it will not be possible to inject my logic to alter the sampling strategy of the 200 seeds.

Selecting seeds during training

5 months ago

Any info about this @mohanty ?

Selecting seeds during training

5 months ago

Hi!

How are you enforcing the usage of 200 training seeds once submitted?
I’m planning on a submission that has some logics to sample certain seeds for each environment.
And as far as Procgen is implemented, I’d have to close and instantiate again the environment to apply the designated seed.

FAQ: Regarding rllib based approach for submissions

5 months ago

Is there any kind of interface that could be used to dynamically tell each environment instance which seed to use? I’ve got some curriculum concepts to sample seeds during training.

From first sight, I think that this is way too cumbersome using RLlib.

Multi-Task Challenge?

5 months ago

According to this image, each environment is being trained and evaluated solely.
After all, the agent gets to train on the unknown environments as well, right?

And does this image mean that the training and the evaluation are done on your side?

Multi-Task Challenge?

6 months ago

Hi!

I’m wondering whether this competition challenges us with a multi-task setting.

To my understanding, one agent shall train on 16 environments So this agent/model should be able to play each environment and the 4 unseen ones, right?

Unity Obstacle Tower Challenge

Good testing environment that does not need X?

Over 1 year ago

Unfortunately I did not find one yet.

Release of the evaluation seeds?

Over 1 year ago

1001, 1002, 1003, 1004, 1005 are the evaluation seeds.
The environment’s source is finally available.

Submissions are stuck

Over 1 year ago

Thanks, the fix works!

Submissions are stuck

Over 1 year ago

@mohanty

Are you going to fix the bug of the show post-challenge submission button?
Pressing this button does not change the leaderboard.

Good testing environment that does not need X?

Over 1 year ago

Hey,

do you guys know of an environment, which would be suitable for testing DRL features?

Obstacle Tower takes too much time as well as the dependency of using an X server is daunting.
I’m working on two clusters, one in Jülich and one in Dortmund and neither of them has a suitable strategy for making X available. X needs root privileges to be started and that’s basically their major issue.

If X was not an issue, I would build myself a Unity environment.

Does anybody know if Unreal Engine is dependant on X as well?

Release of the evaluation seeds?

Over 1 year ago

Hey @arthurj
when could we get an OT build with the evaluation seeds?
I guess we all would love to see what our agents are capable of doing.

Thanks for the great challenge!

Submissions are stuck

Over 1 year ago

same issue over here

Submissions are stuck

Over 1 year ago

Monday as a deadline is kind of a bad choice, because debugging the submission process on a weekend does not sound feasible.

Is the evaluation seed truly random?

Over 1 year ago

Due to the very stochastic nature, I think more trials on the evaluation seeds would mitigate the strongly varying results. So just resubmitting the same agent may yield in much better or much worse performance.

Evaluation perspective config?

Over 1 year ago

I assume that the circumstances of the evaluation will be the same for every participant.

Evaluation perspective config?

Over 1 year ago

The evaluation config uses the default values such as 3rd person view.

Is the new v2.2 used for scoring?

Over 1 year ago

What’s the logic behind the videos? Is only the weakest episode recorded?

Episode 5 of the evaluation

Over 1 year ago

Thanks for checking this seed!

Since the default configuration is used, the sokoban puzzle and other level designs start on floor 10, right?

Episode 5 of the evaluation

Over 1 year ago

It looks like that my trained model always fails on floor 2 on the 5th episode of the evaluation.
Did anybody else encounter such a low performance on that particular episode?

Successful submissions do not appear on the leaderboard

Over 1 year ago

It doesn’t work for me as well.
I pushed the tag “submission-v2.0i”.

Architectures of Round 1 winners

Over 1 year ago

I’d say a lot of results achieved a performance of mean floor 5-6 that used the dopamine Rainbow DQN tutorial.

Some of Round 1 tricks were about limiting the agent’s action space to 6-8 actions.

Successful submissions do not appear on the leaderboard

Over 1 year ago

Did you score a mean floor that is lower than 5?

I’m wondering if there is a threshold.

Vector Observation contents

Over 1 year ago

Does anybody have any idea on how to debug my issue?

GitHub repo gone?

Over 1 year ago

Also the environment’s repo is down.

Vector Observation contents

Over 1 year ago

I checked the path of obstacle_tower_env.py and replaced the file with the one from the repository.
Same issue.

I tested this on two Ubuntu and two Windows machines. All have the same behavior.

Vector Observation contents

Over 1 year ago

I just upgraded to v2.1 and completely reinstalled my Python environment and the bug still exists.

Also, the value of the current floor is always equal to 0.

I observed this on windows and ubuntu.

Vector Observation contents

Over 1 year ago

So how is it possible that I only get 3 values for the vector observation and not 8?
Like said before I’m using the latest build and obstacle_tower_env.

V2.0 performance drop

Over 1 year ago

Hi @arthurj

Yes, I’m using the latest obstacle_tower_env.py.

Vector Observation contents

Over 1 year ago

I just printed the contents of the vector observation.
It has a shape of (4).
The first element is the visual observation (168, 168, 3).
The remaining items have a shape of () and hold the values key, time and current floor.

How does this relate to a vector observation size of 8.

Just to clarify, I’m on v2.0 and retro mode is disabled.

V2.0 performance drop

Over 1 year ago

Like I said, I cannot pass the config to reset like how you do it in your example.

It throws a TypeError
TypeError: reset() takes 1 positional argument but 2 were given

V2.0 performance drop

Over 1 year ago

Thanks for the hint.

Setting the total-floors to a much smaller number solved this.

However, I cannot pass in a config to reset(). @arthurj
At least it worked for instantiating the game instances.

Vector Observation contents

Over 1 year ago

Hi @arthurj

What information is now stored in the vector observation?

In v1.3 I think it was the remaining time and if the agent has a key or not.

Now it looks like that there are 8 values:

Vector Observation space size (per agent): 8

V2.0 performance drop

Over 1 year ago

Did anybody else encounter performance drops in v2.0?

I observed on my PPO implementation that one data sampling cycles takes longer.
Before I observed mean duration of one cycle for 53,3 seconds with a deviation of 1,1.

Now I observe a mean of 146 seconds with a deviation of 24,2.

I tested varying parameters, cpu/gpu training and 3 different machines (windows and ubuntu). All came to the conclusion that training takes more time with a strong variation.

v1.3: 56, 54, 53, 53, 53, 53, 54, 53, 52, 54 seconds
v2.0: 91, 128, 133, 136, 144, 160, 145, 166, 156, 165, 182 seconds

Has the submission process changed for round 2?

Over 1 year ago

@arthurj

Did the evaluation of the submissions change for round 2?

Like what reset parameters are used and how many random seeds?

Am I allowed to join from Round 2?

Over 1 year ago

The finalists are declared here

What reward receives the agent for collecting a key?

Over 1 year ago

How strong is the reward signal for the agent to pick up a key?

Testing agent in local with docker

Over 1 year ago

Is the environment’s worker_id supposed to be 0 during the evaluation process?

Using dopamine trained model

Over 1 year ago

I really dislike Dopamine, Baselines and Spinning Up. These implementations are way too large by now and severly lack in features such as exporting and loading models for inference. It smells like that there was no Software Engineer involved at all.

RL-Adventure-2 is really nice, because it is much more accessible.

Using dopamine trained model

Over 1 year ago

Dopamine has an evaluation mode. To run eval mode, set the training steps to 1 and set the evaluation steps to something reasonable. There is no explicit functionality that exports the trained model for inference.

How to use the model trained in Dopamine

Over 1 year ago

Does anybody know of a suitable solution to run dopamine models in inference mode for the evaluation?
It looks like that dopamine does provide a solution out of a box, which is quite a shame for google.

Tutorial Deep Reinforcement Learning to try with PyTorch

Over 1 year ago

In my opinion, a good start would be to take an existing PPO, SAC or Rainbow DQN implementation.
The initial challenges would be to prepare the model’s input and especially the model’s output, which shall support Multi-Discrete actions. The ML-Agents toolkit solves this by creating so called action branches. I think it is called policy branches in their code, which implements PPO.

Build.sh throws Syntax error

Almost 2 years ago

This was tested on my home computer and at one at my university.

Build.sh throws Syntax error

Almost 2 years ago

I’m able to ping these addresses and to use wget, so I have no clue why Repo2Docker is not doing its job.

Build.sh throws Syntax error

Almost 2 years ago

The host system does not have any issues with resolving the urls. The docker container, which is being build struggles with resolving these. As I don’t have access to the images being build, I cannot think of anything on how to approach this

Build.sh throws Syntax error

Almost 2 years ago

I still didn’t get build.sh to work. Any idea why Repo2Docker cannot resolve sources?

Step 29/33 : RUN apt-get update && apt-get install --yes --no-install-recommends curl git xvfb ffmpeg && apt-get purge && apt-get clean && rm -rf /var/lib/apt/lists/*
 ---> Running in f3979d51b4f6
Err:1 http://security.ubuntu.com/ubuntu xenial-security InRelease
  Temporary failure resolving 'security.ubuntu.com'
Err:2 http://archive.ubuntu.com/ubuntu xenial InRelease
  Temporary failure resolving 'archive.ubuntu.com'
Err:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64  InRelease
  Could not resolve host: developer.download.nvidia.com
Err:4 https://deb.nodesource.com/node_10.x bionic InRelease
  Could not resolve host: deb.nodesource.com
Err:5 http://archive.ubuntu.com/ubuntu xenial-updates InRelease
  Temporary failure resolving 'archive.ubuntu.com'
Err:6 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  InRelease
  Could not resolve host: developer.download.nvidia.com
Err:7 http://archive.ubuntu.com/ubuntu xenial-backports InRelease
  Temporary failure resolving 'archive.ubuntu.com'
Reading package lists...
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial/InRelease  Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-updates/InRelease  Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-backports/InRelease  Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/xenial-security/InRelease  Temporary failure resolving 'security.ubuntu.com'
W: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/InRelease  Could not resolve host: developer.download.nvidia.com
W: Failed to fetch https://deb.nodesource.com/node_10.x/dists/bionic/InRelease  Could not resolve host: deb.nodesource.com
W: Failed to fetch https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/InRelease  Could not resolve host: developer.download.nvidia.com
W: Some index files failed to download. They have been ignored, or old ones used instead.
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package curl
E: Unable to locate package git
E: Unable to locate package xvfb
E: Unable to locate package ffmpeg
Removing intermediate container f3979d51b4f6
The command '/bin/sh -c apt-get update && apt-get install --yes --no-install-recommends curl git xvfb ffmpeg && apt-get purge && apt-get clean && rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100

Build.sh throws Syntax error

Almost 2 years ago

Thanks, I’ll switch to Ubuntu 18.04 as 16.04 keeps installing 3.5.2

Build.sh throws Syntax error

Almost 2 years ago

Hi,

I’m getting this syntax error while trying to run build.sh on Ubuntu 16.04 VM:

Traceback (most recent call last):
  File "/usr/local/bin/aicrowd-repo2docker", line 7, in <module>
    from repo2docker.__main__ import main
  File "/usr/local/lib/python3.5/dist-packages/repo2docker/__main__.py", line 1, in <module>
    from .app import Repo2Docker
  File "/usr/local/lib/python3.5/dist-packages/repo2docker/app.py", line 592
    self.log.info(f'Successfully pushed {self.output_image_spec}', extra=dict(phase='pushing'))
                                                                ^
SyntaxError: invalid syntax

This is how I set up the VM.

Problem with 1.1_windows

Almost 2 years ago

We just need a fixed Linux build as well

Unable to set up environment

Almost 2 years ago

WHich python version are you using? 3.6.8 works, but anything above 3.7 seems not work.

Problem with 1.1_windows

Almost 2 years ago

Same issue here on a Win 10 desktop

My steps:

  • Clone obstacle-tower-challenge repository
  • Create conda environment (Python 3.6.8)
  • Install git
  • Install numpy 1.15.4
  • pip install -r requirements.txt
  • Download build and extract in the repository path

This issue occurs on both version of OTC (v1, v1.1)

Failure: Undelivered E-Mail (OTC@unity3d.com)

Almost 2 years ago

Upon sending an e-mail to you to register my team, I got a received status notification (failure).

Hello x@x.com, We’re writing to let you know that the group you tried to contact (otc) may not exist, or you may not have permission to post messages to the group. A few more details on why you weren’t able to post: * You might have spelled or formatted the group name incorrectly. * The owner of the group may have removed this group. * You may need to join the group before receiving permission to post. * This group may not be open to posting. If you have questions related to this or any other Google Group, visit the Help Center at https://support.google.com/a/unity3d.com/bin/topic.py?topic=25838. Thanks, unity3d.com admins

Potential way to cheat in OTC?

Over 1 year ago

Due to the stochastic nature of the task (environment and agent behaviors), submitted results have high variance. I submitted the same tag 4 times by now and reached: 8.2, 7.8, 9.2, 8.6 as a result.
I assume this counts for other submissions as well.

The winners of Round 2 are selected based on the leaderboard and of course if they qualified after Round 1, right?

Potential way to cheat in OTC?

Over 1 year ago

Isn’t it the case that very last successfull submission is shown on the leaderboard?

Potential way to cheat in OTC?

Over 1 year ago

Hi mohanty,

lets assume the case that somebody wants to improve his score, but does not want to risk his current score due to the none-deterministic behavior of OT.
Wouldn’t it be possible to make the agent program crash on purpose if not the desired result is achieved?

Best
Marco

Quick Question :D

Over 1 year ago

Wait, I actually saw now what you meant.

I did not anything to trigger all the issues that are tagged as failed 1 day ago.

So this is a really weird behavior.

Quick Question :D

Over 1 year ago

Hi,

after Arthur solved the skipping bug, I wanted to test a few trained models to see how the fix affected the performance on Episode 5 of the evaluation.

There is no particular issue to worry about, except this one Vector Observation contents

Best
Marco

Uaing dopamine trained model

Over 1 year ago

I’m getting back to work.
My plan is to write some logics to convert the checkpoints to a model file to avoid the huge file sizes.
After that, my goal is write an independent evaluation script, that makes use of the just created model.

Deep Reinforcement Learning PhD