
Location
Badges
Activity
Challenge Categories
Challenges Entered
Measure sample efficiency and generalization in reinforcement learning using procedurally generated environments
Latest submissions
See All| graded | 94646 | ||
| failed | 94180 | ||
| graded | 93850 |
Self-driving RL on DeepRacer cars - From simulation to real world
Latest submissions
Sample-efficient reinforcement learning in Minecraft
Latest submissions
Multi-agent RL in game environment. Train your Derklings, creatures with a neural network brain, to fight for you!
Latest submissions
Sample-efficient reinforcement learning in Minecraft
Latest submissions
Multi Agent Reinforcement Learning on Trains.
Latest submissions
Robots that learn to interact with the environment autonomously
Latest submissions
| Participant | Rating |
|---|
| Participant | Rating |
|---|
NeurIPS 2020: MineRL Competition
Is there any team that still has a seat?
About 5 years agoHi everyone.
I just finished my competition on Procgen and ranked 3rd place in round 2. Iโd like to participate in MineRL too but I donโt have many resources for this competition โ I get one GPU and $300+ AWS credit. Also, I think working with people would be a lot more interesting than working alone! Therefore, I want to ask and see if any team still open to new members.
I wish to join a team thatโs active and eager to improve its agent and I promise that Iโll do my best in the rest time of the competition.
NeurIPS 2020: Procgen Competition
Is it a preknowledge that we should select one of existing submissions for the final evaluation?
About 5 years agoThe final evaluation evaluates generalization, but I did not use any regularizations such as batch normalization and data augmentations in my previous submissions. Also, in my latest few submissions, I chose to experiment with a newly introduced hyperparameter instead of using the one that performed well on my local machine.
Is it a preknowledge that we should select one of existing submissions for the final evaluation?
About 5 years agoHi @vrv
Thank you for the response. Yeah, I know that was my bad after reviewing thoroughly the overview page and the answer I linked before. However, we did not always follow these, right? For example, we used 6 public and 4 private test environments in the second round instead of 4 and 1 described on the overview page. Also, this answer said we got to pick 3 submissions but at the end of the day, we only pick one.
Maybe I should ask this before instead of wishfully thinking a new submission is viable. At this point, I donโt how which submission I should use as I said before, none of them were made for the final evaluation.
The purpose I posted this was to see if there was someone else facing a similar situation. If Iโm the only one, Iโll accept it.
Although some of the above words may seem like complaining, I am not meant to. Iโve learned a lot during the competition and received a lot of help from you guys. Thank you all.
Is it a preknowledge that we should select one of existing submissions for the final evaluation?
About 5 years agoHi everyone,
Iโm wondering if Iโm the only one that learns we should select one of the previous submissions for the final evaluation. I cannot find any official statement about this and the only clue I can find now is this answer, which Iโve previously read but not paid much attention to the word โexistingโ. That was a mistake of mine but I humbly donโt think such an answer in the forum could be counted as a formal statement.
Itโs really frustrating to learn this at this point as none of my previous solutions was prepared for the final evaluations. I thought the challenge was to find a good solution but in the end, I found myself trapped in some word game. I am not meant to complain as I definitely should be responsible for the above mistake. However, if anyone feels the same way, please say something. Maybe, together we can make the game more interesting.
Is it possible to run the evaluation on all environments in the final week of round 2?
About 5 years agoIn the final week of round 2, is it possible to run the evaluation on all environments? To reduce the computation cost, maybe we can reduce the submission quota a bit.
Number of environments in round 2
About 5 years agoI am humbly against Feiyang. Not everyone gets that amount of computation resources to try out their ideas. A decent amount of daily limits on submissions are helpful.
On the other hand, I agree with @jurgisp and @quang_tran that we should relax the 2-hour limit. I think this makes the competition bias to on-policy algorithms. On-policy algorithms can take advantage of large batches and therefore use less training iterations. But off-policy algorithms usually work with a much smaller batch size and require more training iterations and more time to train.
Round 2 is open for submissions ๐
About 5 years agoHi @jyotish
Whatโs the use of the blind reward?
Running the evaluation worker during evaluations is now optional
About 5 years agoThank you @jyotish , I see now.
Running the evaluation worker during evaluations is now optional
About 5 years agoHi @jyotish
How should I change train.py to disable the evaluation worker locally?
Inform the agent a new episode starts
About 5 years agoIs there a way to inform the agent that a new episode starts when defining trainer using build_trainer?
TF2 is by default not enabled?
About 5 years agoI see. But why should it be restricted to TF1.x even though Iโve set framework=tfe in the yaml file?
TF2 is by default not enabled?
About 5 years agoAdding the following code to this line of train.py
from tensorflow.python import tf2
print(tf2.enabled())
assert False
it prints False. Is there any way to enable tf2?
About Ray trainer and workers
About 5 years agoThis file may contain something what youโre looking for
Rllib custom env
Over 5 years agoHi @jyotish
Could you please answer my above question?
If we use frameskip, does the framework count the number of frames in the right way?
For example, if we useframe_skip=2, the number of interactions between the agent and environment is8e6/2=4e6when using only 8M frames. If we use the standard configuration which settimesteps_total=8000000, will this stop correctly?
Rllib custom env
Over 5 years agoIf we use frameskip, does the framework count the number of frames in the right way?
For example, if we use frame_skip=2, the number of interactions between the agent and environment is 8e6/2=4e6 when using only 8M frames. If we use the standard configuration which set timesteps_total=8000000, will this stop correctly?
Unusually large tensor in the starter code
Over 5 years agoThank you so much and sorry for the late response.
Unusually large tensor in the starter code
Over 5 years agoI receive OOM error when running the starter code. Hereโs the error message I receive
== Status ==
Memory usage on this node: 3.6/62.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 7/10 CPUs, 0.9/1 GPUs, 0.0/13.96 GiB heap, 0.0/6.4 GiB objects
Result logdir: /home/aptx4869/ray_results/procgen-ppo
Number of trials: 1 (1 RUNNING)
ยฑ------------------------------ยฑ---------ยฑ------+
| Trial name | status | loc |
|-------------------------------ยฑ---------ยฑ------|
| PPO_procgen_env_wrapper_00000 | RUNNING | |
ยฑ------------------------------ยฑ---------ยฑ------+(pid=5272) 2020-06-24 09:26:36,869 INFO trainer.py:421 โ Tip: set โeagerโ: true or the --eager flag to enable TensorFlow eager execution
(pid=5272) 2020-06-24 09:26:36,870 INFO trainer.py:580 โ Current log_level is WARN. For more information, set โlog_levelโ: โINFOโ / โDEBUGโ or use the -v and -vv flags.
(pid=5272) 2020-06-24 09:26:44,889 INFO trainable.py:217 โ Getting current IP.
(pid=5272) 2020-06-24 09:26:44,889 WARNING util.py:37 โ Install gputil for GPU system monitoring.
Traceback (most recent call last):
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/tune/trial_runner.pyโ, line 467, in _process_trial
result = self.trial_executor.fetch_result(trial)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/tune/ray_trial_executor.pyโ, line 431, in fetch_result
result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/worker.pyโ, line 1515, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ResourceExhaustedError): ray::PPO.train() (pid=21577, ip=192.168.1.102)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/tensorflow/python/client/session.pyโ, line 1350, in _run_fn
target_list, run_metadata)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/tensorflow/python/client/session.pyโ, line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[2048,32,32,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node default_policy_1/tower_1/gradients_1/default_policy_1/tower_1/model_1/max_pooling2d_1/MaxPool_grad/MaxPoolGrad}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.During handling of the above exception, another exception occurred:
ray::PPO.train() (pid=21577, ip=192.168.1.102)
File โpython/ray/_raylet.pyxโ, line 459, in ray._raylet.execute_task
File โpython/ray/_raylet.pyxโ, line 462, in ray._raylet.execute_task
File โpython/ray/_raylet.pyxโ, line 463, in ray._raylet.execute_task
File โpython/ray/_raylet.pyxโ, line 417, in ray._raylet.execute_task.function_executor
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/rllib/agents/trainer.pyโ, line 498, in train
raise e
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/rllib/agents/trainer.pyโ, line 484, in train
result = Trainable.train(self)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/tune/trainable.pyโ, line 261, in train
result = self._train()
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.pyโ, line 151, in _train
fetches = self.optimizer.step()
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/rllib/optimizers/multi_gpu_optimizer.pyโ, line 212, in step
self.per_device_batch_size)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/rllib/optimizers/multi_gpu_impl.pyโ, line 257, in optimize
return sess.run(fetches, feed_dict=feed_dict)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/tensorflow/python/client/session.pyโ, line 958, in run
run_metadata_ptr)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/tensorflow/python/client/session.pyโ, line 1181, in _run
feed_dict_tensor, options, run_metadata)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/tensorflow/python/client/session.pyโ, line 1359, in _do_run
run_metadata)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/tensorflow/python/client/session.pyโ, line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[2048,32,32,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node default_policy_1/tower_1/gradients_1/default_policy_1/tower_1/model_1/max_pooling2d_1/MaxPool_grad/MaxPoolGrad (defined at /.conda/envs/procgen/lib/python3.7/site-packages/ray/rllib/agents/ppo/ppo_tf_policy.py:195) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
โฆ
== Status ==
Memory usage on this node: 17.9/62.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/12 CPUs, 0.0/1 GPUs, 0.0/13.96 GiB heap, 0.0/6.4 GiB objects
Result logdir: /home/aptx4869/ray_results/procgen-ppo
Number of trials: 1 (1 ERROR)
ยฑ------------------------------ยฑ---------ยฑ------+
| Trial name | status | loc |
|-------------------------------ยฑ---------ยฑ------|
| PPO_procgen_env_wrapper_00000 | ERROR | |
ยฑ------------------------------ยฑ---------ยฑ------+
Number of errored trials: 1
ยฑ------------------------------ยฑ-------------ยฑ-------------------------------------------------------------------------------------------------------+
| Trial name | # failures | error file |
|-------------------------------ยฑ-------------ยฑ-------------------------------------------------------------------------------------------------------|
| PPO_procgen_env_wrapper_00000 | 1 | /home/aptx4869/ray_results/procgen-ppo/PPO_procgen_env_wrapper_0_2020-06-24_09-18-39m1tpdxon/error.txt |
ยฑ------------------------------ยฑ-------------ยฑ-------------------------------------------------------------------------------------------------------+== Status ==
Memory usage on this node: 17.9/62.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/12 CPUs, 0.0/1 GPUs, 0.0/13.96 GiB heap, 0.0/6.4 GiB objects
Result logdir: /home/aptx4869/ray_results/procgen-ppo
Number of trials: 1 (1 ERROR)
ยฑ------------------------------ยฑ---------ยฑ------+
| Trial name | status | loc |
|-------------------------------ยฑ---------ยฑ------|
| PPO_procgen_env_wrapper_00000 | ERROR | |
ยฑ------------------------------ยฑ---------ยฑ------+
Number of errored trials: 1
ยฑ------------------------------ยฑ-------------ยฑ-------------------------------------------------------------------------------------------------------+
| Trial name | # failures | error file |
|-------------------------------ยฑ-------------ยฑ-------------------------------------------------------------------------------------------------------|
| PPO_procgen_env_wrapper_00000 | 1 | /home/aptx4869/ray_results/procgen-ppo/PPO_procgen_env_wrapper_0_2020-06-24_09-18-39m1tpdxon/error.txt |
ยฑ------------------------------ยฑ-------------ยฑ-------------------------------------------------------------------------------------------------------+Traceback (most recent call last):
File โtrain.pyโ, line 235, in
run(args, parser)
File โtrain.pyโ, line 229, in run
concurrent=True)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/tune/tune.pyโ, line 411, in run_experiments
return_trials=True)
File โ/home/aptx4869/.conda/envs/procgen/lib/python3.7/site-packages/ray/tune/tune.pyโ, line 347, in run
raise TuneError(โTrials did not completeโ, incomplete_trials)
ray.tune.error.TuneError: (โTrials did not completeโ, [PPO_procgen_env_wrapper_00000])
It seems that error occurs because of the unusually large tensor with shape[2048, 32, 32,32] but I have no idea where it comes from. My GPU has 12G memory. The only thing I change is the run.sh file, in which I increase the memory and CPU used by ray:
export RAY_MEMORY_LIMIT=15000000000
export RAY_CPUS=12
export RAY_STORE_MEMORY=10000000000
Several questions about the competition
Over 5 years agoHi @jyotish and @mohanty. If we set โnum_levels=0โ for training, which configuration should we use for evaluation on our local machine? And Iโm wondering if the difficulty is set to โeasyโ throughout the competition?
Error when downloading dataset
About 5 years agoHello,
Iโm trying to download minerl dataset, but I keep receiving errors like the following one
Is there another way to download the dataset? BTW, Iโm in China.