bronze-challenge-end Bronze Medal winner for challenge
challenge_round : 72
silver-challenge-end Silver Medal winner for a challenge
challenge_round : 127
- May 16, 2020
our report will probably follow the lines of the presentation, explaining the motivation of the competition, its organization and the results and main challenges that emerged during the first edition.
Given the page limits (8 pages), I don’t think we can report in detail the architectures of the finalists
as with 1 page each it would take 6 pages just for that - basically a whole article in itself.
While the results were not great from the scoring perspective, it might be still worth to make a report on them - I know one of the finalists will attempt that, given that he accumulated quite some data while trying many approaches. I do not know what will be the reviewer take on this though (they did say “outstanding solutions” in the call).
we have received the following Call for Papers on the NeurIPS2019 Competitions.
As organizers, we will certainly submit a report on the competition.
Participants may submit too, as the call says:
"We invite organizers and participants of the competition and demonstration track at
NeurIPS2019 to submit papers to be considered for this volume. Submissions reporting
the design of demos and competitions, results and summaries of challenges, and
outstanding solutions to competitions are encouraged. Submissions on topics related to
the demonstrations and competitions features at NeurIPS2019 can be considered too."
Deadline for paper submission is February 24.
See more details on the pdf of the call here:
here are the answers to the feedback:
The task is far beyond the state of the art.
Indeed the task proved to be too hard in the current version.
There were numerous challenges to be faced (some of them highlighted in the presentation) and all of them had to be tackled together, which then also needed quite a software engineering effort.
Next year, we will probably offer more Rounds with tuned down difficulties at first.
As an example, we might have Rounds where there is only one object on the table or where the positions of the object(s) are available.
You cannot train vision in unsupervised manner when objects are static.
This is true. As I said in the presentation, “To learn to move objects, you need to know they exist. To know that they exist, you need to move them.”.
So you have some kind of a circular problem here, where you first need to learn about objects to learn to control them, but to do that you first need to move them to achieve the required variety of images for your autoencoder, so you would also need control first.
This is an important part of why this competition is relevant for autonomous open-ended learning and this challenge has to be faced.
However, in the next edition, we might have one or more Rounds where this one challenge is avoided (see above suggestion of having positions available).
It is not possible to train visuomotor space when the end-effector is not presented in the visual field during initial steps. You need to wait for convergence that is not granted.
I am not sure if I understood the issue completely (the convergence part) but indeed having a very wide workspace for the arm (partially off-view) was another big issue.
On one side, this contains an issue that we would like to keep (random exploration does not work, you have to devise some intrinsically-motivated focused exploration) but on the other hand we could simplify this a bit.
As an example, next year we might constrain the robot workspace to be more on the table (and in view), at least during the simplified Rounds.
The 2.5D and 3D tasks are not solvable from the top camera view.
Given that the higher part of the table is well confined on the far end of the table, the lack of z-axis/depth from the top camera should not matter
(in the sense that the higher position is still discriminated by its y-position).
On the other hand, I have not seen yet solutions for learning to grasp, the lack of which, independently of having a top camera view, might make 2.5D and 3D task impossible.
The scoring system was poorly designed as it does not reflect the progress in the task.
Yes, this was a critical issue, along with the fact that we did not provide a baseline code for the task.
The score rewarded doing nothing more than doing random or inaccurate movements.
This created a situation where the score actually “punished” systems that were starting to learn something but had not yet achieved a good enough control to exceed the baseline of “doing nothing”.
Being unable to monitor their progresses was clearly not rewarding and discouraging for participants, on top of a very hard challenge.
We will definitely change this in the next edition.
The score could be changed in many ways; as an example one might ignore in the score the objects that are already in place (which might be a bit of extreme, as the agent should try not to move them out of place too much) or the score might have multiple numbers tracking separately objects put in place from out of place and those accidentaly put of out place from being in place.
Several solutions can be envisioned, we welcome suggestions on that.
It is not professional to announce the absence of prize money one week before the deadline.
We never wrote or announced anything about prize money simply because there was not going to be any prize money.
Then, a couple of weeks before the deadline another participant asked if perhaps there was going to be any and we simply answered that there wasn’t.
We’re sorry if you have participated with the assumption that there was going to be money prizes, it must have been very disappointing in the end.
Next year we hope to have sponsors so that we can offer at least travel grants.
Talking about rewards (for research groups at least), I have been confirmed that were will be a volume of Proceedings of Machine Learning Research (http://proceedings.mlr.press/) associated to NeurIPS competitions and as I understood, participants might submit an article with their solutions there.
I do not know the details of how this will work exactly, I will let you know as soon as I receive more info from NeurIPS organization.
Did you already finish the analysis of the architectures? When do you plan to publish the final leaderboard?
Yes, we have checked the submissions and none was excluded, so the final leaderboard is the one in the presentation (i.e. the same as the one on the website except my submission which did not count).
[…] In the background of this competition, angle control made the problem unnecessarily difficult.
Clarify which kind of information can be used in the training phase or not (ex: end-effector position)
While dealing with angle control and doing the appropriate transformations need for the task (maybe learning them) is an interesting problem, I agree that in the context of the many challenges we have to face to solve this competition, it is… well, too much (i.e. I agree it is unnecessarily difficult).
Next year we could go with actions as end-effector positions, or maybe just explicitly allow that pybullet’s cartesian control (or other end-effector control) can be used, leaving to the participant choice if we they want control to be in angular space or simplify it with an intermediate translation (engineered, not learned).
Reduce the action space size (# of joint, limit the range, limit the speed)
As replied above, we might have one or more Rounds where the arm workspace is limited so that initial contacts with the objects can occour more often.
The speed also has to be reduced somehow, because with the current setup the arm can go very very fast (to give a sense of the scale: the timesteps are 5 milliseconds!) and this causes the objects to “fly around” and display behavior that is hard to reproduce.
The solution is actually to move the arm more slowly, but this has the side-effect of having to wait even more to get some results.
Given that the simulation already takes quite some time, a different solution has to be found.
One possibliy is to modify the environment so that internally it will still run 5ms timesteps (to keep accuracy of the simulation) but the rendering of the camera and the gym steps will happen only every 20ms, so that the simulation will run faster and we can afford to move the arm slowly (possibly even enforcing some kind of max speed).
increase/decrease the number of objects (that don’t affect the main goal: autonomous interact)
Agreed, we might have simplified Rounds with one object (see above).
Lack of a baseline (for tasks in this environment, without the intrinsic setting).
Long development-testing cycles, in part because of the computationally expensive simulation of the environment and lack of parallelization.
Engineering challenges: complex setup with many modules required.
As mentioned above, I think that the lack of a baseline was one of the main issues which acted as a big barrier to get more people envolved (along with the score).
Our aim for next edition is to provide a baseline code, so that it will be easier for new participants to start to try new things and see how they affect the final result, without having to pay a big cost in terms of software engineering first.
The baseline code should also provide a way to parallelize some of the computations.
As an example, in the presentation I divided the challenge into three parts: exploration, object recognition and (extrinsic) control.
Each of these parts could be a module, and each module needs to be trained on data.
This training can actually happen in parallel while the robot is interacting with the environment: for example, one might want to train an autoencoder for the object recognition part every 10000 steps; instead of stopping the interaction every 10k steps and wait for the autoencoder training to end, the robot could go on with the interaction (maybe up to a maximum of timesteps) and use the newly trained autoencoder only when it is ready.
The baseline should provide architectural solutions to achieve module parallelization without having each participant code this from scratch.
I want to thank again all of you for your amazing effort and your detailed and constructive feedback.
Overall, despite the flaws highlighted above, the competition received a lot of positive comments and gained attention, also during the NeurIPS conference.
The issue at hand, autonomous open-ended learning, applied to a robotic scenario, is certainly an interesting, important and challenging one.
I hope you will join us again for the next edition next year and we are open to more discussion and suggestions for improvement!
here is the video of the presentation of REAL Competition at the Deep RL workshop @ Neurips2019.
Our presentation starts around 29:30.
Thanks everyone for the detailed feedback!
We had the presentation this morning at the Competition Track in NeurIPS.
I think it went well, with quite a few questions in the end.
Unfortunately, the presentation was not recorded due to a logistical problem.
However, tomorrow we will present REAL again at the Deep RL workshop and the video staff told us that there should be recording there as well (hopefully).
Here is the presentation:
I will also later answer to all the feedback you sent us (probably on Monday), so that we can improve the competition for next year edition. As you will see, some of the feedback has been incorporated into the presentation as well.
Dear participant of the REAL competition,
We are writing you as you participated in the second final round of the competition.
We, the organisers of the competition, are now at NeurIPS 2019 conference and would like to announce the outcome of the competition.
We are at the moment analysing the results of the second round and in particular if the participating Teams met the spirit of the competition before officially declaring the outcome.
On next Friday (13/11), we will present the outcome of the competition to NeurIPS, and in that occasion we would like to mention the approaches used by the 6 Teams that made it to the second part of the competition.
If you would like that we mention your approach, we thus ask you to urgently send us:
Few sentences on the approach you used
The main challenges that, according to your view, the competition involved
And if you like (we will announce this publicly):
- The names of the people of your Team
- Their affiliation
Emilio Cartoni (firstname.lastname@example.org)
Gianluca Baldassarre (email@example.com)
I sent you an email a few days ago, reposting it here as you might have missed it
all submissions have completed, except one which is stuck:
Can you check it (and restart I guess), please?
Also, I have noticed that we have 9 VM instances running for only 1 active submission, and at least 4 of them seem to be from the submissions pool (VMs with V100) so there might be some stuck instance too (i.e. VMs that did not properly shutdown after evaluation finished).
I am not familiar with mpi.set_start_method, but I see that the reason to use it in main is to ensure it is called only once.
So you might have a workaround by using an an empty file as a lock.
i.e. before you invoke set_start_method you check if the file is present, if it is not, you create it and use set_start_method; otherwise, if the file is present, another process has already called set_start_method.
Even we can use multiprocessing, real_robots (0.1.16) never calls end_extrinsic_phase() method, so we have no idea when to terminate our subprocesses. (Maybe bug?)
Yes, it is a bug. I see now that at the end of the extrinsic phase, end_extrinsic_trial() is called again instead of end_extrinsic_phase().
It is too late now to release a fix (it might be disruptive with just a few hours for the final submissions), however you can still catch the end of the extrinsic_phase by detecting when end_extrinsic_trial is called two times in a row
The controller code might be like this:
def start_extrinsic_trial(self): self.trial_ends = 0 pass def end_extrinsic_trial(self): self.trial_ends +=1 if self.trial_ends > 1: self.end_extrinsic_phase() pass def end_extrinsic_phase(self): print("Extrinsic phase has ended!") pass
Maybe it would be reasonable to make Round 1 rules the final rules as the problem was already challenging enough (e.g. no one gets better than “Doing nothing” policy).
Indeed the challenge proved to be too hard.
However, changing now the rules and allow any solution would not be fair to those who have been trying to achieve it following Round 2 spirit.
Instead, we can keep two classifications and allow people to submit both solutions that follow Round 2 rules and those which would be valid only under Round 1 rules.
A single team may have different submissions following either ruleset and will be ranked in both.
Also, it would be beneficial to know the prizes for winning places.
Initially, we had planned to have some travel grants to NeurIPS as prizes but we weren’t able to provide them in the end.
So the main prizes are the recognition and glory
Talking about recognition, we will have a 20-minutes presentation of the competition @ NeurIPS on December 13th.
If nobody gets any significant result (i.e. above the no-action baseline) I will probably focus on the challenges of the competition and how hard it is, mentioning some possible approaches.
However, if we get a result above the baseline, I will devote a part of the presentation to that winning approach above the baseline (with preference to those who followed Round 2 rules).
If the authors of that solution are present at NeurIPS and are willing to explain their solution in person, I think we maye be able to arrange so that they present that part.
We have been told that there will be a volume of Proceedings of Machine Learning Research (http://proceedings.mlr.press/) associated to NeurIPS competitions, so we will recognize notable solutions in there as well when describing the results of our competition.
@tky, I did run the simulation up to 10M steps on a 64GB RAM machine many weeks ago.
I didn’t notice if the simulation itself had increasing memory usage, however I did notice that if you store all the observations (images in particular) you are likely to have memory problems.
As an example, retina images for just 100k steps result in a 23GB file when saved as .npy.
Contacts and joints are easier to handle but for 10M steps still result in 720MB and 320MB files.
Did you run just RandomPolicy as it is, without saving anything?
(i.e. just downloaded the repository and run the local evaluation for 10M)
EDIT: I have launched now a 10M local evaluation with a new copy of the repository… I will let you know how it goes.
On the amount of timesteps of the intrinsic phase: yes, 10M is the correct figure.
We gave quite some time to the agents to try things and learn
An agent which does nothing is evaluated at about 50 steps/s, so it takes 2 days and 8 hours just to run the environment for the intrinsic phase.
Note that the environment uses only 1 CPU and basically no GPU, so if one makes computations in parallel (e.g. updating/training neural networks) there’s a lot of room (7 CPU and the GPU) to spare.
If training is not done in parallel, yes, it could take a week.
Round 1 has been completed.
To compete in Round 2, the requirement was to be in the Top 20 with a valid submission.
Given that we have less than 20 teams on the leaderboard, I am glad to announce that all people who have submitted can now partecipate in Round 2!
Final leaderboard for Round 1:
On the other hand, we expected a wider partecipation (given that 158 people subscribed to the competition) so we will probably send a survey to all subscribers to understand why only a few subscriptions turned into submissions.
One of the main reason is probably that this challenge is very hard and complex to tackle!
There are many problems to face at once, such as:
- Learning to properly abstract the environment
i.e. recognizing objects in some way
- Building models of the world
i.e. learning how the state of the environment evolves by acting on it
- Learning reliable skills
i.e. learning appropriate actions that consistently achieve certain states
i.e. how to reach extrinsic goals by applying the learned skills
… and all these pieces of knowledge interact with each other and must be learned together.
Indeed so far the highest score (0.235) has been achieved by controllers which just stand still, since in many of the goals that we test it is advantageous to just leave objects as they are rather than pushing them around randomly.
An agent doing random movements scores about 0.100 instead.
I would like to make a mention here for teams AutoLearingMPI and isi who got their scores without resorting to these two baseline controllers.
isi team also got up to 0.134 in the latest submission (better than random), altought it was after the deadline had expired.
Moving on to Round 2!
So we have one month left to improve our controllers and beat both the random and the static controllers!
What are your expectation?
Will you beat the random and static controller?
What have been the main challenges for your team so far?
Round 2 will soon open.
Good luck and good to everyone for this last month of challenge!
Was it planned that the object can go inside the shelf and never go back?
Do you have more details on when it occurred? (to try to reproduce it)
Can we compute end-effector position from joint angles as part of the state-representation?
Well, if you only use joint-angles and knowledge of the robot structure, you are not giving it any information about the environment (in the sense of the environment outside the robot itself), nor you are giving it information about the task, so I think this can be allowed.
(A stricter interpretation might include the robot itself as part of the environment and thus part of the task but…that would be harsh :))
Also, is it ok to sample goals from representation space (“learned” with CV for Round 1 and fully unsupervisedly learned for Round 2? ) I’m aking because world explicitly is not completely clear for me in this quote
during the intrinsic phase the robot is not explicitly given any task to learn and it does not know of the future extrinsic tasks,
Yes, it is ok.
Indeed one of the articles we suggested (Visual Reinforcement Learning with Imagined Goals) samples goals from the learned representation space.
I think you can read that quote without the word explicitly.
to facilitate some aspects of the competition we are going to upload some scripts
in the Resource section of the challenge page.
Today we have uploaded a script collect_data.py and two zip files with data.
The script collects observations from the environment for a certain number of steps
following a “RandomPolicy” policy and then saves them to into 4 different files
(one for each type of observation: joints, contacts, retina images plus a file
for all the actions done).
One of the zip files (2.55GB) contains data from running the script for 100k steps.
When unzipped, it produces 23GB of data (mostly images).
The other zip file contains data from running the script for 10M steps.
When unzipped, it produces about 1.7 GB of data (without images).
New participants may find these files especially useful to get a feel of how the
data might look like when interacting with the environment (randomly) and to do
some training test of neural networks or other approaches.
given that submissions were opened a lot later than initially planned, and given that activity has only recently been picking up, we have decided to extend Round 1 up to October 25th.
The deadline for Round 2 has also been moved back to November 25th.
- 25th October 2019 - End of Round 1
- 25th November 2019 - End of Round 2.
- 6th December, 2019 - Competition results are announced.
- 8-14th December - Competition results are presented on NeurIPS.
" we might consider moving the deadline(s) a bit given that submissions were opened a lot later than initially expected and activity is picking up now."
When should we expect the decision regarding this point?
I will have a meeting tomorrow with the other organizers, so most likely tomorrow evening.
the deadline is still Sep 30, the announcement quoted by BrandonHoughton belongs to another comperition (MineRL).
So yes, the timer is not correct, as it shows the days to the end of Round 2.
On the other hand, we might consider moving the deadline(s) a bit given that submissions were opened a lot later than initially expected and activity is picking up now.
no, the score is the average of the three scores.
Some submissions have a 0 score because they are “debug” submissions where only a few trials of 2D goals are run and then the score is automatically set to 0.
See more about debug submissions at https://github.com/AIcrowd/neurips_goal_real_robots_starter_kit
- yes, you can file read.
- Yes, I think the current time limit is set to 12 hours for the extrinsic phase (Round 1), 60 hours for the intrinsic + extrinsic phase combined (Round 2).
You can wait intermediately between each step to do processing.
- Running the environment step alone takes a substantial amount of time, so it is a very good idea to do parallel processing and train a neural network (or doing other computations) meanwhile.
- We currently launch on a 4 CPU with 26Gb RAM and a K80 GPU machine (Google’s n1-standard-4 machine + GPU).
We plan to up that to 8 CPU, 30GB, V100 GPU machine for Round 2 (Google’s n1-standard-8 machine + GPU).
(We might increase to that already for Round 1 but I’d have to check).
to participate in Round 2 you have to be ranked in the Top 20 (not to top 10) of Round 1 with an algorithm complying with the spirit of the rules.
You can have multiple submissions, some complying and some not complying with the spirit of the rules. Only those complying are valid to enter Top 20.
the environment should be used “as it is” to stay within the spirit of the rules.
However, given the difficulty of the challenge, we allow some exceptions for Round 1.
As long as the Golden Rule is not violated (see Rules), all submissions will be considered valid and ranked for Round 1.
However, only submissions fully complying with the spirit of the rules will access Round 2 and take part to the final ranking.
It is possible to submit multiple submissions on Round 1.
So, for your questions:
- No resets allowed. The environment should be kept as it is. The only “reset” available is that objects which go out of bounds are automatically placed back on the table.
- In theory, the agent should learn within 10M steps. While this is not checked by the automatic evaluation in Round 1, an agent which needs 1000M steps to learn something, it will probably fail Round 2.
- There are no episodes (at least from an external point of view - the agent can split the 10M step experience as it wishes).
- No, except for debugging purposes (i.e. not when submitting).
- No. But I would like to mention that the real_robots package contains a second environment with only the cube in it as a way to simplify things while debugging your algorithm.
env = gym.make('REALRobotSingleObj-v0')to create it.
- Given the above, it is not possible to make a curriculum by progressively increasing the difficulty of the environment.
It is possible of course for the agent to build the curriculum by itself by focusing on some aspects first.
i.e. it would be fine if the agent concentrates on controlling its arm first.
- In general, no. This is because one of the difficulties of open-ended autonomous learning is dealing with large action spaces which often contain only a very “tiny subspace” of useful action for the task at hand.
So when you restrict the action space you give sidestep a part of this challenge and also indirectly give away some information about the task which the agent is not supposed to have.
However, one thing we noticed is that the environment is hard to predict when the arm moves at full speed, since very little differences in the starting position result in big difference in the outcome of collisions with objects.
Due to this we may allow for restrictions on the speed of robot on both Round 1 and 2 - but not on restriction on the range of movements (i.e. restricting the joints so that the arm is always over the table).
You may still restrict action space to make a Round 1 only submission (as explained above).
See also: Use of Pre-Trained Models
Have a look at this example, which visualizes in a GUI the effects of the 8th and 9th component,
import numpy as np import real_robots import gym env = gym.make('REALRobot-v0') env.render('human') env.reset() # Using 3rd component to put the arm in better view for _ in range(100): obs, _, _, _ = env.step(np.array([0, 0, 0, -1.5, 0, 0, 0, 0, 0])) # The 8th component moves the bottom part of both fingers at once # from 0 to 90 degrees. # See the gripper opening keeping its fingers symmetrical. for _ in range(100): obs, _, _, _ = env.step(np.array([0, 0, 0, -1.5, 0, 0, 0, np.pi/2, 0])) # The 9th component moves the upper part of both fingers at once. # See the gripper widening by moving the upper parts of the fingers. for _ in range(100): obs, _, _, _ = env.step(np.array([0, 0, 0, -1.5, 0, 0, 0, np.pi/2, np.pi/2])) # The 9th component can go from 0 to half of what the 8th component is. # Now see that while the gripper closes (8th component to zero), both # the upper and the bottom part of the fingers close. # This is because the 9th component it automatically reduced to its maximum # value, i.e. half of the current 8th component. for _ in range(100): obs, _, _, _ = env.step(np.array([0, 0, 0, -1.5, 0, 0, 0, 0, np.pi/2]))
each of the first 7 components acts on a single joint of the arm.
The remaining two components act on the two-finger gripper and they move it so that the fingers keep a symmetrical position.
The 8th component moves the bottom part of both fingers at once from 0 to 90 degrees.
The 9th component moves the upper part of both fingers at once - it can go from 0 to half of what the 8th component is.
This somewhat simplifies gripping from an agent that starts learning with random movements, since the gripper will always behave in a coherent fashion that usually permits grasping objects.
On the other hand, some degrees of freedom are taken away since fingers cannot be moved independently any more.
I would like to add that the competition is proving very hard at the moment, so even having a good score on Round 1 using pre-trained nets could be a very worthy submission.
To advance to Round 2 the pre-trained net should be then substituted by something else though, e.g. some autoencoder trained during the intrinsic phase.
from the rules:
Participants should give as little information as possible to the robot, rather the system should learn from scratch […] given the difficulty of the competition and the many challenges that it contains and to encourage a wide participation, in Round 1 it will be possible to violate in part the aspects of the spirit of the competition, except the Golden Rule above. For example, it will be possible to use hardwired or pre-trained models for recognising the identity of objects and their position in space.
So you can use them for Round 1 but the submission wouldn’t be valid to advance to Round 2.
Sorry for the long wait.
Submissions are opening…now!
I am happy to announce that the submissions for the NeurIPS 2019 - Robot open-Ended Autonomous Learning challenge are finally open!
For the instructions on how to make a submission, go to the new starting kit repository:
You will have to clone that repository and put your controller there to make a submission.
The environment is now installed separately from the starting kit by doing:
pip install real_robots
For any issues about the submissions, please open issues here:
For any issues about the environment:
Feel free also to post here any questions you have.
I’m looking forward to all your submissions
we have just updated the competition page.
In particular, we added to the Rules:
Competition structure The competition will be divided into two rounds.
- Round 1: During the first round, submissions will be evaluated by running only the extrinsic phase. Participants will have to pre-train their robot controllers on their machines before submission. Top 20 ranked participants whose submissions follow the spirit of the rules will be able to participate to Round 2 (see also Spirit of the Rules and Code inspection below).
- Round 2: during the second round, submissions will be evaluated by running both the intrinsic and extrinsic phase. All final submissions will be checked for coherence with the spirit of the rules.
Spirit of the rules As also explained above, the spirit of the rules is that during the intrinsic phase the robot is not explicitly given any task to learn and it does not know of the future extrinsic tasks, but it rather learns in a fully autonomous way.
As such, the Golden Rule is that it is explicitly forbidden to use the scoring function of the extrinsic phase or variants of it as a reward function to train the agent. Participants should give as little information as possible to the robot, rather the system should learn from scratch to interact with the objects using curiosity, intrinsic motivations, self-generated goals, etc.
However, given the difficulty of the competition and the many challenges that it contains and to encourage a wide participation, in Round 1 it will be possible to violate in part the aspects of the spirit of the competition, except the Golden Rule above. For example, it will be possible to use hardwired or pre-trained models for recognising the identity of objects and their position in space. All submissions, except those violating the Golden Rule, will be considered valid and ranked for Round 1. However, only submissions fully complying with the spirit of the rules will access Round 2 and take part to the final ranking.
We are still working on opening online submissions - it will take us a few more days still.
We have also extended “Beta” period up to 31st July, to allow some more feedback on the Rules and the Starting Kit.
we are still a bit behind with some developments, so the submissions won’t start yet.
Hopefully we will be able to enable submissions early next week.
Meanwhile, the Starting Kit has been updated.
Main changes have been:
change orange into a cube
We replaced the orange object with a cube object - this will simplify things as the cube doesn’t roll around.
We have updated the goal dataset, which now contains (350!) goals.
There are 150 “2D” goals, 150 “2.5D” goals and 50 “3D” goals.
The “2D” goals involve only moving objects on the first part of the table, while “2.5D” goals require moving objects from and to the shelf.
The first two types of goals can also be split into 3 categories, with 1, 2 or 3 objects moving when comparing the starting state and the goal final state.
Objects in “2D” and “2.5D” goals have always a fixed orientation and have a minimum distance (15 cm) between each other.
The “3D” goals are the most general (and challenging) type of goals: objects can have any orientation and they can be in any part of the table, even on top of each other.
updated scoring function
We have updated the scoring function for the extrinsic phase.
The score is 1 if each object is positioned as shown in the goal image, while it exponentially decreases to 0 the further the object is compared to the goal position.
In 3D goals, the orientation also matters.
For each goal, the score is the average of the score for each object.
Final score is the average score obtained on all type of goals (and the score for each type is the average of all goals of that type).
increased extrinsic trial length
We have increased the time available to achieve each goal from 1000 to 2000 timesteps.
Notice that the extrinsic phase is now very long (350 goals x 2000 timesteps = 700k timesteps!).
When testing your model locally you may want to modify demo.py so that only a subset of goals are tested (e.g. only 2D goals or only 10 goals from each type).
This file is identical to demo.py, but it has a length 0 intrinsic phase.
This is useful to test an already trained controller and it will be used to test submissions during Round 1 of the competition (see below).
fixed touch sensors
There was a bug that prevented touch sensors beyond the first to activate.
More announcements to follow on Monday
Beta phase extended to July 19th
We have collected some nice feedback on the competition from participants during the IMOL 2019 workshop and the related Summer School where the REAL Challenge has been presented.
We are still implementing that feedback, so expect the Starting Kit and Rules to undergo some changes in the next few days and to be “finalized” by July 19th.
Online Evaluations starting on July 19th
The Starting Kit now provides a score when you run demo.py, so you can start to evaluate your solutions locally on your computer (2D tasks only at the moment).
Unfortunately, due to some unexpected circumstances, we haven’t been able to start the online evaluations yet.
We expect the online submission procedure to be available by July 19th.
I also take the occasion to publicly thank Google which is going to provide the computational resources for the evaluations with their GCP research credits program.
Welcome to the NeurIPS 2019 - Robot open-Ended Autonomous Learning!
This is the place to post any questions you have about the competition and also to discuss with other participants your results and the best approaches to the challenge.
Questions about the starting kit, rules, evaluation metrics, etc. are all on-topic.
Details about the challenge can be found at the REAL competition page.
If you have technical difficulties in running the Starting Kit, you can also post in the issue page on Github.