Hello, I do not understand how the ranking works in this challenge.
The last submission I made had a F1 score of 0.153 and it has been “graded with a score of 0.0”.
I was also “congratulated for my first submission”, but actually it was the sixth .
Should I select the submission for the ranking? Or is there a bug? Or is the F1 score irrelevant for ranking?
My idea was to make something like an “assisted AI”. Instead of waiting millions of steps to avoid climbing the entrance stairs, eating kobold corpses and walking into walls, I would like to make an actor able to choose between not-obviously-stupid actions .
So for this I would like to have a state attached to the actor or the environment ( where I would store things like “the stairdown is there even if there is a bow dropped on it”, the intrinsics etc ), and alter the “call to inference” . Obviously this prevents the model to learn about some actions ( for example when asked to say y/n at: “Beware, there will be no return”, I force Direction.SE when the actor happily samples randomly on the action space ), but I am not sure there is a value, in this case, to let it try something else.
So while training and testing I want to be able to “help choose sensible options”.
Note that while training, overriding the “step()” function of the environment does not do what I want because I want to learn the forced action, not the original one.
I understand I should use a gym wrapper around the environment, this seems to be an allowed method, I have to see if this way I will be able to add one more key to the observations ( for example an uuid computed at reset() time ).
The screen_descriptions observation key is not accessible in the NethackChallenge class and it’s not possible to enable it for submissions ( or is it ?? ). Also the HELP ( “?” ) function is disabled even if it’s available in the game and it’s the only way to know if the dog is tame and if a monster is peaceful.
Is it possible to enable the key for the challenge ?
With the torchbeast system, I understand I have access , at each step, to the environment and the actor state in the inference function while learning , but the batch size is dynamic and I have found no identifier which would allow me to know which environment/actor are in the batch.
It seems all the batch.get_inputs() is managed by the torchbeast library, I would rather not recompile it.
Does anyone have a smart hack ?