Please feel free to leave criticisms and suggestions in case you found it hard to get started with the competition or any other feedback that might help us improve the competition.
Thank you!
Posted by: RaghuSpaceRajan @ June 9, 2022, 10:38 a.m.This is a copy of the message sent to all the participants:
Dear DAC4RL participant,
We would like to thank the participant vu for reaching out and pointing out that the DAC4RL agents are a bit harder to code for compared to the DAC4SGD track. The many hyperparameters that can be set for the DAC4RL track may seem a bit intimidating at first sight but we would like to mention that it is not necessary to try and set all of them at once. For example, a submission could decide to only set the learning rate and pass it as the action to the DAC environment at each step. We allowed setting multiple hyperparameters at each step to retain the flexibility to be able to design more advanced approaches if the need arose. But setting the learning rate alone, for instance, could be considered novel in itself if it is well motivated and leads to new insights.
Secondly, the competition pipeline does not allow methods like PB2 to maintain populations of agents online and tune the hyperparameters online. This is also by design because we were at the limit of our compute budget with only setting the hyperparameters of one agent online. Maintaining even a low population count like 4 (that PB2 can get excellent results with) would have exceeded the compute budget we allocated for the competition workers by a large margin. Apologies if this is not explicitly mentioned in the instructions. We believe that approaches that are possible to design in the current problem setting can already provide very valuable insights to the RL practitioner. For instance, here are a few agent designs that can work in the competition:
1. Meta-learning schedules of hyperparameters on a training distribution of environments to be used on an unseen environment at test time.
2. Use PB2 to learn transferrable schedules offline and then use these schedules at test time (possibly as a function of the environment features).
3. Another example could be setting different schedules of gamma based on the test environment that is presented to the agent.
4. In case you also participated in the SGD track, use a similar submission to the one you submitted for the SGD track for setting the learning rates for the DAC policy in the RL track (with the limitation that you only set the learning rate at 10 points throughout the training, so it's more like a step schedule)
Looking forward to your suggestions and submissions,
DAC4AutoML Organising Team.
Dear team,
There are 4 baselines but 2 of them are broken, 2 of them have much lower scores compared to the "same" ones in the LB.
Any hints what could make the difference?
Tak
Posted by: tak @ July 3, 2022, 6:47 a.m.Dear tak,
Thank you very much for pointing this out!
Those 2 baselines (pb2_piac_for_dac and ac_for_dac) should not be in the main branch. The person working on them had to take leave of absence and we did not intend to submit them to the main competition leaderboard as they were incomplete. So, they are currently broken. Apologies for any inconvenience caused, we have removed them from the main branch.
By "2 of them have much lower scores", do you mean the other 2 baselines? We have not noticed big differences on re-running the baselines. Did you run them locally and did they have much lower scores, or are you referring to your submission on the leaderboard which does not seem to have big differences from our own baseline submission?
Posted by: RaghuSpaceRajan @ July 4, 2022, 1:11 p.m.Hello,
I submitted the zoo example but got much lower scores compared to the baseline-ZooHyperparameters in the LB.
You could submit it by yourself if you really want to confirm the issue.
Posted by: tak @ July 4, 2022, 4:14 p.m.Dear Tak,
Do you mean it for the Acrobot env? We are currently looking into why that is relatively low for your submission. But for the other 4 envs, we don't see a big difference. Could you please clarify by providing numbers why you think your submission is getting much lower scores?
Posted by: RaghuSpaceRajan @ July 4, 2022, 7:50 p.m.