You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

2.2 KiB

Revision Notes

We want to thank the reviewers for their constructive comments and suggestions for improvement. We first want to comment on two suggestions made by Reviewer #1 and why we did not include them for this revision:

  1. Theoretical extensions to the model, e.g. a non-parametric model for K This work showcases how hierarchical Bayesian models can be used to incorporate high-level expert knowledge during model design. The prior knowledge we assume includes knowledge about the correct number of modes required. Note that inference over the correct number of modes is a hard problem due to the ill-posedness of the data association problem. It is crucial to formulate a strong prior over acceptable associations to obtain interpretable solutions. We do agree that this observation is not obvious. We have therefore added a new experiment to showcase the effect of other choices of K.

  2. Experiments on other benchmarks, e.g. the Industrial Benchmark As this work is specifically about formulating a Bayesian model tailored to a problem and available knowledge, adding experiments on another benchmark requires significant changes to the paper, including the formulation of a completely new model. The industrial benchmark's is not multi-modal. Instead, its difficulties lie in its high dimensionality and latent information. We consider such a comparison as out of scope for this submission.

Second, as Reviewer #1 expressed concern about the amount of novel material when compared to the ESANN submission, we give an explicit list of additions here:

  • (Section 1) Extended introduction and related work
  • (Section 3) Addition of a description of the inference scheme employed to train the to transition model and policies
  • (Section 4) A considerably more detailed analysis of the formulated transition model, insights obtained from data and discussion of the model's interpretability.
  • (Section 4) An extension of the original experiment with a comparison to an additional model (BNN+LV) as suggested by Reviewer #1
  • (Section 4) An new experiment on how the interpretable model can be used for reward shaping
  • (Section 4) An new experiment on the effects of model misspecification on the data efficiency as suggested by Reviewer #1