Add reviews

Markus Kaiser 3 years ago
parent 1e40ba5d1f
commit 97a40388b3

@ -0,0 +1,156 @@
RRef.: Ms. No. NEUCOM-D-19-02781
Bayesian Decomposition of Multi-Modal Dynamical Systems for Reinforcement Learning
Dear Mr. Kaiser,
Please find below the referee reports. Based on these and the corresponding recommendation of the associate editor, I have to inform you that your paper
Bayesian Decomposition of Multi-Modal Dynamical Systems for Reinforcement Learning with manuscript number: NEUCOM-D-19-02781
in its present form cannot be accepted for publication in Neurocomputing.
However, I would very much like to invite you to revise your paper, seriously taking into account the comments of the reviewers, and to resubmit your revised version by Oct 30, 2019 (mm/dd/yy). Any revision received after that may be treated as a new submission.
To submit your revision, go to and login as an Author. You will see a menu item call Submission Needing Revision. Here you will also find your submission record.
The revised material should consist of
- your response to the reviewers' comments (to be uploaded as "Revision notes"),
- the revised PDF of the manuscript,
- the source files that have been used to prepare it (source files in LaTeX or Word, as well as separate figure files; these will be used for the eventual typesetting of the paper)
- and finally, biographies and pictures of all authors.
*** Please note: while submitting the revised manuscript, please double check the author names provided in the submission and make sure to indicate any authorship related changes in the revision. Once a paper is accepted, we do not accept any changes to the author list unless explicit approval is given from co-authors and respective editor handling the submission; this may cause a significant delay in publishing your manuscript. Therefore, please make sure that you include the correct author list in the revised text of your manuscript. ***
Other journal-related information is included below, following the reviewer's comments.
MethodsX file (optional)
If you have customized (a) research method(s) for the project presented in your Neurocomputing article, you are invited to submit this part of your work as MethodsX article alongside your revised research article. MethodsX is an independent journal that publishes the work you have done to develop research methods to your specific needs or setting. This is an opportunity to get full credit for the time and money you may have spent on developing research methods, and to increase the visibility and impact of your work.
How does it work?
1) Fill in the MethodsX article template:
2) Place all MethodsX files (including graphical abstract, figures and other relevant files) into a .zip file and upload this as a 'Method Details (MethodsX) ' item alongside your revised Neurocomputing manuscript. Please ensure all of your relevant MethodsX documents are zipped into a single file.
3) If your Neurocomputing research article is accepted, your MethodsX article will automatically be transferred to MethodsX, where it will be reviewed and published as a separate article upon acceptance. MethodsX is a fully Open Access journal, the publication fee is only 520 US$.
Questions? Please contact the MethodsX team at Example MethodsX articles can be found here:
Include interactive data visualizations in your publication and let your readers interact and engage more closely with your research. Follow the instructions here: to find out about available data visualization options and how to include them with your article.
MethodsX file (optional)
We invite you to submit a method article alongside your research article. This is an opportunity to get full credit for the time and money you have spent on developing research methods, and to increase the visibility and impact of your work. If your research article is accepted, your method article will be automatically transferred over to the open access journal, MethodsX, where it will be editorially reviewed and published as a separate method article upon acceptance. Both articles will be linked on ScienceDirect. Please use the MethodsX template available here when preparing your article: Open access fees apply.
Kind regards,
Zidong Wang, PhD
Editor in Chief
Data in Brief (optional):
We invite you to convert your supplementary data (or a part of it) into an additional journal publication in Data in Brief, a multi-disciplinary open access journal. Data in Brief articles are a fantastic way to describe supplementary data and associated metadata, or full raw datasets deposited in an external repository, which are otherwise unnoticed. A Data in Brief article (which will be reviewed, formatted, indexed, and given a DOI) will make your data easier to find, reproduce, and cite.
You can submit to Data in Brief via the Neurocomputing submission system when you upload your revised Neurocomputing manuscript. To do so, complete the template and follow the co-submission instructions found here: If your Neurocomputing manuscript is accepted, your Data in Brief submission will automatically be transferred to Data in Brief for editorial review and publication.
Please note: an open access Article Publication Charge (APC) is payable by the author or research funder to cover the costs associated with publication in Data in Brief and ensure your data article is immediately and permanently free to access by all. For the current APC see:
Please contact the Data in Brief editorial office at or visit the Data in Brief homepage ( if you have questions or need further information.
Editor's and reviewers' comments:
Reviewer #1: The paper discusses a model-based approach to RL exploiting a Bayesian formulation of the dynamical transition system which can be optimized in batch before being introduced in the dynamic environment.
The review cannot but take into account the fact that this manuscript has been invited as an extended version of an original ESANN conference paper. The Authors are not referencing this anywhere in the paper, nor they are discussing differences from the original conference paper. Extensions from previous works should be discussed and differences accounted for.
This said, I fetched the original paper from the ESANN proceedings, and it is strikingly clear how the novel material introduced in the present manuscript is widely insufficient in extension and quality:
- Introduction has been extended, which is good, but no deeper analysis of the state of the art and background is provided. The paper has exactly the same bibliographic references of the original one apart from the very general reference n. [1]. I would have expected the authors to take the opportunity of the extended paper to confront with VERY related works such as:
- The model described in Section 3 is identical to the one in the original paper. No effort has been paid to provide more insight (apart from making some discourse longer and equations less inline). I would have expected, for instance, some extension in the direction of making the model even less parametrical, e.g. by reducing the need of hyperparameter K. Or at least a discussion of the effect of different choice of K on the model performance (in other words, how much a priori knowledge about the modes in the data is needed?).
- Section 2 is reported from the original conference paper. Only a minor extension of the empirical analysis is provided (i.e. Section 4.3). Figure 4 is another minimal addition which does not provide more insight. I would have expected the extended paper to report results on more benchmarks, possibly less toy than the wet chicken. For instance, one might have considered the IB dataset discussed in one of the Authors papers:
Or, even better, confront empirically with the very related models referenced a couple of points above, which provide results on other datasets as well.
Note that the copy-pasting from previous article has produced effects which could have been avoided by a careful proofreading, e.g. caption of Figure 1 (cut and past from original article) reference blue nodes which are inexistent in the plot. Besides, colours cannot be told when the article is printed: why not using the standard Graphical models notation of empty, shaded and seeded nodes?
Concluding, this is still a nice and well-motivated short conference paper. Too bad this paper has been already published. Given its current shape and content I am not convinced this manuscript should be published in Neurocomputing.
Reviewer #2: This paper describes a bayesian reinforcement learning approach based on Gaussian specific priors of a specific problem set as demonstration of the approach to finding the policy. The model is well presented and has suitable figures and explanations of the concept. The method employs a Markovian transition model based on interpretable parameter values to identify the model constraints. The interesting aspect is the specific ability to apply expert knowledge within the transition model for policy generation. This is suitable for several research areas that require an exploration of optimal parameter choices for a specific constraint domain and therefore of interest to the readers of Neurocomputing.
I couldn't see any grammatical or spelling mistakes.
Reviewer #3: The paper presents a Bayesian Reinforcement Learning model based on nonparametric Gaussian process priors. The priors are introduced to limit the hypothesis space in a manner coherent with the (domain-expert) knowledge of the system. As a consequence, the data-efficiency is increased significantly, i.e. the agent's policy can be found with fewer data compared to other approaches like NFQ.
As an additional positive feature, the model used to represent the system dynamics is human-interpretable so that experts can assess and build trust in policies that have to be deployed. Moreover, as the authors show, this interpretability allows to easily in?uence policy behaviours (e.g. to find a more conservative/safer policy).
Overall, the paper is reasonably well written and accessible even to non-experts. The results on the Wet-Chicken benchmark support the conclusions of the paper about the increased data-efficiency and interpretability.
# Major concerns
# Minor concerns
- In §1, the introduction of the "GP" acronym must be anticipated to the sentence "Gaussian processes are stochastic processes"
- In §3, the use of the \gamma symbol in the return function J is common. Nonetheless, I suggest the authors to define it formally.
- In Figure 1, the caption says "variational parameters are blue" but there is no blue in the picture.
# Grammar mistakes & Typos
## Abstract
- "system means we can learn" > means *that* we can learn
- "hetroscedastic" > heteroscedastic
- "To show the bene?ts of the approach we use" / comma before "we"
- "Comparing our model to NFQ we show" / comma before "we"
## Introduction
- "of what a desirable behaviours is." > a desirable *behaviour* is
- "To learn a policy any RL system needs" / comma before "any"
- "main distinctions to different approaches to RL" > distinctions *among* different
- "In model based RL" > In *model-based* RL
- "In model based RL the dynamic model" / comma before "the"
- "part of the system while in the model-free counterpart" / comma before "while"
- "it is unlikely that we will be able deploy" > we will be able *to* deploy
- "In the literature this scenario is referred to as" / comma before "this"
- "In order to be able to derive an efficient policy in this scenario we need to" / comma before "we"
- "While a GP specifies a distribution with support for all functions it efficiently" / comma before "it"
- "concentrates probability mass to functions" > concentrates *its/a/the/…* probability mass to
- "while still placing signi?cant structure" > still placing *a* significant structure
- "the authors propose a model based RL method" > a *model-based* RL method
- "In this work we will show how we can alleviate" / comma before "we"
## The Wet-Chicken Benchmark
- "However, if the canoeist falls down the waterfall he has to start over" / comma before "he"
- "The higher y_t the faster the river ?ows but also the less turbulent it becomes." > The higher y_t *is,* the faster the river ?ows, but also the less turbulent it becomes
## Probabilistic Policy Search
- "reformulated to optimizing the expected return" > reformulated to *optimize*
- "consists of two key parts [4]: First" / full stop instead of the colon
- "to build trust in derived policies: Since experts can" / full stop instead of the colon
- "These likelihood describe" > These *likelihoods* describe
- "by the problem respectively: In our case, we use K = 2 modes," / full stop instead of the colon
- "which mode the data point belongs to as we assume this separation can not be" / comma before as
- "State transitions can e?ciently be sampled" > can *be efficiently sampled*
- "The P roll-outs can trivially be parallelized." > can *be trivially parallelized*
Additional journal-related information:
Please proceed to the following link to update your personal classifications and keywords, if necessary:
Please note that this journal offers a new, free service called AudioSlides: brief, webcast-style presentations that are shown next to published articles on ScienceDirect (see also If your paper is accepted for publication, you will automatically receive an invitation to create an AudioSlides presentation.
Neurocomputing features the Interactive Plot Viewer, see: Interactive Plots provide easy access to the data behind plots. To include one with your article, please prepare a .csv file with your plot data and test it online at before submission as supplementary material.
For further assistance, please visit our customer support site at Here you can search for solutions on a range of topics, find answers to frequently asked questions and learn more about EES via interactive tutorials. You will also find our 24/7 support contact details should you need any further assistance from one of our customer support representatives.