Browse Source

Wording and small fixes

arxiv-v2
Markus Kaiser 3 months ago
parent
commit
89a7fde86f
2 changed files with 11 additions and 9 deletions
  1. BIN
      dynamic_dirichlet_deep_gp.pdf
  2. 11
    9
      dynamic_dirichlet_deep_gp.tex

BIN
dynamic_dirichlet_deep_gp.pdf View File


+ 11
- 9
dynamic_dirichlet_deep_gp.tex View File

@@ -39,7 +39,7 @@
39 39
 ]
40 40
 
41 41
 \begin{abstract}
42
-    The data association problem is concearned with separating data coming from different generating processes, for example when data come from different data sources, contain significant noise, or exhibit multimodality.
42
+    The data association problem is concerned with separating data coming from different generating processes, for example when data come from different data sources, contain significant noise, or exhibit multimodality.
43 43
     We present a fully Bayesian approach to this problem.
44 44
     Our model is capable of simultaneously solving the data association problem and the induced supervised learning problems.
45 45
     Underpinning our approach is the use of Gaussian process priors to encode the structure of both the data and the data associations.
@@ -50,10 +50,10 @@
50 50
 \section{Introduction}
51 51
 \label{sec:introduction}
52 52
 Real-world data often include multiple operational regimes of the considered system, for example a wind turbine or gas turbine~\parencite{hein_benchmark_2017}.
53
-As an example, consider a model describing the lift resulting from airflow around a wing profile as a function of attack angle.
53
+As an example, consider a model describing the lift resulting from airflow around the wing profile of an airplane as a function of attack angle.
54 54
 At a low angle the lift increases linearly with attack angle until the wing stalls and the characteristic of the airflow fundamentally changes.
55 55
 Building a truthful model of such data requires learning two separate models and correctly associating the observed data to each of the dynamical regimes.
56
-A similar example would be if our sensors that measure the lift are faulty in a manner such that we either get a accurate reading or a noisy one.
56
+A similar example would be if our sensors that measure the lift are faulty in a manner such that we either get an accurate reading or a noisy one.
57 57
 Estimating a model in this scenario is often referred to as a \emph{data association problem}~\parencite{Bar-Shalom:1987, Cox93areview}, where we consider the data to have been generated by a mixture of processes and we are interested in factorising the data into these components.
58 58
 
59 59
 \Cref{fig:choicenet_data} shows an example of faulty sensor data, where sensor readings are disturbed by uncorrelated and asymmetric noise.
@@ -316,7 +316,7 @@ We collect the latent multi-layer function values as $\mat{F^\prime} = \Set{\mat
316 316
 \begin{align}
317 317
 \begin{split}
318 318
     \label{eq:deep_variational_distribution}
319
-    \Variat*{\mat{F^\prime}, \mat{\alpha}, \mat{U^\prime}}
319
+    \MoveEqLeft\Variat*{\mat{F^\prime}, \mat{\alpha}, \mat{U^\prime}} = \\
320 320
     = &\Variat*{\mat{\alpha}, \Set*{\mat{u_\alpha^{\pix{k}}}}_{k=1}^K, \Set*{\mat{F_l^{\prime\pix{k}}}, \mat{u_l^{\prime\pix{k}}}}_{k=1,l=1}^{K,L}} \\
321 321
     = &\prod_{k=1}^K\prod_{n=1}^N \Prob*{\mat{\alpha_n^{\pix{k}}} \given \mat{u_\alpha^{\pix{k}}}, \mat{x_n}}\Variat*{\mat{u_\alpha^{\pix{k}}}} \\
322 322
     \MoveEqLeft\prod_{k=1}^K \prod_{l=1}^L \prod_{n=1}^N \Prob*{\mat{f_{n,l}^{\prime\pix{k}}} \given \mat{u_l^{\prime\pix{k}}}, \mat{x_n}}\Variat*{\mat{u_l^{\prime\pix{k}}}},
@@ -366,7 +366,8 @@ This extended bound thus has complexity $\Fun*{\Oh}{NM^2LK}$ to evaluate in the
366 366
     \newcolumntype{Z}{>{\columncolor{sStone!33}\centering\arraybackslash}X}%
367 367
     \begin{tabularx}{\linewidth}{rY|YZZZZZZ}
368 368
         \toprule
369
-        Outliers & DAGP (MLL) & DAGP (RMSE) & CN & MDN & MLP & GPR & LGPR & RGPR \\
369
+        Outliers & DAGP & DAGP & CN & MDN & MLP & GPR & LGPR & RGPR \\
370
+        & \scriptsize MLL & \scriptsize RMSE & \scriptsize RMSE & \scriptsize RMSE & \scriptsize RMSE & \scriptsize RMSE & \scriptsize RMSE & \scriptsize RMSE \\
370 371
         \midrule
371 372
          0\,\% & 2.86 & \textbf{0.008} & 0.034 & 0.028 & 0.039 & \textbf{0.008} & 0.022 & 0.017 \\
372 373
         20\,\% & 2.71 & \textbf{0.008} & 0.022 & 0.087 & 0.413 & 0.280 & 0.206 & 0.013 \\
@@ -453,7 +454,8 @@ This extended bound thus has complexity $\Fun*{\Oh}{NM^2LK}$ to evaluate in the
453 454
         \label{fig:semi_bimodal:c}
454 455
         Normalized samples from the assignment process $\mat{\alpha}$ of the model shown in \cref{fig:semi_bimodal}.
455 456
         The assignment process is used to weigh the predictive distributions of the different modes depending on the position in the input space.
456
-        The model has learned that the mode $k = 2$ is irrelevant, that the mode $k = 1$ is only relevant around the interval $[0, 5]$ and the outside this interval, the mode $k = 3$ is twice as likely as the mode $k = 4$.
457
+        The model has learned that the mode $k = 2$ is irrelevant, that the mode $k = 1$ is only relevant around the interval $[0, 5]$.
458
+        Outside this interval, the mode $k = 3$ is twice as likely as the mode $k = 4$.
457 459
     }
458 460
 \end{figure}
459 461
 %
@@ -481,7 +483,7 @@ To avoid pathological solutions for high outlier ratios, we add a prior to the l
481 483
 The model proposed in~\parencite{choi_choicenet_2018}, called ChoiceNet (CN), is a specific neural network structure and inference algorithm to deal with corrupted data.
482 484
 In their work, they compare their approach to a standard multi-layer perceptron (MLP), a mixture density network (MDN), standard Gaussian process regression (GPR), leveraged Gaussian process regression (LGPR)~\parencite{choi_robust_2016}, and infinite mixtures of Gaussian processes (RGPR)~\parencite{rasmussen_infinite_2002}.
483 485
 \Cref{tab:choicenet} shows results for outlier rates varied from 0\,\% to 80\,\%.
484
-Besides the root mean squared error (RMSE), we also report the mean test log likelihood (MLL) of the process representing the target function in our model.
486
+Besides the root mean squared error (RMSE), we also report the mean test log likelihood (MLL) of the process representing the signal in our model.
485 487
 
486 488
 Up to an outlier rate of 40\,\%, our model correctly identifies the outliers and ignores them, resulting in a predictive posterior of the signal equivalent to standard GP regression without outliers.
487 489
 In the special case of 0\,\% outliers, DAGP correctly identifies that the process modelling the noise is not necessary and disables it, thereby simplifying itself to standard GP regression.
@@ -553,7 +555,7 @@ Our third experiment is based on the cart-pole benchmark for reinforcement learn
553 555
 In this benchmark, the objective is to apply forces to a cart moving on a frictionless track to keep a pole, which is attached to the cart via a joint, in an upright position.
554 556
 We consider the regression problem of predicting the change of the pole's angle given the current state of the cart and the action applied.
555 557
 The current state of the cart consists of the cart's position and velocity and the pole's angular position and velocity.
556
-To simulate a dynamical system with changing states of operation our experimental setup is to sample trajectories from two different cart-pole systems and merging the resulting data into one training set.
558
+To simulate a dynamical system with changing system characteristics our experimental setup is to sample trajectories from two different cart-pole systems and merging the resulting data into one training set.
557 559
 The task is not only to learn a model which explains this data well, but to solve the association problem introduced by the different system configurations.
558 560
 This task is important in reinforcement learning settings where we study systems with multiple operational regimes.
559 561
 
@@ -606,7 +608,7 @@ The data association problem is inherently ill-constrained and requires signific
606 608
 In this paper, we make use of interpretable Gaussian process priors allowing global a priori information to be included into the model.
607 609
 Importantly, our model is able to exploit information both about the underlying functions and the association structure.
608 610
 We have derived a principled approximation to the marginal likelihood which allows us to perform inference for flexible hierarchical processes.
609
-In future work, we would like to incorporate the proposed model in a reinforcement learning scenario where we study a dynamical system with state changes.
611
+In future work, we would like to incorporate the proposed model in a reinforcement learning scenario where we study a dynamical system with different operational regimes.
610 612
 
611 613
 
612 614
 \printbibliography

Loading…
Cancel
Save