\n",
"

\n",
"

"
],
"text/plain": [
" *P(X2|X1,A1):*

X1 | A1 | Living Room | Kitchen | Office | Hallway | Dining Room |
---|---|---|---|---|---|---|

Living Room | L | 1 | 0 | 0 | 0 | 0 |

Living Room | R | 0.2 | 0.8 | 0 | 0 | 0 |

Living Room | U | 1 | 0 | 0 | 0 | 0 |

Living Room | D | 0.2 | 0 | 0 | 0.8 | 0 |

Kitchen | L | 0.8 | 0.2 | 0 | 0 | 0 |

Kitchen | R | 0 | 1 | 0 | 0 | 0 |

Kitchen | U | 0 | 1 | 0 | 0 | 0 |

Kitchen | D | 0 | 0.2 | 0 | 0 | 0.8 |

Office | L | 0 | 0 | 1 | 0 | 0 |

Office | R | 0 | 0 | 0.2 | 0.8 | 0 |

Office | U | 0 | 0 | 1 | 0 | 0 |

Office | D | 0 | 0 | 1 | 0 | 0 |

Hallway | L | 0 | 0 | 0.8 | 0.2 | 0 |

Hallway | R | 0 | 0 | 0 | 0.2 | 0.8 |

Hallway | U | 0.8 | 0 | 0 | 0.2 | 0 |

Hallway | D | 0 | 0 | 0 | 1 | 0 |

Dining Room | L | 0 | 0 | 0 | 0.8 | 0.2 |

Dining Room | R | 0 | 0 | 0 | 0 | 1 |

Dining Room | U | 0 | 0.8 | 0 | 0 | 0.2 |

Dining Room | D | 0 | 0 | 0 | 0 | 1 |

\n",
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"values = VARIABLES.assignment({X[1]: \"Office\", A[1]: \"R\"})\n",
"pretty(motion_model.choose(values))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This idea can be extended to arbitrarily many actions by merely adding action and state nodes for each\n",
"time step $k$. The code below creates a controlled Markov chain with three states."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"A = VARIABLES.discrete_series(\"A\", range(1, N), vacuum.action_space) # actions for times 1...N-1\n",
"bayesNet = gtsam.DiscreteBayesNet()\n",
"bayesNet.add(state_prior)\n",
"for k in range(1, N):\n",
" bayesNet.add(X[k+1], [X[k], A[k]], vacuum.action_spec) # add creates conditional and adds"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#| caption: Concatenating Bayes net fragments into a controlled Markov chain.\n",
"#| label: fig:controlled_markov_chain\n",
"show(bayesNet, hints={\"A\":2, \"X\":1}, boxes={A[1][0],A[2][0]})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above just shows the graphical model as produced by the code fragment above. It is not yet obvious how to obtain the posterior over $X_3$ in this two-step example: we explore that below.\n",
"\n",
"### Exercise\n",
"\n",
"Try varying the value of $N$ to see other examples."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Forward Simulation\n",
"\n",
"> Sampling from Markov chains.\n",
"\n",
"We can use simulation in a graphical model to explore what a sequence of\n",
"actions might yield as an outcome. \n",
"Because the (controlled) Markov chains above are specified in as a set of conditional distributions, and we have already seen how to sample from those, we just need to know in what order to sample them.\n",
"Indeed: to sample from a conditional distribution $p(X|Y)$ we need to make\n",
"sure we sample the variable $Y$ *beforehand*, and then proceed simply by\n",
"selecting the appropriate PMF (probability mass function) depending on the value of $Y$. We can then\n",
"proceed as before using the inverse transform sampling method.\n",
"\n",
"Forward sampling in a Markov chain simply repeats these steps in\n",
"succession, proceeding from left to right, because that ensures that we always have the condition before we attempt sampling from the conditional. \n",
"Simulating a robot *given* an initial state $X_1$ and \n",
"sequence of actions $a_{1}, a_{2},\\ldots$ is then equivalent to sampling\n",
"from this Markov chain:\n",
"\n",
"1. Set $k=1$, and we assume $X_1$ is known.\n",
"\n",
"2. Simulate the effect of the next action $A_k$ by sampling the next state\n",
" $x_{k+1}$ from \n",
" \n",
" $$x_{x+1} \\sim P(X_{k+1}|X_k=x_k,A_k=a_k).$$\n",
"\n",
"3. Increment $k$ and return to step $2$."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below we show how to code this up and show 4 different **rollouts** by simulating in this way. \n",
"Rollouts will be discussed in more detail in Section 3.5. \n",
"After that, we can approximate the PMF of the final state by constructing a\n",
"histogram over the possible values of the state:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['Office', 'Office', 'Office']\n",
"['Office', 'Hallway', 'Hallway']\n",
"['Office', 'Hallway', 'Living Room']\n",
"['Office', 'Office', 'Office']\n"
]
}
],
"source": [
"def sample(x1, a1, a2):\n",
" values = VARIABLES.assignment({X[1]: x1, A[1]: a1}) # initial state and action\n",
" x2 = vacuum.rooms[bayesNet.at(1).sample(values)]\n",
" values = VARIABLES.assignment({X[2]: x2, A[2]: a2}) # next state and action\n",
" x3 = vacuum.rooms[bayesNet.at(2).sample(values)]\n",
" return [x1, x2, x3]\n",
"\n",
"\n",
"for i in range(4):\n",
" print(sample(\"Office\", \"R\", \"U\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"While simple, simulating from a forward model is a rather important\n",
"technique, and underlies some of the recent successes in deep\n",
"reinforcement learning, as well as the success of DeepMind in beating\n",
"the world’s best players of the game of Go.\n",
"\n",
"### Exercise\n",
"\n",
"Generalize the above forward sampling algorithm to an arbitrary number of actions. Hint: you will have to make sure the variables are defined, and the graphical model is extended properly."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Bayes Nets\n",
"\n",
"> Bayes nets provide a graphical language to string together conditional probabilities into a generative world model.\n",
"\n",
"A Markov chain is a special case of the more general **Bayes network** or **Bayes net**.\n",
"A Bayes net is a directed *acyclic* graph (DAG) describing a factored\n",
"probability distribution over a set of random variables. \n",
"By the term *factored probability distribution*, we mean that the\n",
"probability distribution is expressed as a product of factors, \n",
"which in Bayes nets are always conditional distributions."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following code creates a Bayes net using GTSAM, which we can then display. We formally define Bayes nets below that."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"wxyz = gtsam.DiscreteBayesNet()\n",
"W1 = VARIABLES.binary(\"W\")\n",
"X1 = VARIABLES.binary(\"X\")\n",
"Y1 = VARIABLES.binary(\"Y\")\n",
"Z1 = VARIABLES.binary(\"Z\")\n",
"wxyz.add(W1, [X1, Z1], \"1/1 1/1 1/1 1/1\")\n",
"wxyz.add(X1, [Y1, Z1], \"1/1 1/1 1/1 1/1\")\n",
"wxyz.add(Y1, [Z1], \"1/1 1/1\")\n",
"wxyz.add(Z1, \"1/1\")\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#| caption: A general Bayes net on four discrete variables.\n",
"#| label: fig:general_bayes_net\n",
"show(wxyz)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The example Bayes net we created is shown in Figure 5,\n",
"and it is simply a graphical representation of which random variables’s\n",
"CPT (conditional probability table) depend on which other variables. Formally, in Bayes nets the joint probability distribution $P(X_1, \\dots , X_n)$ over a set\n",
"of random variables, $X_1, \\dots, X_n$ can be computed as the product of\n",
"$n$ conditional probabilities associated with each of the individual variables $X_i$\n",
"as\n",
"\n",
"$$P(X_1, \\dots , X_n)=\\prod_{i=1}^{n}P(X_{i}|\\Pi_{i})$$ \n",
"\n",
"where $n$ is the number\n",
"of variables, and $\\Pi_{i}$ denotes the set of parents for variable\n",
"$X_{i}$ in the directed graph. \n",
"In the example the joint distribution can then be read off as\n",
"\n",
"$$\n",
"P(W,X,Y,Z)=P(W|X,Z)P(X|Y,Z)P(Y|Z)P(Z).\n",
"$$\n",
"\n",
"Note that the order in which\n",
"we multiply the conditionals does not matter, but we typically prefer to put the parents more towards the right, as above. \n",
"Finally, note that the above graph has cycles, but the cycles are only in the underlying *undirected* graph, not in the directed graph. Directed cycles, i.e. cycles with a consistent direction, are not allowed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Bayes nets can be very efficient representations of complex\n",
"probability distributions, as they encode the dependence and especially\n",
"independence relationships between the variables.\n",
"The Bayes net above was created assuming binary variables, and hence did not take\n",
"a lot of effort to specify, but as we saw above, even for relatively small state spaces \n",
"the complexity of specifying CPTs can be daunting.\n",
"\n",
"If we were to construct a full table of probabilities for each possible\n",
"outcome of the variables $W$, $X$, $Y$, and $Z$, the table could be quite long. \n",
"For example, if we assume they all have 10 possible values, \n",
"then the table required to represent the full joint distribution will have $10^{4}$ entries,\n",
"i.e., $10,000$ unique values. You\n",
"can save a tiny bit, because they have to sum up to 1, so strictly\n",
"speaking we need only $9,999$ values. In contrast, we can tally how many\n",
"entries all four CPT tables have for the Bayes net above. \n",
"This is shown in the following table.\n",
"\n",
"| CPT | \\# entries |\n",
"|-------------|------------|\n",
"| *P(Z)* | 9 |\n",
"| *P(Y\\|Z)* | 90 |\n",
"| *P(X\\|Y,Z)* | 900 |\n",
"| *P(W\\|X,Z)* | 900 |"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For example, $P(X|Y,Z)$ has 900 entries, i.e., 9\n",
"(independent) entries for each of 100 possible combinations of $Y$ and\n",
"$Z$. Hence, the total number of parameters we need is only $1,899$,\n",
"which is significantly less than $9,999$.\n",
"\n",
"### Exercise\n",
"\n",
"What are the savings for the binary version of the graph that we created? You'll find it is much less, of course."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Factored State Representations*\n",
"\n",
"> We can save even more by factoring the state.\n",
"\n",
"Factored state representations are useful if the state of the robot can\n",
"be described using features that are relatively independent of each\n",
"other. Continuing our example, the robot vacuum cleaner might also run\n",
"out of battery power, so we could associate a different variable with\n",
"its battery status, e.g., `empty`, `half`, or `full`.\n",
"The state of the robot would then be specified by the combination of\n",
"two variables: the room the robot is in, and its battery status. \n",
"Note that now the total number of possible states\n",
"is combinatorial: if there are five rooms and three different battery\n",
"levels, we have a total of 15 possible states for the robot.\n",
"\n",
"A possible model for battery life could be the following transition table, which is independent of which action was taken, and will always progress from `full` to `half`, then from `half` to `empty`, and of course once the battery is empty it will stay empty:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"battery_states = [\"full\", \"half\", \"empty\"]\n",
"B = VARIABLES.discrete_series(\"B\", range(1,N+1), battery_states)\n",
"spec_b = \"9/1/0 0/9/1 0/0/1 \"\n",
"pretty(gtsam.DiscreteConditional(B[2], [B[1]], spec_b))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The graphical model approach allows us to easily extend probabilistic\n",
"actions to factored state representations. The code below shows one way to do it: "
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"factored = gtsam.DiscreteBayesNet()\n",
"factored.add(state_prior)\n",
"factored.add(B[1], [], \"1/0/0\") # initial battery state\n",
"for k in range(1, N):\n",
" factored.add(X[k+1], [X[k], A[k]], vacuum.action_spec) # motion model for location\n",
" factored.add(B[k+1], [A[k], B[k]], \"\".join([spec_b]*4)) # battery evolution model"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"\n",
"\n",
"\n",
"\n",
"\n"
],
"text/plain": [
""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#| caption: A factored Bayes net for the vacuum cleaner problem.\n",
"#| label: fig:factored_bayes_net\n",
"show(factored, hints={\"A\":2, \"X\":1, \"B\":0}, boxes={A[1][0],A[2][0]})\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can see that under the transition models chosen, there are now Markov chains, and these are indpendent when the action sequence is *given*."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise\n",
"Note that the above model makes a number of strong statements about the nature of the world:\n",
"1. The next battery state does not depend on where we are in the world. This seems like an OK assumption. Still, can you think of situations where this is not a realistic assumption?\n",
"2. The next state does not depend on the battery life. Maybe this is not so defensible: clearly, if the battery is empty the robot cannot move, and the next state is the same as the previous state. It is worthwhile to think about what you would change above to make a more realistic model of the world."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## GTSAM 101\n",
"\n",
"> The GTSAM concepts used in this section, explained.\n",
"\n",
"As in Chapter 2, we once again used a `gtsam.DiscreteConditional`, this time to specify a motion model for the controlled Markov chain above. "
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pretty(motion_model)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Above, we use the `gtsam.DiscreteBayesNet` class, and in particular these methods:\n",
"\n",
"- `add(self:, key: Tuple[int, int], parents: List[Tuple[int, int]], spec: str) -> None`: adds a conditional with the same arguments as the `gtsam.DiscreteConditional` constructor.\n",
"- `at(self, i: int) -> gtsam.DiscreteConditional`: retrieves the $i^{th}$ conditional added."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We used a sleight of hand above to extend the battery depletion model to depend on the navigation action. The following is a bit of python code that replicates our specification four times:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'9/1/0 0/9/1 0/0/1 9/1/0 0/9/1 0/0/1 9/1/0 0/9/1 0/0/1 9/1/0 0/9/1 0/0/1 '"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"\".join([spec_b]*4)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We then made sure to specify the action *before* the previous battery state, so that the string above works out. Below we pretty-print to verify that this came out right:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pretty(gtsam.DiscreteConditional(B[2], [A[2], B[1]], \"\".join([spec_b]*4)))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Of course, it is entirely possible to make the battery model *dependent* on the action chosen."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, a word about the graphs above. You might wonder, why these graphs come out so beautifully positioned, e.g., to indicate time from left to right. This was accomplished with the `hints` argument, which positions variables series at an appropriate height. Similarly, the `boxes` argument (which takes `gtsam.Keys`, not tuples) indicates which variables should considered as given.\n",
"\n",
"These arguments are handled in the [`gtbook` library](https://gtbook.github.io/gtbook/), and are passed on in the appropriate format to the underlying GTSAM `dot` methods."
]
}
],
"metadata": {
"colab": {
"authorship_tag": "ABX9TyOWJ1do7sX2VVOgPTc45upi",
"include_colab_link": true,
"name": "S32_vacuum_actions.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3.8.12 ('gtbook')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
},
"latex_metadata": {
"affiliation": "Georgia Institute of Technology",
"author": "Frank Dellaert and Seth Hutchinson",
"title": "Introduction to Robotics"
},
"vscode": {
"interpreter": {
"hash": "9f7376ced4243bb13dfcffa8a3ba834e0602aa8334cd3a1d8ba8d285f4628083"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}

*P(X2):*

\n",
"\n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
" \n",
"

\n",
"

"
],
"text/plain": [
"X2 | value |
---|---|

Living Room | 0 |

Kitchen | 0 |

Office | 0.2 |

Hallway | 0.8 |

Dining Room | 0 |

\n",
"

\n",
"

"
],
"text/plain": [
" *P(B2|B1):*

B1 | full | half | empty |
---|---|---|---|

full | 0.9 | 0.1 | 0 |

half | 0 | 0.9 | 0.1 |

empty | 0 | 0 | 1 |

\n",
"

\n",
"

"
],
"text/plain": [
" *P(X2|X1,A1):*

X1 | A1 | Living Room | Kitchen | Office | Hallway | Dining Room |
---|---|---|---|---|---|---|

Living Room | L | 1 | 0 | 0 | 0 | 0 |

Living Room | R | 0.2 | 0.8 | 0 | 0 | 0 |

Living Room | U | 1 | 0 | 0 | 0 | 0 |

Living Room | D | 0.2 | 0 | 0 | 0.8 | 0 |

Kitchen | L | 0.8 | 0.2 | 0 | 0 | 0 |

Kitchen | R | 0 | 1 | 0 | 0 | 0 |

Kitchen | U | 0 | 1 | 0 | 0 | 0 |

Kitchen | D | 0 | 0.2 | 0 | 0 | 0.8 |

Office | L | 0 | 0 | 1 | 0 | 0 |

Office | R | 0 | 0 | 0.2 | 0.8 | 0 |

Office | U | 0 | 0 | 1 | 0 | 0 |

Office | D | 0 | 0 | 1 | 0 | 0 |

Hallway | L | 0 | 0 | 0.8 | 0.2 | 0 |

Hallway | R | 0 | 0 | 0 | 0.2 | 0.8 |

Hallway | U | 0.8 | 0 | 0 | 0.2 | 0 |

Hallway | D | 0 | 0 | 0 | 1 | 0 |

Dining Room | L | 0 | 0 | 0 | 0.8 | 0.2 |

Dining Room | R | 0 | 0 | 0 | 0 | 1 |

Dining Room | U | 0 | 0.8 | 0 | 0 | 0.2 |

Dining Room | D | 0 | 0 | 0 | 0 | 1 |

\n",
"

\n",
"

"
],
"text/plain": [
" *P(B2|A2,B1):*

A2 | B1 | full | half | empty |
---|---|---|---|---|

L | full | 0.9 | 0.1 | 0 |

L | half | 0 | 0.9 | 0.1 |

L | empty | 0 | 0 | 1 |

R | full | 0.9 | 0.1 | 0 |

R | half | 0 | 0.9 | 0.1 |

R | empty | 0 | 0 | 1 |

U | full | 0.9 | 0.1 | 0 |

U | half | 0 | 0.9 | 0.1 |

U | empty | 0 | 0 | 1 |

D | full | 0.9 | 0.1 | 0 |

D | half | 0 | 0.9 | 0.1 |

D | empty | 0 | 0 | 1 |