This summer I am working in the Robotics Locomotion group at CSAIL (MIT’s Computer Science and Artificial Intelligence Laboratory). I’ve decided to start a blog to exposit on the ideas involved. This ranges from big theoretical ideas (like general system identification techniques) to problem-specific ideas (specific learning strategies for the system we’re interested in) to useful information on using computational tools (how to make MATLAB’s ode45 do what you want it to).
To start with, I’m going to describe the problem that I’m working on, together with John (a grad student in mechanical engineering).
Last spring, I took 6.832 (Underactuated Robotics) at MIT. In that class, we learned multiple incredibly powerful techniques for nonlinear control. After taking it, I was more or less convinced that we could solve, at least off-line, pretty much any control problem once it was posed properly. After coming to the Locomotion group, I realized that this wasn’t quite right. What is actually true is that we can solve any control problem where we have a good model and a reasonable objective function (we can also run into problems in high dimensions, but even there you can make progress if the objective function is nice enough).
So, we can (almost) solve any problem once we have a good model. That means the clear next thing to do is to come up with really good modelling techniques. Again, this is sort of true. There are actually three steps to constructing a good controller: experimental design, system identification, and a control policy.
System identification is the process of building a good model for your system given physical data from it. But to build a good model, you need good data. That’s where experimental design comes in. Many quick and dirty ways of collecting data (like measuring the response of a sinusoidal input) introduce flaws into the model (which cannot be fixed except by collecting more data). I will explain these issues in more detail in a later post. For simple systems, you can get away with still-quick but slightly-less-dirty methods (such as a chirp input), but for more general systems you need better techniques. Ian (a research scientist in our lab) has a nice paper on this topic that involves semidefinite optimization.
Designing a control policy we have already discussed. It is the process of, once we have collected data and built a model, designing an algorithm that will guide our system to its desired end state.
So, we have three tasks — experimental design, system identification, and control policy. If we can do the first two well, then we already know how to do the third. So one solution is to do a really good job on the experimental design and system identification so that we can use our sophisticated control techniques. This is what Michael and Zack are working on with a walking robot. Zack has spent months building the robot in such a way that it will obey all of our idealized models and behave nicely enough that we can run LQR-trees on it.
Another solution is to give up on having a perfect model and have a system identification algorithm that, instead of returning a single model, returns an entire space of models (for example, by giving an uncertainty on each parameter). Then, as long as we can build a controller that works for every system in this space, we’ll be good to go. This can be done by using techniques from robust control.
A final idea is to give up on models entirely and try to build a controller that relies only on the qualitative behaviour of the system in question (for example, by using model-free learning techniques). This is what I am working on. More specifically, I’m working on control in fluids with reasonably high Reynold’s number. Unless you can solve the Navier-Stokes equations, you can’t hope to get a model for this system, so you’ll have to do something more clever.
The first system we’re working with is the underwater cart-pole system. This involves a pendulum attached to a cart. The pendulum itself is not actuated (meaning there is no motor to swing it up). Instead, the only way to control the pendulum is indirectly, by moving the cart around (the cart is constrained to move in a line). The fluid dynamics enter when we put the pendulum in a water tunnel and replace the arm of the pendulum with a water foil.
When the pendulum is in a constant-velocity stream, the system becomes a cart and pendulum with non-linear damping. However, when we add objects to the stream, the objects shed vortices and the dynamics become too complicated to model with an ordinary differential equation. Instead, we need to simulate the solution to a partial differential equation, which is significantly more difficult computationally.
Our first goal is to design a controller that will stabilize the pendulum at the top in the case of a constant stream (we have already done this — more about that in a later post). Our next goal is to design a controller to swing the pendulum up to the top, again in a constant stream (this is what I hope to finish tomorrow — again, more details in a later post). Once we have these finished, the more interesting work begins — to accomplish the same tasks in the presence of vortices. If we were to use existing ideas, we would design a robust version of the controller for a constant stream, and treat the vortices as disturbances. And this will probably be the first thing we do, so that we have a standard of comparison. But there are many examples in nature of animals using vortices to their advantage. So our ultimate goal is to do the same in this simple system. Since vortices represent lots of extra energy in the water, our hope is to actually pull energy out of the vortex to aid in the swing-up task, thus actually using less energy than would be needed without vortices (if this sounds crazy, consider that dead trout can swim upstream using vortices).
Hopefully this gives you a good overview of my project. This is my first attempt at maintaining a research blog, so if you have any comments to help improve the exposition, please give them to me. Also, if you’d like me to elaborate further on anything, let me know. I’ll hopefully have a post or two going into more specific details this weekend.