Recent posts

Policy Gradient Methods: REINFORCE, Actor-Critic, A3C, and A2C

0

The previous post derived DQN, DDQN, and Dueling DQN. These value-based methods learn Q-function and follow a policy that maximizes Q-function. This approach works well for discrete action spaces but cannot be applied when actions are continuous, since the maximization step is no longer proces...

Balancing a Double Pendulum with DQN and MuJoCo

0

A double pendulum consists of two pendulums attached to each other, and is a classic physical system that exhibits complex and chaotic motion. The balancing problem of double pendulum using only a single motor on the first joint is a well-known benchmark in control theory and robotics. This po...

Getting Started with MuJoCo on macOS

0

MuJoCo (Multi-Joint dynamics with Contact) is a physics engine mainly developed by Emo Todorov and maintained by Google DeepMind. It is widely used in robotics, reinforcement learning research, and biomechanics. This post covers installation on macOS, importing model, and running simulations i...

Derivation of the Particle Filter from Bayesian Filter

0

The Particle Filter is a sequential Monte Carlo implementation of the Bayesian Filter. Unlike the Kalman Filter, it does not require linearity or Gaussian assumptions, making it suitable for highly nonlinear systems and non-Gaussian distributions. Instead of representing the belief with a mean...

Derivation of the Kalman Filter from Bayesian Filter

0

The Kalman Filter is a specialized implementation of the Bayesian Filter. It is widely used in various engineering fields due to its optimality and computational efficiency when applied to linear systems with Gaussian noise. Below is a step-by-step mathematical derivation of its update equatio...