python

Policy Gradient Methods: REINFORCE, Actor-Critic, A3C, and A2C

0

The previous post derived DQN, DDQN, and Dueling DQN. These value-based methods learn Q-function and follow a policy that maximizes Q-function. This approach works well for discrete action spaces but cannot be applied when actions are continuous, since the maximization step is no longer proces...

Balancing a Double Pendulum with DQN and MuJoCo

0

A double pendulum consists of two pendulums attached to each other, and is a classic physical system that exhibits complex and chaotic motion. The balancing problem of double pendulum using only a single motor on the first joint is a well-known benchmark in control theory and robotics. This po...

Getting Started with MuJoCo on macOS

0

MuJoCo (Multi-Joint dynamics with Contact) is a physics engine mainly developed by Emo Todorov and maintained by Google DeepMind. It is widely used in robotics, reinforcement learning research, and biomechanics. This post covers installation on macOS, importing model, and running simulations i...