Daan Wierstra, David Silver, Yuval Tassa, Tom Erez, Nicolas Heess, Alexander Pritzel, Jonathan J. Hunt, Timothy P. Lillicrap - 2015

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems such as cartpole swing-up, dexterous manipulation, legged locomotion and car driving. Our algorithm is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain and its derivatives. We further demonstrate that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

Implemented a deep deterministic policy gradient with a neural network for the OpenAI gym pendulum environment.

This project is an exercise in reinforcement learning as part of the Machine Learning Engineer Nanodegree from Udacity. The idea behind this project is to teach a simulated quadcopter how to perform some activities.

Continuous control with deep reinforcement learning - Deep Deterministic Policy Gradient (DDPG) algorithm implemented in OpenAI Gym environments

Two Deep Reinforcement Learning agents that collaborate so as to learn to play a game of tennis.

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)

This tool is developed to scrape twitter data, process the data, and then create either an unsupervised network to identify interesting patterns or can be designed to specifically verify a concept or idea.

Reinforcement learning algorithms implemented for Tensorflow 2.0+ [DQN, DDPG, AE-DDPG]

Implementation of Deep Deterministic Policy Gradients using TensorFlow and OpenAI Gym

DDPG implementation for collaboration and competition for a Tennis environment.

This repository contains: 1. Unofficial code for paper "The Cross Entropy Method for Fast Policy Search" 2. Unofficial code for paper "Continuous control with deep reinforcement learning" 3. Unofficial code for paper "Deep Reinforcement Learning with Double Q-learning"

Distributed Tensorflow Implementation of Continuous control with deep reinforcement learning (DDPG)

My solution to Collaboration and Competition using MADDPG algorithm, Udacity 3rd project of Deep RL Nanodegree from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Implementation of Deep Deterministic Policy Gradient algorithm in Unity environment

A biologically inspired, hierarchical bipedal locomotion controller for robots, trained using deep reinforcement learning.

Repository for Planar Bipedal walking robot in Gazebo environment using Deep Deterministic Policy Gradient(DDPG) using TensorFlow.

An implementation of the Normalized Advantage Function Reinforcement Learning Algorithm with Prioritized Experience Replay

This is a TensorFlow implementation of DeepMind's A Distributional Perspective on Reinforcement Learning.(C51-DDPG)

Deep Reinforcement Learning Agent that solves a continuous control task using Deep Deterministic Policy Gradients (DDPG)

Deep Reinforcement Learning Nanodegree project on continuous control, based on the DDPG algorithm.

Reimplementation of DDPG(Continuous Control with Deep Reinforcement Learning) based on OpenAI Gym + Tensorflow

practice about reinforcement learning, including Q-learning, policy gradient, deterministic policy gradient and deep deterministic policy gradient

Repository for Planar Bipedal walking robot in Gazebo environment using Deep Deterministic Policy Gradient(DDPG) using TensorFlow.

Two agents cooperating to avoid loosing the ball, using Deep Deterministic Policy Gradient in Unity environment

PyTorch deep reinforcement learning library focusing on reproducibility and readability.

Examples of published reinforcement learning algorithms in recent literature implemented in TensorFlow

Multi-Agent Deep Deterministic Policy Gradient applied in Unity Tennis environment

Simple scripts concern about continuous action DQN agent for vrep simluating domain

Reimplementation of DDPG(Continuous Control with Deep Reinforcement Learning) based on OpenAI Gym + Tensorflow

Implementation of DDPG (Modified from the work of Patrick Emami) - Tensorflow (no TFLearn dependency), Ornstein Uhlenbeck noise function, reward discounting, works on discrete & continuous action spaces