Βιβλιοθήκη και Κέντρο Πληροφόρησης catalog › Details for: Deep reinforcement learning hands-on : apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo zero and more /

Normal view MARC view ISBD view

Deep reinforcement learning hands-on : apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo zero and more / Maxim Lapan.

By: Lapan, Maxim Material type: Text

TextSeries: Expert insightPublisher: Birmingham : Packt Publishing Ltd, 2018Copyright date: ©2018Description: xvi, 523 pages : illustrations ; 24 cmContent type: text Media type: unmediated Carrier type: volumeISBN: 1788834240; 9781788834247Subject(s): Reinforcement learning | Machine learning | Natural language processing (Computer science) | Artificial intelligenceDDC classification: 006.31 LOC classification: Q325.6 | .L299 2018

Contents:

What is reinforcement learning? -- OpenAI Gym -- Deep learning with Py Torch -- The cross-entropy method -- Tabular learning and the bellman equation -- Deep Q-networks -- DQN extensions -- Stocks trading using RL -- Policy gradients : an alternative -- The actor-critic method -- Asynchronous advantage actor-critic -- Chatbots training with RL -- Web navigation -- Continuous action space -- Trust regions : TRPO, PPO, and ACKTR -- Black-box optimization in RL -- Beyond model-free : imagination -- AlphaGo Zero.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes ( 2 )
Comments ( 0 )

Item type	Current library	Collection	Call number	Copy number	Status	Date due	Barcode
Book	University of Macedonia Library Βιβλιοστάσιο Α (Stack Room A)	Main Collection	Q325.6.L299 2018 (Browse shelf (Opens below))	1	Available		0013156859

Includes bibliographical references (page 512) and index.

There are no comments on this title.

to post a comment.

Search

Deep reinforcement learning hands-on : apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo zero and more / Maxim Lapan.