OpenAI Evaluation Filter page 5

OpenAI Evaluation Filter June 20, 2020 07:00

Procgen and MineRL Competitions

We’re excited to announce that OpenAI is co-organizing two NeurIPS 2020 competitions with AIcrowd, Carnegie Mellon University, and DeepMind, using Procgen Benchmark and MineRL.

OpenAI Evaluation Filter June 17, 2020 07:00

Image GPT

We find that, just as a large transformer model trained on language can generate coherent text, the same exact model trained on pixel sequences can generate coherent image completions and samples. By establishing a correlation between sample quality and...

OpenAI Evaluation Filter December 03, 2019 08:00

Procgen Benchmark

We’re releasing Procgen Benchmark, 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills.

OpenAI Evaluation Filter March 21, 2019 07:00

Implicit generation and generalization methods for energy-based models

We’ve made progress towards stable and scalable training of energy-based models (EBMs) resulting in better sample quality and generalization ability than existing models. Generation in EBMs spends more compute to continually refine its answers and doing so...

OpenAI Evaluation Filter February 14, 2019 08:00

Better language models and their implications

We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question...

OpenAI Evaluation Filter August 06, 2018 07:00

OpenAI Five Benchmark: Results

Yesterday, OpenAI Five won a best-of-three against a team of 99.95th percentile Dota players: Blitz, Cap, Fogged, Merlini, and MoonMeander—four of whom have played Dota professionally—in front of a live audience and 100,000 concurrent livestream viewers.

OpenAI Evaluation Filter July 18, 2018 07:00

OpenAI Five Benchmark

The OpenAI Five Benchmark match is now over!

OpenAI Evaluation Filter April 10, 2018 07:00

Gotta Learn Fast: A new benchmark for generalization in RL

In this report, we present a new reinforcement learning (RL) benchmark based on the Sonic the Hedgehog™ video game franchise. This benchmark is intended to measure the performance of transfer learning and few-shot learning algorithms in the RL domain. We...

OpenAI Evaluation Filter March 24, 2017 07:00

Evolution strategies as a scalable alternative to reinforcement learning

We’ve discovered that evolution strategies (ES), an optimization technique that’s been known for decades, rivals the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks (e.g. Atari/MuJoCo), while overcoming many of RL’s...

OpenAI Evaluation Filter August 29, 2016 07:00

Infrastructure for deep learning

Deep learning is an empirical science, and the quality of a group’s infrastructure is a multiplier on progress. Fortunately, today’s open-source ecosystem makes it possible for anyone to build great deep learning infrastructure.