OpenAI Evaluation Filter

GPT-4

We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits...

Benchmarks

Benchmarks OpenAI

Hugging Face Evaluation Filter

Announcing Evaluation on the Hub

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

OpenAI Evaluation Filter

CLIP: Connecting text and images

We’re introducing a neural network called CLIP which efficiently learns visual concepts from natural language supervision. CLIP can be applied to any visual classification benchmark by simply providing the names of the visual categories to be recognized,...

Benchmarks

Benchmarks

OpenAI Evaluation Filter

Procgen and MineRL Competitions

We’re excited to announce that OpenAI is co-organizing two NeurIPS 2020 competitions with AIcrowd, Carnegie Mellon University, and DeepMind, using Procgen Benchmark and MineRL.

Benchmarks

Benchmarks OpenAI

OpenAI Evaluation Filter

Image GPT

We find that, just as a large transformer model trained on language can generate coherent text, the same exact model trained on pixel sequences can generate coherent image completions and samples. By establishing a correlation between sample quality and...

LLM Evaluation

LLM Evaluation

OpenAI Evaluation Filter

Procgen Benchmark

We’re releasing Procgen Benchmark, 16 simple-to-use procedurally-generated environments which provide a direct measure of how quickly a reinforcement learning agent learns generalizable skills.

Benchmarks

Benchmarks

More stories

More stories load automatically as you scroll.