Exclusive: This new benchmark could expose AI’s biggest weakness Fast Company