GPT-4 scores well on a variety of common academic benchmarks, but I am most intrigued—though not surprised—by where it falls comparatively short:
AP Language and Composition (Rhetoric) and AP Literature and Composition.
These are the most humane benchmarks it has encountered.