I got some context on what was happening, but it wasn't definitive. I still have a bunch of capacity testing to do. In a nutshell we were/are generating enough traffic to cause Munchkin v1 Corona to queue and ultimately fail, but not enough traffic to qualify or Orion, which would (in theory) be able to handle the throughput.
The suggested course of action was to leverage GTM to load Munchkin with even more scrutiny; that was something we were doing before, so it wasn't much of an answer.
Hit me up offline. I'm working on a blog post about other techniques for reducing load that you might be interested in previewing.