Down for about ten minutes. I’m not sure why. I will continue to monitor the situation.
Update: Now down for 50 minutes. I’ve turned in a trouble ticket.
Further update: Total downtime: 75 minutes, then back up.
Yet further update: Explanation by the host:
This issue was related to a semaphore limits on the machines themselves and not all users were affected; we’ve fixed this and are currently investigating the root cause of the semaphore issue.
All it takes is one process to throw off the rhythm for the entire cluster.