The original problem
I’ve recently spent some time working on an OpenStack deployment. I ran into a
problem in which the compute service would frequently stop communicating
with the AMQP message broker (qpidd).
In order to gather some data on the problem, I ran the following simple test:
- Wait
nminutes - Run
nova boot ...to create an instance - Wait a minute and see if the new instance becomes
ACTIVE - If it works, delete the instance, set
n=2nand repeat
This demonstrated that communication was failing after about an hour, which correlates rather nicely with the idle connection timeout on the firewall.