Performance profiling OpenStack services with repoze.profile

As OpenStack services mature and see ever larger workloads in production, we have an increasing need to optimize them for performance. repoze.profile (official documentation) has always been my go-to tool for profiling WSGI applications. Profiling a WSGI application is only really different than profiling a regular function call in that the entry point is a WSGI interface. But as a perk, you get live profiling statistics in your web browser where you can change profiling modes, sort, and filter the results — super convenient!

repoze.profile provides a WSGI middleware component which aggregates profiling data across all requests to the WSGI application. It provides a web GUI for viewing profiling data.

Configuration

In this tutorial, I'm going to assume you're running devstack, and I'm specifically going to use keystone as an example, but the exact same approach applies to any service using Paste Deployment.

To get started, install repoze.profile from PyPi:

$ pip install repoze.profile

Add the middleware configuration to the service's paste.ini file (/etc/keystone/keystone-paste.ini in keystone's case) before the first pipeline definition. This is the configuration I recommend:

# "filter" means we're configuring a unit of middleware.
# "profile" is the nickname for that middleware configuration
# that we'll include in the pipeline definitions later.
# Include this section before your `[pipeline:*]` sections.
[filter:profile]

# The entry point for this middleware.
use = egg:repoze.profile

# The name of the file to which the accumulated profiler
# statistics are logged. The service's process needs write
# access here.
log_filename = /opt/stack/keystone/keystone.profile

# The optional name of the file to which the accumulated
# profiler statistics are logged in the KCachegrind format.
# The service's process needs write access here.
cachegrind_filename = /opt/stack/keystone/cachegrind.out.keystone

# Discard the statistics for the first request in case there
# are lazy initializations which distort measurement of the
# application's normal performance.
discard_first_request = true

# The URL path to the profiler UI. This is the default.
path = /__profile__

# Profiling data will be deleted when the middleware instance
# disappears. Convenient when restarting the service between
# benchmark runs, anyway.
flush_at_shutdown = true

Now, you can include the profile middleware in any application pipeline you want to profile. Be sure to include repoze.profile's dependencies (cgitb and httpexceptions) ahead of it:

[pipeline:api_v3]
pipeline = egg:Paste#cgitb egg:Paste#httpexceptions profile [...]

(where [...] represents the rest of the existing pipeline configuration.)

Finally, restart the server's process so that it picks up the new paste configuration:

$ sudo service apache2 restart

Usage

And you're ready to run your benchmarks, tempest, or whatever else you have to create some API traffic!

To access the profiling results, access your web service in a browser. In keystone's case, we deploy the api_v3 pipeline on the /v3 endpoint, and so repoze.profile will be accessible at /v3/__profile__, based on the path configuration above. So the full URL might look like:

http://localhost:35357/v3/__profile__

You should see results like this, along with a few menus to control the output (this is a profile of hundreds of Fernet token validation requests in keystone, sorted by cumulative time spent per callee):