OpenStack Mitaka Design Summit outcomes for keystone

This is a summary of the discussions, design decisions, goals, and direction that came out of the OpenStack Mitaka Design Summit in Tokyo, Japan (fall 2015) with regard to keystone.

Token formats

Priti Desai and Brad Pokorny kicked off the technical discussions with an absolutely fantastic deep dive on keystone's four token formats available as of the Liberty release: UUID, PKI, PKIZ and Fernet. They started with an overview of how each token format is constructed in keystone, proceeded to discuss how each token's validation and revocation processes behave, and concluded with the impact of each token type on Horizon.

To top it off, they also included a few links to code reviews which will introduce domain-scoped token support to horizon.

So, the hottest question of them all, should you be using Fernet? As with all things in engineering:

It depends.

Deprecations

The farther back you can look, the farther forward you are likely to see.

Winston Churchill

Step one of every summit should be shedding baggage. Never mind the obvious lowering of maintenance costs over the next development cycle, even conversations at the summit move faster when we can disregard complications on the deprecation list.

Following OpenStack's standard deprecation policy, we reached a consensus to immediately deprecate the following items as part of the Mitaka release cycle:

  • LDAP write support: Operators almost never want keystone to be writing to their LDAP servers in the real world, so the existing write-support receives little usage or maintenance. This means that deployers will neither be able to manage LDAP users through keystone nor use LDAP as their project backend, which keystone must manage for it to be useful.

  • PKI and PKIZ: Keystone core reviewers strongly recommend that operators avoid PKI and PKIZ in production. PKI and PKIZ provide no additional security over UUID and Fernet, but induce several additional headaches such as incredibly large header sizes, increased strain on the token persistence backend, bloated caches, and ultimately a poor user experience. Several releases ago, we believed that we could resolve these issues and deliver better distributed token validation performance, but today, both UUID and Fernet both provide more viable deployment options than PKI and PKIZ.

Several additional items were proposed for deprecation but we produced outstanding action items required to justify their removal:

  • eventlet: Despite the learning curve required to properly tune a new application server, operators have reported success and happiness after switching to not-eventlet based deployment methods such as Apache httpd with mod_wsgi or nginx with uwsgi. Still, we aren't ready to remove support for eventlet until we have a stronger understanding and better documentation of the real-world impact of using an alternative web server.

  • v2.0 API: Deprecating stable APIs is a long and slow process. At the summit, the broader OpenStack community agreed that it's okay to deprecate HTTP APIs, but only to discourage their usage. Support for public HTTP APIs with any sort of usage cannot be removed. So, instead of being proposed for immediate deprecation, we took the chance to review the state of the union regarding the remaining API usage within OpenStack. In summary, although we have an experimental / non-voting job v3-only gate job called gate-tempest-dsvm-neutron-identity-v3-only-full passing as of October 8, 2015, we still have a few peripheral services which are dependent on the v2.0 API. We also identified a short list of APIs for which support will never be removed, although new development will likely be required in order to ease the cost of maintenance for long term supportability. That list includes end-user facing authentication flows for both OpenStack and EC2 tokens:

    • POST /v2.0/tokens: Create an unscoped token.

    • GET :5000/v2.0/tenants: List the tenants to which I can access.

    • POST /v2.0/tokens: Create a scoped token.

    • EC2 API

Library news

On the topic keystone's secondary projects (keystonemiddleware, keystoneauth, and python-keystoneclient), it seems that we're finally approaching a tipping point towards some major version bumps where we can finally drop the long-deprecated CLI support for the v2.0 API in favor of python-openstackclient, remove auth_token from keystoneclient in favor of keystonemiddleware.auth_token, and offer a selection of authentication plugins which are ready for production. The end result will be lighter dependency trees and a better user experience for everyone. In the mean time, it's absolutely necessary that we deliver comprehensive documentation for consumers of our new authentication plugins and sessions.

It was also valuable to reiterate that absolutely no changes are necessary to the clients (including keystonemiddleware) to support Fernet. From a client's perspective, they can be handled just like UUID tokens!

Killing service users

Service users have been a totally unnecessary burden on deployers since they were first introduced around the Diablo or Essex release (basically as soon as we had service-to-service communication). They must be persisted in an identity backend, which has led to such complexity as "domain-specific identity drivers" (because, of course, deployers wanted to use LDAP to manage their real users, not keystone, and you shouldn't need to write to LDAP to deploy OpenStack). All of this was originally caused by our single implementation of bearer tokens.

So, perhaps we should kill service "users" in favor of something effectively simpler, like TLS mutual authentication (you're already deploying with HTTPS, right?), something which a few large deployments have already implemented themselves to various degrees. The effect is that, for example, glance can simply trust that "nova" can make "this subset" of API calls, regardless of tenancy, based on nova's identity alone. Do we really need more granular authorization than that for services? With deployers assigning all their service users the "admin" role, I suspect not. In one fell swoop, we could be eliminating traditional service users, the service tenant, and overly complex authorization management for services (and that painful abuse that is service users carrying the all-too-powerful "admin" role).

Although this topic popped up a few times in Tokyo, no one stepped up to own the issue, and so I suspect it will still be on our wishlist come Austin.

Introducing "shadow" users, sort of

This is a topic that I went into the summit not expecting to discuss, much less to have it color nearly every subsequent discussion around keystone.

Attempting to describe this topic at a high level is quite difficult, because it really amounts to a major internal refactor on the backend, will not have any obvious API impact, and yet will deliver a unified user experience across every authentication method of today and tomorrow. It also opens the door for account linking (where one user can prove they own credentials from two or more domains) and practical multi-factor authentication as a first class citizen.

The short version is that we've historically implemented each authentication method slightly differently, from SQL, to LDAP, to external authentication, to federation, to keystone-to-keystone federation. To unify them all together, all we need to do is map any authentication method into an internal identity for use within OpenStack (basically, a user_id and nothing more). And it's this distinction that is critically important to understanding the concept: identities are separated from the credentials they own (including a traditional user's domain_id + name + password), such that one identity can have multiple credentials in, potentially, multiple domains or handled by multiple external identity providers.

The biggest win is for federation, but let's look at some of the fantastic consequences:

  • "Federated tokens" are replaced by local tokens, referencing local identities (not federated ones). As a bonus, this makes Fernet tokens noticeably smaller (and PKI/Z tokens slightly smaller too)!

  • Federated users become targets for local user-based role assignments. Authorization management is no longer dependent on groups alone.

  • Federated users can be easily disabled, just like SQL users. In fact, all source of authentication for a single identity can be disabled all at once.

  • LDAP users no longer have a reason to emulate keystone's "enabled" attribute.

  • Auditing is simplified, because it separates the user's identity (who's actions are being audited) from their source of authentication (which is only really a concern in keystone).

  • ( I'm sure there are more benefits to discover. )

But, it's "not all roses" just yet. We have some tough questions to answer in such a migration (and yes, there will be significant database migrations!), such as:

  • Is it (still) possible to own an identity, as in domains? Who has the authorization to disable an identity? Domains essentially become a namespace for username + password combinations, but the identity they represent is not bound to a single authentication method. This may be a part of the unification that we'll have to accept, but one which we'll have to carefully think through and document.

  • Do domains become identity providers? Is a domain just an identity provider?

  • ( I'm sure there are more caveats to discover. )

For far more information, read the initial spec!