OpenStack Newton design summit outcomes for keystone

This is a summary of the discussions, design decisions, goals, and direction that came out of the OpenStack Newton Design Summit in Austin, Texas (spring 2016) with regard to keystone.

By Happywaffle at en.wikipedia (Public domain), via Wikimedia Commons

The Newton release was named after the Newton House, a 7,077 square foot historic building constructed in 1874, and located just a few blocks northeast of the conference venue. As I understand it, the second-level porch pictured above makes it pretty swanky for a single-family home.

OpenStack Innovation Center (OSIC)

We're honestly here to improve the simplicity, upgradeability, reliability, and scalability of OpenStack upstream, in the community, for everyone's benefit.

I kicked off my time at the summit with a fantastic experience representing the OpenStack Innovation Center in the Rackspace Cantina.

Rackspace Cantina

If the Rackspace Cantina makes another appearance at the Ocata summit in Barcelona, then I expect to see more mariachis.

It was particularly refreshing to break the stereotypical expectations of conference attendees when visiting a booth. The most frequent questions and surprises we fielded included:

  1. The OpenStack Innovation Center is a mouthful, and so is oh-es-eye-see. We actually pronounce it as oh-sick.

  2. No, we're not building a vendored product for OpenStack, and no, there's no special distribution. OSIC is a rather unique partnership between Intel and Rackspace, and we're honestly here to improve the simplicity, upgradeability, reliability, and scalability of OpenStack upstream, in the community, for everyone's benefit.

  3. Yes, we actually have a big cloud for any developer in the community (including you) to use for testing and experimenting at scale. Just need a few VMs? We've got you covered. Need 242 bare metal nodes (11 cabinets!) and IPMI access for a couple weeks? We'll get you scheduled, as long as you share your learnings with the community!

If you didn't visit the Cantina, you can always apply to access the developer cloud online.

Fernet tokens

Matt Fischer, Lance Bragstad, and myself presented a deep dive discussion into Fernet tokens. Together, we discussed the challenges that led us to Fernet, how Fernet solves those problems compared to existing token formats, how to operate Fernet at scale, and how to upgrade your cloud to use Fernet.

Making Fernet the default token provider

One of the last challenges on the road to making Fernet the default token provider in Keystone is how to handle Fernet keys out of the box, which are currently stored on disk and read by keystone for every token operation. The problem is that if two Keystone processes do not share the same set of primary and secondary Fernet keys, then tokens created by one process will not be decipherable by the other process, and satisfying that constraint requires some orchestration that Keystone cannot perform itself.

That's an easy problem for real deployers to solve with real orchestration tools, but how do we make "simpler" deployments work out of the box with Fernet as a reasonable default, such as a multi-process Keystone install, or a multi-node devstack install? So, the question we brought to the summit was representative of one of these use cases:

How do we utilize Fernet tokens in a multi-node devstack install?

The user running Keystone cannot be expected to write to /etc/keystone/fernet-keys/, and even if it did, we'd have to solve for a race condition where multiple processes race to initialize the Fernet key repository on startup.

The most obvious solution is to move the Fernet keys off disk and into a shared store them in a centralized database, but that also requires encrypting them (and you certainly don't want to store encryption keys in the database in plaintext, so now you have to share encryption keys for your encryption keys across a cluster).

We could have Keystone depend on a more formal, centralized secret storage solution (potentially Barbican), however, that implies increasing the deployment complexity of default Keystone installs by another magnitude.

The solution we settled on at the summit was to simply default to Fernet, and have Keystone fail critically on startup if the Fernet key repository had not already been initialized (today, keystone will instead fail non-fatally during authentication requests with 500 errors). The implication is that by not being able to start keystone at all, we're raising a much more obvious red flag to deployers that they have some work to do bootstrap keystone, whether it's part of a cluster or not.

Additionally, we could ease the operator experience further by making keystone-manage fernet_setup a part of Mitaka's keystone-manage bootstrap, and raising red flags before you even start keystone as part of keystone-manage doctor.

Shadowing federated users

We made some great progress on shadow users in the Mitaka release (thanks, Ron de Rose!). We're now creating shadow identities for federated users on first authentication with locally-persisted user IDs, which opens the door for all sorts of interesting new workflows, such as locally-managed, user-specific role assignments. By extending the shadow user concept to other externally-managed authentication sources (such as LDAP), we gain the ability to unify our authorization model regardless of actual source of identity.

To take things a step further, we discussed the problem of managing per-user authorization for users that had not yet presented federated credentials to keystone, and has not thus not yet been shadowed. Today, the mapping engine supports mapping federated users into preexisting user groups, that have preexisting, locally-managed authorization.

What we don't support is being able to say that "this shadowed user should be granted a new, unique role on a new, unique project." Most of the solutions we've discussed involve letting external admin-capable systems do that kind of provisioning against keystone. For example, Time Warner uses a proprietary Horizon plugin to do the heavy lifting, and Rackspace's public cloud implements a dedicated "signup" service. Similarly, Red Hat has pursued an implementation that requires "predicting" user IDs in Keystone (which are supposed to be UUIDs!) — a fragile assumption with serious security ramifications if you ask me.

However, with shadow users and our mapping engine suddenly being able to produce concrete authorization data in OpenStack, we're essentially just a step away from being able to achieve automatic provisioning of projects and resource quotas during the authentication process as a trivial consequence of mapping.

Expect to see a new spec shortly!

PCI-compliant password management

While Keystone can (and usually does) store passwords, it does nothing to manage those passwords with any notion of PCI compliance. Regardless of where your deployment's primary identity data rests (hopefully via a PCI-compliant federated identity provider!), we still have the problem of "service users."

For large and small deployments alike, we typically have service users stored in SQL, and a truly PCI-compliant deployment would require service user credentials to be held to the same standards as real users. For some operators, it might be viable to stand up a second, PCI-compliant LDAP deployment just to handle service users, but it still makes sense to implement basic PCI-compliant password management directly in keystone.

Multifactor authentication (MFA)

Multifactor authentication (MFA) is essentially the process of making a stronger assertion about the identity of a particular user. MFA achieves that by testing multiple authentication factors of an end-user, including what a user "knows" (for example, a password), something a user "has" (for example, an RSA token), or something a user "is" (for example, a fingerprint).

Going into the summit, we were on course to introduce true MFA directly into keystone during the Newton release cycle, but, thanks to conversations at the summit, we realized we had a much more powerful MFA story already in-place, thanks entirely to identity federation and keystone's mapping engine. Besides special use cases such as service users and instance users, identity federation should ultimately become the only first-class source of identity in keystone; that idea led us to the follow alternate solution:

  1. Multifactor authentication happens to already be handled by external identity providers.

  2. An assertion that multifactor authentication has been performed can be communicated to keystone via a federated user's SAML assertions (as an authentication level), or simply assumed by keystone, depending on the identity provider.

  3. That assertion can then be used in the mapping engine to escalate the authorization received by federated users by granting additional roles.

Facepalm. Identity federation always seems to be the answer.

All that we're missing in keystone is a documented example illustrating how to configure the keystone mapping engine to grant authorization based on the assurance level presented by a SAML assertion.

For example, an assurance level of 1 might be sufficient to grant the member role, whereas a assurance level of 10 (implying the presence of MFA) might be required to grant the admin role. By the way, can the mapping engine handle integer comparisons? Probably not, but that's not actually critical. We decided to file a bug and call it a day :)

Public & private clouds

I was on a panel hosted by Brad Topol which included Matt Fischer, Monty Taylor, Jessie Keating, myself, and Steve Martinelli. Together, we represent deployers, and developers, and end users of both public & private clouds. We answered audience questions regarding:

  • managing authorization at scale,
  • bootstrapping new users and tenants in the cloud,
  • the state and future of identity federation, including the CLI, shadow users, and deployments,
  • options for instance users, and
  • role based access controls.