Async GraphQL with Rust: AuthN and AuthZ

June 21, 2022
Image of a unlocked pad lock connecting a gate

Welcome back to my series covering Async GraphQL in Rust! This series is aimed at intermediate Rust developers who have grasped the basics and are ready to build production-ready server applications. Today's entry will cover authentication and authorization using the biscuit and Oso libraries. The full code for the example project can be found on Github here.

Though they are often lumped into the abstract concept of "auth", authentication and authorization are two distinctly different topics. Authentication is the process of verifying the identity of the requestor and decoding details about them provided by the identity service used, and authorization is the process of deciding what operations the requestor should have access to perform on the requested resources. Both concerns are very important to data security, and almost every API will need to have a strong solution to manage them.

Authentication

A very common pattern for authentication decodes a JWT token issued by an authentication server using an OAuth or OpenID flow and uses "claims" within the payload to identify the user. The reserved "sub" claim which refers to the "subject", or user, of the request is often used as a unique id for the user. It is frequently added as an external id to the internal user data model in an application. Other claims such as the "scope" claim (defined in the OAuth Assertion Framework) or custom claims like "roles" are often used for authorization decisions.

JWT and JWKS

To see a sample JWT token, visit jwt.io and look at the "Encoded" field in the Debugger. It should be pre-populated with an example encoded JWT. You will often see values like this included in the "Authorization" HTTP header prefixed with the word "Bearer ", since that denotes the type of token they are. To prevent manipulation, they are often signed using a public/private key pair with the authentication server having sole access to the secret private key. The public key in these pairs are typically shared with an API server using JSON Web Key Sets (JWKS) along with a key id, allowing the API server to use the public key to validate the cryptographic signature generated with the authentication server's private key.

Because of this validation, an API server can trust that the payload and the claims within have been issued directly from the authentication server, and haven't been tampered with by a malicious user. Though the user can see the claims that the authentication server makes about them, they can't change the claims because the cryptographic signature would no longer match. This is why they are trustworthy for use within authorization decisions.

JOSE with Biscuit

The biscuit crate is a Rust library that handles JSON Object Signing and Encryption (JOSE), which will allow us to use a special key set URL on the authentication server to validate the signature of the JWT token sent by the requestor. Following OpenID conventions, this url is typically called .well-known/jwks.json, and it should contain one or more public keys identified by a key id. The JWT payload should contain a key id (usually presented in the "kid" claim) that matches up with one of the keys in the authentication server's JWKS, indicating that the matching key can be used to verify the payload signature.

My GraphQL API is made available via the Rust warp crate, and in my caster_api::router::create_routes function I pull in caster_auth::authenticate from the "libs" folder so that I can use the with_auth() filter to extract the Subject from the request based on the "sub" claim on their JWT token.

let graphql_post = graphql(schema) // Add the Subject to the request handler .and(with_auth(jwks))

With the .and() method from warp I can add additional arguments to the HTTP request handler. The graphql() method from async_graphql_warp adds the async_graphql::Schema and async_graphql::Request as the first argument within a tuple. Calling .and(with_auth(jwks)) uses the with_auth() filter to add the Subject as the second argument, and I'll show in the next section how the UsersService is injected as well.

async fn with_context( (schema, request): ( Schema<Query, Mutation, EmptySubscription>, async_graphql::Request, ), sub: Subject, users: Arc<dyn UsersService>, ) -> Result<GraphQLResponse, Infallible> { // ... }

This request handler returns a standard Rust Result that should always successfully return an async_graphql_warp::GraphQLResponse since it is designated as Infallible.

JWKS Retrieval

The jwks value is provided by the function that calls create_routes(), which is the run() function at the top level of the caster_api library, in the apps/api/src/lib.rs file. This is called both by the main.rs file in the same crate, and also by the test_utils.rs file that I'll talk about in the next article. In this application, the well-known JWKS file is read in at startup and shared across threads as an immutable static reference using Tokio's OnceCell implementation. This easily prevents the server from repeatedly hammering the authentication server for the keys, but the tradeoff is that if the key set is rotated the API server will need to be restarted to pick up the changes.

Various biscuit tools and types are used to work with the key set data, which is retrieved using a hyper HTTP request that is deserialized using serde.

pub async fn get_key_set(&self) -> anyhow::Result<JWKSet<Empty>> { let url = format!("{}/.well-known/jwks.json", &self.config.auth.url); let req = Request::builder() .method(Method::GET) .uri(url) .body(Body::empty())?; let response = self.client.request(req).await?; let body = to_bytes(response.into_body()).await?; let jwks = serde_json::from_slice::<JWKSet<Empty>>(&body)?; Ok(jwks) }

The Empty type passed into the biscuit JWKSet generic type above represents that there are no expected additional parameters on each JWK. The Request type comes from hyper, as does the to_bytes() function that transforms the response into something serde can read from.

JWKS Verification

Now that the authentication server's key set is retrieved, the with_auth() filter uses the warp::filters::header::headers_cloned() filter to pull the headers so that the "Authorization" header can be read and decoded:

pub fn with_auth( jwks: &'static JWKS, ) -> impl Filter<Extract = (Subject,), Error = Rejection> + Clone { headers_cloned().and_then(move |headers: HeaderMap<HeaderValue>| authenticate(jwks, headers)) }

The funky &'static notation indicates a borrowed reference with a static lifetime, since the key set doesn't ever change during the execution of the program. The Filter type comes from warp and specifies the value that the filter extracts and what the rejection type looks like. The + Clone modifier at the end indicates that the return type also implements the Clone trait and is able to be duplicated with .clone().

The .and_then() method comes from the Filter trait, and it takes the headers and passes them to the authenticate() function along with the JWKS keys. This function uses the jwt_from_header() function to retrieve the "Authorization" header and uses match to handle it in different ways depending on whether it was successful and had a result, was successful but empty, or resulted in an error.

async fn authenticate( jwks: &'static JWKS, headers: HeaderMap<HeaderValue>, ) -> Result<Subject, Rejection> { match jwt_from_header(&headers) { Ok(Some(jwt)) => { // ... } Ok(None) => Ok(Subject(None)), Err(e) => Err(warp::reject::custom(e)), } }

If found, it identifies the key id and uses the get_secret_from_key_set() function to find the matching public key and verify the signature of the payload using the into_decoded() method provided by the biscuit Compact struct.

JWT Subject

Finally, the "sub" claim is pulled from the verified payload because it's the only claim that this application cares about. Roles and Permissions are managed inside the app so no additional claims are needed. This claim is wrapped in a newtype called Subject that contains an Option<String>:

/// The token's Subject claim, which corresponds with the username #[derive(Clone)] pub struct Subject(pub Option<String>);

This means that the Subject can be None, which indicates that there was no token or no "sub" claim and the request should be treated as anonymous. In the GraphQL resolvers this is handled with a match check:

// Otherwise, check for a username so that it can be created let username = match subject { Subject(Some(username)) => Ok(username), _ => Err(graphql_error("Unauthorized", StatusCode::UNAUTHORIZED)), }?;

Rust's pattern matching is a powerful feature that allows you to "reach into" multiple layers, like you see above where I match only Subjects with an inner Some value.

User Injection

Having access to the Subject in my resolvers can be useful, especially when dealing with User operations. In most cases, though, what I really want is the User object that matches the Subject. To make this easily available to my resolvers I add an async operation to attempt to retrieve a User with a username value that matches the Subject inside my with_context() request handler:

let graphql_post = graphql(schema) // Add the Subject to the request handler .and(with_auth(jwks)) // Add the UsersService to the request handler .and(warp::any().map(move || context.users.clone())) // Add details to the GraphQL request context .and_then(with_context);

As you can see, I use warp::any() to add the arbitrary context of the UsersService instance to the request handler, so that it's available to look the User up in the database by username.

async fn with_context( (schema, request): ( Schema<Query, Mutation, EmptySubscription>, async_graphql::Request, ), sub: Subject, users: Arc<dyn UsersService>, ) -> Result<GraphQLResponse, Infallible> { // Retrieve the request User, if username is present let user = if let Subject(Some(ref username)) = sub { users.get_by_username(username, &true).await.unwrap_or(None) } else { None }; // Add the Subject and optional User to the context let request = request.data(sub).data(user); let response = schema.execute(request).await; Ok::<_, Infallible>(GraphQLResponse::from(response)) }

The users.get_by_username() method takes the username found within the Subject wrapper, and uses .unwrap_or(None) to simply default to None if something goes wrong. The request.data() method on the async-graphql Request adds the information to the request context, where it can be retrieved like this:

/// Update the current User based on the current token username (the "sub" claim) async fn update_current_user( &self, ctx: &Context<'_>, input: UpdateUserInput, ) -> Result<MutateUserResult> { let user = ctx.data_unchecked::<Option<User>>(); // ... }

With this information, you should now be confident in who the User is based on their "sub" identifier claim. Next you need to decide what that User can do.

Authorization

The complexity of an authorization layer can vary quite widely depending on the business domain and use cases needed. Some of the simplest API's can meet their requirements with basic logic in the controller or resolver layer. Other applications, such as a large collection of microservices with complex relationships, might opt instead for a central authorization server or even a cluster of servers that sit in front of the API endpoints.

I chose Oso, a free open-source authorization library that also integrates with an optional managed cloud platform. I don't need the cloud platform right now, but it's interesting to know that it's there if I need to scale to that level. The library is written in Rust and is embedded in many other languages, so I can share my authorization strategy with other applications that aren't in Rust in a consistent way.

Polar Bear Logic

I use Oso's Polar language to define a set of rules in authorization.polar files that I keep within each crate of my "libs" where authorization is needed. In libs/users/src/authorization.polar, I define User as an "actor", and define an "allow" rule:

allow(actor, action, resource) if has_permission(actor, action, resource); actor User {}

The actor is the one who wants to perform a particular action (such as "update", or "delete") on a particular resource (such as a Show or an Episode). The allow rule says that in order to allow an action, the "has_permission()" check needs to pass for that actor, action, and resource. This works together with Oso's RBAC shorthand system to implement Role-based access control. There are no Roles or Permissions defined in the caster_users crate, so let's switch over to the caster_shows crate at libs/shows/src/authorization.polar.

Skip down to the resource Show definition first to get an idea of the Roles and Permissions available for this resource:

resource Show { permissions = [ # Update details about a Show "update", # Delete a Show "delete", # Create, update, and delete any Episodes for a Show "manage_episodes", # Grant or revoke Profile Roles for a Show "manage_roles" ]; roles = [ # Able to update a Show and manage Episodes "manager", # Able to fully control a Show "admin" ]; "update" if "manager"; "manage_episodes" if "manager"; "delete" if "admin"; "manage_roles" if "admin"; "manager" if "admin"; }

This says that if an actor has the "manager" Role, then they have the "update" and "manage_episodes" Permissions. If the actor has the "admin" Role, then they have the "delete" and "manage_roles" Permissions, as well as all of the Permissions granted to the "manager" Role.

Now, take a look at the check at the top of the file:

has_role(user: User, role_key: String, show: Show) if role in user.roles and role.role_key = role_key and role.resource_table = "shows" and role.resource_id = show.id;

This is the has_role check for the Show resource. By explicitly specifying "Show" as the type of the "show" parameter, along with the other concrete types specified here, I'm saying that this function applies when the arguments match the given types. This has_role function will only be called for User actors wanting to take action on the Show resource. This function is automatically called by the has_permission function that is derived behind the scenes by the RBAC shorthand. You don't see the has_permission, but by using the shorthand Oso knows to generate it behind the scenes. It calls has_role with the arguments you see above to determine what Permissions the actor has been granted, and then determines if the action they are requesting should be permitted. Actions match up with Permissions, so Oso is looking for the "update" Permission in order to allow the "update" action.

For this check to work, inside my Rust code I need to pass it a user instance that has a roles property. This property should contain a list of objects with a role_key property, a resource_table property, and a resource_id property. Later on, I'll show how we store these generic relationships in the database. For now, understand that the resource_table points to an actual table in the database, and the resource_id points to the primary key of a row within that table. The role_key is the unique key for a Role that should match up with the Roles defined in the Polar files.

Making a Decision

In my GraphQL resolver (or in the controller layer for a REST API), I use oso.is_allowed() to run checks against these Polar rules that I've defined. For example, when I want to check to see if the requesting User can "update" a particular Show:

if let Some(user) = user { if !oso.is_allowed(user.clone(), "update", existing)? { return Err(graphql_error("Forbidden", StatusCode::FORBIDDEN)); } } else { return Err(graphql_error("Unauthorized", StatusCode::UNAUTHORIZED)); }

I first check to see if a User was found within the Warp filter. If there is a User, I call .is_allowed() with a duplicate of the User (since the value inside the Some is actually a reference, and Oso needs to take ownership), the requested "action" that they need a Permission for, and the existing Show record retrieved from the database.

Since Polar knows that the user value is a User instance and the existing value is a Show instance (I'll share more on how it knows that in a moment), it knows to call the has_role() function I defined above in the authorization.polar file. The User object it receives includes the related roles because of the &true value provided for the with_roles flag when the UsersService was called. The resulting object looks like this:

{ "id": "test-user-id", "username": "test-user-sub", "roles": [ { "resource_table": "shows", "resource_id": "test-show-id", "role_key": "admin" } ], "created_at": "...", "updated_at": "..." }

Since the existing value is an instance of Show, the resource_table has a value of "shows", and the role_key has a value of "admin", as long as the existing Show has an id that matches the "test-show-id" set as the resource_id in the example above, Oso would know that the actor can perform the "update" action because the "update" Permission is granted to the "manager" Role, which is also included within the "admin" Role. This .is_allowed() check would pass!

Wiring it all Together

To bring all of this together and make it available to my resolver (or controller) code, I perform a bit of initialization logic when I put together the Context for my application. Inside the caster_api crate in apps/api/src/lib.rs, I add some logic to the init() method:

impl Context { pub async fn init(config: &'static Config) -> Result<Self> { // ... // Set up authorization let mut oso = Oso::new(); oso.register_class(User::get_polar_class_builder().name("User").build())?; oso.register_class(Profile::get_polar_class_builder().name("Profile").build())?; oso.register_class(Show::get_polar_class_builder().name("Show").build())?; oso.register_class(Episode::get_polar_class_builder().name("Episode").build())?; oso.load_str(&[PROFILES_AUTHZ, SHOWS_AUTHZ].join("\n"))?; Ok(Self { // ... oso, // ... }) } }

This first creates a new Oso struct instance, and then uses .register_class() to load it up with information about each of my data entities. This works because of the #[derive(PolarClass)] macro that I apply to my Model struct - the same struct that acts as both the GraphQL Object type and the SeaORM Model. It pulls triple-duty as a Polar authorization data class as well.

The oso.load_str() call loads the two authorization.polar files as simple concatenated strings. These are exported as part of the lib.rs in each "libs" crate:

/// Authorization rules pub const AUTHORIZATION: &str = include_str!("authorization.polar");

Then they're renamed as they are pulled into the caster_api lib.rs module:

use caster_shows::AUTHORIZATION as SHOWS_AUTHZ;

This makes them a part of the source code, so if the rules change the code must be re-compiled. This could also be done by retrieving text files stored in a particular location, but strong security precautions should be taken if that approach is used.

Finally, the oso instance is added to the Context so that it can be injected into the GraphQL context and used within the resolvers.

Roles and Permissions

To track Roles granted to a particular User for a given resource, a "role_grants" table is used in the database. This table has role_keyresource_table, and resource_id properties to match up with what Oso expects.

When a Show is created, the "admin" Role is granted to the creator like this:

role_grants .create(&CreateRoleGrantInput { role_key: "admin".to_string(), user_id: user.id.clone(), resource_table: "shows".to_string(), resource_id: show.id.clone(), }) .await .map_err(as_graphql_error( "Error while granting the admin role for a Show", StatusCode::INTERNAL_SERVER_ERROR, ))?;

This creates a new row in the "role_grants" table with a role_key of "admin", a resource_table of "shows", a resource_id matching the id of the Show row, and a user_id matching the id of the User that created the Show.

To retrieve those RoleGrants for a particular user, the get_by_username() method in the UsersService is used:

async fn get_by_username(&self, username: &str, with_roles: &bool) -> Result<Option<User>> { let query = user_model::Entity::find().filter(user_model::Column::Username.eq(username.to_owned())); let user: UserOption = if *with_roles { query .find_with_related(role_grant_model::Entity) .all(&*self.db) .await? .first() .map(|t| t.to_owned()) .into() } else { query.one(&*self.db).await?.into() }; Ok(user.into()) }

The .find_with_related() call here tells SeaORM to find all rows in the "role_grants" table with a user_id that matches the id of the User that is found by username.

Next Time

Now we have a strong embedded authentication and authorization layer that uses the "sub" claim on a JWT token to match the requestor up with a User which can have one or more Roles granted to them, making decisions with the Polar logic language provided by Oso.

Next time, I'll show how I test my API at the unit and integration level. I'll cover testing services, resolvers, and full HTTP requests. Though Rust has a strict and sound type system, I believe that strong test coverage is still important to check my assumptions and prove that my code will work the way I think it will. It also serves as great documentation for maintainers who come along after you, showing how you intend your code to be used.

I hope you're enjoying my series and you're excited to begin building high-performance GraphQL applications with the sound type safety and memory protection that Rust is famous for! If you want to find me on Discord, you can find me at bkonkle#0217. I lurk around several Rust and TypeScript Discord servers, and I'd love to hear from you! You can also find me on Twitter @bkonkle. Thanks for reading!

This content originally appeared at: https://konkle.us/async-graphql-with-rust-part-three/

Related Posts

Async GraphQL with Rust: Introduction

April 27, 2022
An introduction to the Rust ecosystem and a collection of libraries that empower developers to build production-ready server applications with GraphQL.

Async GraphQL with Rust: Data and Graphs

May 12, 2022
An overview of SeaORM and async-graphql including data models and querying, GraphQL schemas and resolvers, and handling HTTP requests with auth context.

Green with Envy: Tracing Network Calls Across Your Application Stack

November 20, 2023
Envy is a zero config network tracing and telemetry viewer for development. It allows engineers to have a live view into the network requests your application stack is making during local development.