Chris Allen - Obtaining happiness by using Diesel Async in anger

I've been getting some questions from people about how to use Diesel and particularly diesel-async for interacting with SQL databases in Rust. I thought I'd write up a quick post with some patterns and examples.

The example project on GitHub for this post is located at: https://github.com/bitemyapp/better-living-through-petroleum/tree/blog/diesel-async-in-anger

The blog/diesel-async-in-anger Git tag is so you can see the version of the code that I'm using for this post.

Why Diesel Async?

The most efficient and type-safe library for integrating with databases in Rust is Diesel. Diesel can take a little to get going with and the default is synchronous rather than asynchronous. However, Diesel has sister library named diesel-async (AQE) which permits executing queries and processing the results asynchronously.

For an example, I had a heavy database load operation in a work project that took a few minutes to complete. The servers would perform a bulk load of the entire database on boot into memory. When I rewrote the load function to use diesel-async's streaming API it took a couple seconds to complete.

The reason for using Diesel more generally that's going to matter most for developers is type-safety. I understand why SQLx is popular. A colleague of mine uses Cornucopia for heavier analytical queries that Diesel is less suited to. For ordinary OLTP queries and operations, Diesel really is the best show in town. I do recommend using Cornucopia for complicated analytical queries, you still get type-safety it's just "query-first" instead of "model-first."

My preference for diesel-async is undoubtedly influenced by the years I spent using Persistent and Esqueleto (I am not the author) in Haskell. Esqueleto is more expressive in terms of what SQL it can handle and the type-safety it provides. However, Diesel is easier to learn and become comfortable with.

OK but why do I care about type-safety?

If your data is bad your program is bad.

I prefer model-first database integration libraries (not "ORMs") that have an expressive SQL DSL. More kinetically, I arrive at working results faster when I can change the definitions of my models/tables and then fix all the type errors that arise from having done so.

This isn't different from how I write Haskell and Rust programs more generally. I define types that circumscribe the domain/business problem I'm working on and then I write functions that operate on those types. When the problem changes, the domain types change, and then the compiler tells me what conflicts with those changes. This is so that I can work faster, not distract myself with architecture astronomy.

What's the real reason you wrote this?

It causes me psychic pain when programmers use a programming language with a nice type system and then huck it overboard for the principal integration of their application because the superior choice wasn't well documented.

Newtypes will save you, newtypes will betray you

You can leverage the newtype pattern profitably in Rust just as in Haskell.

I do this quite a bit with my database models in Diesel as well, but there's a caveat which could be due to my own ignorance.

This will preserve type-safety with a newtype for a foreign key:

/// User ID
#[derive(DieselNewType)]
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct UserId(pub i32);

// Pretend the User model is defined here

impl User {
    fn get_user(user_id: UserId, db_conn: &PgConnection) -> User {
        let user = users::table.find(user_id).first(db_conn);
    }
}

However, you can pass in a raw i32 to that users::table.find(user_id).first(db_conn) expression and it will compile. There are some traits automatically lowering the UserId to an i32 and the default i32 trait impl can still be invoked.

I don't consider this fatal and avoid this problem by not writing ad-hoc database code outside of a helper method/function in my database model modules.

The author of the diesel-derive-newtype crate has a commit that explains this issue: https://github.com/quodlibetor/diesel-derive-newtype/commit/c4896379cac321894176a1c2e32ffce5d071409c

Integrating `diesel-async` into your application

I'm not writing this intending it to be your primary introduction to using Diesel. I'm just sharing a working example and a few tips that I've learned along the way.

Use `bb8` as your async connection pool

I generally use bb8 as my connection pool. I've used bb8-postgres and bb8-diesel before. You ask for bb8 via the bb8 feature in diesel-async:

diesel-async = { version = "0.5.0", features = ["postgres", "bb8", "async-connection-wrapper"] }

Cf. https://github.com/bitemyapp/better-living-through-petroleum/blob/ad963b6313f2fa5f4f53d459ef5bcd61513f09a5/Cargo.toml#L14

I maintain the schema.rs by hand

From my diesel.toml:

[print_schema]
file = "src/db/schema_printed.rs"

I will copy things over from src/schema_printed.rs into src/schema.rs but because I use things like diesel-derive-newtype, diesel-derive-enum and other libraries it's less trouble to just maintain the schema.rs by hand after it generates the first pass. The auto-generator doesn't have a way to know to include the domain-specific types in the column types.

Here's the schema.rs for the very basic User model: https://github.com/bitemyapp/better-living-through-petroleum/blob/ad963b6313f2fa5f4f53d459ef5bcd61513f09a5/src/db/schema.rs

This doesn't demonstrate why I insisted on hand-maintaining the schema.rs, but I'll write a follow-up post that elaborates on this example and shows why it's helpful.

Connection pools and connections

Generally speaking you want the web::Data type from Actix to be furnishing your request handlers with a connection pool, not a bare connection. I'm not even sure the latter works given the connection has to be &mut. Actix's web::Data type is a wrapper around a std::sync::Arc<T> and the T is the type you provide.

You can see an example of me getting a connection from the connection pool in a request handler here: https://github.com/bitemyapp/better-living-through-petroleum/blob/ad963b6313f2fa5f4f53d459ef5bcd61513f09a5/src/main.rs#L15

This code is just the Actix example slimmed down and modified to use diesel-async instead of diesel's synchronous API.

I always define type aliases for the database connection pool and the database connection. I use PgPool and PgConn for the connection pool and connection. Cf. https://github.com/bitemyapp/better-living-through-petroleum/blob/ad963b6313f2fa5f4f53d459ef5bcd61513f09a5/src/db/pool.rs#L11-L13

You don't want to have to change a bunch of application/DAL code later on just because your database integration changed, be kind to yourself.

Done for now

This example is available at https://github.com/bitemyapp/better-living-through-petroleum and works on my machine. If you have any questions please feel free to hit me up on Twitter or at my email address. Both are listed on this website under the "Contact" link. If there's something specific you don't know how to do with diesel-async please let me know and I'll try to help.

What's next?

Newtypes other than foreign keys
Ephemeral databases for test isolation