GraphQL for SQL Developers - Part 1

The Schema

Featured on Hashnode

The question "Why should I use GraphQL when I can just use SQL?" popped up in my twitter feed a few weeks ago. Like many posts designed to stir controversy, it was quickly followed by a number of comments about the relative superiority of one technology over the other. Since this was twitter many posts seemed, shall we say, uninformed. This blog post aims to clarify the distinct yet complimentary role that each of these "QLs" play in a modern application stack.

Let's get past the semantics first. SQL stands for Structured Query Language. While the language itself is structured the result of a SQL query is always exactly one thing: a table i.e. the structure of the result is fixed. GraphQL stands for Graph Query Language - the graph is the structure that is being queried, specifically a directed acyclic graph or DAG. The result of a GraphQL query is always a JSON object. This gives GraphQL tremendous flexibility, making it ideal for building rich user experiences. Confusingly however, GraphQL does not include key elements of what many would consider a query language: there are no WHERE or HAVING clauses, no LIMIT, no ORDER BY and, blasphemy, no JOINs! While GraphQL permits a client to request data in a particular shape and does support query parameters, it doesn't specify where or how data is stored nor how it maps to results.

SQL and GraphQL really are two sides of a coin. SQL is tied to a storage model but GraphQL is tied to a presentation model.

6to1nl.jpg

When working with a SQL database, developers create a mental model of how the database is structured and how to get various pieces of data in and out of it. I would argue that SQL developers have a particular mindset about working with data. Working with GraphQL requires a new and different mindset even if you are using GraphQL as an API layer above a SQL database.

The Schema

Let's start with something that GraphQL and SQL have in common, a schema! A schema describes the entities that make up a data model along with their relationships. When working with SQL databases we often visualize the model using an entity relationship diagram or ERD. In this series of posts we will be using a demo database which models airline travel. The ERD for this database is shown below:

Screen Shot 2022-09-16 at 12.18.25 PM.png This database is relatively simple, with only 8 tables however it leverages some postgres extensions including JSONB fields and geospatial datatypes so some additional complexity is hiding in that simple diagram.

Note: this demo database comes from Russia. It includes data in both Cyrillic and English. All the flights are internal Russian flights from the summer of 2017. No endorsement of Russia is intended. slava-ukraini.webp

A GraphQL schema is primarily made up of types - you can think of these as being entities but they are most definitely not tables. They are not classes in the OOP sense either since while they have properties - a.k.a. fields, they don't have behaviors.

The code for this example is here on github. If you don't have postgres running somewhere you'll also need to get that setup (with the postgis geospatial extension) then you can load the sample data. The small dataset should be sufficient. The demo is built in typeScript/nodejs and uses the Apollo GraphQL server. Note that each post in this series uses a different branch of the repository - this post is tied to the intro branch.

The Booking Type

Let's illustrate by defining the type for the top left table in the ERD: the booking:

type Booking {
      id: ID!
      bookedAt: DateTime!
      reference: String!
      totalAmount: Float!
      tickets: [Ticket]
}

This looks a lot like the definition of the Bookings table in our SQL database with a few exceptions:

  • I've arbitrarily added an id field of type ID - more on this later
  • There is an extra tickets field which is of type [Ticket], what is this?

You may also be wondering why there are exclamation points after some of the types?

A GraphQL type definition looks a lot like a javascript object: curly braces are used to surround a set of key-value pairs. In a GraphQL type the keys are called fields and the values are the type of each field. The exclamation point ! after the field types indicates that a field is non-nullable i.e. NOT NULL in SQL terms. Unlike javascript, commas are not used as delimiters.

The last field in this definition is the tickets field which references the Ticket type. The square brackets around the Ticket type indicate multiple of the related entity. In plain-English: a booking can have many tickets. In the SQL schema we have a foreign key in the Tickets table that points to the the Bookings table but in GraphQL we can use both forward and backward references.

Let's define the Ticket type - note that you can define types in any order - a type does not already have to be defined for you to use it in a different type. I personally like to organize my types alphabetically. A common convention is also to capitalize the name of a type and to make it singular. A plural field name (tickets in the above case) will be followed by an array of some other type.

A field in a type definition can refer to another complex type or to a scalar-type. The built-in GraphQL scalar types are Int, Float, String, Boolean and ID. In the demo code I'm using the graphql-scalars package to extend the core scalar types with a large number of other useful types, including DateTime. More of these will show up as we extend this model.

The Ticket Type

"""
A ticket is issued to an individual Passenger.
"""
type Ticket {
      id: ID!
      booking: Booking!
      passenger: Passenger!
      ticketNo: String!
}

Here we've introduced a documentation comment for the type. This is the text surrounded on both sides by triple-double-quotes """, akin to the COMMENT statement in SQL which I'm sure many of you use frequently. The documentation comment refers to the item immediately below it, in this case a type. You can add documentation comments on fields as well.

The ticket type has a back-reference to the Booking that it is a part of. It also references a single Passenger type.

If we look at the ERD for the database we can see that the Flights table has passenger_id, passenger_name, and contact_data columns. In the type definition here I've arbitrarily decided to define a Passenger type because (a) it allows me to think of it as a different kind of entity and (b) I suspect that I might need to extend its definition later.

"""
A person who is traveling
"""
type Passenger {
      id: ID!
      email: EmailAddress
      name: String!
      phone: PhoneNumber
}

Here again I've added an id field where a field of that name does not exist in the database. The database uses ticketNo as the primary key for the Tickets table. I've decided to have a separate id field because I found it's simpler to code when your types typically have an id field - you don't have to think so much about what the primary identifier might be called (it's a habit I've carried over from working with MongoDB). My personal convention is to start type definition with the id field followed by the remaining fields in alphabetical order. Another common GraphQL convention is that field names are camelCase.

Remaining Types

We can go through the remaining types a bit quicker now that we've got a bunch of definitions out of the way.

The Airport Type

"""
An airport is the source and destination for flights
"""
type Airport {
      """
      For the purpose of this demo the id is the same as the airportCode
      """
      id: ID!
      """
      The 3 letter airport code, globally unique
      """
      airportCode: String
      """
      The name of the airport in the language requested by the client
      """
      airportName: String
      """
      The name of the city that the airport serves in the language requested by the client
      """
      city: String
      coordinates: Point
      timezone: String
}

Here we have an example of a type where most fields have documentation comments. This type does have an important detail: in the SQL table the coordinates column is defined as being of type POINT - i.e. a latitude/longitude pair in postgres. In the database one of these looks like (37.347301483154,45.002101898193) i.e. 37.347N, 45.002E. To handle this in GraphQL we'll define a custom Point type:

"""
The coordinates of a place on earth
"""
type Point {
      x: Longitude
      y: Latitude
}

Furthermore in the postgres database the airport name and city columns are actually represented as JSONB fields with a one key-value pair for each supported language, e.g.:

{
  "en": "Anapa Vityazevo Airport",
  "ru": "Витязево"
}

We don't need to send our clients all possible names so I defined the airportName and city fields as String types. The server can use the client's language preferences to serve the correct string.

TimeZone, Latitude and Longitude are custom scalar types that are also available in the graphql-scalars package.

If we were particularly thorough we could define our own custom scalar type for airport codes - three uppercase letters.

The Flight type

"""
A flight represents an aircraft flying between two airports on a particular date
"""
type Flight {
      id: ID!
      actual: TimeInterval
      aircraft: Aircraft!
      arrivalAirport: Airport!
      departureAirport: Airport!
      """
      The flight number is a string where the first 2 characters represent the airline and the remaining digits represent the route
      """
      flightNo: String!
      scheduled: TimeInterval!
      status: FlightStatus!
}

Here we have another new situation where the type refers to another type more than once. arrivalAirport and departureAirport both refer to the Airport type but clearly different instances of that type. Same for actual and scheduled which refer to the TimeInterval type. The latter is required hence the TimeInterval!

"""
The TimeInterval type is a pair of departure and arrival times.
It can be used for either scheduled or actual time pairs.
"""
type TimeInterval {
      depart: DateTime
      arrive: DateTime
}

Aircraft and Seat Types

"""
Represents a *kind* of aircraft (not to be confused with a particular *instance* of an aircraft)
"""
type Aircraft {
      """
      For the purpose of this demo the id is the aircraftCode
      """
      id: ID!
      """
      A string representing the type of aircraft
      """
      aircraftCode: String
      """
      The model of the aircraft, in the language requested by the client
      """
      model: String
      """
      The range of this type of aircraft in miles
      """
      range: NonNegativeInt
}

"""
An aircraft has a number of seats each with their own number-letter code
"""
type Seat {
      seatNo: String
      aircraft: Aircraft
      fareConditions: FareConditions!
}

Now if you've been paying really close attention you might have noticed a couple types that we haven't described: FareConditions and FlightStatus. These turn out to be enums just like we might define in SQL.

"""
Classes of service
"""
enum FareConditions {
      Business
      Comfort
      Economy
}

"""
The *current* status of a flight
"""
enum FlightStatus {
      Departed
      Arrived
      On
      Time
      Cancelled
      Delayed
      Scheduled
}

The `ticket_flights` and `boarding_passes` tables are also not represented (yet) in our GraphQL schema. We'll approach these later once we see what kind of queries we want to run that involve them.

Exploring the Schema

If you have installed the demo code and are running it, navigating to the server url (typically localhost:4000) and explore your schema using Apollo Studio. You should see something like:

Screen Shot 2022-09-20 at 12.23.04 PM.png

All the documentation comments you add to your schema show up in studio. This becomes increasingly useful as more developers work on larger models.

Summary

All the types above are defined in the project's typeDefs file.

Starting with 9 database entities a.k.a. tables in the original postgres demo I've created 9 distinct GraphQL types. I could have created more entities and also less, depending on what I eventually want to do with the data. We skipped the ticket_flights and boarding_passes tables altogether; we'll return to these in part 3.

Take-Aways

  • SQL tables do not have to map 1:1 to GraphQL types. In practice it will be common for many tables to be hidden from clients.
  • Columns in SQL tables do not need to map 1:1 to fields in GraphQL types. Foreign keys for example rarely appear in GraphQL type definitions. Secrets (ex: a user's hashed password) are also usually omitted from the GraphQL schema.
  • Relations are described by inclusion in GraphQL instead of by foreign keys
  • Relations in GraphQL can be forward or backward (pointing in the 1:many direction or the many:1 direction)
  • The names used in the GraphQL schema don't have to be the same as the names in the SQL schema
  • You can include fields in your GraphQL schema that don't exist in your SQL schema (e.g. computed fields)
  • GraphQL has a very small number of built-in types relative to SQL. In nodejs, use the npm graphql-scalars package to extend the types. Similar packages are available for other languages.
  • The GraphQL schema does not specify at all how data is stored or even if there is a database underneath or a flat file or a REST endpoint
  • A GraphQL schema does not need or benefit from any kind of normalization. The more important concern is "How does the client want to see the data?"

Up Next

In part 2 of this series we'll start connecting our GraphqQL schema to our postgres demo database and accessing data.

Spoiler alert:

6tr7ph.jpg