GraphQL for SQL Developers - Part 3

When we need to JOIN

In part 1 and part 2 we've gone through some basic data modeling in GraphQL and written our first simple queries. However we have yet to try to write a useful query driven by user experience.

One of the basic user experiences in flying is looking at an itinerary. If you're flying by yourself one-way this is very simple: your flight leaves at a particular time from a particular airport and arrives at a different one sometime later. If you're flying with your family and making multiple stops things get more complicated. As the person who booked the flights you want to see an itinerary that shows who is flying where and when. You probably also want to make sure that your family is sitting together.

The demo data model we're using is flexible enough that some travelers on the same booking might be taking different flights. For example everyone is flying together on the outbound legs but one family member is taking a detour on the way back to stop in a different city for a few days.

This part uses the firstJoin branch of the repo. Assuming you've got the project already setup locally, you can follow along with:

$ git checkout firstJoin
$ npm install
$ npm start

Let's lay out what we want to show the user on screen or in an email:

booking reference: AXQS8I
Passengers:
   Alice
   Bob
   Ted

Flight XY101
    Leaving DME 11:06am October 8, 2022
    Arriving AER 3:02pm October 8, 2022
Aircraft type: Airbus A319-100
    Seat Assignments:
        Alice 11C comfort class
        Bob 11A comfort class
        Ted  11B comfort class

Flight XY102 
    Leaving AER 6:12am October 14, 2022 
    Arriving VVO 5:32pm October 14, 2022
Aircraft type: Airbus A319-100
    Seat Assignments
        Alice 15F economy class
        Bob 15D economy class
        Ted 15E economy class

Flight XY259 
    Leaving VVO October 21, 2022 8:57am 
    Arriving DME 11:19am October 21, 2022
Aircraft type: Boeing 777-300
    Seat Assignments
        Alice 2C business class
        Bob 2A check-in at airport

Booked: September 23, 2022 Total Price: 29,154.

If we look at our GraphQL schema as it stands however we don't have an obvious way to return such a structure. In part 1 we said we would defer the modeling of the ticket_flights and boarding_passes table. It's time to make use of these.

Let's define a couple new types.

"""
A ticketed passenger holds a ticket on a particular flight. They may or may not have a boarding pass.
"""
 type TicketedPassenger {
      ticket: Ticket!
      fareConditions: String!
      amount: Float!
      boardingPass: BoardingPass
}
"""
A boarding pass permits a passenger to board a particular flight
"""
type BoardingPass {
      boardingNo: NonNegativeInt
      flight: Flight!
      seatNo: String
      ticket: Ticket!
}

We also need to extend the Flight type with a ticketedPassengers field:

type Flight {
    …existing fields
    ticketedPassengers(bookRef: String): [TicketedPassenger]
}

Note here that the new field has its own query parameters (bookRef: String) There is no exclamation point at the end of String so this parameter is optional.

Using our new types and field we can now define a query in our typeDefs:

"""
Fetch a particular Itinerary based on a booking reference
"""
fetchItinerary(bookRef: String!): [Flight]

To make this query executable requires we write resolvers in javascript. Let's look at fetchItinerary:

export const fetchItinerary = (_: object, { bookRef }: tTicket) =>
  database('flights')
    .select(['*'])
    .whereIn('flightId', function () {
      void this.select('ticketFlights.flightId')
        .from('ticketFlights')
        .join('tickets', 'tickets.ticketNo', 'ticketFlights.ticketNo')
        .where({ bookRef });
    })
    .orderBy('flights.scheduledDeparture');

This also requires additional type resolvers to go resolve all the fields we will have access to. We won't repeat that code here.

Now let's try to actually run a query in Apollo Studio that fetches an itinerary from the database.

query fi($reference: String!) {
  fetchItinerary(bookRef: $reference) {
    flightNo
    scheduled {
      depart
      arrive
    }
    departureAirport {
      airportCode
      airportName
      city
    }
    arrivalAirport {
      airportCode
      airportName
    }
    aircraft {
      model
    }
    ticketedPassengers(bookRef: $reference) {
      ticket {
        passenger {
          name
        }
      }
      boardingPass {
        seatNo
        boardingNo
      }
      fareConditions
      amount
    }
  }
}

This query takes a $reference parameter. Note that this parameter is passed to both the fetchItinerary query and the ticketedPassengers field. This type of construct allows us to pass query parameters into individual field resolvers to narrow down the results returned by that resolver.

Here we'll run it with the most expensive booking in the sample database, 3B54BB:

Screen Shot 2022-09-23 at 4.21.57 PM.png

Running the query returns the following results:

Note that I've switched the accept-languages header back to en to show results in English.

Screen Shot 2022-09-25 at 6.25.48 PM.png

It looks like we have most, but not all, of the information we need to display an itinerary. We're missing the information about the Booking that we want to display at the top of the itinerary along with the list of passengers that are flying.

Fortunately GraphQL permits us to multiple queries together. Let's do:

query fi($reference: String!) {
  oneBooking(reference: $reference) {
    id
    reference
    bookedAt
    totalAmount
    tickets {
      passenger {
        name
      }
    }
  }

  fetchItinerary(bookRef: $reference) {
    flightNo
    scheduled {
      depart
      arrive
    }
    …rest of the fields we used before

You can see here that we've defined our query in Apollo Studio as query fi($reference: String!) {…} and inside that query we have two queries that refer to the same $reference parameter: oneBooking(reference: $reference) {…} and fetchItinerary(reference: $reference) {…}. We had defined the oneBooking query in part 2 of this series and now we're reusing it in the context of a larger query.

The query name fi is not in our schema - this is just a local name we're using in our client (in this case Apollo Studio) to refer to the actual queries that are in our typeDefs.

With the extra oneBooking query added, our results now look like:

Screen Shot 2022-09-25 at 5.16.20 PM.png

In case you're curious, the total amount of 1,204,500 is in Rubles and corresponds to a little over USD 20,000.

If you're running the demo application and have DEBUG_KNEX=true in your .env file (as shown in the template) you'll see a large number of queries, summarized below:

select "book_ref" as "id", "book_ref" as "reference",… (2x)
select * from "flights" where "flight_id"… (2x)
select "ticket_no" as "id", * from "tickets" where "book_ref" = ?… (2x)
select * from "airports_data" where "airport_code" = ? limit ?… (16x)
select * from "aircrafts" where "aircraft_code" = ? limit ?… (8x)
select "ticket_flights".*, "boarding_passes".*… (8x)
select * from "tickets" where "ticket_no" = ?… (24x)

For a total of 62 SQL queries. The bulk of these lookup a single record based on the primary key.

Although our initial fetchItinerary query uses a SELECT … WHERE IN a JOIN to find the 4 ticket_flight records involved in this booking, our GraphQL server needs to resolve all the objects we're looking for. This is driven primarily by the TicketedPassenger type resolver. This resolver does a 4-way join.

ticketedPassengers: ({ flightId }: tFlight, { bookRef }: tBooking) => {
    const whereClause = { 'ticketFlights.flightId': flightId };
    if (bookRef) whereClause['bookRef'] = bookRef;
    return database('ticketFlights')
      .join('tickets', 'tickets.ticketNo', 'ticket_flights.ticketNo')
      .join('flights', 'ticketFlights.flightId', 'flights.flightId')
      .leftJoin('boardingPasses', function () {
        this.on('boardingPasses.ticketNo', 'ticketFlights.ticketNo').on(
          'boardingPasses.flightId',
          'ticketFlights.flightId'
        );
      })
      .where(whereClause)
      .orderBy(['flights.scheduledDeparture', 'tickets.passengerName'])
      .select(['ticketFlights.*', 'boardingPasses.*']);
  },

You can see that the ticketedPassenger resolver is looking for a bookRef key on the second parameter. If that is specified it is added to the whereClause. If you ran the query without specifying that parameter you'd get a list of all the ticketed passengers.

The type resolvers for the ticket and boardingPass fields on the ticketedPassenger type get invoked to resolve individual fields that are not part of the underlying table. Then the passenger field is resolved under the ticket type. The result is a cascade of simple queries to get individual records from the various tables in our database.

Benefits

With respect to creating an itinerary, we can run a single GraphQL query and get all the data we need in one network roundtrip (between the client and our GraphQL server). We get exactly the data we asked for, no more, no less. The data is JSON which means we can use it in any language on any operating system over any wired or wireless network.

Imagine now that we want to show a flight manifest or get a list of open seats. we can easily extend the GraphQL schema with new types that refer to my existing ones and with a very small amount of additional javascript get the data I want in the shape that I need it in. This carries a very low risk of collateral damage to existing types and queries.

However just like in SQL, dropping types and fields carries risks due to already deployed applications that may be referencing them.

Tradeoffs

A 2 way join followed by a number of 4-way joins were required. A total of 62 queries were run, many of them asking for exactly the same data. However we avoided doing the 7-way join that would have been required to get all the data in a single SQL query. If we asked for fewer fields we could also reduce the number of SQL queries generated. For example if we wanted to omit the aircraft type that would eliminate all the repeated queries to the aircrafts_data table.

Recap

In part 1 I described how to map a SQL schema into a GraphQL one. In part 2 I showed how to implement a GraphQL query. In part 3 above I've illustrated a use case where one might need to use a SQL JOIN even though joins aren't a feature of GraphQL. The repo that supports these posts has 3 branches off of main, one per part. You should be able to clone the repo and run any branch. The main branch is the rollup of the 3 other branches.

Up Next

In part 4 we'll write a mutation to insert or update some data.