What is GraphQL?

May 29, 2019
Aaron Crootof
Technical Director

Drupal, GraphQL, DevOps

Background on GraphQL

For our less technical readers, there's a glossary of definitions located at the bottom of the article.

GraphQL is an application query language, meaning it makes queries of databases. It was created by Facebook in 2012 and open-sourced in 2015. It allows for API consumers to ask for the exact data they need and get back that exact data using a single endpoint. This allows the consumer to update their request without having to get a backend developer involved (assuming the data already exists).

GraphQL relies on a schema (Schema Definition Language) that defines data types available. This allows the consumer to know exactly the type of data they will get back.

GraphQL supports fetching data (Queries), creating/updating/deleting data (Mutations), and realtime updates (Subscriptions). Without GraphQL, developers would need to create a custom API for every screen of an app. Instead, GraphQL automatically matches the data needed by an app with the data that service developers supply.

It is also language and platform agnostic, and is currently implemented in several major languages, some of which are: JS, C#/.NET, PHP, Python, Ruby, Go, Java, Swift.

What makes it different from a REST API?

The following examples were taken and modified from an Apollo blog post.

A typical REST endpoint will look something like the example below. The endpoint couples the type/shape of the resource and the way it's fetched.

GET /books/1

{

  "title": "Black Hole Blues",

  "author": 6,

  "categories": [

    12,

    8

  ],

  "price": 22.25,

  "type": "paperback"

  "condition": "new"

  "image": "https://example.com/images/book/black_hole_blues.jpg"

}

You would then need to make another call to an author endpoint to get the author name.

GET /authors/6

{

  "firstName": "Janna",

  "lastName": "Levin",

  "age": 45,

  "gender": "female"

}

And then another for categories...

GET /categories/12+8

{

  [

    "title": "astrophysics",

    ...

  ],

  [

    "title": "science",

    ...

  ],

}

With GraphQL you would first need to define the types in your schema

type Book {

  id: ID

  title: String

  type: String

  price: Float

  condition: String

  author: Author

  categories: [Category]

}

type Author {

  id: ID

  firstName: String

  lastName: String

  books: [Book]

  age: Integer

  gender: String

}

type Category {

  id: ID

  title: String

}

Then you can query for the book and author all at once. And only call for the fields you want

query Books($id: ID) {

  bookQuery(filter: {id: $id}){

    title

    author {

      firstName

      lastName

    }

    categories {

      title

    }

  }

}

That query would return

{

  "title": "Black Hole Blues",

  "author": {

    "firstName": "Janna",

    "lastName": "Levin",

  },

  "categories": [

    {"title": "astrophysics"},

    {"title": "science"}

  ],

}

Pros of GraphQL

No over-fetching or under-fetching

As shown in the example above, a traditional REST API requires a second call to get information about the author. Add an associated taxonomy and more calls might be need to a taxonomy endpoint. This can result in making many calls back to the server for that information, slowing down the site/app. Not only will GraphQL prevent this, it will also never return a field you do not ask for in a query. Both of these go a long way in speeding up requests and keeping the payloads as small as possible.

It uses a single endpoint

There is not a need for multiple endpoints since the consumer asks for the data needed. This can help speed up development and allow your development team to easily integrate new features.

GraphQL is strongly typed

GraphQL has a strongly typed schema structure which allows the API to be self documenting, and allows the consumer to know exactly the type of data to expect back.

Cons of GraphQL

Caching is hard with GraphQL

There is no standard for caching GraphQL. It is possible to rely on CDNs (the Drupal GraphQL module handles it through Drupal's caching system), but when implementing a custom GraphQL server a custom solution is needed.

It's hard to optimize for all use cases

A traditional RESTful endpoint will always know the query run to return the data. That query can be improved upon over time as long as it returns the same data. In GraphQL the consumer is in change of what gets returned, which creates a dynamic database query.

It has an exposed schema

Since GraphQL requires the consumer to be aware of the schema to query it, it allows possibly malicious actors to create expensive join queries that could potentially lead to a DoS attack.

Monitoring and error reporting are limited

Errors in the query are returned in the response body, but still give a 200 response from the server.

When should GraphQL be used?

  • When more flexibility and rapid prototyping/feature-building is needed
  • When dynamic fields are needed in responses
  • When content has reference fields
  • When there may be a need to migrate the backend (developers can simply bring the schema with them to new platform, and no consumer changes would be needed)

What companies are using GraphQL?

GraphQL is used by a bevy of leading tech and media companies. The below list is just a partial selection of the biggest sites currently using GraphQL:

  • Facebook
  • Github
  • Atlassian
  • NBC News
  • Drift
  • Intuit
  • Paypal
  • Shopify
  • NY Times
  • Twitter
  • Wayfair
  • Yelp

Additionally, any site using GatsbyJS also uses GraphQL.

GraphQL in Drupal

There is a Drupal GraphQL module that automatically exposes all content and configuration entities in the GraphQL schema (nodes, blocks, users, products, etc). Getting the module operational is easy: simply install it and enable it. That's it! Now you have a fully functional GraphQL server built right in.

There's also a built in, in-browser IDE for exploring and testing GraphQL available at /graphql/explorer once the module is enabled. This will allow you to craft queries in a test environment.

The GraphQL module also works with Drupal user permissions right out of the box. It uses the Drupal entity access system to check if the user has access to what they are requesting. So if a user logs into Drupal and navigates to a page using GraphQL, it is aware of that userr and their role(s) and uses that to manage their access.

A limitation to be aware of is that the module does not support any mutations out of the box due to the size and complexity of Drupal entities. According to one of the main companies behind the module, Amazee, supporting them ballooned out the size of the schema for mutations that were rarely used. Thankfully, mutations are a plugin, so they are easy to implement in custom code.

Finally, be aware that the Drupal module also does not support subscriptions. The module has a PHP backend that does not support websockets. To get around this and implement subscriptions with your GraphQL module, we recommend using a NodeJS backend. 

Example GraphQL queries

This example will query for all nodes on the site

query Nodes {

  nodeQuery {

    entities {

      entityId

      entityLabel

    }

  }

}

You can also add filters, such as a specific node id

query Nodes($id: [String]) {

  nodeQuery (

    filter: {

      conditions: [

        {

          operator: EQUAL,

          field: "nid",

          value: $id

        }

      ]

    }

  ) {

    entities {

      entityLabel

    }

  }

}

 

// Query Variables

{

  "id": "5"

}

The filters are very versatile, and allow for limiting, sorting, grouping, and choosing the conjunction.

Drupal GraphQL 8.x-4.x

Version 4 of the Drupal GraphQL module will be one big major change. Notably, it's doing away with automatic schema generation. This will help remove some of the "Drupalisms" in the query construction and response. It may also help obfuscate the data layer and make it easier to become backend agnostic.

How ADK Group is using GraphQL on Drupal

We recently launched an eCommerce-enabled Drupal website for Hancock National Resources Group. Designed by a partner agency, ADK Group engineers brought the front and back end experience to life to drive millions in revenue out of the gate. The interactive map on the homepage is built using React and queries for data from Drupal using GraphQL to populate the available products (parcels of land). It also uses mutations to allow users to create orders and place bids directly through the front end react app and add the data to Drupal. Finally, a layer of sophisticated custom event tracking built with Google Analytics allows the client to monitor and continuously improve the performance of the website.

Learn more about GraphQL

  • The #graphql channel on the Drupal Slack channel has many of the maintainers of the module helping people and answering questions.
  • Amazee Labs (one of the maintainers of the library) has several detailed blog posts on the module. Start here.

Definitions of common terms and phrases

Here are some quick definitions of commonly used terms that may not be known to our less-technical audience:

Query language

A programming language that sends queries to a database in order to request and retrieve data from it.

Endpoint

Endpoints are the points of communication between an API and the system it is interacting with. 

API consumer

API consumers are any developers that create apps or web sites that use APIs. 

Schema

Simply put, schemas are a way to structure data. Similar to grammatical rules for languages.

Strongly typed

Strongly typed programming languages limits usage of data types to those predefined by the language. This article presents an interesting point of view on strongly typed vs. weakly typed and static vs dynamic.

Self documenting

Generally any code that is self documenting has a clear name that follows naming and structured programming conventions. The goal is to increase how easy it is for a new developer to jump into code created by someone else.  

CDNs

Content delivery networks are proxy servers and data centers distributed in different locations to increase availability and performance (speed) for users.

Caching

Caches are relatively small data storage layers that stores frequently accessed data. 

Dynamic database query

These are queries built dynamically by Drupal instead of being provided by an explicit query string.

DoS attack

DoS stands for "denial-of-service." It generally involves barraging the target machine with a huge volume of requests in a short period of time in order to overload systems. Wikipedia has a solid analogy: "A DoS or DDoS attack is analogous to a group of people crowding the entry door of a shop, making it hard for legitimate customers to enter, disrupting trade."

In-browser IDE

An "IDE" is an integrated development environment. The basic components are a source code editor, tools to automate builds, and a debugger. These are commonly built on a software, but there are also some versions that work in browsers.

Mutations

Mutations are fields that allow you to modify server-side data with GraphQL.

 

Return to Top
Interested in Working Together?