Production readiness checklist
We recommend that you read through this checklist and idenitify critical features for your team before your supergraph begins handling production traffic.
GraphOS Studio
- Ensure that you've created multiple variants to represent the different environments where your supergraph runs (production, staging, and so on).
- Protect your production variant to avoid accidental changes while working in Studio.
Apollo Router
- Ensure that you've correctly configured managed federation and GraphOS schema usage reporting.
- For security, disable introspection for all production routers (by default Router disables introspection, but make sure you are not using
--dev
mode).- You can continue to view and fetch your GraphQL schemas from GraphOS and run operations from GraphOS Studio Explorer.
- Configure the Router traffic shapping features:
- Set request and subgraph level timeouts and rate limits
- Deduplicate subgraph requests
- Communicate with subgraphs using APQ
- Enable operation limits to block large and malicous requests
- Configure additional tracing, metrics, and logging through OpenTelemetry or Prometheous
- Enable the operation and query plan distributed cache
- Optionally, enable any other features deemed critical for your deployment of Apollo Router
Subgraphs/Servers
- For security, disable introspection for all production GraphQL subgraphs.
- You can continue to view and fetch your GraphQL schemas from GraphOS and run operations from GraphOS Studio Explorer.
- Ensure that you've integrated
rover subgraph check
androver subgraph publish
into your CI/CD pipeline. - If your subgraph servers are listed as compatible with
FEDERATED TRACING
, ensure that you've enabled federated traces, and that you can view operation metrics as expected in Apollo Studio.- Enable fractional trace sampling via
fieldLevelInstrumentation
to reduce performance hits due to tracing.
- Enable fractional trace sampling via
- Ensure that you've load-tested your graph.
- Test loads should be representative of your current traffic (both in terms of volume and in terms of the actual operations you execute in the test).
- To investigate performance issues, use Apollo Studio to identify which operations are performing slowly.
- Look at resolver execution times to identify slow areas of execution.
- Whenever possible, avoid making multiple calls to data sources within a single resolver.
- Understand query plan execution to help understand slow operations and optimize your supergraph to avoid them.
- Consider adding caching layers.
- Apollo Server supports automatic persisted queries (APQ) out of the box.
- If using Apollo Server, ensure that you use a distributed caching system for APQ in production to avoid cache inconsistency across server instances.
- Optionally use the
@cacheControl
directive to enable your CDN to cache APQ GET requests using theCache-Control
header.
- Optionally add full response caching to improve performance.
- Apollo Server supports automatic persisted queries (APQ) out of the box.
Clients
- Ensure that your clients identify themselves by name and version.
- If you're using an Apollo Client library, you can add a client name and version to the constructor.
- For example, the React client uses the
name
andversion
attributes in the constructor options. - If you're using a third-party GraphQL client, set the
apollographql-client-name
andapollographql-client-version
HTTP headers for each request to identify your client. - For an example of enforcing client identification in your gateway, see this technote for Client ID enforcement.
- Consider adding caching layers.
- Enable Persisted Queries and/or Automatic Persisted Queries (APQ) support for request-size savings.
- Enable and configure the client side normalized cache