Reference Architecture

Reference for Enterprise Deployment of GraphOS

While you can run the Apollo Router regardless of your Apollo plan, connecting the router to GraphOS requires an Enterprise plan. If your organization doesn't currently have an Enterprise plan, you can test out this functionality by signing up for a free Enterprise trial.

In this guide, learn about the fundamental concepts and configuration underlying Apollo's reference architecture for enterprise deployment of GraphOS with a self-hosted router. Use this guide as a companion reference to the Apollo reference architecture repositories.

About Apollo's reference architecture

In a modern cloud-native stack, your components must be scalable with high availability. The Apollo Router is built with this in mind. The router is much faster and less resource-intensive than the Apollo Gateway, Apollo's original runtime.

Apollo provides a reference architecture for self-hosting the router and subgraphs in an enterprise cloud environment using Kubernetes and Helm. This reference architecture includes autoscaling to suit the needs of a modern enterprise application. Continue reading to learn more about the reference architecture and how to use it.

💡 TIP

Check out our blog post, Deploying the Apollo Router at Apollo, to learn about Apollo's internal use of the router, including the performance improvements and resource utilization reductions.

Reference architecture repositories

The reference architecture consists of the following GitHub repositories:

Repository	Description
build-a-supergraph	The main repository that contains a step-by-step guide to utilizing the architecture and deploying it to AWS or GCP
build-a-supergraph-infra	The template repository for Kubernetes deployment of the Apollo Router, OTel collector, Grafana, performance tests, and Zipkin
build-a-supergraph-subgraph-a	Template repository for a subgraph used in the reference architecture
build-a-supergraph-subgraph-b	Template repository for a subgraph used in the reference architecture

Getting started

To get started with the reference architecture, follow the README in the main build-a-supergraph repository. The README provides a how-to guide that walks you through building a supergraph with the reference architecture. It gives step-by-step instructions for setup, CI/CD, load testing, and more.

While you work through the steps of deploying the reference architecture, you can use this guide as a complementary reference of the architecture's organization and configuration.

Architecture overview

The reference architecture uses two Kubernetes clusters, one for a development environment and other for a production environment. Each cluster has pods for:

Hosting the router
Hosting subgraphs
Collecting traces
Load testing K6 and viewing results with Grafana

For both environments, GraphOS serves as a schema registry. Each subgraph publishes schema updates to the registry via CI/CD, and GraphOS validates and composes them into a supergraph schema. The router regularly polls an endpoint called Apollo Uplink to get the latest supergraph schema and routing configurations from GraphOS.

The router also pushes performance and utilization metrics to GraphOS via Uplink so you can analyze them in GraphOS Studio.

Development environment

The development environment consists of the router and subgraphs hosted in a Kubernetes cluster in either AWS or GCP. GraphOS validates subgraph schemas using schema checks and makes them available to the router via Uplink. The router also reports usage metrics back to GraphOS.

Production environment

The production environment is similar to the development environment with some additions.

The router and subgraphs send their OpenTelemetry data to a collector. You can then view the data in Zipkin.
A K6 load tester sends traffic to the router and stores load test results in InfluxDB for viewing in Grafana.

Components

This section summarizes the various services and runtimes that comprise the reference architecture. These include the router, subgraphs, OpenTelemetry, Zipkin, K6, and Grafana.

Router

The router is deployed via GitHub Actions to Kubernetes using the Helm charts provided for router deployments.

Chart.yaml

1
dependencies:
2
  - name: router
3
    version: 1.33.2
4
    repository: oci://ghcr.io/apollographql/helm-charts

The values.yaml file in router.router.configuration provides router configuration values at runtime.

values.yaml

1
router:
2
  router:
3
    configuration:
4
      health_check:
5
        listen: 0.0.0.0:8080
6
      sandbox:
7
        enabled: true
8
      homepage:
9
        enabled: false

This approach lets you run the router in Kubernetes with minimal effort. The schema and configurations the router receives from Uplink have already been composed and validated.

Subgraphs

Each subgraph is deployed via GitHub Actions to Kubernetes using a Helm chart. Each subgraph also uses GitHub Actions to publish its schema updates to the schema registry. Each subgraph is deployed to its own Kubernetes namespace. The subgraphs use a HorizontalPodAutoscaler to automatically scale up service instances based on CPU or memory utilization.

values.yaml

1
autoscaling:
2
  enabled: false
3
  targetCPUUtilizationPercentage: 80
4
  minReplicas: 1
5
  maxReplicas: 100

hpa.yaml

1
minReplicas: {{ .Values.autoscaling.minReplicas }}
2
maxReplicas: {{ .Values.autoscaling.maxReplicas }}
3
metrics:
4
  {{- if .Values.autoscaling.targetCPUUtilizationPercentage }}
5
  - type: Resource
6
    resource:
7
      name: cpu
8
      targetAverageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
9
  {{- end }}
10
  {{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
11
  - type: Resource
12
    resource:
13
      name: memory
14
      targetAverageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
15
  {{- end }}

The router queries subgraphs directly.

OpenTelemetry and Zipkin

The OpenTelemetry Collector is deployed via GitHub Actions to Kubernetes using the otel/opentelemetry-collector-contrib Docker image.

values.yml

1
image:
2
  repository: otel/opentelemetry-collector-contrib
3
  pullPolicy: IfNotPresent
4
  tag: '0.59.0'

The collector is configured to export trace metrics to the Zipkin spans endpoint.

configmap.yml

1
exporters:
2
  zipkin:
3
    endpoint: 'http://zipkin.zipkin.svc.cluster.local:9411/api/v2/spans'

Zipkin is deployed via GitHub Actions to Kubernetes using the https://openzipkin.github.io/zipkin Helm chart.

Chart.yaml

1
dependencies:
2
  - name: zipkin
3
    version: 0.3.0
4
    repository: https://openzipkin.github.io/zipkin

K6 and Grafana

The K6 Operator is deployed via GitHub Actions to Kubernetes. The operator exports test results to InfluxDB for viewing in Grafana.

_run-loadtest-aws.yaml

1
spec:
2
  parallelism: ${{ inputs.parallelism }}
3
  arguments: '--out influxdb=http://influxdb.monitoring:8086/db'
4
  script:
5
    configMap:
6
      name: tests
7
      file: ${{ inputs.test }}.js

value.yaml

1
datasources:
2
  datasources.yaml:
3
    apiVersion: 1
4
    datasources:
5
      - name: InfluxDB
6
        type: influxdb
7
        access: proxy
8
        url: http://influxdb.monitoring:8086

CI/CD

The reference architecture uses GitHub Actions for its CI/CD. These actions include:

PR-level schema checks
Building containers using Docker
Publishing subgraph schemas to Apollo Uplink
Router deployment to the Kubernetes cluster
OTel collector deployment
Grafana deployment
Running load tests

Reference Architecture

Reference for Enterprise Deployment of GraphOS

About Apollo's reference architecture

Reference architecture repositories

Getting started

Architecture overview

Development environment

Production environment

Components

Router

Subgraphs

OpenTelemetry and Zipkin

K6 and Grafana

CI/CD

Development actions

Production deploy

Deploy router

Deploy OpenTelemetry collector

Deploy load test infrastructure

Run load tests

Further reading