Reference Architecture
Reference for Enterprise Deployment of GraphOS
While you can run the Apollo Router regardless of your Apollo plan, connecting the router to GraphOS requires an Enterprise plan. If your organization doesn't currently have an Enterprise plan, you can test out this functionality by signing up for a free Enterprise trial.
In this guide, learn about the fundamental concepts and configuration underlying Apollo's reference architecture for enterprise deployment of GraphOS with a self-hosted router. Use this guide as a companion reference to the Apollo reference architecture repositories.
About Apollo's reference architecture
In a modern cloud-native stack, your components must be scalable with high availability. The Apollo Router is built with this in mind. The router is much faster and less resource-intensive than the Apollo Gateway, Apollo's original runtime.
Apollo provides a reference architecture for self-hosting the router and subgraphs in an enterprise cloud environment using Kubernetes and Helm. This reference architecture includes autoscaling to suit the needs of a modern enterprise application. Continue reading to learn more about the reference architecture and how to use it.
💡 TIP
Check out our blog post, Deploying the Apollo Router at Apollo, to learn about Apollo's internal use of the router, including the performance improvements and resource utilization reductions.
Reference architecture repositories
The reference architecture consists of the following GitHub repositories:
Repository | Description |
---|---|
build-a-supergraph | The main repository that contains a step-by-step guide to utilizing the architecture and deploying it to AWS or GCP |
build-a-supergraph-infra | The template repository for Kubernetes deployment of the Apollo Router, OTel collector, Grafana, performance tests, and Zipkin |
build-a-supergraph-subgraph-a | Template repository for a subgraph used in the reference architecture |
build-a-supergraph-subgraph-b | Template repository for a subgraph used in the reference architecture |
Getting started
To get started with the reference architecture, follow the README in the main build-a-supergraph repository. The README provides a how-to guide that walks you through building a supergraph with the reference architecture. It gives step-by-step instructions for setup, CI/CD, load testing, and more.
While you work through the steps of deploying the reference architecture, you can use this guide as a complementary reference of the architecture's organization and configuration.
Architecture overview
The reference architecture uses two Kubernetes clusters, one for a development environment and other for a production environment. Each cluster has pods for:
- Hosting the router
- Hosting subgraphs
- Collecting traces
- Load testing K6 and viewing results with Grafana
For both environments, GraphOS serves as a schema registry. Each subgraph publishes schema updates to the registry via CI/CD, and GraphOS validates and composes them into a supergraph schema. The router regularly polls an endpoint called Apollo Uplink to get the latest supergraph schema and routing configurations from GraphOS.
The router also pushes performance and utilization metrics to GraphOS via Uplink so you can analyze them in GraphOS Studio.
Development environment
The development environment consists of the router and subgraphs hosted in a Kubernetes cluster in either AWS or GCP. GraphOS validates subgraph schemas using schema checks and makes them available to the router via Uplink. The router also reports usage metrics back to GraphOS.
Production environment
The production environment is similar to the development environment with some additions.
- The router and subgraphs send their OpenTelemetry data to a collector. You can then view the data in Zipkin.
- A K6 load tester sends traffic to the router and stores load test results in InfluxDB for viewing in Grafana.
Components
This section summarizes the various services and runtimes that comprise the reference architecture. These include the router, subgraphs, OpenTelemetry, Zipkin, K6, and Grafana.
Router
The router is deployed via GitHub Actions to Kubernetes using the Helm charts provided for router deployments.
dependencies:- name: routerversion: 1.33.2repository: oci://ghcr.io/apollographql/helm-charts
The values.yaml
file in router.router.configuration
provides router configuration values at runtime.
router:router:configuration:health_check:listen: 0.0.0.0:8080sandbox:enabled: truehomepage:enabled: false
This approach lets you run the router in Kubernetes with minimal effort. The schema and configurations the router receives from Uplink have already been composed and validated.
Subgraphs
Each subgraph is deployed via GitHub Actions to Kubernetes using a Helm chart. Each subgraph also uses GitHub Actions to publish its schema updates to the schema registry. Each subgraph is deployed to its own Kubernetes namespace. The subgraphs use a HorizontalPodAutoscaler
to automatically scale up service instances based on CPU or memory utilization.
autoscaling:enabled: falsetargetCPUUtilizationPercentage: 80minReplicas: 1maxReplicas: 100
minReplicas: {{ .Values.autoscaling.minReplicas }}maxReplicas: {{ .Values.autoscaling.maxReplicas }}metrics:{{- if .Values.autoscaling.targetCPUUtilizationPercentage }}- type: Resourceresource:name: cputargetAverageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}{{- end }}{{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}- type: Resourceresource:name: memorytargetAverageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}{{- end }}
The router queries subgraphs directly.
OpenTelemetry and Zipkin
The OpenTelemetry Collector is deployed via GitHub Actions to Kubernetes using the otel/opentelemetry-collector-contrib
Docker image.
image:repository: otel/opentelemetry-collector-contribpullPolicy: IfNotPresenttag: '0.59.0'
The collector is configured to export trace metrics to the Zipkin spans
endpoint.
exporters:zipkin:endpoint: 'http://zipkin.zipkin.svc.cluster.local:9411/api/v2/spans'
Zipkin is deployed via GitHub Actions to Kubernetes using the https://openzipkin.github.io/zipkin Helm chart.
dependencies:- name: zipkinversion: 0.3.0repository: https://openzipkin.github.io/zipkin
K6 and Grafana
The K6 Operator is deployed via GitHub Actions to Kubernetes. The operator exports test results to InfluxDB for viewing in Grafana.
spec:parallelism: ${{ inputs.parallelism }}arguments: '--out influxdb=http://influxdb.monitoring:8086/db'script:configMap:name: testsfile: ${{ inputs.test }}.js
datasources:datasources.yaml:apiVersion: 1datasources:- name: InfluxDBtype: influxdbaccess: proxyurl: http://influxdb.monitoring:8086
CI/CD
The reference architecture uses GitHub Actions for its CI/CD. These actions include:
- PR-level schema checks
- Building containers using Docker
- Publishing subgraph schemas to Apollo Uplink
- Router deployment to the Kubernetes cluster
- OTel collector deployment
- Grafana deployment
- Running load tests
Development actions
When a PR is submitted to one of the subgraphs, GitHub Actions uses GraphOS to validate schema changes using schema checks.
When the PR is merged, GitHub Actions publishes schema updates to Uplink, and GraphOS validates them using schema checks before making them available to the router. Additionally, the subgraph service is deployed.
Production deploy
When you manually trigger a production deployment, GitHub Actions publishes schema updates to Uplink and GraphOS validates them using schema checks before making them available to the router. Additionally, the subgraph service is deployed.
Deploy router
When you manually trigger a router deployment, GitHub Actions deploys the router to the Kubernetes cluster.
Deploy OpenTelemetry collector
When you manually trigger an OpenTelemetry deployment, GitHub Actions deploys the OpenTelemetry Collector and Zipkin to the Kubernetes cluster.
Deploy load test infrastructure
When you manually trigger a load test infrastructure deployment, GitHub Actions deploys the K6 Load Tester, Grafana, the load tests, and InfluxDB to the Kubernetes cluster.
Run load tests
When you manually trigger a load test run, GitHub Actions triggers the K6 Load Tester to pull the Load Tests from the environment, run the tests against the router, and store the results in InfluxDB.