Introduction
Cloud Composer is a fully managed workflow orchestration service that allows users to create, schedule, monitor, and manage workflow pipelines across multiple cloud environments and on-premises data centres. It is built on the popular open-source project, Apache Airflow using the Python programming language and offers the benefits of Airflow without the need for installation or infrastructure management.
One of the main benefits of using Cloud Composer instead of a local instance of Apache Airflow is that you get the best of Airflow with no installation or management overhead. Cloud Composer creates managed Airflow environments quickly and use Airflow-native tools, such as the powerful Airflow web interface and command-line tools, so you can focus on your workflows and not the infrastructure needed to support it.
Following components get created in the background when a Cloud Composer cluster is spun up for each environment:
- A GKE cluster that runs Airflow schedulers, workers, and Redis Queue as GKE workloads on a single cluster, and are responsible for processing and executing DAGs.
- Web server to run the Apache Airflow web interface.
- A Database to hold the Apache Airflow metadata.
- A Cloud Storage bucket that stores the DAGs, logs, custom plugins, and data for the environment. Detailed information about the storage bucket for Cloud Composer can be read here Data Stored in Cloud Storage.
Objective
In this blog post, we will discuss the detailed steps to implement a private Cloud Composer cluster within a shared Virtual Private Cloud (VPC). This setup ensures enhanced security and isolation while leveraging the features of Cloud Composer and shared VPC networking capabilities.
Configuration Steps
Configuring Shared VPC for Private Cloud Composer
Shared VPC is a feature that allows you to share a single VPC across multiple projects. This can be useful for organizations that want to centralize their network resources.
To set up a private Cloud Composer cluster in a shared VPC, there are some specific IP ranges that need to be provisioned from the host project to the service project.
These IP ranges include:
- A primary IP range for the GKE nodes
- A secondary IP range for Services
- A secondary IP range for Pods
Image Source: Configuring Shared VPC for cloud composer
It is crucial to avoid overlapping with any existing secondary IP ranges in the host VPC. Additionally, the secondary IP ranges should be sized appropriately to accommodate the current cluster size and future growth. So, specifically for environments where a separate foundations or platform team manages the shared services in GCP, these subnets must be pre-provisioned by the foundations team in the host project and shared with a service project before it can be used by the application team in that service project.
Key Considerations while choosing Primary and Secondary IP ranges:
Primary IP Range for GKE Nodes:
- The primary IP range of the subnet used by GKE nodes should be determined based on anticipated growth and reserved IP addresses.
- The network prefix of the subnet’s primary IP range should not exceed /29.
Secondary IP Ranges for GKE Services and Pods: For the secondary IP ranges dedicated to GKE Services and Pods, it is recommended to follow these guidelines:
- Pods: Use a network prefix of /21 or lower.
- Services: Use a network prefix of /27 or lower.
A key recommendation is to use IP masquerading when applicable to conserve IP address space, especially for non-routable pods and services.
IP masquerading is a form of source network address translation (SNAT) used to perform many-to-one IP address translations. GKE can use IP masquerading to change the source IP addresses of packets sent from Pods. When IP masquerading applies to a packet emitted by a Pod, GKE changes the packet's source address from the Pod IP to the underlying node's IP address. Masquerading a packet's source is useful when a recipient is configured to receive packets only from the cluster's node IP addresses.
Host Project Configuration:
Before a subnet can be used by a cloud composer in a service project, there are few configurations that need to be made for the network in the host project.
- While setting up a Shared VPC and configuring networking resources in the host project for a cloud composer, there are few possible scenarios as mentioned below:
- You need to create a new VPC Network, Subnet, and Secondary IP Ranges:
- Create a new VPC network and define a subnet with the primary IP range as per the guidelines mentioned earlier.
- Specify two secondary IP ranges during the subnet creation process
- One for Pods and
- Another for Services
- There is an Existing VPC and you need to create a Subnet and Secondary IP Ranges in the chosen VPC:
- Create a subnet within an existing VPC and set the primary IP range accordingly.
- Specify two secondary IP ranges during the subnet creation process
- One for Pods and
- Another for Services
- The last of the 3 scenarios would be to create Secondary IP Ranges in an Existing Subnet and VPC:
- Define two secondary IP ranges for Pods and Services within an existing subnet and VPC, ensuring there are no conflicts with existing secondary ranges.
- You need to create a new VPC Network, Subnet, and Secondary IP Ranges:
- After configuring the shared VPC in the host project, the required subnet needs to be attached to the service project that will host the Cloud Composer environments.
Note: On the subsequent steps, keep the existing account roles as they are and add another role to the accounts mentioned below instead of replacing an existing role.
- Google APIs Service Account Permissions in host project: Important note here, the permissions are assigned to service account of service project in the host project
- Edit the permissions for the Google APIs service account, which will be in the following format <sewrvice-project_number>@cloudservices.gserviceaccount.com
- Add the “compute.networkUser” role at the project level. This role is necessary for managing instance groups used with Shared VPC to perform tasks such as instance creation.
- GKE Service Account Permissions for Service project: Important note here, the permissions are assigned to service account of service project in the host project
- In the host project, edit the permissions for the GKE service accounts, which will be in the following format service-<service-project_number>@container-engine-robot.iam.gserviceaccount.com and perform the following actions for each account
- Grant the “compute.networkUser” role at the network level to enable the required VPC peering architecture for Cloud Composer.
- Add the “Kubernetes Engine Host Service Agent User” role (roles/container.hostServiceAgentUser) to allow the service project’s GKE Service Account to configure shared network resources using the GKE Service Account of the host project.
- GKE Service Account Permissions for Host project:
- In the host project, edit the permissions for the GKE service accounts, which will be in the following format service-<host-project_number>@container-engine-robot.iam.gserviceaccount.com and perform the following actions for each account
- Grant the “compute.networkUser” role at the network level to enable the required VPC peering architecture for Cloud Composer.
- DNS Configurations: To ensure proper functionality of Cloud Composer with VPC Service Controls, configure the DNS settings as follows:
- Create a DNS mapping from *.googleapis.com to restricted.googleapis.com.
- Set up a new DNS zone and configure the CNAME and A records for *.gcr.io, *.pkg.dev, and *.composer.cloud.google.com to resolve to IP addresses 199.36.153.4, 199.36.153.5, 199.36.153.6, and 199.36.153.7.
| Service | DNS | Resolves to |
| Container Registry | *.gcr.io | 199.36.153.4, 199.36.153.5, 199.36.153.6, 199.36.153.7 |
| Artifact Registry | *.pkg.dev | |
| Cloud Composer | *.composer.cloud.google.com |
- Composer Agent Service Account: If this is the first Cloud Composer environment, create a Composer Agent Service Account using the command:
gcloud beta services identity create --service=composer.googleapis.com
The service account created will be named as service-<project_number>@cloudcomposer-accounts.iam.gserviceaccount.com
- After creating the Composer Agent Service Account, few permissions need to be assigned to the service account as mentioned below:
- Edit the permissions for the Composer Agent Service Account service-<project_number>@cloudcomposer-accounts.iam.gserviceaccount.com
- Assign the appropriate roles based on the type of Cloud Composer environment:
- For a Private IP environment, add the “Composer Shared VPC Agent” role.
- For a Public IP environment, add the “Compute Network User” role.
Conclusion
By following the detailed steps outlined in this blog post, you can successfully configure a private Cloud Composer cluster within a shared VPC. This setup offers improved security, isolation, and streamlined management of workflow pipelines. With Cloud Composer and shared VPC networking, you can focus on building and orchestrating your workflows without the complexities of infrastructure management.
https://cloud.google.com/composer/docs/how-to/managing/configuring-shared-vpc
https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters
https://cloud.google.com/composer/docs/how-to/managing/configuring-private-ip
Note: The provided references contain additional documentation and guidelines that can assist you in implementing and managing private Cloud Composer clusters and shared VPC environments.




Leave a comment