In this article we will take things a step further, where uploading an object to a GCS bucket will trigger a DLP inspection of the object and if any preconfigured info types (such as credit card numbers or API credentials) are present in the object, a Slack notification will be generated.
As DLP scans are “jobs”, meaning they run asynchronously, we will need to trigger scans and inspect results using two separate Cloud Functions (one for triggering a scan [gcs-dlp-scan-trigger] and one for inspecting the results of the scan [gcs-dlp-evaluate-results]) and a Cloud PubSub topic [dlp-scan-topic] which is used to hold the reference to the DLP job.
The process is described using the sequence diagram below:
The gcs-dlp-scan-trigger Cloud Function fires when a new object is created in a specified GCS bucket. This function configures the DLP scan to be executed, including the DLP info types (for instance CREDIT_CARD_NUMBER, EMAIL_ADDRESS, ETHNIC_GROUP, PHONE_NUMBER, etc) a and likelihood of that info type existing (for instance LIKELY). DLP scans determine the probability of an info type occurring in the data, they do not scan every object in its entirety as this would be too expensive.
The primary function executed in the gcs-dlp-scan-trigger Cloud Function is named inspect_gcs_file. This function configures and submits the DLP job, supplying a PubSub topic to which the DLP Job Name will be written, the code for the inspect_gcs_file is shown here:
At this stage the DLP job is created an running asynchronously, the next Cloud Function, gcs-dlp-evaluate-results, fires when a message is sent to the PubSub topic defined in the DLP job. The gcs-dlp-evaluate-results reads the DLP Job Name from the PubSub topic, connects to the DLP service and queries the job status, when the job is complete, this function checks the results of the scan, if the min_likliehood threshold is met for any of the specified info types, a Slack message is generated. The code for the main method in the gcs-dlp-evaluate-results function is shown here:
Finally, a Slack webhook is used to send the message to a specified Slack channel in a workspace, this is done using the send_slack_notification function shown here:
In this article we will build a program in Golang to parse a JSON file containing a collection held in a named key – without knowing the structure of this object, we will expose the schema for the object including data types and recurse the object for its values.
This example uses a great Go package called tablewriter to render the output of these operations using a table style result set.
The program has describe and select verbs as operation types; describe shows the column names in the collection and their respective data types, select prints the keys and values as a tabular result set with column headers for the keys and rows containing their corresponding values.
Starting with this:
We will end up with this when performing a describe operation:
And this when performing a select operation:
Now let’s talk about how we got there…
The JSON package
Support for JSON in Go is provided using the encoding/json package, this needs to be imported in your program of course… You will also need to import the reflect package – more on this later. io/ioutil is required to read the data from a file input, there are other packages included in the program that are removed for brevity:
Reading the data…
We will read the data from the JSON file into a variable called body, note that we are not attempting to deserialize the data at this point. This is also a good opportunity to handle any runtime or IO errors that occur here as well.
We will declare an empty interface called data which will be used to decode the json object (of which the structure is not known), we will also create an abstract interface called colldata to hold the contents of the collection contained inside the JSON object that we are specifically looking for:
Next we need to validate that the input is a valid JSON object, we can use the json.Valid(body) method to do this:
Now the interesting bits, we will deserialize the JSON object to the empty data interface we created earlier using the json.Unmarshal() method:
Note that this operation is another opportunity to catch unexpected errors and handle them accordingly.
Checking the type of the object using reflection…
Now that we have serialized the JSON object into the data interface, there are several ways we can inspect the type of the object (which could be a map or an array). One such way is to use reflection. Reflection is the ability of a program to inspect itself at runtime. An example is shown here:
This instruction would produce the following output for our zones.json file:
The type switch…
Another method to decode the type of the data object (and any objects nested as elements or keys within the data object), is to use the type switch, an example of a type switch function is shown here:
Finding the nested collection and recursing it…
The aim of the program is to find a collection (an array of maps) nested in a JSON object. The maps with each element of the array are unknown at runtime and are discovered through recursion.
If we are performing a describe operation, we only need to parse the first element of the collection to get the key names and the data type of the values (for which we will use the same getObjectType function to perform a type switch.
If we are performing a select operation, we need to parse the first element to get the column names (the keys in the map) and then we need to recurse each element to get the values for each key.
If the element contains a key named id or name, we will place this at the beginning of the resultant record, as maps are unordered by definition.
As mentioned, we are using the tablewriter package to render the output of the collection as a pretty printed table in our terminal. As wrap around can get pretty ugly an additional maxfieldlen argument is provided to truncate the values if needed.
Although it is a bit more involved than some other languages, once you get your head around processing JSON in Go, the possibilities are endless!
This article demonstrates creating a site to site IPSEC VPN connection between a GCP VPC network and an Azure Virtual Network, enabling private RFC1918 network connectivity between virtual networks in both clouds. This is done using a single PowerShell script leveraging Azure PowerShell and gcloud commands in the Google SDK.
Additionally, we will use Azure Private DNS to enable private access between Azure hosts and GCP APIs (such as Cloud Storage or Big Query).
An overview of the solution is provided here:
One note before starting – site to site VPN connections between GCP and Azure currently do not support dynamic routing using BGP, however creating some simple routes on either end of the connection will be enough to get going.
Let’s go through this step by step:
Step 1 : Authenticate to Azure
Azure’s account equivalent is a subscription, the following command from Azure Powershell is used to authenticate a user to one or more subscriptions.
This command will open a browser window prompting you for Microsoft credentials, once authenticated you will be returned to the command line.
Step 2 : Create a Resource Group (Azure)
A resource group is roughly equivalent to a project in GCP. You will need to supply a Location (equivalent to a GCP region):
Step 3 : Create a Virtual Network with Subnets and Routes (Azure)
An Azure Virtual Network is the equivalent of a VPC network in GCP (or AWS), you must define subnets before creating a Virtual Network. In this example we will create two subnets, one Gateway subnet (which needs to be named accordingly) where the VPN gateway will reside, and one subnet named ‘default’ where we will host VMs which will connect to GCP services over the private VPN connection.
Before defining the default subnet we must create and attach a Route Table (equivalent of a Route in GCP), this particular route will be used to route ‘private’ requests to services in GCP (such as Big Query).
Network Security Groups in Azure are stateful firewalls much like Firewall Rules in VPC networks in GCP. Like GCP, the lower priority overrides higher priority rules.
In the example we will create several rules to allow inbound ICMP, TCP and UDP traffic from our Google VPC and RDP traffic from the Internet (which we will use to logon to a VM in Azure to test private connectivity between the two clouds):
We need to create two Public IP Address (equivalent of an External IP in GCP) which will be used for our VPN gateway and for the VM we will create:
# create public IP address for VM
$vmpip = New-AzPublicIpAddress `
-Name "vm-ip" `
-ResourceGroupName "azure-to-gcp" `
-Location "Australia Southeast" `
# create public IP address for NW gateway
$ngwpip = New-AzPublicIpAddress `
-Name "ngw-ip" `
-ResourceGroupName "azure-to-gcp" `
-Location "Australia Southeast" `
Step 6 : Create Virtual Network Gateway (Azure)
The Virtual Network Gateway in Azure is the VPN Gateway equivalent in Azure which will be used to create a VPN tunnel between Azure and a GCP VPN Gateway. This gateway will be placed in the Gateway subnet created previously and one of the Public IP addresses created in the previous step will be assigned to this gateway.
# create virtual network gateway
$ngwipconfig = New-AzVirtualNetworkGatewayIpConfig `
-Name "ngw-ipconfig" `
-SubnetId $gatewaySubnet.Id `
# use the AsJob switch as this is a long running process
$job = New-AzVirtualNetworkGateway -Name "vnet-gateway" `
-ResourceGroupName "azure-to-gcp" `
-Location "Australia Southeast" `
-IpConfigurations $ngwipconfig `
-GatewayType "Vpn" `
-VpnType "RouteBased" `
-GatewaySku "VpnGw1" `
-VpnGatewayGeneration "Generation1" `
$vnetgw = Get-AzVirtualNetworkGateway `
-Name "vnet-gateway" `
Step 7 : Create a VPC Network and Subnetwork(s) (GCP)
A VPC network and subnet need to be created in GCP, the subnet defines the VPC address space. This address space must not overlap with the Azure Virtual Network CIDR. For all GCP steps it is assumed that the project is set for client config (e.g. gcloud config set project <>) so it does not need to be specified for each operation. Private Google access should be enabled on all subnets created.
Now we will create the GCP side of our VPN tunnel using the Public IP Address of the Azure Virtual Network Gateway created in a previous step. As this example uses a route based VPN the traffic selector values need to be set at 0.0.0.0/0. A PSK (Pre Shared Key) needs to be supplied which will be the same key used when we configure a VPN Connection on the Azure side of the tunnel.
As we are using static routing (as opposed to dynamic routing) we will need to define all of the specific routes on the GCP side. We will need to setup routes for both outgoing traffic to the Azure network as well as incoming traffic for the restricted Google API range (184.108.40.206/30).
Now we can setup the Azure side of the VPN Connection which is accomplished by associating the Azure Virtual Network Gateway with the Local Network Gateway. A PSK (Pre Shared Key) needs to be supplied which is the same key used for the GCP VPN Tunnel in step 10.
Perform an nslookup to ensure that calls to googleapis.com resolve to the restricted API range (e.g. nslookup storage.googleapis.com). You should see a response showing the A records from the googleapis.com Private DNS Zone created in step 14.
Now test connectivity to Google APIs, for example to test access to Google Cloud Storage using gsutil, or test access to Big Query using the bq command
In the previous post in this series Spark in the Google Cloud Platform Part 1, we started to explore the various ways in which we could deploy Apache Spark applications in GCP. The first option we looked at was deploying Spark using Cloud DataProc, a managed Hadoop cluster with various ecosystem components included.
In this post, we will look at another option for deploying Spark in GCP – a Spark Standalone cluster running on GKE.
Spark Standalone refers to the in-built cluster manager provided with each Spark release. Standalone can be a bit of a misnomer as it sounds like a single instance – which it is not, standalone simply refers to the fact that it is not dependent upon any other projects or components – such as Apache YARN, Mesos, etc.
A Spark Standalone cluster consists of a Master node or instance and one of more Worker nodes. The Master node serves as both a master and a cluster manager in the Spark runtime architecture.
The Master process is responsible for marshalling resource requests on behalf of applications and monitoring cluster resources.
The Worker nodes host one or many Executor instances which are responsible for carrying out tasks.
Deploying a Spark Standalone cluster on GKE is reasonably straightforward. In the example provided in this post we will set up a private network (VPC), create a GKE cluster, and deploy a Spark Master pod and two Spark Worker pods (in a real scenario you would typically have many Worker pods).
Once the network and GKE cluster have been deployed, the first step is to create Docker images for both the Master and Workers.
The Dockerfile below can be used to create an image capable or running either the Worker or Master daemons:
Note the shell scripts included in the Dockerfile: spark-master and spark-worker. These will be used later on by K8S deployments to start the relative Master and Worker daemon processes in each of the pods.
Next, we will use Cloud Build to build an image using the Dockerfile are store this in GCR (Google Container Registry), from the Cloud Build directory in our project we will run:
Now from within the k8s-deployments\deploy folder of our project we will use the kubectl command to deploy the Master pod, service and the Worker pods
Starting with the Master deployment, this will deploy our Spark Standalone image into a container running the Master daemon process:
To deploy the Master, run the following:
kubectl create -f spark-master-deployment.yaml
The Master will expose a web UI on port 8080 and an RPC service on port 7077, we will need to deploy a K8S service for this, the YAML required to do this is shown here:
To deploy the Master service, run the following:
kubectl create -f spark-master-service.yaml
Now that we have a Master pod and service up and running, we need to deploy our Workers which are preconfigured to communicate with the Master service.
The YAML required to deploy the two Worker pods is shown here:
To deploy the Worker pods, run the following:
kubectl create -f spark-worker-deployment.yaml
You can now inspect the Spark processes running on your GKE cluster.
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
spark-master 1/1 1 1 7m45s
spark-worker 2/2 2 2 9s
kubectl get pods
NAME READY STATUS RESTARTS AGE
spark-master-f69d7d9bc-7jgmj 1/1 Running 0 8m
spark-worker-55965f669c-rm59p 1/1 Running 0 24s
spark-worker-55965f669c-wsb2f 1/1 Running 0 24s
Next, as we need to expose the Web UI for the Master process we will create a LoadBalancer resource. The YAML used to do this is provided here:
To deploy the LB, you would run the following:
kubectl create -f spark-ui-lb.yaml
NOTE This is just an example, for simplicity we are creating an external LoadBalancer with a public IP, this configuration is likely not be appropriate in most real scenarios, alternatives would include an internal LoadBalancer, retraction of Authorized Networks, a jump host, SSH tunnelling or IAP.
Now you’re up and running!
You can access the Master web UI from the Google Console link shown here:
The Spark Master UI should look like this:
Next we will exec into a Worker pod, get a shell:
kubectl exec -it spark-worker-55965f669c-rm59p -- sh
Now from within the shell environment of a Worker – which includes all of the Spark client libraries, we will submit a simple Spark application:
You can see the results in the shell, as shown here:
Additionally, as all of the container logs go to Stackdriver, you can view the application logs there as well:
This is a simple way to get a Spark cluster running, it is not without its downsides and shortcomings however, which include the limited security mechanisms available (SASL, network security, shared secrets).
In the final post in this series we will look at Spark on Kubernetes, using Kubernetes as the Spark cluster manager and interacting with Spark using the Kubernetes API and control plane, see you then.
The infrastructure coding for this example uses Powershell and Terraform, and is deployed as follows:
I have been an avid Spark enthusiast since 2014 (the early days..). Spark has featured heavily in every project I have been involved with from data warehousing, ETL, feature extraction, advanced analytics to event processing and IoT applications. I like to think of it as a Swiss army knife for distributed processing.
Curiously enough, the first project I had been involved with for some years that did not feature the Apache Spark project was a green field GCP project which got me thinking… where does Spark fit into the GCP landscape?
Unlike the other major providers who use Spark as the backbone of their managed distributed ETL services with examples such as AWS Glue or the Spark integration runtime option in Azure Data Factory, Google’s managed ETL solution is Cloud DataFlow. Cloud DataFlow which is a managed Apache Beam service does not use a Spark runtime (there is a Spark Runner however this is not an option when using CDF). So where does this leave Spark?
My summation is that although Spark is not a first-class citizen in GCP (as far as managed ETL), it is not a second-class citizen either. This article will discuss the various ways Spark clusters and applications can be deployed within the GCP ecosystem.
Quick Primer on Spark
Every Spark application contains several components regardless of deployment mode, the components in the Spark runtime architecture are:
the Cluster Manager
the Executor(s), which run on worker nodes or Workers
Each component has a specific role in executing a Spark program and all of the Spark components run in Java virtual machines (JVMs).
Cluster Managers schedule and manage distributed resources (compute and memory) across the nodes of the cluster. Cluster Managers available for Spark include:
Spark on DataProc
This is perhaps the simplest and most integrated approach to using Spark in the GCP ecosystem.
DataProc is GCP’s managed Hadoop Service (akin to AWS EMR or HDInsight on Azure). DataProc uses Hadoop/YARN as the Cluster Manager. DataProc clusters can be deployed on a private network (VPC using RFC1918 address space), supports encryption at Rest using Google Managed or Customer Managed Keys in KMS, supports autoscaling and the use of Preemptible Workers, and can be deployed in a HA config.
Furthermore, DataProc clusters can enforce strong authentication using Kerberos which can be integrated into other directory services such as Active Directory through the use of cross realm trusts.
DataProc clusters can be deployed using the gcloud dataproc clusters create command or using IaC solutions such as Terraform. For this article I have included an example in the source code using the gcloud command to deploy a DataProc cluster on a private network which was created using Terraform.
The beauty of DataProc is its native integration into IAM and the GCP service plane. Having been a long-time user of AWS EMR, I have found that the usability and integration are in many ways superior in GCP DataProc. Let’s look at some examples…
IAM and IAP (TCP Forwarding)
DataProc is integrated into Cloud IAM using various coarse grained permissions use as dataproc.clusters.use and simplified IAM Roles such as dataproc.editor or dataproc.admin. Members with bindings to the these roles can perform tasks such as submitting jobs and creating workflow templates (which we will discuss shortly), as well as accessing instances such as the master node instance or instances in the cluster using IAP (TCP Forwarding) without requiring a public IP address or a bastion host.
DataProc Jobs and Workflows
Spark jobs can be submitted using the console or via gcloud dataproc jobs submit as shown here:
Cluster logs are natively available in StackDriver and standard out from the Spark Driver is visible from the console as well as via gcloud commands.
Complex Workflows can be created by adding Jobs as Steps in Workflow Templates using the following command:
gcloud dataproc workflow-templates add-job spark
Optional Components and the Component Gateway
DataProc provides you with a Hadoop cluster including YARN and HDFS, a Spark runtine – which includes Spark SQL and SparkR. DataProc also supports several optional components including Anaconda, Jupyter, Zeppelin, Druid, Presto, and more.
Web interfaces to some of these components as well as the management interfaces such as the Resource Manager UI or the Spark History Server UI can be accessed through the Component Gateway.
This is a Cloud IAM integrated gateway (much like IAP) which can allow access through an authenticated and authorized console session to web UIs in the cluster – without the need for SSH tunnels, additional firewall rules, bastion hosts, or public IPs. Very cool.
Links to the component UIs as well as built in UIs like the YARN Resource Manager UI are available directly from through the console.
Jupyter is a popular notebook application in the data science and analytics communities used for reproducible research. DataProc’s Jupyter component provides a ready-made Spark application vector using PySpark. If you have also installed the Anaconda component you will have access to the full complement of scientific and mathematic Python packages such as Pandas and NumPy which can be used in Jupyter notebooks as well. Using the Component Gateway, Jupyer notebooks can be accessed directly from the Google console as shown here:
From this example you can see that I accessed source data from a GCS bucket and used HDFS as local scratch space.
Furthermore, notebooks are automagically saved in your integrated Cloud Storage DataProc staging bucket and can be shared amongst analysts or accessed at a later time. These notebooks also persist beyond the lifespan of the cluster.
Next up we will look at deploying a Spark Standalone cluster on a GKE cluster, see you then!
This article demonstrates Cloud SQL federated queries for Big Query, a neat and simple to use feature.
Connecting to Cloud SQL
One of the challenges presented when using Cloud SQL on a private network (VPC) is providing access to users. There are several ways to accomplish this which include:
open the database port on the VPC Firewall (5432 for example for Postgres) and let users access the database using a command line or locally installed GUI tool (may not be allowed in your environment)
provide a web based interface deployed on your VPC such as PGAdmin deployed on a GCE instance or GKE pod (adds security and management overhead)
use the Cloud SQL proxy (requires additional software to be installed and configured)
In additional, all of the above solutions require direct IP connectivity to the instance which may not always be available. Furthermore each of these operations requires the user to present some form of authentication – in many cases the database user and password which then must be managed at an individual level.
Enter Cloud SQL federated queries for Big Query…
Big Query Federated Queries for Cloud SQL
Big Query allows you to query tables and views in Cloud SQL (currently MySQL and Postgres) using the Federated Queries feature. The queries could be authorized views in Big Query datasets for example.
This has the following advantages:
Allows users to authenticate and use the GCP console to query Cloud SQL
Does not require direct IP connectivity to the user or additional routes or firewall rules
Leverages Cloud IAM as the authorization mechanism – rather that unmanaged db user accounts and object level permissions
External queries can be executed against a read replica of the Cloud SQL instance to offload query IO from the master instance
Setting it up
Setting up big query federated queries for Cloud SQL is exceptionally straightforward, a summary of the steps are provided below:
Step 1. Enable a Public IP on the Cloud SQL instance
This sounds bad, but it isn’t really that bad. You need to enable a public interface for Big Query to be able to establish a connection to Cloud SQL, however this is not accessed through the actual public internet – rather it is accessed through the Google network using the back end of the front end if you will.
Furthermore, you configure an empty list of authorized networks which effectively shields the instance from the public network, this can be configured in Terraform as shown here:
This configuration change can be made to a running instance as well as during the initial provisioning of the instance.
As shown below you will get a warning dialog in the console saying that you have no authorized networks – this is by design.
Step 2. Create a Big Query dataset which will be used to execute the queries to Cloud SQL
Connections to Cloud SQL are defined in a Big Query dataset, this can also be use to control access to Cloud SQL using authorized views controlled by IAM roles.
Step 3. Create a connection to Cloud SQL
To create a connection to Cloud SQL from Big Query you must first enable the BigQuery Connection API, this is done at a project level.
As this is a fairly recent feature there isn’t great coverage with either the bq tool or any of the Big Query client libraries to do this so we will need to use the console for now…
Under the Resources -> Add Data link in the left hand panel of the Big Query console UI, select Create Connection. You will see a side info panel with a form to enter connection details for your Cloud SQL instance.
In this example I will setup a connection to a Cloud SQL read replica instance I have created:
In this post we will look at read replicas as an additional method to achieve multi zone availability for Cloud SQL, which gives us – in turn – the ability to offload (potentially expensive) IO operations such as user created backups or read operations without adding load to the master instance.
In the previous post in this series we looked at Regional availability for PostgreSQL HA using Cloud SQL:
Recall that this option was simple to implement and worked relatively seamlessly and transparently with respect to zonal failover.
Now let’s look at read replicas in Cloud SQL as an additional measure for availability.
Deploying Read Replica(s)
Deploying read replicas is slightly more involved than simple regional (high) availability, as you will need to define each replica or replicas as a separate Cloud SQL instance which is a slave to the primary instance (the master instance).
An example using Terraform is provided here, starting by creating the master instance:
Next you would specify one or more read replicas (typically in a zone other than the zone the master is in):
Note that several of the options supplied are omitted when creating a read replica database instance, such as the backup and maintenance options – as these operations cannot be performed on a read replica as we will see later.
Voila! You have just set up a master instance (the primary instance your application and/or users will connect to) along with a read replica in a different zone which will be asynchronously updated as changes occur on the master instance.
Read Replicas in Action
Now that we have created a read replica, lets see it in action. After connecting to the read replica (like you would any other instance), attempt to access a table that has not been created on the master as shown here:
Now create the table and insert some data on the master instance:
Now try the select operation on the replica instance:
Some Points to Note about Cloud SQL Read Replicas
Users connect to a read replica as a normal database connection (as shown above)
Google managed backups (using the console or gcloud sql backups create .. ) can NOT be performed against replica instances
Read replicas can be used to offload IO intensive operations from the the master instance – such as user managed backup operations (e.g. pg_dump)
BE CAREFUL Despite their name, read replicas are NOT read only, updates can be made which will NOT propagate back to the master instance – you could get yourself in an awful mess if you allow users to perform INSERT, UPDATE, DELETE, CREATE or DROP operations against replica instances.
Promoting a Read Replica
If required a read replica can be promoted to a standalone Cloud SQL instance, which is another DR option. Keep in mind however as the read replica is updated in an asynchronous manner, promotion of a read replica may result in a loss of data (hopefully not much but a loss nonetheless). Your application RPO will dictate if this is acceptable or not.
Promotion of a read replica is reasonably straightforward as demonstrated here using the console:
Once you click on the Promote Replica button you will see the following warning:
This simply states that once you promote the replica instance your instance will become an independent instance with no further relationship with the master instance. Once accepted and the promotion process is complete, you can see that you now have two independent Cloud SQL instances (as advertised!):
Some of the options you would normally configure with a master instance would need to be configured on the promoted replica instance – such as high availability, maintenance and scheduled backups – but in the event of a zonal failure you would be back up and running with virtually no data loss!
Full source code for this article is available at:
In this multi part blog we will explore the features available in Google Cloud SQL for High Availability, Backup and Recovery, Replication and Failover and Security (at rest and in transit) for the PostgreSQL DBMS engine. Some of these features are relatively hot of the press and in Beta – which still makes them available for general use.
We will start by looking at the High Availability (HA) options available to you when using the PostgreSQL engine in Google Cloud SQL.
Most of you would be familiar with the concepts of High Availability, Redundancy, Fault Tolerance, etc but let’s start with a definition of HA anyway:
High availability (HA) is a characteristic of a system, which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.
Higher than a normal period is quite subjective, typically this is quantified by a percentage represented by a number of “9s”, that is 99.99% (which would be quoted as “four nines”), this would allot you 52.60 minutes of downtime over a one-year period.
Essentially the number of 9’s required will drive your bias towards the options available to you for Cloud SQL HA.
We will start with Cloud SQL HA in its simplest form, Regional Availability.
Knowing what we know about the Google Cloud Platform, regional availability means that our application or service (in this case Cloud SQL) should be resilient to a failure of any one zone in our region. In fact, as all GCP regions have at least 3 zones – two zones could fail, and our application would still be available.
Regional availability for Cloud SQL (which Google refers to as High Availability), creates a standby instance in addition to the primary instance and uses a regional Persistent Disk resource to store the database instance data, transaction log and other state files, which is synchronously replicated to a Persistent Disk resource local to the zones that the primary and standby instances are located in.
A shared IP address (like a Virtual IP) is used to serve traffic to the healthy (normally primary) Cloud SQL instance.
An overview of Cloud SQL HA is shown here:
Implementing High Availability for Cloud SQL
Implementing Regional Availability for Cloud SQL is dead simple, it is one argument:
availability_type = "REGIONAL"
Using the gcloud command line utility, this would be:
There is an --async option which will return immediately, invoking the failover operation asynchronously.
Failover can also be invoked from the Cloud Console using the Failover button shown previously. As an example I have created a connection to a regionally available Cloud SQL instance and started a command which runs a loop and prints out a counter:
Now using the gcloud command shown earlier, I have invoked a manual failover of the Cloud SQL instance.
Once the failover is initiated, the client connection is dropped (as the server is momentarily unavailable):
The connection can be immediately re-established afterwards, the state of the running query is lost – importantly no data is lost however. If your application clients had retry logic in their code and they weren’t executing a long running query, chances are no one would notice! Once reconnecting normal database activities can be resumed:
A quick check of the instance logs will show that the failover event has occured:
Now when you return to the instance page in the console you will see a Failback button, which indicates that your instance is being served by the standby:
Note that there may be a slight delay in the availability of this option as the replica is still being synched.
It is worth noting that nothing comes for free! When you run in REGIONAL or High Availability mode – you are effectively paying double the costs as compared to running in ZONAL mode. However availability and cost have always been trade-offs against one another – you get what you pay for…
Next up we will look at read replicas (and their ability to be promoted) as another high availability alternative in Cloud SQL.
There are many posts available which map analogous services between the different cloud providers, but this post attempts to go a step further and map additional concepts, terms, and configuration options to be the definitive thesaurus for cloud practitioners familiar with AWS looking to fast track their familiarisation with GCP.
It should be noted that AWS and GCP are fundamentally different platforms, nowhere is this more apparent than in the way networking is implemented between the two providers, see:
This post is focused on the core infrastructure, networking and security services offered by the two major cloud providers, I will do a future post on higher level services such as the ML/AI offerings from the respective providers.
Furthermore this will be a living post which I will continue to update, I encourage comments from readers on additional mappings which I will incorporate into the post as well.
I have broken this down into sections based upon the layout of the AWS Console.
The prerequisite steps to configure Slack are provided here:
1. First you will need to create a Slack app (assuming you have already set up an account and a workspace). The following screenshots demonstrate this process:
2. Next you need to Enable and Activate Incoming Webhooks to your app and add this to your workspace. The following screenshots demonstrate this process:
3. Next you need to specify a channel for notifications generated from object events.
4. Now you need to copy the Webhook url provided, you will use this later in your Cloud Function.
Treat your webhook url as a secret, do not upload this to a public source code repository
Next you need to create your Cloud Function, this example uses Python but you can use an alternative runtime including Node.js or Go.
This example templates the source code using the Terraform template_file data source. The function source code is shown here:
Within your Terraform code you need to render your Cloud Function code substituting the slack_webhook_url for it’s value which you will supply as a Terraform variable. The rendered template file is then placed in a local directory along with a requirements.txt file and zipped up. The resulting Zip archive is uploaded to a specified bucket where it will be sourced to create the Cloud Function.
Now you need to create the Cloud Function, the following HCL snippet demonstrates this:
The event_trigger block in particular specifies which GCS bucket to watch and what events will trigger invocation of the function. Bucket events include:
google.storage.object.finalize(the creation of a new object)
You could add additional logic to the Cloud Function code to look for specific object names or naming patterns, but keep in mind the function will fire upon every event matching the event_type and resource criteria.