Release Review - Pub/Sub Message Filtering

Tue Feb 02, 2021 in optimize, pubsub, release review

In November 2020, Google released¹ a Pub/Sub feature called Message Filtering². This feature can help reduce unnecessary invocations of an HTTP endpoint and reduce egress traffic, both of which can help you reduce infrastructure costs.

In this article, we’ll take a closer look at this feature, with an aim to better understand its cost implications and to help you determine if it will be helpful to your cost-reduction efforts.

Feature Overview

Google’s Pub/Sub is described as follows:

Pub/Sub is a fully-managed real-time messaging service that allows you to send and receive messages between independent applications.

Messages are sent to a Pub/Sub Topic. A topic has one or many Subscriptions that act to route that message to a consuming service. Each Subscription is associated with an application, service, or component that is interested in receiving messages sent to that Topic. A Subscription may be an application that checks-in on a polling basis to pull messages, or it may be a service endpoint/url that messages get sent to via a HTTP POST.

Once a message is received by the application it is acknowledged. This ensures they are not delivered again by that Subscription.

When the message is written to a Pub/Sub Topic, there are two primary payloads for that message: The message body, and attributes made up of key:value pairs.

A Message Filter, as provided by this new feature, is configured on a Pub/Sub Subscription. It is a matching rule that is evaluated against each message’s defined attributes. Only messages that match the filter are sent to the subscribing service. The others are immediately auto-acknowledged. Auto-acknowledged messages never get sent to the service endpoint or pulled by a consuming service or job.

For Filtering to work, the publishing service must be applying helpful attributes to the messages as they are published to the topic.

Cost Implications

A common use-case for Pub/Sub Subscriptions is to invoke a HTTP function. If you are using a Serverless technology such as Cloud Run or Cloud Function, you are billed based on your usage of that service, and unnecessary invocations can lead to unnecessary costs.

You may be billed for traffic related to messages on this Subscription if they are pushed to, or pulled by, a service that is outside of the GCP network.

We’ll look at both of these scenarios a bit closer to quantify the potential cost impact of message filtering:

Example Scenario

As an example scenario, we will look at a simple Pub/Sub topic with a single Subscription that receives 10 million messages per month. We will assume the message size is 8kb. The cost of this Pub/Sub configuration will be $2.26/month³, based on that message volume. Filtering will have no impact on this cost because you still pay for auto-acknowledged messages.

This cost will be billed under the GCP service “Cloud Pub/Sub” and under the SKU of “Message Delivery Basic”. Depending on your message retention rules as well as your snapshot use as defined on the Subscription, you may incur additional costs.

None of these costs are affected by the use of message filtering. The cost impact is on downstream consumption:

Reduced Serverless Invocation (Cloud Function)

Let’s consider a scenario where you have a Cloud Function that is allocated 1GB RAM and has a 600ms execution time. The function is invoked based on the previously described scenario, 10 million times per month. This function does not send any egress traffic outside of the GCP region, so there is no cost for network traffic.

As defined, this function will cost $99.20/month⁴. It will be billed under the GCP service of “Cloud Function”, under the SKUs such as the following:

sku.description	sku.id
CPU Time	C024-9C10-2A5B
Memory Time	F01C-3EA0-06CD
Invocations	8E10-82EB-6917
Network Egress from us-central1	B068-6B82-C017

If the function is filtering in-code, and it is invoked on unnecessary messages, we could use a message filter on that Subscription. If we, for example, reduce the message volume and invocation to 25%, that would reduce our Cloud Function bill to 25%, or $24.80/month.

A similar scenario could be used for any other metered, pay-per-invocation Serverless service, e.g. Cloud Run, and provide similar results.

Reduced Egress (Pull or Push)

If the consuming service is located outside of the GCP region then you are billed for egress costs for the network traffic. Building on the previous scenario, with the Subscription consumer receiving 10 million messages a month, you are billed approximately 80GB of egress traffic.

There are a number of different SKUs for egress traffic related to the Cloud Pub/Sub service, and each have their own cost³. Examples include:

sku.description	sku.id
Inter-region data delivery from North America to North America	A8B0-3AAC-6BBD
Internet data delivery from North America to North America	4A21-6BF7-331E
Intra-region data delivery	460A-7826-0471
Internet data delivery from North America to EMEA	08DA-836B-A5A6
Inter-region data delivery from North America to EMEA	EF6B-3E74-36DA

For this example, calculating based on “Internet data delivery from North America to North America” with a rate of $.12/GB, our costs without message filtering would be $9.60/month.

If the consuming service is receiving unnecessary messages from the topic, applying a message filter could reduce the egress costs. A 75% decrease would reduce the monthly cost to $2.30/mo.

Summary

Message Filters are a helpful new feature added to Pub/Sub that increase their utility and flexibility even if your primary purpose isn’t cost savings.

Using the examples in this article, to determine whether message filters will help you reduce your infrastructure costs.

Even if it is not worthwhile to modify existing subscriptions, it is probably a good idea to share this new Pub/Sub feature with your engineers so that it can be used on a go-forward basis. It is a helpful new function that makes Pub/Sub more powerful, improves design options for the service, and makes its use more efficient.