Protobuf Field Masks

Protobuf Field Masks

August 19, 2019

When you design an API to a service, it’s nice if your API is robust enough to return just the data that the user is interested in.

For example, take the Google Maps Place Details API. There are about 20 fields that you can query, such as name, formatted_phone_number, website, and rating.

You can specify exactly which fields you want returned by specifying a fields parameter. This helps limit responses to just the data we actually want. It can also help reduce costs, as Google bills based on the returned fields.

The key idea here is that the client should be able to specify the fields that they want to read or update through your service. In this post, we’ll look at how this can be done using protobuf field masks.

An Example Service

Let’s say we have a service for tracking store inventory. Our service manages a bunch of items, each of which has a name, stock keeping unit (SKU)1, quantity, and price. To keep things simple, our prices will always be integral amounts in your unit of choice.

Our service’s API communicates using protocol buffers, or protobufs, and specifically uses proto3. We’ll assume you already have some basic familiarity with protobufs. If you’re new to them, the main thing to keep in mind is that they’re a structured data format (think JSON or XML).

Item Definition

Our item definition looks something like this:

message Item {
  string sku = 1;
  string name = 2;
  uint32 quantity = 3;
  uint32 price = 4;
}

Read

Our service has two API endpoints that we’d like to improve.

The first lets you fetch a single item, specified by a SKU.

message FetchItemRequest {
  string sku = 1;
}

message FetchItemResponse {
  Item item = 1;
}

Right now, it returns every field in our item. We’d like to be able to specify some subset of these fields to return.

Update

The second lets you update a single item, again identified by its SKU.

message UpdateItemRequest {
  Item item = 1;
}

Similarly to our fetch endpoint, you must fully specify the item to be updated. Even if you want to only change the quantity, you must still provide the name and price as well.

It might seem odd that you can’t update only specific fields. If you’ve used proto3 yourself, you might have an idea of why we have this quirk—proto3 does not have has* methods for scalar types.

To simplify our example, we just require specifying all fields in our request. Other ways of dealing with this issue are to use proto2 or to define a flag on each field to indicate whether or not the field was updated. Each of these approaches has its own problems. As we’ll see later on, field masks will help us fix this.

Field Masks

As mentioned, we’ll be improving our endpoints with Field Masks. Let’s look at how they work.

Masking

Field masks are similar to any other kind of mask. You might already be familiar with bitmasks (for bitwise operations) or layer masks (for image editing). A mask lets you indicate which parts of an object you’re interested in.

A field mask lets us indicate fields in a protobuf message. The mask itself is just a message, defined as a list of string paths:

message FieldMask {
  repeated string paths = 1;
}

The actual path values are just the field names defined in your message. If you have nested objects, you can specify the nested fields with a ., e.g. foo.bar.

For example, let’s say we wanted to get just the SKU and name from our items. We could use this mask:

FieldMask fieldMask =
    FieldMask.newBuilder()
        .addPaths("sku")
        .addPaths("name")
        .build();

If we applied our mask to this item…

Item hersheys =
    Item.newBuilder()
        .setSku("CHOC-0001")
        .setName("Hershey's Bar")
        .setQuantity(23)
        .setPrice(2)
        .build();

…we’d end up with this:

{
  sku: "CHOC-0001"
  name: "Hershey's Bar"
}

Usage

To actually write the code to do this, we want to use FieldMaskUtil.

The merge method will apply a field mask to a message for us. merge takes in a field mask, a source message, and a destination message builder. It sets fields in the destination builder, according to the field mask and source.

The code for our example above would look like this:

Item.Builder filteredItem = Item.newBuilder();
FieldMaskUtil.merge(fieldMask, hersheys, filteredItem);

Now that we’ve seen what field masks are and how to use them in code, let’s update our service’s endpoints.

Read

To update our fetch endpoint, we first need to let our callers specify a field mask for the fields they want returned.

message FetchItemRequest {
  string sku = 1;
  google.protobuf.FieldMask field_mask = 2;
}

Our endpoint would then construct its response as before, but apply the given field mask before returning.

public FetchItemResponse fetchItem(FetchItemRequest request) {
    Item item = // fetch item as before
    Item filteredItem = Item.newBuilder();
    FieldMaskUtil.merge(request.getFieldMask(), item, filteredItem);
    return FetchItemResponse.newBuilder()
        .setItem(filteredItem)
        .build();
}

There are a few things to note about this change.

First, if this endpoint is already in use, this will actually be a breaking change. Existing requests will be sending the default, empty field mask. This would then result in nothing being returned. We’d need to either migrate our callers or explicitly check if a field mask has been set.

Second, this change does not substantially affect anything server-side. Even though we’re only returning a subset of our Item fields, we are still fetching and constructing a full Item on the server. Our improvement is just the reduced amount of data that we have to send back to the client.

Update

Now, for our update endpoint. Again, we add a field mask to our request.

message UpdateItemRequest {
  Item item = 1;
  google.protobuf.FieldMask update_mask = 2;
}

We’ll merge our field mask slightly differently here. We want to create an updated Item, using the fields from our request.

public void updateItem(UpdateItemRequest request) {
    Item current = // fetch the pre-update item
    Item.Builder updated = current.toBuilder();
    FieldMaskUtil.merge(request.getUpdateMask(), request.getItem(), updated);
    // persist updated item
}

We use our existing item to create our destination builder. merge will only update the fields specified by our field mask, resulting in our builder having the final state of our item.

Again, as with our changes to our fetch endpoint, the changes to our update endpoint will break existing behavior.

A Caveat

Because field masks rely on using strings to specify fields, they break the backward and forward compatibility that protobufs provide. If you want to rename your fields, you need to either update all of your callers’ or build some kind of versioning into your server.


  1. A SKU is just a unique identifier. It’s jargon used when talking about stuff you can sell. [return]
comments powered by Disqus