Zero Trust Kubernetes and the Service Mesh

If you’re at all involved with technology, chances are good that you’re already working in the cloud, and that you’ve at least heard of zero trust: the topic became top-of-mind for many after the White House issued guidance calling for federal agencies to implement zero trust by 2024. Let’s take a closer look at what it really is.

Zero trust

Table of Contents

What is Zero Trust?

Zero trust is a model for helping us reason about some questions that have been central to security for a very long time: who do we trust with what? and how do we make decisions about trust? The core concepts of zero trust are that trust must be explicit rather than inferred, and that you check every access, every time, from everyone.

We can contrast zero trust with the much older perimeter security model, in which we protected important things by putting walls around them, and making sure that whoever is inside the wall is safe, leaving anyone scary outside. Here, trust is inferred based on your location: are you inside the wall, or outside? Additionally, once inside the wall, typically no further checks are made.

Perimeter security worked in the pre-cloud days, where we were usually running a monolith on machines and networks we controlled. That world has ready-made perimeters defined by the edge of the monolith, the edge of the computer(s), and the edge of the network. But in the cloud-native world, it falls apart (though using a firewall around your cluster is still a good idea for defense in depth).

Today, applications are a collection of microservices running on CPUs and networks that we don’t own, managed by a layer of code that we didn’t write. Worse, that hardware we don’t own may also be running code for our competitors at the same time as it’s running ours, relying on virtualization and containers to keep everything separate. Kubernetes enters as an incredibly powerful tool, and trusting its guarantees can be far more cost-effective than building out our own data centers, but it does not allow perimeter security.

Cloud-Native Zero Trust

To make zero trust work in the cloud-native world, we need to start by understanding the core components for security: identity, policy, and enforcement.

Identity: who is trying to take an action

In the pre-cloud world, we often inferred identity from location: is the requestor inside the perimeter or not? In the cloud-native world, though, Kubernetes controls the network, so we need to associate identity with the workload, not the network.

Policy: what is the requester allowed to do

Again, this was typically very simple in the pre-cloud days: anyone inside the perimeter could do anything, while those outside had more limitations. Zero trust, though, requires following the principle of least privilege, which paradoxically can require very complex descriptions of policy.

Enforcement: how are bad requests handled

This one is as simple in the cloud as before: bad requests must not be allowed to complete.

Where Should This Happen?

While it’s possible to explicitly write code for all this into your application, it’s expensive and fragile. It means that every application developer has to get it perfectly right every time, and it means rebuilding the application whenever you want to update policies.

A smarter way to get everything you need is to install a service mesh.

What is a Service Mesh?

A service mesh is a layer of software that adds security, observability, and reliability features to your application at the platform level, letting your application developers focus on the business needs of your application. There are many service meshes to choose from, both open-source and commercial, but they all have this same purpose.

Meshes work by mediating and monitoring all communications between your application’s workloads, usually by using Kubernetes to route traffic through proxies inserted next to each of your application containers (which lets the proxies fit neatly into Kubernetes’ security guarantees). Typical capabilities provided by a service mesh include:

Automatic mTLS for all communications
Authentication and authorization of requests between services
Per-request load balancing (instead of Kubernetes’ native per-connection load balancing)
Automatic retry of failed requests
Automatic metric collection
and more.

A mesh can offer these capabilities without requiring you to modify or configure your application. Service meshes are extremely powerful tools because of their extremely broad, low-level access to communications.

Service Meshes and Cloud-Native Trust

Different meshes approach the core components of cloud-native trust a bit differently. I’ll use Linkerd, the open source, CNCF graduated service mesh for specific examples—all of the meshes provide these functions, though.

Identity

Everything running in Kubernetes has an associated ServiceAccount which meshes often use as the basis for identity. Linkerd, for example, uses the ServiceAccount’s unique token to generate a Transport Layer Security (TLS) certificate that provides a safe identity tied only to the workload, not to the network. This TLS certificate allows Linkerd to use industry-standard mTLS to verify the identity of both ends of every connection, as well as protecting data in transit.

Policy

The most common mechanism here is mesh-specific Kubernetes policy resources. With Linkerd, for example, you can use an AuthorizationPolicy resource to define policy down to the level of individual HTTP verbs and paths, allowing policies like “the API gateway can list the user’s bank accounts, but it can’t try to transfer funds.”

Enforcement

Most meshes, including Linkerd, handle a policy violation by the simple expedient of refusing the request or summarily dropping the entire connection.

Mesh Limitations

Service meshes aren’t silver bullets, of course. The most important thing to be aware of is that identity in the mesh is not the same thing as identity in your application: identity in the mesh is associated with a workload, rather than an end user. It’s sadly common to see systems built that only worry about the user—but effective security means authenticating both the user and the workloads.

Zero Trust Kubernetes and the Service Mesh

Rethinking security for a cloud-native world is a tall order. We’re talking about changing how we manage identity, looking at policy separate from any application, and managing it all at the platform level so that the application developers don’t have to worry about it. This might be happening under deadline (US federal agencies, for example, have to get this done by 2024), and it will always be happening in a world where it’s critical to keep costs down and not interrupt critical services. Fortunately, in the world of Kubernetes, you can solve a lot of zero trust issues quickly and easily by adding a service mesh.

About the Author: The article is written by Flynn, Technology Evangelist, Buoyant

Resources and Knowledge for the Small Business CEO

Zero Trust Kubernetes and the Service Mesh

What is Zero Trust?

Cloud-Native Zero Trust

Identity: who is trying to take an action

Policy: what is the requester allowed to do

Enforcement: how are bad requests handled

Where Should This Happen?

What is a Service Mesh?

Service Meshes and Cloud-Native Trust

Identity

Policy

Enforcement

Mesh Limitations

Zero Trust Kubernetes and the Service Mesh

Leave a Reply Cancel reply

Pin It on Pinterest