21 January 2020

Yet Another Kubernetes Intro - Part 4 - Services

Ok… So far in this series I have covered “What is Kubernetes?”, Pods and ReplicaSets. This means that we should be able to get a basic Kubernetes cluster up and running (at least locally using Docker for Dektop), be able to define our pods, and finally get them up and running in a replicated manner using ReplicaSets. So that is kind of cool. However, we still haven’t covered any form of communication with, or between the pods. Except for doing port forwarding using kubectl port-forward.

In K8s, pods are attached to an internal network inside the cluster. This means that, by default, we can’t really communicate with the pods from the outside. They are sort of “protected” inside the cluster. On top of that, pods get assigned pseudo random IP addresses during creation. This means that even talking between pods inside the cluster becomes a bit complicated, as we can’t depend on their IP addresses for communication. Instead, we need some other way to locate and communicate with our pods.

For quite obvious reasons, there is already a resource type available in Kubernetes to solve this particular issue. It is called a Service.

Services come in a few different flavors, but the general gist of a service, is a fixed location in the cluster, that we can use to communicate with the pods we require in a load balanced fashion. And to make it even better, it even gives us a DNS entry, so that we can locate pods using DNS lookups instead of random IP addresses. The fixed endpoint is available as long as the service resource exists. So even if you remove all pods “behind” the service, and then re-create them, the endpoint is still there for you to use.

The high level, technical overview

To me at least, it is important to understand a bit of how services work. When learning about services, it took me a while before I properly grasped that creating a Service resource in the cluster doesn’t actually mean that some form of virtual piece of infrastructure is being set up. Instead, when you create a service resource, Kubernetes will see this, and figure out what pods the requests should be forwarded when the service is being targeted. This is done using a label selector just as you have already seen with ReplicaSets. Once it has figured out what pods to “use”, it creates a set of resources called Endpoints. An endpoint is a resource that contains the IP address of an individual pod that is “used” by a specific service.

On the worker nodes, where the pods run, there is a piece of code running called kube-proxy. This code constantly monitors the cluster (using the REST API) for changes in the list of available services. Whenever it sees a change, it looks at the services, and their configuration, to figure out what routing needs to be set up. It then creates rules in the nodes’ iptables to make sure that any traffic directed at the service is randomly redirected to one of the services found using the specified label selector.

Note: This is the default way that networking works in Kubernetes. However, as with a lot of parts in K8s, it can be replaced and modified. Another option is to use a service mesh like for example Istio. This replaces the implementation of the internal networking to enable a bunch of more advanced features in something called a service mesh.

On top of that, it also sets up a DNS name for the service. The full DNS name for a service looks like this <SERVICE_NAME>.<NAMESPACE>.svc.<DOMAIN>. As you can see it contains a few different parts. Let’s have a look at the individual pieces of the address

SERVICE_NAME - the name of the service we are trying to reach

NAMESPACE - the name of the namespace in which the service is running. So far, I haven’t covered namespaces, but it is basically a way to separate and group resources in the cluster. The default namespace is called default

svc - defines that it is a service

DOMAIN - the domain name set for the cluster internally in the cluster. The default is cluster.local

So, let’s say that you have a service called my-svc, in the default namespace, in a cluster using the default domain. That ends up being my-svc.default.svc.cluster.local.

This gives us a very precise address to the service. However, it also gives us an annoyingly long address. Luckily, we can ignore a big part of it in most cases. The cluster already knows the name of the domain, so we can skip that. And it knows that it is a service we are looking for, so we can ignore svc. And as long as you want to talk to a service in the same namespace as the pod making the request, you can skip the namespace as well. So in most cases it is ok to just use the my-svc, which is just the name of the service we want to call.

If you do want to call a service in a different namespace, you can just go ahead and append the namespace to the address. So imagine that you wanted to call a service called customer-service in a namespace called e-commerce, but you were in a pod in the default namespace. Then you could just use the service name and the namespace name. So that would end up being customer-service.e-commerce.

Different types of services

Kubernetes has a few different types of services. All offering the idea of a central point to access a set of pods. The main difference between them is the way you can access the service.

ClusterIP service

The default service type, is a ClusterIP service. This is a purely internal resource that gives the service a fixed, virtual IP address in the cluster. It also generates a DNS record, as mentioned above, allowing us to access the service through either a fixed IP address or through a DNS look-up. Any request that is sent to the ClusterIP is then redirected to a randomly selected pod based on the IP table routes set up by the kube-proxy.

To create a ClusterIP service, you can use a YAML-file that looks something like this

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

As you can see, this creates a service called my-service. It will cause a redirect to be set up for all TCP traffic on port 80, to port 8080 on pods fits the label selector app=my-app.

Note: If you want to port-forward several different ports (and/or protocols) you can just add more port definitions to the spec.

Note 2: If you do not specifically assign a spec.type attribute to the service (as seen later), it defaults to ClusterIP

Note 3: The fact that it uses label selectors makes it VERY flexible. It even allows you to send traffic to different types of pods using the same service, which could be really useful. But it could also be really bad if you didn’t intend it…

If you use the kubectl describe command, you can see more information about the service, including its endpoints

kubectl describe service my-service

Name:              my-service
Namespace:         default
Labels:            <none>
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"my-service","namespace":"default"},"spec":{"ports":[{"port":80,"p...
Selector:          app=my-app
Type:              ClusterIP
IP:                10.100.247.21
Port:              <unset>  80/TCP
TargetPort:        8080/TCP
Endpoints:         10.1.0.72:8080,10.1.0.73:8080,10.1.0.74:8080
Session Affinity:  None
Events:            <none>

As you can see, the service is assigned a pseudo random cluster IP address from a defined range of addresses. In this this case, it ended up being 10.100.247.21. If you want to, you can specify the ClusterIP in your config. But if you do, it has to be inside the defined address range, and you have to make sure that your services cluster IP addresses don’t collide.

As mentioned previously, any requests to the ClusterIP is redirected to a randomly selected endpoint.

In the above terminal output, you can also see 3 endpoints (10.1.0.72-10.1.0.74), representing the 3 running pods that matches the label selector. If any pods are added or removed, the list of endpoints would be updated automatically by a Kubernetes controller.

If you want to, you can even kubectl port-forward to the service using

kubectl port-forward service/my-service 8080:80

It’s worth noting, that even though this works, it won’t actually load-balance across the pods. Instead, the port-forward command will take the first (I think) address from the endpoints list, and port-forward to that pod. So, if that pod goes down, the port-forwarding will fail, and you have to re-start the command again.

However, besides ClusterIP, we also have NodePort, LoadBalancer and ExternalName services. So, what do they actually do? Well, let’s have a very quick look at each one of them.

NodePort service

A NodePort service doesn’t get a virtual cluster IP address. Instead, it gets allocated a random port (between 30000-32767 by default) on each one of the nodes in the cluster. This port is then “bound” to the service. This means that any request to any of the nodes in the cluster, on that port, is forwarded to one of the pods with a matching label selector.

Here is an example

apiVersion: v1
kind: Service
metadata:
  name: my-nodeport-service
spec:
  type: NodePort
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80

As you can see, all that needs to be changed to turn it into a NodePort, is to change the spec.type property to NodePort.

If I add this to my cluster, and then look at the service

kubectl get services

NAME                  TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes            ClusterIP   10.96.0.1      <none>        443/TCP        52d
my-nodeport-service   NodePort    10.104.1.179   <none>        80:30237/TCP   91s

we can see in the PORT(S) column that it has allocated port 30237. This means that I can reach this service by making calls to any of the nodes in the cluster on that port.

As I am running Docker for Desktop, I only have a single node, my Linux VM. So I should be able to make a request to that node on port 30237. However, with Docker for Desktop, it does some magic stuff that means that it binds that port on localhost. So in this case, I can make a request to http://localhost:30237/, and reach one of the nodes behind my NodePort service.

LoadBalancer service

The LoadBalancer service type is a very special type of service that actually requires external help. It is intended to be used in a cloud environment, where the cloud provider helps out and does a bit some work for it to work.

When you create a LoadBalancer service (in a cloud environment), it creates a service in the cluster that is bound to an external address. So instead of binding it to the nodes in the cluster, like a NodePort, it gets a load balanced public address provisioned by the cloud provider. This means that we don’t need to manually set up a load balancer that load balances traffic to the nodes in the cluster, as we would if we used a NodePort. Instead, the cloud provider of choice can plug into K8s and do the work for you. So when you set up a LoadBalancer service, the provider goes ahead and creates all the virtual infrastructure you need to get a public, load balanced endpoint set up.

The config for a LoadBalancer service looks something like this

apiVersion: v1
kind: Service
metadata:
  name: my-loadbalanced-service
spec:
  type: LoadBalancer
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80

The only real change is once again the spec.type property. Just as with the NodePort service.

This type of service is hard to try out on a local machine for obvious reasons. But if you are using AWS EKS or Azure Container Services for example, creating one of these services will (after a bit of a delay) give you a pseudo random DNS name that you can use to reach your service.

Note: There is additional configuration that can be added depending on the cloud provider you are using.

Note 2: It is a much better idea to use an Ingress than a service to front your application though. A LoadBalancer service limits traffic to one port, targeting one set of pods, with the ability to specify multiple port mappings. An Ingress allows you to be MUCH more flexible, mapping different paths, headers etc to different services. However, Ingress is out of scope for this post. More to come in this area in the future!

ExternalName service

An ExternalName service allows you to use an internal service, but have it map to an external address. This allows you to configure whether or not a service is hosted internally in the cluster, or as an external resource. This could for example be useful when you run a external database server in production, but as an internal pod in development. By using an ExternalName service, the containers using the database can just rely on the service to get hold of it, and won’t need to know whether or not it is running in production or not.

The config for an ExternalName service looks like this

apiVersion: v1
kind: Service
metadata:
  name: my-service
  namespace: prod
spec:
  type: ExternalName
  externalName: my.database.external.com

Once again, we switch the spec.type. This time we set it to ExternalName. However, for this service we also set it up send requests to my.database.external.com when connecting to my-service. And if we ever wanted to change it to an internal pod, we could just set up a service called database fronting a database pod, and then change the externalName database.

Note: The externalName property cannot be set to an IP address. It should be set to a DNS name.

Services without label selectors

It is possible to create a service without defining a label selector. However, a service like this will not do anything at all by default. As it has no selector, it does not have any pods to target. Because of this, it cannot have any endpoint resources added automatically. But…you can add some manually, allowing you to redirect traffic to any IP address you want. This allows you to redirect traffic to IP addresses belonging to resources that do not live inside the cluster for example. It could look like this

apiVersion: v1
kind: Service
metadata:
  name: my-svc
spec:
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
---
apiVersion: v1
kind: Endpoints
metadata:
  name: my-svc
subsets:
  - addresses:
      - ip: 192.168.10.100
    ports:
      - port: 80
        protocol: TCP

where the name of the endpoint corresponds with name of the service to which it should be connected.

Note: You can add multiple addresses to an endpoint. This allows you to get some load balancing between the endpoints.

Headless services

When you set up a service, the default is to create a ClusterIP. This kind of service will, as mentioned before, give you a fixed, virtual IP address in the cluster with a DNS record attached. It will also load-balance your calls to the backing pods. However, if you want to go a little more low level and skip the load balancing and get straight access to the pods, you can use a “headless service”.

A “headless service” gets a DNS name, and has endpoints assigned by a label selector. However, it doesn’t get a cluster IP address that enables load-balancing. Instead, when you call the DNS server, it returns the list of endpoint addresses, leaving it up to the client to select the address to use.

This allows us to find the IP addresses for all the pods, not communicate with a randomly selected pod out of the set. This can be really useful in scenarios where for example all pods need to communicate with each other.

Warning: A lot of DNS clients just use the first address returned. So if you want to use a headless service, make sure you don’t end up just sending traffic to a single pod…

You create a headless service by setting the spec.clusterIP setting to None in your configuration. Like this

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  clusterIP: None    # <-- This causes a headless service
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Service discovery

In almost all cases, we locate our services through DNS look ups. DNS look-ups are a standardized way to locate things, and it works very well. And on top of that, it keeps our applications Kubernetes agnostic, which in turn makes them a bit more flexible. On the other hand, if you for some reason don’t want to, or can’t use DNS look-ups to find the IP address of your service, Kubernetes actually adds the service information to your pods as environment variables.

For each active service in the cluster, the kubelet maps a set of variables starting with the format {SVCNAME}_SERVICE_HOST and {SVCNAME}_SERVICE_PORT into your pod. The SVCNAME is the name of the service in upper case, with dashes replaced with underscores. So, for example, a service called my-service, mapping TCP port 80, would end up with a set of environment variables that look like this

MY_SERVICE_SERVICE_HOST=10.0.0.11
MY_SERVICE_SERVICE_PORT=80
MY_SERVICE_PORT=tcp://10.0.0.11:80
MY_SERVICE_PORT_80_TCP=tcp://10.0.0.11:80
MY_SERVICE_PORT_80_TCP_PROTO=tcp
MY_SERVICE_PORT_80_TCP_PORT=80
MY_SERVICE_PORT_80_TCP_ADDR=10.0.0.11

Note: The service has to be up and running before the pod is created. Any service created after the pod will not be available as environment variables.

If you want to manually look up the current set of endpoints backing the service, you can do so by querying the Kubernetes API.

As the API is always available at http://kubernetes/ inside the cluster, making requests to it is fairly simple. Getting the endpoints would be a request using the format http://kubernetes/api/v1/namespaces/default/endpoints/my-service, where default is the namespace that I want to look in, and my-service is the name of the service that I am looking for. The response is a JSON document that looks something like this

{
    "kind": "Endpoints",
    "apiVersion": "v1",
    "metadata": {
        "name": "my-nodeport-service",
        "namespace": "default",
        "selfLink": "/api/v1/namespaces/default/endpoints/my-nodeport-service",
        "uid": "52f152c5-3a1b-11ea-8235-00155d0aa703",
        "resourceVersion": "948447",
        "creationTimestamp": "2020-01-18T17:52:41Z"
    },
    "subsets": [
        {
            "addresses": [
                {
                    "ip": "10.1.0.80",
                    "nodeName": "docker-desktop",
                    "targetRef": {
                        "kind": "Pod",
                        "namespace": "default",
                        "name": "hello-world-v1-2rnp4",
                        "uid": "967c3d46-3a1a-11ea-8235-00155d0aa703",
                        "resourceVersion": "948437"
                    }
                }
            ],
            "ports": [
                {
                    "port": 80,
                    "protocol": "TCP"
                }
            ]
        }
    ]
}

The addresses in the document can then be used to make requests straight to the pods. But I would still argue that this is an edge case, and that almost all of our communication can be handled by using the DNS version! Doing it manually means that we have to refresh the list of endpoints manually to make sure that pod endpoints haven’t changed etc…

Readiness probes

The final thing to cover in this post is readiness probes. A readiness probe is very similar to the liveness probes I mentioned in part 2. However, instead of causing a container to restart when the probe fails, its endpoint is removed from any service that forwards traffic to it. This allows keep pods out of load balancing rotation during start up, or have it removed when being under too heavy load.

A readiness probe is defined in the pod definition like this

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  labels:
    app: my-world
    version: '1.0'
spec:
  containers:
  - name: my-container
    image: zerokoll/helloworld
    readinessProbe:
      httpGet:
        path: /ready
        port: 80
      initialDelaySeconds: 15
      timeoutSeconds: 1
    name: readiness

This sets up a readiness probe that will keep the pod out of load balancing rotation for at least the first 15 seconds. After that it starts sending HTTP GET requests to /ready to determine whether or not it is ready to accept traffic. If the response has a status code between 200 and 399, it is considered ready, and its address is added as an endpoint to any service where the label selector corresponds to the pods label set.

I think that was all I had to say about Kubernetes services right now. I hope it explained the subject in a way that made sense. I know there was a lot of detail in some areas, but I think it was details that were important. At least it was to me when learning about the subject.

To me, going a bit deeper, and understanding how things work under the hood, often makes it a lot easier to understand why something is not working, and gives you a better ability to figure out what you can actually do with the technology.

The fifth part of this series is available here.