• Kubernetes API Concepts
    • Standard API terminology
    • Efficient detection of changes
      1. GET /api/v1/namespaces/test/pods
      1. GET /api/v1/namespaces/test/pods?watch=1&resourceVersion=10245
        • Watch bookmarks
      1. GET /api/v1/namespaces/test/pods?watch=1&resourceVersion=10245&allowWatchBookmarks=true
    • Retrieving large results sets in chunks
      1. GET /api/v1/pods?limit=500
      1. GET /api/v1/pods?limit=500&continue=ENCODED_CONTINUE_TOKEN
      1. GET /api/v1/pods?limit=500&continue=ENCODED_CONTINUE_TOKEN_2
    • Receiving resources as Tables
      1. GET /apis/crd.example.com/v1alpha1/namespaces/default/resources
    • Alternate representations of resources
      • Protobuf encoding
    • Resource deletion
    • Dry run
      • Make a dry run request
      • Generated values
    • Server Side Apply
      • Introduction
      • Field Management
      • Conflicts
      • Managers
      • Apply and Update
      • Merge strategy
      • Custom Resources
      • Using Server-Side Apply in a controller
      • Comparison with Client Side Apply
      • API Endpoint
      • Clearing ManagedFields
      • Disabling the feature
    • Feedback

    Kubernetes API Concepts

    This page describes common concepts in the Kubernetes API.

    The Kubernetes API is a resource-based (RESTful) programmatic interface provided via HTTP. It supports retrieving, creating,updating, and deleting primary resources via the standard HTTP verbs (POST, PUT, PATCH, DELETE, GET), includes additional subresources for many objects that allow fine grained authorization (such as binding a pod to a node), and can accept and serve those resources in different representations for convenience or efficiency. It also supports efficient change notifications on resources via “watches” and consistent lists to allow other components to effectively cache and synchronize the state of resources.

    Standard API terminology

    Most Kubernetes API resource types are “objects” - they represent a concrete instance of a concept on the cluster, like a pod or namespace. A smaller number of API resource types are “virtual” - they often represent operations rather than objects, such as a permission check (use a POST with a JSON-encoded body of SubjectAccessReview to the subjectaccessreviews resource). All objects will have a unique name to allow idempotent creation and retrieval, but virtual resource types may not have unique names if they are not retrievable or do not rely on idempotency.

    Kubernetes generally leverages standard RESTful terminology to describe the API concepts:

    • A resource type is the name used in the URL (pods, namespaces, services)
    • All resource types have a concrete representation in JSON (their object schema) which is called a kind
    • A list of instances of a resource type is known as a collection
    • A single instance of the resource type is called a resource

    All resource types are either scoped by the cluster (/apis/GROUP/VERSION/) or to a namespace (/apis/GROUP/VERSION/namespaces/NAMESPACE/). A namespace-scoped resource type will be deleted when its namespace is deleted and access to that resource type is controlled by authorization checks on the namespace scope. The following paths are used to retrieve collections and resources:

    • Cluster-scoped resources:
      • GET /apis/GROUP/VERSION/RESOURCETYPE - return the collection of resources of the resource type
      • GET /apis/GROUP/VERSION/RESOURCETYPE/NAME - return the resource with NAME under the resource type
    • Namespace-scoped resources:
      • GET /apis/GROUP/VERSION/RESOURCETYPE - return the collection of all instances of the resource type across all namespaces
      • GET /apis/GROUP/VERSION/namespaces/NAMESPACE/RESOURCETYPE - return collection of all instances of the resource type in NAMESPACE
      • GET /apis/GROUP/VERSION/namespaces/NAMESPACE/RESOURCETYPE/NAME - return the instance of the resource type with NAME in NAMESPACE

    Since a namespace is a cluster-scoped resource type, you can retrieve the list of all namespaces with GET /api/v1/namespaces and details about a particular namespace with GET /api/v1/namespaces/NAME.

    Almost all object resource types support the standard HTTP verbs - GET, POST, PUT, PATCH, and DELETE. Kubernetes uses the term list to describe returning a collection of resources to distinguish from retrieving a single resource which is usually called a get.

    Some resource types will have one or more sub-resources, represented as sub paths below the resource:

    • Cluster-scoped subresource: GET /apis/GROUP/VERSION/RESOURCETYPE/NAME/SUBRESOURCE
    • Namespace-scoped subresource: GET /apis/GROUP/VERSION/namespaces/NAMESPACE/RESOURCETYPE/NAME/SUBRESOURCE

    The verbs supported for each subresource will differ depending on the object - see the API documentation more information. It is not possible to access sub-resources across multiple resources - generally a new virtual resource type would be used if that becomes necessary.

    Efficient detection of changes

    To enable clients to build a model of the current state of a cluster, all Kubernetes object resource types are required to support consistent lists and an incremental change notification feed called a watch. Every Kubernetes object has a resourceVersion field representing the version of that resource as stored in the underlying database. When retrieving a collection of resources (either namespace or cluster scoped), the response from the server will contain a resourceVersion value that can be used to initiate a watch against the server. The server will return all changes (creates, deletes, and updates) that occur after the supplied resourceVersion. This allows a client to fetch the current state and then watch for changes without missing any updates. If the client watch is disconnected they can restart a new watch from the last returned resourceVersion, or perform a new collection request and begin again.

    For example:

    • List all of the pods in a given namespace.

    1. GET /api/v1/namespaces/test/pods

    200 OKContent-Type: application/json{ "kind": "PodList", "apiVersion": "v1", "metadata": {"resourceVersion":"10245"}, "items": […]}

    • Starting from resource version 10245, receive notifications of any creates, deletes, or updates as individual JSON objects.

    1. GET /api/v1/namespaces/test/pods?watch=1&resourceVersion=10245

    200 OKTransfer-Encoding: chunkedContent-Type: application/json{ "type": "ADDED", "object": {"kind": "Pod", "apiVersion": "v1", "metadata": {"resourceVersion": "10596", …}, …}}{ "type": "MODIFIED", "object": {"kind": "Pod", "apiVersion": "v1", "metadata": {"resourceVersion": "11020", …}, …}}…

    A given Kubernetes server will only preserve a historical list of changes for a limited time. Clusters using etcd3 preserve changes in the last 5 minutes by default. When the requested watch operations fail because the historical version of that resource is not available, clients must handle the case by recognizing the status code 410 Gone, clearing their local cache, performing a list operation, and starting the watch from the resourceVersion returned by that new list operation. Most client libraries offer some form of standard tool for this logic. (In Go this is called a Reflector and is located in the k8s.io/client-go/cache package.)

    Watch bookmarks

    FEATURE STATE: Kubernetes v1.16betaThis feature is currently in a beta state, meaning:

    • The version names contain beta (e.g. v2beta3).
    • Code is well tested. Enabling the feature is considered safe. Enabled by default.
    • Support for the overall feature will not be dropped, though details may change.
    • The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature.
    • Recommended for only non-business-critical uses because of potential for incompatible changes in subsequent releases. If you have multiple clusters that can be upgraded independently, you may be able to relax this restriction.
    • Please do try our beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.

    To mitigate the impact of short history window, we introduced a concept of bookmark watch event. It is a special kind of event to pass an information that all changes up to a given resourceVersion client is requesting has already been send. Object returned in that event is of the type requested by the request, but only resourceVersion field is set, e.g.:

    1. GET /api/v1/namespaces/test/pods?watch=1&resourceVersion=10245&allowWatchBookmarks=true

    1. 200 OK
    2. Transfer-Encoding: chunked
    3. Content-Type: application/json
    4. {
    5. "type": "ADDED",
    6. "object": {"kind": "Pod", "apiVersion": "v1", "metadata": {"resourceVersion": "10596", ...}, ...}
    7. }
    8. ...
    9. {
    10. "type": "BOOKMARK",
    11. "object": {"kind": "Pod", "apiVersion": "v1", "metadata": {"resourceVersion": "12746"} }
    12. }

    Bookmark events can be requested by allowWatchBookmarks=true option in watch requests, but clients shouldn’t assume bookmarks are returned at any specific interval, nor may they assume the server will send any bookmark event. Since version 1.16, watch bookmarks feature is enabled by default.

    Retrieving large results sets in chunks

    On large clusters, retrieving the collection of some resource types may result in very large responses that can impact the server and client. For instance, a cluster may have tens of thousands of pods, each of which is 1-2kb of encoded JSON. Retrieving all pods across all namespaces may result in a very large response (10-20MB) and consume a large amount of server resources. Starting in Kubernetes 1.9 the server supports the ability to break a single large collection request into many smaller chunks while preserving the consistency of the total request. Each chunk can be returned sequentially which reduces both the total size of the request and allows user-oriented clients to display results incrementally to improve responsiveness.

    To retrieve a single list in chunks, two new parameters limit and continue are supported on collection requests and a new field continue is returned from all list operations in the list metadata field. A client should specify the maximum results they wish to receive in each chunk with limit and the server will return up to limit resources in the result and include a continue value if there are more resources in the collection. The client can then pass this continue value to the server on the next request to instruct the server to return the next chunk of results. By continuing until the server returns an empty continue value the client can consume the full set of results.

    Like a watch operation, a continue token will expire after a short amount of time (by default 5 minutes) and return a 410 Gone if more results cannot be returned. In this case, the client will need to start from the beginning or omit the limit parameter.

    For example, if there are 1,253 pods on the cluster and the client wants to receive chunks of 500 pods at a time, they would request those chunks as follows:

    • List all of the pods on a cluster, retrieving up to 500 pods each time.

    1. GET /api/v1/pods?limit=500

    200 OKContent-Type: application/json{ "kind": "PodList", "apiVersion": "v1", "metadata": { "resourceVersion":"10245", "continue": "ENCODED_CONTINUE_TOKEN", … }, "items": […] // returns pods 1-500}

    • Continue the previous call, retrieving the next set of 500 pods.

    1. GET /api/v1/pods?limit=500&continue=ENCODED_CONTINUE_TOKEN

    200 OKContent-Type: application/json{ "kind": "PodList", "apiVersion": "v1", "metadata": { "resourceVersion":"10245", "continue": "ENCODED_CONTINUE_TOKEN_2", … }, "items": […] // returns pods 501-1000}

    • Continue the previous call, retrieving the last 253 pods.

    1. GET /api/v1/pods?limit=500&continue=ENCODED_CONTINUE_TOKEN_2

    200 OKContent-Type: application/json{ "kind": "PodList", "apiVersion": "v1", "metadata": { "resourceVersion":"10245", "continue": "", // continue token is empty because we have reached the end of the list … }, "items": […] // returns pods 1001-1253}

    Note that the resourceVersion of the list remains constant across each request, indicating the server is showing us a consistent snapshot of the pods. Pods that are created, updated, or deleted after version 10245 would not be shown unless the user makes a list request without the continue token. This allows clients to break large requests into smaller chunks and then perform a watch operation on the full set without missing any updates.

    Receiving resources as Tables

    kubectl get is a simple tabular representation of one or more instances of a particular resource type. In the past, clients were required to reproduce the tabular and describe output implemented in kubectl to perform simple lists of objects.A few limitations of that approach include non-trivial logic when dealing with certain objects. Additionally, types provided by API aggregation or third party resources are not known at compile time. This means that generic implementations had to be in place for types unrecognized by a client.

    In order to avoid potential limitations as described above, clients may request the Table representation of objects, delegating specific details of printing to the server. The Kubernetes API implements standard HTTP content type negotiation: passing an Accept header containing a value of application/json;as=Table;g=meta.k8s.io;v=v1beta1 with a GET call will request that the server return objects in the Table content type.

    For example:

    • List all of the pods on a cluster in the Table format.
    1. GET /api/v1/pods
    2. Accept: application/json;as=Table;g=meta.k8s.io;v=v1beta1
    3. ---
    4. 200 OK
    5. Content-Type: application/json
    6. {
    7. "kind": "Table",
    8. "apiVersion": "meta.k8s.io/v1beta1",
    9. ...
    10. "columnDefinitions": [
    11. ...
    12. ]
    13. }

    For API resource types that do not have a custom Table definition on the server, a default Table response is returned by the server, consisting of the resource’s name and creationTimestamp fields.

    1. GET /apis/crd.example.com/v1alpha1/namespaces/default/resources

    1. 200 OK
    2. Content-Type: application/json
    3. ...
    4. {
    5. "kind": "Table",
    6. "apiVersion": "meta.k8s.io/v1beta1",
    7. ...
    8. "columnDefinitions": [
    9. {
    10. "name": "Name",
    11. "type": "string",
    12. ...
    13. },
    14. {
    15. "name": "Created At",
    16. "type": "date",
    17. ...
    18. }
    19. ]
    20. }

    Table responses are available beginning in version 1.10 of the kube-apiserver. As such, not all API resource types will support a Table response, specifically when using a client against older clusters. Clients that must work against all resource types, or can potentially deal with older clusters, should specify multiple content types in their Accept header to support fallback to non-Tabular JSON:

    1. Accept: application/json;as=Table;g=meta.k8s.io;v=v1beta1, application/json

    Alternate representations of resources

    By default Kubernetes returns objects serialized to JSON with content type application/json. This is the default serialization format for the API. However, clients may request the more efficient Protobuf representation of these objects for better performance at scale. The Kubernetes API implements standard HTTP content type negotiation: passing an Accept header with a GET call will request that the server return objects in the provided content type, while sending an object in Protobuf to the server for a PUT or POST call takes the Content-Type header. The server will return a Content-Type header if the requested format is supported, or the 406 Not acceptable error if an invalid content type is provided.

    See the API documentation for a list of supported content types for each API.

    For example:

    • List all of the pods on a cluster in Protobuf format.
    1. GET /api/v1/pods
    2. Accept: application/vnd.kubernetes.protobuf
    3. ---
    4. 200 OK
    5. Content-Type: application/vnd.kubernetes.protobuf
    6. ... binary encoded PodList object
    • Create a pod by sending Protobuf encoded data to the server, but request a response in JSON.
    1. POST /api/v1/namespaces/test/pods
    2. Content-Type: application/vnd.kubernetes.protobuf
    3. Accept: application/json
    4. ... binary encoded Pod object
    5. ---
    6. 200 OK
    7. Content-Type: application/json
    8. {
    9. "kind": "Pod",
    10. "apiVersion": "v1",
    11. ...
    12. }

    Not all API resource types will support Protobuf, specifically those defined via Custom Resource Definitions or those that are API extensions. Clients that must work against all resource types should specify multiple content types in their Accept header to support fallback to JSON:

    1. Accept: application/vnd.kubernetes.protobuf, application/json

    Protobuf encoding

    Kubernetes uses an envelope wrapper to encode Protobuf responses. That wrapper starts with a 4 byte magic number to help identify content in disk or in etcd as Protobuf (as opposed to JSON), and then is followed by a Protobuf encoded wrapper message, which describes the encoding and type of the underlying object and then contains the object.

    The wrapper format is:

    1. A four byte magic number prefix:
    2. Bytes 0-3: "k8s\x00" [0x6b, 0x38, 0x73, 0x00]
    3. An encoded Protobuf message with the following IDL:
    4. message Unknown {
    5. // typeMeta should have the string values for "kind" and "apiVersion" as set on the JSON object
    6. optional TypeMeta typeMeta = 1;
    7. // raw will hold the complete serialized object in protobuf. See the protobuf definitions in the client libraries for a given kind.
    8. optional bytes raw = 2;
    9. // contentEncoding is encoding used for the raw data. Unspecified means no encoding.
    10. optional string contentEncoding = 3;
    11. // contentType is the serialization method used to serialize 'raw'. Unspecified means application/vnd.kubernetes.protobuf and is usually
    12. // omitted.
    13. optional string contentType = 4;
    14. }
    15. message TypeMeta {
    16. // apiVersion is the group/version for this type
    17. optional string apiVersion = 1;
    18. // kind is the name of the object schema. A protobuf definition should exist for this object.
    19. optional string kind = 2;
    20. }

    Clients that receive a response in application/vnd.kubernetes.protobuf that does not match the expected prefix should reject the response, as future versions may need to alter the serialization format in an incompatible way and will do so by changing the prefix.

    Resource deletion

    Resources are deleted in two phases: 1) finalization, and 2) removal.

    1. {
    2. "kind": "ConfigMap",
    3. "apiVersion": "v1",
    4. "metadata": {
    5. "finalizers": {"url.io/neat-finalization", "other-url.io/my-finalizer"},
    6. "deletionTimestamp": nil,
    7. }
    8. }

    When a client first deletes a resource, the .metadata.deletionTimestamp is set to the current time.Once the .metadata.deletionTimestamp is set, external controllers that act on finalizersmay start performing their cleanup work at any time, in any order.Order is NOT enforced because it introduces significant risk of stuck .metadata.finalizers..metadata.finalizers is a shared field, any actor with permission can reorder it.If the finalizer list is processed in order, then this can lead to a situationin which the component responsible for the first finalizer in the list iswaiting for a signal (field value, external system, or other) produced by acomponent responsible for a finalizer later in the list, resulting in a deadlock.Without enforced ordering finalizers are free to order amongst themselves andare not vulnerable to ordering changes in the list.

    Once the last finalizer is removed, the resource is actually removed from etcd.

    Dry run

    FEATURE STATE: Kubernetes v1.13betaThis feature is currently in a beta state, meaning:

    • The version names contain beta (e.g. v2beta3).
    • Code is well tested. Enabling the feature is considered safe. Enabled by default.
    • Support for the overall feature will not be dropped, though details may change.
    • The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature.
    • Recommended for only non-business-critical uses because of potential for incompatible changes in subsequent releases. If you have multiple clusters that can be upgraded independently, you may be able to relax this restriction.
    • Please do try our beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.

    In version 1.13, the dry run beta feature is enabled by default. The modifying verbs (POST, PUT, PATCH, and DELETE) can accept requests in a dry run mode. Dry run mode helps to evaluate a request through the typical request stages (admission chain, validation, merge conflicts) up until persisting objects to storage. The response body for the request is as close as possible to a non dry run response. The system guarantees that dry run requests will not be persisted in storage or have any other side effects.

    Make a dry run request

    Dry run is triggered by setting the dryRun query parameter. This parameter is a string, working as an enum, and in 1.13 the only accepted values are:

    • All: Every stage runs as normal, except for the final storage stage. Admission controllers are run to check that the request is valid, mutating controllers mutate the request, merge is performed on PATCH, fields are defaulted, and schema validation occurs. The changes are not persisted to the underlying storage, but the final object which would have been persisted is still returned to the user, along with the normal status code. If the request would trigger an admission controller which would have side effects, the request will be failed rather than risk an unwanted side effect. All built in admission control plugins support dry run. Additionally, admission webhooks can declare in their configuration object that they do not have side effects by setting the sideEffects field to “None”. If a webhook actually does have side effects, then the sideEffects field should be set to “NoneOnDryRun”, and the webhook should also be modified to understand the DryRun field in AdmissionReview, and prevent side effects on dry run requests.
    • Leave the value empty, which is also the default: Keep the default modifying behavior.

    For example:

    1. POST /api/v1/namespaces/test/pods?dryRun=All
    2. Content-Type: application/json
    3. Accept: application/json

    The response would look the same as for non dry run request, but the values of some generated fields may differ.

    Generated values

    Some values of an object are typically generated before the object is persisted. It is important not to rely upon the values of these fields set by a dry run request, since these values will likely be different in dry run mode from when the real request is made. Some of these fields are:

    • name: if generateName is set, name will have a unique random name
    • creationTimestamp/deletionTimestamp: records the time of creation/deletion
    • UID: uniquely identifies the object and is randomly generated (non-deterministic)
    • resourceVersion: tracks the persisted version of the object
    • Any field set by a mutating admission controller
    • For the Service resource: Ports or IPs that kube-apiserver assigns to v1.Service objects

    Server Side Apply

    FEATURE STATE: Kubernetes v1.16betaThis feature is currently in a beta state, meaning:

    • The version names contain beta (e.g. v2beta3).
    • Code is well tested. Enabling the feature is considered safe. Enabled by default.
    • Support for the overall feature will not be dropped, though details may change.
    • The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature.
    • Recommended for only non-business-critical uses because of potential for incompatible changes in subsequent releases. If you have multiple clusters that can be upgraded independently, you may be able to relax this restriction.
    • Please do try our beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.

    Introduction

    Server Side Apply helps users and controllers manage their resources viadeclarative configurations. It allows them to create and/or modify theirobjects declaratively, simply by sending their fully specified intent.

    A fully specified intent is a partial object that only includes the fields andvalues for which the user has an opinion. That intent either creates a newobject or is combined, by the server, with the existing object.

    The system supports multiple appliers collaborating on a single object.

    This model of specifying intent makes it difficult to remove existing fields.When a field is removed from one’s config and applied, the value will be kept(the system assumes that you don’t care about that value anymore). If an item isremoved from a list or a map, it will be removed if no other appliers care aboutits presence.

    Changes to an object’s fields are tracked through a “field management“mechanism. When a field’s value changes, ownership moves from its currentmanager to the manager making the change. When trying to apply an object, fieldsthat have a different value and are owned by another manager will result in aconflict. This is done in order to signal that the operation might undo anothercollaborator’s changes. Conflicts can be forced, in which case the value will beoverriden, and the ownership will be transfered.

    It is meant both as a replacement for the original kubectl apply and as asimpler mechanism to write controllers.

    Field Management

    Compared to the last-applied annotation managed by kubectl, Server SideApply uses a more declarative approach, which tracks a user’s field management,rather than a user’s last applied state. This means that as a side effect ofusing Server Side Apply, information about which field manager manages eachfield in an object also becomes available.

    For a user to manage a field, in the Server Side Apply sense, means that theuser relies on and expects the value of the field not to change. The user wholast made an assertion about the value of a field will be recorded as thecurrent field manager. This can be done either by changing the value withPOST, PUT, or non-apply PATCH, or by including the field in a config sentto the Server Side Apply endpoint. When using Server-Side Apply, trying tochange a field which is managed by someone else will result in a rejectedrequest (if not forced, see Conflicts).

    Field management is stored in a newly introduced managedFields field that ispart of an object’smetadata.

    A simple example of an object created by Server Side Apply could look like this:

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: test-cm
    5. namespace: default
    6. labels:
    7. test-label: test
    8. managedFields:
    9. - manager: kubectl
    10. operation: Apply
    11. apiVersion: v1
    12. time: "2010-10-10T0:00:00Z"
    13. fieldsType: FieldsV1
    14. fieldsV1:
    15. f:metadata:
    16. f:labels:
    17. f:test-label: {}
    18. f:data:
    19. f:key: {}
    20. data:
    21. key: some value

    The above object contains a single manager in metadata.managedFields. Themanager consists of basic information about the managing entity itself, likeoperation type, api version, and the fields managed by it.

    Note: This field is managed by the apiserver and should not be changed bythe user.

    Nevertheless it is possible to change metadata.managedFields through anUpdate operation. Doing so is highly discouraged, but might be a reasonableoption to try if, for example, the managedFields get into an inconsistentstate (which clearly should not happen).

    The format of the managedFields is described in the API.

    Conflicts

    A conflict is a special status error that occurs when an Apply operation triesto change a field, which another user also claims to manage. This prevents anapplier from unintentionally overwriting the value set by another user. Whenthis occurs, the applier has 3 options to resolve the conflicts:

    • Overwrite value, become sole manager: If overwriting the value wasintentional (or if the applier is an automated process like a controller) theapplier should set the force query parameter to true and make the requestagain. This forces the operation to succeed, changes the value of the field,and removes the field from all other managers’ entries in managedFields.
    • Don’t overwrite value, give up management claim: If the applier doesn’tcare about the value of the field anymore, they can remove it from theirconfig and make the request again. This leaves the value unchanged, and causesthe field to be removed from the applier’s entry in managedFields.
    • Don’t overwrite value, become shared manager: If the applier still caresabout the value of the field, but doesn’t want to overwrite it, they canchange the value of the field in their config to match the value of the objecton the server, and make the request again. This leaves the value unchanged,and causes the field’s management to be shared by the applier and all otherfield managers that already claimed to manage it.

    Managers

    Managers identify distinct workflows that are modifying the object (especiallyuseful on conflicts!), and can be specified through the fieldManager queryparameter as part of a modifying request. It is required for the apply endpoint,though kubectl will default it to kubectl. For other updates, its default iscomputed from the user-agent.

    Apply and Update

    The two operation types considered by this feature are Apply (PATCH withcontent type application/apply-patch+yaml) and Update (all other operationswhich modify the object). Both operations update the managedFields, but behavea little differently.

    For instance, only the apply operation fails on conflicts while update doesnot. Also, apply operations are required to identify themselves by providing afieldManager query parameter, while the query parameter is optional for updateoperations. Finally, when using the apply operation you cannot have managedFields in the object that is being applied.

    An example object with multiple managers could look like this:

    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4. name: test-cm
    5. namespace: default
    6. labels:
    7. test-label: test
    8. managedFields:
    9. - manager: kubectl
    10. operation: Apply
    11. apiVersion: v1
    12. fields:
    13. f:metadata:
    14. f:labels:
    15. f:test-label: {}
    16. - manager: kube-controller-manager
    17. operation: Update
    18. apiVersion: v1
    19. time: '2019-03-30T16:00:00.000Z'
    20. fields:
    21. f:data:
    22. f:key: {}
    23. data:
    24. key: new value

    In this example, a second operation was run as an Update by the manager calledkube-controller-manager. The update changed a value in the data field whichcaused the field’s management to change to the kube-controller-manager.

    Note: If this update would have been an Apply operation, the operationwould have failed due to conflicting ownership.

    Merge strategy

    The merging strategy, implemented with Server Side Apply, provides a generallymore stable object lifecycle. Server Side Apply tries to merge fields based onthe fact who manages them instead of overruling just based on values. This wayit is intended to make it easier and more stable for multiple actors updatingthe same object by causing less unexpected interference.

    When a user sends a “fully-specified intent” object to the Server Side Applyendpoint, the server merges it with the live object favoring the value in theapplied config if it is specified in both places. If the set of items present inthe applied config is not a superset of the items applied by the same user lasttime, each missing item not managed by any other appliers is removed. Formore information about how an object’s schema is used to make decisions whenmerging, seesigs.k8s.io/structured-merge-diff.

    Custom Resources

    By default, Server Side Apply treats custom resources as unstructured data. Allkeys are treated the same as struct fields, and all lists are considered atomic.If the validation field is specified in the Custom Rseource Definition, it isused when merging objects of this type.

    Using Server-Side Apply in a controller

    As a developer of a controller, you can use server-side apply as a way tosimplify the update logic of your controller. The main differences with aread-modify-write and/or patch are the following:

    • the applied object must contain all the fields that the controller cares about.
    • there are no way to remove fields that haven’t been applied by the controllerbefore (controller can still send a PATCH/UPDATE for these use-cases).
    • the object doesn’t have to be read beforehand, resourceVersion doesn’t haveto be specified.

    It is strongly recommended for controllers to always “force” conflicts, since theymight not be able to resolve or act on these conflicts.

    Comparison with Client Side Apply

    A consequence of the conflict detection and resolution implemented by ServerSide Apply is that an applier always has up to date field values in their localstate. If they don’t, they get a conflict the next time they apply. Any of thethree options to resolve conflicts results in the applied config being an up todate subset of the object on the server’s fields.

    This is different from Client Side Apply, where outdated values which have beenoverwritten by other users are left in an applier’s local config. These valuesonly become accurate when the user updates that specific field, if ever, and anapplier has no way of knowing whether their next apply will overwrite otherusers’ changes.

    Another difference is that an applier using Client Side Apply is unable tochange the API version they are using, but Server Side Apply supports this usecase.

    API Endpoint

    With the Server Side Apply feature enabled, the PATCH endpoint accepts theadditional application/apply-patch+yaml content type. Users of Server SideApply can send partially specified objects to this endpoint. An applied configshould always include every field that the applier has an opinion about.

    Clearing ManagedFields

    It is possible to strip all managedFields from an object by overwriting themusing MergePatch, StrategicMergePatch, JSONPatch or Update, so everynon-apply operation. This can be done by overwriting the managedFields fieldwith an empty entry. Two examples are:

    1. PATCH /api/v1/namespaces/default/configmaps/example-cm
    2. Content-Type: application/merge-patch+json
    3. Accept: application/json
    4. Data: {"metadata":{"managedFields": [{}]}}
    1. PATCH /api/v1/namespaces/default/configmaps/example-cm
    2. Content-Type: application/json-patch+json
    3. Accept: application/json
    4. Data: [{"op": "replace", "path": "/metadata/managedFields", "value": [{}]}]

    This will overwrite the managedFields with a list containing a single emptyentry that then results in the managedFields being stripped entirely from theobject. Note that just setting the managedFields to an empty list will not resetthe field. This is on purpose, so managedFields never get stripped by clientsnot aware of the field.

    In cases where the reset operation is combined with changes to other fields thanthe managedFields, this will result in the managedFields being reset first andthe other changes being processed afterwards. As a result the applier takesownership of any fields updated in the same request.

    Disabling the feature

    Server Side Apply is a beta feature, so it is enabled by default. To turn thisfeature gate off,you need to include the —feature-gates ServerSideApply=false flag whenstarting kube-apiserver. If you have multiple kube-apiserver replicas, allshould have the same flag setting.

    Feedback

    Was this page helpful?

    Thanks for the feedback. If you have a specific, answerable question about how to use Kubernetes, ask it onStack Overflow.Open an issue in the GitHub repo if you want toreport a problemorsuggest an improvement.