November 2, 2022

OpenShift 4.6 Automation and Integration: Recovering Failed Worker Nodes

Node Status

$ oc get nodes <NODE>

$ oc adm top node <NODE>

$ oc describe node <NODE> | grep -i taint

OpenShift Taint Effects

3.6.1. Understanding taints and tolerations
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/nodes/index#nodes-scheduler-taints-tolerations-about_nodes-scheduler-taints-tolerations

  • PreferNoSchedule
  • NoSchedule
  • NoExecute
apiVersion: v1
kind: Node
metadata:
  annotations:
    machine.openshift.io/machine: openshift-machine-api/ci-ln-62s7gtb-f76d1-v8jxv-master-0
    machineconfiguration.openshift.io/currentConfig: rendered-master-cdc1ab7da414629332cc4c3926e6e59c
...
spec:
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master

Worker Node Not Ready

$ oc describe node/worker01
...output omitted...
Taints:             node.kubernetes.io/not-ready:NoExecute
                    node.kubernetes.io/not-ready:NoSchedule
...
Ready       False   ...     KubeletNotReady        [container runtime is down...
$ ssh core@worker01 "sudo systemctl is-active crio"

$ ssh core@worker01 "sudo systemctl start crio"

$ oc describe node/worker01 | grep -i taints

Worker Node Storage Exhaustion

3.6.1. Understanding taints and tolerations
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/nodes/index#nodes-scheduler-taints-tolerations-about_nodes-scheduler-taints-tolerations

node.kubernetes.io/disk-pressure: The node has disk pressure issues. This corresponds to the node condition DiskPressure=True.

$ oc describe node/worker01 
...
Taints:             disk-pressure:NoSchedule 
                    disk-pressure:NoExecute 
...

Worker Node Capacity

$ oc get pod -o wide
NAME             READY   STATUS    ...  NODE      ...
diskuser-4cfdd   0/1     Pending   ...  <none>    ...
diskuser-ck4df   0/1     Evicted   ...  worker02  ...

$ oc describe node/worker01
...output omitted...
Taints:             node.kubernetes.io/not-ready:NoSchedule
...
Conditions:
  Type             Status  ...   Reason                       ...
  ----             ------  ...   ------                       ...
  DiskPressure     True    ...   KubeletHasDiskPressure       ...

Worker Node Unreachable

3.6.1. Understanding taints and tolerations
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/nodes/index#nodes-scheduler-taints-tolerations-about_nodes-scheduler-taints-tolerations

node.kubernetes.io/unreachable: The node is unreachable from the node controller. This corresponds to the node condition Ready=Unknown.

$ ssh core@worker02 "sudo systemctl is-active kubelet" 

$ ssh core@worker02 "sudo systemctl start kubelet" 

OpenShift 4.6 Automation and Integration: Kibana

Filtering Queries

12.3. Kubernetes exported fields
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/logging/index#cluster-logging-exported-fields-kubernetes_cluster-logging-exported-fields

These are the Kubernetes fields exported by the OpenShift Container Platform cluster logging available for searching from Elasticsearch and Kibana.

hostname The hostname of OpenShift node that generated the message.
kubernetes.flat_labels The label for the pod that generated the message. Format: key=value
kubernetes.container_name The name of the container in Kubernetes.
kubernetes.namespace_name The name of the namespace in Kubernetes.
kubernetes.pod_name The name of the pod that generated the log message.
level The log level of the message.
message The actual log message.

Example Lucene query:

+kubernetes.namespace_name:"openshift-etcd" +message:elected

Finding OpenShift Event Logs

kubernetes.event  
kubernetes.event.involvedObject.name Resource name invloved in event.
kubernetes.event.involvedObject.namespace Namespace of the resource name invloved in event.
kubernetes.event.reason The reason for the event. Correspond to the values in the REASON column that displays in the output of the oc get events command.
kubernetes.event.type The type of message, e.g. kubernetes.event.type:warning

Visualizing Time Series with Timelion

Timelion Tutorial – From Zero to Hero
https://www.elastic.co/blog/timelion-tutorial-from-zero-to-hero

.es('+kubernetes.namespace_name:logging-query +message:200'),
.es('+kubernetes.namespace_name:logging-query +message:404'),
.es('+kubernetes.namespace_name:logging-query +message:500')

.es('+kubernetes.container_name:logger +message:500')
.divide(.es('+kubernetes.container_name:logger +message:*'))
.multiply(100)

.es('+kubernetes.container_name:logger +message:500').label(current),
.es(q='+kubernetes.container_name:logger +message:500', offset=-5m).label(previous)

Troubleshooting cluster logging

Chapter 10. Troubleshooting cluster logging
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/logging/index#troubleshooting-cluster-logging

$ oc get -n openshift-logging clusterlogging instance -o yaml

apiVersion: logging.openshift.io/v1
kind: ClusterLogging
....
status:  
...
  logstore:
    elasticsearchStatus:
    - ShardAllocationEnabled:  all
      cluster:
        activePrimaryShards:    5
        activeShards:           5
        initializingShards:     0
        numDataNodes:           1
        numNodes:               1
        pendingTasks:           0
        relocatingShards:       0
        status:                 green
        unassignedShards:       0
      clusterName:             elasticsearch
...

Using Grafana

Monitoring -> Dashboards:

Dashboards: Kubernetes / Compute Resources / Node (Pods)
Namespace: openshift-logging

Using Kibana

Infra index
+kubernetes.namespace_name:openshift-logging +kubernetes.container_name:

OpenShift 4.6 Automation and Integration: Cluster Logging

Overview

1.1.8. About cluster logging components
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/logging/index#cluster-logging-about-components_cluster-logging

The major components of cluster logging are:

LogStore

The logStore is the Elasticsearch cluster that

  • Stores the logs into indexes.
  • Provides RBAC access to the logs.
  • Provides data redundancy.

Collection

Implemented with Fluentd, By default, the log collector uses the following sources:

  • journald for all system logs
  • /var/log/containers/*.log for all container logs

The logging collector is deployed as a daemon set that deploys pods to each OpenShift Container Platform node.

Visualization

This is the UI component you can use to view logs, graphs, charts, and so forth. The current implementation is Kibana.

Event Routing

The Event Router is a pod that watches OpenShift Container Platform events so they can be collected by cluster logging. The Event Router collects events from all projects and writes them to STDOUT. Fluentd collects those events and forwards them into the OpenShift Container Platform Elasticsearch instance. Elasticsearch indexes the events to the infra index.

You must manually deploy the Event Router.

Installing cluster logging

Chapter 2. Installing cluster logging https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/logging/index#cluster-logging-deploying

Install the OpenShift Elasticsearch Operator

namespace: openshift-operators-redhat

Install the Cluster Logging Operator

namespace: openshift-logging

Deploying a Cluster Logging Instance

This default cluster logging configuration should support a wide array of environments.

apiVersion: "logging.openshift.io/v1"
kind: "ClusterLogging"
metadata:
  name: "instance"
  namespace: "openshift-logging"
spec:
  managementState: "Managed"
  logStore:
    type: "elasticsearch"
    retentionPolicy:
      application:
        maxAge: 1d
      infra:
        maxAge: 7d
      audit:
        maxAge: 7d
    elasticsearch:
      nodeCount: 3 5
      storage:
        storageClassName: "<storage-class-name>"
        size: 200G
      resources:
        limits:
          memory: "16Gi"
        requests:
          memory: "16Gi"
      proxy:
        resources:
          limits:
            memory: 256Mi
          requests:
            memory: 256Mi
      redundancyPolicy: "SingleRedundancy"
  visualization:
    type: "kibana"
    kibana:
      replicas: 1
  curation:
    type: "curator"
    curator:
      schedule: "30 3 * * *"
  collection:
    logs:
      type: "fluentd"
      fluentd: {}

Verify

$ oc get clusterlogging -n openshift-logging instance -o yaml

Install the Event Router

7.1. Deploying and configuring the Event Router
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/logging/index#cluster-logging-eventrouter-deploy_cluster-logging-curator

Creating Kibana Index Patterns

Index Pattern: app-*
Time Filter Field Name: @timestamp

Index Pattern: infra-*
Time Filter Field Name: @timestamp

Index Pattern: audit-*
Time Filter Field Name: @timestamp

OpenShift 4.6 Automation and Integration: Cluster Monitoring and Metrics

Overview Monitoring

1.2. Understanding the monitoring stack
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/monitoring/index#understanding-the-monitoring-stack_monitoring-overview

Alertmanager

5.7. Applying a custom Alertmanager configuration
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/monitoring/index#applying-custom-alertmanager-configuration_managing-alerts

$ oc extract secret/alertmanager-main --to /tmp/ -n openshift-monitoring --confirm

OCP Web Console

Navigate to the Administration -> Cluster Settings -> Global Configuration -> Alertmanager -> YAML.

global:
  resolve_timeout: 5m
route:
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 12h
  receiver: default
  routes:
  - match:
      alertname: Watchdog
    repeat_interval: 5m
    receiver: watchdog
receivers:
- name: default
- name: watchdog

Sending Alerts to Email

global:
  resolve_timeout: 5m
  smtp_smarthost: "mail.mkk.se:25"
  smtp_from: alerts@ocp4.mkk.se
  smtp_auth_username: mail_username
  smtp_auth_password: mail_password
  smtp_require_tls: false
route:
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 12h
  receiver: default
  routes:
    - match:
        alertname: Watchdog
      repeat_interval: 5m
      receiver: watchdog
    - match:
        severity: critical
      receiver: email-notification
receivers:
  - name: default
  - name: watchdog
  - name: email-notification
    email_configs:
      - to: ocp-admins@mkk.se
$ oc set data secret/alertmanager-main -n openshift-monitoring --from-file=/tmp/alertmanager.yaml

$ oc logs -f -n openshift-monitoring alertmanager-main-0 -c alertmanager

Grafana

Grafana includes the following default dashboards:

etcd Information on etcd in cluster.
Kubernetes / Compute Resources / Cluster High-level view of cluster resources.
Kubernetes / Compute Resources / Namespace (Pods) Resource usage for pods per namespace.
Kubernetes / Compute Resources / Namespace (Workloads) Resource usage per namespace and then by workload type, such as deployment, daemonset, and statefulset.
Kubernetes / Compute Resources / Node (Pods) Resource usage per node.
Kubernetes / Compute Resources / Pod Resource usage for individual pods.
Kubernetes / Compute Resources / Workload Resources usage per namespace, workload, and workload type.
Kubernetes / Networking/Cluster Network usage in cluster
Prometheus Information about prometheus-k8s pods running in the openshift-monitoring namespace.
USE Method / Cluster USE, Utilization Saturation and Errors.

Persistent Storage

Configuring Prometheus Persistent Storage

2.8.2. Configuring a local persistent volume claim
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/monitoring/index#configuring-a-local-persistent-volume-claim_configuring-the-monitoring-stack

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    prometheusK8s:
      retention: 15d
      volumeClaimTemplate:
        spec:
          storageClassName: local-storage
          volumeMode: Filesystem
          resources:
            requests:
              storage: 40Gi

Configuring Alert Manager Persistent Storage

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    alertmanagerMain:
      volumeClaimTemplate:
        spec:
          storageClassName: local-storage
          volumeMode: Filesystem
          resources:
            requests:
              storage: 20Gi
$ oc exec -it prometheus-k8s-0 -c prometheus -n openshift-monitoring -- ls -l /prometheus

$ oc exec -it prometheus-k8s-0 -c prometheus -n openshift-monitoring -- df -h /prometheus

OpenShift 4.6 Automation and Integration: Storage

Overview

3.1. Persistent storage overview
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/storage/index#persistent-storage-overview_understanding-persistent-storage

The OpenShift storage architecture has three primary components:

  • Storage Classes
  • Persistent Volumes
  • Persistent Volume Claims

Persistent Volume Claims (pvc)

The project defines pvc with following

  • Storage Size: [G|Gi...]
  • Storage Class:
  • Access Mode: [ReadWriteMany|ReadWriteOnce|ReadOnlyMany]
  • Volume Mode: [Filesystem|Block|Object]

Persistent Volume (pv)

4.11. Persistent storage using NFS
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/storage/index#persistent-storage-using-nfs

Example Persistent Volume

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv0001
spec:
  capacity:
    storage: 5Gi
  storageClassName: nfs-storage
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  nfs:
    path: /tmp
    server: 172.17.0.2
  persistentVolumeReclaimPolicy: Retain

This persistent volume uses the NFS volume plug-in. The nfs section defines parameters that the NFS volume plug-in requires to mount the volume on a node. This section includes sensitive NFS configuration information.

Provisioning and Binding Persistent Volumes

  • Install a storage operator
  • Write and use Ansible Playbooks

Persistent Volume Reclaim Policy

3.2.6. Reclaim policy for persistent volumes
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/storage/index#reclaiming_understanding-persistent-storage

  • Delete: reclaim policy deletes both the PersistentVolume object from OpenShift Container Platform and the associated storage asset in external infrastructure, such as AWS EBS or VMware vSphere. All dynamically-provisioned persistent volumes use a Delete reclaim policy.
  • Retain: Reclaim policy allows manual reclamation of the resource for those volume plug-ins that support it.
  • Recycle: Reclaim policy recycles the volume back into the pool of unbound persistent volumes once it is released from its claim.

Supported access modes for PVs

Table 3.2. Supported access modes for PVs
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/storage/index#pv-access-modes_understanding-persistent-storage

Available dynamic provisioning plug-ins

7.2. Available dynamic provisioning plug-ins
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/storage/index#available-plug-ins_dynamic-provisioning

Setting a Default Storage Class

7.3.2. Storage class annotations
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/storage/index#storage-class-annotations_dynamic-provisioning

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"

Restricting Access to Storage Resources

5.1.1. Resources managed by quotas
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/applications/index#quotas-resources-managed_quotas-setting-per-project

requests.storage The sum of storage requests across all persistent volume claims in any state cannot exceed this value.
persistentvolumeclaims The total number of persistent volume claims that can exist in the project.
<storage-class-name>.storageclass.storage.k8s.io/requests.storage The sum of storage requests across all persistent volume claims in any state that have a matching storage class, cannot exceed this value.
<storage-class-name>.storageclass.storage.k8s.io/persistentvolumeclaims The total number of persistent volume claims with a matching storage class that can exist in the project.

Block Volume

3.5.1. Block volume examples
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/storage/index#block-volume-examples_understanding-persistent-storage

apiVersion: v1
kind: PersistentVolume
metadata:
  name: block-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  volumeMode: Block 1
  persistentVolumeReclaimPolicy: Retain
  fc:
    targetWWNs: ["50060e801049cfd1"]
    lun: 0
    readOnly: false
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: block-pvc
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Block
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: pod-with-block-volume
spec:
  containers:
    - name: fc-container
      image: fedora:26
      command: ["/bin/sh", "-c"]
      args: [ "tail -f /dev/null" ]
      volumeDevices: 
        - name: data
          devicePath: /dev/xvda
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: block-pvc

Persistent storage using iSCSI

4.9. Persistent storage using iSCSI
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/storage/index#persistent-storage-using-iscsi

PersistentVolume object definition

apiVersion: v1
kind: PersistentVolume
metadata:
  name: iscsi-pv
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  storageClassName: iscsi-blk
  accessModes:
    - ReadWriteOnce
  iscsi:
    targetPortal: 10.0.0.1:3260
    iqn: iqn.2016-04.test.com:storage.target00
    lun: 0
    initiatorName: iqn.2016-04.test.com:custom.iqn 1
    fsType: ext4
    readOnly: false

Persistent storage using local volumes

Installing the Local Storage Operator

4.10.1. Installing the Local Storage Operator
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/storage/index#local-storage-install_persistent-storage-local

$ oc debug node/worker06 -- lsblk
...
vdb    252:16   0   20G  0 disk

$ oc adm new-project openshift-local-storage

$ OC_VERSION=$(oc version -o yaml | grep openshiftVersion | \
    grep -o '[0-9]*[.][0-9]*' | head -1)
apiVersion: operators.coreos.com/v1alpha2
kind: OperatorGroup
metadata:
  name: local-operator-group
  namespace: openshift-local-storage
spec:
  targetNamespaces:
    - openshift-local-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: local-storage-operator
  namespace: openshift-local-storage
spec:
  channel: "${OC_VERSION}"
  installPlanApproval: Automatic 1
  name: local-storage-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
$ oc apply -f openshift-local-storage.yaml

Verify installation

$ oc -n openshift-local-storage get pods

$ oc get csv -n openshift-local-storage
NAME                                         DISPLAY         VERSION               REPLACES   PHASE
local-storage-operator.4.2.26-202003230335   Local Storage   4.2.26-202003230335              Succeeded

Provisioning local volumes by using the Local Storage Operator

$ export CSV_NAME=$(oc get csv -n openshift-local-storage -o name)

$ oc get ${CSV_NAME} -o jsonpath='{.spec.customresourcedefinitions.owned[*].kind}{"\n"}'
LocalVolume LocalVolumeSet LocalVolumeDiscovery LocalVolumeDiscoveryResult

$ oc get ${CSV_NAME} -o jsonpath='{.metadata.annotations.alm-examples}{"\n"}'
[
  {
    "apiVersion": "local.storage.openshift.io/v1",
    "kind": "LocalVolume",
    "metadata": {
      "name": "example"
    },
    "spec": {
      "storageClassDevices": [
        {
          "devicePaths": [
              "/dev/vde",
              "/dev/vdf"
          ],
          "fsType": "ext4",
          "storageClassName": "foobar",
          "volumeMode": "Filesystem"
        }
      ]
    }
  }
  ...
]
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
  name: local-storage
spec:
  storageClassDevices:
  - devicePaths:
    - /dev/vdb
    fsType: ext4
    storageClassName: local-blk
    volumeMode: Filesystem

OpenShift 4.6 Automation and Integration: Machine Config Pool and Machine Config

Introduction

1.4. About Red Hat Enterprise Linux CoreOS (RHCOS) and Ignition
1.2. About the control plane
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/architecture/index#coreos-and-ignition

Red Hat discourages directly manipulating a RHCOS configuration. Instead, provide initial instance configuration in the form of Ignition files.

After the instance is provisioned, changes to RHCOS are managed by the Machine Config Operator.

7.2.2. Creating a machine set
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/machine_management/index#machineset-creating_creating-infrastructure-machinesets

4.2.7. Customization
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/security_and_compliance/index#customization-2

Example MachineConfig (mc)

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: infra
  name: 50-foo-config
spec:
  config:
    ignition:
      version: 3.1.0
    storage:
      files:
      - contents:
          source: data:text/plain;charset=utf-8;base64,LS0t...LQo=
        filesystem: root
        mode: 0644
        path: /etc/foo-config

7.2.4. Creating a machine config pool for infrastructure machines
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/machine_management/index#creating-infra-machines_creating-infrastructure-machinesets

Example MachineConfigPool (mcp)

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: infra
spec:
  machineConfigSelector:
    matchExpressions:
      - {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,infra]}
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/infra: ""
$ oc get mcp

$ oc get mc --show-labels

$ oc get mc --selector=machineconfiguration.openshift.io/role=infra

Label Nodes

Add a label to worker node

$ oc label node/worker03 node-role.kubernetes.io/infra=

Remove label from worker node

$ oc label node/worker03 node-role.kubernetes.io/infra-

Configuring Pod Scheduling

7.4. Moving resources to infrastructure machine sets
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/machine_management/index#moving-resources-to-infrastructure-machinesets

3.7. Placing pods on specific nodes using node selectors
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/nodes/index#nodes-scheduler-node-selectors

apiVersion: apps/v1
kind: Deployment
metadata:
  name: foo
spec:
  replicas: 2
  selector:
    matchLabels:
      app: foo
  template:
    metadata:
      labels:
        app: foo
    spec:
      nodeSelector:
        node-role.kubernetes.io/infra: ""
      containers:
...

4.1.2. Creating daemonsets
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/nodes/index

If you fail to debug a node, this could be because of a defaultNodeSelector is defined, then you must specify a node selector to override the default.

$ oc adm new-project debug --node-selector=""
$ oc debug node/master03 -n debug

Observing Machine Config Pool Updates

https://github.com/openshift/machine-config-operator/blob/master/docs/MachineConfigController.md

Following annotations on node object will be used by UpdateController to coordinate node update with MachineConfigDaemon.

  • machine-config-daemon.v1.openshift.com/currentConfig: defines the current MachineConfig applied by MachineConfigDaemon.
  • machine-config-daemon.v1.openshift.com/desiredConfig: defines the desired MachineConfig that need to be applied by MachineConfigDaemon
  • machine-config-daemon.v1.openshift.com/state: defines the state of the MachineConfigDaemon, It can be done, working and degraded.
$ oc describe node/worker03

OpenShift 4.6 Automation and Integration: Adding Working Nodes

Installer-Provisioned Infrastructure

3.2. Scaling a machine set manually
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/machine_management/index#machineset-manually-scaling_manually-scaling-machineset

In installer-provisioned OCP cluster does the the Machine API automatically performs scaling operations, just modify the number of replicas specified in a Machine Set, and the OCP communicates to the provider to provision or deprovision instances.

User-Provisioned Infrastructure

Adding compute machines to bare metal

10.4. Adding compute machines to bare metal
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/machine_management/index#adding-bare-metal-compute-user-infra

Here you must create the new machines yourself. You can create new Red Hat Enterprise Linux CoreOS (RHCOS) machines either from ISO image or use Preboot eXecution Environment (PXE) boot.

PXE relies on a set of very basic technologies:

  • Dynamic Host Configuration Protocol (DHCP) for locating instances.
  • Trivial File Transfer Protocol (TFTP) for serving the PXE files.
  • HTTP for the ISO images and configuration files.

Example PXE. NOTE THE APPEND PARAMETERS NEED TO BE ON A SINGLE LINE

DEFAULT pxeboot
TIMEOUT 20
PROMPT 0
LABEL pxeboot
  KERNEL http://<HTTP_server>/rhcos-<version>-live-kernel-<architecture>
  APPEND initrd=http://<HTTP_server>/rhcos-<version>-live-initramfs.<architecture>.img 
    coreos.inst.install_dev=/dev/sda 
    coreos.inst.ignition_url=http://<HTTP_server>/worker.ign 
    coreos.live.rootfs_url=http://<HTTP_server>/rhcos-<version>-live-rootfs.<architecture>.img
    coreos.inst=yes
    console=tty0 
    console=ttyS0  
    ip=dhcp rd.neednet=1 

The coreos.inst.ignition_url param points to a working ignition file.

5.1.10. Creating the Kubernetes manifest and Ignition config files
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/installing/index#installation-user-infra-generate-k8s-manifest-ignition_installing-bare-metal

The OpenShift Container Platform installation program ($ ./openshift-install create manifests --dir <installation_directory>) generates

  • bootstrap.ign
  • master.ign
  • worker.ign

Example worker.ign

{
  "ignition": {
    "config": {
      "merge": [
        {
          "source": "https://api-int.mkk.example.com:22623/config/worker",
          "verification": {}
        }
      ]
    },
    "security": {
      "tls": {
        "certificateAuthorities": [
          {
            "source": "data:text/plain;charset=utf-8;base64,XXX...XX",
            "verification": {}
          }
        ]
      }
    },
    "version": "3.1.0"
  },
}

certificateAuthorities contains the custom truststore for the internal CA. You can check a HTTPS endpoint cert chain with openssl, and for above endpoint.

$ openssl s_client -connect api-int.mkk.example.com:22623 -showcerts

And you can check that it is the same Root CA in worker.ign with

$ echo "XXX...XX" | base64 -d | openssl -text -noout

Red Hat OpenStack Platform HAProxy

Chapter 5. Using HAProxy
https://access.redhat.com/documentation/fr-fr/red_hat_openstack_platform/10/html-single/understanding_red_hat_openstack_platform_high_availability/index#haproxy

On a Red Hat OpenStack Platform you must then update the HAProxy (/etc/haproxy/haproxy.cfg) with the nodes

Approving the certificate signing requests for your machines

10.4.3. Approving the certificate signing requests for your machines
https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/machine_management/index#installation-approve-csrs_adding-bare-metal-compute-user-infra

$ oc get csr -A

$ oc adm certificate approve csr-abc

Verify

You should now see the new worker nodes, but it will take some time for them to reach Ready state.

$ oc get nodes

October 29, 2022

RH Satellite 6.11: Registering Hosts to Satellite

Fix DNS in Test Environment

Manually set DNS on every servers

# cat /etc/hosts 
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.122.22    satellite.mkk.se satellite
192.168.122.33    rhel90-01.mkk.se rhel90-01

Prepare Hosts for Satellite Registration

# timedatectl
               Local time: Sat 2022-07-02 00:46:59 UTC
           Universal time: Sat 2022-07-02 00:46:59 UTC
                 RTC time: Sat 2022-07-02 00:46:59
                Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

# chronyc -n tracking

Registering Hosts to Satellite

Chapter 3. Registering Hosts to Satellite
https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/managing_hosts/index#Registering_Hosts_to_Server_managing-hosts

Clear any previous subscription and registration.

[root@rhel90-01 ~]# subscription-manager remove --all
[root@rhel90-01 ~]# subscription-manager unregister
[root@rhel90-01 ~]# subscription-manager clean

Download and install the consumer RPM.

[root@rhel90-01 tmp]# yum localinstall http://satellite.mkk.se/pub/katello-ca-consumer-latest.noarch.rpm

Register the host with a Satellite admin account

# subscription-manager register --org <Organization> --environment <Lifecycle Environment Name>/<Content View>
[root@rhel90-01 tmp]# subscription-manager register --org MKK --environment Development/Base
Registering to: satellite.mkk.se:443/rhsm
Username: admin
Password: 
The system has been registered with ID: a2bdbfbb-13f3-430f-9a11-431da3a43b87
The registered system name is: rhel90-01.mkk.se

[root@rhel90-01 tmp]# subscription-manager status
+-------------------------------------------+
   System Status Details
+-------------------------------------------+
Overall Status: Disabled
Content Access Mode is set to Simple Content Access. This host has access to content, regardless of subscription status.

System Purpose Status: Disabled

[root@rhel90-01 tmp]# subscription-manager repos --list
+----------------------------------------------------------+
    Available Repositories in /etc/yum.repos.d/redhat.repo
+----------------------------------------------------------+
Repo ID:   rhel-9-for-x86_64-baseos-rpms
Repo Name: Red Hat Enterprise Linux 9 for x86_64 - BaseOS (RPMs)
Repo URL:  https://satellite.mkk.se/pulp/content/MKK/Development/Base/content/dist/rhel9/$releasever/x86_64/baseos/os
Enabled:   1

Repo ID:   rhel-9-for-x86_64-appstream-rpms
Repo Name: Red Hat Enterprise Linux 9 for x86_64 - AppStream (RPMs)
Repo URL:  https://satellite.mkk.se/pulp/content/MKK/Development/Base/content/dist/rhel9/$releasever/x86_64/appstream/os
Enabled:   1

[root@rhel90-01 tmp]# cat /etc/yum.repos.d/redhat.repo 
#
# Certificate-Based Repositories
# Managed by (rhsm) subscription-manager
#
# *** This file is auto-generated.  Changes made here will be over-written. ***
# *** Use "subscription-manager repo-override --help" if you wish to make changes. ***
#
# If this file is empty and this system is subscribed consider
# a "yum repolist" to refresh available repos
#

[rhel-9-for-x86_64-appstream-rpms]
name = Red Hat Enterprise Linux 9 for x86_64 - AppStream (RPMs)
baseurl = https://satellite.mkk.se/pulp/content/MKK/Development/Base/content/dist/rhel9/$releasever/x86_64/appstream/os
enabled = 1
gpgcheck = 1
gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
sslverify = 1
sslcacert = /etc/rhsm/ca/katello-server-ca.pem
sslclientkey = /etc/pki/entitlement/7011859186933837834-key.pem
sslclientcert = /etc/pki/entitlement/7011859186933837834.pem
metadata_expire = 1
enabled_metadata = 1

[rhel-9-for-x86_64-baseos-rpms]
name = Red Hat Enterprise Linux 9 for x86_64 - BaseOS (RPMs)
baseurl = https://satellite.mkk.se/pulp/content/MKK/Development/Base/content/dist/rhel9/$releasever/x86_64/baseos/os
enabled = 1
gpgcheck = 1
gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
sslverify = 1
sslcacert = /etc/rhsm/ca/katello-server-ca.pem
sslclientkey = /etc/pki/entitlement/7011859186933837834-key.pem
sslclientcert = /etc/pki/entitlement/7011859186933837834.pem
metadata_expire = 1
enabled_metadata = 1

Verify registred host on Satellite

https://satellite.mkk.se/ -> <Select Organization> -> Hosts -> Content Hosts

RH Satellite 6.11: Managing Red Hat Subscription, Satellite Repository, Lifecycle Environment and Content View

Manage Subscriptions and Content

Create a Subscription Manifest

Chapter 5. Managing Red Hat Subscriptions
https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/managing_content/index#Managing_Red_Hat_Subscriptions_content-management

https://access.redhat.com/ -> Subscriptions -> Subscription Allocations -> Create New subscription allocation

Name: mkk-satellite
Type (select a type and version of the subscription management application that you are using): Satellite 6.11

Subscription Allocations -> mkk-satellite -> Subscriptions -> Add Subscriptions

You enable Simple Content Access (SCA) separately for each organization, allowing you to maintain some organizations with the SCA behavior and others without the behavior.

Export a Subscription Manifest from the Customer Portal

Subscription Allocations -> mkk-satellite -> Subscriptions -> Export Manifest

Import a Subscription Manifest to Satellite Server

3.9. Importing a Red Hat Subscription Manifest into Satellite Server
https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/installing_satellite_server_in_a_connected_network_environment/index#Importing_a_Red_Hat_Subscription_Manifest_into_Server_satellite

Select Organization to associate the subscription manifest to

https://satellite.mkk.se/ -> <Select Organization> -> Content -> Subscriptions -> Manage Manifest

CDN Configuration
CDN Configuration for Red Hat Content
Red Hat CDN
URL: https://cdn.redhat.com

Manifest
Subscription Manifest
Import New Manifest

Verify Organization, Location and theirs Association

[root@satellite ~]# hammer organization list
---|-------|------|-------------|------
ID | TITLE | NAME | DESCRIPTION | LABEL
---|-------|------|-------------|------
1  | MKK   | MKK  |             | MKK  
---|-------|------|-------------|------

[root@satellite ~]# hammer location list
---|-----------|-----------|------------
ID | TITLE     | NAME      | DESCRIPTION
---|-----------|-----------|------------
2  | Stockholm | Stockholm |            
---|-----------|-----------|------------

Verify that Location is associated with Organization

[root@satellite ~]# hammer location list --organization MKK
---|-----------|-----------|------------
ID | TITLE     | NAME      | DESCRIPTION
---|-----------|-----------|------------
2  | Stockholm | Stockholm |            
---|-----------|-----------|------------

Synchronize Red Hat Content

To Enable a Repository in an Organization

6.5. Enabling Red Hat Repositories
https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/managing_content/index#Enabling_Red_Hat_Repositories_content-management

https://satellite.mkk.se/ -> <Select Organization> -> Content -> Red Hat Repositories -> <Repository Name>

To Manually Synchronize Red Hat Product Repositories

6.6. Synchronizing Repositories
https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/managing_content/index#Synchronizing_Repositories_content-management

https://satellite.mkk.se/ -> <Select Organization> -> Content -> Products -> <Product Name> -> <Repository Name> -> Sync Now

To Schedule Synchronization of Red Hat Product Repositories

Create Sync Plan

https://satellite.mkk.se/ -> <Select Organization> -> Content -> Sync Plans -> Create Sync Plan

Name: rhel8
Interval: hourly
Start Date: Today
Start Time: Now

Products -> Add -> <Select Product Name> -> Add Selected

Define Download Policies

6.7. Download Policies Overview
https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/managing_content/index#Download_Policies_Overview_content-management

https://satellite.mkk.se/ -> <Select Organization> -> Administer -> Settings -> Content

Default Custom Repository download policy: immediate

Default Red Hat Repository download policy: on_demand

Verify Content from the CLI

[root@satellite ~]# hammer repository list 
---|----------------------------------------------------------|-------------------------------------|--------------|----------------------------------------------------------------
ID | NAME                                                     | PRODUCT                             | CONTENT TYPE | URL                                                            
---|----------------------------------------------------------|-------------------------------------|--------------|----------------------------------------------------------------
3  | Red Hat Enterprise Linux 8 for x86_64 - AppStream RPMs 8 | Red Hat Enterprise Linux for x86_64 | yum          | https://cdn.redhat.com/content/dist/rhel8/8/x86_64/appstream/os
1  | Red Hat Enterprise Linux 8 for x86_64 - BaseOS RPMs 8    | Red Hat Enterprise Linux for x86_64 | yum          | https://cdn.redhat.com/content/dist/rhel8/8/x86_64/baseos/os   
---|----------------------------------------------------------|-------------------------------------|--------------|----------------------------------------------------------------

[root@satellite ~]# hammer repository synchronize --id 1

Verify Content from the Web UI

https://satellite.mkk.se/ -> <Select Organization> -> Monitor -> Recurring Logics

https://satellite.mkk.se/ -> <Select Organization> -> -> Content -> Products -> <Product Name> -> <Repository Name>

Create Software Lifecycles

7.3. Creating a Life Cycle Environment Path
https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/managing_content/index#Creating_a_Life_Cycle_Environment_Path_content-management

The first environment in every environment path is the Library environment that receives synced content from available sources. The Library environment is continuously syncing with source repositories, which are configured by the sync plans that you assigned to each repository.

Create a lifecycle environment path:

https://satellite.mkk.se/ -> <Select Organization> -> -> Content -> Lifecycle Environments -> Create Environment Path

Name: Development
Label: Development
Descriptiom: Development environment

Add New Environment

Name: Production
Label: Production
Descriptiom: Production environment
Prior Environment: Development 

Verify Content from the CLI

[root@satellite ~]# hammer lifecycle-environment list --organization MKK
---|-------------|------------
ID | NAME        | PRIOR      
---|-------------|------------
2  | Development | Library    
1  | Library     |            
3  | Production  | Development
---|-------------|------------

[root@satellite ~]# hammer lifecycle-environment paths --organization MKK
------------------------------------
LIFECYCLE PATH                      
------------------------------------
Library >> Development >> Production
------------------------------------

Publish and Promote Content Views

Chapter 8. Managing Content Views
https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/managing_content/index#Managing_Content_Views_content-management

Content Views

A content view is a customized content repository to define the software that a specific environment uses.

https://satellite.mkk.se/ -> <Select Organization> -> -> Content -> Content Views -> Create New View

Name: Base
Label: Base
Description: Base packages

Manage Repositories in Content Views

Add a repository to a content view:

https://satellite.mkk.se/ -> <Select Organization> -> -> Content -> Content Views -> <Content View Name> -> Repositories -> <Select Checkbox Repository> -> Add Repositories

Content View Filters

Publish a Content View

https://satellite.mkk.se/ -> <Select Organization> -> -> Content -> Content Views -> <Content View Name> -> Publish New Version

Description: Added RHEL 8 BaseOS and AppStream RPM repo

Promote a Content View

After publishing a content view version to the library, you can promote the content view to the first lifecycle environment in the environment path.

To promote a content view:

https://satellite.mkk.se/ -> <Select Organization> -> -> Content -> Content Views -> <Content View Name> -> Versions -> vertical ellipsis menu -> Promote

Description: RHEL 8 BaseOS and AppStream RPM repo

Content View Scenarios

In an all-in-one content view scenario, the content view contains all the needed content for all of your hosts.

In the host-specific content view scenario, dedicated content views exist for each host type.

October 27, 2022

RH Satellite 6.11: Install Standalone in a Connected Network Environment on RHEL 8.6

Install RHEL 8.6

# subscription-manager register

# subscription-manager list --all --available
...
Subscription Name:   Red Hat Enterprise Linux
...
Pool ID:             XXX
...

# subscription-manager attach --pool=XXX

# cat /etc/os-release 
NAME="Red Hat Enterprise Linux"
VERSION="8.6 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.6"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.6 (Ootpa)"
...

Install vim and bash-completion

RH Satellite 6.11 System Requirements

https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/installing_satellite_server_in_a_disconnected_network_environment/index#system-requirements_satellite

The following requirements apply to the networked base operating system:

  • x86_64 architecture
  • The latest version of Red Hat Enterprise Linux 8 or Red Hat Enterprise Linux 7 Server
  • 4-core 2.0 GHz CPU at a minimum
  • A minimum of 20 GB RAM is required for Satellite Server to function. In addition, a minimum of 4 GB RAM of swap space is also recommended. Satellite running with less RAM than the minimum value might not operate correctly.
  • A unique host name, which can contain lower-case letters, numbers, dots (.) and hyphens (-)
  • A current Red Hat Satellite subscription
  • Administrative user (root) access
  • A system umask of 0022
  • Full forward and reverse DNS resolution using a fully-qualified domain name
# getenforce
Enforcing

# hostnamectl status 
   Static hostname: satellite.mkk.se
...
  Operating System: Red Hat Enterprise Linux 8.6 (Ootpa)
...
      Architecture: x86-64

# echo '192.168.122.22    satellite.mkk.se satellite' >> /etc/hosts 

# ping -c 1 localhost

# ping -c 1 $(hostname -s)

# ping -c 1 $(hostname -f)

# dnf install bind-utils -y

# dig -x 192.168.122.22
...
;; ANSWER SECTION:
22.122.168.192.in-addr.arpa. 0	IN	PTR	satellite.
...

# systemctl status firewalld.service 
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2022-10-27 14:54:36 CEST; 25min ago

# firewall-cmd --get-services | grep -i satellite
RH-Satellite-6 RH-Satellite-6-capsule ...

# firewall-cmd --info-service=RH-Satellite-6
RH-Satellite-6
  ports: 5000/tcp 5646-5647/tcp 5671/tcp 8000/tcp 8080/tcp 9090/tcp
  protocols: 
  source-ports: 
  modules: 
  destination: 
  includes: foreman
  helpers: 

# firewall-cmd --permanent --add-service=RH-Satellite-6

# firewall-cmd --reload

# free -g

Install Satellite 6.11

https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/installing_satellite_server_in_a_connected_network_environment/index#attaching-infrastructure-subscription_satellite
# subscription-manager list --all --available
...
Subscription Name:   Red Hat Satellite
...
Pool ID:             XXX
...

# subscription-manager attach --pool=XXX

# dnf update

# systemctl reboot
https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/installing_satellite_server_in_a_connected_network_environment/index#anchor_xml_id_repositories_rhel_8_xreflabel_repositories_rhel_8_red_hat_enterprise_linux_8
# subscription-manager repos --disable "*"

# subscription-manager repos --enable=rhel-8-for-x86_64-baseos-rpms \
--enable=rhel-8-for-x86_64-appstream-rpms \
--enable=satellite-6.11-for-rhel-8-x86_64-rpms \
--enable=satellite-maintenance-6.11-for-rhel-8-x86_64-rpms

# dnf module enable satellite:el8

# dnf repolist 
Updating Subscription Management repositories.
repo id                                                                                  repo name
rhel-8-for-x86_64-appstream-rpms                                                         Red Hat Enterprise Linux 8 for x86_64 - AppStream (RPMs)
rhel-8-for-x86_64-baseos-rpms                                                            Red Hat Enterprise Linux 8 for x86_64 - BaseOS (RPMs)
satellite-6.11-for-rhel-8-x86_64-rpms                                                    Red Hat Satellite 6.11 for RHEL 8 x86_64 (RPMs)
satellite-maintenance-6.11-for-rhel-8-x86_64-rpms                                        Red Hat Satellite Maintenance 6.11 for RHEL 8 x86_64 (RPMs)

# dnf module list
...
Red Hat Satellite 6.11 for RHEL 8 x86_64 (RPMs)
Name                 Stream          Profiles Summary                                                                                                                                       
satellite            el8 [e]                  Satellite module         

# dnf install satellite

Configure Satellite Installation

https://access.redhat.com/documentation/en-us/red_hat_satellite/6.11/html-single/installing_satellite_server_in_a_connected_network_environment/index#Configuring_Installation_satellite
# dnf install chrony -y

# systemctl enable --now chronyd

# systemctl status chronyd.service 
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2022-10-27 15:12:21 CEST; 17s ago
...

# dnf install sos

# satellite-installer --list-scenarios
Available scenarios
  Capsule (use: --scenario capsule)
        Install a stand-alone Satellite Capsule.
  Satellite (use: --scenario satellite)
        Install Satellite server

# satellite-installer --scenario satellite --help

# satellite-installer --scenario satellite \
--foreman-initial-organization "MKK" \
--foreman-initial-location "Stockholm" \
--foreman-initial-admin-username admin \
--foreman-initial-admin-password redhat123
...
  Success!
  * Satellite is running at https://satellite.mkk.se
      Initial credentials are admin / redhat123

  * To install an additional Capsule on separate machine continue by running:

      capsule-certs-generate --foreman-proxy-fqdn "$CAPSULE" --certs-tar "/root/$CAPSULE-certs.tar"
  * Capsule is running at https://satellite.mkk.se:9090

  The full log is at /var/log/foreman-installer/satellite.log
Package versions are being locked.

Validate a Satellite Server Installation

[root@satellite ~]# satellite-maintain --help

# satellite-maintain health check
Running ForemanMaintain::Scenario::FilteredScenario
================================================================================
Check number of fact names in database:                               [OK]
--------------------------------------------------------------------------------
Check whether all services are running:                               [OK]
--------------------------------------------------------------------------------
Check whether all services are running using the ping call:           [OK]
--------------------------------------------------------------------------------
Check for paused tasks:                                               [OK]
--------------------------------------------------------------------------------
Check whether system is self-registered or not:                       [OK]
--------------------------------------------------------------------------------

[root@satellite ~]# satellite-maintain service list
Running Service List
================================================================================
List applicable services: 
dynflow-sidekiq@.service                   indirect
foreman-proxy.service                      enabled 
foreman.service                            enabled 
httpd.service                              enabled 
postgresql.service                         enabled 
pulpcore-api.service                       enabled 
pulpcore-content.service                   enabled 
pulpcore-worker@.service                   indirect
redis.service                              enabled 
tomcat.service                             enabled

All services listed                                                   [OK]
--------------------------------------------------------------------------------

[root@satellite ~]# cat .hammer/cli.modules.d/foreman.yml 
:foreman:
  # Credentials. You'll be asked for the interactively if you leave them blank here
  :username: 'admin'
  :password: 'redhat123'

[root@satellite ~]# hammer ping
database:         
    Status:          ok
    Server Response: Duration: 1ms
candlepin:        
    Status:          ok
    Server Response: Duration: 38ms
candlepin_auth:   
    Status:          ok
    Server Response: Duration: 24ms
candlepin_events: 
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
katello_events:   
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
pulp3:            
    Status:          ok
    Server Response: Duration: 75ms
pulp3_content:    
    Status:          ok
    Server Response: Duration: 59ms
foreman_tasks:    
    Status:          ok
    Server Response: Duration: 4ms

# hammer organization list
---|-------|------|-------------|------
ID | TITLE | NAME | DESCRIPTION | LABEL
---|-------|------|-------------|------
1  | MKK   | MKK  |             | MKK  
---|-------|------|-------------|------

# hammer location list
---|-----------|-----------|------------
ID | TITLE     | NAME      | DESCRIPTION
---|-----------|-----------|------------
2  | Stockholm | Stockholm |            
---|-----------|-----------|------------

Open https://satellite.mkk.se in browser and login as admin with password redhat123.

October 20, 2022

Install Visual Studio Code (VSCode) on RHEL, Fedora, or CentOS

Installation

https://code.visualstudio.com/docs/setup/linux#_rhel-fedora-and-centos-based-distributions

sudo rpm --import https://packages.microsoft.com/keys/microsoft.asc

sudo sh -c 'echo -e "[code]\nname=Visual Studio Code\nbaseurl=https://packages.microsoft.com/yumrepos/vscode\nenabled=1\ngpgcheck=1\ngpgkey=https://packages.microsoft.com/keys/microsoft.asc" > /etc/yum.repos.d/vscode.repo'

dnf check-update

sudo dnf install code

Install Python3 on RHEL, Fedora, or CentOS

sudo dnf install python3

The built-in Python 3 installation on Linux works well, but to install other Python packages you must install pip with get-pip.py

https://pip.pypa.io/en/stable/installation/#get-pip-py

Install Python Extension in VS Code

https://code.visualstudio.com/docs/python/python-tutorial

Start VSCode.

$ code

Open View -> Extension and type python and install .

Setup python interpreter in VSCode

  1. Opening the Command Palette (Ctrl+Shift+P)
  2. And type "Python: Select Interpreter"
  3. And select recommended python

Write and Execute Simple Python Script

Create a new file hello.py

#!/usr/bin/env python3
print("Hello World!")

Run from VSCode by pressing Ctrl+F5.

Execute from command line.

$ chmod +x tmp/hello.py
$ tmp/hello.py

OpenShift 4.6 Automation and Integration: Configure trusted TLS Certificates

3.1. Replacing the default ingress certificate

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/security_and_compliance/index#replacing-default-ingress

subjectAltName: DNS:*.apps.<cluster_name>.<base_domain>

$ cat ingress.pem ca.pem > ingress-chain.pem

Create a config map that includes only the root CA certificate used to sign the wildcard certificate:

$ oc create configmap custom-ca \
     --from-file=ca-bundle.crt=ca.pem \
     -n openshift-config

Update the cluster-wide proxy configuration with the newly created config map:

$ oc patch proxy/cluster \
     --type=merge \
     --patch='{"spec":{"trustedCA":{"name":"custom-ca"}}}'

Create a secret that contains the wildcard certificate chain and key:

$ oc create secret tls custom-ingress \
     --cert=ingress-chain.pem \
     --key=</path/to/cert.key> \
     -n openshift-ingress

Update the Ingress Controller configuration with the newly created secret:

$ oc patch ingresscontroller.operator default \
     --type=merge -p \
     '{"spec":{"defaultCertificate": {"name": "custom-ingress"}}}' \
     -n openshift-ingress-operator

$ watch oc get pods -n openshift-ingress

Adding API server certificates

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/security_and_compliance/index#api-server-certificates

subjectAltName: DNS:api.<cluster_name>.<base_domain>

$ cat api.pem ca.pem > api-chain.pem

Create a secret that contains the certificate chain and private key in the openshift-config namespace.

$ oc create secret tls custom-api \
     --cert=api-chain.pem \
     --key=</path/to/cert.key> \
     -n openshift-config

Update the API server to reference the created secret.

$ oc patch apiserver cluster \
     --type=merge -p \
     '{"spec":{"servingCerts": {"namedCertificates":
     [{"names": ["<FQDN>"], 1
     "servingCertificate": {"name": "custom-api"}}]}}}'

$ oc get clusteroperators kube-apiserver

$ oc get events --sort-by='.lastTimestamp' -n openshift-kube-apiserver

Replacing the CA Bundle certificate

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/security_and_compliance/index#ca-bundle-replacing_updating-ca-bundle

See above "Update the cluster-wide proxy configuration with the newly created config map:"

Certificate injection using Operators

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/networking/index#certificate-injection-using-operators_configuring-a-custom-pki

$ oc create configmap trusted-ca -n my-example-custom-ca-ns

$ oc label configmap trusted-ca \
  config.openshift.io/inject-trusted-cabundle=true -n my-example-custom-ca-ns

Add the lines in bold so that the pod mounts the certificate bundle at /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-example-custom-ca-deployment
  namespace: my-example-custom-ca-ns
spec:
  ...
    spec:
      ...
      containers:
        - name: my-container-that-needs-custom-ca
          volumeMounts:
          - name: trusted-ca
            mountPath: /etc/pki/ca-trust/extracted/pem
            readOnly: true
      volumes:
      - name: trusted-ca
        configMap:
          name: trusted-ca
          items:
            - key: ca-bundle.crt
              path: tls-ca-bundle.pem

October 19, 2022

OpenShift 4.6 Automation and Integration: Enterprise Authentication

Introduction

$ kinit admin

$ ipa -vv user-show admin
...
"result": {
"dn": "uid=admin,cn=users,cn=accounts,dc=mkk,dc=example,dc=com",
...

$ ipa group-find
...
  Group name: admins
...

$ ipa -vv group-show admins
...
"result": {
"dn": "cn=admins,cn=groups,cn=accounts,dc=mkk,dc=example,dc=com ",
...

Configuring the LDAP Identity Provider

$ oc explain OAuth.spec.identityProviders.ldap
...
FIELDS:
...
   bindPassword	<Object>
     bindPassword is an optional reference to a secret by name containing a
     password to bind with during the search phase. The key "bindPassword" is
     used to locate the data. If specified and the secret or expected key is not
     found, the identity provider is not honored. The namespace for this secret
     is openshift-config.

   ca	<Object>
     ca is an optional reference to a config map by name containing the
     PEM-encoded CA bundle. It is used as a trust anchor to validate the TLS
     certificate presented by the remote server. The key "ca.crt" is used to
     locate the data. If specified and the config map or expected key is not
     found, the identity provider is not honored. If the specified ca data is
     not valid, the identity provider is not honored. If empty, the default
     system roots are used. The namespace for this config map is
     openshift-config.
...

Administration -> Cluster Settings -> Configuration -> OAuth

curl http://idm.mkk.example.com/ipa/config/ca.crt

bindDN: "uid=admin,cn=users,cn=accounts,dc=mkk,dc=example,dc=com"

url: "ldaps://idm.mkk.example.com/cn=users,cn=accounts,dc=mkk,dc=example,dc=com?uid"

Troubleshooting

  • Authentication Operator Logs
  • Oauth Pods status
  • oc get pods -n openshift-authentication

Synchronizing LDAP Groups

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/authentication_and_authorization/index#ldap-auto-syncing_ldap-syncing-groups

ldap-sync-config-map.yaml

kind: LDAPSyncConfig
apiVersion: v1
url: ldaps://idm.mkk.example.com/
insecure: false
bindDN: uid=admin,cn=users,cn=accounts,dc=mkk,dc=example,dc=com
bindPassword: redhat123
ca: /tmp/ca.crt
rfc2307:
  groupsQuery:
    baseDN: "cn=groups,cn=accounts,dc=mkk,dc=example,dc=com"
    scope: sub
    derefAliases: never
    pageSize: 0
    filter: "(objectClass=ipausergroup)"
  groupUIDAttribute: dn
  groupNameAttributes: [ cn ]
  groupMembershipAttributes: [ member ]
  usersQuery:
    baseDN: "cn=users,cn=accounts,dc=mkk,dc=example,dc=com"
    scope: sub
    derefAliases: never
    pageSize: 0
  userUIDAttribute: dn
  userNameAttributes: [ uid ]
  tolerateMemberNotFoundErrors: false
  tolerateMemberOutOfScopeErrors: false

Verify configuration, connectivity, username, password, etc

$ oc adm groups sync --sync-config tmp/ldap-sync.yml

Create new namespace to store everything

$ oc new-project ldap-group-sync

Modify LDAPSyncConfig and save to /tmp/ldap-group-sync.yaml

...
bindPassword:
  file: "/etc/secrets/bindPassword"
ca: /etc/config/ca.crt
...

$ oc create secret generic ldap-secret --from-literal bindPassword=redhat123 -n ldap-group-sync

$ oc create configmap ldap-config --from-file ldap-group-sync.yaml=/tmp/ldap-group-sync.yaml,ca.crt=/tmp/ca.crt -n ldap-group-sync

kind: ServiceAccount
apiVersion: v1
metadata:
  name: ldap-group-sync-sa
  namespace: ldap-group-sync
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ldap-group-sync-cr
rules:
  - apiGroups:
      - ''
      - user.openshift.io
    resources:
      - groups
    verbs:
      - get
      - list
      - create
      - update
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: ldap-group-sync-crb
subjects:
  - kind: ServiceAccount
    name: ldap-group-sync-sa
    namespace: ldap-group-sync
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ldap-group-sync-cr   
---
kind: CronJob
apiVersion: batch/v1beta1
metadata:
  name: ldap-group-sync-cj
  namespace: ldap-group-sync
spec:
  schedule: "*/30 * * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      backoffLimit: 0
      template:
        spec:
          containers:
            - name: ldap-group-sync
              image: "registry.redhat.io/openshift4/ose-cli:latest"
              command:
                - "/bin/bash"
                - "-c"
                - "oc adm groups sync --sync-config=/etc/config/ldap-group-sync.yaml --confirm"
              volumeMounts:
                - mountPath: "/etc/config"
                  name: "ldap-sync-volume"
                - mountPath: "/etc/secrets"
                  name: "ldap-bind-password"
          volumes:
            - name: "ldap-sync-volume"
              configMap:
                name: "ldap-config"
            - name: "ldap-bind-password"
              secret:
                secretName: "ldap-secret"
          restartPolicy: "Never"
          terminationGracePeriodSeconds: 30
          activeDeadlineSeconds: 500
          dnsPolicy: "ClusterFirst"
          serviceAccountName: "ldap-group-sync-sa"

$ oc logs pod/ldap-group-sync-...

$ oc get groups

$ oc adm policy add-cluster-role-to-group cluster-admin admins

October 18, 2022

OpenShift 4.6 Automation and Integration: Jenkins

Introduction

A Jenkinsfile is a text file using Groovy syntax, which is very similar to Java and JavaScript.

https://www.jenkins.io/doc/book/pipeline/syntax/

https://www.jenkins.io/doc/book/pipeline/syntax/#scripted-pipeline

There are Two Possible Styles for Writing a Jenkinsfile:

Declarative Pipeline

Pipelines that start with a pipeline directive and define declarative scripts using a special-purpose domain-specific language (DSL) that is a subset of Groovy.

pipeline {
    /* insert Declarative Pipeline here */
}

Scripted Pipeline

Pipelines that start with a node directive and define imperative scripts using the full Groovy programming language.

node {
    stage('Example') {
        if (env.BRANCH_NAME == 'master') {
            echo 'I only execute on the master branch'
        } else {
            echo 'I execute elsewhere'
        }
    }
}

Declarative Pipeline Example

pipeline {
    agent any
    triggers {
        cron('H */4 * * 1-5')
    } 
    stages {
        stage('Example Build') {
            agent { docker 'maven:3.8.1-adoptopenjdk-11' } 
            steps {
                echo 'Hello, Maven'
                sh 'mvn --version'
            }
        }
        stage('Example Test') {
            agent { docker 'openjdk:8-jre' } 
            steps {
                echo 'Hello, JDK'
                sh 'java -version'
            }
        }
    }
}

The Two Most Common Project Types are:

Pipeline

Runs a pipeline taking as input a single branch from a version control system repository.

Multibranch pipeline

Automatically creates new projects when new branches are detected in a version control system repository. All these projects share the same pipeline definition that must be flexible enough to avoid conflicts between builds in different branches.

Jenkins agent images

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/images/index#images-other-jenkins-agent

Jenkins images are available through the Red Hat Registry:

$ docker pull registry.redhat.io/openshift4/ose-jenkins:<v4.5.0>

$ docker pull registry.redhat.io/openshift4/jenkins-agent-nodejs-10-rhel7:<v4.5.0>

$ docker pull registry.redhat.io/openshift4/jenkins-agent-nodejs-12-rhel7:<v4.5.0>

$ docker pull registry.redhat.io/openshift4/ose-jenkins-agent-maven:<v4.5.0>

$ docker pull registry.redhat.io/openshift4/ose-jenkins-agent-base:<v4.5.0>

Installng Jenkins on OCP

$ oc get templates -A | grep jenkins
openshift   jenkins-ephemeral                               Jenkins service, without persistent storage....
openshift   jenkins-ephemeral-monitored                     Jenkins service, without persistent storage. ...
openshift   jenkins-persistent                              Jenkins service, with persistent storage....
openshift   jenkins-persistent-monitored                    Jenkins service, with persistent storage. ...

$ oc describe -n openshift template jenkins-persistent

$ oc new-project gitops-deploy

$ oc new-app --template jenkins-persistent -p JENKINS_IMAGE_STREAM_TAG=jenkins:v4.8

$ oc adm policy add-cluster-role-to-user self-provisioner -z jenkins -n gitops-deploy

October 17, 2022

OpenShift 4.6 Automation and Integration: Operator

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/operators/index#olm-what-operators-are

$ oc get packagemanifests

$ oc describe packagemanifests file-integrity-operator

$ oc get csv -A

$ oc get subs -A

$ oc describe deployment.apps/file-integrity-operator

$ oc get crd | grep -i fileintegrity

$ oc describe crd fileintegrities.fileintegrity.openshift.io

$ oc get all -n openshift-file-integrity

$ oc logs deployment.apps/file-integrity-operator

Updating an Operator from the OLM Using the CLI

$ oc apply -f file-integrity-operator-subscription.yaml

Deleting Operators

$ oc delete sub $lt;subscription-name>
$ oc delete csv $lt;currentCSV>

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/logging/index#cluster-logging-deploy-cli_cluster-logging-deploying

openshift-file-integrity

apiVersion: v1
kind: Namespace
metadata:
  labels:
    openshift.io/cluster-monitoring: "true"
  name: openshift-file-integrity

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
  metadata:
    name: file-integrity-operator
    namespace: openshift-file-integrity
spec:
  targetNamespaces:
    - openshift-file-integrity
    
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: file-integrity-operator-sub
  namespace: openshift-file-integrity
spec:
  channel: "4.6"
  name: file-integrity-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace    

Cluster Operator

$ oc get clusteroperator

AVAILABLE
The cluster operator is working correctly.

PROGRESSING
The Cluster Version Operator is making changes to this operator.

DEGRADED
The cluster operator has detected a problem and it may not be working correctly.

Cluster Version Operator

$ oc get clusterversion -o jsonpath='{.status.desired.image}' version

Desired release image

$ oc get clusterversion version -o jsonpath='{.status.desired.image}'

$ relimg=$(oc get clusterversion version -o jsonpath='{.status.desired.image}')
$ oc adm release extract --from=$relimg --to=/tmp
$ ll /tmp/*samples*clusteroperator.yaml
-rw-r-----. 1 magnuskkarlsson magnuskkarlsson 778 Apr 21 15:24 /tmp/0000_50_cluster-samples-operator_07-clusteroperator.yaml

OpenShift 4.6 Automation and Integration: Getting Resources Information, Scripts, Rollout, Job, CronJob, Ansible

Getting Resource Information

$ oc get nodes -o wide

$ oc get nodes -o name

$ oc api-resources

$ oc explain route.spec

$ oc get -n openshift-authentication deployment oauth-openshift -o json

$ oc get -n openshift-authentication deployment oauth-openshift -o jsonpath='{.status.availableReplicas}'

$ oc get -n openshift-authentication deployment oauth-openshift -o jsonpath='{.status.conditions[*].type}'

$ oc get -n openshift-authentication deployment oauth-openshift -o jsonpath='{.spec.template.spec.containers[0].name}'

$ oc get -n openshift-authentication deployment oauth-openshift -o jsonpath='{.status.conditions[?(@.type=="Available")].status}'

$ oc get -n openshift-monitoring route -o jsonpath='{.items[*].spec.host}'

$ oc get pods -A -o custom-columns=NAME:.metadata.name,STATUS:.status.phase,IMAGE:.spec.containers[*].name
$ cat not_ready_pods.jsonpath
{range .items[*]}
  {.metadata.name}
  {range .status.conditions[?(@.status=="False")]}
    {.type}{"="}{.status} {.message}
  {end}
{end}

$ oc get nodes -o jsonpath-file=/tmp/not_ready_pods.jsonpath

Labels

$ oc get nodes --show-labels

$ oc get -n openshift-authentication deployment oauth-openshift --show-labels

$ oc get nodes -l node-role.kubernetes.io/worker= -o name

Creating Scripts for Automation

$ oc wait -h
...
Examples:
  # Wait for the pod "busybox1" to contain the status condition of type "Ready"
  oc wait --for=condition=Ready pod/busybox1
  
  # The default value of status condition is true; you can set it to false
  oc wait --for=condition=Ready=false pod/busybox1
  
  # Wait for the pod "busybox1" to contain the status phase to be "Running".
  oc wait --for=jsonpath='{.status.phase}'=Running pod/busybox1
  
  # Wait for the pod "busybox1" to be deleted, with a timeout of 60s, after having issued the "delete" command
  oc delete pod/busybox1
  oc wait --for=delete pod/busybox1 --timeout=60s
...

$ oc rollout status -h
...
Examples:
  # Watch the status of the latest rollout
  oc rollout status dc/nginx
...
$ cat add-user.sh

#!/bin/bash
username=$1
password=$2

echo "$username:$password"

secretname=$(oc get oauth cluster -o jsonpath='{.spec.identityProviders[?(@.name=="htpasswd")].htpasswd.fileData.name}')

secretfile=$(oc extract secret/$secretname -n openshift-config --confirm)

cut -d : -f 1 $secretfile

htpasswd -B -b $secretfile $username $password 

cat $secretfile

oldpods=$(oc get pods -n openshift-authentication -o name)

oc set data secret/$secretname -n openshift-config --from-file=$secretfile

oc wait co/authentication --for condition=Progressing --timeout=90s

oc rollout status -n openshift-authentication deployment oauth-openshift --timeout=90s

oc wait $oldpods -n openshift-authentication --for delete --timeout=90s

rm -f secretfile

ServiceAccount, Role, RoleBinding, Job and CronJob

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/nodes/index#nodes-nodes-jobs

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.6/html-single/authentication_and_authorization/index#ldap-auto-syncing_ldap-syncing-groups

$ oc get pods -A -o jsonpath='{.items[*].spec.containers[*].image}' | sed 's/ /\n/g' | sort | uniq

$ oc new-project audit

$ oc create serviceaccount audit-sa

$ oc create clusterrole audit-cr --verb=get,list,watch --resource=pods

$ oc create clusterrolebinding audit-crb --clusterrole=audit-cr --serviceaccount=audit:audit-sa

apiVersion: batch/v1
kind: Job
metadata:
  name: audit-job
  namespace: audit
spec:
  parallelism: 1
  completions: 1
  activeDeadlineSeconds: 1800
  backoffLimit: 6
  template:
    metadata:
      name: audit-job
    spec:
      serviceAccount: audit-sa
      serviceAccountName: audit-sa
      restartPolicy: "Never"
      containers:
        - name: audit-job
          image: "registry.redhat.io/openshift4/ose-cli:latest"
          command:
            - "/bin/bash"
            - "-c"
            - "oc get pods --all-namespaces -o jsonpath='{.items[*].spec.containers[*].image}' | sed 's/ /\\\n/g' | sort | uniq"
        
$ echo "Hello from OCP $(date +'%F %T')"

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: hello-cr
  namespace: audit
spec:
  schedule: "*/1 * * * *"  
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 3
  jobTemplate:
    spec:
      template:
        metadata:
          name: "hello-cr"
          labels:
            parent: "hello-cr"
        spec:
          serviceAccount: audit-sa
          serviceAccountName: audit-sa
          restartPolicy: "Never"
          containers:
            - name: hello-cr
              image: "registry.redhat.io/openshift4/ose-cli:latest"
              command:
                - "/bin/bash"
                - "-c"
                - echo "Hello from OCP $(date +'%F %T')"

Ansible Playbooks

$ sudo dnf install -y ansible ansible-collection-community-kubernetes jq

$ pip install openshift

https://docs.ansible.com/ansible/2.9/modules/list_of_clustering_modules.html#k8s

- name: Demo k8s modules
  hosts: localhost
  become: false
  vars:
    namespace: automation-hello
  module_defaults:
    group/k8s:
      namespace: "{{ namespace }}"
      # ca_cert: "/etc/pki/tls/certs/ca-bundle.crt"
      validate_certs: false
  tasks:
    - name: Create project
      k8s:
        api_version: project.openshift.io/v1
        kind: Project
        name: "{{ namespace }}"
        state: present
        namespace: ""

    - name: Create deployment, service and route
      k8s:
        state: present
        src: "/tmp/hello.yaml"

    - name: Get a pod info
      k8s_info:
        kind: Pod

#    - name: Scale deployment
#      k8s_scale:
#        kind: Deployment
#        name: hello
#        replicas: 3

    - name: Get hostname from the route
      k8s_info:
        kind: Route
        name: hello
      register: route

    - name: Test access
      uri:
        url: "http://{{ route.resources[0].spec.host }}"
        return_content: yes
      register: response
      until: response.status == 200
      retries: 10
      delay: 5

    - name: Display response
      debug:
        var: response.content
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: hello
  name: hello
  namespace: automation-hello
spec:
  replicas: 1
  selector:
    matchLabels:
      deployment: hello
  template:
    metadata:
      labels:
        deployment: hello
    spec:
      containers:
      - image: quay.io/redhattraining/versioned-hello:v1.0
        name: hello
        ports:
        - containerPort: 8080
          protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: hello
  name: hello
  namespace: automation-hello
spec:
  ports:
  - name: 8080-tcp
    port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    deployment: hello
---
apiVersion: route.openshift.io/v1
kind: Route
metadata:
  labels:
    app: hello
  name: hello
  namespace: automation-hello
spec:
  port:
    targetPort: 8080-tcp
  to:
    kind: Service
    name: hello
$ ansible-playbook /tmp/k8s.yml