K8s data collectors
The Splunk Operator for K8s deploys Splunk Enterprise custom resources across a single namespace or multiple namespaces. The helper scripts k8s-splunk-collector.sh
and k8s-systeminfo-collector.sh
in the tools directory collect data from a K8s cluster which runs the Splunk Operator for K8s.
Splunk Collector for K8s - k8s-splunk-collector.sh
The Splunk collector for K8s collects data from multiple
The script:
- Collects extensive data on your K8s cluster. If there is any data you’d like to keep private please avoid using the script (modify it as per your needs)
- If diags is opted for, generates Splunk diag on all of the Splunk Instances running inside of Splunk Enterprise CR pods deployed by the Splunk Operator for K8s. The diag generated by the script on the Splunk instance gets deleted after extraction. If any of the above is not desired, please resort to collecting data manually.
Requirements to run the script
- Kubeconfig context set to the cluster/namespace running the Splunk Operator for K8s
- Access to kubectl commands to get data
- Access rights on the host file system to create/delete directories (atleast within the directory where you want to run this script and store data)
- Enough space to collect data in a target folder
Script run instructions
- Run the script using the following command -
sh k8s-splunk-collector.sh -d <flag_to_collect_splunk_diags> -t <target_folder> -l <flag_to_limit_output_by_avoid_kubectl_describe> -s <flag_to_collect_secret_object_metadata>
. There are three options which are configurable through the script:- The
-d
option is used to specify whether Splunk diags needs to be collected. Splunk diags are not collected by default. Set totrue
if Splunk diags are to be collected. - The
-t
option is used to specify a<target folder>
. This option isnot mandatory
. The script allows you to store the data collected in two different ways:- If the
-t
option is not used, a timestamped foldertmp-<timestamp>
is created in the present working directory where the data will be written to.Eg. sh k8s-splunk-collector.sh -d true
- If the
-t
option is used with valid full path, a timestamped foldertmp-<timestamp>
is created inside the full path where the data will be written to. Note: If the folder provided doesn’t exist, it is created provided atleast one of the preceeding paths exist. But if none of the preceding paths exist, the script runs to completion without writing data to disk. Please make sure you have enough space in the target folders in either case (for reference look at performance requirements section.
- If the
-
The
-l
option is used to specify whether you want to limit the collection of data by avoidingkubectl describe
commands. The kubectl describe command outputsare
collected by default. There is an issue in K8s with creating too many clients for describe commands (https://github.com/kubernetes/kubernetes/issues/91913). In internal testing these messages have not caused any issues. However, to avoid the warning messages as well to protect your network bandwidth if limited, you can set the-l
option totrue
. Example of a warning message from the K8s cluster:W0419 14:46:10.239590 21927 exec.go:203] constructing many client instances from the same exec auth config can cause performance problems during cert rotation and can exhaust available network connections; 1478 clients constructed calling "aws-iam-authenticator"
- The
-s
option is used to specify whether K8s secret object metadata needs to be collected. Secret object metadata is not collected by default. Set totrue
if you want the script to collect secret object metadata. Note: The sensitive secret data isNOT
collected.
- The
- After you run the script, wait until you see the message
All data required collected under folder <target_folder>
Example script run:
bash#sh k8s-splunk-collector.sh -d "true"
Starting to collect data with diag true in folder /Users/akondur/Desktop/operator_training/Data_collection_debug/collect_data_k8s/tmp-2021-04-19-10-37
Setting up directories
Done setting up directories
Started collecting logs and diags
Done collecting logs and diags
Started collecting cluster info
Done collecting cluster info
Started collecting kubectl get command outputs
Done collecting kubectl get command outputs
Started collecting kubectl describe command outputs
Done collecting kubectl describe command outputs
All data required collected under folder /Users/akondur/Desktop/operator_training/Data_collection_debug/collect_data_k8s/tmp-2021-04-19-10-37
Target folder breakdown
**
Note: All logs are appropriately named with proper prefixes. The folder k8s_data
contains outputs of kubectl get and describe commands for multiple K8s resources. The folder pods_data
contains logs for all pods in the K8s namespace and also diags from all Splunk pods if requested.
Performance
Splunk deployments on K8s Cluster: SHC (3 search heads, 1 deployer), IDXC (3 indexers, 1 cluster master), 1 Standalone, 1 License master
kubectl get pods
NAME READY STATUS RESTARTS AGE
splunk-default-monitoring-console-0 1/1 Running 0 17m
splunk-example-license-manager-0 1/1 Running 0 18m
splunk-operator-cb8d66765-tl6z2 1/1 Running 0 6h6m
splunk-test-cluster-manager-0 1/1 Running 0 19m
splunk-test-deployer-0 1/1 Running 0 6h3m
splunk-test-indexer-0 1/1 Running 0 17m
splunk-test-indexer-1 1/1 Running 0 17m
splunk-test-indexer-2 1/1 Running 0 17m
splunk-test-search-head-0 1/1 Running 0 6h3m
splunk-test-search-head-1 1/1 Running 0 6h3m
splunk-test-search-head-2 1/1 Running 0 6h3m
splunk-test2-standalone-0 1/1 Running 0 6h5m
For performance testing, the script:
- Collected diags i.e
-d
optiontrue
- Collected kubectl describe commands i.e
-l
not supplied - Collected secret data i.e
-s
optiontrue
Performance Metrics:
Time taken - 6 mins 40 seconds
Memory - 774.4 MB
System info collector for K8s
The system info collector for K8s collects all the required K8s related system information. Only the K8s admin for the K8s cluster should be allowed to collect data via the script.
Requirements to run the script
- The script has to be run on K8s node i.e ssh into the node is necessary to run the script. Setup of ssh into the node is not in the scope of the script
- Admin access to the K8s cluster
Script run instructions
- Run the script using the following command -
sh k8s-systeminfo-collector.sh --ignore_introspection <ignore_introspection_flag> --ignore_metrics <ignore_metrics_flag>
. There are two options configurable through the script:- The
ignore_introspection
option is used to specify whether the script should ignore collecting introspection data. Set totrue
to ignore. By default, the script collects introspection data. - The
ignore_metrics
option is used to specify whether the script should ignore collecting metrics data. Set totrue
to ignore. By default, the script collects metrics data.
- The
Example script run:
sh-4.2$ sudo tools/k8s-log-collector.sh
This is version 0.0.1. New versions can be found at https://github.com/splunk/splunk-operator/tools
Trying to collect common operating system logs...
Trying to collect kernel logs...
Trying to collect mount points and volume information...
Trying to collect SELinux status...
Trying to collect iptables information...
Trying to collect installed packages...
Trying to collect active system services...
Trying to Collect Containerd daemon information...
Trying to collect Docker daemon information...
Trying to collect kubelet information...
Trying to collect L-IPAMD introspection information... Trying to collect L-IPAMD prometheus metrics... Trying to collect L-IPAMD checkpoint... cp: cannot stat '/var/run/k8s-node/ipam.json': No such file or directory
Trying to collect sysctls information...
Trying to collect networking infomation... conntrack v1.4.4 (conntrack-tools): 253 flow entries have been shown.
Trying to collect CNI configuration information...
Trying to collect Docker daemon logs...
Trying to archive gathered information...
Done... your bundled logs are located in /var/log/k8s__2022-03-10_1857-UTC_0.0.1.tar.gz
Target folder breakdown
The script creates a tar file in the present working directory i.e the folder from which the script is executed. Upon untarring the following information can be found:
- Kernel logs at **
/kernel** - Mount points and Volume Information at **
/storage** - SELinux status at **
/system** - IPtables at **
/networking** - Installed packages at **
/system** - System services at **
/system** - Containerd at **
/containerd** - Dockerd at **
/docker** - Kubelet at **
/kubelet** - Ipmand at **
/ipmand** - sysctls at **
/sysctls** - Networking (conntrack, ifconfig, routes etc..) at **
/networking** - CNI at **
/cni** - Docker logs for system at **
/var_log**
Example target folder:
drwxr-xr-x 4 root root 4096 Mar 16 22:26 var_log
drwxr-xr-x 2 root root 137 Mar 16 22:26 system
drwxr-xr-x 2 root root 86 Mar 16 22:26 storage
drwxr-xr-x 2 root root 89 Mar 16 22:26 kernel
drwxr-xr-x 2 root root 61 Mar 16 22:26 containerd
drwxr-xr-x 2 root root 28 Mar 16 22:26 sysctls
drwxr-xr-x 2 root root 249 Mar 16 22:26 networking
drwxr-xr-x 2 root root 75 Mar 16 22:26 kubelet
drwxr-xr-x 2 root root 153 Mar 16 22:26 ipamd
drwxr-xr-x 2 root root 143 Mar 16 22:26 docker
drwxr-xr-x 2 root root 29 Mar 16 22:26 cni