|
| 1 | +# EFK (ElasticSearch - FluentD - Kibana ) |
| 2 | + |
| 3 | +### ElasticSearch |
| 4 | + |
| 5 | +> ElasticSearch is a document-oriented database designed to store, retrieve, and manage document-oriented or semi-structured data. When you use Elasticsearch, you store data in JSON document form. Then, you query them for retrieval. |
| 6 | +
|
| 7 | +### FluentD |
| 8 | + |
| 9 | +> Fluentd is a popular open-source data collector that runs on a machine to tail log files, filter and transform the log data, and deliver it to the Elasticsearch cluster, where it will be indexed and stored |
| 10 | +
|
| 11 | +### Kibana |
| 12 | + |
| 13 | +> Kibana is an open source analytics and visualization platform designed to work with Elasticsearch. You use Kibana to search, view, and interact with data stored in Elasticsearch indices. You can easily perform advanced data analysis and visualize your data in a variety of charts, tables, and maps. |
| 14 | +
|
| 15 | +## Steps to install EFK stack on kubernetes cluster |
| 16 | + |
| 17 | +## Pre-requisite |
| 18 | + |
| 19 | +> Since EFK is a heavy application - the cluster needs to be atleast 6 cpu x 10 GB memory with 30 GB storage. EFK stack is a good example to understand the concepts of Deployment, Statefulset and DaemonSet. Lets start installing EFK stack on kubernetes - |
| 20 | +
|
| 21 | +* Create the namespace to install the stack |
| 22 | + |
| 23 | +` kubectl create ns kube-logging ` |
| 24 | + |
| 25 | +``` |
| 26 | +kubectl get ns kube-logging |
| 27 | +NAME STATUS AGE |
| 28 | +kube-logging Active 11s |
| 29 | +``` |
| 30 | + |
| 31 | +* Create persistent volumes and persistent volume claims |
| 32 | + |
| 33 | +> Elasticsearch will need a persistent volume and a corresponding claim that will be attached to the 3 replicas that we will create. The files pv.yaml and pvc.yaml contains the definition of persistent volume and persistent volume claim respectively. |
| 34 | +
|
| 35 | +` kubectl create -f pv.yaml -f pvc.yaml -n kube-logging ` |
| 36 | + |
| 37 | +> The output will show that 3 PVCs are **BOUND** to 3 PVs. |
| 38 | +
|
| 39 | +~~~ |
| 40 | +kubectl get pv,pvc -n kube-logging |
| 41 | +NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE |
| 42 | +persistentvolume/es-pv-0 10Gi RWO Retain Bound kube-logging/es-pvc-es-cluster-0 9s |
| 43 | +persistentvolume/es-pv-1 10Gi RWO Retain Bound kube-logging/es-pvc-es-cluster-1 9s |
| 44 | +persistentvolume/es-pv-2 10Gi RWO Retain Bound kube-logging/es-pvc-es-cluster-2 9s |
| 45 | +
|
| 46 | +NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE |
| 47 | +persistentvolumeclaim/es-pvc-es-cluster-0 Bound es-pv-0 10Gi RWO 9s |
| 48 | +persistentvolumeclaim/es-pvc-es-cluster-1 Bound es-pv-1 10Gi RWO 9s |
| 49 | +persistentvolumeclaim/es-pvc-es-cluster-2 Bound es-pv-2 10Gi RWO 9s |
| 50 | +
|
| 51 | +~~~ |
| 52 | + |
| 53 | +* Create elasticsearch Statefulset |
| 54 | + |
| 55 | +> As elasticsearch acts as the default backend of fluentd aggregated logs, its important that we deploy elasticsearch as an application that maintains state. Fluentd will continuously push data to elasticsearch. To reduce any latency and to associate the elasticsearch replicas directly to fluentd, we use the concept of Headless service. By using headless service - the DNS of the elasticsearch pods will be - *STATEFULSET-NAME-STICKYIDENTIFIER.HEADLESS-SERVICE-NAME*, i.e. **es-cluster-0.elasticsearch** |
| 56 | +
|
| 57 | +> Lets install elasticsearch headless service first - |
| 58 | +
|
| 59 | +` kubectl create -f elasticsearch_svc.yaml` |
| 60 | + |
| 61 | +``` |
| 62 | +kubectl get svc -n kube-logging |
| 63 | +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE |
| 64 | +elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 7s |
| 65 | +``` |
| 66 | + |
| 67 | +> Install elasticsearch statefulset |
| 68 | +
|
| 69 | +` kubectl create -f elasticsearch_statefulset.yaml` |
| 70 | + |
| 71 | +``` |
| 72 | +kubectl get pods -n kube-logging |
| 73 | +NAME READY STATUS RESTARTS AGE |
| 74 | +es-cluster-0 1/1 Running 0 21s |
| 75 | +es-cluster-1 1/1 Running 0 14s |
| 76 | +es-cluster-2 1/1 Running 0 8s |
| 77 | +``` |
| 78 | + |
| 79 | +> Using port-forward, verify the status of statefulset deployment |
| 80 | +
|
| 81 | +` kubectl port-forward es-cluster-0 9200:9200 --namespace=kube-logging` |
| 82 | + |
| 83 | +` curl http://localhost:9200/_cluster/state?pretty ` |
| 84 | + |
| 85 | +> The output should be as below |
| 86 | +
|
| 87 | +``` |
| 88 | +curl http://localhost:9200/_cluster/state?pretty |
| 89 | +{ |
| 90 | + "cluster_name" : "k8s-logs", |
| 91 | + "compressed_size_in_bytes" : 351, |
| 92 | + "cluster_uuid" : "fDRfwLflQjuKeOLAXuPwLg", |
| 93 | + "version" : 3, |
| 94 | + "state_uuid" : "NkdqNF34SKq0bmIMHrG96Q", |
| 95 | + "master_node" : "28Vbx-gdR7CKje0oT1PFhA", |
| 96 | + "blocks" : { }, |
| 97 | + "nodes" : { |
| 98 | + "4FNwm6qBS6qBZDDpMg4x9g" : { |
| 99 | + "name" : "es-cluster-2", |
| 100 | + "ephemeral_id" : "s182JiZdSHCYG8Ja-swyuA", |
| 101 | + "transport_address" : "192.168.1.192:9300", |
| 102 | + "attributes" : { } |
| 103 | + }, |
| 104 | + "VwgBprBNTA6kDP1BUJs_Zg" : { |
| 105 | + "name" : "es-cluster-0", |
| 106 | + "ephemeral_id" : "IQmaLDsJRzWU9tY7JDiUQg", |
| 107 | + "transport_address" : "192.168.1.191:9300", |
| 108 | + "attributes" : { } |
| 109 | + }, |
| 110 | + "28Vbx-gdR7CKje0oT1PFhA" : { |
| 111 | + "name" : "es-cluster-1", |
| 112 | + "ephemeral_id" : "lJFv0XwaShm_y8eIjuMf-g", |
| 113 | + "transport_address" : "192.168.2.178:9300", |
| 114 | + "attributes" : { } |
| 115 | + } |
| 116 | + }, |
| 117 | +``` |
| 118 | + |
| 119 | +* Install Kibana |
| 120 | + |
| 121 | +` kubectl create -f kibana.yaml ` |
| 122 | + |
| 123 | +> The output now should be as below - |
| 124 | +
|
| 125 | +~~~ |
| 126 | +kubectl get pods,svc -n kube-logging |
| 127 | +NAME READY STATUS RESTARTS AGE |
| 128 | +pod/es-cluster-0 1/1 Running 0 5m13s |
| 129 | +pod/es-cluster-1 1/1 Running 0 5m6s |
| 130 | +pod/es-cluster-2 1/1 Running 0 5m |
| 131 | +pod/kibana-bd6f49775-zmt4g 1/1 Running 0 22s |
| 132 | +
|
| 133 | +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE |
| 134 | +service/elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 6m14s |
| 135 | +service/kibana NodePort 10.99.16.215 <none> 5601:32182/TCP 22s |
| 136 | +~~~ |
| 137 | + |
| 138 | +> Get the nodeport from the kibana service, and visit the kibana dashboard on your browser using - http://EXTERNAL_IP:nodeport. Currently kibana is empty as there are no logs being pushed to elasticsearch. |
| 139 | +
|
| 140 | +* Install FluentD daemonset |
| 141 | + |
| 142 | +> FluentD will be installed as daemonset as we need one instance of fluentD running on all nodes. In order to run it on master, the corresponding tolerations has to be added to the fluentd yaml definition. The fluentd daemonset will look for the elasticsearch service to push the logs to. As a part of the environment variables, we define the headless service DNS (elasticsearch.kube-logging.svc.cluster.local) and the port 9200 so that fluentd can push all logs to the elasticsearch backend. |
| 143 | +
|
| 144 | +> FluentD will aggregate logs from all pods running in all namespaces. In order to provide fluentd the corresponding privileges, we have to create a RBAC policy for fluentd to fetch data from the "POD" resource and fetch pods from all "NAMESPACES". The file clusterrole-fluentd.yaml provides the necessary clusterrole definition. The file clusterrolebinding-fluentd.yaml will bind the clusterrole to a serviceaccount which will be used to run the fluentd daemonset. |
| 145 | +
|
| 146 | +` kubectl create -f sa-fluentd.yaml -f clusterrole-fluentd.yaml -f clusterrolebinding-fluentd.yaml ` |
| 147 | + |
| 148 | +Output should be as below - |
| 149 | + |
| 150 | +~~~ |
| 151 | +kubectl create -f sa-fluentd.yaml -f clusterrole-fluentd.yaml -f clusterrolebinding-fluentd.yaml |
| 152 | +serviceaccount/fluentd created |
| 153 | +clusterrole.rbac.authorization.k8s.io/fluentd created |
| 154 | +clusterrolebinding.rbac.authorization.k8s.io/fluentd created |
| 155 | +~~~ |
| 156 | + |
| 157 | +> Deploy the fluentd daemonset |
| 158 | +
|
| 159 | +` kubectl create -f fluentd_daemonset.yaml ` |
| 160 | + |
| 161 | +> Below should be the output of the kube-logging namespace now |
| 162 | +~~~ |
| 163 | +kubectl get pods -n kube-logging |
| 164 | +NAME READY STATUS RESTARTS AGE |
| 165 | +es-cluster-0 1/1 Running 0 16m |
| 166 | +es-cluster-1 1/1 Running 0 16m |
| 167 | +es-cluster-2 1/1 Running 0 15m |
| 168 | +fluentd-dcstb 1/1 Running 0 20s |
| 169 | +fluentd-kqmcd 1/1 Running 0 20s |
| 170 | +fluentd-xr987 1/1 Running 0 20s |
| 171 | +kibana-bd6f49775-zmt4g 1/1 Running 0 11m |
| 172 | +~~~ |
| 173 | + |
| 174 | + |
| 175 | +* Refresh kibana dashboard to see if the logstash-* index patterns are getting created. |
| 176 | + |
| 177 | +> In Discovery section - use the index pattern as logstash-* with timestamp as the filter to view all the logs. |
| 178 | +
|
| 179 | +* Cleanup |
| 180 | + |
| 181 | +` kubectl delete ns kube-logging` |
| 182 | + |
| 183 | + |
| 184 | + |
| 185 | + |
| 186 | + |
| 187 | + |
| 188 | + |
| 189 | + |
| 190 | + |
| 191 | + |
| 192 | + |
| 193 | + |
| 194 | + |
| 195 | + |
| 196 | + |
0 commit comments