"This website is not affiliated with Splunk, Inc. and is not an authorized seller of Splunk products or services."
  • Home - Splunk Tutorial
  • Splunk training videos
  • Splunk interview questions
  • Contact US
  • About Us
  • Privacy Policy
  • Splunk Jobs

Implementing Splunk Index Cluster :

Picture
What is a cluster?
                       A computer cluster consists of a set of loosely or tightly connected computers that work together so that, in many respects, they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.

 What is Index cluster in splunk?

                      A specially configured group of Splunk Enterprise indexers that replicate external data, so that they maintain multiple copies of the data. Indexer clusters promote high availability and disaster recovery. Indexer clusters feature automatic failover from one indexer to the next. This means that, if one or more indexers fail, incoming data continues to get indexed and indexed data continues to be searchable.

​Why to implement index cluster?

Data availability - An indexer is always available to handle incoming data, and the indexed data is available for searching
 Data fidelity - You never lose any data. You have assurance that the data sent to the cluster is exactly the same data that gets stored in the cluster and that a search can later access
 Data recovery - Your system can tolerate downed indexers without losing data or losing access to data
 Disaster recovery -With multisite clustering, your system can tolerate the failure of an entire data center
 Search affinity - With multisite clustering, search heads can access the entire set of data through their local sites, greatly reducing long-distance network traffic
 
What are Components of index cluster?
  1. Master node manages the cluster. It coordinates the replicating activities of the peer nodes and tells the search head where to find data. It also helps manage the configuration of peer nodes and orchestrates remedial activities if a peer goes down.
2.    peer nodes receive and index incoming data, just like non-clustered, stand-alone indexers. Unlike stand-alone indexers, however, peer nodes also replicate data from other nodes in the cluster. A peer node can index its own incoming data while simultaneously storing copies of data from other nodes. You must have at least as many peer nodes as the replication factor. That is, to support a replication factor of 3, you need a minimum of three peer nodes.
3.     search head runs searches across the set of peer nodes. You must use a search head to manage searches across indexer clusters
Here is a diagram of a basic, single-site indexer cluster, containing three peer nodes and supporting a replication factor of 3:


Picture

Prerequisites for splunk index cluster Implementation:

These are the main issues to note:
  •  Each cluster node (master, peer, or search head) must reside on a separate Splunk Enterprise instance.
  •  Each node instance must run the same Splunk Enterprise version.
  •  Each node instance must run on a separate machine or virtual machine, and each machine must be running the same operating system.
  •  All nodes must be connected over a network.
For example, to deploy a cluster consisting of three peers, one master, and one search head, you need five Splunk Enterprise instances running on five machines connected over a network. All instances must be at the same Splunk Enterprise version level (for example, 5.0.3). And all machines must be running the same operating system.
These are some additional issues to be aware of:
  •  Compared to a non-clustered deployment, clusters require more storage, to accommodate the multiple copies of data.
  •  Index replication, in and of itself, does not increase your licensing needs.
  •  You cannot use a deployment server to distribute updates to peers.
 
Reference hardware:

When sizing your Splunk Enterprise environment's hardware needs, a reference machine helps you understand when it is time to scale and distribute the deployment. Following is an example of such a machine. Refer to this configuration as the standard for the remainder of this chapter.
The reference machine described below produces the following index and search performance metrics for a given sample of data:
Indexing performance
 Up to 20 megabytes per second (1700 GB per day) of raw indexing performance, provided no other Splunk activity is occurring.
Search performance
 Up to 50,000 events per second for dense searches
 Up to 5,000 events per second for sparse searches
 Up to 2 seconds per index bucket for super-sparse searches
 From 10 to 50 buckets per second for rare searches with bloom filters
To find out more about the types of searches and how they affect Splunk Enterprise performance, read "How search types affect Splunk Enterprise performance" in this manual.
Bare-metal hardware
 Intel x86 64-bit chip architecture
 2 CPUs, 6 cores per CPU (12 cores total), at least 2 Ghz per core
 12 GB RAM
 Standard 1 Gb Ethernet NIC, optional 2nd NIC for a management network
 Standard 64-bit Linux or Windows distribution
Disk subsystem
The reference computer's disk subsystem should be capable of handling a high number of averaged Input/Output OperationsPer Second (IOPS).
IOPS are a measurement of how much data throughput a hard drive can produce. Since a hard drive reads and writes at different speeds, there are IOPS numbers for disk reads and writes. The average IOPS is the blend between those two figures.
The more average IOPS a hard drive can produce, the more data it can index and search in a given period of time. While many variable items factor into the amount of IOPS that a hard drive can produce, the three most important elements are:
 its rotational speed (in revolutions per minute)
 its average latency (the amount of time it takes to spin its platters half a rotation)
 its average seek time (the amount of time it takes to retrieve a requested block of data.)
 
Implementing Index Cluster:
  1. Be ready with above mentioned prerequisites
  2. Decide your replication factor and number of instances you are going to deploy
  3. Install splunk instance on all hosts which will be part of index cluster..steps here
  4. Enable Index cluster master mode:
  5. Configure the master with server.conf
 
The following example shows the basic settings that you must configure when enabling a master node. The configuration attributes correspond to fields on theEnable clustering page of Splunk Web.
 
-------------------------------------------------
[clustering]
mode = master
replication_factor = 4
search_factor = 3
pass4SymmKey = whatever
 -------------------------------------------------
  1. Configure peer nodes with server.conf
  2. The following example shows the basic settings that you must configure when enabling a peer node. The configuration attributes shown in these examples correspond to fields on the Enable clustering page of Splunk Web.

-----------------------------------------------------------------------------
[replication_port://9887]
 
[clustering]
master_uri = https://10.152.31.202:8089
mode = slave
pass4SymmKey = whatever
 
 -----------------------------------------------------------
  1. Enable cluster search head using server.conf
The following example shows the basic settings that you must configure when enabling a search head node. The configuration attributes shown here correspond to fields on the Enable clustering page of Splunk Web.

----------------------------------------------------------------------
master_uri = https://10.152.31.202:8089
mode = searchhead
pass4SymmKey = whatever
 ------------------------------------------------------------------------
  1. Add all your indexers as search peer on search head
  2. Point your forwarders to forward data to your index clsuter 
  3. Complete the peer node configuration:
If you use only the default set of indexes and default configurations, you can start replicating data right away.
Your Index cluster is ready ….

Powered by Create your own unique website with customizable templates.