Introduction to Splunk>: Start getting more from your data
The concept behind Splunk is simple: Google made it possible for users to search billion of pages of Web content, which is nothing but data, so why couldn’t Splunk do that using machine data across data centers, the cloud, logs, networks, or even smartphones in a practical amount of time. This is exactly what Splunk does. Let me introduce you to this powerful tool and give you a general understanding of its capabilities. The best way to start understanding Splunk is to divide it into four logical functions.
1. Search Head
This is the Web server and app interpreting engine that provides the primary, Web-based user interface. Since most of the data interpretation happens as-needed at search time, the role of the search head is to translate user and app requests into actionable searches for its indexer(s) and display the results. The Splunk Web UI is highly customizable, either through Spunk’s own view and app system or by embedding Splunk searches in your own web apps via includes or Spunk’s API.
An indexer does two things—it accepts and processes new data, adding the data to the index and compressing it. The indexer also services search requests, looking through the data it has via its indices and returning the appropriate results to the searcher over a compressed communication channel. Indexers scale-out almost limitlessly and with almost no degradation in overall performance, allowing Splunk to scale from single-instance small deployments to truly massive Big Data challenges.
Splunk forwarders come in two types: distribution or a dedicated “Universal Forwarder.” The distribution forwarder can be configured to filter data before transmitting and executing scripts locally. The universal forwarder is an ultra-lightweight agent designed to collect data in the smallest possible footprint. Both flavors of forwarder come with automatic load balancing, SSL encryption, and data compression, and the ability to route data to multiple Splunk instances or third-party systems.
4. Deployment Server
Finally, to manage your distributed environments, there is the Deployment Server. This helps you synchronize the configuration of your search heads during distributed searching across your data sources, as well as your forwarders, to centrally manage your distributed data collection. Splunk has a simple flat-file configuration system, so if you already have config management tools with which you’re comfortable, you can still use them.
- Provides an engine to search, alert, monitor, and report.
- Founded in 2004 with its first software release in 2006; it currently has about 11,000+ customers in 110 countries.
- Captures data, indexes, and stores the data in searchable Hot, Warm, and Cold repositories that can be used to create custom reporting-alerts, dashboards, and visualizations.
- Offers tools in various flavors of products aka Splunk Enterprise, Splunk Cloud, and Splunk Light.
- In addition, offers a range of products and plug-ins to work with cutting edge technologies like VMware, Cisco, and Mobile Apps.
- A platform for Operational Intelligence.
- Easy to start: Hassle-free installations and configurations.
- Range of products: Handy Apps and plug-ins offer plug and play with the same configuration and operational guidelines.
- Minimal operational overheads: Easy to maintain and troubleshoot.
- Learning: Configurations, searching, monitoring, and creating reports.
- Help: Product features are self-explanatory and easy to build.
- Search Head: a Distributed agent for search—Splunk instance sends the search to be performed on several indexers and groups the results back to the user.
- Indexer: Centralized data store—Splunk instance which can be configured to receive log data feed from the forwarders.
- Forwarders: Agent to collect logs—Splunk instance that can be deployed on event source systems to collect the logs.
- Deployment Manager: Centralized configuration manager—Splunk instance managing the configurations of the different components within the overall deployment.
- Conf files: Splunk depends on conf files—all configuration goes to conf files.
- Index: Pointer to all data events—Indexing tags each event in fields.
- Source type: Format of the data input—determines how Splunk formats event data.
- Source: Name of the input—file from which data gets generated.
- Host: Domain/Host/IP from which the event originates.
- Events for same source types can come from different sources, hosts.
- Events from different source types can go to the same Index.
- Defining source type is very important as source type plays a key role in searching.
How It Works
- Install: Unzip to explore and run installation. Splunk provides OS based installation in a compressed format.
- Getting data into Splunk: Forwarders, inputs configuration.
- Manage Splunk licenses: Specifies and controls Indexing data on Splunk instance.
- Manage data inputs: Define Index, source types, and Indexers.
- Scale Splunk: Distributed architecture, distributed searches—search heads Deployment Servers Configuration.
- Secure Splunk: Role based access [LDAP], Secure Authentication and Encryption [SSL].
- Troubleshoot: Splunk logs—under Splunk home var/log/Splunk/ support teams from Splunk.
- Enterprise: Enables all features—distributed search authentication.
- Free License: Limited index volume—disables authentication.
- Beta License: Limited to Beta Versions—enables enterprise features.
Splunk can be configured by any of these methods:
- Splunk Web: Configuration using Splunk URL—Default runs on 8000 port of the host where it is installed.
- Edit Conf files: Located under Splunk home directory under /etc/system.
- Splunk CLI: Command line interface provides many options—can be referred using /Splunk help.
- Set up screens for an app: Allows users to configure for the app without touching config files.
Choose among forwarders as per need:
- Universal: Streamlined Version—dedicated Splunk instance with essential components to forward data.
- Heavy: Full version—Splunk enterprise instance with some features disabled.
- Light: Splunk enterprise instance with most of the features disabled.
Got questions about your data? We’ve got answers!