Splunk 初級入門課程 Jerry Lee RSTN ,Taiwan

Splunk 初級入門課程 Jerry Lee RSTN ,Taiwan jerry.lee@rstn.com.tw
(2015年版本 v 6.3 )

課程大綱 Splunk 基本架構說明 Splunk 重要專有名詞說明實機操作下載和安裝資料輸入 (Data Input)
Search App 介紹基本搜尋語法（全文搜尋、欄位精準搜尋）儲存報告 (Report) 查閱表格 (Lookup Table) 進階搜尋語法 (top, stats, chart, timechart, eval) 儀表版 (Dashboard) 地圖（Map）資料模型、樞紐分析（Data Model, Pivot Analysis）

Splunk: 企業各式資訊機器設備的營運智慧平台
不用事先定義資料欄位，不用客製化連接器，不用資料庫，不需要事先過濾客戶使用資料資料中心以外其他設備資料 Click-stream data Shopping cart data Online transaction data Manufacturing, logistics… CDRs & IPDRs Power consumption RFID data GPS data Logfiles Configs Messages Traps Alerts Metrics Scripts Changes Tickets 視窗平台Windows UNIX 平台 Linux/Unix 虛擬化雲端Virtual & Cloud 應用系統Applications 資料庫Databases 網路設備Networking Registry Event logs File system sysinternals Configurations syslog File system ps, iostat, top Hypervisor Guest OS, Apps Cloud Web logs Log4J, JMS, JMX .NET events Code and scripts Configurations Audit/query logs Tables Schemas Configurations syslog SNMP netflow Splunk is a data engine for your machine data. It gives you real-time visibility and intelligence into what’s happening across your IT infrastructure – whether it’s physical, virtual or in the cloud. Everybody now recognizes the value of this data, the problem up to now has been getting to it. At Splunk we applied the search engine paradigm to being able to rapidly harness any and all machine data wherever it originates. The “no predefined schema” design, means you can point Splunk at any of your data, regardless of format, source or location. There is no need to build custom parsers or connectors, there’s no traditional RDBMS, there’s no need to filter and forward. Here we see just a sample of the kinds of data Splunk can ‘eat’. Reminder – what’s the ‘big deal’ about machine data? It holds a categorical record of the following: User transactions Customer behavior Machine behavior Security threats Fraudulent activity You can imagine that a single user transaction can span many systems and sources of this data, or a single service relies on many underlying systems. Splunk gives you one place to search, report on, analyze and visualize all this data.

Splunk 產品，四個主要元件： Search Head, Indexer, Forwarder, Deployment Server
今天入門課程，安裝軟體的架構 Search Head Indexing Server Log File Splunk Forwarders Data is spread out everywhere, and getting it all to one place is often harder than expected. Splunk helps make that job easy, with both agent-less data gathering, and Splunk forwarders. Splunk forwarders collect, process, and forward data to a central Splunk indexer. Forwarders can be load-balanced, are fault tolerant and centrally managed by either Splunk’s Deployment Server or your own config management system, and come in several footprint options.

Splunk 兩個主要執行程式之一： splunkd
全文檢索服務：被查詢、傳回結果、和將所有進入的資料建立索引 Accesses, processes, and indexes incoming data Processes all search requests and returns results Runs a web server on port 8089 by default Speaks SSL by default Splunk helpers run as dependent process(es) of splunkd Splunk helpers run outside scripts, for example: Scripted inputs Scripted alerts

Splunk 兩個主要執行程式之二： Splunk Web
Python-based web server, based on CherryPy framework Provides both search and management web front end for splunkd process Runs on port 8000 by default Sets initial login to user: admin password: changeme

<proprietary app>
Splunk 產品的目錄結構 $SPLUNK_HOME bin etc var 授權、設定執行檔 lib system apps users 安裝套件 splunk 全文檢索處理後所建立的索引 search launcher <proprietary app>

Splunk 重要專有名詞說明 Data Input : 資料輸入（例如：檔案、TCP、UDP、WMI、Script、Forwarder、Stream、API…） Source type ：來源類別（例如：apache logs, security log、network log, sensor log…） Host ：資料主機（例如：apache1、apche2, apserver1, firewall1, , …) Source：來源（例如：/opt/apache/log/*.*, udp:514, /bin/current_status.sh） Field：欄位（字段）：以正規表示式(Regular Expression) 擷取出欄位（字段） Search Language：搜尋語法（概念：縮小範圍 -> 運算 -> 結果呈現） Saved Search：儲存搜尋，將搜尋條件存下來，下次可以直接用 Alert : 告警，當搜尋到特定關鍵字、統計分析達到設定值，可發出警告（即時、排程） Report ：報表，將儲存搜尋結果，產出的圖形化報表 Dashboard ：儀表版，將不同報表彙整多個面板成儀表版 Data Model : 資料模型，將機器資料虛擬化的資料結構 Pivot Analysis : 樞紐分析，提供給一般使用者可以拖拉產生報表和儀表板 9

使用 Splunk Web介面之管理員：設定Data Input
Setting up inputs in manager is easy Useful for learning inputs and their settings Not typically used for setting production inputs, but can be used to create an example inputs.conf

資料輸入的類別 – all OS’s 檔案和目錄（Files & directories） - Splunk monitors text-based log files 網路輸入（TCP and UDP） - Splunk listens on a specified port for data feeds 指令碼（Scripts） - Splunk runs a script and indexes the output HTTP 事件收集器（HTTP Event Collector）

HTTP 事件收集器（HTTP Event Collector）
Supports DevOps and IoT data analysis needs at scale 1. Standard API and logging libraries send events directly to Splunk 2. Libraries integrated into popular platforms and services Scales to Millions of Events/Second DevOps & Developers IoT Devices & Applications Now you can onboard data directly from any application or device– opening up new types of machine data to the benefits of Splunk analysis. The new Event Collector makes it simple and efficient to collect this data, scaling to millions of events per second, using a developer-friendly, standard HTTP/JSON API and logging libraries And NO FORWARDERS. Today it is possible to send data directly to Splunk using Modular Inputs or a TCP connection, however this is not an efficient or scalable solution. While log files and forwarders provide an efficient mechanism for typical log and syslog files, use of files and forwarders is not possible or necessarily a desired data collection method for the world of custom applications DevOps, Docker, and other packaged application environments. The same is true for the world of IoT event data, where devices/apps need have no local storage, and even intermediate event collection systems and partners would prefer to use a real-time interface to Splunk rather than create specific log files and use forwarders. The HTTP Event Collector (EC) uses a standard API and high-volume Splunk endpoint to allow events to be directly sent/collected at extreme velocity. The HTTP/JSON API is a developer-standard, whose simple but powerful functionality will be attractive to DevOps and custom application developers and operations managers. Without requiring new system configuration, log creation or administration support, developers can instrument their applications to understand usage flows, performance, error conditions and more. The interface/functionality is also a fit for IoT software developers to connect their devices either directly or via intermediate collection services. The data volumes supported by Splunk are ideal for the transactional and diagnostic data of devices such as Point-Of-Sale systems, vending machines, gaming consoles, automobiles and other devices/systems – opening up a new world of machine data to the benefits of Splunk analysis

指定資料輸入的檔案和目錄 add new input edit existing input

選擇輸入的檔案或目錄位置 => Source
Specify a file or directory for ongoing monitoring Upload a copy of a file Useful for testing and development

選擇資料輸入的指定主機 => host
Specify a constant value if all monitored files in an input are from the same host

選擇資料輸入的來源類型 => sourcetype
Sourcetype is Splunk’s way of identifying the type of data Default and custom data processing during indexing relies heavily on sourcetype Also used heavily in searches, reports, dashboards, Apps -- basically the rest of Splunk as well!

實際的資料輸入檔： inputs.conf 的設定
Each input gets its own stanza The first line, encased in square brackets [ ], sets the type of input and location Subsequent lines are “attribute = value” See $SPLUNK_HOME/etc/system/README/inputs.c onf.spec for detailed syntax [monitor:///logs/secure] disabled = false host_segment = 3 sourcetype = linux_secure index = security [monitor:///opt/tradelog.log] disabled = 1 sourcetype = trade_entries host = tradesrv.mycompany.com [udp://514] connection_host = dns sourcetype = syslog

預設可辨識的來源類型，其他可透過下載App、或自行設定

Splunk 登入的首頁

基本搜尋使用套件（Search App）

設定的管理介面：知識、資料、系統管理、存取控制

Splunkbase 有800個安裝套件(App)，可免費下載安裝
熱門下載： Splunk App for Windows Splunk for Unix and Linux DB Collect Splunk for Cisco Firewall Splunk for F5 Splunk for Nagios Splunk for Web Intelligence ..

常見問答，可到 Splunkbase Answer 查詢、發問

跟 Splunk 更多忍者，學習進階技巧

安裝的作業系統和瀏覽器需求 Splunk works on Windows, Linux, Solaris, FreeBSD, MacOS X, AIX, and HP-UX Firefox 3, 4, and 8; IE 7, 8, and 9; latest Safari and Chrome docs.splunk.com/Documentation/Splunk/latest/Installation/Systemrequire ments

Splunk 免費下載（需要先註冊帳號，登入後即可下載）
Download Splunk from (login required) Make sure you get the right version for your platform You might be able to install the wrong version, but it won't run

實機上手教材上手教材簡報ＰＤＦ檔詳細入門教材檔今日教材（請從隨身碟取得）英文版中文版搜尋入門手冊
今日教材（請從隨身碟取得）詳細入門教材檔英文版搜尋入門手冊資料模型和樞紐分析教學手冊中文版

Sample Data 範例範例資料 (Sample Data)
內含： Apache 1 Log, Apache 2 Log, Apache 3 Log Mail Servers Vendor Sales 請解開來在你電腦的某個指定目錄，瀏覽一下～線上查閱表格 (Lookup Table) Product ID 對應到產品名稱、價格的對應表

新增資料輸入 (1)

新增資料輸入 (2)

開始使用 Splunk 6 的使用介面

開始使用 Search & Report 應用套件

資料摘要說明

Search 的結果呈現

時間範圍挑選器、搜尋模式可選預設的時間區間、相對、即時、日期範圍、進階…等

以關鍵字搜尋，可搭配 OR, NOT，可點選 TimeLine 縮小時間例如： buttercupgames (error OR fail
以關鍵字搜尋，可搭配 OR, NOT，可點選 TimeLine 縮小時間例如： buttercupgames (error OR fail* OR severe)

資料結果頁籤：事件、樣式、統計資料、視覺化
1. 搜尋的事件結果 2. 統計分析 3. 視覺化

資料結果頁籤：樣式 => 對資料做初步的 Pattern分析

查詢『 sourcetype=“access_*” 』

欄位（字段）選擇顯示

檢視『欄位摘要』

搜尋範例搜尋 Buttercup Games 商店的成功購買數搜尋發生錯誤的產生記錄搜尋昨天購買了多少模擬遊戲
sourcetype=access_* status=200 action=purchase 搜尋發生錯誤的產生記錄 (error OR fail* OR severe) OR (status=404 OR status=500 OR status=503) 搜尋昨天購買了多少模擬遊戲 sourcetype=access_* status=200 action=purchase categoryId=simulation

使用直立線字元｜來進行資料的後續處理

搜尋語言的範例 This diagram represents a search, broken into its syntax components PIPE: Take these events and… PIPE: Take these stats and… Search for this sourcetype=access_* status=503 | stats sum(price) as lost_revenue | fieldformat lost_revenue = "$" + tostring(lost_revenue, "commas") COMMAND: Get some stats COMMAND: Format values for the lost_revenue field FUNCTION: Get a sum FUNCTION: Create a string ARGUMENT: Get a sum of the price field ARGUMENT: Format the string from values in the lost_revenue field, insert commas CLAUSE: Call that sum “lost_revenue”

搜尋語法處理的過程（範圍縮小 => 運算 => 呈現）
搜尋語法處理的過程（範圍縮小 => 運算 => 呈現） Disk Intermediate results table Intermediate results table Final results table sourcetype=syslog ERROR | top user | fields - percent Summarize into table of top 10 users Remove column showing percentage Fetch events from disk that match

將搜尋結果，透過『進階語法』統計運算： top 熱門排名 : sourcetype=access_
將搜尋結果，透過『進階語法』統計運算： top 熱門排名 : sourcetype=access_* status=200 action=purchase | top categoryId

選擇視覺化的樣式

stats – count by （依欄位分別統計）
The by clause returns a count for each field value of a named field This example counts the number of events when action=purchase for each productId How many of each product was purchased? sourcetype=access_* action=purchase | stats count by productId

stats – sparkline （產生分時統計圖）
Used in conjunction with the stats and chart commands Creates a mini-timeline in a report Represents the same time span as the search – in this case “last 7 days” Not to be confused with timechart, which creates a standalone visualization What is the purchase trend for each product ID over the last 7 days? sourcetype=access* action=purchase | stats sparkline count by productId | sort -count Note: chart and timechart are covered later in this course

How do our prices compare to the competition?
利用 eval 進行值的差異計算 How do our prices compare to the competition? You can perform mathematical functions against fields with numeric field values This example compares the flowershop price against the competitor's price Subtract the value of flowsersrus_price from price flowersrus_price is another field available via a lookup! sourcetype=access_combined product_name=* | eval difference = price - flowersrus_price | table product_name, price, flowersrus_price, difference

Are any hosts throwing a lot of errors?
圖表指令： chart This example shows a basic chart The count function counts the number of events for each http status Are any hosts throwing a lot of errors? sourcetype=access_* | chart count by status

What’s the overall usage trend for the last 24 hours?
時間趨勢圖分析： timechart This example displays the usage categories over a 1 hour period Splitting by the usage field, each line represents a unique value of the field The y-axis represents the count for each field value sourcetype=cisco_w* | timechart count by usage What’s the overall usage trend for the last 24 hours?

另存為『報告』，方便再利用

以後，就可以直接點選搜尋和報告，直接瀏覽

使用子搜尋（Sub Search ） sourcetype=access_
使用子搜尋（Sub Search ） sourcetype=access_* status=200 action=purchase [search sourcetype=access_* status=200 action=purchase | top limit=1 clientip | table clientip] | stats count, dc(productId), values(productId) by clientip

使用查閱表格（Lookup Table）

新增查閱表格

設定查閱表格權限

設定查閱定義、權限

設定自動查閱

確定自動查閱成功 => 會有 price, productName 兩個欄位
sourcetype=access_*

選擇 price, productName 兩個欄位

使用子搜尋（Sub Search ）和自動查閱 sourcetype=access_
使用子搜尋（Sub Search ）和自動查閱 sourcetype=access_* status=200 action=purchase [search sourcetype=access_* status=200 action=purchase | top limit=1 clientip | table clientip] | stats count AS "Total Purchased", dc(productId) AS "Total Products", values(productName) AS "Product Names" by clientip | rename clientip AS "VIP Customer 請，另存為「VIP 客戶」的報告

其他進階搜尋語法（一） 1、比較檢視數量和購買數量（看 => 放進購物車 => 購買）
sourcetype=access_* status=200 | chart count AS views count(eval(action=“addtocart”)) AS addtocart count(eval(action=“purchase”)) AS purchases by productName | rename productName AS “產品名稱”, views AS ”瀏覽總量“, addtocart AS “放入購物車總量”, purchases AS ”最終購買總量” 進階語法（看 => 放進購物車（百分比） => 購買（百分比）） sourcetype=access_* status=200 | stats count AS views count(eval(action="addtocart")) AS addtocart count(eval(action="purchase")) AS purchases by productName | eval viewsToPurchase=(purchases/views)*100 | eval cartToPurchase=(purchases/addtocart)*100 | table productName views addtocart purchases viewsToPurchase cartToPurchase | rename productName AS “產品名稱”, views AS ”瀏覽總量“, addtocart AS “放入購物車總量”, purchases AS ”最終購買總量” 另存為 [產品檢視數以及購買數比較] 報告

其他進階搜尋語法（一）另存為 [產品檢視數以及購買數比較] 報告

其他進階搜尋語法（二） 2、特定期間內所購買的產品分析
sourcetype=access_* | timechart count(eval(action="purchase")) by productName usenull="f" useother="f” 選擇 [折線圖]、[區域圖] 和 [柱狀圖] 進行視覺化另存為 [每種產品的購買數] 報告

其他進階搜尋語法（三） 3、購買趨勢 sourcetype=access_* status=200 action=purchase| chart sparkline(count) AS "Purchases Trend" count AS Total by categoryId | rename categoryId AS "Category” 另存為 [購買趨勢] 報告

其他進階搜尋語法（四）-1 sourcetype=access_* status=200 | stats count AS views count(eval(action="addtocart")) AS addtocart count(eval(action="purchase")) AS purchases by productName | eval viewsToPurchase=(purchases/views)*100 | eval cartToPurchase=(purchases/addtocart)*100 | table productName views addtocart purchases viewsToPurchase cartToPurchase | renam productName AS "Product Name" views AS "Views", addtocart as "Adds To Cart", purchases AS "Purchases"

其他進階搜尋語法（四）-2 Chart Overlay 選項

製作儀表版（Dashboard）（一）：建立儀表板
1.執行下列搜尋 sourcetype=access_* status=200 action=purchase | top categoryId 2. 視覺化選『圓餅圖』

3. 另存為『儀表板面板』

製作儀表版（Dashboard）（二）：編輯面板

製作儀表版（Dashboard）（三）：新增輸入

製作儀表版（Dashboard）（三）：新增輸入，變成查詢表單

製作儀表版（Dashboard）（三）：新增面板

製作儀表版（四）：新增其他面板，最終儀表版

地圖功能（Map ）對應 IP 所在地指令： iplocation [外部IP位址]
sourcetype="access_*" | iplocation clientip 產生以下欄位 Country 所在國家 City 所在城市 lon 經度 lat 緯度

地圖功能（Map ）統計指令： geostats count by [統計欄位]
sourcetype="access_*" | iplocation clientip | geostats count by productName

資料模型（Data Model）、樞紐分析(Pivot Analysis)

恭喜你！你已經練功完成～變成等級 1 的 Splunk Ninja 請繼續練功，持續提升功力！

Splunk 初級入門課程 Jerry Lee RSTN ,Taiwan

Similar presentations

Presentation on theme: "Splunk 初級入門課程 Jerry Lee RSTN ,Taiwan"— Presentation transcript:

Similar presentations

About project

反馈

请登录

Auth with social network:

Splunk 初級入門課程 Jerry Lee RSTN ,Taiwan

Similar presentations

Presentation on theme: "Splunk 初級入門課程 Jerry Lee RSTN ,Taiwan"— Presentation transcript:

Similar presentations

About project

反馈