Splunk 初級入門課程 Jerry Lee RSTN ,Taiwan jerry.lee@rstn.com.tw (2015年版本 v 6.3 )
課程大綱 Splunk 基本架構說明 Splunk 重要專有名詞說明 實機操作 下載和安裝 資料輸入 (Data Input) Search App 介紹 基本搜尋語法 (全文搜尋、欄位精準搜尋) 儲存報告 (Report) 查閱表格 (Lookup Table) 進階搜尋語法 (top, stats, chart, timechart, eval) 儀表版 (Dashboard) 地圖(Map) 資料模型、樞紐分析(Data Model, Pivot Analysis)
Splunk: 企業 各式資訊機器設備的 營運智慧平台 不用事先定義 資料欄位,不用 客製化 連接器,不用資料庫,不需要事先過濾 客戶使用 資料 資料中心 以外 其他設備資料 Click-stream data Shopping cart data Online transaction data Manufacturing, logistics… CDRs & IPDRs Power consumption RFID data GPS data Logfiles Configs Messages Traps Alerts Metrics Scripts Changes Tickets 視窗平台Windows UNIX 平台 Linux/Unix 虛擬化 雲端Virtual & Cloud 應用系統Applications 資料庫Databases 網路設備Networking Registry Event logs File system sysinternals Configurations syslog File system ps, iostat, top Hypervisor Guest OS, Apps Cloud Web logs Log4J, JMS, JMX .NET events Code and scripts Configurations Audit/query logs Tables Schemas Configurations syslog SNMP netflow Splunk is a data engine for your machine data. It gives you real-time visibility and intelligence into what’s happening across your IT infrastructure – whether it’s physical, virtual or in the cloud. Everybody now recognizes the value of this data, the problem up to now has been getting to it. At Splunk we applied the search engine paradigm to being able to rapidly harness any and all machine data wherever it originates. The “no predefined schema” design, means you can point Splunk at any of your data, regardless of format, source or location. There is no need to build custom parsers or connectors, there’s no traditional RDBMS, there’s no need to filter and forward. Here we see just a sample of the kinds of data Splunk can ‘eat’. Reminder – what’s the ‘big deal’ about machine data? It holds a categorical record of the following: User transactions Customer behavior Machine behavior Security threats Fraudulent activity You can imagine that a single user transaction can span many systems and sources of this data, or a single service relies on many underlying systems. Splunk gives you one place to search, report on, analyze and visualize all this data.
Splunk 產品,四 個主要元件: Search Head, Indexer, Forwarder, Deployment Server 今天入門課程, 安裝軟體的架構 Search Head Indexing Server Log File Splunk Forwarders Data is spread out everywhere, and getting it all to one place is often harder than expected. Splunk helps make that job easy, with both agent-less data gathering, and Splunk forwarders. Splunk forwarders collect, process, and forward data to a central Splunk indexer. Forwarders can be load-balanced, are fault tolerant and centrally managed by either Splunk’s Deployment Server or your own config management system, and come in several footprint options.
Splunk 兩個主要執行程式之一 : splunkd 全文檢索服務:被查詢、傳回結果、和 將 所有進入的資料 建立索引 Accesses, processes, and indexes incoming data Processes all search requests and returns results Runs a web server on port 8089 by default Speaks SSL by default Splunk helpers run as dependent process(es) of splunkd Splunk helpers run outside scripts, for example: Scripted inputs Scripted alerts
Splunk 兩個主要執行程式之二 : Splunk Web Python-based web server, based on CherryPy framework Provides both search and management web front end for splunkd process Runs on port 8000 by default Sets initial login to user: admin password: changeme
<proprietary app> Splunk 產品的目錄結構 $SPLUNK_HOME bin etc var 授權、設定 執行檔 lib system apps users 安裝套件 splunk 全文檢索處理後 所建立的索引 search launcher <proprietary app>
課程大綱 Splunk 基本架構說明 Splunk 重要專有名詞說明 實機操作 下載和安裝 資料輸入 (Data Input) Search App 介紹 基本搜尋語法 (全文搜尋、欄位精準搜尋) 儲存報告 (Report) 查閱表格 (Lookup Table) 進階搜尋語法 (top, stats, chart, timechart, eval) 儀表版 (Dashboard) 地圖(Map) 資料模型、樞紐分析(Data Model, Pivot Analysis)
Splunk 重要專有名詞說明 Data Input : 資料輸入 (例如:檔案、TCP、UDP、WMI、Script、Forwarder、Stream、API…) Source type : 來源類別 (例如:apache logs, security log、network log, sensor log…) Host :資料主機 (例如:apache1、apche2, apserver1, firewall1, 10.1.1.2, …) Source: 來源 (例如:/opt/apache/log/*.*, udp:514, /bin/current_status.sh) Field:欄位(字段): 以正規表示式(Regular Expression) 擷取出欄位(字段) Search Language:搜尋語法 (概念: 縮小範圍 -> 運算 -> 結果呈現 ) Saved Search: 儲存搜尋,將搜尋條件 存下來,下次可以直接用 Alert : 告警,當 搜尋到特定關鍵字、統計分析達到設定值,可發出警告(即時、排程) Report : 報表,將 儲存搜尋結果,產出的 圖形化報表 Dashboard : 儀表版,將不同報表彙整多個面板 成 儀表版 Data Model : 資料模型,將機器資料虛擬化的資料結構 Pivot Analysis : 樞紐分析,提供給 一般使用者 可以拖拉產生報表和儀表板 9
使用 Splunk Web介面之管理員:設定Data Input Setting up inputs in manager is easy Useful for learning inputs and their settings Not typically used for setting production inputs, but can be used to create an example inputs.conf
資料輸入 的 類別 – all OS’s 檔案和目錄(Files & directories) - Splunk monitors text-based log files 網路輸入(TCP and UDP) - Splunk listens on a specified port for data feeds 指令碼(Scripts) - Splunk runs a script and indexes the output HTTP 事件收集器(HTTP Event Collector)
HTTP 事件收集器 (HTTP Event Collector) Supports DevOps and IoT data analysis needs at scale 1. Standard API and logging libraries send events directly to Splunk 2. Libraries integrated into popular platforms and services Scales to Millions of Events/Second DevOps & Developers IoT Devices & Applications Now you can onboard data directly from any application or device– opening up new types of machine data to the benefits of Splunk analysis. The new Event Collector makes it simple and efficient to collect this data, scaling to millions of events per second, using a developer-friendly, standard HTTP/JSON API and logging libraries And NO FORWARDERS. Today it is possible to send data directly to Splunk using Modular Inputs or a TCP connection, however this is not an efficient or scalable solution. While log files and forwarders provide an efficient mechanism for typical log and syslog files, use of files and forwarders is not possible or necessarily a desired data collection method for the world of custom applications DevOps, Docker, and other packaged application environments. The same is true for the world of IoT event data, where devices/apps need have no local storage, and even intermediate event collection systems and partners would prefer to use a real-time interface to Splunk rather than create specific log files and use forwarders. The HTTP Event Collector (EC) uses a standard API and high-volume Splunk endpoint to allow events to be directly sent/collected at extreme velocity. The HTTP/JSON API is a developer-standard, whose simple but powerful functionality will be attractive to DevOps and custom application developers and operations managers. Without requiring new system configuration, log creation or administration support, developers can instrument their applications to understand usage flows, performance, error conditions and more. The interface/functionality is also a fit for IoT software developers to connect their devices either directly or via intermediate collection services. The data volumes supported by Splunk are ideal for the transactional and diagnostic data of devices such as Point-Of-Sale systems, vending machines, gaming consoles, automobiles and other devices/systems – opening up a new world of machine data to the benefits of Splunk analysis
指定 資料輸入的 檔案和目錄 add new input edit existing input
選擇 輸入的 檔案 或 目錄 位置 => Source Specify a file or directory for ongoing monitoring Upload a copy of a file Useful for testing and development
選擇 資料輸入的 指定主機 => host Specify a constant value if all monitored files in an input are from the same host
選擇 資料輸入的 來源類型 => sourcetype Sourcetype is Splunk’s way of identifying the type of data Default and custom data processing during indexing relies heavily on sourcetype Also used heavily in searches, reports, dashboards, Apps -- basically the rest of Splunk as well!
實際的 資料輸入檔: inputs.conf 的設定 Each input gets its own stanza The first line, encased in square brackets [ ], sets the type of input and location Subsequent lines are “attribute = value” See $SPLUNK_HOME/etc/system/README/inputs.c onf.spec for detailed syntax [monitor:///logs/secure] disabled = false host_segment = 3 sourcetype = linux_secure index = security [monitor:///opt/tradelog.log] disabled = 1 sourcetype = trade_entries host = tradesrv.mycompany.com [udp://514] connection_host = dns sourcetype = syslog
預設可辨識的來源類型,其他可透過下載App、或自行設定 http://docs.splunk.com/Documentation/Splunk/latest/Data/Listofpretrainedsourcetypes
Splunk 登入的首頁
基本 搜尋使用套件 (Search App)
設定的管理介面:知識、資料、系統管理、存取控制
Splunkbase 有800個安裝套件(App),可免費下載安裝 熱門下載: Splunk App for Windows Splunk for Unix and Linux DB Collect Splunk for Cisco Firewall Splunk for F5 Splunk for Nagios Splunk for Web Intelligence .. http://apps.splunk.com
常見問答,可到 Splunkbase Answer 查詢、發問 http://answers.splunk.com/
跟 Splunk 更多忍者,學習進階技巧 http://wiki.splunk.com/ http://blogs.splunk.com/
課程大綱 Splunk 基本架構說明 Splunk 重要專有名詞說明 實機操作 下載和安裝 資料輸入 (Data Input) Search App 介紹 基本搜尋語法 (全文搜尋、欄位精準搜尋) 儲存報告 (Report) 查閱表格 (Lookup Table) 進階搜尋語法 (top, stats, chart, timechart, eval) 儀表版 (Dashboard) 地圖(Map) 資料模型、樞紐分析(Data Model, Pivot Analysis)
安裝的 作業系統 和 瀏覽器 需求 Splunk works on Windows, Linux, Solaris, FreeBSD, MacOS X, AIX, and HP-UX Firefox 3, 4, and 8; IE 7, 8, and 9; latest Safari and Chrome docs.splunk.com/Documentation/Splunk/latest/Installation/Systemrequire ments
Splunk 免費下載(需要先註冊帳號,登入後即可下載) Download Splunk from www.splunk.com/download (login required) Make sure you get the right version for your platform You might be able to install the wrong version, but it won't run
實機上手教材 上手教材簡報 PDF 檔 詳細入門教材檔 今日教材 (請從 隨身碟 取得) 英文版 中文版 搜尋入門手冊 今日教材 (請從 隨身碟 取得) 詳細入門教材檔 英文版 搜尋入門手冊 http://docs.splunk.com/Documentation/Splunk/latest/SearchTutorial/WelcometotheSearchTutorial 資料模型和樞紐分析 教學手冊 http://docs.splunk.com/Documentation/Splunk/latest/PivotTutorial/WelcometothePivotTutorial 中文版 http://docs.splunk.com/Documentation/Splunk/6.2.0/Translated/TraditionalChinesemanuals
Sample Data 範例 範例資料 (Sample Data) http://docs.splunk.com/images/Tutorial/tutorialdata.zip 內含: Apache 1 Log, Apache 2 Log, Apache 3 Log Mail Servers Vendor Sales 請解開來在 你電腦的某個指定目錄,瀏覽一下~ 線上查閱表格 (Lookup Table) http://docs.splunk.com/images/d/db/Prices.csv.zip Product ID 對應到 產品名稱 、價格 的 對應表
新增 資料輸入 (1)
新增 資料輸入 (2)
開始使用 Splunk 6 的使用介面
開始使用 Search & Report 應用套件
資料摘要 說明
Search 的結果呈現
時間範圍挑選器、搜尋模式 可選 預設的時間區間、相對、即時、日期範圍、進階…等
以關鍵字搜尋,可搭配 OR, NOT,可點選 TimeLine 縮小時間 例如: buttercupgames (error OR fail 以關鍵字搜尋,可搭配 OR, NOT,可點選 TimeLine 縮小時間 例如: buttercupgames (error OR fail* OR severe)
資料結果頁籤:事件、樣式、統計資料、視覺化 1. 搜尋 的事件結果 2. 統計分析 3. 視覺化
資料結果頁籤:樣式 => 對資料做初步的 Pattern分析
查詢 『 sourcetype=“access_*” 』
欄位(字段)選擇顯示
欄位(字段)選擇顯示
檢視『欄位摘要』
搜尋範例 搜尋 Buttercup Games 商店的成功購買數 搜尋 發生錯誤 的產生記錄 搜尋昨天購買了多少模擬遊戲 sourcetype=access_* status=200 action=purchase 搜尋 發生錯誤 的產生記錄 (error OR fail* OR severe) OR (status=404 OR status=500 OR status=503) 搜尋昨天購買了多少模擬遊戲 sourcetype=access_* status=200 action=purchase categoryId=simulation
使用 直立線字元|來進行 資料的後續處理
搜尋 語言 的 範例 This diagram represents a search, broken into its syntax components PIPE: Take these events and… PIPE: Take these stats and… Search for this sourcetype=access_* status=503 | stats sum(price) as lost_revenue | fieldformat lost_revenue = "$" + tostring(lost_revenue, "commas") COMMAND: Get some stats COMMAND: Format values for the lost_revenue field FUNCTION: Get a sum FUNCTION: Create a string ARGUMENT: Get a sum of the price field ARGUMENT: Format the string from values in the lost_revenue field, insert commas CLAUSE: Call that sum “lost_revenue”
搜尋語法處理的過程 (範圍縮小 => 運算 => 呈現) 搜尋語法處理的過程 (範圍縮小 => 運算 => 呈現) Disk Intermediate results table Intermediate results table Final results table sourcetype=syslog ERROR | top user | fields - percent Summarize into table of top 10 users Remove column showing percentage Fetch events from disk that match
將 搜尋結果,透過『進階語法』 統計運算: top 熱門排名 : sourcetype=access_ 將 搜尋結果,透過『進階語法』 統計運算: top 熱門排名 : sourcetype=access_* status=200 action=purchase | top categoryId
選擇視覺化 的樣式
stats – count by (依欄位分別統計) The by clause returns a count for each field value of a named field This example counts the number of events when action=purchase for each productId How many of each product was purchased? sourcetype=access_* action=purchase | stats count by productId
stats – sparkline (產生 分時統計圖) Used in conjunction with the stats and chart commands Creates a mini-timeline in a report Represents the same time span as the search – in this case “last 7 days” Not to be confused with timechart, which creates a standalone visualization What is the purchase trend for each product ID over the last 7 days? sourcetype=access* action=purchase | stats sparkline count by productId | sort -count Note: chart and timechart are covered later in this course
How do our prices compare to the competition? 利用 eval 進行 值的差異計算 How do our prices compare to the competition? You can perform mathematical functions against fields with numeric field values This example compares the flowershop price against the competitor's price Subtract the value of flowsersrus_price from price flowersrus_price is another field available via a lookup! sourcetype=access_combined product_name=* | eval difference = price - flowersrus_price | table product_name, price, flowersrus_price, difference
Are any hosts throwing a lot of errors? 圖表指令: chart This example shows a basic chart The count function counts the number of events for each http status Are any hosts throwing a lot of errors? sourcetype=access_* | chart count by status
What’s the overall usage trend for the last 24 hours? 時間趨勢圖分析: timechart This example displays the usage categories over a 1 hour period Splitting by the usage field, each line represents a unique value of the field The y-axis represents the count for each field value sourcetype=cisco_w* | timechart count by usage What’s the overall usage trend for the last 24 hours?
另存為『報告』,方便再利用
以後,就可以直接點選 搜尋和報告,直接瀏覽
使用子搜尋(Sub Search ) sourcetype=access_ 使用子搜尋(Sub Search ) sourcetype=access_* status=200 action=purchase [search sourcetype=access_* status=200 action=purchase | top limit=1 clientip | table clientip] | stats count, dc(productId), values(productId) by clientip
使用查閱表格(Lookup Table)
新增 查閱表格
設定 查閱表格權限
設定 查閱定義、權限
設定 自動 查閱
確定 自動查閱 成功 => 會有 price, productName 兩個欄位 sourcetype=access_*
選擇 price, productName 兩個欄位
使用子搜尋(Sub Search )和 自動查閱 sourcetype=access_ 使用子搜尋(Sub Search )和 自動查閱 sourcetype=access_* status=200 action=purchase [search sourcetype=access_* status=200 action=purchase | top limit=1 clientip | table clientip] | stats count AS "Total Purchased", dc(productId) AS "Total Products", values(productName) AS "Product Names" by clientip | rename clientip AS "VIP Customer 請,另存 為「VIP 客戶」的報告
其他 進階搜尋 語法 (一) 1、比較檢視數量和購買數量 (看 => 放進購物車 => 購買) sourcetype=access_* status=200 | chart count AS views count(eval(action=“addtocart”)) AS addtocart count(eval(action=“purchase”)) AS purchases by productName | rename productName AS “產品名稱”, views AS ”瀏覽總量“, addtocart AS “放入購物車總量”, purchases AS ”最終購買總量” 進階語法 (看 => 放進購物車(百分比) => 購買(百分比)) sourcetype=access_* status=200 | stats count AS views count(eval(action="addtocart")) AS addtocart count(eval(action="purchase")) AS purchases by productName | eval viewsToPurchase=(purchases/views)*100 | eval cartToPurchase=(purchases/addtocart)*100 | table productName views addtocart purchases viewsToPurchase cartToPurchase | rename productName AS “產品名稱”, views AS ”瀏覽總量“, addtocart AS “放入購物車總量”, purchases AS ”最終購買總量” 另存為 [產品檢視數以及購買數比較] 報告
其他 進階搜尋 語法 (一) 另存為 [產品檢視數以及購買數比較] 報告
其他 進階搜尋 語法 (二) 2、特定期間內所購買的產品分析 sourcetype=access_* | timechart count(eval(action="purchase")) by productName usenull="f" useother="f” 選擇 [折線圖]、[區域圖] 和 [柱狀圖] 進行 視覺化 另存為 [每種產品的購買數] 報告
其他 進階搜尋 語法 (三) 3、購買趨勢 sourcetype=access_* status=200 action=purchase| chart sparkline(count) AS "Purchases Trend" count AS Total by categoryId | rename categoryId AS "Category” 另存為 [購買趨勢] 報告
其他 進階搜尋 語法 (四)-1 sourcetype=access_* status=200 | stats count AS views count(eval(action="addtocart")) AS addtocart count(eval(action="purchase")) AS purchases by productName | eval viewsToPurchase=(purchases/views)*100 | eval cartToPurchase=(purchases/addtocart)*100 | table productName views addtocart purchases viewsToPurchase cartToPurchase | renam productName AS "Product Name" views AS "Views", addtocart as "Adds To Cart", purchases AS "Purchases"
其他 進階搜尋 語法 (四)-2 Chart Overlay 選項
製作 儀表版(Dashboard) (一):建立儀表板 1.執行下列搜尋 sourcetype=access_* status=200 action=purchase | top categoryId 2. 視覺化選『圓餅圖』
製作 儀表版(Dashboard) (一):建立儀表板 3. 另存為『儀表板面板』
製作 儀表版(Dashboard) (一):建立儀表板
製作 儀表版(Dashboard) (二):編輯面板
製作 儀表版(Dashboard) (二):編輯面板
製作 儀表版(Dashboard) (三):新增輸入
製作 儀表版(Dashboard) (三):新增輸入,變成 查詢表單
製作 儀表版(Dashboard) (三):新增面板
製作儀表版(四):新增其他面板,最終儀表版
地圖功能(Map ) 對應 IP 所在地指令: iplocation [外部IP位址] sourcetype="access_*" | iplocation clientip 產生以下欄位 Country 所在國家 City 所在城市 lon 經度 lat 緯度
地圖功能(Map ) 統計指令: geostats count by [統計欄位] sourcetype="access_*" | iplocation clientip | geostats count by productName
資料模型(Data Model)、樞紐分析(Pivot Analysis)
恭喜你! 你已經 練功完成~ 變成 等級 1 的 Splunk Ninja 請繼續練功,持續提升功力!