3 Splunk: 企業 各式資訊機器設備的 營運智慧平台 不用事先定義 資料欄位，不用 客製化 連接器，不用資料庫，不需要事先過濾客戶使用資料資料中心 以外其他設備資料Click-stream dataShopping cart dataOnline transaction dataManufacturing, logistics…CDRs & IPDRsPower consumptionRFID dataGPS dataLogfilesConfigsMessagesTraps AlertsMetricsScriptsChangesTickets視窗平台WindowsUNIX 平台Linux/Unix虛擬化 雲端Virtual & Cloud應用系統Applications資料庫Databases網路設備NetworkingRegistryEvent logsFile systemsysinternalsConfigurationssyslogFile systemps, iostat, topHypervisorGuest OS, AppsCloudWeb logsLog4J, JMS, JMX.NET eventsCode and scriptsConfigurationsAudit/query logsTablesSchemasConfigurationssyslogSNMPnetflowSplunk is a data engine for your machine data. It gives you real-time visibility and intelligence into what’s happening across your IT infrastructure – whether it’s physical, virtual or in the cloud.Everybody now recognizes the value of this data, the problem up to now has been getting to it.At Splunk we applied the search engine paradigm to being able to rapidly harness any and all machine data wherever it originates. The “no predefined schema” design, means you can point Splunk at any of your data, regardless of format, source or location. There is no need to build custom parsers or connectors, there’s no traditional RDBMS, there’s no need to filter and forward.Here we see just a sample of the kinds of data Splunk can ‘eat’.Reminder – what’s the ‘big deal’ about machine data? It holds a categorical record of the following:User transactionsCustomer behaviorMachine behaviorSecurity threatsFraudulent activityYou can imagine that a single user transaction can span many systems and sources of this data, or a single service relies on many underlying systems. Splunk gives you one place to search, report on, analyze and visualize all this data.
4 Splunk 產品，四 個主要元件： Search Head, Indexer, Forwarder, Deployment Server 今天入門課程，安裝軟體的架構Search HeadIndexingServerLog FileSplunk ForwardersData is spread out everywhere, and getting it all to one place is often harder than expected. Splunk helps make that job easy, with both agent-less data gathering, and Splunk forwarders. Splunk forwarders collect, process, and forward data to a central Splunk indexer. Forwarders can be load-balanced, are fault tolerant and centrally managed by either Splunk’s Deployment Server or your own config management system, and come in several footprint options.
5 Splunk 兩個主要執行程式之一 ： splunkd 全文檢索服務：被查詢、傳回結果、和 將 所有進入的資料 建立索引Accesses, processes, and indexes incoming dataProcesses all search requests and returns resultsRuns a web server on port 8089 by defaultSpeaks SSL by defaultSplunk helpers run as dependent process(es) of splunkdSplunk helpers run outside scripts, for example:Scripted inputsScripted alerts
6 Splunk 兩個主要執行程式之二 ： Splunk Web Python-based web server, based on CherryPy frameworkProvides both search and management web front end for splunkd processRuns on port 8000 by defaultSets initial login to user: admin password: changeme
10 使用 Splunk Web介面之管理員：設定Data Input Setting up inputs in manager is easyUseful for learning inputs and their settingsNot typically used for setting production inputs, but can be used to create an example inputs.conf
11 資料輸入 的 類別 – all OS’s檔案和目錄（Files & directories） - Splunk monitors text-based log files網路輸入（TCP and UDP） - Splunk listens on a specified port for data feeds指令碼（Scripts） - Splunk runs a script and indexes the outputHTTP 事件收集器（HTTP Event Collector）
12 HTTP 事件收集器 （HTTP Event Collector） Supports DevOps and IoT data analysis needs at scale1. Standard API and logging libraries send events directly to Splunk2. Libraries integrated into popular platforms and servicesScales to Millions of Events/SecondDevOps & DevelopersIoT Devices & ApplicationsNow you can onboard data directly from any application or device– opening up new types of machine data to the benefits of Splunk analysis.The new Event Collector makes it simple and efficient to collect this data, scaling to millions of events per second, using a developer-friendly, standard HTTP/JSON API and logging librariesAnd NO FORWARDERS.Today it is possible to send data directly to Splunk using Modular Inputs or a TCP connection, however this is not an efficient or scalable solution. While log files and forwarders provide an efficient mechanism for typical log and syslog files, use of files and forwarders is not possible or necessarily a desired data collection method for the world of custom applications DevOps, Docker, and other packaged application environments. The same is true for the world of IoT event data, where devices/apps need have no local storage, and even intermediate event collection systems and partners would prefer to use a real-time interface to Splunk rather than create specific log files and use forwarders.The HTTP Event Collector (EC) uses a standard API and high-volume Splunk endpoint to allow events to be directly sent/collected at extreme velocity. The HTTP/JSON API is a developer-standard, whose simple but powerful functionality will be attractive to DevOps and custom application developers and operations managers. Without requiring new system configuration, log creation or administration support, developers can instrument their applications to understand usage flows, performance, error conditions and more. The interface/functionality is also a fit for IoT software developers to connect their devices either directly or via intermediate collection services. The data volumes supported by Splunk are ideal for the transactional and diagnostic data of devices such as Point-Of-Sale systems, vending machines, gaming consoles, automobiles and other devices/systems – opening up a new world of machine data to the benefits of Splunk analysis
14 選擇 輸入的 檔案 或 目錄 位置 => Source Specify a file or directory for ongoing monitoringUpload a copy of a fileUseful for testing and development
15 選擇 資料輸入的 指定主機 => host Specify a constant value if all monitored files in an input are from the same host
16 選擇 資料輸入的 來源類型 => sourcetype Sourcetype is Splunk’s way of identifying the type of dataDefault and custom data processing during indexing relies heavily on sourcetypeAlso used heavily in searches, reports, dashboards, Apps -- basically the rest of Splunk as well!
17 實際的 資料輸入檔： inputs.conf 的設定 Each input gets its own stanzaThe first line, encased in square brackets [ ], sets the type of input and locationSubsequent lines are “attribute = value”See $SPLUNK_HOME/etc/system/README/inputs.c onf.spec for detailed syntax[monitor:///logs/secure] disabled = false host_segment = 3 sourcetype = linux_secure index = security[monitor:///opt/tradelog.log] disabled = 1sourcetype = trade_entrieshost = tradesrv.mycompany.com[udp://514]connection_host = dnssourcetype = syslog
44 搜尋範例 搜尋 Buttercup Games 商店的成功購買數 搜尋 發生錯誤 的產生記錄 搜尋昨天購買了多少模擬遊戲 sourcetype=access_* status=200 action=purchase搜尋 發生錯誤 的產生記錄(error OR fail* OR severe) OR (status=404 OR status=500 OR status=503)搜尋昨天購買了多少模擬遊戲sourcetype=access_* status=200 action=purchase categoryId=simulation
46 搜尋 語言 的 範例This diagram represents a search, broken into its syntax componentsPIPE:Take these events and…PIPE:Take these stats and…Search for thissourcetype=access_* status=503 | stats sum(price) as lost_revenue | fieldformat lost_revenue = "$" + tostring(lost_revenue, "commas")COMMAND:Get some statsCOMMAND:Format values for the lost_revenue fieldFUNCTION: Get a sumFUNCTION:Create a stringARGUMENT: Get a sum of the price fieldARGUMENT:Format the string from values in the lost_revenue field, insert commasCLAUSE:Call that sum “lost_revenue”
47 搜尋語法處理的過程 （範圍縮小 => 運算 => 呈現） 搜尋語法處理的過程 （範圍縮小 => 運算 => 呈現）DiskIntermediate results tableIntermediate results tableFinal results tablesourcetype=syslog ERROR | top user | fields - percentSummarize into table of top 10 usersRemove column showing percentageFetch events from disk that match
48 將 搜尋結果，透過『進階語法』 統計運算： top 熱門排名 : sourcetype=access_ 將 搜尋結果，透過『進階語法』 統計運算： top 熱門排名 : sourcetype=access_* status=200 action=purchase | top categoryId
50 stats – count by （依欄位分別統計） The by clause returns a count for each field value of a named fieldThis example counts the number of events when action=purchase for each productIdHow many of each product was purchased?sourcetype=access_* action=purchase | stats count by productId
51 stats – sparkline （產生 分時統計圖） Used in conjunction with the stats and chart commandsCreates a mini-timeline in a reportRepresents the same time span as the search – in this case “last 7 days”Not to be confused with timechart, which creates a standalone visualizationWhat is the purchase trend for each product ID over the last 7 days?sourcetype=access* action=purchase| stats sparkline count by productId | sort -countNote: chart and timechart are covered later in this course
52 How do our prices compare to the competition? 利用 eval 進行 值的差異計算How do our prices compare to the competition?You can perform mathematical functions against fields with numeric field valuesThis example compares the flowershop price against the competitor's priceSubtract the value of flowsersrus_price from priceflowersrus_price is another field available via a lookup!sourcetype=access_combined product_name=*| eval difference = price - flowersrus_price| table product_name, price, flowersrus_price, difference
53 Are any hosts throwing a lot of errors? 圖表指令： chartThis example shows a basic chartThe count function counts the number of events for each http statusAre any hosts throwing a lot of errors?sourcetype=access_* | chart count by status
54 What’s the overall usage trend for the last 24 hours? 時間趨勢圖分析： timechartThis example displays the usage categories over a 1 hour periodSplitting by the usage field, each line represents a unique value of the fieldThe y-axis represents the count for each field valuesourcetype=cisco_w* | timechart count by usageWhat’s the overall usage trend for the last 24 hours?