Computer Vision Chapter 4 Statistical Pattern Recognition Presenter: 小彼得 E-mail: itspeter@gmail.com Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Overview of this chapter Introduction Bayes Decision Rules Economic Gain Matrix Maximin Decision Rule Decision Rule Error Reserving Judgment Nearest Neighbor Rule Binary Decision Tree Classifier Neural Networks It is not cover… ANY 真實的辨識..
Introduction Units: Image regions and projected segments Each unit has an associated measurement vector Using decision rule to assign unit to class or category optimally DC & CV Lab. CSIE NTU
Introduction (Cont.) Feature selection and extraction techniques Decision rule construction techniques Techniques for estimating decision rule error DC & CV Lab. CSIE NTU
Simple Pattern Discrimination Also called pattern identification process A unit is observed or measured A category assignment is made that names or classifies the unit as a type of object The category assignment is made only on observed measurement (pattern) This is called Fair Game Assumption Unit就是真實發生的事 Measure就是儀器量到的 DC & CV Lab. CSIE NTU
Overview of this chapter Introduction Bayes Decision Rules Economic Gain Matrix Maximin Decision Rule Decision Rule Error Reserving Judgment Nearest Neighbor Rule Binary Decision Tree Classifier Neural Networks
Simple Pattern Discrimination (cont.) a: assigned category from a set of categories C t: true category identification from C d: observed measurement from a set of measurements D (t, a, d): event of classifying the observed unit P(t, a, d): probability of the event (t, a, b) 接著要進入統計的世界了 . 先定義一些名詞 DC & CV Lab. CSIE NTU
Economic Gain Matrix e(t, a): economic gain/utility with true category t and assigned category a A mechanism to evaluate a decision rule Identity gain matrix 我們在後面會一直常常用到這個 他就是某一個情況下, 事件發生會帶來的影響 DC & CV Lab. CSIE NTU
Identity Gain Matrix True/Assigned A B C D 1 DC & CV Lab. CSIE NTU
An Instance DC & CV Lab. CSIE NTU
Another Instance P(g, g): probability of true good, assigned good, P(g, b): probability of true good, assigned bad, ... e(g, g): economic consequence for event (g, g), … e positive: profit consequence e negative: loss consequence 各有4種狀況, 下面有兩張表格講解 DC & CV Lab. CSIE NTU
Another Instance (cont.) DC & CV Lab. CSIE NTU
Another Instance (cont.) 至此大家要注意到. 我們是有兩張Table的 一張給定機率, 另一張給定economic 的影響 DC & CV Lab. CSIE NTU
Another Instance (cont.) Fraction of good objects manufactured P(g) = P(g, g) + P(g, b) P(b) = P(b, g) + P(b, b) Expected profit per object E = P(g, g)*e(g,g) + P(g, b)*e(g,b) + P(b, g)*e(b,g) + P(b, b)*e(b,b) 下一頁講條件機率 DC & CV Lab. CSIE NTU
Conditional Probability DC & CV Lab. CSIE NTU
Conditional Probability (cont.) P(b|g): false-alarm rate P(g|b): misdetection rate Another formula for expected profit per object 接著我們把條件機率帶到式子裡, 讓我們來看一個example DC & CV Lab. CSIE NTU
Example 4.1 P(g) = 0.95, P(b) = 0.05 DC & CV Lab. CSIE NTU
Example 4.1 (cont.) DC & CV Lab. CSIE NTU Table. 4.5在上一頁
Example 4.2 P(g) = 0.95, P(b) = 0.05 DC & CV Lab. CSIE NTU Table 4.7, economic matrix 是一樣的 Prior 也是一樣的 DC & CV Lab. CSIE NTU
Example 4.2 (cont.) DC & CV Lab. CSIE NTU 這兩個例子告訴我們給定prior, 給定 economic 後 隨著我們偵測的machine performance會有所改變 DC & CV Lab. CSIE NTU
Decision Rule Construction (t, a): summing (t, a, d) on every measurements d Therefore, Average economic gain 下一張是期望值的示意圖 DC & CV Lab. CSIE NTU
Decision Rule Construction (cont.) Measurement d 這軸已經被壓進右圖了 這本書的average 通常指的就是期望值 DC & CV Lab. CSIE NTU
Decision Rule Construction (cont.) We can use identity matrix as the economic gain matrix to compute the probability of correct assignment: True/Assigned A B C D 1 DC & CV Lab. CSIE NTU
Fair Game Assumption Decision rule uses only measurement data in assignment; the nature and the decision rule are not in collusion In other words, P(a| t, d) = P(a| d) DC & CV Lab. CSIE NTU
Fair Game Assumption (cont.) From the definition of conditional probability DC & CV Lab. CSIE NTU
Fair Game Assumption (cont.) Derive P(t,a|d) = P(a|d)P(t|d) By def., P(t,a|d) = P(t,a,d) / P(d) ----(1) By def., P(a|t,d) = P(t,a,d) / P(t,d) ----(2) By Fair Game Assumption P(a|t,d) = P(a|d) ----(3) (1)+(2)可得 P(t,a|d)*P(d) = P(a|t,d)*P(t,d) (3)代入得 P(t,a|d)*P(d) = P(a|d)*P(t,d) Q.E.D DC & CV Lab. CSIE NTU
預計花20 min DC & CV Lab. CSIE NTU
Deterministic Decision Rule We use the notation f(a|d) to completely define a decision rule; f(a|d) presents all the conditional probability associated with the decision rule A deterministic decision rule: Decision rules which are not deterministic are called probabilistic/nondeterministic/stochastic Ok, 現在我們要開始推deterministic decision rule 這是比較符合”直觀”的想法 , 但不一定是最好的 我們令 function f 為這個分類函式. DC & CV Lab. CSIE NTU
Expected Value on f(a|d) Previous formula By and => 首先e 不變, P(d) 是 prior , 知道生產出來的好壞為何 真正我們要決定的是 f . 這個函式決定了要怎麼分類 DC & CV Lab. CSIE NTU
Expected Value on f(a|d) (cont.) 從上面這個式子呢, 很直觀的來看, function f 不是0 就是1 應該針對某個d , 直接把其最大的P(t,d)設定 1, 其它為0 但這樣忘了考量e(t,a)帶來的效果 所以整理成下面這個式子 DC & CV Lab. CSIE NTU
Bayes Decision Rules Maximize expected economic gain Satisfy Constructing f 現在我們正式的定義Bayes Decision Rule, 要從所有可能的function g中找到一個最好的 (可能不只一個) 後面會舉例 DC & CV Lab. CSIE NTU
Bayes Decision Rules (cont.) 每一個column 每一個column 看 看看如果 d1 的判斷結果是assign 成c1 , 那大括弧算出來的是什麼 ?- 0.12 和 0.2 (p106 note) 接著我們要看如何求出false-positive, mis-identification. DC & CV Lab. CSIE NTU
Bayes Decision Rules (cont.) + 接著是往Continuous 的延伸 + DC & CV Lab. CSIE NTU
Continuous Measurement For the same example, try the continuous density function of the measurements: and Measurement lie in the close interval [0,1] Prove that they are indeed density function P(x|t1) = 3x^2 DC & CV Lab. CSIE NTU
Continuous Measurement (cont.) Suppose that the prior probability of is 1/3 and the prior probability of is 2/3 When , a Bayes decision rule will assign an observed unit to t1, which implies => x: measurement DC & CV Lab. CSIE NTU
Continuous Measurement (cont.) .805 > .68, the continuous measurement has larger expected economic gain than discrete DC & CV Lab. CSIE NTU
Break 剛那邊15 minute DC & CV Lab. CSIE NTU
Prior Probability The Bayes rule: Replace with The Bayes rule can be determined by assigning any categories that maximizes 為什麼本來好好的P(t,d)不要, 要變成 P(d|t)呢 ? 大家看一下這個式子, 有時連P(t) 都不知道了, 更何況是 P(t,d) 呢 ? 一般來說P(d|t)是比較好量的 DC & CV Lab. CSIE NTU
Equal-probability of ignorance Equal-probability of ignorance assumption P(t) is likely equal. Put identity gain matrix together. Maximum likelihood decision rule. DC & CV Lab. CSIE NTU
Overview of this chapter Introduction Bayes Decision Rules Economic Gain Matrix Maximin Decision Rule Decision Rule Error Reserving Judgment Nearest Neighbor Rule Binary Decision Tree Classifier Neural Networks It is not cover… ANY 真實的辨識..
Economic Gain Matrix Do they have different decision rules ? e(t,a) A B C 1 e(t,a) A B C -30 70 -20 -10 DC & CV Lab. CSIE NTU
Economic Gain Matrix Identity matrix Incorrect loses 1 A more balanced instance Why we talk about this ? 之前在算decision rule時, 有e(t,d) 這個 economic matrix 現在我們要探討e 對 rule決定會造成什麼影響 DC & CV Lab. CSIE NTU
Economic Gain Matrix Suppose are two different economic gain matrix with relationship According to the construction rule. Given a measurement d, Because We then got DC & CV Lab. CSIE NTU
Economic Gain Matrix Summary Under positive multiplicative constant and additive constant will not change the optimal rule. Why we talk about this ? 之前在算decision rule時, 有e(t,d) 這個 economic matrix 現在我們要探討e 對 rule決定會造成什麼影響 DC & CV Lab. CSIE NTU
Economic Gain Matrix In some cases, non regular matrix can derive optimal rule. Revisit the continuous example Let the economic matric be arbitrary, say it is Optimal rule is to assign t1 when Substituting the density for P(t1,x) , P(t2,x) Finally, reach assigns category t1 DC & CV Lab. CSIE NTU
Overview of this chapter Introduction Bayes Decision Rules Economic Gain Matrix Maximin Decision Rule Decision Rule Error Reserving Judgment Nearest Neighbor Rule Binary Decision Tree Classifier Neural Networks 接著要講Maximin, 針對不同的optimal 對像的另一種rule
Maximin Decision Rule Recall, we have done that… Finding function f(a|d) under… given d, pick “a” that maximize What if we don’t know P(t,d) and P(t) is not reasonable to be assume even. Equal-probability-of-ignorance assumption is not true DC & CV Lab. CSIE NTU
Maximin Decision Rule Maximizes average gain over worst prior probability 找一個f , 然後試所有可能的prior, DC & CV Lab. CSIE NTU
Maximin Decision Rule Simplify using the assumption that Thus, reaching worst prior of When and others have zero probability Formula became , instead of 即使做了這樣的簡化, 一般情況下要找 Maximin rule仍是很困難 且通常找出來的rule不符合直覺 (是 stochastic的 ) DC & CV Lab. CSIE NTU
Example 4.3 DC & CV Lab. CSIE NTU
Example 4.3 (cont.) DC & CV Lab. CSIE NTU
Example 4.3 (cont.) DC & CV Lab. CSIE NTU
Example 4.3 (cont.) The lowest Bayes gain is achieved when The lowest gain is 0.6714 DC & CV Lab. CSIE NTU
Example 4.3 (cont.) DC & CV Lab. CSIE NTU 注意任何f1~f7的線性組合都會落在這個convex hall裡 DC & CV Lab. CSIE NTU
Example 4.3 (cont.) The “general” maximin decision rule is a “stochastic” one and is “guaranteed” to achieve the worst possible Bayes gain. To determine a decision rule f that maximize 畫虛線至交點處 下一頁的例子是說最佳解也可以是deterministic DC & CV Lab. CSIE NTU
Example 4.4 DC & CV Lab. CSIE NTU
Example 4.4 (cont.) DC & CV Lab. CSIE NTU
Example 4.4 (cont.) DC & CV Lab. CSIE NTU
Example 4.4 (cont.) 下一個例子比較複雜 DC & CV Lab. CSIE NTU
Example 4.5 DC & CV Lab. CSIE NTU 他要說明三件事 1.Bayes 是最好的, (廢話, 己知prior) 2.General maximin rule 永遠可以做到和 Bayes最差一樣好 3. General maximin rule != Bayes decision rule DC & CV Lab. CSIE NTU
Example 4.5 (cont.) DC & CV Lab. CSIE NTU
Example 4.5 (cont.) f1 and f4 forms the lowest Bayes gain Find some p that eliminate P(c1) p = -0.3103 DC & CV Lab. CSIE NTU
Break DC & CV Lab. CSIE NTU
Overview of this chapter Introduction Bayes Decision Rules Economic Gain Matrix Maximin Decision Rule Decision Rule Error Reserving Judgment Nearest Neighbor Rule Binary Decision Tree Classifier Neural Networks 接著要講Maximin, 針對不同的optimal 對像的另一種rule
Decision Rule Error (Type I)The misidentification error αk (Type II)The false-identification error βk Alpha : 不小心把好人指責成壞人 Beta : 把 DC & CV Lab. CSIE NTU
An Instance DC & CV Lab. CSIE NTU
Reserving Judgment The decision rule may withhold judgment for some measurements Then, the decision rule is characterized by the fraction of time it withhold judgment and the error rate for those measurement it does assign. It is an important technique to control error rate. DC & CV Lab. CSIE NTU
Reserving Judgment Let be the maximum Type I error we can tolerate with category k Let be the maximum Type II error we can tolerate with category k Measurement that will not be rejected (acceptance region) DC & CV Lab. CSIE NTU
Break DC & CV Lab. CSIE NTU
Nearest Neighbor Rule Assign pattern x to the closest vector in the training set The definition of “closest”: where is a metric or measurement space Chief difficulty: brute-force nearest neighbor algorithm computational complexity proportional to number of patterns in training set brute-force nearest neighbor:暴力法 這邊介紹了一個 metric , 講解一下, 距離的定義可以是好多種. DC & CV Lab. CSIE NTU
Nearest Neighbor Rule Cover and Hart(1967) showed that nearest neighbor rule will not have an error rate worse than twice the Bayes error rate Speed up: using eigenvalue decomposition. brute-force nearest neighbor:暴力法 這邊介紹了一個 metric , 講解一下, 距離的定義可以是好多種. DC & CV Lab. CSIE NTU
Nearest Neighbor Rule kNN algorithm DC & CV Lab. CSIE NTU
Nearest Neighbor Rule kNN algorithm DC & CV Lab. CSIE NTU Open source C++ library , fast! DC & CV Lab. CSIE NTU
Binary Decision Tree Classifier Assign by hierarchical decision procedure DC & CV Lab. CSIE NTU
Major Problems Choosing tree structure Choosing features used at each non-terminal node Choosing decision rule at each non-terminal node DC & CV Lab. CSIE NTU
Decision Rules at the Non-terminal Node Thresholding the measurement component Fisher’s linear decision rule Bayes quadratic decision rule Bayes linear decision rule Linear decision rule from the first principal component DC & CV Lab. CSIE NTU
Error Estimation An important way to characterize the performance of a decision rule Training data set: must be independent of testing data set Hold-out method: a common technique construct the decision rule with half the data set, and test with the other half More generalized method divide data set into N parts, use N-1 to train. 如果只用原來的資料測, 有時會有over-fitting的問題 DC & CV Lab. CSIE NTU
Neural Network A set of units each of which takes a linear combination of values from either an input vector or the output of other units 由線性組合來構成, 線跟線代表線性function, 邊上有weight 點則代表non-linear function Hidden layer可以有好多好多層 DC & CV Lab. CSIE NTU
Neural Network (cont.) Has a training algorithm Responses observed Reinforcement algorithms Back propagation to change weights DC & CV Lab. CSIE NTU
Summary Bayesian approach Maximin decision rule Misidentification and false-alarm error rates Nearest neighbor rule Construction of decision trees Estimation of decision rules error Neural network DC & CV Lab. CSIE NTU