National Central University, Taiwan Machine Learning Give a man a fish, and you feed him for a day; teach him how to fish, and you feed him for a lifetime. Give a computer (machine) a program, and you make it useful for a time; teach it how to program, and you make it useful for a lifetime. Prof. Jehn-Ruey Jiang National Central University, Taiwan 給他魚吃不如教他釣魚 Give a man (computer) a fish (program), and you feed him for a day; teach him how to fish (program), and you feed him for a lifetime.
A Few Quotes “A breakthrough in machine learning would be worth ten Microsofts” (Bill Gates, Chairman, Microsoft) “Machine learning is the next Internet” (Tony Tether, Director, DARPA) Machine learning is the hot new thing” (John Hennessy, President, Stanford) “Web rankings today are mostly a matter of machine learning” (Prabhakar Raghavan, Dir. Research, Yahoo) “Machine learning is going to result in a real revolution” (Greg Papadopoulos, CTO, Sun) Source: Slides of Dr. Pedro Domingos
So What Is Machine Learning? Wiki: Machine learning is a field of computer science that gives computers the ability to learn without being explicitly programmed. Arthur Samuel, an American pioneer in the field of computer gaming and artificial intelligence, coined the term "Machine Learning" in 1959 while at IBM. (Arthur Samuel said in 1959: “How can computers learn to solve problems without being explicitly programmed?”)
So What Is Machine Learning? Pedro Domingos: Getting computers to program themselves Let the data do the work instead! Yi-Fan Chang: A branch of artificial intelligence, concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data. As intelligence requires knowledge, it is necessary for the computers to acquire knowledge.
Traditional Programming Machine Learning Computer Data Output Program Computer Data Program Output Source: Slides of Dr. Pedro Domingos
COMMENT: from林志傑,網路上常用的名字是 Fukuball 從人的學習轉換到機器學習 人學習是為了習得一種技能,比如學習辨認男生或女生,而我們可以從觀察中累積經驗而學會辨認男生或女生,這就是人學習的過程,觀察 -> 累積經驗、學習 -> 習得技能;而機器怎麼學習呢?其實有點相似,機器為了學習一種技能,比如一樣是學習辨認男生或女生,電腦可以從觀察資料及計算累積模型而學會辨認男生或女生,這就是機器學習的過程,資料 -> 計算、學習出模型 -> 習得技能。 Source: Slides of Dr. Hsuan-Tien Lin
在機器學習上,技能就是透過計算所搜集到的資料來提升一些可量測的性能,比如預測得更準確,實例上像是我們可以搜集股票的交易資料,然後透過機器學習的計算及預測後,是否可以得到更多的投資報酬。如果可以增加預測的準確度,那麼我們就可以說電腦透過機器學習得到了預測股票買賣的技能了。 Source: Slides of Dr. Hsuan-Tien Lin
Sample ML Applications Go games Pattern (image, voice, etc.) recognition Computational biology Finance E-commerce Space exploration Robotics Information extraction ….
Magic? No, more like gardening Seeds = ML Algorithms Nutrients = Data Gardener = You (Trainer) Plants = Programs (Model) Source: Slides of Dr. Pedro Domingos
Sample Applications Web search Computational biology Finance E-commerce Space exploration Robotics Information extraction Social networks …. Source: Slides of Dr. Pedro Domingos
ML in a Nutshell Tens of thousands of machine learning algorithms Hundreds new every year Every machine learning algorithm has three components: Representation Evaluation Optimization Source: Slides of Dr. Pedro Domingos
Representation Decision trees (forest) Graphical models (Bayes/Markov nets) Support vector machines Neural networks … Source: Slides of Dr. Pedro Domingos
Evaluation Squared error Accuracy Precision and recall Likelihood Posterior probability Cost / Utility Entropy K-L divergence …
Optimization Convex optimization Combinatorial optimization E.g.: Gradient descent Combinatorial optimization E.g.: Greedy search Constrained optimization E.g.: Linear programming
Types of Machine Learning Supervised (inductive) learning Training data includes desired outputs Unsupervised learning Training data does not include desired outputs Semi-supervised learning Training data includes a few desired outputs Reinforcement learning Rewards from sequence of actions
Term Project: Hello, Digit! Goal: To train a deep learning model to recognized hand-written digits (i.e., 0, 1, 2, 3, 4, 5, 6, 7, 8, 9) of Mnist dataset by using Google TensorFlow python package, so that the model can recognize your own hand-written digits. Background knowledge: Deep Learning (DNN)
Term Project Preliminaries Deep Learning Background Knowledge DNN MLP RNN CNN Tools and Data Mnist Tortoise GIt Anadconda (Python Language) Jupiter Notebook Theano or Keras + TensorFlow
Q&A