算法设计与分析 叶 德 仕 yedeshi@gmail.com
课程基本信息 课程编号: 21190120 上课时间:2007年 夏学期 上课地点:紫金港 东1B-206 周一、周三的晚上 6:30 ~ 上课地点:紫金港 东1B-206 考试时间:7月6日 14:00 – 16:00 考试形式:书面闭卷考试 学时/学分:4-2/周 学时/ 2-1 学分 11/18/2018
Office 办公室地点:曹光彪主楼 524 Tel.: 13957173448 11/18/2018
$1,000,000 problem P = NP ? http://www.claymath.org/millennium/ Solved???!!!! 11/18/2018
Examination Scores: 1. Final examination (50%) 2. Two projects (40% ) 3. Attendances & Questions (10%) 11/18/2018
想成为计算机科学家? 11/18/2018
Erdős project Paul Erdős(1913-1996) has an Erdős number of zero. If the lowest Erdős number of a coauthor is X, then the author's Erdős number is X + 1. 11/18/2018
Nevanlinna Prize winners NAME YEAR COUNTRY ERDÖS NUMBER Robert Tarjan 1982 USA 2 Leslie Valiant 1986 Hungary/Gt Brtn 3 Alexander Razborov 1990 Russia 2 Avi Wigderson 1994 Israel 2 Peter Shor 1998 USA 2 Madhu Sudan 2002 India/USA 2 Jon Kleinberg 2006 USA 3 11/18/2018
Other famous people Albert Einstein 1921 Physics 2 Chen Ning Yang 1957 Physics 4 Tsung-dao Lee 1957 Physics 5 John F. Nash 1994 Economics 4 Edmund S. Phelps 2006 Economics 4 Shing-Tung Yau 1982 China 2 Shiing Shen Chern 1983-84 China 2 Alan Turing computer science 5 John von Neumann mathematics 3 David Hilbert mathematics 4 Donald E. Knuth 2 11/18/2018
还是想成为一名 技术人员? 11/18/2018
领导者或决策者? 11/18/2018
老板布置任务了 11/18/2018
你的答案: Um? 告诉我写那些代码? 11/18/2018
还是随便的应付? 11/18/2018
你的答案: 我学过这个算法,但想不起来了 图书馆里有现成 11/18/2018
你的答案: 我可以设计一个新的算法 11/18/2018
A Microsoft interview? 11/18/2018
未来是属于 谁能掌握基本问题和解决方法 有能力解决新出现或者不熟悉的问题 11/18/2018
What is algorithm? (Oxford Dict.)Algorithm: From Math world A set of rules that must be followed when solving a particular problem. From Math world A specific set of instructions for carrying out a procedure or solving a problem, usually with the requirement that the procedure terminate at some point. An algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output. 11/18/2018
Algorithm Problem definition 问题 Objective 目标 (very important) Methods 方法 Performance 算法评价 11/18/2018
Perspective Algorithms we can find everywhere They have been developed to easy our daily life Train/Airplane timetable schedule Routing We live in the age of information Text, numbers, images, video, audio Human Genome project Internet Electronic commerce Manufacturing 11/18/2018
Algorithm in daily life Clothes: strip packing Cooking: menu scheduling Accommodation: facility location Traffic: traffic lights 11/18/2018
Pricing Water, electrical power pricing: Step pricing Promotion: Hangbai et al., buy items with total price >= 300, then 60 bonus, but each invoice uses only once. Bin covering, Bin packing, Open – End bin packing problem. 11/18/2018
Selfish routing Pigou's Example Suburb s, a nearby train station t. Assuming that all drivers aim to minimize the driving time from s to t C(x) = 1 Suburb: s t C(x) = x, with x in [0, 1] 11/18/2018
Selfish routing We have good reason to expect all traffic to follow the lower road Social optimal? ½ to the long, wide highway, ½ to the lower road. selfish behavior need not produce a socially optimal outcome 11/18/2018
Braess's Paradox v C(x) = x C(x) = 1 s t C(x) = 1 C(x) = x w 11/18/2018
Braess's Paradox v C(x) = x C(x) = 1 C(x) = 0 s t C(x) = 1 C(x) = x w 11/18/2018
Braess's Paradox Paradox thus shows that the intuitively helpful action of adding a new zero-cost link can negatively impact all of the traffic! With selfish routing, network improvements can degrade network performance. 11/18/2018
History of Algorithm The word algorithm comes from the name of the 9th century Persian mathematician Abu Abdullah Muhammad ibn Musa al-Khwarizmi whose works introduced Arabic numerals and algebraic concepts. The word algorism originally referred only to the rules of performing arithmetic using Arabic numerals but evolved into algorithm by the 18th century. The word has now evolved to include all definite procedures for solving problems or performing tasks. 11/18/2018
History – con. The first case of an algorithm written for a computer was Ada Byron's notes on the analytical engine written in 1842, for which she is considered by many to be the world's first programmer. However, since Charles Babbage never completed his analytical engine the algorithm was never implemented on it. This problem was largely solved with the description of the Turing machine, an abstract model of a computer formulated by Alan Turing, and the demonstration that every method yet found for describing "well-defined procedures" advanced by other mathematicians could be emulated on a Turing machine (a statement known as the Church-Turing thesis). 11/18/2018
课程内容 1. 数学基础 2. 基本算法 2.1 分治 (Divide-and-Conquer)* 1.1 算法基础 1.1 算法基础 1.2 和 (SUMS) 集合运算 (Sets) 1.3 特殊数 (Stirling numbers, Harmonic numbers, Eulerian numbers et al.) 2. 基本算法 2.1 分治 (Divide-and-Conquer)* 2.1.1 Mergesort * 2.1.2 自然数相乘(Multiplication)* 2.1.3 矩阵相乘(Matrix multiplication) 2.1.4 Discrete Fourier transform and Fast Fourier transform 11/18/2018
课程内容 2.2 动态规划 (Dynamic Programming) 2.2.1 背包问题(Knapsack problem) 2.2.2 最长递增子序列(Longest increasing subsequence) 2.2.3 Sequence alignment 2.2.4 最长相同子序列(Longest common subsequence) 2.3.5 Matrix-chain multiplication 2.3.6 树上的独立集 (Max Independent set in tree) 11/18/2018
课程内容 2.3 贪婪算法 (Greedy) 2.4 NP 问题 (NP-completeness) 2.3.1 区间规划(Interval scheduling) 2.3.2 集合覆盖(Set cover) 2.3.3 拟阵(Matroids) 2.4 NP 问题 (NP-completeness) 2.4.1 The classes P and NP 2.4.2 NP-completeness and reducibility 2.4.3 NP-complete problems * 11/18/2018
课程内容 2.5 近似算法 (Approximate Algorithm) 2.5.1 顶点覆盖问题 (Vertex cover) 2.5.2 负载平衡问题 (Load balancing) 2.5.3 旅行商问题 (Traveling salesman problem) 2.5.4 子集和问题 (Subset sum problem) 11/18/2018
课程内容 3. 算法的应用 3.1 局部搜索 (Local Search) 3.1.1 The Metropolis Algorithm and Simulated Annealing 3.1.2 Local Search to Hopfield Neural Networks(Nash Equilibria) 3.1.3 Maximum Cut Approximation via Local Search 11/18/2018
课程内容 3.2 图论 (Graph Theorem) 3.3计算几何学 (Computational Geometry) 3.2.1 图论的基本知识 (Fundamental) 3.2.2 线性规划 (Linear Programming) 网络流(Network Flow),二分图,完全图的匹配 3.3计算几何学 (Computational Geometry) 3.3.1 基本概念与折线段的性质 (Line-segment ) 3.3.2 线段的一些性质 (Segments intersects ) 3.3.3 凸包问题 (Convex Hull ) 3.3.4 最近点对问题 (The closet pair of points) 3.3.5 多边形三角剖分 (Polygon Triangulation) 11/18/2018
课程内容 3.5 随机算法 (Randomized Algorithm) 3.4 字符串匹配 (String Matching)* 3.4.1 字符串匹配的简单算法 3.4.2 The Karp-Rabin algorithm 3.4.3 Finite automata, The Knuth-Morris-Pratt algorithm 3.5 随机算法 (Randomized Algorithm) 3.5.1 随机变量与期望 3.5.2 A Randomized MAX-3-SAT 3.5.3 Randomized Divide-and-Conquer 11/18/2018
课程内容 3.6 组合数学与数论 (Number-Theoretic) 3.7 高精度 (BigNums) 3.8 数据结构 (Data Structure)* 跳跃链表(Skip list), Fibonacci Heaps, Data Structures for Disjoint Sets 3.9 矩阵运算/计算数学 (Matrix operations)* 矩阵性质, 矩阵相乘, 解线性方程组, 高斯消去法 *:备选内容 11/18/2018
主要参考教材 Introduction to algorithms, Second Edition. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein. The MIT Press, 2001. ISBN: 0262032937. Algorithm Design. Jon Kleinberg, Éva Tardos. Addison Wesley, 2005. ISBN: 0-321-29535-8. Clifford Stein: Columbia University. Thomas H. Cormen: darmouth. Charles Leiserson, Ronald L. Rivest :MIT. Rolf Nevanlinna prize starts from 1982 for each four year. Rolf Nevanlinna Prize, 06 11/18/2018
11/18/2018
参考教材 Algorithms. S. Dasgupta, C.H. Papadimitriou, and U. V. Vazirani. May 2006. Combinatorial Algorithms. Jeff Erickson. University of Illinois, Urbana-Champaign. Lecture Notes. Fall 2002. Concrete Mathematics. Ronald L. Graham, Donald E. Knuth, Oren Patashnik. Addison-Wesley Publishing Company, 2005. ISBN: o-201-14236-8. 数据结构与算法 – Data structures and algorithms. A. V. Aho J. E. Hopcroft, J. D. Ullman. 清华大学出版社, 2003. 算法艺术与信息学竞赛.刘汝佳 黄亮. ISBN 7-302-07800-9, 清华大学出版社,2004. 计算几何—算法分析与设计.周培德. 清华大学出版社ISBN 7-302-03801-5,2000. 11/18/2018
课程目标: 实践算法设计和分析的技能。 培养各种证明能力,如:归纳法证明。 培养寻找最坏情况的例子。 培养对算法的直觉。能够对陌生问题提出算法,寻找解决问题的途径。 能判断什么情况下已找到最优解?或者能设计更好的算法。 11/18/2018
Tools? 11/18/2018
Requirement Come to the class (*) Ask questions Thinking: Why it is ok now? How about other methods? 11/18/2018
Algorithms in Computer Science P = NP ? Can we solve a problem efficiently? Tradeoff between quality of solution and the running time Solve a problem with optimal solution, but it might cost long time Solve a problem approximately in short time 11/18/2018
Example Trucking company with a central warehouse Each day, it loads up the truck at the warehouse and sends it around to several locations to make deliveries. At the end of the day, the truck must end up back at the warehouse so that it ready to be loaded for the next day. To reduce the costs, the company wants to select an order of delivery stops that yields the lowest overall distance traveled by the truck. 11/18/2018
Algorithm as technology Suppose computers were infinitely fast and computer memory was free. Do we have any reason to study algorithms? The answer is: Yes 11/18/2018
Lost cow problem A short-sighted cow (or assume it’s dark, or foggy, or ...) is standing in front of a fence and does not know in which direction the only gate in the fence might be. How can the cow find the gate without walking too great a detour? How can two soldiers get together when lost in battlefield ? 11/18/2018
The Ski problem The Ski problem [Karp 92]: A skier must decide every day she goes skiing whether to rent or buy skis, unless or until she decides to buy them. The skiier doesn’t know how many days she will go on skiing before she gets tired of this hobbie. The cost to rent skis for a day is 1 unit, while the cost to buy the skis is B units. How can she save money? 11/18/2018
Pizza delivery One can give a call or via internet to order a pizza for dinner We want the hot, fresh and tasty pizzas How should they delivery the pizzas upon the reception of orders?? Immediately or wait some minutes for next orders in the near places? 11/18/2018
Computing time is a bounded resource and so is space in meomery. 11/18/2018
Efficiency Computer A is 100 times faster than computer B Sort n numbers Computer A requires instructions Computer B requires 50nlgn instructions n = 1,000, 000 Computer A: 2(10^6)^2/10^9 = 2000 seconds Computer B: 50*10^6 lg 10^6/10^7 ~ 100 seconds 11/18/2018
Running time 10 < 1 s < 1s 4 s 100 1 s 18 min year 1,000 Very long 10,000 2 min 12 day 20 s 12 days 31710 year n 2 n n l o g n 3 2 n n ! 1 2 5 1 6 11/18/2018
Sorting < a a … > a , , , Input: 8 2 4 9 3 6 Output: 2 3 4 6 8 9 输入:A sequence of n number 输出:排列(permutation ) < a a … > 1 2 a , , , n < a a … a > 1 , 2 , , n 使得: <= <= <= a a ... a 1 2 Example: n Input: 8 2 4 9 3 6 Output: 2 3 4 6 8 9 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 2 3 4 8 9 6 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 2 3 4 8 9 6 11/18/2018
EX. of insertion sort 8 2 4 9 3 6 2 8 4 9 3 6 2 4 8 9 3 6 2 4 8 9 3 6 2 3 4 8 9 6 2 3 4 6 8 9 done 11/18/2018
Insertion sort “pseudocode” A: key sorted INSERTION-SORT (A, n) ⊳ A[1 . . n] for j ← 2 to n do key ← A[ j] i ← j – 1 while i > 0 and A[i] > key do A[i+1] ← A[i] i ← i – 1 A[i+1] = key “pseudocode” 1 i j n A: key sorted 11/18/2018
Running time Running time: The running time of an algorithm on a particular input is the number of primitive operations or “steps” executed. Input size: number of items, the total number of bits. Example: The input size of sorting problem is n. Worst-case running time of Insert sort is Q(n2). 11/18/2018
Running time The running time depends on the input: an already sorted sequence is easier to sort. Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones. Generally, we seek upper bounds on the running time, because everybody likes a guarantee. 11/18/2018
Kinds of analyses Worst-case: (usually) Average-case: (sometimes) T(n) = maximum time of algorithm on any input of size n. Average-case: (sometimes) T(n) = expected time of algorithm over all inputs of size n. Need assumption of statistical distribution of inputs. Best-case: (bogus) Cheat with a slow algorithm that works fast on some input. 11/18/2018