贝叶斯方法在政治学中的应用 天津财经大学统计系 吴敬
背景与发展 贝叶斯机理与计算 贝叶斯方法在政治学中的应用 政治学中贝叶斯方法研究 未来的发展
背景及发展 统计学作为一种科学的方法论,广泛应用于自然、社会、经济等各领域的研究。二次大战后,随着行为主义在政治学中的兴起,应用统计学及其它数量方法研究政治行为成为时尚,研究成果大量出现。 自20 世纪70 年代开始,统计方法作为政治活动的数量化研究工具得到了应用,并逐步发展成为政治学研究方法论的重要组成部分,从而开始形成政治统计学体系。 我国政治学研究主要还是规范分析,基本没有统计实证分析。
对许多从事经验政治学的人来说,贝叶斯统计学可能像一个怪异的分支,只是偶尔出现在杂志和书里边,但是并没占据一个中心地位。这种看法看来正在改变,事实上,改变很迅速。 贝叶斯统计思想在二十世纪八十年代开始引入政治学领域。 从2000年至2012年这12年中,在《 political analysis(政治分析)》杂志中涉及贝叶斯方法的论文达到176篇,贝叶斯方法的应用得到了充分应用。 King, G. (1990). On political methodology. Political Analysis, 1-29. Gill, J. (2004). Introduction to the special issue [Bayesian methods]. Political Analysis, 12, 323–337
一些政治学中贝叶斯方法研究以及应用专著: Western B, Jackman S. (1994). Bayesian inference for comparative research[J](比较研究贝叶斯推断). American Political Science Review, 1994 : 412-423. Jackman, S. (2004). Bayesian analysis for political research (政治研究的贝叶斯分析). Annual Review of Political Science , 7, 483-505. Martin, A. D. (2004). Bayesian Inference and Computation in Political Science(政治学中的贝叶斯推断和计算). http://www.polmeth.wustl.edu/media/Paper/berger.pdf Chen, M. H., Dey, D. K., Müller, P., Sun, D., & Ye, K. (2010). Bayesian Inference in Political Science, Finance, and Marketing Research(政治学、金融和市场研究中的贝叶斯推断). In Frontiers of Statistical Decision Making and Bayesian Analysis (pp. 377-417). Springer New York. Gill J. (2012). Bayesian Methods in Political Science: Introduction to the Virtual Issue. http://www.oxfordjournals.org/our_journals/polana/pa_bayes2.pdf
Gary King Gary King is the Albert J. Weatherhead III University Professor at Harvard University -- one of 24 with the title of University Professor, Harvard‘s most distinguished faculty position. He is based in the Department of Government (in the Faculty of Arts and Sciences) and serves as Director of the Institute for Quantitative Social Science. King develops and applies empirical methods in many areas of social science research, focusing on innovations that span the range from statistical theory to practical application(software) King has been elected Fellow in 6 honorary societies
Andrew Gelman professor of statistics and political science and director of the Applied Statistics Center at Columbia University. He has received the Outstanding Statistical Application award from the American Statistical Association, the award for best article published in the American Political Science Review, and the Council of Presidents of Statistical Societies award for outstanding contributions by a person under the age of 40. His books include Bayesian Data Analysis (with John Carlin, Hal Stern, and Don Rubin), Teaching Statistics: A Bag of Tricks (with Deb Nolan), Data Analysis Using Regression and Multilevel/Hierarchical Models (with Jennifer Hill), Red State, Blue State, Rich State, Poor State: Why Americans Vote the Way They Do (with David Park, Boris Shor, Joe Bafumi, and Jeronimo Cortina), and A Quantitative Tour of the Social Sciences (co-edited with Jeronimo Cortina). Andrew has done research on a wide range of topics, including: vote, elections , democracy, police , social network structure, toxicology; medical imaging; and methods in surveys, experimental design, statistical inference, computation, and graphics.
Jeff Gill Professor, Washington University. (BA UCLA, MBA Georgetown, Ph.D. American University, Post-Doc Harvard). Major areas of research and interest are [Methodology and Statistics] Bayesian approaches, Markov chain Monte Carlo, queueing theory, nonparametrics, missing data, generalized linear model theory, model selection, circular data, and general problems in statistical computing; [Epidemiology] mental health outcomes for children exposed to war, foot-and-mouth disease, containment policy,and measurement/data issues; [Medicine] pediatric traumatic brain injury, linkages between obesity and cancer (including human energetics and mouse models), models of Warfarin dosage, psychiatric trauma, physiological effects of stress; [Political Science] voting, terrorism, Scottish politics, expert elicitation, bureaucracy.
为什么人们突然在政治学研究中更有兴趣应用贝叶斯模型?引起这个变化的一个明显原因是贝叶斯模型设定对传统模型有独特的优势,模型结果的概率表达和先验信息的内在机制结合。 引起这个变化的第二个原因是关于计算的问题。 摆脱困难的多维积分的问题最终被MCMC技术解决,这导致了当前贝叶斯学派的复兴。 贝叶斯统计学在二十一世纪初更受欢迎 。
贝叶斯方法及计算 统计学中贝叶斯推断的核心哲学基础是将未知量和参数都看作随机变量,所有观测值都看作基于条件固定不变,所有未观测到的变量都假定有分布性质,看作随机变量。 贝叶斯推断:从可能经验、定性描述、统计或直觉上得到未知量一个先验分布,后验分布从先验分布以及观测值得到。
计算 通过MCMC方法随机模拟得到边缘后验分布 MCMC方法是使用马尔科夫链的蒙特卡罗积分,其基本思想是:构造一条 Markov 链使其平稳分布为待估参数的后验分布,通过这条马尔科夫链产生后验分布的样本,并基于马尔科夫链达到平稳分布时的样本(有效样本)进行蒙特卡罗积分。 产生马尔科夫链的一个最常见方法是Gibbs sampler(软件包winBUGS的缺省机制),通过对每个参数的所有条件分布的重复抽样得到边缘后验分布的经验估计。 软件:MCMCpack and WinBUGS
贝叶斯方法在政治学的应用 贝叶斯方法在政治学得到了广泛的应用,下面我们给出其中引用率高的一些重要著作。 Bartels L M. Messages received: The political impact of media exposure(接收信息:媒体暴露的政治影响)[J]. American Political Science Review, 1993: 267-285. Gelman, A. (2012). How Bayesian analysis cracked the red-state, blue state problem(贝叶斯分析如何破解不同阶层选民问题). Beck, Nathaniel, Gary King, and Langche Zeng. 2000. Improving quantitative studies of international conflict: A conjecture(改进国际冲突定量研究的一个猜想). American Political Science Review 94(1): 21-35.
Katz J N, King G. A statistical model for multiparty electoral data[J](一个多党选举数据的统计模型). American Political Science Review, 1999: 15-32. King G, Murray C J L, Salomon J A, et al. Enhancing the validity and cross-cultural comparability of measurement in survey research[J](增强调研测度的文化间可比性与有效性). American Political Science Review, 2003, 97(4): 567-584. Hill J L, Kriesi H. An extension and test of Converse‘s“ black-and-white” model of response stability[J](受访者稳定性Converse黑白模型的检验与扩展). American Political Science Review, 2001, 95(2): 397-414. Barabas J. How deliberation affects policy opinions[J](审议如何影响政策舆论). American Political Science Review, 2004, 98(04): 687-701. Bartels B L. The constraining capacity of legal doctrine on the US Supreme Court[J](法律原则对美国最高法院的约束能力). American Political Science Review, 2009, 103(3): 474-95.
Shih V, Adolph C, Liu M. Getting ahead in the communist party: explaining the advancement of central committee members in China[J](共产党的成功:解释中国中央委员的进步). American Political Science Review, 2012, 106(01): 166-187. Shor B, McCarty N. The ideological mapping of American legislatures[J](美国立法机构的意识形态分布). American Political Science Review, 2011, 105(3): 530-51. 其中贝叶斯方法中非常重要也是有重大争议的领域是先验分布的设定,A Gelman(2009)做了总结。 Gelman A. Prior distributions for Bayesian data analysis in political science. 2009. http://www.polmeth.wustl.edu/media/Paper/berger.pdf
政治学中贝叶斯方法研究 一些学者集中在政治学中的贝叶斯方法研究,包括测量、设定、维数和估计问题。在有些情况下,使用其他方法解决很困难或者不可能解决,或者理论上不合适,使用贝叶斯方法解决更为合适。 political analysis以及其他杂志上都有这方面的论文。
Political Analysis中的贝叶斯方法论文 Ward M D, Gleditsch K S. Location, location, location: An MCMC approach to modeling spatial context with categorical variables in the study and prediction of war[C](地点,地点,地点:空间状态分类变量模型的一个MCMC方法)//Political Analysis. 2000. Beck N, Katz J N. Random coefficient models for time-series–cross-section data: Monte Carlo experiments[J] (时间截面数据的随机系数模型:蒙特卡洛实验). Political Analysis, 2007, 15(2): 182-195. Stegmueller D. Modeling dynamic preferences: a Bayesian robust dynamic latent ordered probit model[J](动态偏好模型:一个贝叶斯稳健动态潜变量顺序probit模型). Political Analysis, 2013. Buckley J. Simple Bayesian inference for qualitative political research[J](定性政治研究的简单贝叶斯推断). Political Analysis, 2004, 12(4): 386-399. Lock K, Gelman A. Bayesian combination of state polls and election forecasts[J](结合州民调和选举预测的一个贝叶斯方法). Political Analysis, 2010, 18(3): 337-348. Williams J T. Dynamic change, specification uncertainty, and Bayesian vector autoregression analysis[J](动态变化、不确定性设定和贝叶斯向量自回归分析). Political Analysis, 1992: 97-125.
Spirling A. Bayesian approaches for limited dependent variable change point problems[J](一个受限因变量变点问题的贝叶斯方法). Political Analysis, 2007, 15(4): 387-405. Imai K, Lu Y, Strauss A. Bayesian and likelihood inference for 2× 2 ecological tables: an incomplete-data approach[J](2× 2生态表贝叶斯和似然推断:一个不完备数据方法). Political Analysis, 2008, 16(1): 41-69. Western B, Kleykamp M. A Bayesian change point model for historical time series analysis[J](一个历史时间序列分析的贝叶斯变点模型). Political Analysis, 2004, 12(4): 354-374. Montgomery J M, Nyhan B. Bayesian model averaging: Theoretical developments and practical applications[J](贝叶斯模型平均方法:理论发展与实际应用). Political Analysis, 2010, 18(2): 245-270. Grimmer J. A Bayesian hierarchical topic model for political texts: Measuring expressed agendas in Senate press releases[J](一个政治环境分层主题贝叶斯模型:测度参议院新闻发布的时间安排). Political Analysis, 2010, 18(1): 1-35.
Park D K, Gelman A, Bafumi J Park D K, Gelman A, Bafumi J. Bayesian multilevel estimation with poststratification: state-level estimates from national polls[J](事后分层贝叶斯多层估计:从国家民调得到州层次估计). Political Analysis, 2004, 12(4): 375-385. Quinn K M. Bayesian factor analysis for mixed ordinal and continuous responses[J](混合顺序连续因变量的贝叶斯因子分析). Political Analysis, 2004, 12(4): 338-353. Bafumi J, Gelman A, Park D K, et al. Practical issues in implementing and understanding Bayesian ideal point estimation[J](实施和理解贝叶斯理想点估计的实际问题). Political Analysis, 2005, 13(2): 171-187. Jackman S. Estimation and inference are missing data problems: Unifying social science statistics via Bayesian simulation[J](缺失值问题的估计和推断:通过贝叶斯模拟统一社会科学统计学). Political Analysis, 2000, 8(4): 307-332.
Jackman S. Multidimensional analysis of roll call data via Bayesian simulation: identification, estimation, inference, and model checking[J](通过贝叶斯模拟进行点名投票数据多维分析:识别、估计、推断和模型检验). Political Analysis, 2001, 9(3): 227-241. Buckley J. Simple Bayesian inference for qualitative political research[J](定性政治研究简单贝叶斯推断). Political Analysis, 2004, 12(4): 386-399. Shor B, Bafumi J, Keele L, et al. A Bayesian multilevel modeling approach to time-series cross-sectional data[J](一个时间截面数据的贝叶斯多层模型方法). Political Analysis, 2007, 15(2): 165-181.
其他杂志发表的贝叶斯方法方面的论文。 Martin A D, Saunders K L. Bayesian Inference for Political Science Panel Data[C](政治学Panel数据贝叶斯推断)//American Political Science Association. 2002. Darmofal D. Bayesian spatial survival models for political event processes[J](政治事件进程贝叶斯空间生存模型). American Journal of Political Science, 2009, 53(1): 241-257. Gill J, Walker L D. Elicited priors for bayesian model specifications in political science research[J](政治学研究贝叶斯模型设定的引致先验分布). Journal of Politics, 2005, 67(3): 841-872.
将来的发展 贝叶斯方法提供比所知其他方法更灵活的概率模型和推断的基本方法。 将来需要在时间序列方面加强贝叶斯方法研究,要在误差结构和多层组成中增加结构特征和联立性。数目不确定的多变点模型还没充分解决。由于语言本身就是多层次,对于文本分析中应该扩展贝叶斯多层设定改善结果。另一个激动人心的领域是贝叶斯非参数方法。另一个应该更注意的贝叶斯一般领域是先验分布的设定,或者通过所处的环境信息或者合适数学性质。第一种情况,在一些学科,比如医药,已经成功将以前的知识转化成先验分布改善后验分布的质量,第二种情况,称为“客观贝叶斯”组促进了替代信息少扁平先验分布的研究工作。 Gill J. Bayesian Methods in Political Science: Introduction to the Virtual Issue. http://www.oxfordjournals.org/our_journals/polana/pa_bayes2.pdf
欢迎批评指正!