How and why (NOT) should we embrace big data

Slides:



Advertisements
Similar presentations
IM426 – BUSINESS CASE 6: SOCIAL SENTIMENTAL ANALYSIS 社群情感分析 Original case source & reference: Rainer, Kelly, Prince, Brad and Watson, Hugh, Management.
Advertisements

Warming up. Heavy! Difficult! Hard! Tired! 1. Easy! 2. Fast! 3. Free!
考研英语复试 口语准备 考研英语口语复试. 考研英语复试 口语准备 服装 谦虚、微笑、自信 态度积极 乐观沉稳.
高中英语教材分析与教学建议 福建教育学院外语研修部特级教师:周大明. 课程目录  一、理论创新与教材发展  二、现行教材的理论基础和编写体系  三、图式理论与 “ 话题教学 ”  四、课例分析与教学建议.
黄国文 中山大学 通用型英语人才培养中的 语言学教学 黄国文 中山大学
2014 年上学期 湖南长郡卫星远程学校 制作 13 Getting news from the Internet.
Presented By: 王信傑 Ricky Wang Date:2010/10/6
Unit 9 Have you ever been to an amusement park? Section A.
Have you ever been to a zoo? zoo water park Have you ever been to a water park?
Healthy Breakfast 第四組 電子一甲(電資一) 指導老師:高美玉 組長:B 侯昌毅
Helping my child with reading and writing(II)
Chapter 29 English Learning Strategy Of High School Students
批判思考 基於行為過程實錄 陳淑月
3 心靈雞湯 ——怎樣移動 富士山 謹以此書 獻給不甘平庸的年輕人 [美]詹姆斯·H·波倫.
3 怎样移动 富士山 —— 献给热爱生活的人.
世代之爭爭什麼 ? 談談如何從調查資料挖掘出 豐厚的意義 劉正山 中山大學政治學研究所 副教授 Director, Smilepoll.tw
Chaoping Li, Zhejiang University
Homework 4 an innovative design process model TEAM 7
Unit 4 I used to be afraid of the dark.
Reading Do you remember what you were doing? 学习目标 1、了解几个重要历史事件。
Unit 5.
Module 5 Shopping 第2课时.
Module 5.
International Conference ITIE2010: Inspiration from Best Practices
Unit title: 买东西 - Shopping
初二英语写作课 课件 福建省闽清县第一中 王国豪
Unit title: 买东西 - Shopping
Journal Citation Reports® 期刊引文分析報告的使用和檢索
Our Boundless Life in Christ
Unit 5 Why do you like pandas?
Unit 7 What’s the highest mountain in the world?
The expression and applications of topology on spatial data
LCCC 2018 Spring Festival April 28, 2018.
971研究方法課程第九次上課 認識、理解及選擇一項適當的研究策略
This Is English 3 双向视频文稿.
Module 4 The natural world
冀教版 七年级下册 Lesson 38 Stay Healthy!.
解读设题意图,探究阅读策略 年高考试卷题型(阅读理解)分析及对策
Traditional Chinese Medicine
Hot Issue 2013 Policy Address 熱點事件 – 醫療美容事件 回到通識教育科網上資源平台 瀏覽內容.
A SMALL TRUTH TO MAKE LIFE 100%
如何增加对欧贸易出口 中国制造展销中心(英国)有限公司 首席执行官 理查德·赛斯
Unit 7 Protect the earth (Story Time).
IBM SWG Overall Introduction
Unit title: 买东西 - Shopping
運用 IT 的論點 Making the Case for IT
BORROWING SUBTRACTION WITHIN 20
《语言与文化》 Unit 3 Verbal and Non-verbal Communication
——Teaching for t_______ hinking
以阅读策略为抓手 以教师引领为提升 年温州一模阅读理解分析及对策
Case study: a manager’s dilemma 組別:3-7 組員:資財 黃姿瑋 資財 林宛璇
運用 IT 的論點 Making the Case for IT
浅谈高中英语阅读教学中的问题设计 浙江省临安中学 方利春.
系统科学与复杂网络初探 刘建国 上海理工大学管理学院
取材 Tommy’s Window slideshow
3 心靈雞湯 ——怎樣移動 富士山 謹以此書 獻給不甘平庸的年輕人 [美]詹姆斯·H·波倫.
陕西丹凤中学 靳庆军 Shaanxi Danfeng Middle School Jin Qingjun
「政治學研究方法的回顧與前瞻: 科技發展與科際整合」研討會 實證主義下的探索式資料分析: 復古?創新?
M; Well, let me check again with Jane
Unit 1 How do you study for a test?
創造思考的開發與培養.
English article read(英文文章閱讀)
國立東華大學課程設計與潛能開發學系張德勝
CONSCIOUS Value-Based Parenting 基于价值的有意识子女教育
Judging 論斷論斷.
英语口译 4 Education and Campus 大学英语教学部 向丁丁.
Center for Deliberative Democracy, Stanford University
3 心靈雞湯 ——怎樣移動 富士山 謹以此書 獻給不甘平庸的年輕人 [美]詹姆斯·H·波倫.
冀教版 三年级下册 Lesson 18 The Magic Stone.
Unit 1 Book 8 A land of diversity
Presentation transcript:

How and why (NOT) should we embrace big data How and why (NOT) should we embrace big data? A reflection from the aspect of epistemology Prof. Chengshan (Frank) Liu Institue of Political Science, NSYSU 2018.5.3 @Dept. of Political Science, NCKU

fact, truth, reality, knowledge, or… ? What is PHD for? fact, truth, reality, knowledge, or… ?

What is “big data”? 5Vs: Big volume, velocity, variety, veracity, and value. Honestly, this term has gone out of fashion.

What do scholars mean by saying “big data”? In our field ”data-driven” and “method-driven” research works are labelled as “big data” studies. Methods that are associated with “big data” Text-mining (文本探勘), data-mining (資料探勘), automatic content analysis (自動內容分析), computer-assisted text analysis (電腦輔助文本分析), automatic annotation (自動附記), sentiment analysis (情緒分析), geographic information system (地理資訊系統) network analysis (網絡分析)等等。

Check out his upcoming talks May 29-30 @NTU 圖片來源:http://ppt.cc/Aqutw 國外關於大數據應用於政治學研究的出版以Gary King為主帥。其他文獻也大都或多或少受過Gary King所帶領的研究群之影響與啟發,儼然成為Gary King學派。Gary King在哈佛大學社會科學量化研 究院(Institute for Quantitative Social Science, IQSS)中,鑽研如何使用不同的研究方法與量化工具推進 社會科學研究。 Check out his upcoming talks May 29-30 @NTU

King’s Purposes of embracing big data Evaluate public policy understand what social posts say estimate the causes of death, ensure fair legislative redistricting, reverse engineer Chinese government’s censorship program, forecast elections and international conflict

主題一:資訊工具在社科(政治)應用概論 2010. “A Method of Automated Nonparametric Content Analysis for Social Science.” 2012. “Social Science Research Methods in Internet Time. 2014. “Restructuring the Social Sciences: Reflections from Harvard’s Institute for Quantitative Social Science.” 2015. “Computer-Assisted Text Analysis for Comparative Politics.” 2015. “No! Formal Theory, Causal Inference, and Big Data Are Not Contradictory Trends in Political Science.” 2015. “We Are All Social Scientists Now: How Big Data, Machine Learning, and Causal Inference Work Together.” 2015. “Is Bigger Always Better? Potential Biases of Big Data Derived from Social Network Sites.” 2016. “Machine Translation: Mining Text for Social Theory.”

主題二:公共言論趨勢之辨識或追蹤 2008. “Recognizing Citations in Public Comments.” 2008. “Parsing, Semantic Networks, and Political Authority Using Syntactic Analysis to Extract Semantic Relations from Dutch Newspaper Articles.” 2008. “Good News or Bad News? Conducting Sentiment Analysis on Dutch Text to Distinguish Between Positive and Negative Relations.” 2008. “Media Monitoring by Means of Speech and Language Indexing for Political Analysis.” 2012. “Media Coverage in Times of Political Crisis: A Text Mining Approach.” 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” 2014. “Echo Chamber or Public Sphere? Predicting Political Orientation and Measuring Political Homophily in Twitter Using Big Data.” 2017. “Critical News Reading with Twitter? Exploring Data-mining Practices and their Impact on Societal Discourse.”

其他主題(三~五) 主題三: 政治立場的辨識/追蹤 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” 2008. “A Scaling Model for Estimating Time-series Party Positions from Texts.” 2014. “Scaling Politically Meaningful Dimensions Using Texts and Votes.” 2015. “Quantifying Social Media’s Political Space: Estimating Ideology from Publicly Revealed Preferences on Facebook.” 主題四:政治言論的管制策略 2013. “How Censorship in China Allows Government Criticism but Silences Collective Expression.” 2013. Media Commercialization & Authoritarian Rule in China. 2017. "How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument." 主題五:公共政策形成之探討 2005. “Using Geographic Information Systems to Study Interstate Competition.” 2014. “’Big Data’ in Research on Social Policy.” 2015. “Analyzing Big Data: Social Choice and Measurement.”

其他主題(六~八) 主題六:政治言論的語意分析 主題七:政治選舉的運用 主題八:國際關係研究 2008. “Automatic Annotation of Semantic Fields for Political Science Research.” 2015. “Uncovering Social Semantics from Textual Traces: A Theory Driven Approach and Evidence from Public Statements of US Members of Congress.” 主題七:政治選舉的運用 2014. “Political Campaigns and Big Data.” 2017. “The Pulse of the People: Can internet data outdo costly and unreliable polls in predicting election outcomes?” 主題八:國際關係研究 2012. “Richardson in the Information Age: Geographic Information Systems and Spatial Data in International Studies.”

Why (not) big data? Your epistemological and methodological stances and attitudes toward methods decide how you evaluate (if not distain) “big data”.

From Big data to Data science “Data science is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining.” ~ Wikipedia

How positivists look at “big data”? Evans & Aceves (2016) “Machine Translation: Mining Text for Social Theory.”

Let’s look at the whole thing from the right angle: data-assisted meaning netting 資料輔助的意義織造 大數據的實作告訴我們,既然知識目的是探索。那就專注在在發現,而 不(必)在驗證。資料數據可用於發現關聯,更可用於探勘意義。不妨 先辨識自己有興趣的概念或面向(什麼價值、什麼行為、什麼態度?), 再透過資料進行探索。一面辨識出不同價值、態度、行為之間的可能關 係,一面與自己的預期關係進行對話。最後再來進行意義的詮釋。 Let’s make our exploration DAMN right.

Data science for extracting facts and Discovering meaning fact vs. truth vs. reality vs. knowledge

March 2016. Google watched how people use a phone in a van for over an hour at a time. Goal: complete interviewing 500 people.

Reflections from the Humanities Holmes, J. (2015). Nonsense: The Power of Not Knowing (First Edition). New York: Crown Publishers. 《無知的力量》 Lindstrom, M. (2016). Small Data: The Tiny Clues That Uncover Huge Trends. New York City: St. Martin’s Press. 《小數據獵人》 Madsbjerg, C. (2017). Sensemaking: The Power of the Humanities in the Age of the Algorithm. New York, NY: Hachette Books.

Meaning nettng Blackburn, S. (2012). What Do We Really Know? The Big Questions in Philosophy. London: Quercus. Cohen, L. H. (2013). I don’t know: In Praise of Admitting Ignorance. New York: Riverhead Books. Holmes, J. (2015). Nonsense: The Power of Not Knowing (First Edition). New York: Crown Publishers. Madsbjerg, C. (2017). Sensemaking: The Power of the Humanities in the Age of the Algorithm. New York, NY: Hachette Books. Sesno, F., & Blitzer, W. (2017). Ask More: The Power of Questions to Open Doors, Uncover Solutions, and Spark Change. New York: AMACOM. Zarkadakis, G. (2016). In Our Own Image: Savior or Destroyer? The History and Future of Artificial Intelligence (1 edition). Pegasus Books.

DamN Methods

資料 Taiwan Election and Democracy Studies 2016 Data Collection Period: 2017.1.17 ~ 4.28 N=1,690 $$$: > NTD 1,000,000

無政黨支持傾向者的樣貌

藍綠支持者的樣貌

不是手段上的量化vs.質化,也不是大數據vs. 厚數據 而是研究者心中資料-意義之間的對話 Conclusion 不是手段上的量化vs.質化,也不是大數據vs. 厚數據 而是研究者心中資料-意義之間的對話

How do I re-evaluate “survey” ?

你有想過,台灣民眾對於「獨立」的定義有很多種,而且很可能沒有什麼共識嗎?

Smilepoll.tw A quali-quantative platform of collecting preferences, patterns, and values for netting data and meaning.

Conclusion: How and why (NOT) should we embrace big data? Exploring new patterns via big data is the spirit of data science. (So think again what political science means.) Different epistemology camps see different uses of big data. (Which side will you take?) “Meaning mining with data” is the consequences of the above way of thinking Data size matters much less than purposes of using data. Learning new data analytical tools will help you get connected to the world of exploring patterns and facts via data. But be fully aware that we should locate our purposes first.