Long-Sheng Chena, Cheng-Hsiang Liub, Hui-Ju Chiua

Slides:



Advertisements
Similar presentations
Chapter 2 Combinatorial Analysis 主講人 : 虞台文. Content Basic Procedure for Probability Calculation Counting – Ordered Samples with Replacement – Ordered.
Advertisements

专题八 书面表达.
Classification of Web Query Intent Using Encyclopedia 基于百科知识的查询意图获取
如何在Elsevier期刊上发表文章 china.elsevier.com
二維品質模式與麻醉前訪視滿意度 中文摘要 麻醉前訪視,是麻醉醫護人員對病患提供麻醉相關資訊與服務,並建立良好醫病關係的第一次接觸。本研究目的是以Kano‘s 二維品質模式,設計病患滿意度問卷,探討麻醉前訪視內容與病患滿意度之關係,以期分析關鍵品質要素為何,作為提高病患對醫療滿意度之參考。 本研究於台灣北部某醫學中心,通過該院人體試驗委員會審查後進行。對象為婦科排程手術住院病患,其中實驗組共107位病患,在麻醉醫師訪視之前,安排先觀看麻醉流程衛教影片;另外對照組111位病患,則未提供衛教影片。問卷於麻醉醫師
课程:跨境电商 资料源:阿里巴巴教学资源库
个人总结及展望 主讲人:胡玲玲.
多菌株乳酸菌組合在飼料添加物及保健食品之應用-
-Artificial Neural Network- Hopfield Neural Network(HNN) 朝陽科技大學 資訊管理系 李麗華 教授.
分析抗焦慮劑/安眠劑之使用的影響因子在重度憂鬱症及廣泛性焦慮症病人和一般大眾的處方形態
Chapter 8 Liner Regression and Correlation 第八章 直线回归和相关
Mode Selection and Resource Allocation for Deviceto- Device Communications in 5G Cellular Networks 林柏毅 羅傑文.
XI. Hilbert Huang Transform (HHT)
Leftmost Longest Regular Expression Matching in Reconfigurable Logic
Operating System CPU Scheduing - 3 Monday, August 11, 2008.
A TIME-FREQUENCY ADAPTIVE SIGNAL MODEL-BASED APPROACH FOR PARAMETRIC ECG COMPRESSION 14th European Signal Processing Conference (EUSIPCO 2006), Florence,
A Question Answering Approach to Emotion Cause Extraction
深層學習 暑期訓練 (2017).
Visualizing and Understanding Neural Machine Translation
-Artificial Neural Network- Adaline & Madaline
An Adaptive Cross-Layer Multi-Path Routing Protocol for Urban VANET
What water is more suitable for nurturing the goldfish
Some Effective Techniques for Naive Bayes Text Classification
Improving classification models with taxonomy information
Thinking of Instrumentation Survivability Under Severe Accident
指導教授:許子衡 教授 報告學生:翁偉傑 Qiangyuan Yu , Geert Heijenk
Population proportion and sample proportion
模式识别 Pattern Recognition
Consumer Memory 指導老師 莊勝雄 MA4D0102郭虹汝MA4D0201吳宜臻.
What is poverty? Source: Commission on Proverty, HKSAR Government
Source: IEEE Access, vol. 5, pp , October 2017
Advanced Artificial Intelligence
The role of leverage in cross-border mergers and acquisitions
971研究方法課程第九次上課 認識、理解及選擇一項適當的研究策略
Interval Estimation區間估計
如何利用教学资源库 提高师生的信息素养 How to Utilize the Teaching Resource Library
Towards Emotional Awareness in Software Development Teams
客户服务 售后服务.
Version Control System Based DSNs
研究技巧與論文撰寫方法 中央大學資管系 陳彥良.
校園地震預警系統的建置與應用 林沛暘.
高性能计算与天文技术联合实验室 智能与计算学部 天津大学
Maintaining Frequent Itemsets over High-Speed Data Streams
Guide to a successful PowerPoint design – simple is best
前向人工神经网络敏感性研究 曾晓勤 河海大学计算机及信息工程学院 2003年10月.
虚 拟 仪 器 virtual instrument
Common Qs Regarding Earnings
Cisco Troubleshooting and Maintaining Cisco IP Networks (TSHOOT)
Lesson 19: A Story or a Poem?
关联词 Writing.
Review and Analysis of the Usage of Degree Adverbs
Learn Question Focus and Dependency Relations from Web Search Results for Question Classification 各位老師大家好,這是我今天要報告的論文題目,…… 那在題目上的括號是因為,前陣子我們有投airs的paper,那有reviewer對model的名稱產生意見.
主講人:陳鴻文 副教授 銘傳大學資訊傳播工程系所 日期:3/13/2010
A Data Mining Algorithm for Generalized Web Prefetching
An Efficient MSB Prediction-based Method for High-capacity Reversible Data Hiding in Encrypted Images 基于有效MSB预测的加密图像大容量可逆数据隐藏方法。 本文目的: 做到既有较高的藏量(1bpp),
An organizational learning approach to information systems development
李宏毅專題 Track A, B, C 的時間、地點開學前通知
Efficient Query Relaxation for Complex Relationship Search on Graph Data 李舒馨
Introduction of this course
More About Auto-encoder
Speaker : YI-CHENG HUNG
Chapter 9 Validation Prof. Dehan Luo
Class imbalance in Classification
MGT 213 System Management Server的昨天,今天和明天
簡單迴歸分析與相關分析 莊文忠 副教授 世新大學行政管理學系 計量分析一(莊文忠副教授) 2019/8/3.
如何在Elsevier期刊上发表文章 china.elsevier.com
WiFi is a powerful sensing medium
Gaussian Process Ruohua Shi Meeting
Hybrid fractal zerotree wavelet image coding
When using opening and closing presentation slides, use the masterbrand logo at the correct size and in the right position. This slide meets both needs.
Presentation transcript:

Long-Sheng Chena, Cheng-Hsiang Liub, Hui-Ju Chiua A neural network based approach for sentiment classification in the blogosphere Long-Sheng Chena, Cheng-Hsiang Liub, Hui-Ju Chiua Journal of Informetrics 5 (2011) 313–322 Report:Yi-Hsiang Hsieh

Outline Introduction Methodology Experiments Conclusion

Introduction(1/3) Recognizing emotion is extremely important for a text-based communication tool such as a blog. On commercial blogs, the evaluation comments by bloggers of a product can spread at an explosive rate in cyberspace. Lately, researchers have been paying much attention to sentiment classification. Semantic orientation indexes and machine learning methods are usually employed to achieve this goal.

Introduction(2/3) This study proposed a neural-network based approach. The proposed NN based method combines the BPN and SO indexes to classify bloggers’ sentiment. NN based method can reduce training time when classifying textual data. NN based method outperforms traditional sentiment classification methods, BPN and SO index, from experimental results. 為了結合這兩種方法的優點,本研究提出了一種神經網絡為基礎的方法 ►所提出的基於神經網絡方法結合了BPN和SO指標進行分類博客“的情緒。 ►基於神經網絡的方法可以在分類文本數據,從而減少培訓時間。 ►基於神經網絡方法優於傳統的情感分類方法,BPN和SO指數,從實驗結果。

Introduction(3/3) Our method uses the results of the SO indexes as the inputs for the BPN. Several cases collected from real world blogs or databases are provided to demonstrate the effectiveness of our method. The experimental results indicate that our method can efficiently increase the performance of sentiment classification and save a substantial amount of training time compared with traditional IR and ML techniques, respectively.

Methodology(1/8) Back-propagation neural networks Step 1. For each training pattern (presented in random order): Step 1.1. Apply the inputs to the network. Step 1.2. Calculate the output for every neuron from the input layer, through the hidden layer(s), to the output layer. Step 1.3. Calculate the error at the outputs. Step 1.4. Use the output error to compute error signals for pre-output layers. Step 1.5. Use the error signals to compute weight adjustments. Step 1.6. Apply the weight adjustments. Step 2. Periodically evaluate the network performance.

Methodology(2/8) Semantic orientation indexes The general SO index is used to infer semantic orientation from the semantic association (SO-A). In the SO-A index defined in Eq. (1). A word, word, is classified as having a positive (negative) semantic orientation when the SO-A(word) is positive (negative). The magnitude (absolute value) of the SO-A(word) can be considered as the strength of the semantic orientation: 一般SO索引用於從語義關聯(SO-A)推斷語義傾向。在公式中定義的SO-A指數。 一個字,一句話,被歸類為具有正(負)語義傾向時, SO-A(字)為正(負)。的幅度將SO-A(字)的(絕對值)可以被認為是的強度 語義方向:

Methodology(3/8) The second index calculates the semantic orientation from the PMI, called the SO-PMI index. Unlike the SO-A, the SO-PMI uses the PMI-IR to estimate the semantic orientation of a phrase. The PMI between two words, word1 and word2, is defined as the SO-PMI can be calculated as follows: 第二個指標計算從PMI的傾向性,叫SO-PMI指數。該指數是從SO-A擴展,它被廣泛應用於實踐(阿巴西等人,2008年,Chaovalit週,2005年,特尼,2002年和特尼和利特曼,2003)。不同的是SO-A,該SO-PMI採用PMI-IR(點式互信息和信息檢索)來估算一個短語(教會和漢克斯,1989年和特尼,2002)的語義指向。之間的兩個詞,WORD1和WORD2的PMI,被定義為

Methodology(4/8) Thus, using 2 different operators, we have two SO-PMI indexes, SO- PMI(AND) and SO-PMI(NEAR) in this study. The last index is SO-LSA which calculates the strength of the semantic association between words using LSA 因此,使用2個不同的運營商,我們有兩個SO-PMI索引,在這項研究中的SO-PMI(AND)和SO-PMI(近端)。 最後一個指標(SO-LSA,它計算使用LSA詞之間語義關聯的強度

Methodology(5/8) 本節將介紹所提出的基於神經網絡的方法。如圖所示。2,我們的方法的實現可以分為4個步驟。這四個步驟可以證明如下。

Methodology(6/8) Step 1: prepare data

Methodology(7/8) Step 2: calculate the SO indexes In this study, we use four SO indexes including SO-A, SO-PMI(AND), SO- PMI(NEAR), SO-LSA as the input neurons of BPN. Therefore, the second step of our method is to calculate these SO indexes. Step 3: train the neural network The experimental data set is divided into training and test sets. systematically tried a different proportion (50–90%) of all examples to be the training data set, Then, we begin the training process of the BPN using the training data set.

Methodology(8/8) Step 4: performance evaluation In this step, we use the test data to evaluate the performance of our NN based approach, the BPN, and the four SO indexes.

Experiments(1/8) Data preparation

Experiments(2/8) Performance evaluation The performance evaluation matrices, overall accuracy (OA) and F1 have been used. In short, the common way for evaluating the performance of classifiers is based on the confusion matrix shown in Table 3. 性能評價矩陣,總體準確度(OA)與F1已被使用。總之,對於評估分類器的性能的常見方法是根據在表3中所示的混淆矩陣。

Experiments(3/8) In general, the performance of a sentiment classifier is evaluated by the OA compared to the number of test cases. OA can be defined by Eq.  Another popular index is F1 whose formula comes from the combination of Precision and Recall. F1, Precision, and Recall are defined by Eqs.

Experiments(4/8) Experimental results First, we attempted to compare the effectiveness of SO indexes, SO-A, SO-PMI(NEAR), SO-PMI(AND), and SO-LSA. Table 4 summarizes the results of these four indexes. 本節提供了實現的結果。首先,我們試圖比較SO索引的有效性,SO-A,SO-PMI(近端),SO-PMI(AND),和SO-LSA。表4總結了這四個指標的結果。

Experiments(5/8) However, this performance of SO-LSA is not good enough. Therefore, next, we implemented BPN and our method. To find the best performance of BPN and our method, we systematically tried a different proportion (50–90%) of all examples to be the training data set, with the rest of the samples as the test set. After the experiments, we picked the best performance. 

Experiments(6/8)

Experiments(7/8) From Fig. 3 and Table 5, we found that the proposed method, including quantitative and qualitative representation, has the best OAs in Movie-1, Movie-2, EC and Blog data sets. Compared with the original BPN, the NN based method can increase the classification performance by 4–6% in these 4 data sets.

Experiments(8/8)  Table 6 summarizes the average processing time of the BPN and NN based methods. 

Conclusion(1/2) This study proposed an NN based approach to classify sentiment in blogospheres by combining the advantages of the BPN and SO indexes. Compared with traditional techniques such as BPN and SO indexes, the proposed approach shows its superiority not only in classification accuracy, but also in training time. In order to obtain better or more robust results, additional experiments of using different ML approaches such as Support Vector Machines (SVM) and Naïve Bayes are necessary in future researches.

Conclusion(2/2) It should also be noted that our proposed method is not only specific to blogs, it can be employed to classify sentiment in any text based communication tool. We just used blogs as an example in this study. Readers can apply the proposed method to any new media such as Twitter, Plurk, Facebook, and so on. But, to testify the limitations of the proposed method, future works could use different data sets or data types.

Thanks for your attention