Speaker : YI-CHENG HUNG HIN2Vcec:ExploreMetapaths in Heterogeneous Information Networks for Representation Learning Source : CIKM 2017 Advisor : JIA-LING KOH Speaker : YI-CHENG HUNG Date:2018/07/03
Outline Introduction Method Experiment Conclusion
Outline Introduction Method Experiment Conclusion
Introduction-heterogeneous network
Introduction-meta path
Introduction-goal Representation of nodes Meta path
Outline Introduction Method Experiment Conclusion
Method-CNN
Method-CNN Vocabulary(詞彙表) of ICD-10 conditions
Method-CNN D Word embedding[28] 整體架構各層設定及說明
Method-CNN D H k 卷積運算子 bias 整體架構各層設定及說明 The setting of kernel size for the convolution layers are 3,5,7 Activation Function:Tanh,ReLU
Method-CNN 整體架構各層設定及說明 影像辨識中常用的是最大池化 Maximum Average Pooling
Method-CNN 優點:快速學習、不會過度依賴預設值、控制過度學習(減少Dropout的必要) 批次正規化 Γ,β為參數,初設為Γ =1, β =0,藉由學習調整為適當值。
Method-CNN Test error Test Train 目的:用以減少過度學習(overfitting)
Method-CNN 整體架構各層設定及說明 影像辨識中常用的是最大池化
Configurations of CNN Model is built with PyTorch parameter setting Static Dynamic ES eval Model is built with PyTorch parameter setting embedding dimension 128 three kernel sizes for the convolution layers 3,5,7 Dropout probability 0.5 maximum norm(L2 norm) 3.0
How to dynamically build neural network? (TensorFlow) (PyTorch,Chainer)
Early stopping 優點:節省可觀的時間,並保持效能 首先將一小部分訓練集作為我們的開發集 每一個epoch(週期) 結束時,計算開發集的accuracy 一旦我們觀察到開發集上性能越來越差 但測試性能超過了我們預先設定的值 可能 overfitting,終止訓練過程 我們首先將一小部分訓練集作為我們的開發集,然後在其餘的訓練集上進行訓練。 一旦開發集上的測試性能比其餘的訓練性能差,並且測試性能超過了我們預先設定的閾值,則可以得出結論:訓練可能已經過度裝配數據,並終止訓練過程。
Outline Introduction Method Experiment Conclusion
Experiment Dataset overview Baseline method Experiment Settings Evaluation Metrics Experiment Results Parameter Analysis Analyzing Embeddings of Medical Conditions
Experiment-Dataset overview 2 million death certificates in the U.S. from2014 Removing identical records and filter out records with length less than 3 Obtain 1,499,128 records. 1610 input conditions 1180 possible classes as cause of death
Experiment-Baseline method Feature extraction classifiers BoW-bag of word Naive Bayes、Logistic Regression Word embedding Shallow Architectures of shallow classifiers:
Experiment-Settings Training set Development set Test sets 資料集的切割 7.9 0.1 1 BoW CNN 、Shallow 硬體 CPU+60GB RAM NVIDIA K80 GPU Mini-batch 64 epoch 2
Experiment-Evaluation Metrics Accuracy(ACC) Cross Entropy Loss F1 score Cohen’s kappa K=1,代表完全吻合
Experiment-Cohen’s kappa
Classification Results
Experiment-Parameter Analysis The base model is the standard static version of CNN
Experiment-Parameter Analysis The base model is the standard static version of CNN
Experiment-Analyzing Embeddings of Medical Conditions side-product
CONCLUSION This paper showed how a modern deep learning architecture (CNN)can be adapted to identify the cause of death. The model shows significant improvement over the traditional baselines Handle even larger scale datasets than traditional methods Provide human understandable interpretation for the model 現在的深度學習架構CNN模型對資料集的高適應性