R教學 Graph 2(繪圖2) 羅琪老師.

R教學 Graph 2(繪圖2) 羅琪老師

Bar plot長條圖長條圖或稱條形圖，常用來顯示類別資料的分佈情形。它用寬度相等、條帶的長短表示各類別次數的多寡。類別資料可藉由長條圖以圖示法描述。橫軸為類別縱軸為次數或百分比等使用barplot() 函數來建立長條圖

汽車趨勢資料汽車趨勢汽車道路測試資料(Motor Trend Car Road Tests)
資料是取自於1974年的美國汽車發展趨勢雜誌，內容包括32款汽車（型號）在油耗及10個汽車設計和性能方面的數據。

汽車趨勢資料資料集中有11個變數32個觀察值 mpg Miles/(US) gallon 公哩/加侖
cyl Number of cylinders 氣缸數 disp Displacement (cu.in.) 容量 hp Gross horsepower 總馬力 drat Rear axle ratio 後輪軸比

汽車趨勢資料資料集中有11個變數32個觀察值 wt Weight (1000 lbs) 重量 qsec ¼ mile time ¼哩的時間
vs V/S 發動機類型 (0 = V型, 1 = S型) am Transmission (0 = automatic, 1 = manual) 變速器(0-自動, 1-手動) gear Number of forward gears 前進檔位數 carb Number of carburetors 化油器數

Motor Trend Car Road Tests

Simple Bar Plot簡單長條圖 > attach(mtcars) > counts <- table(gear) # 建立前進檔位數的次數分配 > counts > barplot(counts, main="Car Distribution", xlab="Number of Gears")

Simple Bar Plot簡單長條圖

Simple Bar Plot簡單長條圖 > barplot(counts, main="Car Distribution", xlab="Number of Gears", border = "red") # border = “red” 長條邊框使用顏色設定為紅色 # border = NA 長條忽略邊框

Simple Bar Plot簡單長條圖 > barplot(counts, main="Car Distribution", col=c("red", "blue", "green"), names.arg=c("3 Gears", "4 Gears", "5 Gears")) # col 每個長條顏色 # names.arg 每個長條名稱

Simple Bar Plot簡單長條圖(橫式)
> barplot(counts, main="Car Distribution", horiz=TRUE, names.arg=c("3 Gears", "4 Gears", "5 Gears")) # horiz=TRUE 畫橫式, horiz=FALSE 畫直式 # names.arg 每個長條名稱

> count1<-counts/32*100 # 計算百分比次數 > count1 gear > barplot(count1, main="Car Distribution", horiz=TRUE, names.arg=c("3 Gears", "4 Gears", "5 Gears"), xlim=c(0,50)) # horiz=TRUE 畫橫式, horiz=FALSE 畫直式 # names.arg 每個長條名稱 # xlim=c(0,50) x軸的範圍設為0~50

Grouped Bar Plot分組的長條圖
> counts <- table(vs, gear) # 建立按vs與gear分類的交叉表 > counts gear vs > barplot(counts, main="Car Distribution by Gears and VS", xlab="Number of Gears", col=c("darkblue","red"), legend = rownames(counts), beside=TRUE) # col 每個長條顏色 # legend = rownames(counts) 圖例使用counts列的名稱 # beside=TRUE 分組長條左右並列

> count1<-counts > count1[1,]<-counts[1,]/sum(counts[1,]) # 計算vs=0, gear的條件比例 > count1[2,]<-counts[2,]/sum(counts[2,]) # 計算vs=1, gear的條件比例 > count1 gear vs

> barplot(count1, main="Car Distribution of Gears by VS", xlab="Number of Gears", col=c("darkblue","red"), legend = c("V", "S"), beside=TRUE) # col 每個長條顏色 # legend = c(“V”, “S”) 圖例使用自己給的名稱 # beside=TRUE 分組長條左右並列

Stacked Bar Plot堆疊的長條圖
> counts <- table(vs, gear) > counts gear vs > barplot(counts, main="Car Distribution by Gears and VS", xlab="Number of Gears", col=c("darkblue","red"), legend = rownames(counts)) # col 每個長條顏色 # legend = rownames(counts) 圖例使用counts列的名稱 # beside=FALSE(內設, 沒設定表示就是FALLSE) 分組長條上下並列

Stacked Bar Plot堆疊的長條圖

> par(las=2) # 使標籤文字垂直於軸線 > par(mar=c(5,8,4,2)) # 增加 y-軸邊界寬度 > counts <- table(gear) > barplot(counts, main="Car Distribution", horiz=TRUE, names.arg=c("3 Gears", "4 Gears", "5 Gears"), cex.names=0.8) # names.arg 自定每個長條名稱 # 名稱大小為原來大小的0.8倍

las(the style of axis labels)
las=0:always parallel to the axis [default], # 使標籤文字平行於軸線 las=1:always horizontal, # 標籤文字水平 las=2:always perpendicular to the axis, # 使標籤文字垂直於軸線 las=3:always vertical # 標籤文字垂直

las las=0 las=1 las=2 las=3

mar A numerical vector of the form c(bottom, left, top, right) which gives the number of lines of margin to be specified on the four sides of the plot. The default is c(5, 4, 4, 2) + 0.1

mar c(5, 4, 4, 2) c(5, 8, 4, 2)

Pie Chart圓餅圖圓餅圖常用來呈現某類別變數的構成百分比%，即以各組的相對次數將該圓餅分為若干扇形部分。
亦常用於表示資料的部分與整體之間的比例關係。整個圓的面積表示整體，對圓進行分割得到多個扇形的面積表示部分。使用pie(x, labels=) 函數來建立圓餅圖

Simple Pie Chart簡單圓餅圖 > slices <- c(10, 12, 4, 16, 8) # 輸入每個扇形的次數 > lbls <- c("US", "UK", "Australia", "Germany", "France") # 輸入每個扇形的標籤 > pie(slices, labels = lbls, main="Pie Chart of Countries") # 畫圓餅圖

Simple Pie Chart簡單圓餅圖

Simple Pie Chart簡單圓餅圖 > slices <- c(10, 12, 4, 16, 8) # 輸入每個扇形的次數 > lbls <- c("US", "UK", "Australia", "Germany", "France") # 輸入每個扇形的標籤 > pct <- round(slices/sum(slices)*100) # 計算每扇形百分比 > lbls <- paste(lbls, pct) > lbls <- paste(lbls,“%”,sep=“”) # 增加% 符號到標籤 > pie(slices, labels = lbls, col=rainbow(length(lbls)), main="Pie Chart of Countries") # 畫圓餅圖, col=rainbow(length(lbls)) 顏色用彩虹, 幾個彩虹由length(lbls), lbls的個數決定

Simple Pie Chart簡單圓餅圖 > pie(slices,labels = lbls, cex=2, col=rainbow(length(lbls)), main="Pie Chart of Countries", cex.main=2) # 畫圓餅圖 # cex=2 是將標籤放大2倍 # cex.main=2 是將標題放大2倍

Simple Pie Chart簡單圓餅圖 > slices <- c(10, 12, 4, 16, 8) # 輸入每個扇形的次數 > lbls <- c("US", "UK", "Australia", "Germany", "France") > pct <- round(slices/sum(slices)*100) # 計算每個扇形百分比, 並四捨五入到整數 > lbls <- paste(lbls, “\n”, pct, sep=“”) # 印出標籤, 然後換列, 再增加百分比到標籤 > lbls <- paste(lbls, "%", sep="") # 增加% 符號到標籤 > pie(slices, labels = lbls, col=rainbow(length(lbls)), main="Pie Chart of Countries \n (with percentage)") # 畫圓餅圖 # \n 代表換列

3D Pie Chart立體圓餅圖 > install.packages(“plotrix”) # 安裝plotrix套件 > library(plotrix) #連接plotrix套件 > slices <- c(10, 12, 4, 16, 8) # 輸入每個扇形的次數 > lbls <- c("US", "UK", "Australia", "Germany", "France") # 輸入每個扇形的標籤 > pie3D(slices, labels=lbls, explode=0.1, main="Pie Chart of Countries ") # 畫立體圓餅圖 # explode=0.1 是將3D扇形爆開0.1

3D Pie Chart立體圓餅圖

Donut plot甜甜圈圖, 環狀圖 > > slices <- c(10, 12, 4, 16, 8) # 輸入每個扇形的次數 > lbls <- c("US", "UK", "Australia", "Germany", "France") > pct <- round(slices/sum(slices)*100) # 計算每個扇形百分比, 並四捨五入到整數 > lbls <- paste(lbls, pct) # 增加百分比到標籤 > lbls <- paste(lbls,“%”,sep=“”) # 增加% 符號到標籤 > doughnut(slices, lbls, main="Pie Chart of Countries\n (with percentage)") # 畫環狀圖

Donut plot甜甜圈圖, 環狀圖

Histograms直方圖一般常見的定量資料的圖形表示就是直方圖 (histogram)。在建構直方圖之前，資料須先經過分組
分組不要分太多組或太少組，組數應在5到20之間，視資料的多寡而定分組之後，然後再建立次數分配、相對次數分配或百分比次數分配。

Histograms直方圖我們有興趣之變數置於直方圖的橫軸上，而次數、相對次數或百分比次數則置於直方圖的縱軸上。
每一分組的次數、相對次數或百分比次數以一個長方形表示，其寬度是該組別之組寬，高度則是相對應之次數、相對次數或百分比次數。使用下面函數建立直方圖 hist(x) hist(x, nclass=n) # 組數 hist(x, breaks=b, ...) # 分組設定

直方圖與長條圖的差異長條圖直方圖

Histograms直方圖在生態領域學家最重要的活動之一是透過記錄觀察到生物的距離來估算人口密度(Buckland et al., 2001)。距離資料的分組方式影響距離資料進行調整並影響稍後用於推論的人口密度。當談到瀕危和稀有物種，對於決定如何分組資料，會影響保護行動、法院裁決等的決策，圈形圖用於普查鳥類。觀察員坐在半徑r的一個想像的圓中心，記錄距離和發現的物種。在內華達山脈鳥類密度的研究，資料都是針對黄喉蟲森鶯 (Nashville warbler)。

Histograms直方圖以下是84個的觀察值，每個觀察值是從20個這樣的重複，每個記錄為50微米的半徑：

Histograms直方圖 > distance <- c(15, 16, 10, 8, 4, 2, 35, 7, 5, 14, 14, 0, 35, 31, 0,10, 36, 16, 5, 3, 22, 7, 55, 24, 42, 29, 2, 4, 14, 29, 17, 1, 3, 17, 0, 10, 45, 10, 9, 22, 11, 16, 10, 22, 48, 18, 41, 4, 43, 13, 7, 7, 8, 9, 18, 2, 5, 6, 48, 28, 9, 0, 54, 14, 21, 23, 24, 35, 14, 4, 10, 18, 14, 21, 8, 14, 10, 6, 11, 22, 1, 18, 30, 39) # 輸入資料

Histograms直方圖 > par(mfrow=c(2,2)) # 一頁配置2列2行共4個圖 > hist(distance, xlab=" ", ylab="frequency", col="gray90") > hist(distance, xlab=" ", main="breaks =22", ylab="", breaks=22 ,col="dark green") > hist(distance, xlab="distance(m)", ylab="frequency", breaks = seq(0,60,5), col="gray90") > hist(distance, xlab="distance(m)", main="breaks =6", ylab="", breaks=6, col="gray90", las=2) hist 預設會使用 Sturges 的方式自動計算 bin 的數目，我們也可以用 breaks 參數自行指定

Histograms直方圖 breaks are one of: Breaks可以是下面其中一個 a vector giving the breakpoints between histogram cells, 一個向量給予直方圖格子分隔點 a function to compute the vector of breakpoints, 一個函數用來計算分隔點向量 a single number giving the number of cells for the histogram, 一個數字給予直方圖的格子數 a character string naming an algorithm to compute the number of cells (see ‘Details’), 一字串給演算法的名稱用來計算格子數 a function to compute the number of cells. 一個函數用來計算格子數

Histograms直方圖

Histograms直方圖下面是有關植物生長的資料(Dobson, 1983). 資料集比較三種(1控制組, 2治療組)的生產量用植物的幹重量控制組重量多介於 5 與 5.5 間. 治療組1重量多介於4 與 5 間治療組2重量多介於5.25 與 5.75間.

Histograms直方圖 >data(PlantGrowth) >head(PlantGrowth) # 看前6筆資料 weight group ctrl ctrl ctrl ctrl ctrl ctrl > dim(PlantGrowth) [1] 30 2 >attach(PlantGrowth)

Histograms直方圖 >par(mfrow = c(1, 3)) # 一頁配置1列3行共3個圖 >a <- hist(weight[group == 'ctrl'], xlim = c(3, 6.5), ylim = c(0, 4), xlab = '', ylab = 'frequency', main = 'control', col = 'gray90') >b <- hist(weight[group == 'trt1'], xlab = 'weight', ylab = '', main = 'treatment 1', >c <- hist(weight[group == 'trt2'], xlab = '', ylab = '', main = 'treatment 2',

Histograms直方圖

a是一個list > a $breaks # 分隔點 [1] 4.0 4.5 5.0 5.5 6.0 6.5 $counts # 次數
[1] $density # 機率密度 [1] $mids # 每條中間 [1] $xname [1] "weight[group == \"ctrl\"]" $equidist [1] TRUE attr(,"class") [1] "histogram"

Histograms直方圖(y軸是機率密度)
> densa <- density(weight[group == 'ctrl']) > densa Call: density.default(x = weight[group == "ctrl"]) Data: weight[group == "ctrl"] (10 obs.); Bandwidth 'bw' = x y Min. : Min. : 1st Qu.: st Qu.: Median : Median : 3rd Qu.: rd Qu.: Max. : Max. : > hist(weight[group == 'ctrl'], xlim=range(densa$x), xlab = '', main = 'control', ylab = ‘density', col = 'gray90', probability = TRUE) > lines(densa)

> densb <- density(weight[group == 'trt1']) > densb Call: density.default(x = weight[group == "trt1"]) Data: weight[group == "trt1"] (10 obs.); Bandwidth 'bw' = x y Min. :2.748 Min. : st Qu.: st Qu.: Median :4.810 Median : Mean :4.810 Mean : rd Qu.: rd Qu.: Max. :6.872 Max. : > hist(weight[group == ‘trt1’], xlim=range(densb$x), xlab = 'weight', ylab = '', main = ‘treatment 1’, col = 'gray90', probability = TRUE) > lines(densb)

> densc <- density(weight[group == 'trt2']) > densc Call: density.default(x = weight[group == "trt2"]) Data: weight[group == "trt2"] (10 obs.); Bandwidth 'bw' = x y Min. :4.326 Min. : st Qu.: st Qu.: Median :5.615 Median : Mean :5.615 Mean : rd Qu.: rd Qu.: Max. :6.904 Max. : > hist(weight[group == ‘trt2’], xlim=range(densc$x), xlab = '', ylab = '', main = 'treatment 2', col = 'gray90', probability = TRUE) > lines(densc)

灰色區域面積=1

汽車趨勢資料資料集中有11個變數32個觀察值 mpg Miles/(US) gallon 公哩/加侖
cyl Number of cylinders 氣缸數 disp Displacement (cu.in.) 容量 hp Gross horsepower 總馬力 drat Rear axle ratio 後輪軸比

汽車趨勢資料資料集中有11個變數32個觀察值 wt Weight (1000 lbs) 重量 qsec ¼ mile time ¼哩的時間
vs V/S 發動機類型 (0 = V型, 1 = S型) am Transmission (0 = automatic, 1 = manual) 變速器(0-自動, 1-手動) gear Number of forward gears 前進檔位數 carb Number of carburetors 化油器數

Motor Trend Car Road Tests

Histograms直方圖(加常態曲線)
> Mtcars # 汽車趨勢資料 > attach(mtcars) # 連接汽車趨勢資料 > hist(mpg) # 簡單直方圖 > hist(mpg, breaks=12, col=“red”) # 紅色直方圖條狀數=12 > x <- mpg > h<-hist(x, breaks=10, col="red", xlab="Miles Per Gallon", main=“Histogram with Normal Curve”) # 畫直方圖 > xfit<-seq(min(x),max(x),length=40) # x軸資料最小到最大分成40個點 > yfit<-dnorm(xfit,mean=mean(x),sd=sd(x)) # y軸將x值代入常態 > yfit <- yfit*diff(h$mids[1:2])*length(x) # 綠色是機率(乘n轉為次數) > lines(xfit, yfit, col=“blue”, lwd=2) # 加藍色常態曲線

> h $breaks [1] $counts [1] $density [1] $mids [1] $xname [1] "x" $equidist [1] TRUE attr(,"class") [1] "histogram" > h$mids[1:2] [1] 11 13

五個數字的彙總(5-number summary)
一般我們拿到資料後，為了能快速對這組觀察值提供資訊，最常用的統計量就是五個數彙總(five-number summary)，也就是利用下列五個數來匯總資料。最小值第一四分位數 (Q1) (25百分位數) 中位數(第二四分位數) (Q2) (50百分位數) 第三四分位數 (Q3) (75百分位數) 最大值

盒形圖(Box plot) 盒形圖(box plot)是根據五數彙總而繪製的圖形。
盒形圖(box plot)也稱為箱形圖或盒鬚圖(box-and-whisker plot)，因其型狀如箱子而得名。繪製盒形圖的關鍵在求出四分位距 IQR＝Q3－Q1。盒形圖是1977年由美國著名統計學家約翰·圖基（John Tukey）發明。它能顯示出一組數據的最大值、最小值、中位數、下四分位數及上四分位數。

盒形圖例子這組數據顯示出：最小值(minimum)=0.5 下四分位數(Q1)=7 中位數(Median)=8.5
不含極端值的最小值最大值 Extreme outlier mild outlier Q1 Q3 中位數這組數據顯示出：最小值(minimum)=0.5 下四分位數(Q1)=7 中位數(Median)=8.5 上四分位數(Q3)=9 最大值(maximum)=10 平均值=8(盒子正中間) 四分位距(interquartile range)=IQR=Q3-Q1=9-7=2

盒形圖(Box plot) > par(mfrow=c(1,1)) > mtcars # 汽車趨勢資料 > attach(mtcars) # 連接汽車趨勢資料 > boxplot(mpg, horizontal=TRUE, xlab="Miles Per Gallon")

盒形圖(Box plot) > boxplot(mpg, horizontal=TRUE, xlab="Miles Per Gallon“ , col=“pink”) # horizontal=TRUE 水平盒形圖

分組盒形圖(side-by-side box plot)
> boxplot(mpg~cyl,data=mtcars, main="Car Milage Data", xlab="Number of Cylinders", ylab="Miles Per Gallon")

> boxplot(mpg~cyl,data=mtcars, main="Car Milage Data", xlab="Number of Cylinders", ylab="Miles Per Gallon“, col=“gold")

> boxplot(mpg~cyl, main="Milage by Car Weight", yaxt="n", xlab="Milage", horizontal=TRUE, col=terrain.colors(3)) # 3種地球色 > legend("topright", inset=.05, title="Number of Cylinders", c("4","6","8"), fill=terrain.colors(3), horiz=TRUE) # 圖例右上(topright) 內插=0.05 水平(horiz=TRUE)

> boxplot(mpg~cyl*am, data=mtcars, col=(c("gold","darkgreen“,”red”)), main="Miles Per Gallon", xlab="cylinders and transmission") # 按氣缸數與變速器(0-自動, 1-手動)分組盒形圖

常態機率圖 > rad<-read.csv(“c:/RData/radiation.csv”) # 微波爐輻射資料 > rad # 看微波爐輻射資料 > names(rad) # 微波爐輻射資料變數名稱 > attach(rad) # 連接微波爐輻射資料 > par(mfrow=c(1,2)) # 一頁配置1列2行共2個圖 > qqnorm(rad[,1], main="Normal Q-Q Plot(door closed)") # 產生 Q-Q plot for radiation data (door closed) > qqline(rad[,1]) # 在常態機率圖上畫45度線 > qqnorm(rad[,2], main="Normal Q-Q Plot(door open)") # Produce a Q-Q plot for radiation data (door open) > qqline(rad[,2]) # 在常態機率圖上畫45度線

常態機率圖大部分的點沒有接近一直線, 常態分配的假設不滿足

常態機率圖 > srad_sqrt(rad) # 將兩變數都作√ 轉換 > qqnorm(srad[,1], ylab=“sqrt(door closed radiation) quantiles”, main="Normal Q-Q Plot(door closed)") > qqline(srad[,1]) > qqnorm(srad[,2], ylab=“sqrt(door open radiation) quantiles”, main="Normal Q-Q Plot(door open)") > qqline(srad[,2])

常態機率圖將兩變數都作√轉換

iPlots Iplots套件提供互動式可連結接和刷顏色的馬賽克圖，長條圖，盒形圖，平行圖，散佈圖和直方圖。
iplots是通過為R 的Java GUI執行 32-bit才可以執行

iPlots > install.packages(“iplots”,dep=TRUE) # 安裝 iplots # Create some linked plots > library(iplots) > cyl.f <- factor(mtcars$cyl) # 設定變數為因子 > gear.f <- factor(mtcars$factor) # 設定變數為因子 > attach(mtcars) > ihist(mpg) # 直方圖 > ibar(carb) # 長條圖 > iplot(mpg, wt) # 散佈圖 > ibox(mtcars[c(“qsec”,“disp”,“hp”)]) # 盒形圖 > ipcp(mtcars[c(“mpg”,“wt”,“hp”)]) # 平行圖 > imosaic(cyl.f, gear.f) # 馬賽克圖

iPlots

付出最多的人，也是收穫最多的人 ~共勉之~

R教學 Graph 2(繪圖2) 羅琪老師.

Similar presentations

Presentation on theme: "R教學 Graph 2(繪圖2) 羅琪老師."— Presentation transcript:

Similar presentations

About project

反馈

请登录

Auth with social network:

R教學 Graph 2(繪圖2) 羅琪老師.

Similar presentations

Presentation on theme: "R教學 Graph 2(繪圖2) 羅琪老師."— Presentation transcript:

Similar presentations

About project

反馈