Download presentation
Presentation is loading. Please wait.
1
Tesseract OCR 資科四 李昱安
2
關於Tesseract OCR
3
關於Tesseract OCR Tesseract OCR是HP公司的研究員於 年間開發的OCR引擎,當時是內華達州立大學OCR準確度(accuracy)競賽的前三名。 2005年轉由Google進行維護並在2006年以Open Source的方式釋出 Google宣稱Tesseract OCR是準確度最高的Open Source OCR引擎。
4
關於Tesseract OCR 支援30種以上的文字/語言 能分析頁面、支援直書
輸入圖檔須為: 未經壓縮的TIF格式 背景須為白色 文字支援全彩
5
關於Tesseract OCR 將字元的邊緣取多邊形逼近,再用多邊形的 x-position、y-position、direction及length四維向量作為其feature
6
關於Tesseract OCR
7
We‘ve already sorted through the specs, and laid our hands on its rather sexy frame, now Fujifilm'soffering up a more palatable price tag than we expected for its throwback X10 shooter. Startingsometime in early October, the X100's more affordable little brother will set nostalgic point-and-shooters back $ — about $100 bones less than the estimated $715 to $860 ballpark we threw outback in September. If you'll recall, the X10 packs a 12 megapixel EXR CMOS sensor, f/2-2.8, mmmanual zoom lens, up to 12,800 ISO sensitivity, 1080p video, an optical viewfinder, and pop-up flash. Noword yet on a final release date. Full PR after the break.
8
weve almay 591191 u1¢911g\1 me 91995, ma had 9111 mas
weve almay u1¢911g\1 me 91995, ma had 9111 mas num my mme, 119wn1j;aLm~s 9ef9119g up P1199 mg um we f91 19 u1¢9w\199k x19 =\199¢91. sm-ang an may me x199's me \119u191 wan =\199¢91s\199k $ _ s199\19119=1& um me sm 19 sa99\19up9¢kw9 u1¢9w99¢ s9p¢ \191. 1fy911'u Emu, me x nxncmos , f/29.9, 2s»112mm , up 19 12,999 xso sensiavify, v;a99, 99 9p¢1<=1 v;9ws9a91, ma P99911 mash. N9w91ay919111s91\191m9a=¢9. n1u1m9n91u19\1m11_
9
We‘ve alrmdy sorted through the specs, and laid our hands on is rather sexy fimlne, now Fujifi.l.m‘soffering up 1 more palatable price mg than we expected for is Lhrawback X10 shooter. Starlingsometime in mrly October, the X1oo's more afifordable little brother will set uostz.lp'c p0inl~and—shoolelsback s _ about $1ooboues 1§ than the eflimaled $115 me $86oba.l.lparkwe Lhrewoulbackin Seplbet. Ifyo\|'l.l lemll, the X10 packs 1 12 megppixel EXRCMOS sensor, f/2fl.B, 2E>n2m|:umm zoom lens, up to 12, sensitivity, 10801; video, an optiml viewfinder, and popllp flash.Nowordyetouzfinallelmsedale. F\|.|.lPRafien.heb!m.k.
10
Adapting the Tesseract Open Source OCR Engine forMultilingual OCRRay Smith Daria Antonova Dar-Shyang LeeGoogle |nc., 1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA.AbstractWe describe eforts to adapt the Tesseract open source OCRengine for multiple scripts and languages. Eflort has beenconcentrated on enabling genmic niulti-lingual operation suchthat negligible eust0nti:ati0n is required far a new languagebevmrrlprorirling a cmpus aftert.
11
為了落實國民教育的精砷 ,也為了提昇國家人力素質促進競爭力教育部多年以來 一 直致力於推動教育普及化及延長國民教育 c
12
The (quick) [brown] {fox} jumps. Over the $43,456
The (quick) [brown] {fox} jumps!Over the $43, <lazy> #90 dog& duck/goose, as 12.5% of from is spam.Der ,,schnelle” braune Fuchs springtfiber den faulen Hund. Le renard brun<<rapide» saute par-dessus le chienparesseux. La volpe marrone rapidasalta sopra il cane pigro. El zorromarrén répido salta sobre el perroperezoso. A raposa marrom rzipidasalta sobre 0 cfio preguieoso.
13
就在十月初的時候 7 這間 日本著名的相機製造商總算肯透露其定價將訂在 US$599_99 (約 N丁$18,300 、
HK$4,700) 之譜 7 坦白靚還直 我們,D中所想像的價位啊 ! (感覺至少會比 GRD 貴些吧 ? 直是 想不到 XD) 預計將於十一月初開賣 7 不過包括美國及中港台地區目前都仍未公佈確切的發售日期 7 但可 以確定的是您將會有更多預算可以先準偏好它的皮套 、 背帶等相闆周暹 7 讓呈晝台復古囷格的 X10 更有味 道 ° 透過引用來源可看到完整的新間稿 、 台灠宮網介紹以及日本宮網的賣拍樣本 °
14
執行時間
15
準確度
16
結論 英文/西歐字元辨識準確度很高,字體達一定大小,準確度都有99%以上 (以字元計)
使用官方提供的正體中文model來辨識,易產生許多誤字,遇標點、符號及數字時也容易辨識錯誤,即便字體放到很大也是如此
17
試看超立方暟光學文字辨識中文是否準確成式戎戍戌戒找我或咸
Similar presentations