Tesseract OCR 97703036 資科四 李昱安
關於Tesseract OCR
關於Tesseract OCR Tesseract OCR是HP公司的研究員於1985-1994年間開發的OCR引擎,當時是內華達州立大學OCR準確度(accuracy)競賽的前三名。 2005年轉由Google進行維護並在2006年以Open Source的方式釋出 Google宣稱Tesseract OCR是準確度最高的Open Source OCR引擎。
關於Tesseract OCR 支援30種以上的文字/語言 能分析頁面、支援直書 輸入圖檔須為: 未經壓縮的TIF格式 背景須為白色 文字支援全彩
關於Tesseract OCR 將字元的邊緣取多邊形逼近,再用多邊形的 x-position、y-position、direction及length四維向量作為其feature
關於Tesseract OCR
We‘ve already sorted through the specs, and laid our hands on its rather sexy frame, now Fujifilm'soffering up a more palatable price tag than we expected for its throwback X10 shooter. Startingsometime in early October, the X100's more affordable little brother will set nostalgic point-and-shooters back $599.99 — about $100 bones less than the estimated $715 to $860 ballpark we threw outback in September. If you'll recall, the X10 packs a 12 megapixel EXR CMOS sensor, f/2-2.8, 28-112mmmanual zoom lens, up to 12,800 ISO sensitivity, 1080p video, an optical viewfinder, and pop-up flash. Noword yet on a final release date. Full PR after the break.
weve almay 591191 u1¢911g\1 me 91995, ma had 9111 mas weve almay 591191 u1¢911g\1 me 91995, ma had 9111 mas .111 19 num my mme, 119wn1j;aLm~s 9ef9119g up 1 19919 1191991119 P1199 mg um we 91999191 f91 19 u1¢9w\199k x19 =\199¢91. sm-ang 5919911919 an may 09191199 me x199's 19919 959199919 me \119u191 wan 591 1199111919 11919199117 =\199¢91s\199k $599.99 _ 1119111 s199\19119=1& um me 9919191911 sm 19 sa99\19up9¢kw9 u1¢9w99¢ 9991119 s9p¢ \191. 1fy911'u Emu, me x19 119919 1 12 19991911191 nxncmos 591591, f/29.9, 2s»112mm 91991111 19919 19.15, up 19 12,999 xso sensiavify, 199911 v;a99, 99 9p¢1<=1 v;9ws9a91, ma P99911 mash. N9w91ay919111s91\191m9a=¢9. n1u1m9n91u19\1m11_
We‘ve alrmdy sorted through the specs, and laid our hands on is rather sexy fimlne, now Fujifi.l.m‘soffering up 1 more palatable price mg than we expected for is Lhrawback X10 shooter. Starlingsometime in mrly October, the X1oo's more afifordable little brother will set uostz.lp'c p0inl~and—shoolelsback s599.99 _ about $1ooboues 1§ than the eflimaled $115 me $86oba.l.lparkwe Lhrewoulbackin Seplbet. Ifyo\|'l.l lemll, the X10 packs 1 12 megppixel EXRCMOS sensor, f/2fl.B, 2E>n2m|:umm zoom lens, up to 12,800 150 sensitivity, 10801; video, an optiml viewfinder, and popllp flash.Nowordyetouzfinallelmsedale. F\|.|.lPRafien.heb!m.k.
Adapting the Tesseract Open Source OCR Engine forMultilingual OCRRay Smith Daria Antonova Dar-Shyang LeeGoogle |nc., 1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA.AbstractWe describe eforts to adapt the Tesseract open source OCRengine for multiple scripts and languages. Eflort has beenconcentrated on enabling genmic niulti-lingual operation suchthat negligible eust0nti:ati0n is required far a new languagebevmrrlprorirling a cmpus aftert.
為了落實國民教育的精砷 ,也為了提昇國家人力素質促進競爭力教育部多年以來 一 直致力於推動教育普及化及延長國民教育 c
The (quick) [brown] {fox} jumps. Over the $43,456 The (quick) [brown] {fox} jumps!Over the $43,456.78 <lazy> #90 dog& duck/goose, as 12.5% of E-mailfrom aspammer@website.com is spam.Der ,,schnelle” braune Fuchs springtfiber den faulen Hund. Le renard brun<<rapide» saute par-dessus le chienparesseux. La volpe marrone rapidasalta sopra il cane pigro. El zorromarrén répido salta sobre el perroperezoso. A raposa marrom rzipidasalta sobre 0 cfio preguieoso.
就在十月初的時候 7 這間 日本著名的相機製造商總算肯透露其定價將訂在 US$599_99 (約 N丁$18,300 、 HK$4,700) 之譜 7 坦白靚還直 我們,D中所想像的價位啊 ! (感覺至少會比 GRD 貴些吧 ? 直是 想不到 XD) 預計將於十一月初開賣 7 不過包括美國及中港台地區目前都仍未公佈確切的發售日期 7 但可 以確定的是您將會有更多預算可以先準偏好它的皮套 、 背帶等相闆周暹 7 讓呈晝台復古囷格的 X10 更有味 道 ° 透過引用來源可看到完整的新間稿 、 台灠宮網介紹以及日本宮網的賣拍樣本 °
執行時間
準確度
結論 英文/西歐字元辨識準確度很高,字體達一定大小,準確度都有99%以上 (以字元計) 使用官方提供的正體中文model來辨識,易產生許多誤字,遇標點、符號及數字時也容易辨識錯誤,即便字體放到很大也是如此
試看超立方暟光學文字辨識中文是否準確成式戎戍戌戒找我或咸
http://code.google.com/p/tesseract-ocr