circle.ch weblog by Urs Gehrig |
|
|
|
| Academic, Android, Apache, Apple, Art, Best Of, Biotech, Civil Society, Content Management, Cooking, Copyright, Creative Commons, Crosspost, Culture, Database, Deutsch, DRM, Economy, Education, Event, Gadget, General, Geodata, Government, Health, Howto, Humor, Innovation, Intellectual Property, Java, Language, LaTeX, Law, Linux, Media, Moblog, Mozilla, Music, Office, Open Content, Open Source, P2P, PHP, Podcast, Politics, Privacy, Projects, Random Thought, Rant, Science, Search, Social Network, Software, Sport, Talks, Technology, Technology Transfer, Travelling, Weblog, Wiki, Wireless and Mobile, XML
|
|
22. April 2008
How to OCR multipage PDF files For reasons of simplicity the TIF files p00.tif to pXY.tif will get concatenated together to a single TIF file, that has the width of a single page and the height of XY pages. In such a way at least the order of the text or the text flow respectively will be preserved. But one could also concatenate a mosaic of all the TIF files. The density of 150 (dpi) gives reasonable results with tesseract. Comments (4) Permalink del.icio.us The URL to TrackBack this entry is: Comments closed.
|
Werbung:Beiträge von Dritten:
Nachfolgende Titel verweisen auf von mir gelesene Weblogs. Feeds:WikiAgenda:Comments:Good question, but...Hi, thank you very... Unter http://www.s... Ich weiss mir nich... ThanQ matthias. Th... in case you just w... ich liebe dir, urs... hi there, sorry i... Hoi Leo. I haven'... Do you know the si... Archives:Blog stack:Bill Humphriesmonorom Wendy M. Seltzer Christian Stocker Roger Fischer Sandro Zic Wez Furlong Ben Hammersley George Schlossnagle Joichi Ito Lawrence Lessig Derek Slater Karl-Friedrich Lenz John Palfrey Bernhard A.M. Seefeld Gregor J. Rothfuss Rainer Langenhan Elke Engel Sebastian Bergmann Simon Willison Jeremy Zwaodny Udo Vetter Axel A. Horns Miguel de Icaza Andreas Halter Silvan Zurbrügg Hannes Gassert Markus Koller
|
$Date: 2005/11/05 11:14:30 $ |
|
greetz, matthias