diff options
author | Nick White <git@njw.me.uk> | 2011-10-30 12:29:45 +0000 |
---|---|---|
committer | Nick White <git@njw.me.uk> | 2011-10-30 12:29:45 +0000 |
commit | cd0c0a821361f5ee7c52ee60fb0ed5b758e53620 (patch) | |
tree | d9bc0c72dc2447c9ac3b4edc68081348b36ba64e /TODO | |
parent | 82908257a64d4fd67785c76ab33b0392bc9d9724 (diff) |
Add ocr pdf script
Diffstat (limited to 'TODO')
-rw-r--r-- | TODO | 3 |
1 files changed, 0 insertions, 3 deletions
@@ -4,9 +4,6 @@ before 1.0: create bn tool, fix http bugs, be unicode safe, package for osx & wi # other todos -improve 2pdf script to use ocr; use tesseract to output hocr & hocr2pdf (from exact-image pkg) - see http://www.exactcode.de/site/open_source/exactimage/hocr2pdf/ https://tfischernet.wordpress.com/2008/11/26/searchable-pdfs-with-linux/ http://code.google.com/p/tesseract-ocr/ - create 2epub script if simple - use the correct file extension depending on the image type (for google and amazon the first page is a jpg, all the others are png) |