From cd0c0a821361f5ee7c52ee60fb0ed5b758e53620 Mon Sep 17 00:00:00 2001 From: Nick White Date: Sun, 30 Oct 2011 12:29:45 +0000 Subject: Add ocr pdf script --- TODO | 3 --- 1 file changed, 3 deletions(-) (limited to 'TODO') diff --git a/TODO b/TODO index 4c79489..43c7b19 100644 --- a/TODO +++ b/TODO @@ -4,9 +4,6 @@ before 1.0: create bn tool, fix http bugs, be unicode safe, package for osx & wi # other todos -improve 2pdf script to use ocr; use tesseract to output hocr & hocr2pdf (from exact-image pkg) - see http://www.exactcode.de/site/open_source/exactimage/hocr2pdf/ https://tfischernet.wordpress.com/2008/11/26/searchable-pdfs-with-linux/ http://code.google.com/p/tesseract-ocr/ - create 2epub script if simple - use the correct file extension depending on the image type (for google and amazon the first page is a jpg, all the others are png) -- cgit v1.2.3