[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sheflug] Convert JPEG to Acsii



> If you want to OCR look at using gocr
> (http://jocr.sourceforge.net/) ... I use this with in my mail
> delivery chain to remove the image-based spam. Works quite well :-)

Hmmmm.... how to decode the help file ;) .....

$ gocr -h
 Optical Character Recognition --- gocr 0.41
 using: gocr [options] pnm_file_name  # use - for stdin
 options (see gocr manual pages for more details):
 -h        - get this help
 -i name   - input image file (pnm,pgm,pbm,ppm,pcx,...)
 -o name   - output file  (redirection of stdout)
 -e name   - logging file (redirection of stderr)
 -x name   - progress output to fifo (see manual)
 -p name   - database path including final slash (default is ./db/)
 -f fmt    - output format (ISO8859_1 TeX HTML XML UTF8 ASCII)
 -l num    - threshold grey level 0<160<=255 (0 = autodetect)
 -d num    - dust_size (remove small clusters, -1 = autodetect)
 -s num    - spacewidth/dots (0 = autodetect)
 -v num    - verbose (see manual page)
 -c string - list of chars (debugging, see manual)
 -C string - char filter (ex. hexdigits: 0-9A-Fx, only ASCII)
 -m num    - operation modes (bitpattern, see manual)
 -a num      value of certainty (in percent, 0..100, default=95)
 examples:
        gocr -m 4 text1.pbm                   # do layout analyzis
        gocr -m 130 -p ./database/ text1.pbm  # extend database
        djpeg -pnm -gray text.jpg | gocr -    # use jpeg-file via pipe


Not sure about output file formats.  Have to have a think.


-- 
Richard

_______________________________________________
        Sheffield Linux User's Group
  http://www.sheflug.org.uk/mailfaq.html
 GNU - The choice of a complete generation