Diffusion FreeBSD ports repository rP506461

New port: textproc/py-ocrmypdf
rP506461
Actions

Tags

None

Referenced Files

None

Subscribers

None

Description

New port: textproc/py-ocrmypdf

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be
searched or copy-pasted.

Main features:

Generates a searchable PDF/A file from a regular PDF
Places OCR text accurately below the image to ease copy / paste
Keeps the exact resolution of the original embedded images
When possible, inserts OCR information as a "lossless" operation without disrupting any other content
Optimizes PDF images, often producing files smaller than the input file
If requested deskews and/or cleans the image before performing OCR
Validates input and output files
Distributes work across all available CPU cores
Uses Tesseract OCR engine to recognize more than 100 languages
Scales properly to handle files with thousands of pages
Battle-tested on millions of PDFs

WWW: https://github.com/jbarlow83/OCRmyPDF

Reviewed by: 0mp, koobs
Differential Revision: https://reviews.freebsd.org/D20927

Details

Provenance

kai	Authored on

Reviewer

Differential Revision

D20927: New port: textproc/py-ocrmypdf: Adds an OCR test layer to scanned PDF files

Parents

rP506460: mail/dovecot, mail/dovecot-pigeonhole: Update to 2.3.7 and 0.5.7 respectively.

Branches

Unknown

Tags

Unknown

Event Timeline

kai committed rP506461: New port: textproc/py-ocrmypdf.Jul 12 2019, 3:08 PM

kai added an edge: D20927: New port: textproc/py-ocrmypdf: Adds an OCR test layer to scanned PDF files.

Changes (5)

Path

Size

head/

textproc/

rP506461

head/textproc/Makefile

Loading...

head/textproc/py-ocrmypdf/

Loading...

head/textproc/py-ocrmypdf/Makefile

Loading...

head/textproc/py-ocrmypdf/distinfo

Loading...

head/textproc/py-ocrmypdf/pkg-descr

Loading...