Files Repository
TEXTfromPDF is a text extraction tool for WinXP/2000 that automates the conversion of Adobe PDF documents to text files.
One of a company`s greatest assets is its intellectual capital. An important form of intellectual capital is the documentation created by its employees. This documentation is saved in a variety of file formats on computers throughout the enterprise.
How can a company tap into all of this intellectual capital? TEXTfromPDF gives a company access to the text content contained in PDF documents without requiring any Adobe product. It can automatically extract the text from thousands of Adobe PDF documents in a matter of minutes. The extracted content is saved to text files where it can be easily searched, or archived. Developers can also import the text file contents into a variety of databases for content management purposes
Gathering the text from a PDF document without a tool such as TEXTfromPDF requires manual selection and copy/paste operations by a human. This is very slow, laborious, costly, and prone to error.
Features:
Command line execution.
Batch conversion of multiple PDF documents to text file.
Supports drag and drop of files and folders.
Retain original layout or change to "reading order".
Text files can be DOS/Windows, Unix, or Mac compatible
Convert entire document or only a specific page range.
PDFs can be on local drives or the Internet
Works with password protected documents.
Does not require any Adobe program.
Maximum line length of output can be user-defined.
Three text encodings supported: UTF-8, Latin1, and ASCII
Supports PDF 1.5
