gituser/production/: pdfminer-20140328 metadata and description

Homepage Simple index

PDF parser and analyzer

author Yusuke Shinyama
author_email yusuke at cs dot nyu dot edu
classifiers
  • Development Status :: 4 - Beta
  • Environment :: Console
  • Intended Audience :: Developers
  • Intended Audience :: Science/Research
  • License :: OSI Approved :: MIT License
  • Topic :: Text Processing
keywords pdf parser,pdf converter,layout analysis,text mining
license MIT/X
File Tox results History
pdfminer-20140328-py2-none-any.whl
Size
111 KB
Type
Python Wheel
Python
2

PDFMiner is a tool for extracting information from PDF documents. Unlike other PDF-related tools, it focuses entirely on getting and analyzing text data. PDFMiner allows to obtain the exact location of texts in a page, as well as other information such as fonts or lines. It includes a PDF converter that can transform PDF files into other text formats (such as HTML). It has an extensible PDF parser that can be used for other purposes instead of text analysis.