installing textract for python 3

I came across this library textract for extracting text from various formats.  I was interested to use this for extracting text from html files.  Here is what I did to get it to install on my machine.

The installation outlines some steps that you need to perform. These steps can be found at the following url:

http://textract.readthedocs.org/en/latest/installation.html

I wanted to install it for python 3.4 on ubuntu 14.04, but it seemed to only support python 2.x.  Here is what I did to get it to install for python 3.4

Get your virtualenv setup first

virtualenv -p /usr/bin/python3.4 /usr/local

1. install required libraries for linux as outlined in the installation page.

2. download the source file for textract from

https://pypi.python.org/pypi/textract

3. untar the downloaded file

4. cd into the directory and look for cases of :

except ShellError,  e:

and change it to

except ShellError as e:

5. edit the requirements/python file comment out

pdfminer==20140328

6. install the python 3 equivalent

pip install pdfminer3k

7. finally run

python3.4 setup.py install

Everything should install at this point.

WordPress

My name is Tyson Maly. I am a computer engineer working in the financial services industry for the past 9 years.  I have also moonlighted as a consultant building web applications for businesses.   This site has been around since 2003.  I have been programming perl since 1996, and I have been developing websites since 1995.  I have a wide variety of skills and can program the full stack from the devops side to the user interface.

I recently had to upgrade a task management system that has been running since 2006.   It was based off the dotproject project management system written in PHP.  Due to a move to a newer version of PHP, the system had some issues.   Upon patching the system, I realized that the latest version of PHP had quite a handful of new features.

I have worked with PHP quite a bit over the years.  I wrote my own CMS system that supported both MySql and Postgresql back in 2004.  This system supported clients I was consulting for at the time.

WordPress has grown quite a bit since I began using it when it was just a simple blog system.  I am working on a few plugins for the system.  If you have a business that is in need of a feature for your wordpress site, feel free to contact me.  I can discuss what is possible given your requirements.

automating difficult business processes