High Performance Text Processing

I realised that the program that I wrote in the first week to import and process the letters and print a simple histogram-like visualisation of words and their distribution was pretty slow. It worked great with the unittests I had set up, because they were mostly short texts to prope if the individual functions do what I wanted them to do. When I tried the program however on the over 850 letters, it took ‘forever to get a response’ ( up to a minute). After searching online for a solution I found a few good guidelines how I can make the program faster.
On the Python homepage there is a section on performance tips. This was one of the starting places for me. There are many useful tips there. To monitor the performance of functions the Python cProfile and profile modules are very good and easy to use. They gave me a good idea where I could optimise my program. I found also the following video by Daniel Krasner very useful, as it gives a easy and simple introduction to the problem and also a few useful solutions:

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s