Oli Movie R

I'm always interested in machine learning since I learned the concept last year. After watching a lot of videos and reading a bunch of posts, I trained my first machine learning model a few days ago. It's so exciting that I can't wait to share my experience.

If you want to check out the codes directly please visit my GitHub repository.

Find a problem

The first also the most important thing to train a model is to find a problem which is currently handled by human but could be solved by machine. If you are one of the human it will be great because you will be willing to save yourself a lot of time.

I use EMS(Express Mail Service) a lot after my baby was born because I need to query milk powder from Germany(Big thanks to my best friend in Germany) and commodities from Amazon Japan. The stupid website asks user to recognize a captcha before showing mail information. Thus I found my problem which is to recognize the captcha from their website using machine learning. The captcha looks like this:

EMS captcha

To our human being it's an easy task. Believe it or not it's also a piece of cake to a computer thanks to machine learning.

Analysis the problem

The captcha is always composed of six numbers and my goal is to recognize all the numbers from the captcha image which could be thought as a image classification problem. I have went through TensorFlow for Poets from Google Developers Codelabs. The tutorial teaches you how to train a simple classifier to classify images of flowers step by step. Maybe my problem could also be solved if I do correctly according to the tutorial. I want to try.

There is a question to ask before I can train my model. what's the categories? Categories are the whole possible result to a specific image classification problem. It seems reasonable to label them from 000000 to 999999. But I will need huge amount of traning data if I classify like this. Instead I chose to cut the captcha into six pieces which contain their own number. The segment will look like this:

Segment after cutting

Thus my categories become 0 to 9. I only need to recognize each piece of the captcha and combine their result together to get final result.

Prepare training data

My machine learning model needs to be fed by data. How can I get data? It's always been a difficult problem for machine learning newbies. I use a most stupid method. Just download a lot of captcha from EMS website and save them using their value as name.

As I have analysised the problem I need to preprocess these labeled captcha and organize them correctly before training. According to the tutorial I need to create ten directories with names from 0 to 9 and put each segment into their corresponding directory.

Train model

With training data prepared I can train my model simply with a script according to the tutorial. After executing the Python script I came across a error. It turns out that the image size of my training data is too small. So I need to scale the segment. I scale it with 128 width and 197 height. why 128x192? Beacuse 128 is the smallest size supported by the script and 197 is caculated according to the original ratio of width and height. After scaling the segment looks like this:

Segment after scaling

Then I can train again and this time the script should run without any problem. The whole traning process should cost you dozens of seconds.

Use trained model

After training I could use my model to predict a segment number simply by running a predicting script. Predicting a whole captcha just becomes a for loop.

To make things easier I wrote a python script to query mail status and integrate it into my IM robot system. I only need to send a mail number to my IM robot who will crawl EMS website, download a captcha, predict its value, submit the result, parse the response and send the result back to me.

Conclusion