During development of my offline reader, I wanted the news articles to appear cleanly formatted, without providers’ headings, formats, ads, comments, etc. If you have an iPhone, you are, probably, familiar with an app called InstaPaper which does that for you.
So, for my app, I was looking for a web-based REST API which can take a URL of a news article and returns a nicely formatted string representing the article text with minimal amount of extra shmutz. I reached out to InstaPaper, but they could not accommodate my request, because their infrastructure is not yet scalable enough to handle many generic requests. They did suggest a great tool called DiffBot instead. Go ahead try it, it’s really cool. You get back a nice JSON-formatted response, which you can parse and present the way you want. Their prices start at 20/month, so I have yet to consider it. But their API is clean and they do an awesome job at parsing.
I also found another tool which I recently incorporated into the codebase. Readibility provides a similar API to DiffBot and I did not have to pay for it yet. So for now, I am using them.