Personal tools
You are here: Home

Matthew Russell's 21 Recipes for Mining Twitter

Matthew Russell's 21 Recipes for Mining Twitter

Posted by Ricardo Bánffy at Apr 25, 2011 06:25 PM |

This is a very short, very practical way to get you started exploring the Twitter APIs on your own. It offers a decent amount of code no experienced developer should have much trouble understanding and applying to his or her own needs. The cost per page is not exactly attractive and some readers may want a more in-depth less cookbook-like experience. If you are in a hurry to extract data from Twitter, this book may be for you - for less than $20 for the electronic edition, it will spare you more than that in time spent figuring out libraries and APIs. Plus, it offers some intro on many other interesting libraries that can be applied to a lot of problems besides Twitter mining.

You can buy the book from Amazon or directly from O'Reilly.

Read More…

Python 2.6, PIL, Django 1.3 e MySQLdb no CentOS 5.5

Posted by Ricardo Bánffy at Apr 11, 2011 05:05 PM |

Instalar um ambiente Django razoavelmente moderno em um CentOS 5.5 (e, presumivelmente, em um Red Hat 5.x) pode não ser uma tarefa trivial - envolve, por exemplo, algumas decisões que precisam ser tomadas e, às vezes, um ou outro sacrifício que precisa ser feito.

Read More…

The fainting surgeon

Posted by Ricardo Bánffy at Mar 24, 2011 06:09 PM |

This morning I came across a funny comment.

On a post that highlights a subtle implication of the use of Java and as a teaching tool (picked up on another article, this one about Scheme), a guy named Stephen Fraser dropped this: "As I see it, a software engineer who hasn’t worked through SICP with Scheme, a basic editor and command line, is like a surgeon who has never dissected a frog and faints at the sight of blood."

I happen to agree with that. It's not as much a virtue of Scheme and SICP (both outstanding teaching tools), but a major conceptual failure of IDEs. By shielding the programmer from the complexities (and we can endlessly argue whether those are needed or not) of the typical Java framework or build tool, they make that complexity tolerable and thus create fertile ground for adding new complexities on top of it the already high pile of complexity, a layer that will, eventually, be mitigated by the next iteration of the IDE (or, if this specific layer of complexity fails to get much traction, by an IDE plugin).

And so, as we add layer upon layer of complexity, many of today's software engineers grow accustomed to be so removed from whatever they are actually doing (or, more precisely, what their IDEs are doing for them) that they risk being unable not only to see, but to accurately grasp the full depth of the stack they are standing upon. They become surgeons who can't say where the patient's lungs are located or what they do.

Read More…

Mining the Social Web, by Matthew A. Russell

Mining the Social WebThis book covers a lot of ground. It's, at times, a bit vertiginous in the amount of subjects and technologies it touches per chapter, and is not always easy to follow. It can also introduce so many interesting things that, by the time you finished becoming familiar with all of them, after wandering for hours on the web, jumping from interesting technology to interesting technology, you may have forgotten what took you to these places and wonder where you were in the book. Time spent reading it is, however, time very well spent. When you finish it, you will have at least a cursory familiarity with tools like OAuth, CouchDB, Redis, MapReduce, NumPy (and the Python programming language, albeit it will help you a lot if you know your way around Python before you start the book), Graphviz, SIMILE widgets, NLTK, various service APIs and data formats, and will be well equipped to explore those rich datasets on your own. The chapters are well compartmentalized and it's easy to pick chapters to read according to your needs. I know that, when I face the problems they tackle, I will do exactly that.

If you do any kind of analysis and visualization of social-generated data that's on the web, this book is a good pick. Even if your datasets are not from the web, you may find the parts on analysis and visualization very interesting.

You can find this book at the O'Reilly website or on Amazon.

Disclosure: I reviewed this book for the O'Reilly Blogger Review Program. If you have a blog and love to read, you should take a look into it. It's fun.

Read More…