counter Blog para proyectantes de Dani Gayo

Monday, February 22, 2010, 04:32 PM
Last week I finished "Superfreakonomics" by Levitt and Dubner. In case you haven't read "Freakonomics" by the same authors just skip this review and read that book. In case you have read it and didn't like it just skip this review.

Still here? OK. My opinions on the book. First of all, I recommend it although I think Freakonomics is much better because, to me, Superfreakonomics is a reinstallment of the first book. I mean, I enjoyed S-F but in the same way I'd enjoy a good sequel to a great movie.

As with Freakonomics, S-F has a rather zigzagging narrative which, in a few chapters, makes hard following (even finding) the authors' discourse. However, I think it's a rather common style nowadays (I'm not sure if this is good, bad, or just the opposite) with some authors (for instance Malcolm Gladwell).

Again, the authors have chosen some topics which they (probably) thought could be controversial (aka sale-boosting) like prostitution and fighting global warming. Nonetheless to say, none of them are that controversial, and the global cooling measures described in the book are, at most, intriguing.

There are, however, two chapters that I think are really worth reading: those on terrorism and altruism. The first one, "Why should suicide bombers buy life insurance?", will probably appeal those of you with data-mining inclinations (although they just provide some glimpses on the topic). The later, "Unbeliavable stories about apathy and altruism", is, well, really interesting, and helps to understand the issues about controlled experiments in sociology and psychology (after all, any measurement changes the thing measured even when the instrument is just the researcher).

Additionally, I found the epilogue, "Monkeys are people too", really funny which is a kind of bonus for the book :)

My recommendation? Get the book and read it. Is it going to be useful for you? I don't really know, or the way in which it can be useful. However, if you are a researcher somewhat related to sociology/psychology/human interaction it can help you to be a little "freakier", to encourage you to ask tougher questions, and trying to think outside of the box. To me, this alone makes Superfreakonomics worth the money.

| enlace relacionado | ( 0 / 0 ) | Top



Saturday, December 26, 2009, 10:23 PM
The list of accepted paper for WSDM 2010 is available (thanks to @mstrohm). As usual I've prepared my list of papers to read (with their corresponding PDFs); those papers I'd like to read but they are not yet available appear at the end of the list :(


| enlace relacionado | ( 0 / 0 ) | Top


HOT!
Thursday, December 17, 2009, 05:26 PM
INQ has just released a Twitterati list (their name); that is, a list containing the most influential twitterers. I took notice of this thanks to Zee M Kane who, according to that list, is the 7th most influential Briton.

I was curious enough to find how influence was computed and I eventually reached the actual report (PDF). There, in a footprint, you can find that INQ did not actually compute the twitterers' influence but, instead, relied on the Twitalyzer.com service.

Nonetheless to say, Twitalyzer is one of the not-so-many systems to compute a twitterer's influence. Another one is TunkRank which is an implementation of an idea by Daniel Tunkelang. TunkRank is kind-of a PageRank for Twitter and, thus, I find it much more pleasant: I mean, there are not really (or at least not many) ad hoc decisions and its results can be replicated (provided you have a Twitter graph).

Actually, I have assembled such a subgraph (about 1.5 million users) and have applied PageRank (not yet TunkRank) to it, and this is my list of "most influential British Twitter users". But, before, a piece of warning: I've just look for the users appearing in the Telegraph's list and just added Richard Branson which seems to be an odd missing. Now, the list:
  1. stephenfry
  2. mashable
  3. rustyrockets
  4. richardbranson
  5. imogenheap
  6. richardpbacon
  7. calvinharris
  8. andy_murray
  9. mayoroflondon
  10. sarahbrown10
  11. mrpeterandre
  12. tommcfly
  13. zee
  14. suziperry
  15. johnprescott
  16. tom_watson
  17. campbellclaret
  18. dougiemcfly

As you can see, appart of including Richard Branson (and not including some users that do not appear in my sample such as Eddie Izzard) there are only minor, but interesting, differences: for instance, Stephen Fry is more influential than Pete Cashmore from Mashable and the Mayor of London is more influential than PM's wife Sarah Brown; with regards to Tom and Dougie from McFly, Tom is much more influential than Dougie (remember, there are other twitterers in between these celebrities).

Oh, and what about that "contest" between Ashton Kutcher and CNN? It seems that both, aplusk and cnnbrk, share the first place of Twitter influencers. Yup...

| enlace relacionado | ( 3 / 37 ) | Top



Tuesday, December 8, 2009, 12:27 PM
I learned of "The Numerati" by Stephen Baker a couple of weeks ago in El Pais (a Spanish newspaper). He signed the short story "Nos vigilan" (They are watching us) which depicts a scenario where "scientists" (mainly mathematicians and engineers) apply different tools and techniques in order to detect, track, and predict users behaviour under different situations (e.g. in the supermarket, during elections, when using the Internet, etc.)

The tone of the story is a bit Big-Brother-esque for my taste and, in my opinion, it fails in providing a realistic, albeit simplified, picture of the actual state of the art.

Because of that, I was reluctant to buy the book and, thus, I searched for some reviews on it. This one, by Jeffrey Shallit, was pretty useful because it helped me to set my expectations towards the book which I, eventually, have bought and read.

After reading it, I must said that I mostly agree with Shallit's review: "The Numerati" is not a book for the technical savvy, and it probably also fails when introducing the field to the lay person. However, I haven't found it totally unenjoyable: despite the title and the short story in the newspaper, the book is not the mixture of Big Brother paranoia and Dan Brown I was afraid of. It (tries to) describe several fields where data mining and machine learning can be applied, the challenges researchers are finding, and the goals they try to reach.

All in all, I really recommend this book to those interested in data mining; hopefully, some of you could write a kind of version 2 of "The Numerati" providing a more accurate picture, while appealing to the general public at the same time.

| enlace relacionado | ( 3 / 66 ) | Top



Wednesday, November 25, 2009, 02:05 AM
Last friday I took notice of Wowd; it is, in their words, "a real-time search engine for discovering what's popular on the web right now". Wowd exploits crowd intelligence in a really smart way: first, there is no crawler, those pages visited by the users are submitted to the index and, secondly, ranking is determined from the attention the users pay to each page.

All of this is somewhat related to a paper we have under review at this moment and, thus, we decided to release a draft report as a preprint:

Making the road by searching - A search engine based on Swarm Information Foraging by Daniel Gayo-Avello and David J. Brenes. Abstract: Search engines are nowadays one of the most important entry points for Internet users and a central tool to solve most of their information needs. Still, there exist a substantial amount of users' searches which obtain unsatisfactory results. Needless to say, several lines of research aim to increase the relevancy of the results users retrieve. In this paper the authors frame this problem within the much broader (and older) one of information overload. They argue that users' dissatisfaction with search engines is a currently common manifestation of such a problem, and propose a different angle from which to tackle with it. As it will be discussed, their approach shares goals with a current hot research topic (namely, learning to rank for information retrieval) but, unlike the techniques commonly applied in that field, their technique cannot be exactly considered machine learning and, additionally, it can be used to change the search engine's response in real-time, driven by the users behavior. Their proposal adapts concepts from Swarm Intelligence (in particular, Ant Algorithms) from an Information Foraging point of view. It will be shown that the technique is not only feasible, but also an elegant solution to the stated problem; what's more, it achieves promising results, both increasing the performance of a major search engine for informational queries, and substantially reducing the time users require to answer complex information needs.


Curiously enough, a few hours after the preprint was available I received an interview request from a French journalist which ended in this description of our research (in french).

| enlace relacionado | ( 2.9 / 68 ) | Top



Siguiente