Category Archives: science

Why am I jealous of poster board?

Because my new poster gets to go to Istanbul, Turkey for next week’s EvoStar Evolutionary Computation conference without me. I just shipped it off this morning.

Combining puzzle-making with biologically-inspired computer algorithms. Because life isn’t already crazy enough.

Click the above image to see the large version of the poster. Or, you can get a copy of the original PDF. And download the entire 10 page paper if you really want to bore yourself.

We decided not to travel to Istanbul for the conference because of the expense and difficulty in getting there. Alas, I will not be able to experience the surreal juxtaposition of the geeky Mario A.I. competition amidst the majestic ruins of the ancient Roman, Byzantine, and Ottoman empires.


My other posters: Evolutionary art (GECCO 2007), Cracking substitution ciphers (GECCO 2008)

Einstein on map making, and the connection of art and science.

“One of the strongest motives that lead persons to art or science is a flight from the everyday life. With this negative motive goes a positive one. Man seeks to form for himself, in whatever manner is suitable for him, a simplified and lucid image of the world, and so to overcome the world of experience by striving to replace it to some extent by this image. This is what the painter does, and the poet, the speculative philosopher, the natural scientist, each in his own way. Into this image and its formation, he places the center of gravity of his emotional life, in order to attain the peace and serenity that he cannot find within the narrow confines of swirling personal experience.”

– Albert Einstein

“I’m struck again by the irony that spaceflight – conceived in the cauldron of nationalistic rivalries and hatreds – brings with it a stunning transnational vision. You spend even a little time contemplating the Earth from orbit and the most deeply ingrained nationalisms begin to erode. They seem the squabbles of mites on a plum.”

Carl SaganPale Blue Dot

United we stand

You enter a contest. A million dollars is at stake. Forty-one thousand teams from 186 different countries are clamoring for the prize and the glory. You edge into the top 5 contestants, but there is only one prize, and one winner. Second place is the first loser. What do you do?

Team up with the winners, of course.

The Netflix Prize is a competition that is awarding $1,000,000 to whomever can come up with the best improvement to their movie recommendation engine. Their system looks at the massive amounts of movie rental data to try to predict how well users will like other movies. For example, if you like Coraline, you may also like Sweeney Todd. But Netflix’s recommendation engine isn’t great at making predictions, so they decided to offer a bounty to anyone who could come up with a system that has a verifiable 10% improvement to Netflix’s prediction accuracy.

The contest recently ended with two teams jockeying for the prize. During the two and a half years the contest has been active, several individuals and small groups dominated the contest leaderboard;, with competition among 41,000 teams from 186 different countries. The competition became fierce, resulting in coalitions forming. The team “BellKor’s Pragmatic Chaos” formed from the separate teams “BellKor” (part of the Statistics Research Group in AT&T labs), “BigChaos” (a group of folks who specialize in building recommender systems), and “PragmaticTheory” (two Canadian engineers with no formal machine learning or mathematics training). Another conglomerate team, “The Ensemble“, is made up of “Grand Prize Team” (itself a coalition of members combining strategies to win the prize), “Vandelay Industries (another mish-mash of volunteers)”, and “Opera Solutions“.

;

At first, it looked like BellKor’s Pragmatic Chaos won. But now it looks like The Ensemble won. Netflix says it will verify and announce the winner in a few weeks.

Who the hell cares? Why is this interesting in the slightest? Ten percent seems so insignificant.

Well, predicting human behavior seems impossible. But this contest has clearly shown that some amount of improvement in prediction of complicated human behavior is indeed possible. And what’s really interesting about the winning teams is that no single machine learning or statistical technique dominates by itself. Each of the winning teams “blends” a lot of different approaches into a single prediction engine.

Artificial neural networks. Singular value decomposition. Restricted Boltzmann Machines. K-Nearest Neighbor Algorithms. Nonnegative matrix factorization. These are all important algorithms and techniques, but they aren’t best in isolation. Blending is key. Even the teams in the contest were blended together.

United we stand.

Each technique has its strengths and weaknesses. Where one predictor fails, another can take up the slack with its own unique take on the problem.

BellKor, in their 2008 paper describing their approach, made the following conclusions about what was important in making predictions:

  • Movies are selected deliberately by users to be ranked. The movies are not randomly selected.
  • Temporal effects:
    • Movies go in and out of popularity over time.
    • User biases change. For example, a user may rate average movies “4 stars”, but later on decide to rate them “3 stars”.
    • User preferences change. For example, a user may like thrillers one year, then a year later become a fan of science fiction.
  • Not all data features are useful. For example, details about descriptions of movies were significant, and explained some user behaviors, but did not improve prediction accuracy.
  • Matrix factorization models were very popular in the contest. Variations of these models were very accurate compared to other models.
  • Neighborhood models and their variants were also popular.
  • For this problem, increasing the number of parameters in the models resulted in more accuracy. This is interesting, because usually when you add more parameters, you risk over-fitting the data. For example, a naive algorithm that has “shoe color” as an input parameter might see a bank that was robbed by someone wearing red shoes, and conclude that anyone wearing red shoes was a potential bank robber. For another classic example of over-fitting, see the Hidenburg Omen.
  • To make a great predictive system, use a few well-selected models. But to win a contest, small incremental improvements are needed, so you need to blend many models to refine the results.


;
RMSE (error) goes down as the number of blended predictors goes up. But the steepest reduction in error happens with only a handful of predictors — the rest of them only gradually draw down the error rate.

Yehuda Koren, one of the members of BellKor’s Pragmatic Chaos and a researcher for Yahoo! Israel, went on to publish another paper that goes into more juicy details about their team’s techniques.

I hope to see more contests like this. The KDD Cup is the most similar one that comes to mind. But where is the ginormous cash prize???

(previously)

CTRL-ALT-DNA

You’re sitting at your computer, writing your next awesome computer program. You think, “I want to run my new program. But the computer I have is too slow and too boring to run it on.”

You glance over at the petri dish in your biology lab. “What if I could deploy my program as DNA, and the outcome of my program gets expressed as proteins and genes in a real cell?”

Sounds kind of crazy. But Microsoft is researching this.



An Escherichia coli predator-prey system implemented with a synthetic biology programming language developed by Microsoft researchers.

In their paper Towards programming languages for genetic engineering of living cells, Microsoft UK researchers Michael Pedersen and Andrew Phillips have developed a programming language that translates logical concepts into models of biological reactions in simulators. Reactions that have favorable results have the potential to be synthesized into DNA for insertion into real cells, achieving some level of cyborgian awesomeness that we can only just begin to imagine. (Insert obligatory Blue Screen of Death joke here).

More info here. And be sure to check out the full paper here.

links for 2009-03-06: Pile o’ toys

This impressive augmented reality demo from GE inserts computer-generated 3D objects into live video. First, watch the short video. Then, try it yourself.
Israeli musician “Kutiman” took a big pile of seemingly random YouTube video clips and used them as instruments in his own musical compositions. I could not stop listening to these. My favorites are tracks 2 and 3. His site is overloaded at the time of this post; for now you can see samples here, here, and here.
Can you be an awesome DJ using nothing but a web browser and your computer’s keyboard? Yes you can.
A curious programmer, inspired by Roger Asling’s evolution of the Mona Lisa, asks if the technique could be a good way to compress images. Also take a look at the nice online version of the image evolver he wrote, in which you can set your own target image.
Hilarious Livejournal diary done in the style of Rorschach from the Watchmen comic book series.
The Crisis of Credit, Visualized – An extremely well-produced video describing the credit crisis in simple terms.
instantwatcher.com – “Netflix for impatient people”. A remix of the Netflix site that is “about a quadrillion times easier to browse than Netflix’s own site”.
$timator: How much is your web site worth?
Cursebird. A real time feed of people swearing on Twitter. THANK YOU, INTERNET!
Leapfish. An interesting new meta-search engine with a clean interface. “It’s OK, you’re not cheating on Google.”
Twittersheep. “Enter your twitter username to see a tag cloud from the ‘bios’ of your twitter flock.”
PWN! YouTube. This is a great idea. You just type “pwn” in front of “youtube” in the URL, and voila; instant links for downloading and saving the videos.

Simulated evolution parlor tricks

Here are some interesting tidbits of evolutionary computing to honor Darwin’s birthday yesterday:

Evolution of Mona Lisa

(youtube link)

Roger Alsing’s idea is to start with a random pile of polygons. Random mutations are applied to the polygons. The result is compared to the Mona Lisa source image, and mutations resulting in improvements are kept. Over many generations, the evolved image begins to resemble the Mona Lisa.

This particular application of genetic algorithms is very popular. See what many other people have tried.

Evolectronica

This site evolves music by generating loops randomly from sounds and effects. Listeners to the site’s audio streams rank the results, and the genetic algorithm creates “baby loops” for the listeners to rank.

CSS Evolve

This site shows you variations of a web site’s cascading style sheets. You pick the best results, and their genetic algorithm breeds them to create new styles for the web site.

Every letter is powerful

A fun nugget from my new favorite blog, Futility Closet:

Show this bold Prussian that praises slaughter, slaughter brings rout. Teach this slaughter-lover his fall nears.

Grim, no? But remove the first letter of each word and the mood changes:

How his old Russian hat raises laughter — laughter rings out! Each, his laughter over, is all ears.

Check out Futility Closet for more fascinating curiosities tinted with language, math, science, antiquity, puzzles, and amusement. I especially enjoy The Random Item Button.

links for 2008-10-30

MTV Music – huge archive of linkable and embeddable music videos. Finally! MTV does something music-related!

Roanoke Robotics Society and Club – We actually have a Robotics club in little ol’ Roanoke? Very surprising! In fact, they are hosting a robotics competition event at our science museum this Saturday. We’re definitely taking Iris to this!

Terrorist ‘tweets’? US Army warns of Twitter dangers – Microblogging: A platform for Jihadists?

Paris Hilton In Space – “Paris Hilton will be among the passengers on Richard Branson’s first Virgin spaceflight.” OK, this article just made space less cool. We send enough junk into space as it is.

Sarah Palin Cabbage Patch dolls. *Wink*

links for 2008-10-24

My del.icio.us auto-posting doodad no longer works. Automation has failed me. *Cry*.

So, here are some hand-cranked, slow-cooked links for today.

X-rays detected from Scotch tape. Incredible. Maybe somebody will exploit this to take a peek inside of Christmas presents.

How the Weird Mars Science Laboratory Floating Sky Crane Works. Very cool video showing the landing procedure of an upcoming Mars mission. GET YOUR ASS TO MARS!

ALIPR – Automatic Photo Tagging and Visual Image Search. Free photo auto-tagging service. I had very mixed results on the few tests I attempted.

Cockroach inspired robot from CWRU’s biorobotics lab. More roachbots are coming. And your sprays will not help you.


Housing prices infographic.

The bursting housing bubble is painfully clear in this graph. I’ve heard some projections that the retraction in housing prices won’t bottom out until 2011.