JSM Day 4, almost there

(by Julien Cornebise)

Intense pace and social life (on which I’ll dwell in tomorrow’s post) are starting to take their toll: morning sessions were far less attended this morning than on the two former days. The dance party yesterday probably did not help. Not claiming to be better than any other, I shamelessly skipped the early sessions, so as to be fully focused for Chris Holmes’ Medallion Lecture at 10:30 — preceded by a luxurious breakfast: M&M’s cookies from 7/11, yay! Long live the US and their food(ish).

Chris’s lecture was a model of clarity. With an unhurried pace leaving no room for boredom (which could have hit after this many sessions), he took the audience all the way from hidden Markov models (HMM) to specially designed Loss Functions to his applications in genome-based oncology. Ideas were flowing smoothly and naturally, striking a perfect balance between formalism and intuition.

He started by showing the kind of changepoint-detection/HMM problems he was facing, which actually stem from a hybrid supervised/unsupervised problem: you know the class number and characteristics, but have no training data. He then showed how off-the-shelf Viterbi Maximum A Posteriori estimate and Forward-Backward Marginal Maximum perform well but are not flexible enough — no possibility to tune the false-positive and false-negative discovery. He then explained what the ideal loss function on the whole path for his problem would be, just to shatter it to pieces by an algorithmic complexity analysis, which made his following solution pitch-perfect: use a k-th order Markovian loss function.

All of this was progressively illustrated on a running-example dataset with simple yet informative graphs — much better than any final recapitulating table would have done. He then spent the last third of his lecture on his real-world applications on Colon cancer genomics, going from the algorithmic considerations to the more classical statistic questions of distributions used, dealing with the mixture of population subtypes, and extended to Sequential (longitudinal) model, tracking the changes in a same patient over time and treatment.

On a side-note I especially appreciated his introductory memories of finding the theoretical chapters of Bernardo-Smith’s book mysterious and too abstract when he first came to Imperial — only to, years later, gain more and more appreciation for those very aspects of decision theory who lead him to attack those challenging problems in this winning angle. As more of a theoretician/algorithmician than an applied statistician, I am convinced that, once a practical problem is defined, taking a step back to a more abstract level can bring tremendous gains: first focus on the problem, then step back to an abstract, bigger picture, develop in this generic setting, and finally zoom back on the concrete problem — and then all his neighbors that can now be tackled with the same tools!
Really a lecture that I was glad to attend.

I then aimed at the afternoon session Beyond Pharmacokinetics: Recent Advances in Science and Methodology, a great follow-up to last summer’s SAMSI program on Pharmacokinetics/Pharmacodynamics, with a non-empty intersection of speakers! Although I missed the first talk and a half (didn’t see the time pass while in a discussion), I heard enough of the second talk to realize the amount of work that went into it: when the speaker thanks this and that person for “setting up the robotics that were used in the sequencing”, you know you just missed a good talk! The third and last talk was on familiar topics, I was struck by its links to the former talks of the week, especially Sylvia Richardson’s Medallion Lecture yesterday: just like her, Michele Guindani aims at clustering the profiles of the patients, but does so by using Non Parametric Bayes and Dirichlet Process. Of course, now that he mentions it, I recall that Sylvia also mentioned NPB in the part of her talk about mixtures, and I also remember Peter Mueller tutorial about using NPB to cluster the patients into subpopulations — but it only just now clicked all into place. That’s an advantage of having all those talks concentrated over 3 days: there’s not far from one to another!

A disheartening point in the Q&A session, though: while, during the whole congress, I have been impressed by the way usual controversies were smoothed out or, better, bridged, this carebear-spree ended when one of the attendants tried to pry his own method into the talk, with what seemed the biggest oratory forceps ever known to academics. The method seemed to me so prehistoric and limited in its applications that I am truly wondering which of him and I totally missed the point! Anyway, this is anecdotal in view of the great talks and debates I attended this week, and I still am impressed.

Tomorrow is the last day of this JSM 2011. Exhibit Hall is already disbanded, and it is unclear if the free wifi will still be there! Indeed one of the biggest anguishes I can live through, but I’ll nonetheless end up posting eventually, even if that means having to go steal wifi on the beach near the big hotels… Life is hard 😉


0 Responses to “JSM Day 4, almost there”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


The Statistics Forum, brought to you by the American Statistical Association and CHANCE magazine, provides everyone the opportunity to participate in discussions about probability and statistics and their role in important and interesting topics.

The views expressed here are those of the individual authors and not necessarily those of the ASA, its officers, or its staff. The Statistics Forum is edited by Andrew Gelman.

A Magazine for People Interested in the Analysis of Data

%d bloggers like this: