More advice on giving talks

I’d like to point out Alex Dimakis’s great post from last week on how to give a great ISIT talk (or any talk, really) and to put in a few bits more about a replicable process for writing talks that might help out students trying to write their first (or second, or third…) talk. As with any form of writing or communication, an unclear talk is usually the product of unclear thought on the part of the presenter. There’s no shortage of advice out there on how to give talks, but I figured I’d write down a process that I use to help streamline the process.

Before I get into the list, I have to issue a disclaimer : it’s not that I think I am super-great at giving talks. I am pretty good at critiquing talks to death, however. It must be the theater critic in me. What I have done is write down a process that has worked for me…

  1. Who is your audience? This is the place to start. If you are giving a talk at a conference, you may have a sense of what the interests of the audience are, but that is only part of it. Do you want your talk to be accessible to only faculty in your sub-area? (Hint : NO). Graduate students? If so, how much background should they have? Most of your talk should be accessible to a certain group — identify that group and target the bulk of the material to them. If you are giving a job talk it’s a different group from a conference or a specialized workshop.
  2. What do you want them to know? You have to be able to summarize what you want to say in one sentence. It’s not going to encapsulate everything, but it should be what you want your target audience to learn from your talk. Write this down and think about it. It’s not easy to summarize your work in a sound bite.
  3. Outlining the talk. Make an outline of the talk that contains one sentence per slide. Each slide should have a single main point that you can write down before adding any content to the slide. Read the sentences in order. It should be a story — does the story flow well? Does it make sense? Do you really need a table of contents slide for a 20 minute talk?
  4. Filling it in. For each sentence, think about how best to explain that point. Pictures are great, loads of equations are not. How can you, in a minute or two, make that point while talking, and what do you need visually to help emphasize that point? This is like storyboarding for a film.
  5. Balancing the content. Now that you have a plan for each slide, check to see that it’s balanced. Too many slides with text will fatigue people. Too many slides with just pictures may lead to confusion.
  6. Fill it in. Fill in the text and make the figures. This will take time, but since you have the sentence on each slide and the plan you should be able to think clearly through what you need to put down on each slide.
  7. Practice. Practice giving the talk once, out loud (not just in your head). You will find mistakes on the slides. Fix those mistakes. Practice again to find more mistakes. You might discover that the story doesn’t flow as well as you thought so you have to go back and retweak things. But always keep in mind the story.
  8. Presentation = Invitation. People say a talk is an invitation to read your paper, but that is not really true, I think. Chances are that more than half the audience will not read your paper anyway, but you still need to teach them something. Of the remaining half, most may skim your paper in the proceedings; you have to make that process easier for them. For the hardcore few who will really spend time reading your paper (because they are reviewing it), you want them to be excited by that prospect.
  9. Planning for contingencies. People get derailed about things like having backup slides with all the details of the proofs. Making backup slides, which are seen 1% of the time, takes away time from making the main presentation, which is seen 100% of the time. Focus on making the main presentation good.

ISIT : plenaries and thoughts

Just a few brief notes on the plenaries. Prakash Narayan gave a nice talk on his work on secrecy generation and related problems. It was nice because it tied together a number of different models in one talk so that if you were someone who had only looked at wiretap problems you could see a more unified approach to these problems. It was a little technical for my pre-breakfast brain though. Ueli Maurer gave an overview of his new approach to cryptography — I had seen a version of this before, and it was full of pictures to illustrate the reductions and interfaces he was trying to create. I think if I had more of a background in formal CS-style cryptography I might have understood it a bit better. It feels like trying to build a different style of bridge between theory (formal reasoning about security) and practice.

Abbas El Gamal gave a rather personal Shannon Lecture, taking us through a number of stages in his research life, together with some perspectives on his new book with Young-Han Kim on network information theory. He ended by calling for the IT community to really go and tackle new problems and develop new tools and models to do that. One of the things that came across more sharply for me in this ISIT, partly due to the Cover memorial, is that information theory really is a research community. Of course, there are groups and cliques and politics and problems, but each ISIT is a real coming together that reinforces that sense of community. That’s valuable.

ISIT 2012 : more talks

Since I am getting increasingly delayed by post-ISIT and pre-SPCOM business, I am going to have to keep the rest of blogging about ISIT a little short. This post will mention some talks, and I’ll keep the other stuff for a (final) post.

Efficient Tracking of Large Classes of Experts
András György, Tamas Linder, Gabor Lugosi
This paper was on expanding the reference class against one is competing in a “prediction with experts” problem. Instead of doing well against the best expert chosen in hindsight, you compete against the best meta-expert which can switch between the existing experts. This leads to a transition diagram that is kind of complicated, but they propose a unifying approach which traces along branches — the key is that every transition path can be well approximated, so the space of possibilities one is tracking will not blow up tremendously.

Information-Theoretically Optimal Compressed Sensing via Spatial Coupling and Approximate Message Passing
David Donoho, Adel Javanmard, Andrea Montanari
What a trendy title! Basically this problem looks at the compressed sensing problem when the sensing matrix is banded (this is what spatially coupled means), and solves it using Bayesian approximate message passing to do progressive decoding and elimination. The optimality is in the sense of matching with the Renyi dimension of the signal class for the data. I alas did not take notes for the next talk, which also seemed related: Hybrid Generalized Approximate Message Passing with Applications to Structured Sparsity (Sundeep Rangan, Alyson Fletcher, Vivek Goyal, Philip Schniter)

Quantized Stochastic Belief Propagation: Efficient Message-Passing for Continuous State Spaces
Nima Noorshams, Martin Wainwright
This problem was on BP when the state space is continuous — instead of passing the whole belief distribution, nodes pass along samples from the distribution and the receiving node does a kind of interpolation/estimate of the density. They show that this process converges on trees. This is related to a problem I’ve been thinking about for decentralized inference, but with a different approach.

Synchrony Amplification
Ueli Maurer, Björn Tackmann
This was a cool talk on a framework for thinking about synchrony in clocks — the model is pretty formal, and it’s something I never really think about but it seemed like a fun way to think about these problems. Basically they want to formalize how you can take a given clock (a sequence of ticks) and convert it into another clock. The goal is to not throw out too many ticks (which equals slowdown), while achieving synchrony.

Non-coherent Network Coding: An Arbitrarily Varying Channel Approach
Mahdi Jafari Siavoshani, Shenghao Yang, Raymond Yeung
Of course I have to go to a talk with AVC in the title. This looks at the same operator channel for network coding but then they assume the network matrix may be arbitrarily varying (with known rank). In this model they can define all the usual AVC concepts and they get similar sorts of results that you see for AVCs, like dichotomies between deterministic coding with average error and randomized coding.

Alternating Markov Chains for Distribution Estimation in the Presence of Errors
Farzad Farnoud, Narayana Prasad Santhanam, Olgica Milenkovic
This talk was on the repetition channel and getting the redundancy of alternating patterns. They show upper and lower bounds. The idea is you start out with a word like abccd and it goes through a repetition channel to get aaabbcccdddd for example, and then you look instead at abcd by merging repeated letters.

On Optimal Two Sample Homogeneity Tests for Finite Alphabets
Jayakrishnan Unnikrishnan
A two-sample test means you have two strings x^n and y^n and you want to know if they are from the same distribution. He looked at the weak convergence of the asymptotically optimal test to get bounds on the false alarm probability.

Hypothesis testing via a comparator
Yury Polyanskiy
This was on a model where two nodes get to observe X^n and Y^n drawn i.i.d. from either P_{XY} or Q_{XY} and they separately compress their observations into messages W_1 and W_1. The decision rule is to decide P_{XY} if W_1 = W_2. What’s the best exponent?

The Supermarket Game
Jiaming Xu, Bruce Hajek
This was on queuing. Customers come in and sample the loads of L queues and then pick one to join. Their strategies may differ, so there is a game between the customers and this can affect the distribution of queue sizes. As a flavor of the weird stuff that can happen, suppose all customers but one only sample one queue and join that queue. Then the remaining customer will experience less delay if they sample two and join the shorter one. However, if all but one sample two and join the shorter one, then it’s better for her to sample just one. At least, that’s how I understood it. I’m not really a queueing guy.

Collaborative paper filtering?

At ISIT 2012, there were posters up for a site called ShareRI.org: Share Research Ideas, an initiative of a student at UIUC named Quan Geng. It’s a platform for posting and discussing papers — sort of like creating a mini-forum around ArXiV posts. It seems to be just starting out now, but I figured I would post the link to see if others take it up. I imagine as things scale up it might run into similar problems as Wikipedia with trolling etc, but it’s an interesting idea which has come up before in discussions with the IT Society Online Committee, for example.

Tracking the origin of genies

Lalitha Sankar asked Gerhard Kramer about my earlier question about genies. Gerhard wrote:

I got the name from Jim Massey who had suggested it as part of a title for the thesis of another doctoral student I know.

I have heard this attributed to Gallager, but the word “genie” might even come up in the Wozencraft-Jacobs book from the mid-60s (not sure!). I suspect that it goes back even further.

A little further searching along those directions turned up some more hits. On page 366 of Viterbi and Omura’s 1978 text Principles of Digital Communication and Coding, while discussing the distribution of computation in convolutional codes they write “[W]e begin by considering a sequential decoder aided by a benevolent genie who oversees the decoder action on each incorrect subset.”

But indeed, as Gerhard indicates, there is a reference in Wozencraft and Jacobs (1965). From Rimoldi and Urbanke’s paper on rate splitting, they write “[C]onceptually, we follow the lead of Wozencraft and Jacobs [29, p. 419] and postulate a genie who always knows the codeword of user 2…” Following up on that reference, in reference to the decoding of convolutional codes, Wozencraft and Jacobs write “… assume initially that a magic genie directs the decoder to the correct starting node for determining each \hat{x}_h…”

In the bibliographic notes in Viterbi and Omura, they write

As was noted previously, the original sequential decoding algorithm was proposed and analyzed by Wozencraft [1957]. The Fano algorithm [1963], with various minor modifications, has been analyzed by Yudkin [1964], Wozencraft and Jacobs [1965], Gallager [1968], and Jelinek [1968a]. Two versions of stack algorithms and their performance analyses are due to Zigangirov [1966] and Jelinek [1969a]. The precise form of the Pareto distribution on computation emerged from the works of Savage [1966] for the upper bound, and of Jacobs and Berlekamp [1967] for the lower bound.

So it seems that if the argument is due to Wozencraft, the source of the genie argument in this context is probably due to the Wozencraft and Jacobs book, but the credit for the analogy to genies is probably lost in time to us.

Linkage

I’m being lazy about more ISIT blogging because my brain is full. So here are some links as a distraction.

Via John, George Boolos’s talk entitled Gödel’s Second Incompleteness Theorem Explained in Words of One Syllable.

D’Angelo is back!

This short video about a subway stair in New York is great, especially the music.

Crooked Timber is on a tear about workplace coercion and its proponents.

Luca’s thoughts on the Turing Centennial are touching.

Converse genie etymology

In his Shannon Lecture, Abbas mentioned that his work with Costa on deterministic interference channels was the first to use “genie-aided” converse arguments (essentially assuming the decoder has more information), but that they “lacked the imagination” to use that name. The question is, who did come up with the term “genie” in connection with converses?

ISIT 2012 : ah the memories

I am at MIT for ISIT 2012. All of the sessions are in the the student center (W20) and in Kresge, so I am having serious flashbacks to my days of doing theater as an undergrad. Also some vague flashbacks to sneaking into sessions at ISIT 1998. I think I saw Jack Wolf give a great talk on group testing in W20-407 back then.

The program feels pretty packed this year – 9 parallel sessions! I will blog about some of the talks, but as usual I may not be super timely. Perhaps there is a Twitter hashtag for ISIT, but honestly, how many information theorists use Twitter?