### August 2011

Monthly Archive

August 29, 2011

This is my last concert in San Diego, schade! It should be a good one!

**In Memoriam: Marking the tenth anniversary of 9/11**

W.A. Mozart : *Requiem in D minor, K 626 (Levin completion)*

Bach Collegium San Diego

Ruben Valenzuela, dirigent

Claire Fedoruk (Soprano)

Angela Young Smucker (Alto)

Pablo Corá (Tenor)

Mischa Bouvier (Bass)

St. James by-the-Sea Episcopal Church

San Diego, California

Friday, 16 September 2011

7:30 PM

Pt. Loma Nazarene University: Crill Hall

San Diego, California

Saturday, 17 September 2011

7:30 PM

Tickets available online.

August 19, 2011

Posted by Anand Sarwate under

Uncategorized | Tags:

statistics,

war |

Leave a Comment
A while back I heard this NPR story about a method for predicting attacks that was published in Science. The news story left me a bit perplexed — where was the prediction? It turns out to be yet another version of attack of the power laws. After a bit of curve fitting, the meat of the thing seems to be this:

Our broad-brush theory does not require knowledge of specific adaptation or counteradaptation mechanisms, and hence bypasses issues such as changes in insurgent membership, technology, learning, or skill set, as well as a need to know the hearts and minds of local residents. We regard the escalation of hostilities as representing adaptation and counteradaptation in a way that is analogous to the Red Queen hypothesis of evolutionary biology. (Johnson et al. 2011)

Hearts and minds : pish-posh, what nonsense. Bring on the Red Queen!

This continual arms race of adaptation and counter-adaptation suggests similarities with Darwinian selection in nature. As noted by Rafe Sagarin, “a fundamental tenet of evolutionary biology is that organisms must constantly adapt just to stay in the same strategic position relative to their enemies—who are constantly changing as well. For example, to protect its DNA against viruses, a host organism must continually change the access code to its genetic material” (Sagarin, 2003, p. 69). Meeting the Red Queen in Alice in Wonderland, Alice finds that however fast she runs, she always stays in the same place. The “Red Queen” concept has become widely used in evolutionary biology to describe how competing individuals can become locked in an “arms race” of strategies, machinery, or weapons. However impressive one side becomes, it may never come any closer to defeating its opponent. (Johnson 2009)

Unfortunately, Red-Queen like equilibrium is not possible. Since they’ve already thrown out the idea of modeling, they’re left with fitting parameters in a stochastic model:

However, instantaneous and perfect counteradaptation is unrealistic; indeed, complex adaptation-counteradaptation dynamics generated by sporadic changes in circumstances imply that R’s temporal evolution is likely to be so complex that it can be modeled as a stochastic process. We do not need to know exactly why changes at any specific moment, nor do the changes in have to have the same value, because each change is the net result of a mix of factors [such as learning by experience or changes in personnel and technology] for each opponent. (Johnson et al. 2011)

They they go on to say that once you fit the parameters you could estimate momentum from a few fatal days to estimate the timing of the next fatal day. But they don’t actually do this (say, by holding out some data and doing a little validation), except an example of a single prediction being more accurate than one might expect.

So in the end, what do we have? This is a paper about *modeling* and not about *prediction*, but prediction sounds a lot more sexy, and so NPR ran with that aspect of it. In order to build a predictor, you would need to build a good model, which presumably means sitting around and waiting for fatal attacks to happen because you can’t make an ~~omelette~~ predictive model without ~~breaking some eggs~~ killing some people. Finally, “ZOMG P0W3R L4WZ” is so the aughts.

August 12, 2011

I also attended HealthSec ’11 this week, and the program was a little different than what I had expected. There was a mix of technical talks and policy/framework proposals around a couple of themes:

- security in medical devices
- auditing in electronic medical records
- medical record dissemination and privacy

In particular, a key challenge in healthcare coming up is how patient information is going to be handled in heath insurance exchanges (HIE’s) that will be created as part of the Affordable Care Act. The real question is what is the threat model for health information : hackers who want to wholesale health records, or the selling of data by third parties (e.g. insurance companies). Matthew Green from Dartmouth discussed implications of the PCAST report on Health Information Technology, which I will have to read.

The most interesting part of the workshop was the panel on de-identification and whether it was a relevant or useful framework moving forward. The panelists were Sean Nolan from Microsoft, Kelly Edwards from University of Washington, Arvind Narayanan from Stanford, and Lee Tien from the EFF. Sean Nolan talked a bit about how HIPAA acts as an impediment to exploratory research, which I have worked on a little, but also raised the thorny ethical issue of public good versus privacy, which is key to understanding the debate over health records in clinical research. Edwards is a bioethicist and had some very important points to raise about how informed consent is an opportunity to educate patients about their (potential) role in medical research, but also to make them feel like informed participants in the process. The way in which we phrase the tradeoff Nolan mentioned really relates to ethics in how we communicate the tradeoff to patients. Narayanan (famous for his Netflix deanonymization) talked about the relationship between technology and policy has to be rethought or turned more into a dialogue rather than a blame-shifting or challenge-posing framework. Lee Tien made a crucial point that if we do not understand how patient data moves about in our existing system, then we have no hope of reform or regulation, and no stakeholder in the system how has that “bird’s eye view” of these data flows.

I hope that in the future I can contribute to this in some way, but in the meantime I’ve been left with a fair bit to chew on. Although the conference was perhaps a bit less technical than I would have liked, I think it was quite valuable as a starting point for future work.

August 11, 2011

This week I attended EVT/WOTE ’11, a workshop on voting, technology, and trustworthiness co-located with the USENIX Security conference. I phased in and out of the workshop, which had a number of different session themes:

- “E2E”, or end-to-end voting
- empirical studies of real elections direct-recording electronic voting machines (DREs), forensics for them, and their impact on the reliability of election outcomes
- studies of accessibility issues, either to polling places or for different voting technologies
- new proposals for voting systems
- auditing and metrics for existing election systems

I pretty much work on the last one, so while some of the panels were quite interesting, some technical talks were a little beyond me. Dana Debeauvoir, the Travis County Clerk (where Austin is) gave a keynote about how she thinks technologists and elections officials can work together, and rather bravely put forward a proposal for an electronic voting system to be used at the county level. There were lots of comments about that, of course.

Theron Ji gave a nice talk about how write-in marks are (or are not) properly counted by optical scan machines. People often forget to fill in the bubble saying they are doing a write-in, which can have pretty disastrous effects, as San Diego politician Donna Frye found out. Gillian Piner reported on a survey she did of vision-impaired voters, asking them what they want for accessibility technologies. Imagine that, asking them what they want!

The two talks of most interest to me were by David Cary and Tom Magrino, both on the margin of victory for IRV elections. Cary presented a method for estimating the margin based only on the tabulation of first-choices in each round, whereas Magrino presented an exact calculation that involved solving many integer linear programs. The scaling is not so great with the number of candidates (exponential), but for the kind of IRV elections we see in the US it was definitely doable. Margin calculations are important for developing auditing algorithms (on which I will write more once my paper is done). Philip Stark gave a plenary lecture on auditing which I missed part of due to a conflict with a parallel workshop.

There were also some interesting panels. The most contentious one was on internet voting, which I missed much of but the discussion went over by an hour so I think got the gist of it. Some people are afraid of voting over the internet, but the crypto people think it can be made safe. The panel on the the Sarasota House race in 2006 tried to hone in on the reason for the problems with undervotes in that contest. A lot can be explained by the design of the ballot, proving again that user interface and graphic design is really important!

The rump session was, as always, a mixture of amusing and technical and dry. The real highlight was probably David Bismark, who seems to have antagonized someone who has a new voting system involving moon projections. Wow.

August 11, 2011

From Section 15360:

(a) During the official canvass of every election in which a

voting system is used, the official conducting the election shall

conduct a public manual tally of the ballots tabulated by those

devices, including vote by mail voters’ ballots, cast in 1 percent of

the precincts chosen at random by the elections official. If 1

percent of the precincts is less than one whole precinct, the tally

shall be conducted in one precinct chosen at random by the elections

official.

In addition to the 1 percent manual tally, the elections official

shall, for each race not included in the initial group of precincts,

count one additional precinct. The manual tally shall apply only to

the race not previously counted.

Additional precincts for the manual tally may be selected at the

discretion of the elections official.

Clearly this is not written by a statistician. Counting 1% of precincts “chosen at random” is hardly clear, and also doesn’t tell you too much about how many ballots you are going to count.

August 4, 2011

I started reading *Convex Games in Banach Spaces*, by Sridharan and Tewari and then got interested in the work of G. Pisier on martingales in Banach spaces, which studies how probability and geometry interact. Some of the definitions differ a bit between these papers (the former does things in terms of martingale difference sequences).

A Banach space has Martingale-type (or M-type ) if there is a constant such that for any and -valued martingale , we have

A Banach space has Martingale-cotype (or M-cotype ) if there is a constant such that for any and -valued martingale , we have

So far these are statements about how martingales behave in these spaces, and you can see a little geometry in the kind of stuff, but how does this relate to overall geometric properties of the space?

A Banach space is *reflexive* if it is isomorphic to its double dual (the dual of its dual space). The space is *reflexive* if and only if its dual is reflexive. A space is *finitely representable* in if for any and any there exists an isomorphism such that

A Banach space is *super-reflexive* if no non-reflexive Banach space is finitely representable in . A result of James shows that a space is super-reflexive if and only if its dual is super-reflexive. Reflexivity means a space is “nice” and super-reflexivity is a kind of “uniform niceness.”

The norm for a Banach space is *uniformly convex* if for every there exists a such that for two unit vectors and ,

implies

For example, for the spaces and are uniformly convex. Another way to picture it is that if the midpoint of the chord between and is close to the surface of the unit ball, then the chord has to be short. Per Enflo showed that if a Banach space is super-reflexive, it has an equivalent norm which is uniformly convex.

It turns out that the M-type is related to the degree of convexity of a space, namely how uniformly convex the norm is (and in turn connected to super-reflexivity). A space is -*uniformly convex* if

Gilles Pisier made the connection to M-types : a Banach space is super-reflexive if and only if it has M-type (or M-cotype ). Furthermore, a Banach space has M-cotype if and only if has a -uniformly convex norm.

So the big picture is this: super-reflexivity is enough to guarantee a uniformly convex norm, but the M-cotype characterizes the degree of how convex that norm is.

**References:**

Sridharan, Kartik and Tewari, Ambuj, *Convex Games in Banach Spaces*, Proceedings of COLT, 2010.

Enflo, Per, *Banach spaces which can be given an equivalent uniformly convex norm*, Israel Journal of Mathematics 13 (1972), 281-288.

James, Robert C., *Super-reflexive Banach spaces.*, Canadian Journal of Mathematics 24 (1972), 896Ð904.

Pisier, Gilles, *Probabilistic Methods in the Geometry of Banach spaces*, in G. Letta and M. Pratelli, ed., *Probability and Analysis*, v.1206 of *Lecture Notes in Mathematics*, p167–241, Springer, 1986.

Pisier, Gilles, *Martingales with values in uniformly convex spaces*, Israel Journal of Mathematics 20 (1975), no. 3-4, 326Ð350.

Pisier, Gilles, *Martingales in Banach Spaces (in connection with Type and Cotype)*, IHP Winter Course online lecture notes, 2011.

Lovaglia, A. R., *Locally uniformly convex Banach spaces*, Trans. Amer. Math. Soc. 78, (1955). 225Ð238.