Domingos on what you should know about machine learning

Posted on October 4, 2012 by Anand Sarwate

Dhruv Batra forwarded this Communications of the ACM article by Pedro Domingos, entitled “A Few Useful Things to Know about Machine Learning” [free version] The main point from the abstract is:

However, developing successful machine learning applications requires a substantial amount of “black art” that is hard to find in textbooks. This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions.

The article focuses on the classification problem to illustrate these “key lessons.” It’s well-worth reading, especially for people who don’t work on machine learning because it explains a number of important issues.

It illustrates the gap between what the theory/research works on and the nitty-gritty of applying these algorithms to real data.
It gives people who want to implement an ML method important fundamental questions to ask before starting : how do I represent my data? How do I evaluate performance? How do I do things efficiently? These have to get squared away first.
Domain knowledge and feature engineering are the keys to success.

Since I’m guessing there are 2 machine learners who read this blog, go read it (unless you are one of my friends who doesn’t care about all of these technical posts).

Sita tries to send a message to Rama using a digital certificate

Posted on October 2, 2012 by Anand Sarwate

Via Erin (via Bruce Schneier’s blog), I found out about S. Parthasarathy‘s proposal to replace Alice and Bob with Sita and Rama. I have been known to use Alice and Bob on occasion (unlike some people I find the anthropomorphizing to be good, on the balance), but perhaps I should develop some cultural pride and make the switch to “a smarter alternative to these characters.” According to Parthasarathy, there is greater literary relevance to the scenario where Sita wants to send a message to Rama. The dramatic personae in this version are:

Sita : kidnapped maiden who wishes to send a message
Rama : brave prince who is to receive the message
Hanuman : the honest broker who relays the message
Ravana : the rogue-in-the-middle who acts as the adversary. To avoid confusing first letters, let’s rename him Badmash.

There are a number of other appealing allusions in this scenario.

I think it’s a fun exercise — can one come up with other settings? Perhaps based on Gilgamesh, or Star Wars. I’m sure at least one reader of this blog could come up with a Battlestar Galactica scenario. Adama to Baltar?

Also, I couldn’t help but point to this chestnut, the real story of Alice and Bob (h/t to my father).

Daniel Spielman wins 2012 MacArthur Award

Posted on October 1, 2012 by Anand Sarwate

The NY Times only mentions him in passing and the Yale CS department hasn’t issued a press release, but that’s pretty awesome news. You can read all about his research on his homepage. (h/t Kevin Chen).

DIMACS Workshop on Differential Privacy

Posted on September 19, 2012 by Anand Sarwate

Via Kamalika, I head about the DIMACS Workshop on differential privacy at the end of October:

DIMACS Workshop on Differential Privacy across Computer Science
October 24-26, 2012
(immediately after FOCS 2012)

Call for Abstracts — Short Presentations

The upcoming DIMACS workshop on differential privacy will feature invited talks by experts from a range of areas in computer science as well as short talks (5 to 10 minutes) by participants.

Participants interested in giving a short presentation should send an email to asmith+dimacs@psu.edu containing a proposed talk title, abstract, and the speaker’s name and affiliation. We will try to
accommodate as many speakers as possible, but

a) requests received before October 1 will get full consideration
b) priority will be given to junior researchers, so students and postdocs should indicate their status in the email.

More information about the workshop:

The last few years have seen an explosion of results concerning differential privacy across many distinct but overlapping communities in computer science: Theoretical Computer Science, Databases, Programming Languages, Machine Learning, Data Mining, Security, and Cryptography. Each of these different areas has different priorities and techniques, and despite very similar interests, motivations, and choice of problems, it has become difficult to keep track of this large literature across so many different venues. The purpose of this workshop is to bring researchers in differential privacy across all of these communities together under one roof to discuss recent results and synchronize our understanding of the field. The first day of the workshop will include tutorials, representing a broad cross-section of research across fields. The remaining days will be devoted to talks on the exciting recent results in differential privacy across communities, discussion and formation of interesting open problems, and directions for potential inter-community collaborations.

The workshop is being organized by Aaron Roth (blog) and Adam Smith (blog).

Linkage

Posted on July 6, 2012 by Anand Sarwate

I’m being lazy about more ISIT blogging because my brain is full. So here are some links as a distraction.

Via John, George Boolos’s talk entitled Gödel’s Second Incompleteness Theorem Explained in Words of One Syllable.

D’Angelo is back!

This short video about a subway stair in New York is great, especially the music.

Crooked Timber is on a tear about workplace coercion and its proponents.

Luca’s thoughts on the Turing Centennial are touching.

Linkage

Posted on April 17, 2012 by Anand Sarwate

Via Brandy, Kenji breaks down perfect hard boiled eggs. See also sauceome.

Bret Victor talks about Inventing on Principle — the first half are a lot of demos of some pretty amazing applications of his major driving principle, which is that creators should be able to see what they are creating in real time. He sometimes waxes a little TED-like, but overall, quite inspiring.

My high school history teacher, Chris Butler, has turned his award-winning lecture notes and flowcharts into an iPad app which is available on the App Store.

Queen, live at Wembley. (via MeFi)

Some pretty cool visualizations of sorting. (via logistic aggression)

R package for differentially private logistic regression

Posted on March 29, 2012 by Anand Sarwate

My collaborator Staal Vinterbo has written an implementation in R of differentially private logistic regression and put it into a package on the CRAN archive. It implements the objective perturbation method described in this paper.

Linkage

Posted on March 2, 2012 by Anand Sarwate

It’s been a busy week, deadline-wise, but I did see a few cool things on the interwebs which seemed worth sharing:

Tarantulas molting, courtesy of my high school biology teacher and ExploraVision coach extraordinare, Mr. Stone (his blog is cool too).

Keeping with the nature theme, find the cuttlefish!. The octopus video is cool too. Thanks to my commute being a bit longer, I listen to Science Friday podcasts as well as Story Collider, which is a pretty cool Moth-meets-science storytelling podcast.

Sometimes papers use pretty strong words in their titles (see for more context). On that note, some letters from John Nash (see also) were recently declassified by the NSA wherein he seems to predict fundamentals of cryptography and computational complexity. In more Rivest news, he coded up the cryptosystem.

In sadder news (also not so recent now), De Bruijn passed away. I’ve started a bioinformatics project recently (maybe more like “started”) and DeBruijn graphs are a pretty useful tool for making sense of data from next-generation sequencing technologies. Here are some animations describing how Illumina and 454 sequencing work.

Maybe when it gets warmer I will put together a worm bin — I miss the curbside composting of Berkeley.

I get a lot of positive comments about this shirt, but Topatoco are discontinuing it. Speaking of potatoes, Lav has a nice post with some links to papers on the importance and history of potatoes.

Call for Papers : ICITS 2012

Posted on February 10, 2012 by Anand Sarwate

I am on the PC for this conference, so I figured I would advertise the CFP here for those readers who would be interested.

6th International Conference on Information-Theoretic Security
Montreal, Quebec, Canada
August 15–17, 2012

This is the sixth in a series of conferences that aims to bring together the leading researchers in the areas of information theory, quantum information theory, and cryptography. ICITS covers all aspects of information-theoretic security, from relevant mathematical tools to theoretical modeling to implementation. Papers on all technical aspects of these topics are solicited for submission.

Note that this year there will be two distinct tracks for submission.

Important Dates:

Conference Track Submission: Monday, March 12, 2012
Conference Track Notification: Friday, May 4, 2012
Proceedings version: Tuesday, May 29, 2012
Workshop Track Submissions: Monday, April 9, 2012
Workshop Track Notification: Monday, May 28, 2012

Note: ICITS (Aug. 15-17, Montreal) is the week before CRYPTO 2012 (Aug. 20–23, Santa Barbara).

Two Tracks: Conference and Workshop

The goal of ICITS is to bring together researchers on all aspects of information-theoretic security. To this end, ICITS 2012 will consist of two types of contributed presentations. The conference track will act as a traditional conference (original papers with published proceedings). The workshop track will operate more like an informal workshop, with papers that have appeared elsewhere or that consist of work in progress.

Conference Track (with proceedings): Submissions to this track must be original papers that have not previously appeared in published form. Accepted papers will be presented at the conference and will also be published in the conference proceedings (which will appear in Springer’s Lecture Notes in Computer Science series). We note that simultaneous submission to journals is acceptable, but simultaneous submission to other conferences with published proceedings is not.
Workshop Track (no proceedings): To encourage presentation of work from a variety of fields (especially those where conference publication is unusual or makes journal publication difficult), the committee also solicits “workshop track” papers. Accepted papers will be presented orally at the conference but will not appear in
the proceedings. Submissions to this track that have previously appeared (or are currently submitted elsewhere) are acceptable, as long as they first appeared after January 1, 2011. Papers that describe work in progress are also welcome. We note that the same standards of quality will apply to conference and workshop papers.

Conference Organization:

Program Chair: Adam Smith (Pennsylvania State University)
Program Committee:

Anne Broadbent (University of Waterloo)
Thomas Holenstein (ETH Zurich)
Yuval Ishai (Technion)
Sidharth Jaggi (CU Hong Kong)
Bhavana Kanukurthi (UCLA)
Ashish Khisti (University of Toronto)
Yingbin Liang (Syracuse University)
Prakash Narayan (University of Maryland)
Louis Salvail (Universite de Montreal)
Anand Sarwate (TTI Chicago)
Christian Schaffner (University of Amsterdam)
Adam Smith (Pennsylvania State University)
Stephanie Wehner (National University of Singapore)
Daniel Wichs (IBM Research)
Juerg Wullschleger (Universite de Montreal)
Aylin Yener (Pennsylvania State University)

General Chair: Juerg Wullschleger (Universite de Montreal)
Local Co-Chairs: Claude Crepeau (McGill University) and Alain Tapp
(Universite de Montreal)

Detailed instructions for authors can be found in the full CFP, available on the website.

Linkage

Posted on July 29, 2011 by Anand Sarwate

Via Jay P., a pretty amazing dance video.

Via 530nm330Hz, a very interesting tidbit on the history of the one-time pad. A free tech report version is available too. The one-time pad XOR’s the bits of a message with a i.i.d. random bitstring of the same length, and is credited to Gilbert Vernam and Joseph Mauborgne. However, as Steven Bellovin‘s paper shows,

In 1882, a California banker named Frank Miller published Telegraphic Code to Insure Privacy and Secrecy in the Transmission of Telegrams. In it, he describes the first one-time pad system, as a superencipherment mechanism for his telegraph code. If used properly, it would have had the same property of absolute security.

Although in theory Miller can claim priority, reality is more complex. As will be explained below, it is quite unlikely that either he or anyone else ever used his system for real messages; in fact, it is unclear if anyone other than he and his friends and family ever knew of its existence. That said, there are some possible links to Mauborgne. It thus remains unclear who should be credited with effectively inventing the one-time pad.

Another fun tidbit : apparently mother’s maiden name was used for security purposes way back in 1882!

I really like shiso leaves and their cousins. I had a shiso plant but it did not survive the California sun / I have a black thumb. One of my favorite meals at ISIT 2009 was with Bobak Nazer, where we found an out-of-the way BBQ joint where they brought us a long box filled with 7 varieties of leaves, including perilla leaves. It makes me hungry just writing about it.

Kudos to Adrienne for the amazing photo.

There’s Only One Sun, a short sci-fi film by Wong Kar-Wai.

	Zonghong Liu on A story about Canvas
	anonymousskimmer on “The needs of the many,…
	Chanterelle Recipes… on Broiled shrimp with chanterell…
	kvarsh on ICML 2019 encouraged code subm…
	Pulkit Grover on gender inclusivity in communic…

An Ergodic Walk

a process whose average over time converges to the true average

Tag Archives: computer science

Domingos on what you should know about machine learning

Sita tries to send a message to Rama using a digital certificate

Daniel Spielman wins 2012 MacArthur Award

DIMACS Workshop on Differential Privacy

Linkage

Linkage

R package for differentially private logistic regression

Linkage

Call for Papers : ICITS 2012

Linkage