Linkage

Cheating: The List Of Things I Never Want To Hear Again. This is an almost definitive list of plagiarism/cheating excuses. I both love and loathe the idea of making students sign a pledge, but there’s that saying about a horse and water… (h/t Daniel Hsu)

This note on data journalism comes with a longer report about how to integrate data journalism into curricula. It strikes me that many statistics and CS departments are missing the boat here on creating valuable pedagogical material for improving data analytics in journalism. (h/t Meredith Broussard)

Speaking of which, ProPublica has launched version 2.0 of it’s Data Store!

Of course, data isn’t everything: The Perils of Using Technology to Solve Other People’s Problems.

DARPA just launched a podcast series, Voices from DARPA, where DARPA PMs talk about what they’re doing and what they’re interested in. The first one is on molecular synthesis. It’s more for a popular audience than a technical one, but also seems like a smart public-facing move by DARPA.

My friend Steve Severinghaus won the The Metropolitan Society of Natural Historians Photo Contest!

My friend (acquaintance?) Yvonne Lai co-authored this nice article on teaching high school math teachers and the importance of “mathematical knowledge for teaching.”

Data: what is it good for? (Absolutely Something): the first few weeks

So Waheed Bajwa and I have been teaching this Byrne Seminar on “data science.” At Allerton some people asked me how it was going and what we were covering in the class. These seminars are meant to be more discussion-based. This is a bit tough for us in particular:

  • engineering classes are generally NOT discussion-based, neither in the US nor in Pakistan
  • it’s been more than a decade since we were undergraduates, let alone 18
  • the students in our class are fresh out of high school and also haven’t had discussion-based classes

My one experience in leading discussion was covering for a theater class approximately 10 years ago, but that was junior-level elective as I recall, and the dynamics were quite a bit different. So getting a discussion going and getting all of the students to participate is, on top of being tough in general, particularly challenging for us. What has helped is that a number of the students in the class are pretty engaged with the ideas and material, and we do in the end get to collectively think about the technologies around us and the role that data plays a bit differently.

What I wanted to talk about in this post was what we’ve covered in the first few weeks. If we offer this class again it would be good to revisit some of the decisions we’ve made along the way, as this is as much a learning process for us as it is for them. A Byrne Seminar meets for 10 times during the semester, so that it will end well before finals. We had some overflow from one topic to the next, but roughly speaking the class went in the following order:

  • Introduction: what is data?
  • Potentials and perils of data science
  • The importance of modeling
  • Statistical considerations
  • Machine learning and algorithms
  • Data and society: ethics and privacy
  • Data visualizaion
  • Project Presentations

I’ll talk a bit more on the blog about this class, what we covered, what readings/videos we ended up choosing, and how it went. I think it would be fun to offer this course again, assuming our evaluations pass muster. But in the meantime, the class is still on, so it’s a bit hard to pass retrospective judgement.

Detection and Estimation: book recommendations?

It’s confirmed that I will be teaching Detection and Estimation next semester so I figured I would use the blog to conjure up some book recommendations (or even debate, if I can be so hopeful). Some of the contenders:

  • Steven M. Kay, Fundamentals of Statistical Signal Processing – Estimation Theory (Vol. 1), Prentice Hall, 1993.
  • H. Vincent Poor, An Introduction to Signal Detection and Estimation, 2nd Edition, Springer, 1998.
  • Harry L. Van Trees, Detection, Estimation, and Modulation Theory (in 4 parts), Wiley, 2001 (a reprint).
  • M.D. Srinath, P.K. Rajasekaran, P. K. and R. Viswanathan, Introduction to Statistical Signal Processing with Applications, Prentice Hall, 1996.

Detection and estimation is a fundamental class for the ECE graduate curriculum, but these “standard” textbooks are around 20 years old, and I can’t help but think there might be more “modern” take on the subject (no I’m not volunteering). Venu Veeravalli‘s class doesn’t use a book, but just has notes. However, I think the students at Rutgers (majority MS students) would benefit from a textbook, at least as a grounding.

Srinath et al. is what my colleague Narayan Mandyam uses. Kay is what I was leaning to before (because it seems to be the most widely used), but Poor’s book is the one I read. I think I am putting up the Van Trees as a joke, mostly. I mean, it’s a great book but I think a bit much for a textbook. So what do the rest of you use? Also, if you are teaching this course next semester, perhaps we can share some ideas. I think the curriculum might be ripe for some shaking up. If not in core material, at least in the kinds of examples we use. For example, I’m certainly going to cover differential privacy as a connection to hypothesis testing.

Teaching bleg: articles on “data” suitable for first-year undergraduates

My colleague Waheed Bajwa and I are teaching a Rutgers Byrne Seminar for first-year undergraduates this fall. The title of the course is Data: What is it Good For? (Absolutely Something), a reference which I am sure will be completely lost on the undergrads. The point of the course is to talk about “data” (what is it, exactly?), how it gets turned into “information,” and then perhaps even “knowledge,” with all of the pitfalls along the way. So it’s a good opportunity to talk about philosophy (e.g. epistemology), mathematics/statistics (e.g. undersampling, bias, analysis), engineering (e.g. storage, transmission), science (e.g. reduplication, retraction), and policy (e.g. privacy). It’s supposed to be a seminar class with lots of discussion, and the students can be expected to do a little reading outside of class. We have a full roster of 20 signed up, so managing the discussion might be a bit tricky, of course.

We’re in the process of collecting reading materials — magazine articles, book chapters, blog posts, etc. for the students to read. We explicitly didn’t want it to be for “technical” students only. Do any readers of the blog have great articles suitable for first-year undergrads across all majors?

As the class progresses I will post materials here, as well as some snapshot of the discussion. It’s my first time teaching a class of this type (or indeed any undergraduates at Rutgers) so I’m excited (and perhaps a bit nervous).

On a side note, Edwin Starr’s shirt is awesome and I want one.

Teaching technical (re-)writing

I think it would be great to have a more formal way of teaching technical writing for graduate students in engineering. It’s certainly not being taught at (most) undergraduate institutions, and the mistakes are so common across the examples that I’ve seen that there must be a way to formalize the process for students. Since we tend to publish smaller things a lot earlier in our graduate career, having a “checklist” approach to writing/editing could be very helpful to first-time authors. There are several coupled problems here:

  • students often don’t have a clear line of thought before they write,
  • they don’t think of who their audience is,
  • they don’t know how to rewrite, or indeed how important it is.

Adding to all of this is that they don’t know how to read a paper. In particular, they don’t know what to be reading for in terms of content or form. This makes the experience of reading “related work” sections incredibly frustrating.

What I was thinking was a class where students learn to write a literature review (a small one) on a topic of their choosing. The first part will be how to read papers and make connections between them. What is the point of a literature review, anyway? The first objective is to develop a more systematic way of reading and processing papers. I think everyone I know professionally, myself included, learned how to do this in an ad-hoc way. I believe that developing a formula would help improve my own literature surveying. The second part of the course would be teaching about rewriting (rather than writing). That is, instead of providing rules like “don’t use the passive voice so much” we could focus on “how to revise your sentences to be more active.” I would also benefit from a systematic approach to this for my own writing.

I was thinking of a kind of once-a-week writing seminar style class. Has anyone seen a class like this in engineering programs? Are there tips/tricks from other fields/departments which do have such classes that could be useful in such a class? Even though it is “for social scientists”, Harold Becker’s book is a really great resource.

“Cascading Style Sheets are a cryptic language developed by the Freemasons to obscure the visual nature of reality”

Via Cynthia, here is a column by James Mickens about how horrible the web is right now:

Computer scientists often look at Web pages in the same way that my friend looked at farms. People think that Web browsers are elegant computation platforms, and Web pages are light, fluffy things that you can edit in Notepad as you trade ironic comments with your friends in the coffee shop. Nothing could be further from the truth. A modern Web page is a catastrophe. It’s like a scene from one of those apocalyptic medieval paintings that depicts what would happen if Galactus arrived: people are tumbling into fiery crevasses and lamenting various lamentable things and hanging from playground equipment that would not pass OSHA safety checks.

It’s a fun read, but also a sentiment that may echo with those who truly believe in “clean slate networking.” I remember going to a tutorial on LTE and having a vision of what 6G systems will look like. One thing that is not present, though, is the sense that the system is unstable, and that the introduction of another feature in communication systems will cause the house of cards to collapse. Mickens seems to think the web is nearly there. The reason I thought of this is the recent fracas over the US ceding control of ICANN, and the sort of doomsdaying around that. From my perspective, network operators are sufficiently conservative that they can’t/won’t willy-nilly introduce new features that are only half-supported, like the in Web. The result is a (relatively) stable networking world that appears to detractors as somewhat Jurassic.

I’d argue (with less hyperbole) that some of our curriculum ideas also suffer from the accretion of old ideas. When I took DSP oh-so-long ago (13 years, really?) we learned all of this Direct Form Transposed II blah blah which I’m sure was useful for DSP engineers at TI to know at some point, but has no place in a curriculum now. And yet I imagine there are many places that still teaching it. If anyone reads this still, what are the dinosaurs in your curriculum?

A proposal for restructuring tenure

An Op-Ed from the NY Times (warning: paywall) suggests creating research and teaching tenure tracks and hire people for one or the other. This is an interesting proposal, and while the author Adam Grant marshals empirical evidence showing that the two skills are largely uncorrelated, as well as research on designing incentives, it seems that the social and economic barriers to implementing such a scheme are quite high.

Firstly, the economic. Grant-funded research faculty bring in big bucks (sometimes more modest bucks for pen-and-paper types) to the university. They overheads (55% at Rutgers, I think) on those grants help keep the university afloat, especially at places which don’t have huge endowments. Research in technology areas can also generate patents, startups, and other vehicles that bring money to the university coffers. This is an incentive for the university to push the research agenda first. Grant funding may be drying up, but it’s still a big money maker.

On the social barriers, it’s simply true in the US that as a society we don’t value teaching very highly. Sure, we complain about the quality of education and its price and so on, but the taxpayers and politicians are not willing to put their money where their mouth is. We see this in the low pay for K-12 teachers and the rise of the $5k-per-class adjunct at the university level. If a university finds that it’s doing well on research but poorly on teaching, the solution-on-the-cheap is to hire more adjuncts.

Of course, the proposal also represents a change, and institutionalized professionals hate change. For what it’s worth, I think it’s a good idea to have more tenure-track teaching positions. However, forcing a choice — research or teaching — is a terrible idea. I do like research, but part of the reason I want to be at a university is to engage with students through the classroom. I may not be the best teacher now, but I want to get better. A better, and more feasible, short-term solution would be to create more opportunities and support for teacher development within the university. This would strengthen the correlation between research and teaching success.