My colleague Waheed Bajwa and I are teaching a Rutgers Byrne Seminar for first-year undergraduates this fall. The title of the course is Data: What is it Good For? (Absolutely Something), a reference which I am sure will be completely lost on the undergrads. The point of the course is to talk about “data” (what is it, exactly?), how it gets turned into “information,” and then perhaps even “knowledge,” with all of the pitfalls along the way. So it’s a good opportunity to talk about philosophy (e.g. epistemology), mathematics/statistics (e.g. undersampling, bias, analysis), engineering (e.g. storage, transmission), science (e.g. reduplication, retraction), and policy (e.g. privacy). It’s supposed to be a seminar class with lots of discussion, and the students can be expected to do a little reading outside of class. We have a full roster of 20 signed up, so managing the discussion might be a bit tricky, of course.
We’re in the process of collecting reading materials — magazine articles, book chapters, blog posts, etc. for the students to read. We explicitly didn’t want it to be for “technical” students only. Do any readers of the blog have great articles suitable for first-year undergrads across all majors?
As the class progresses I will post materials here, as well as some snapshot of the discussion. It’s my first time teaching a class of this type (or indeed any undergraduates at Rutgers) so I’m excited (and perhaps a bit nervous).
On a side note, Edwin Starr’s shirt is awesome and I want one.
http://mimno.infosci.cornell.edu/b/articles/carpentry
Click to access OReilly.Doing.Data.Science.Oct.2013.pdf
http://www.nytimes.com/2014/08/11/opinion/sentencing-by-the-numbers.html
http://www.ssireview.org/articles/entry/big_data_for_social_innovation
http://www.huffingtonpost.com/travis-korte/how-data-and-analytics-ca_b_5609411.html
Thanks Kush! I had totally forgotten about the sentencing one.
You’re welcome. Here are a few more which might be relevant:
http://www.dynamicyield.com/2014/07/startup-data-science
http://jvns.ca/blog/2014/06/19/machine-learning-isnt-kaggle-competitions
http://www.informit.com/articles/article.aspx?p=2217274
http://cacm.acm.org/magazines/2012/10/155531-a-few-useful-things-to-know-about-machine-learning/fulltext
Reading ‘Big Data’ by Viktor Mayer-Schonberger and Kenneth Cukier gave me an appreciation for what Big Data is, as somebody who doesn’t work in that field. I kept hearing the term bandied about without understanding what people meant when using that term. There are also discussions about privacy and policy implications, including unforeseen consequences of so-called data exhaust.
Cukier is a journalist and you should be able to find some his stories that led to the creation of the book.
They had an article in Foreign Affairs which I think we’ll assign as reading. The hard part has been finding material which will be digestible to 18 year olds. The first chapter of O’Reilly book is great for this, and I might try to excerpt a chapter or two from other books.
You could peruse this book, written by an economist. It’s all about data as it relates to risks to the fetus during pregnancy. It’s a complete book, but the chapter on alcohol and caffeine use in pregnancy isn’t too cumbersome, and I imagine it would spark some interesting discussion. http://www.amazon.com/Expecting-Better-Conventional-Pregnancy-Wisdom/dp/1594204756
PS Not pregnant, just picked it up since I’m a doc.
I am really glad this follow up comment is here.
For the record, if and when Jenn does get pregnant, I want to find out before a comment is left on Anand’s blog about it. :-p