ResearchGate: spam scam, or…?

Posted on August 18, 2014 by Anand Sarwate

I’ve been getting fairly regular automated emails lately from ResearchGate, which has pull-quotes from Forbes and NPR saying it’s changing the way we do research blah blah blah. However, all empirical reports I have heard indicate that once you join, it repeatedly spams all of your co-authors with requests to join, which makes it feel a bit more like Heaven’s Gate.

On a less grim note, the site’s promise to make your research “more visible” sounds a bit like SEO spam. Given the existence of Google Scholar, which is run by the SE that one would like to O, it seems slightly implausible.

Any readers want to weigh in on whether ResearchGate has been useful to them? Or is this mostly for people who don’t know how to make their own homepage with their papers on it (which is probably most faculty).

B-log on IT

Posted on January 3, 2013 by Anand Sarwate

Via Tara Javidi I heard about a new blog on information theory: the Information Theory b-log, which has been going for a few months now but I guess in more “stealth mode.” It’s mostly posts by Sergio Verdú, with some initial posting by Thomas Courtade, but the most recent post is by Tara on how to compare random variables from a decision point of view. However, as Max noted:

All researchers working on information theory are invited to participate by posting items to the blog. Both original material and pointers to the web are welcome.

Banff LiveCam

Posted on April 4, 2012 by Anand Sarwate

It’s not as risqué as ChatRoulette, but the Banff International Research Station now has a live stream for talks (assuming the speaker wants to be recorded). This week you can learn about Stochastic PDEs (talk schedule here).

Linkage

Posted on March 30, 2012 by Anand Sarwate

The variation of the human body across sports is fascinating (via Matt Tong).

The films of Andrei Tarkovsky (1932-1986) are available for free online (via Zhenya Tumanova).

A paper arguing that systems-CS conference reviews are bad. (via Manu Sridharan)

Downton Arby’s.

A watch that’s on Indian Time. Works for other cultures too! (via Harbeer)

Linkage part deux

Posted on April 18, 2011 by Anand Sarwate

Most of these are stolen from MetaFilter.

Welcome back to public blogging, Dan.

All about time zones.

Musical instrument samples. My first UROP at MIT was at the Media Lab, where I helped record instrumentalists as part of a musical instrument identification system. Paris Smaragdis was there at the time, and now he is at UIUC where he has a lot of cool audio demos. There are also some great clips Inside the Music Library at the BBC.

Ridiculous computer interfaces from movies.

Linkage

Posted on April 17, 2011 by Anand Sarwate

I’m blogging from Chicago, where it is a balmy 42 degrees but sunny. Whither spring, I ask! Actually, I’m not blogging so much as linking to a bunch of stuff.

For San Diegans, the SD Asian Film Festival Spring Showcase is going on. It looks like I’ll miss a lot of it but I might try to catch something at the end of the week.

Less Pretentious & More Accurate Titles For Literary Masterworks — funny but possibly NSFW.

This home-scanning program seems creepy, regardless of the constitutionality issues.

Unfortunate headlines strike again.

I really like scallion pancakes. I’ll have to try this out when I get back to San Diego.

I agree that this video is awesome. Yo-Yo Ma and Lil Buck. I think that dude is made of rubber. And steel.

Tom Waits was induced into the Rock and Roll Hall of Fame. I just hope I get to see him live some day.

Some things to skim or read from ArXiV when I get the chance:
Sequential Analysis in High Dimensional Multiple Testing and Sparse Recovery (Matt Malloy, Robert Nowak)
Differential Privacy: on the trade-off between Utility and Information Leakage (Mário S. Alvim, Miguel E. Andrés, Konstantinos Chatzikokolakis, Pierpaolo Degano, Catuscia Palamidessi)
Capacity of Byzantine Consensus with Capacity-Limited Point-to-Point Links (Guanfeng Liang, Nitin Vaidya)
Settling the feasibility of interference alignment for the MIMO interference channel: the symmetric square case (Guy Bresler, Dustin Cartwright, David Tse)
Decentralized Online Learning Algorithms for Opportunistic Spectrum Access (Yi Gai, Bhaskar Krishnamachari)
Online and Batch Learning Algorithms for Data with Missing Features (Afshin Rostamizadeh, Alekh Agarwal, Peter Bartlett)
Nonuniform Coverage Control on the Line (Naomi Ehrich Leonard, Alex Olshevsky)
Degree Fluctuations and the Convergence Time of Consensus Algorithms (Alex Olshevsky, John Tsitsiklis)

Privacy and entropy (needs improvement)

Posted on January 21, 2011 by Anand Sarwate

A while ago, Alex Dimakis sent me an EFF article on information theory and privacy, which starts out with an observation of Latanya Sweeney’s that gender, ZIP code, birthdate are uniquely identifying for a large portion of the population (an updated observation was made in 2006).

What’s weird is that the article veers into “how many bits of do you need to uniquely identify someone” based on self-information or surprisal calculations. It paints a bit of a misleading picture about how to answer the question. I’d probably start with taking $\log_2(6.625 \times 10^9)$ and then look at the variables in question.

However, the mere existence of this article raises a point : here is a situation where ideas from information theory and probability/statistics can be made relevant to a larger population. It’s a great opportunity to popularize our field (and demonstrate good ways of thinking about it). Why not do it ourselves?

A new scam?

Posted on December 29, 2010 by Anand Sarwate

I received a scam email recently asking me if I wanted to “take up a job” for a few months while someone is goes on leave. The salary is not huge but pretty decent ($3.5k/month + some sort of commission) and I’m instructed to contact someone in Taiwan to make the arrangements. I assume the next step is that I will give them my bank account etc (for direct deposit), which then will get cleared out.

This seems like a particularly pernicious type of internet fraud because it doesn’t involve egregious amounts of money (vs. Nigerian bank scams) and there is so much unemployment in the US right now that people would be happy to take that job for 3 months.

Is this a new kind of scam? It seems different from the “work from home and make $70k” things I’ve seen before.

Linkage

Posted on October 19, 2010 by Anand Sarwate

The fog in San Francisco (h/t Erin Rhode).

A general approach to privacy/utility tradeoffs, with metric spaces! A new preprint/note by Robert Kleinberg and Katrina Ligett.

Max breaks it down for you on how to use the divergence to get tight convergence for Markov chains.

The San Diego Asian Film Festival starts on Thursday!

Apparently China = SF Chinatown. Who knew? Maybe the fog confused them.

Privacy and Google Web History

Posted on March 18, 2010 by Anand Sarwate

Posted on ArXiV last night: Private Information Disclosure from Web Searches. (The case of Google Web History), by Claude Castelluccia, Emiliano De Cristofaro, Daniele Perito.

Our report was sent to Google on February 23rd, 2010. Google is investigating the problem and has decided to temporarily suspend search suggestions from Search History. Furthermore, Google Web History page is now offered over HTTPS only. Updated information about this project is available at: this http URL

The link above has some more details of their back and forth with Google on the matter, and at least it looks like Google’s on the losing end of it.

Search histories have a lot of information in them, since searches correlated with local events, such as disease spread (related and interesting is Twitter’s tracking of earthquakes). Since user sessions can be compromised by someone hijacking the cookies that maintain the session, Google requires HTTPS for many services, like GMail, but not for the “automatic suggestion” for searches. The authors implemented an attack called The Historiographer:

The Historiographer uses the fact that users signed in any Google service receive personalized suggestions for their search queries based on previously-searched keywords. Since Google Web Search transmits authentication cookies in clear, the Historiographer monitoring the network can capture such a cookie and exploit the search suggestions to reconstruct a user’s search history.

This attack is not looking at a short time-window of browsing history, but essentially the entire search history as stored by Google. They did real experiments, and found:

Results show that almost one third of monitored users were signed in their Google accounts and, among them, a half had Web History enabled, thus being vulnerable to our attack. Finally, we show that data from several other Google services can be collected with a simple session hijacking attack.

So how does it work? The program hijacks the SID cookie from the user by eavesdropping, and then issues prefixes to the suggestion services; that is, it simulates a user typing in the first few letters of a search query. Prefixes have to be at least 2-3 letters to trigger the suggestion, and the top 3 completions are given. Of course 26^3 is a lot of prefixes to try, so the system has to sample effectively. The system just queries the top 10% of most frequent 3-letter prefixes (based on the statistics of English), which amounts to 121 queries to the system. If a particular 2-letter prefix (e.g. “pr”) is a prefix for many 3-letter prefixes (e.g. “pre”, “pra”, “pro”) which result in 3 completions, they will proceed greedily to look at longer prefixes in that direction. Note that this is the same principle behind Dasher (or arithmetic coding, really).

Based on this, the system can reconstruct the search history for the hijacked user. By using Google’s personalized results service, they can also get more information about the user’s preferences. A little more worrying is this observation:

In fact, a malicious entity could set up a Tor exit node to hijack cookies and reconstruct search histories. The security design underlying the Tor network guarantees that the malicious Tor exit node, although potentially able to access unencrypted traffic, is not able to learn the origin of such traffic. However, it may take the malicious node just one Google SID cookie to reconstruct a user’s search history, the searched locations, the default location, etc., thus signicantly increasing the probability of identifying a user.

It’s an interesting paper, and worth a read if you are interested in these issues.

	Zonghong Liu on A story about Canvas
	anonymousskimmer on “The needs of the many,…
	Chanterelle Recipes… on Broiled shrimp with chanterell…
	kvarsh on ICML 2019 encouraged code subm…
	Pulkit Grover on gender inclusivity in communic…

An Ergodic Walk

a process whose average over time converges to the true average

Tag Archives: web

ResearchGate: spam scam, or…?

B-log on IT

Banff LiveCam

Linkage

Linkage part deux

Linkage

Privacy and entropy (needs improvement)

A new scam?

Linkage

Privacy and Google Web History