Some LaTeX hacks for writing proposals

I just submitted my CAREER proposal on Monday, and now that the endless revision process is over, I wanted to write a post about what I learned about proposal writing. But that seems like hubris, since I have no idea if the thing will get funded, so what do I know? Besides, everything I learned was from the valuable feedback I got from those who very kindly read my previous drafts. So instead of writing about How To Sell Your Idea, I figured I would write about some cute LaTeX hacks that I came across, most of them via the very useful TeX-LaTeX Stack Exchange.

If anyone else has some useful hacks, feel free to leave them in the comments!

Saving Space

One of the big problems in NSF proposal writing is that there’s a hard limit on the number of pages (not the number of words), so if you’re at the edge, there’s a lot of “oops, two lines over” hacking to be done towards the end.

  • \usepackage{times} \usepackage{mathptmx}: The typeface for your proposal makes a big difference in space. Computer Modern is a bit easier to read since there’s more whitespace, but Times shaved a whole page off of my proposal. The NSF Grant Proposal Guidelines has the list of approved formatting. It seems standard for NIH proposals to use 11pt Arial but that makes me want to gouge my eyes out. Know thy reviewers, is what I would say: keep in mind what’s standard for the solicitation and don’t make the proposal so dense as to be unreadable. NB: Apparently the times package is deprecated (see comments).
  • \usepackage{titlesec}. This package lets you control the spacing around your titles and subtitles like this:

    \titlespacing\section{0pt}{10pt plus 2pt minus 2pt}{2pt plus 2pt minus 2pt}
    \titlespacing\subsection{0pt}{8pt plus 2pt minus 2pt}{2pt plus 2pt minus 2pt}

    See this post for more details, but basically it’s \titlespacing{command}{left spacing}{before spacing}{after spacing}. This is handy because there’s a lot of empty space around titles/subtitles and it’s an easy way to trim a few lines while making sure things don’t get too cramped/ugly.

  • \usepackage{enumitem}: This package lets you control the spacing around your enumerate lists. The package has a lot of options but one that may be handy is \setlist{nosep} which removes the space around the list items. This actually makes things a little ugly, I think, but bulleted lists are helpful to the reviewer and they also take a little more space, so this lets you control the tradeoff. Another thing that is handy to control is the left margin: \setlist[itemize,1]{leftmargin=20pt}.
  • \usepackage{savetrees}: Prasad says it’s great, but I didn’t really use it. YMMV.


  • Sometimes it’s handy to have a new theorem environment for Specific Aims or Open Problems or what-have-you. The problem is (as usual) that the theorem environment by itself puts in extra space and isn’t particularly customizable. So one option is to define a new theorem style:

    \newtheoremstyle{mystyle}% name
    {5pt}%Space above
    {5pt}%Space below
    {\itshape}% Body font
    {5pt}%Indent amount
    {\bfseries}% Theorem head font
    {:}%Punctuation after theorem head
    {4pt}%Space after theorem head 2
    {}%Theorem head spec (can be left empty, meaning ‘normal’)

    \newtheorem{specaim}{Specific Aim}

  • Another handy hack is to make a different citation command to use for your own work that will then appear in a different color than normal citations if you use \usepackage[colorlinks]{hyperref}. I learned how to do this by asking a question on the stack exchange.


  • The hyperref package also creates internal links to equations and Figures (if you label them) and so on, but the link is usually just the number of the label, so you have to click on “1” instead of “Figure 1” being the link. One way to improve this is to make a custom reference command:

    \newcommand{\fref}[2]{\hyperref[#2]{#1 \ref*{#2}}}

    So now you can write \fref{Figure}{fig:myfig} to get “Figure 1” to be clickable.

  • You can also customize the colors for hyperlinks:


  • Depending on your SRO, they may ask you to deactivate URLs in the references section. I had to ask to figure this out, but basically putting \let\url\nolinkurl before the bibliography seemed to work…

ISIT 2014: how many samples do we need?

Due to jetlag, my CAREER proposal deadline, and perhaps a bit of general laziness, I didn’t take as many notes at ISIT as I would have, so my posting will be somewhat light (in addition to being almost a month delayed). If someone else took notes on some talks and wants to guest-post on it, let me know!

Strong Large Deviations for Composite Hypothesis Testing
Yen-Wei Huang (Microsoft Corporation, USA); Pierre Moulin (University of Illinois at Urbana-Champaign, USA)
This talk was actually given by Vincent Tan since neither of the authors could make it (this seems to be a theme of talks I’ve attended this summer. The paper was about testing a simple hypothesis H_1 versus a composite hypothesis H_0 where under H_0 the observations are i.i.d. with respect to one of possibly k different distributions. There are therefore k different errors and the goal is to characterize these errors when we ask for the probability of true detection to be greater than 1 - \epsilon. This is a sort of generalized Neyman-Pearson setup. They look at the vector of log-likelihood ratios and show that a threshold test is nearly optimal. At the time, I understood the idea of the proof, but I think it’s one of things where you need to really read the paper.

Randomized Sketches of Convex Programs with Sharp Guarantees
Mert Pilanci (University of California, Berkeley, USA); Martin J. Wainwright (University of California, Berkeley, USA)
This talk was about using random projections to lower the complexity of solving a convex program. Suppose we want to minimize \| Ax - y \|^2 over x given y. A sketch would be to solve \| SAx - Sy \|^2 where S is a random projection. One question is how to choose A. They show that choosing S to be a randomized Hadamard matrix (the paper studies Gaussian matrices), then the objective value of the sketched program is at most (1 + \epsilon)^2 times the value of the original program as long as the the number of rows of S is larger than O( \epsilon^{-2} \mathbb{W}^2(A \mathcal{K})), where \mathbb{W}(A \mathcal{K}) is the Gaussian width of the tangent cone of the contraint set at the optimum value. For more details look at their preprint on ArXiV.

On Efficiency and Low Sample Complexity in Phase Retrieval
Youssef Mroueh (MIT-IIT, USA); Lorenzo Rosasco (DIBRIS, Unige and LCSL – MIT, IIT, USA)
This was another talk not given by the authors. The problem is recovery of a complex vector x_0 \in \mathbb{C}^n from phaseless measurements of the form b_i = |\langle a_i, x_0 \rangle|^2 where a_i are complex spherically symmetric Gaussian vectors. Recovery from such measurements is nonconvex and tricky, but an alternating minimizing algorithm can reach a local optimum, and if you start it in a “good” initial position, it will find a global optimum. The contribution of this paper is provide such a smart initialization. The idea is to “pair” the measurements to create new measurements y_i = \mathrm{sign}( b_i^{(1)} - b_i^{(2)} ). This leads to a new problem (with half as many measurements) which is still hard, so they find a convex relaxation of that. I had thought briefly about such sensing setups a long time ago (and by thought, I mean puzzled over it at a coffeshop once), so it was interesting to see what was known about the problem.

Sorting with adversarial comparators and application to density estimation
Jayadev Acharya (University of California, San Diego, USA); Ashkan Jafarpour (University of California, San Diego, USA); Alon Orlitsky (University of California, San Diego, USA); Ananda Theertha Suresh (University of California, San Diego, USA)
Ashkan gave this talk on a problem where you have m samples from an unknown distribution p and a set of distributions \{q_1, q_2, \ldots, q_n\} to compare against. You want to find the distribution that is closest in \ell_1. One way to do this is via Scheffe tournament tht compares all pairs of distributions — this runs in time n^2 time. They show a method that runs in O(n) time by studying the structure of the comparators used in the sorting method. The motivation is that running comparisons can be expensive (especially if they involve human decisions) so we want to minimize the number of comparisons. The paper is significantly different than the talk, but I think it would definitely be interesting to those interested in discrete algorithms. The density estimation problem is really just a motivator — the sorting problem is far more general.