During a research conversation this week I came to the realization that we really should model the paper writing as a partially observed Markov decision problem (POMDP). The writer has an action space of writing SIMPLE or FANCY statements. A SIMPLE statement follows very clearly from the previous statement and so should be easy to follow. A FANCY statement is a bit more challenging but might sound more impressive the reviewer. The reviewer has several states: BORED, IMPRESSED, and CONFUSED. Based on the input being SIMPLE or FANCY, the reviewer changes state according to a Markov chain with this state space.
The writer suffers a cost (or gains a reward, depending on how we want to model it) based on the current action and reviewer state. For example, issuing a FANCY statement when the reviewer is CONFUSED might suffer a higher cost than a FANCY statement when IMPRESSED. The goal is to minimize the discounted cost. This makes sense since reviewers become more fatigued over time, so extracting reward earlier in the paper is better.
Now, conventional wisdom might suggest that clarity is best, so a pure-SIMPLE policy should be a dominant strategy for this problem. However, depending on the cost and transition structure, I am not sure this is true.
And now, back to paper writing for me…