I got the name from Jim Massey who had suggested it as part of a title for the thesis of another doctoral student I know.
I have heard this attributed to Gallager, but the word “genie” might even come up in the Wozencraft-Jacobs book from the mid-60s (not sure!). I suspect that it goes back even further.
A little further searching along those directions turned up some more hits. On page 366 of Viterbi and Omura’s 1978 text Principles of Digital Communication and Coding, while discussing the distribution of computation in convolutional codes they write “[W]e begin by considering a sequential decoder aided by a benevolent genie who oversees the decoder action on each incorrect subset.”
But indeed, as Gerhard indicates, there is a reference in Wozencraft and Jacobs (1965). From Rimoldi and Urbanke’s paper on rate splitting, they write “[C]onceptually, we follow the lead of Wozencraft and Jacobs [29, p. 419] and postulate a genie who always knows the codeword of user 2…” Following up on that reference, in reference to the decoding of convolutional codes, Wozencraft and Jacobs write “… assume initially that a magic genie directs the decoder to the correct starting node for determining each …”
In the bibliographic notes in Viterbi and Omura, they write
As was noted previously, the original sequential decoding algorithm was proposed and analyzed by Wozencraft . The Fano algorithm , with various minor modifications, has been analyzed by Yudkin , Wozencraft and Jacobs , Gallager , and Jelinek [1968a]. Two versions of stack algorithms and their performance analyses are due to Zigangirov  and Jelinek [1969a]. The precise form of the Pareto distribution on computation emerged from the works of Savage  for the upper bound, and of Jacobs and Berlekamp  for the lower bound.
So it seems that if the argument is due to Wozencraft, the source of the genie argument in this context is probably due to the Wozencraft and Jacobs book, but the credit for the analogy to genies is probably lost in time to us.