Because warehouse doesn't send a charset, this can get decoded as something other than utf-8, which doesn't encode (as utf-8) back to the same bytes. For hash purposes especially, just look at the original bytes. (cherry picked from commit 28d5c007)