Implement content sniffing for HTML parsing #808

Instead of looking for "html" substring, actually parse the MIME type string. Don't use mime.ParseMediaType though as it doesn't handle invalid duplicate parameters (e.g. "text/html; charset=UTF-8; charset=utf-8") that occur in the wild.

Web pages can be served without Content-Type set, in which case browsers employ content sniffing. Do the same here, in Colly.