Skip to content

Parsing headers fails if trailing newline is missing #90

Open
@liclac

Description

@liclac

I'm using mail_parser to parse an email containing a Delivery Status Notification. They come in a message part that looks something like this:

--=_360a803c773b6a010a411a7b=0d5cefb0-e4ef-5aa9-80ba-ac9ecbbeedf1_=
Content-Type: message/delivery-status

Arrival-Date: Mon, 26 Jul 2021 10:19:42 +0200 (CEST)
Reporting-MTA: dns; example.com

Final-Recipient: rfc822; [email protected]
Status: 2.0.0
Original-Recipient: rfc822; [email protected]
--=_360a803c773b6a010a411a7b=0d5cefb0-e4ef-5aa9-80ba-ac9ecbbeedf1_=--

So the part itself has a single Content-Type: message/delivery-status header, followed by a group of "per-message" headers, followed by a group (or more) of "per-recipient" headers, all of which ends up in the "body" of the part. So I grab the part, run MessageParser on the body, grab the headers, run it again on the body of the new message, and so on.

It all works swimmingly right up until the end: because Delivery Status Notifications generally don't contain a trailing newline, msg.headers() will contain the last header, but with HeaderValue::Empty. Now, this is admittedly a bit of an abuse of the parser, and headers do normally contain a trailing CLRF, but I feel like if the message from the parser's perspective appears to end inside of a header value, the more expected behaviour would be to return what it has as the value.

Here's a test to reproduce this:

     #[test]
     fn test_headers_no_trailing_crlf() {
         use mail_parser::{Header, HeaderName, HeaderValue, MessageParser};
 
         let expected = &[
             Header {
                 name: HeaderName::Other("Foo".into()),
                 value: HeaderValue::Text("foo".into()),
                 offset_field: 0,
                 offset_start: 4,
                 offset_end: 10,
             },
             Header {
                 name: HeaderName::Other("Bar".into()),
                 value: HeaderValue::Text("bar".into()),
                 offset_field: 10,
                 offset_start: 14,
                 offset_end: 20,
             },
         ];
 
         // With trailing newline.
         let msg = MessageParser::new()
             .parse("Foo: foo\r\nBar: bar\r\n")
             .unwrap();
         assert_eq!(expected, msg.headers());
 
         // Without trailing newline.
         let msg = MessageParser::new().parse("Foo: foo\r\nBar: bar").unwrap();
         assert_eq!(expected, msg.headers());
     }

With the pretty_assertions::assert_eq! macro, this produces:

Diff < left / right > :
 [
     Header {
         name: Other(
             "Foo",
         ),
         value: Text(
             "foo",
         ),
         offset_field: 0,
         offset_start: 4,
         offset_end: 10,
     },
     Header {
         name: Other(
             "Bar",
         ),
<        value: Text(
<            "bar",
<        ),
>        value: Empty,
         offset_field: 10,
         offset_start: 14,
<        offset_end: 20,
>        offset_end: 18,
     },
 ]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions