(Since this is my first SO question, let me just say I hope it's not too Zend-specific. As far as I can tell this shouldn't be a problem. Although I could have posted it in a Zend-specific forum, I feel like I'm at least as likely to get a good answer here, especially since the answer might involve MIME-related issues that transcend Zend Framework. I'm basically trying to understand whether the issue I'm facing should be considered a ZF bug, or if I'm misunderstanding something or misusing it.)
I've been using Zend_Mail to build up a MIME message that gets sent through SendGrid, an email distribution service. Their platform allows you to send emails through their SMTP server, but gives added features when you use a special header (X-SMTPAPI) whose value is a JSON-encoded string of proprietary parameters, which can get quite long.
Eventually, the header I was passing got too long (I think >1000 chars), and I got errors. I was confused because I knew that it was getting passed through PHP's native wordwrap() function before I passed the value to Zend_Mail::addHeader(), so I thought line length should never be a problem.
It turns out that addHeader() strips newlines very deliberately, and with no particular explanation by way of comments.
// In Zend_Mail::addHeader()
$value = $this->_filterOther($value);
// In Zend_Mail::_filterOther()
$rule = array("\r" => '',
"\n" => '',
"\t" => '',
);
return strtr($data, $rule);
Ok, this seemed reasonable at first -- maybe ZF wants full control of formatting and line-wrapping. The next method called in Zend_Mail::addHeader() is
$value = $this->_encodeHeader($value);
This method encodes the value (either quoted-printable or base64 as appropriate) and chunks it into lines of appropriate length, but only if it contains "non-printable characters", as determined by Zend_Mime::isPrintable($value).
Looking into that method, newlines (\n) are indeed considered non-printable characters! So if only they hadn't been stripped out of the string in the previous method call, the long header would get encoded as QP and chunked into 72-char lines, and everything would work fine. In fact, I did a test where I commented out the call to _filterOther(), and the long header gets encoded and goes through with no problem. But now I've just made a careless hack to ZF without really understanding the purpose behind the line I removed, so this can't be a long-term solution.
My medium-term solution has been to extend Zend_Mail and create a new method, addHeaderForceEncode(), which will always encode the value of the header, and thus always chunk it into short lines. But I'm still not satisfied because I don't understand why that _filterOther() call was necessary in the first place -- maybe I shouldn't be working around it at all.
Can anyone explain to me why this behaviour exists of stripping newlines? It seems to inevitably lead to situations where a header can get too long if it doesn't contain any "non-printable characters" other than newlines.
I've done a bunch of different searches on this subject and looked through some ZF bug reports, but haven't seen anyone talking about this. Surprisingly it seems to be a really obscure issue. FYI I'm working with ZF 1.11.11.
Update: In case anyone wants to follow the ZF issue I opened about this, here it is: Zend_Mail::addHeader() UNfolds long headers, then throws exception
Source: Tips4all
You're probably running into a few things. Per RFC 2821, text lines in SMTP can't exceed 1000 characters:
ReplyDeletetext line
The maximum total length of a text line including the is
1000 characters (not counting the leading dot duplicated for
transparency). This number may be increased by the use of SMTP
Service Extensions.
A header can't contain newlines, so that's probably why Zend is stripping them. For long headers, it's common to insert a line break (CRLF in SMTP) and a tab to "wrap" them.
Excerpt from RFC 822:
Each header field can be viewed as a single, logical line of
ASCII characters, comprising a field-name and a field-body.
For convenience, the field-body portion of this conceptual
entity can be split into a multiple-line representation; this
is called "folding". The general rule is that wherever there
may be linear-white-space (NOT simply LWSP-chars), a CRLF
immediately followed by AT LEAST one LWSP-char may instead be
inserted.
I would say that the _encodeHeader() function should possibly look at line length, and if the header is longer than some magic value, do the "wrap and tab" to have it span multiple lines.