Never, ever (ever) use the HTML 3.2 ˂center˃ tag in emails

SanfordWhiteman
Level 10 - Community Moderator
Level 10 - Community Moderator

As we all know, cross-client-compatible email design means HTML 4-era <table> elements, a smattering of CSS2 styles (usually inlined, with <style> sporadically supported), and conditional VML for Outlook.

 

But don’t get that twisted and think other old-school HTML elements are safe to use. Sure, the <center> element achieves the alignment you want in all mail clients — if the person ever sees the email! The problem is perfectly valid HTML inside a <center> can get the whole email flagged as spam due to longstanding antispam rules.

 

<center> was bad from the start

The <center> element is ancient in internet years. (Even in human years, it’ll soon be picking out outfits for its 10-year reunion.) It was only reluctantly accepted into HTML 3.2 because a prominent browser had invented it in the interim:

CENTER was introduced by Netscape before they added support for the HTML 3.0 DIV element. It is retained in HTML 3.2 on account of its widespread deployment.

 

So we’re talking about an element that was already frowned upon 25 years ago!

 

As the standard says, <center> is exactly the same as <div align="center"> and you should always use the latter.

 

SpamAssassin has also been around (in a good way)

SpamAssassin is the still-beating open-source heart of antispam technology, protecting hundreds of millions of mailboxes.

 

If you...

  • don’t want your antispam layer in the cloud for security reasons
  • lack budget for a commercial app with thousands of licenses
  • want fine-grained control over every spam check and score

... SpamAssassin is the way to go. (We use it on our company’s IceWarp server to protect local accounts.)

 

SA is an antispam command center that does way more than just scan content for suspicious patterns. It also checks DNS block lists, verifies DKIM/SPF/DMARC, has its own Bayesian (loosely, “AI”) classifying filter, and passes content to external plugins that can do even more.

 

But it still does basic pattern matching. One builtin rule is called HTML_TAG_BALANCE_CENTER and it’s heavily weighted by default. Here’s an example of a legitimate message sent by Marketo which violated HTML_TAG_BALANCE_CENTER and primarily because of that was sent to the Spam folder:

185.28.196.182  [0EF8] 15:22:13 202302101522081207 From '<solutions@email.example.com>' (MAIL FROM '<123-KVM-456.0.101280.0.0.36620.9.506557589-11@envelope.example.com>') RCPT TO '<inboundtest+01@figureone.com>' score 5.10 required 5.00 reason [SpamAssassin=5.10:(HTML_MESSAGE=0.00,BAYES_00=-1.00,DKIM_VALID=-0.10,SPF_PASS=0.00,SPF_HELO_PASS=0.00,URIBL_BLOCKED=0.00,HTML_TAG_BALANCE_CENTER=3.60,THIS_AD=1.40,URI_TRY_3LD=1.20),Other=0.00] action SPAM

Here HTML_TAG_BALANCE_CENTER gave the message 3.60 points (in the spammy direction) toward the default spam threshold of 5.00 points. Without violating that rule, the message would’ve smoothly sailed to the Inbox.

 

What HTML_TAG_BALANCE_CENTER checks (vs. what it’s supposed to)

Even if HTML_TAG_BALANCE_CENTER worked properly, <center> would still be something to avoid. But it’s even worse because there’s a bug in SpamAssassin that causes messages to be scored even when there isn’t a violation.

 

The rule is supposed to check for unbalanced tags inside a <center> tag. That is, if you have an open <span> tag there must be a closing ⋖/span> tag at the right level, and so on for every level.[1]

 

Problem is it’s broken. Somewhere deep in the HTML parser code[2] — I haven’t found exactly where — it loses track of whether it’s inside or outside an element. So if you have markup like this:

<html>
    <body>
        <center>
            <table>
                <tr>
                    <td>
                        <table>
                            <tr>
                                <td>Some stuff</td>
                            </tr>
                        </table>
                        <table>
                            <tr>
                                <td>Other stuff</td>
                            </tr>
                        </table>                        
                    </td>
                </tr>
            </table>
        </center>
    </body>
</html>

SpamAssassin will claim there’s an unbalanced tag inside the <center>. But of course you can see the <table> tags and all the other markup is fine. There just happen to be two sibling <table>s inside a <td>, which is perfectly valid.

 

If you don’t have the second <table> it’ll pass the test:

<html>
    <body>
        <center>
            <table>
                <tr>
                    <td>
                        <table>
                            <tr>
                                <td>Some stuff</td>
                            </tr>
                        </table>                       
                    </td>
                </tr>
            </table>
        </center>
    </body>
</html>

 

More notably, if you eschew the <center> tag and actually have unbalanced tags, you’ll also pass the test! It only cares if they’re inside a <center>:

<html>
    <body>
        <table>
            <tr>
                <td>
                    <table>
                        <tr>
                          <td>Some stuff
                        </tr>
                    </table>                       
                </td>
            </tr>
        </table>
    </body>
</html>

 

So now you know. Don’t use <center> because (a) you’ll inevitably mess up and leave unbalanced tags and (b) even if you didn’t mess up at all SpamAssassin can think you did.

 
NOTES

[1] The reasons that unbalanced tags signify spamminess are lost to time, by the way. Presumably it had something to do with spamware that routinely created malformed emails, which a human — ha! — would not.

 

[2] It starts here, if you feel like reading Perl:

sub html_tag_balance {
  my ($self, $pms, undef, $rawtag, $rawexpr) = @_;

  return 0 if $rawtag !~ /^([a-zA-Z0-9]+)$/;
  my $tag = $1;

  return 0 if $rawexpr !~ /^([\<\>\=\!\-\+ 0-9]+)$/;
  my $expr = untaint_var($1);

  foreach my $html (@{$pms->{html_all}}) {
    next unless exists $html->{inside}{$tag};
    $html->{inside}{$tag} =~ /^([\<\>\=\!\-\+ 0-9]+)$/;
    my $val = untaint_var($1);
    return 1 if eval "\$val $expr";
  }

  return 0;
}​
from SpamAssassin’s HTMLEval.pm
1669
0