Debugging SMTP Conversations Part 1: How to Speak SMTP
Amazon SES strives to make your email sending as simple and quick as possible, which means that users of our HTTP API don’t even have to worry about what an SMTP conversation is or how to capture one. Even a lot of our SMTP interface users outsource the problem to software like Microsoft Outlook or PHP Mailer that takes care of these details for them. But if you’re experiencing an issue sending mail that can’t be explained by a recent code change or an error message from SES, understanding how an SMTP conversation works can be helpful. Or maybe you’re just curious as to what those bits flying around look like. In today’s blog post, the first in a series to help you debug these issues, we’ll go over the basics of an SMTP conversation.
The Simple Mail Transfer Protocol (SMTP) was first officially put into writing in 1982 in RFC 821 as a way to “transfer mail reliably and efficiently”, but the protocol that the majority of ISPs use today is described in RFC 5321. The protocol includes a basic handshake, supported commands and responses at each step of the conversation, and finally transmission of message content. Here is a summary of the various steps of a typical conversation, without any extensions involved:
- An SMTP client opens a connection with an SMTP server. Generally, this is on port 25 or 587.
- The SMTP server responds with a 220 code and may follow that with a header that describes the server. It could also respond with a 554 status code to reject the connection, and then the client’s only option would be the QUIT command.
- The SMTP client sends either an EHLO or HELO command to the server. Either command must be followed by a space and then the domain of the client. Contemporary clients and servers should support EHLO, so EHLO is what we’ll use in this example.
- The SMTP server should respond to the EHLO with the 250 status code, its domain name, and a server greeting, and one line for every SMTP extension it supports.
- Now the SMTP client is in business and can start defining the mail to be sent, starting with what’s commonly referred to as the “envelope from” header. The client sends “MAIL FROM:” followed by a reverse-path address, which defines where bounce messages are sent if the message can’t be delivered after it’s accepted (receiving MTAs add this to incoming mail with the Return-Path header). If the mail being sent is a bounce message, this address should be empty (i.e., “<>”). The reverse-path address can optionally be followed by mail parameters defined by a supported SMTP extension (as advertised in the EHLO reply). The conversation cannot proceed until this MAIL FROM command is sent and accepted.
- The SMTP server must accept the MAIL FROM with a “250 OK” reply if the format is good and the address is deemed acceptable. Otherwise, the server typically responds with a 550 or 553 error response to indicate whether the failure is temporary or permanent. Other acceptable error status codes are 552, 451, 452, 503, 455, and 555 (see RFC 5321 for their definitions).
- The SMTP client can now define whom the email is for using the RCPT TO command. The syntax is very similar to MAIL FROM: it must be the literal “RCPT TO:” followed by the forward-path address surrounded by angle brackets. This can also optionally be followed by any parameters necessary to use an SMTP extension advertised in the EHLO reply. The RCPT TO command can only define a single recipient. If there are multiple recipients, the command can be issued multiple times, but the client needs to wait for a response from the server each time before supplying another destination.
- The SMTP server usually validates that the address is deliverable and responds with “250 OK” if it is. Otherwise, it typically returns a 550 reply. Other acceptable error status codes are 551, 552, 553, 450, 451, 452, 503, 455, and 555 (see RFC 5321 for their definitions). Some servers will accept all mail and only validate the destination after the SMTP conversation has completed.
- Finally, the SMTP client can initiate sending the body of the email by issuing the DATA command with no other text after it.
- The SMTP server responds with a 354 reply if it is ready to accept the message, or else a 503 or 554 if there was no valid MAIL FROM or RCPT TO command sent.
- If the SMTP server responded with a 354 reply, the client submits the message text, followed by the end of mail data indicator, which is a line containing only a period. The message generally starts with headers (one per line, e.g., “Header-name: header-value”) and then is followed by the body of the message.
- If the message is accepted, the SMTP server replies with a “250 OK” reply.
- The client can now initiate a new conversation with the server or send the “QUIT” command to politely close out the connection.
- If the SMTP server receives a QUIT, it is supposed to send a “221 OK” reply and then close the connection.
For deeper details of SMTP, please refer to RFC 5321. You can communicate with SES via SMTP over a number of clients including Microsoft Outlook – see the developer guide for more details. If you’d like to see the conversation yourself, you can also use telnet. There are a few additional things worth noting:
- The server must not intentionally close the connection unless it sees a QUIT command except in the case of a timeout or if the server has to go down.
- At any time during the conversation, the SMTP client can send the RSET command (just “RSET”, no additional parameters) to abort the current mail transaction so that the conversation can start fresh.
- Other supported commands are VRFY to verify that a string represents a valid user or mailbox, EXPN to confirm that a string identifies a mailing list and returns membership of that list, NOOP to get a “250 OK” reply from the server, and HELP to get help information.
- One notable extension that Amazon SES supports for secure communication on ports 25, 587, and 2587 is STARTTLS, as detailed in RFC 3207. After the EHLO, the client sends the command “STARTTLS”, the server replies with “220 Ready to start TLS” (or 501 or 454 error codes), and then TLS negotiation occurs to set up encryption keys. The conversation is then reset and must start with EHLO all over again with all transactions from here on out encrypted. Amazon SES also supports TLS wrapper mode on ports 465 and 2465.
- Another notable extension that Amazon SES requires for authentication is AUTH PLAIN LOGIN, which is where you would enter your SMTP credentials. This is explained in some detail in the Developer’s Guide, but a recent blog post also goes into authentication in general.
Here is a sample SMTP conversation with SES in TLS wrapper mode with the conversation contents decrypted. The blue text comes from Amazon SES and the red text comes from a sample client:
PROXY TCP4 188.8.131.52 10.44.15.76 14659 465
220 email-smtp.amazonaws.com ESMTP SimpleEmailService-793939519 iqAfLvOj6BjiiCjSnD6S
250-AUTH PLAIN LOGIN
235 Authentication successful.
354 End data with <CR><LF>.<CR><LF>
Subject: Test message
From: Senior Tester <firstname.lastname@example.org>
Content-Type: text/html; charset=”UTF-8″
<b>Cool email body</b>
250 Ok 0000012345678e09-123a4cdc-b56c-78dd-b90e-d123be456789-000000
We hope that you feel better informed now on how email works at a high level! In the next post of this series we’ll go over how you can easily capture a live SMTP conversation. Thanks for being a customer of Amazon SES!