AWS Messaging & Targeting Blog

A Guide to Optimizing SMS Delivery and Best Practices

by Tyler Holmes, Casey Forrest, and Patrick Viker on Permalink Share

If you’re sending critical SMS messages to your users including one-time passwords (OTP), appointment reminders, urgent alerts, and marketing messages, you know how important reliable delivery is. SMS has become the backbone of modern business communications, and for good reason. Its ubiquitous nature and high engagement rates make it the go-to channel for reaching users globally.

But here’s the thing. As your messaging volumes grow and you rely more heavily on SMS for critical communications, you might notice that not every message gets delivered exactly as planned. And that’s normal across the entire telecommunications industry. In fact, expecting 100% delivery rates for SMS messages is like expecting every flight to arrive exactly on time. It’s simply not realistic given the inherent complexity of global telecommunications networks and systems. This isn’t unique to any particular messaging provider. It’s the nature of SMS delivery itself. Don’t worry, you’re not alone. Let’s dive into why this happens and show you how to build a rock-solid messaging strategy that works for your business.

In this post, we’ll explore what really happens when you send an SMS, share practical monitoring techniques that go beyond basic delivery receipts, and show you how to build redundancy into your messaging architecture. By the end, you’ll have the knowledge and tools to optimize your messaging operations using AWS End User Messaging.

Understanding SMS Delivery Complexities

Have you ever wondered what happens when you hit send on that crucial OTP message or important customer alert? The journey your message takes is more complex than you might think. Let’s break it down.

Your message begins its journey through a network ecosystem that involves multiple carriers, systems, and devices before reaching your end user. While this might sound daunting, AWS End User Messaging works continuously to optimize and simplify these delivery paths and maintain strong relationships within the messaging ecosystem.
So what makes SMS delivery so complex? Think of it like air travel – even with the best airlines and optimal conditions, various factors can affect whether a flight arrives exactly on time. The same is true for SMS messages.

Network infrastructure plays a crucial role. Just as flights navigate through different airspaces, your messages traverse carrier networks to reach their destination. AWS End User Messaging actively participates with SMS providers, carriers, and regulators to ensure up-to-date compliance and optimal delivery performance. However, just like air traffic control might need to redirect flights, message routing isn’t always straightforward. Network congestion, carrier maintenance (both scheduled and unplanned), country regulatory changes, and handset network availability, can occasionally affect message routing.

Carriers add another layer of complexity. Each carrier has its own set of rules and policies – think of them as different airports with their own specific regulations. They implement various filtering and anti-spam policies, handle message queuing differently, and may occasionally block messages without proactive notice if they detect suspicious patterns or unusual volumes. This is actually a good thing – it helps protect users from fraud, even though it can sometimes affect legitimate messages.

Then there’s the final destination – the end user’s device. Even if your message successfully navigates the network and carrier challenges, the recipient’s phone might be turned off, in an area with poor coverage, or simply out of storage space. It’s similar to a passenger missing their flight because their vehicle broke down on the way to the airport. Like the passenger, SMS connections may be lost due to local transportation issues at their destination.

This is why focusing solely on delivery reports doesn’t tell the whole story. For instance, you might receive a successful delivery receipt from a carrier, but the end user’s phone could be in airplane mode. Or conversely, a message might show as undelivered in carrier reports, but the user actually received it after a slight delay.

Understanding these complexities helps explain why achieving 100% delivery rates isn’t realistic. Instead of pursuing perfect delivery rates, successful messaging strategies focus on multiple factors:

  • Building comprehensive monitoring systems,
  • Following messaging content best practices (such as using Global System for Mobile Communications (GSM) characters,
  • Maintaining appropriate message lengths,
  • Following URL best practices
  • Ensuring compliance with carrier requirements
  • Providing alternative delivery channels.

Next we will dive deeper into some of these and explore how to set up effective monitoring practices that give you real insight into your message delivery success.

The Reality of Delivery Rates

Let’s address something important: if you’re expecting 100% delivery rates for your SMS messages, you’ll need to adjust those expectations, as this is not the reality of the industry. This is true regardless of which messaging provider you use – it’s simply the nature of how SMS works within global telecommunications networks. Even in optimal conditions, various factors can affect delivery:

  • Network conditions in different countries
  • Carrier policies and filtering systems
  • End-user device states
  • Local regulations and requirements
  • Natural or infrastructure disruptions (for example: cable cuts, wildfires, tsunamis, or other environmental events)

Think of it like this: even the world’s most reliable airline can’t guarantee every flight will arrive exactly on time. Weather patterns change. Airports face congestion. Maintenance needs arise unexpectedly. SMS delivery navigates similar real-world challenges.

What matters is understanding what “good” looks like for your specific use case. Just as an 85% on-time arrival rate might be excellent for flights in winter conditions but below average in clear weather, a 95% SMS delivery rate might be excellent in one country but below average in another. This is why establishing baseline metrics for different regions and message types is so crucial.

Strategies for Reliable Delivery

Now that we understand why 100% delivery isn’t realistic, let’s talk about strategies to maximize your success rates.

The Art of the Retry

When a message doesn’t get through, having a retry strategy is crucial. But it’s not as simple as “try, try again.” You need to be thoughtful about:

  • How long to wait between attempts
  • How many times to retry
  • When to switch to a different channel
  • The cost implications of your retry strategy

Think of it like following up on an important email – you wouldn’t send the same email every 5 minutes, but you might try different approaches over time.

Important Anti-Abuse Note: Always implement reasonable limits on your retry features. This prevents both intentional and unintentional abuse of the system, ensuring fair usage and maintaining the integrity of the service for all users.

This retry strategy forms just one part of a comprehensive approach to reliable message delivery. Later in this post, we’ll explore how to build additional resilience through multi-channel messaging strategies that give you multiple paths to reach your users.

Establishing Effective Monitoring Practices

Let’s talk about what really matters: knowing if your messages are actually reaching your users. Sure, carrier delivery receipts are useful, but they’re just one piece of the puzzle. Just as airlines don’t rely solely on flight trackers to measure success, you need a more comprehensive view of your messaging performance.

So how do you get the full picture? It starts with understanding what “normal” looks like for your messaging patterns.

Getting to Know Your Baseline

Just like you know your typical website traffic patterns or customer service volume, you need to understand your typical message delivery patterns. What’s a normal delivery rate for messages to India versus messages to the United States? How do your success rates vary between weekdays and weekends? What about during peak shopping seasons?

This baseline knowledge becomes your compass – helping you quickly spot when something’s not quite right. But how do you build this understanding? This is where AWS End User Messaging Message Feedback API comes in handy.

Putting the Message Feedback API to Work

Here’s the thing about carrier delivery receipts: they can take up to 72 hours to arrive and will vary by country. That’s like waiting three days to know if your customer got their one-time password! Instead of playing this waiting game, you can use the Message Feedback API to gain real-time insights into message delivery.

Let’s say you’re sending OTP codes. When a user successfully enters their code, that’s a clear signal they received your message. With the Message Feedback API, you can record this action, marking the message as successfully delivered. Not only does this give you immediate feedback, but it also helps build a more accurate picture of your actual delivery success rates.

But what about messages that don’t get a response? After an hour without user interaction, the Message Feedback API will mark these messages as failed. This helps you maintain accurate metrics and quickly identify potential delivery issues.

Building a Complete Monitoring Strategy

Your monitoring strategy should be like a flight operations center, combining multiple data sources and ready to respond to changing conditions.

Message Feedback Data: This is your real-time insight into user interactions. Are recipients completing the actions your messages are meant to trigger? Are OTP codes being used? Are links being clicked?

CloudWatch Metrics: Set up alerts that make sense for your business. If your typical OTP conversion rate is 85%, you might want to know if it suddenly drops below 80%. Remember, these aren’t perfect numbers, and they’re not meant to be. Different messages might need different thresholds. What’s acceptable for a marketing message might not be acceptable for a security verification code. The key is understanding your normal delivery rates and monitoring for significant deviations from that baseline. See here for more information on setting up CloudWatch for End User Messaging.

User Behavior Patterns: Pay attention to how users interact with your messages. Are certain types of messages more successful than others? Do some regions consistently show different patterns? This information is gold for optimizing your messaging strategy.

The key is to look for patterns. Maybe your delivery rates dip at certain times of day, or perhaps specific types of messages have lower success rates. These patterns help you adapt and improve your messaging strategy over time.

Remember, monitoring isn’t just about catching problems. It’s about understanding your messaging ecosystem and continuously improving it. When you do spot an issue, you’ll need to know how to investigate and resolve it quickly.

Investigation and Troubleshooting Strategy

Even with the best monitoring in place, you’ll occasionally run into delivery challenges. Just as airlines have procedures for investigating flight delays, you need a systematic approach to investigate and resolve messaging issues quickly.

Spotting the Signs

Just as air traffic controllers monitor multiple indicators for potential issues, your messaging system has key indicators that signal when something needs attention; your most reliable indicators come directly from your customers’ experiences:

  • A sudden drop in OTP conversion rates
  • An uptick in customer complaints about missing messages
  • Unusual patterns in your message feedback data
  • Spikes in failed delivery attempts

Customer-driven signals are your most accurate indicators of messaging health. When these metrics change significantly, particularly in one-time password (OTP) conversion rates and customer complaints, it’s crucial to investigate the underlying causes and understand their impact on user experience.

Playing Detective

When you notice something’s off, start by narrowing down the scope. AWS End User Messaging provides detailed event data that helps you investigate delivery issues. Let’s look at what information you have at your fingertips:

Message Events contain crucial investigation data like:

  • Country (isoCountryCode): Which countries are affected?
  • Carrier information (carrierName): Is this specific to certain carriers?
  • Timing (eventTimestamp): Are issues occurring at particular times?
  • Message status and description: What’s happening with the message?
  • Message type and encoding: Could content formatting be a factor?

Some of the most important things to configure in End User Messaging are Event Destinations. For an in-depth post on how to configure these read here. Here’s an example snippet of a delivery event that you might receive that helps paint the picture:

Understanding these events helps identify patterns. Maybe you recently changed your message templates, or perhaps you’re sending higher volumes than usual. These could be important clues to investigate.

When to Call in AWS Support

Time is crucial when investigating SMS issues. For ongoing problems, carriers need recent examples – ideally messages sent within the last 48 hours. This allows them to investigate current network conditions and message flows.

Even for historical issues that are no longer occurring, fresh data is still required for investigations. If you’re reporting a past problem, try to provide the most recent examples possible. Be aware that if an issue is too old, providers may be unable to conduct a root cause analysis due to log retention policies and other limitations.

The SMS ecosystem involves multiple third parties, each playing a role in message delivery. Investigating issues often requires coordinating with these various entities, which can extend the time needed to determine the root cause. In some cases, if the issue is old enough, a complete analysis may not be possible.

Prompt reporting is key. The sooner you alert us to an issue, the better chance we have of gathering relevant data and working with carriers to resolve the problem or provide meaningful insights.

If you spot significant issues and you have AWS Premium Support (a paid service that provides additional assistance), don’t hesitate to contact them. But here’s the key to getting quick results: provide comprehensive information. Remember that “my message didn’t deliver” isn’t nearly as helpful as “we’ve seen a 20% drop in OTP conversion rates for messages to Country X over the past 4 hours, affecting approximately 1,000 messages. Here are message IDs to investigate.”

What is Required by Support to Help You:

  • Country having SMS issues
  • Clear data showing the scope of the issue
  • Multiple example message IDs and phone numbers that reflect the scale of the problem:
    • For widespread issues affecting thousands of messages, provide dozens of examples
    • For regional issues, include examples from different affected areas
    • For carrier-specific problems, include examples across affected carriers
  • Dates and times for when the issue started and any patterns you’ve noticed
  • Relevant message feedback data

The affected downstream carrier and our support team requires detailed information to help resolve delivery problems. A few example numbers aren’t enough if you’re seeing a widespread issue. The scale of your evidence should match the scale of the problem.

Investigating Issues Without Premium Support

Even without Premium Support, you have powerful tools at your disposal to investigate and resolve many issues:

  • Leverage CloudWatch Metrics: Set up detailed alarms to catch issues early. Monitor trends in delivery rates, user engagement, and error types.
  • Analyze Message Feedback Data: Use the Message Feedback API to gather real-time data on message delivery and user interactions. This can help you pinpoint where in the delivery process issues are occurring.
  • Review AWS End User Messaging Documentation: Check out our Best Practices guide for proactive measures you can take.
  • Use AWS Forums and Communities: Connect with other AWS users who might have encountered similar issues. Our community forums are a great place to share experiences and solutions.
  • Implement Logging: Detailed application logs can be invaluable for tracking down the root cause of issues. Ensure you’re logging key events in your messaging workflow.
  • Test with Simulator Numbers: Use our simulator numbers to test your messaging flows in a controlled environment, helping you isolate issues.

For particularly complex or persistent problems, Premium Support does offer additional resources and expert assistance. You can learn more about these services here: https://aws.amazon.com/premiumsupport/.

Learning from Each Investigation

Each investigation is an opportunity to improve your messaging strategy. Keep track of what you learn:

  • Which monitoring alerts helped you catch the issue?
  • What investigation steps were most effective?
  • How could you spot similar issues faster next time?

But what if you could prevent some of these issues in the first place? That’s where building a resilient messaging strategy comes in, and that’s exactly what we’ll explore next.

Building a Resilient Messaging Strategy

Earlier, we discussed how retry logic helps handle immediate delivery challenges. Now, let’s expand our reliability toolkit with multi-channel approaches…

Just as passengers don’t have to rely on a single airline to reach a destination, you shouldn’t depend solely on SMS for critical communications. While SMS is fantastic, using just one channel is like depending on a single flight path. When that path becomes unavailable, you need alternatives.

Understanding Single Points of Failure

Here’s something crucial to consider: dedicated SMS phone numbers are provisioned through a single carrier partner in each region and country. Think of it like relying on a single airline for all your routes. If that airline experiences issues, you need alternative routes. This creates a potential single point of failure if that carrier partner experiences problems.

This makes implementing redundancy into your messaging strategy not only favorable/beneficial, but essential for business-critical communications. You can create this redundancy through:

  • Using multiple channels like WhatsApp, Push notifications, or voice calls
  • Implementing email notifications as a backup, failover, or just as an additional channel that handles specific types of messages
  • In countries that support both dedicated numbers and sender IDs, planning to use either option if your use case allows for it
  • Using phone pools to quickly adjust your sending strategy if specific originators experience problems

Remember, just as major airports maintain multiple airlines and routes to ensure reliable travel options, your messaging strategy needs multiple paths to reach your users reliably.

The Multi-Channel Advantage

Consider your messaging strategy like an international airport hub serving multiple carriers. AWS End User Messaging gives you several channels to work with:

But it’s not just about having multiple channels – it’s about using them strategically. Pick the right channel for the message you are delivering. Not every message belongs on every channel.

Smart Failover: Your Messaging Safety Net

Imagine you’re sending a critical security alert. Here’s how a smart failover strategy might work:

  1. Start with an SMS – it’s quick and widely accessible
  2. If you don’t get confirmation within a few minutes, try WhatsApp
  3. Still no response? Send a push notification if they have your app
  4. For truly critical messages, you might even escalate to a voice call, or send via email and SMS at the same time

Putting Users in the Driver’s Seat

Just as frequent flyers have preferred airlines and routes, your users likely have preferred ways to receive messages. Some might want WhatsApp messages during the day but SMS for urgent notifications. Others might prefer push notifications while using your app but SMS for critical alerts.

Let your users choose their preferred channels, but be smart about it:

  • Make preference updates simple and straightforward
  • Consider message urgency when applying preferences
  • Remember that preferences might vary by message type

Testing: Your Safety Check

Just as pilots run through a pre-flight checklist, regularly test your messaging setup. AWS End User Messaging makes this easier with SMS simulator numbers – a powerful tool that lets you test your messaging flows without sending messages over carrier networks.

With simulator numbers, you can:

  • Test your messaging flows in a controlled environment
  • Receive realistic event records
  • Validate your application’s handling of SMS events
  • All without incurring carrier costs(you will still pay for volume based on the country you are sending to) or affecting production traffic

Your testing strategy should include:

  • Using simulator numbers to verify basic message flows
  • Checking that messages render correctly across different channels (especially if you support multiple languages in your messaging)
  • Confirming that your retry logic performs as designed
  • Validating failover mechanisms work as expected
  • Monitoring the performance of each channel

Think of simulator numbers as your messaging test lab – a controlled environment where you can experiment, validate, and fine-tune your implementation before sending to real phone numbers. You can find more details about using simulator numbers in the AWS End User Messaging documentation.

The Goal: Reliable Communication

Remember, the goal isn’t perfect delivery, but reliable communication with your users. By building redundancy into your system and offering choices, you create a robust messaging strategy that handles real-world challenges.

Just as airlines maintain multiple hubs and routes to ensure reliable service, your messaging strategy should provide dependable communication, even when individual channels face challenges.

Bringing It All Together: Your Path to Messaging Success

We’ve covered a lot of ground, so let’s wrap up with the big picture. Successful message delivery isn’t about achieving perfect numbers. It’s about building a robust system that reliably connects you with your users, even when conditions aren’t ideal.

Key Lessons to Take Away

Think of what we’ve learned as your messaging strategy toolkit:

First, we discovered why SMS delivery isn’t as straightforward as pushing a button. Just like a flight plan, your message navigates through various networks and systems before reaching its destination. Understanding these complexities helps set realistic expectations and guides better decision making.

Next, we learned that comprehensive monitoring is like having a reliable air traffic control system. It’s not just about watching flight trackers. It’s about actively monitoring passenger experiences through tools like the Message Feedback API. Remember, knowing if your passenger reached their final destination tells you more than a simple landing confirmation ever could.

We also explored how to identify and thoroughly investigate issues when they arise. Time is crucial. Those first 48 hours are golden when investigating delivery problems, and when you need help from AWS Support, detailed evidence is your best asset.

Finally, we looked at building resilience through multiple channels. Just as airlines maintain various routes to key destinations, your messaging strategy should have backup plans ready when needed.

Taking Action

Ready to improve your messaging strategy? Here are your next steps:

  1. Start with Monitoring
    Review your current monitoring setup. Are you just looking at delivery receipts, or are you tracking actual user interactions? Implement the Message Feedback API to get better visibility into your real delivery success rates.
  2. Set Up Smart Alerts
    Configure CloudWatch alerts that make sense for your business. Remember, different messages might need different thresholds – what’s acceptable for a marketing message likely is not acceptable for a security alert.
  3. Build Your Safety Nets
    Begin implementing multi-channel capabilities. You don’t need to do everything at once. Start with one alternative channel and expand from there. Click here for a workshop on a basic failover between SMS and Email
  4. Test and Learn
    Regularly test your messaging flows and monitor their performance. Use what you learn to continuously refine your strategy.

Need More Help?

We’re here to support your messaging journey. Check out these resources to dive deeper:

  • AWS End User Messaging Documentation [link]
  • Message Feedback API Guide [link]
  • AWS End User Messaging Best Practices [link]
  • AWS Premium Support [link]

The Future of Your Messaging Strategy

The messaging landscape will continue to evolve, but the fundamentals we’ve discussed will serve you well: monitor effectively, investigate thoroughly, and build in redundancy. With AWS End User Messaging, you have a partner who’s continuously working to optimize message delivery and provide the tools you need for success.

Remember, the goal isn’t perfection. It’s building a reliable communication system that your users can count on. Start implementing these practices today, and you’ll be well on your way to more effective user communications.

What’s your next step? Whether it’s implementing the Message Feedback API or designing a multi-channel strategy, the time to start is now. Your users are waiting to hear from you.

Tyler Holmes

Tyler Holmes

Casey Forrest

Casey Forrest

Patrick Viker

Patrick Viker