AWS Startups Blog

Why Avatars are Usually Awful, and How Snappr Fixed It

Guest post by Matt Schiller, Co-founder & CEO, Snappr

Matt Schiller Snappr

Avatars and profile pictures are everywhere on the web. But aside from being pretty, they serve two important functions. First off, they’re easier than written names for the human eye to identify and secondly, they take up less real estate on cluttered screens. Getting users to upload an avatar, however, is tricky; even advanced platforms like LinkedIn struggle with an abundance of picture-less accounts.

Getting users to upload a good profile picture, where the face isn’t too obscured, is even more difficult. Platform designers face the additional challenge of needing to consider what dozens—if not hundreds or thousands–of profiles will look like when laid out side by side. For something as individualistic as Facebook, mismatched profile pictures are fine. But for commercial uses, such as needing to look at service providers side by side, it’s not ideal.

Given these considerations, the key to creating a good avatar comes down to the cropping. But if you suggest to your users that they do their own cropping, you are unlikely to receive much of a response. The next best thing is to incorporate a self-service cropping tool into your GUI, but that adds additional time to setting up an account and can be annoying for users. As a photography platform, we wanted to solve this issue once and for all for users. Our solution was to crop the pictures for our users, but not manually. Rather, we used robots!

Snappr Profile Avatar Comparison

How We Did It

Let’s start by breaking our process down into steps, before we explain how we tied each step together to solve the problem.

  1. Upload and store the master image
  2. Find the face
  3. Crop the master image
  4. Store an optimized and pre-cropped image

We are going to assume that you’re familiar with getting files in and out of Amazon S3. In the first step, we upload the avatar to a S3 bucket called user-media-avatar that we created to store the source images that are ready to be processed. User 27 has provided an image they want to use as their profile picture. It might be an attractive photo, but if it gets reduced in size enough it will be impossible to see who is in the picture. After it’s uploaded, it will be stored as “27-master.”

Enter Amazon Rekognition on stage right (for more information about Rekognition, see the intro blurb at the top of this post). Thanks to Rekognition, most of the hard work has been done. Now we make a call to the API to return anything it can find:

const params = {
    Image: {
        S3Object: {
            Bucket: "user-media-avatar",
            Name: "27-master"
        }
    }
};

rekognition.detectFaces(params, (err, data) => {
    /* callback response */
});

Hopefully, the user uploaded a picture with a face! For brevity, we won’t cover the alternative case here, but suffice to say that it needs to be accommodated. If everything went well, the data parameter will contain an object that will tell us everything we need to know about the face image:

{
    "FaceDetails": [{
        "BoundingBox": {
            "Height": 0.18000000715255737,
            "Left": 0.5555555820465088,
            "Top": 0.33666667342185974,
            "Width": 0.23999999463558197
        },
        "Confidence": 100,
        "Landmarks": [],
        "Pose": {},
        "Quality": {}
    }]
}

The FaceDetails property will contain an array of any faces that it detected. For our example, we’ll assume that there’s only one. But if you’re building something for production, you should take the time to detect multiple faces and go with one of the following options: (a) a best guess; (b) throw an error so that the user will need to upload another more suitable image; or (c) implement a GUI to allow the user to interactively select the correct face. We suggest (b). But for now, let’s just grab the first face in the array:

rekognition.detectFaces(params, (err, data) => {
    if(err) {
        throw err;
    }
    if(!data) {
        throw new Error('Nothing returned from Rekognition');
    }
    if(!data.FaceDetails || data.FaceDetails.length == 0) {
        throw new Error('No faces returned in Rekognition response.');
    }

    const [face] = data.FaceDetails;

Now that we have the face, it’s time to get the bounding box. The face.BoundingBox property will have a set of four values that specify the left, width, top, and height of the face. These values have been normalized to zero to one, with zero being far left/top and one being far right/bottom. So before we can use them, we need to convert them into absolute (pixel) values.

This takes us to the next step. We have the position of the face, and now we need to crop out the unwanted parts of the image. We’ll write our own cropping function in Node.js to do what we want. Traditionally, you would use a third-party library like ImageMagick to do the heavy lifting. We prefer a relatively new library called lwip because it doesn’t have any external runtime dependencies and can be installed super quickly via NPM. The bonus of using lwip is that it handles streams out of the box, so we can just pipe the data directly from S3 into memory to do the image processing without needing to store it to disk first!

Start by using an AWS SDK to open a stream of the object from S3, and then use that stream to create an lwip image object:

    const S3 = new AWS.S3({ region, accessKeyId, secretAccessKey });
    const Bucket = "user-media-avatar";
    const Key = "27-master";
    let imageStream = S3.getObject({ Bucket, Key }).createReadStream();

    lwip.open(imageStream, "image/jpeg", (err, image) => {
        if(err) {
            throw err;
        }

Now that we have the image data in memory, we can use it to find out the width and height so we can convert the normalized values from Rekognition into pixel values. Because lwip uses an absolute bounding box, while we’re denormalizing the values we’ll convert the width and height values into right and bottom respectively:

        const [imageWidth, imageHeight] = [image.width(), image.height()];

        const absoluteBox = {
            left:   imageWidth * face.BoundingBox.Left,
            right:  imageWidth * (face.BoundingBox.Left + face.BoundingBox.Width),
            top:    imageHeight * face.BoundingBox.Top,
            bottom: imageHeight * (face.BoundingBox.Top + face.BoundingBox.Height)
        };

Now that the box is calculated we are almost ready to do the crop. The Rekognition bounding box values are designed for analysis rather than the basis of a user-facing GUI (it crops too tightly on the face for this second purpose). So to make things simple, you can just scale the box to be twice the size. That will give you a nice crop just below the shoulders:

        const halfWidth  = (imageWidth * face.BoundingBox.Width) / 2;
        const halfHeight = (imageHeight * face.BoundingBox.Height) / 2;
        absoluteBox.left   -= halfWidth;
        absoluteBox.right  += halfWidth;
        absoluteBox.top    -= halfHeight;
        absoluteBox.bottom += halfHeight;

Now that the box is scaled up, let’s bring in lwip to do the actual work. Using the batch feature, we can avoid some deep nesting of callbacks. One thing to keep in mind is that you also need to check for clipping on the bounding box: if the crop box goes outside the image it will fill the dead space with black. I’ve left it out of this example for brevity. Time to crop! There’s a lot going on in this chain, so let’s break it down by the methods:

        const avatarHeight = 600;
        const avatarWidth = 600;

        image.batch()
            .crop(  absoluteBox.left,
                    absoluteBox.top,
                    absoluteBox.right,
                    absoluteBox.bottom )
            .sharpen(0.25)
            .cover( avatarWidth,
                    avatarHeight,
                    'nearest-neighbor')
            .toBuffer("jpg", { quality: 80 }, storeImage);

·       Crop – does what it says on the box: crops the image to the dimensions we specified.

·       Sharpen – when the image is reduced in size it looks nicer if we artificially sharpen the image a little. Tweak this value to taste.

·       Cover – this is the most fancy call in the stack. It resizes and crops the image to the supplied dimensions so that our optimized image is always the same size and aspect ratio.

·       toBuffer – converts the lwip image object to a buffer object that can be passed to S3.

Once we have the buffer, we just send it to S3, storing the optimized image ready to be sent to the application for display to the users:

        function storeImage(err, buffer) {
        const OptimisedKey = "27-small";
        S3.upload({Bucket, OptimisedKey, buffer}, function(err, data) {
            if (err) {
                throw err;
            }
        });
    }

Conclusion

If you’re running any kind of platform that involves user profile images, why not give a cropping method like this a try?

Cropping is important, but there are so many other things that go into a good profile picture. In fact, Snappr has built a tool that analyzes all the critical elements for you! It’s called the Snappr Photo Analyzer. You can try it for free right here. We are now in the process of building a tool that analyzes dating photos, so stay tuned for more on that.

Never underestimate the importance of photography in creating amazing software that people love. The Airbnb story—where they didn’t reach a point of breakout traction until they started a program of taking free professional photos of their host’s homes—is a perfect example. If you need great pro photography taken for your platform or your users, Snappr operates in both the US and Australia with amazing, pre-vetted photographers available to book with as little as 30 minutes’ notice.


About Snappr

Snappr allows you to book a professional photographer for any occasion as easily as you might book an Uber, at affordable fixed prices. They also run the Snappr Photo Analyzer, a free AI tool that assesses your professional profile picture and tells you how you can improve it. Snappr believes that amazing photography should be within reach for everyone.

About Amazon Recognition
Amazon Rekognition makes it easy to add image analysis to apps. Rekognition allows you to detect objects, scenes, and faces in images. You can even search and compare faces. The Rekognition API lets you easily build powerful visual search and discovery into your applications. For more information, see Introducing Amazon Rekognition.