AWS Marketplace

Using Seller Data Feed Delivery Service, Amazon Athena, and Amazon QuickSight to create seller reports

by Varun Krishnakumar, Taylor St. Clair, Vincent Larchet, Ankit Marwaha, Nimish Amlathe, and Alexandru Stroe

AWS Marketplace enables more than 1,500 sellers to sell their software and data products to over 260,000 customers worldwide. As sellers growing your businesses using AWS Marketplace, you require timely, secure, and easy access to business, sales, and financial data. To address this need, AWS Marketplace launched the Seller Data Feed Delivery Service that delivers business insights directly into sellers’ Amazon Simple Storage Service (Amazon S3) buckets. This enables you to meet unique reporting and visualization needs and obtain business insights on a daily basis.

Until now, AWS Marketplace sellers received business reports daily, weekly, or monthly, depending on the data points involved. These downloadable reports were previously available in the AWS Marketplace Management Portal (AMMP). However, users had to do their own analysis of the data to meet business needs, such as sales compensation or quarterly financial reporting. In this blog post, I show you how to access and use the new data feeds, including how to:

  • Set up your private Amazon S3 bucket
  • Receive the new data reports from AWS Marketplace
  • Extract and load report insights into your internal BI tools using Amazon Athena
  • Build simple dashboards for visualization with Amazon QuickSight.

Prerequisites

Before beginning, make sure that you:

  • Are using your registered AWS Marketplace seller account
  • Are familiar with the AWS Management Console
  • Are familiar with basic SQL principles (only required if you want to build customized reports using the data feeds)

Step 1: Setting up your AWS account to receive data feeds

Here is how to set up your AWS account to receive these data feeds.

  1. Set up your secure S3 bucket to receive the data feeds. To simplify this process, AWS Marketplace has provided this AWS CloudFormation template, which sets up the AWS resources required to receive the data feed. Specifically, the template helps to set up an encrypted S3 bucket, AWS Key Management Service key, and an optional Simple Notification Service topic under your AWS account ID. Refer to the following screenshot.

How to set up an encrypted S3 bucket, AWS Key Management Service key, and an optional Simple Notification Service topic under your AWS account I

  1. To configure the data feed feature from the Set up customer data storage page, enter the Amazon Resource Names (ARNs) and select Submit. If you choose to use existing resources, you must create an Identity and Access Management (IAM) role that grants access to AWS Marketplace services.

You are now ready to start receiving data feeds from AWS Marketplace. You may access these reports within the Amazon S3 console, and more information about each data feed can be accessed there. You can also browse the folders within the S3 bucket to download the CSV file containing any dataset you’d like to analyze. Any data feeds that are made available in future will automatically be created in your Data Catalog (and available via Amazon Athena).

Step 2: Setting up your AWS Glue Data Catalog

To build customized queries and reports, you must ingest these data feeds into a SQL client, such as Amazon Athena. To be able to query the data feeds on Amazon Athena, you must first set up the AWS Glue Data Catalog using an AWS Glue crawler.

Follow these step-by-step instructions for setting up your AWS Glue crawler:

  1. Add a crawler. To add a crawler, in the side navigation panel, select Crawler and then Add Crawler. Then follow these steps:
    1. Name your crawler and connect to your data stores. To do this:
    2. Enter a name for your crawler. Under Custom classifiers, select the classifier you set up in step 1 Create an AWS Glue classifier, and then choose Add.
    3. On the Specify crawler source type page, choose Data stores, and then select Next.
    4. On the Add a data store page, select S3. Choose Specified path in my account. Then enter the path to your S3 bucket. The S3 path usually follows this format: s3://<bucket-name>. You can find it by navigating to a file within your S3 folder structure and selecting copy path. The first portion of your copy path is the path to your S3 bucket. For Exclude patterns, enter .deleteme. Select Next.
    5. When asked if you want to Add another data store, choose No.
  2. Set IAM roles and schedule your crawler. To do this:
      1. Either choose an existing IAM role or create new one. If you create a new IAM role, attach the AWS managed policy You must also attach an inline policy that allows read access to the bucket that you set up in Step 1: Setting up your AWS account to receive data feeds. Select Next.
      2. On the scheduling page, choose Daily as the frequency as the schedule for this crawler. Select On the database page, select Create database. Enter a database name and select Create.
      3. On the Configure the crawler’s output page, ensure you have unselected Create a single schema for each S3 path under Grouping behavior for S3 data. Then select Next, review your inputs, and select Finish.
      4. In the AWS Management Console, go to Key Management Service. The Region is us-east N. Virginia. Under the Customer Managed Keys, look for a key with prefix AWSMarketplaceDataFeeds. Copy the ARN for the key. It should be something like arn:aws:kms:us-east-1:XXXXXXXXXXXX:key/1234abcd-12ab-34cd-56ef-1234567890ab.
      5. Go to IAM. On the left, choose the Roles tab. Look for the appropriate role being used for AWS Glue you specified in Step 2.2.1 and choose it.
      6. In the Permissions tab to the right of Attach Policies, choose Add Inline policy.Choose the Json tab. Paste the following policy: {
        "Version": "2012-10-17",
        "Statement": [
        {
        "Sid": "VisualEditor0",
        "Effect": "Allow",
        "Action": [
        "kms:Decrypt",
        "kms:DescribeKey" ],
        "Resource": "<provide the arn noted in step 6"
        }
        ]
        }
      7. Enter a name and create the policy. Confirm this new policy is part of the role.

3. Run your new crawler. To make the database available on Amazon Athena, run your new AWS Glue crawler now.

Step 3: Building a SQL query on Amazon Athena

To start building your own SQL on Amazon Athena, navigate to the Amazon Athena console. For the purposes of this exercise, I am building a query that summarizes the total disbursements by products over the past 12 months.

In the Amazon Athena console left sidebar, navigate to Database column. Then choose your newly created Data Catalog from Step 2. This query can be built by joining the billingeventfeed, accountfeed, and addressfeed tables. Enter the following SQL query in the query editor:

/* SQL to extract total disbursed amount by customer*/
WITH account_data AS
(
  SELECT *,
         COALESCE(LEAD(valid_from) OVER (PARTITION BY account_id ORDER BY valid_from ASC),'2020-05-01T00:00:00Z') AS valid_to
  FROM accountfeed
)
SELECT bi.invoice_date,
       bi.from_account_id,
       ac.mailing_address_id,
       ad.address_id,
       ac.tax_legal_name,
       ad.company_name,
       SUM(bi.amount) AS disbursed_amt
FROM BILLINGEVENTFEED bi
  LEFT JOIN account_data ac
         ON bi.from_account_id = ac.account_id
        AND bi.invoice_date BETWEEN ac.valid_from AND ac.valid_to
  LEFT JOIN addressfeed ad ON ac.mailing_address_id = ad.address_id
WHERE
-- Only look for billing events that have been disbursed
bi.action = 'DISBURSED'
-- Only look at line items that pertain to seller accounts
AND   bi.transaction_type IN ('SELLER_REV_SHARE','SELLER_REV_SHARE_REFUND')
-- Only look for transactions that have occurred this year
AND   bi.invoice_date BETWEEN '2020-01-01' AND '2020-04-30'
GROUP BY bi.invoice_date,
         bi.from_account_id,
         -- The fields below have a 1-1 mapping with from_account_id, but added here for readability
         ac.mailing_address_id,
         ad.address_id,
         ac.tax_legal_name,
         ad.company_name;

Run the query by selecting Ctrl + Enter. The results of your query show up in the Results box below the query tab, and your query tab now shows a green check mark. Refer to the following screenshot:

Successful results of SQL query built in Amazon Athena

Step 4: Building dashboards using Amazon QuickSight

Now you have the data ready in S3 for analysis. You can now start building visualizations, performing one-time analyses, and getting business insights from your AWS Marketplace data.

  1. Navigate to the QuickSight homepage. To do this, on the AWS Management Console home page, select Services from the global navigation bar. Enter QuickSight into the search bar. Select QuickSight from the search results.
    1. If you are new to Amazon QuickSight, you must set up your account. Follow the instructions on the portal to sign up for the QuickSight service. During the sign-up process for your new QuickSight account, provide access the QuickSight service with access to your private S3 bucket. To do so, on the Create Your QuickSight Account page, select the S3 Access option.
    2. To provide access to the specific S3 bucket that has your data, select Choose S3 buckets. You can now explore your seller Data Catalog using QuickSight.
  2. On upper left of the QuickSight home page, select New Analysis and then S3. Enter a name for your Data Catalog. Then add a manifest file. To read more about manifest files and how to create one, review the documentation on Amazon S3 manifest files. If you want to add data from multiple sources and perform join operations, ensure that the manifest file contains Uniform Resource Identifiers (URIs) needed for it. The following is a template manifest file that you can use to get started quickly.
{
    "fileLocations": [
        {
            "URIs": [
                "s3:// bucket-name /query-output-folder/datafeedoutput_2020-01-15.csv"
            ]
        }
    ],
    "globalUploadSettings": {
        "format": "CSV"
    }
}

To load your Data Catalog, select Connect.

You can now create custom visualizations on your AWS Marketplace seller data, identify patterns, and get actionable insights. For more details on how to build customized analyses, review the documentation.

Conclusion

With Seller Data Feed Delivery Service, you now have access to critical financial, tax, and buyer information within one day. You can seamlessly ingest the data into AWS services, including Amazon Athena and Amazon QuickSight, or your internal reporting tools. In this blog post, we showed you how to start receiving AWS Marketplace business insights data feeds directly to your private S3 bucket. We also showed how to configure an AWS Glue crawler to parse the data feeds and load onto the Amazon Athena SQL client. Finally, we showed how to use Amazon QuickSight to build analyses and visualizations. Now your organization can quickly and easily access AWS Marketplace information needed for activities such as sales compensation, financial planning, and tax audits.

Next steps

The following resources provide more information about the tools described in this post:

About the authors

Varun Krishnakumar is a Principal Product Manager on the AWS Marketplace team based in Seattle, WA. Outside of work, he loves cooking new foods, golfing, and hiking.

 

 

 

 

 

taylor st clairTaylor St Clair is a Senior Product Marketing Manager on the AWS Marketplace team based in Seattle, WA. Outside of work, she loves testing out new recipes, learning Spanish, and exploring the world with her goldendoodle, Fusilli.

 

 

 

 

Vincent Larchet is a Senior Software Development Engineer on the AWS Marketplace team based in Vancouver, BC. Outside of work, he is a passionate wood worker and DIYer.

 

 

 

 

 

Ankit Marwaha is a Software Development Manager on the AWS Marketplace team based in Seattle, WA. Outside of work, you will find him walking around local neighborhood parks with his two-year-old.

 

 

 

 

Nimish Amlathe is a Senior Product Manager on the AWS Marketplace team based in Vancouver, BC. Outside of work, you are likely to see him at a local comedy club.

 

 

 

 

Alexandru StoeAlexandru Stroe is a Software Development Engineer on the AWS Marketplace team based in Vancouver, BC. Outside of work, he enjoys spending time with his family, discovering new places to visit, and reading.