Sign in Agent Mode
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

The most Competant Web Crawling Service I've used

  • By Justin W.
  • on 02/03/2023

What do you like best about the product?
Overall, Diffbot's tools are simple to use and understand outside of more complex use cases. We use several of their features to deliver content insights to our clients. I would recommend Diffbot to any person or organization that needs to pull large amounts of data from arbitrary web sources.

The first tool we use is the crawlbot, which we appreciate is configurable and extremely capable. In most of our use cases - we just need to point to a URL and have it repeat every so often to discover new content. After crawling, the data is available via an easy-to-parse JSON file.

We also use the Diffbot Knowledge Graph API. The powerful DQL language allows us to query a massive amount of data to find articles and entities. DQL is simple to use, and the GUI interface allows easy testing and iteration.

Diffbot's customer service is also exceptional. Our contact has been very attentive in helping us learn how to properly use Diffbot's services to meet our needs. He has organized one-off Zoom meetings to walk us through the appropriate method for creating DQL queries and has expedited bug fixes required for our use cases.
What do you dislike about the product?
Diffbot is a powerful tool, and with its numerous capabilities, it can be difficult for those unfamiliar with it to understand how to use it properly. Fortunately, Diffbot provides excellent customer service, which can help guide you through the process of determining the best practices for your use case.
What problems is the product solving and how is that benefiting you?
Diffbot offloads the complex and difficult process of web crawling, scraping and analysis/parsing. Rather than writing our own in-house web crawler, we can spend our time elsewhere building features for our clients.

Diffbot's Knowledge Graph allows us to find relationships between articles and entities across the web in near real-time. This feature has been invaluable in providing insightful information to our clients.