亚马逊AWS官方博客
Real-time data infrastructure for next-gen web3 AI research platform
RisingWave provides a streaming data warehouse to power AI-based search products for Kaito, a leading web3 information platform indexing content not accessible via traditional search engines.
Overview
Crypto traders and investors need solid, real-time data to identify trends, possible fraud, and nuanced risk. However, much of the data to feed that need is found within resources that traditional search engines simply do not index, such as blockchains and private information sources. Kaito, a fintech company, has taken on this challenge to provide the crypto industry with an authoritative crypto data access engine with powerful AI capabilities to guide informed decision making. To do this, Kaito needs to incorporate real-time stream processing rather than traditional batch processing that also minimizes or eliminates the nuanced challenges of stream processing. Amazon Web Services (AWS) Partner RisingWave has just such a solution and is playing a key role in Kaito’s emergence as the source for solid, real-time crypto data.
Unique value proposition meets batch processing Inefficiency
Kaito is a fintech company with the mission to create the industry’s most powerful search engine based on a Large Language Model (LLM) to meet the crypto currency community’s data access needs. In particular, indexing data scattered across various private information sources and blockchains, which is invisible to traditional search engines. The key: combining the company’s proprietary financial search engine real-time data with advanced LLM capabilities to provide a revolutionary information access experience for 300 million users. The value proposition is to deliver real-time data to inform crypto trader and investor social intelligence to identify short-term trends as indicators of FOMO (fear of missing out) opportunities or risks; medium to long term trends to assess crypto project maturity and community; fraud detection relative to bot patterns; and monitor engagement activities of Key Opinion Leaders (KOLs).
Obviously, the operative phrase is real-time. However, data blockchains, private information sources, and surrounding ecosystems is often messy. As data is extracted, it must simultaneously undergo cleaning, transformation, enrichment, aggregation, and indexing before reaching users or fed into Kaito’s in-house AI model. The value of the information—and Kaito’s value proposition—is proportional to the speed of this end-to-end process. On one hand, conventional batch processing requires waiting for the data to accumulate in batches before processing (e.g., unacceptably high latency). On the other hand, stream processing handles an unending stream of data to provide immediate results which are continuously updated and then contribute to the next second’s insights (latencies in sub-second range). Even prior to its partnering with RisingWave, Kaito knew it needed a next level stream processing solution to affordably bring the benefits with consistency and miniscule fault tolerance.
RisingWave: Realizing the full benefits of stream processing
| Data freshness: | second level |
| number of data pipelines: | Several hundred |
| number of supported cryptos: | Several thousand |
Initially, Kaito sought a stream processing system to read and transform data from upstream sources then forward the results to a downstream system responsible for ‘materializing’ (multiple sources married into a single source) the data and handling user queries. However, introducing a second system presented challenges such as any changes to queries in one system must be automatically mirrored in the other to meet evolving business needs; data transfer between systems can become bottlenecked leading to additional serialization/deserialization and network costs; syncing upstream to downstream systems requires engineers to reread the data causing latency and increased costs; and, importantly, when multiple streaming jobs sync data into different tables in the downstream system, maintaining consistency between tables is difficult.
RisingWave distinguishes itself—and became the stream processing solution of choice for Kaito—by combining stream processing and server queries within a single system. As a result, users in RisingWave can define two types of streaming jobs. ‘Create sink’ continuously processes input data and updates the downstream system. ‘Create materialized view’ continuously reflects updates in a materialized view so that all views accurately reflect changes brought about by the same volume of upstream data. Additionally, RisingWave is PostgreSQL-compatible which reduces the learning curve so that Kaito can speed up the development process and onboard new user with a low learning curve. RisingWave also offers Java, Python, and Rust User-Defined Functions (UDFs) to handle the 5% of cases where SQL may fall short. Furthermore, third-party tools and client libraries can seamlessly connect to RisingWave as if it were PostSQL, which is crucial for Kaito’s reliance on a variety of managed services and open-source tools to develop products.
In short, RisingWave offers Kaito a stream processing solution that is a critical collaborator to its unique value proposition. The force-multiplier for RisingWave is its partnership with Amazon Web Services (AWS), which aligns perfectly with RisingWave’s cloud-native architecture. AWS provides robust services for storage, compute, and networking, enabling RisingWave to deliver low-latency, high-performance real-time analytics at scale. The partnership also allows RisingWave to integrate seamlessly with the AWS ecosystem, giving customers a streamlined deployment experience and access to a wide range of cloud-native tools.
Ultimately, this partnership helps RisingWave ensure it can elastically scale to meet real-time processing needs for organizations of all sizes, meet high standards of security and compliance via AWS enterprise-grade security framework, and deliver seamless streaming data experience through native connections to AWS resources such as Amazon Managed Streaming for Apache Kafka and Amazon Redshift.
Delivering continuous, real-time stream processing for Kaito
In all, Kaito utilizes RisingWave as the streaming data warehouse to power its AI-based search platform for crypto traders and investors as well as its user-facing analytical dashboards, internal operational workloads, and real-time alerts among other applications. By consolidating its real-time data infrastructure for various applications into a single system (e.g., RisingWave), Kaito significantly reduces processing costs and enhances data freshness; provides analytics and operational insights to clients and the in-house operations team with near-instantaneous latency; drastically lowers development costs and shortens the learning curve as engineers can focus exclusively on implementing new features using Postgres-compatible SQL and leveraging existing ecosystems.
About AWS Partner RisingWave
RisingWave, based in San Francisco, delivers fresh, low-latency insights from real-time streams, database CDC, and time-series data. With support from the AWS Global Startup Program, it brings streaming and batch together, letting users join and analyze both live and historical data, and persist results in managed Apache Iceberg tables. The mission of the company is to empower every organization to build real-time, event-driven applications with simplicity, scalability, and cost-effectiveness. This mission reflects a commitment to providing a cloud-native, SQL-based streaming database that integrates seamlessly into modern data architectures. By leveraging technologies like Apache Iceberg and offering features such as compute isolation and cross-database querying, RisingWave aims to support enterprises in developing and deploying real-time applications efficiently and at scale.
To learn more, visit risingwave.com
“Running RisingWave on AWS gives us the elasticity and reliability we need to power real-time workloads at scale. AWS cloud-native services allow RisingWave to deliver sub-second insights without operational overhead.”
Yingjun Wu, Founder and CEO RisingWave Labs
About Kaito
Seattle, Washington, based Kaito is a next generation web3 information platform that indexes a wide range of web3 content that is not easily accessible through traditional search engines. This includes sources like social media, governance forums, research, news, podcasts, conference transcripts, and more. By leveraging advanced AI technologies, Kaito redefines how users discover and interact with blockchain-related information. Our enterprise platform empowers users across the web3 ecosystem, offering revolutionary information access and data-driven insights. www.kaito.ai