RapidsDB ASEAN

Company profile

RapidsDB Pte Ltd - Asean HQ

Rapids Data was founded in 2014 and is dual headquartered in Beijing, China and Silicon Valley, USA.

The company is an industry leader in the research and development of big data real-time processing and analysis, providing advanced and innovative big data technology, products, services and total solutions to the global market.

The RapidsDB Unified Analytics Platform is an integrated real-time, AI-based, big data analytics platform that can be deployed on premises or in the Cloud.

RapidsDB offers a fully parallel, distributed, in-memory federated query system designed to support complex analytical SQL queries across multiple data stores, producing integrated results at ultra-fast speed.

A pluggable connector-based framework, streaming data processing engine, and AI-enabled computing engine help build future-oriented data pipelines with high performance and cost effectiveness.

Trust signals

Market recognition, patents, clients, and certifications

USA Registered IP Patent

Intellectual property coverage presented through USPTO filing evidence and patent registry details.

Reference Clients

Reference client logos and customer proof from the original company profile.

Gartner Magic Quadrant

Market positioning reference included in the source presentation.

CMM Certification & INCITIS

Certification and standards material carried over from the PPT.

Rapids Data Platform

Architecture for real-time big data analytics

The Rapids Data Platform focuses on big data real-time processing and provides real-time analytics solutions through an integrated product system.

Solutions

Big Data analysis solutions
(AI-In-A-Box)

Big Data storage solutions
(platform/all-in-one-Box)

Product system

Rapids ParallelAI Lib, in-database distributed R engine

StreamDB
Memory stream data analysis engine RapidsDB
Full in-memory data processing engine Rapids Manager
Management control platform

Rapids Federation
Connector-based integration of disparate data sources

Maintenance

Annual maintenance, dynamic data maintenance, new feature upgrade

RapidsDB distributed memory real-time analysis engine diagram

RapidsDB engine

Distributed, in-memory, real-time big data analysis

RapidsDB accesses and processes data directly in memory. Query requests are broken into smaller tasks, distributed intelligently, and executed in parallel across nodes for real-time processing and analysis.

Distributed, MPP, shared-nothing memory database
Unified ANSI SQL query support for multiple data sources
Adaptive query pushdown and dynamic query optimization
Multiple table joins across nodes
Distributed in-memory data storage

Performance comparison

TPC-H 100G data processing time

Testing environment: 5-node server cluster. Each server has 2 CPU cores and 256GB memory.

RapidsDB 186.23s

Greenplum 3376.09s

Spark on YARN 1528.67s

Spark Standalone 1543.63s

Hive on Tez 8184.33s

Hive on Tez (partition) 4378.26s

8x to 44x faster

RapidsDB test result compared with mainstream database providers in the source profile.

Rapids Federation

Integrated query from all kinds of data sources

The embedded federated connector system enables users to access various data sources through industry-standard SQL and JDBC interfaces.

Dispenses with the traditional Extract-Transform-Load process and unnecessary data migration
Enables seamless integrated queries across all kinds of data sources
Separates hot, warm, and cold data analytics and management

Big Data current environment diagram — Big Data current environment

Big Data with RapidsDB Federation diagram — Big Data with RapidsDB Federation

Product system

Streaming, AI, Hadoop, and deployment-ready analytics

Rapids StreamDB

An ISO standard-based, in-memory streaming data processing engine that continuously analyzes streaming data within milliseconds.

Millisecond-level real-time data processing and computing
Fully compatible with ANSI SQL and window functions
Incremental data refresh
Multiple data source integration

Rapids ParallelAI

AI-enabled analytics with an in-memory, distributed, parallel implementation of the R language integrated within a RapidsDB cluster.

AI-enabled analytics directly against data managed by RapidsDB
Distributed R computing beyond single-machine restrictions
In-memory R module computing without complex upload or cleansing processes
20 popular algorithms in 6 categories for complex modeling

Rapids Hadoop

Enterprise-grade SQL-on-Hadoop based on open source Apache Hadoop technology, helping enterprises build data lakes through strictly size-controlled installation packages.

Batch processing and interactive SQL queries
Real-time analysis of heterogeneous big data
SQL-on-Hadoop analytical application tools
IaaS, YARN, and Mesos cloud computing configurations
Open source ETL and BI application integration

AI-In-A-Box

Appliance-ready AI modelling for business units

Quick start to AI modelling within a week
Low-cost monthly subscription model as OPEX
Millisecond data result reports for decision making
High-performance, ultra-fast data management
Data migration through Rapids Connectors to Rapids Data in-memory database
20 industry standard models including GLM, Random Forest, GBM, and Word2Vec
Use cases for fraud detection, loans assessment, and risk management

Data flow and AI workflow

From IoT data to drag-and-drop AI analytics

RapidsDB connects edge and cloud data flow patterns with AI workflow tooling for regression, classification, and clustering use cases.

R2 Logloss AUC Gini MSE RMSE Mean per class error

Regression: Power Meter
Classification: Fraud Detection
Clustering: Telco Users

IoT Data Flow to RapidsDB diagram — IoT Data Flow to RapidsDB

RapidsDB AIworkflow Drag-n-Drop UI slide — RapidsDB AIworkflow

Applications

Rapids Data Platform applications

RapidsDB Unified Analytics Platform supports high-volume, low-latency analytics across industries.

Financial Services

High throughput and high concurrency
High volume ad hoc query on current trading data
Trading real-time risk evaluation

Energy

High throughput and high concurrency
High speed fee charge
Ad hoc query on detailed billing data
Real-time billing fraud detection and alteration

Telecommunications

High throughput and high concurrency
Real-time fee charge
Ad hoc query on detailed billing data
Real-time billing fraud detection and alteration

Retail E-commerce

Real-time billing fraud detection and real-time alteration
User behavior data collection in real time

Online Gaming

Process high volume online concurrent users
Low latency to guarantee game performance

Traffic

RT traffic stream data analytics within second
RT acquisition of high volume traffic data

Platform recap

Unified analytics platform recap

Distributed architecture for scalability and parallel processing
Unified SQL query support for multiple data sources
In-memory computing technology for lightning-fast execution
Dynamic query optimization for high-performance queries
AI-in-database for automated data analytics
State-of-art simplicity for cost savings and value attainment acceleration
Enterprise-level solutions for elasticity, scalability, cost efficiency, and agility
Seamless integration with the big data analytics ecosystem for future-oriented data pipeline creation