DIVE – Revolutionising Data Inflow in the Trading Data Ecosystems

Blog 5 Nov 2024

Read time:
3 minutes

Meabh McMahon

At the heart of the Trading Data Ecosystem lies the seamless integration of high-volume, high-frequency data from diverse sources, as depicted in our ecosystem model. This crucial layer is what we refer to as “Data Inflow” in our Trading Data Ecoystem. It ensures efficient ingestion of both real-time and historical market data, which is then connected across the ecosystem to drive accurate, timely decision-making.  

We recognised that maintaining the quality of these inflows is vital to the success of the entire system, particularly in the fast-moving capital markets sector. To address this, we developed DIVE (Data Intellect Validation Engine)—a powerful, kdb+ native validation tool. 

DIVE was designed to tackle one of the biggest issues in the trading data ecosystems: ensuring the integrity and accuracy of data inflow while enabling organisations to confidently rely on their data without manual interventions or unnecessary overhead. By identifying errors in data and offering scalable, real-time insights, DIVE keeps the entire ecosystem running smoothly. 

What is DIVE?

DIVE operates directly within kdb+ systems, providing real-time insights and validating both real-time and historical data inflows. It avoids the latency of external processing by integrating natively into the core data systems. This approach is crucial for the highly demanding financial environments where performance is essential. 

Key components of DIVE include: 

  • Core Models: Pre-built models serve as the backbone for rapid quality checks, which are easily adaptable to client systems. Available out-of-the-box or as core elements of more complex, custom models. 
  • Customisable UI: DIVE’s user interface allows for the creation of validation rules without complex coding, making the system accessible to users beyond kdb+ specialists, while retaining an extensible core. 
  • Horizontal Scalability: With built-in worker support, DIVE seamlessly scales to handle high-volume validations across multiple datasets. 

Why we created DIVE 

In today’s complex trading data ecosystems, data inflows are key to day to day business operations. Ensuring their quality and integrity is critical for making sound, real-time business decisions. We developed DIVE to meet the growing challenges of: 

  • Data discrepancies caused by inconsistent vendor datasets. 
  • Increased data volumes during periods of market volatility. 
  • The rising need for automation in data governance and quality assurance. 

By providing organisations with the tools to maintain high standards of data quality without heavy reliance on infrastructure or complex code, DIVE serves as a cornerstone for modern financial data ecosystems. 

Key Benefits

  1. Native kdb+ Integration: DIVE operates natively within kdb+ environments, avoiding costly data conversions and ensuring swift performance. It can easily access HDB data or integrate through kdb gateways. 
  1. Scalable Performance: Built to scale horizontally, DIVE can run simultaneous validations, making it ideal for high-frequency data environments. As the data volume grows, DIVE’s performance remains uncompromised. 
  1. Actionable Insights: With its report-driven framework, DIVE provides ongoing metrics for data quality and can hook into existing alert systems, enabling real-time responses to any detected anomalies. 
  1. User-Friendly Validation: Designed with usability in mind, DIVE’s interface allows non-kdb experts to define rules, track performance, and improve governance—eliminating the need for extensive coding expertise. 

Use Cases 

Why is my data late? A distressingly common issue in real-time data is silent feed failure. Live data becomes stale, and as seconds turn to minutes, trading actions quickly become out-of-date. With our real-time data staleness model, deviations from normal data flow can be detected in as little as milliseconds. 

What happened to my time zones? Data vendors may change their time zone unexpectedly. We’ve seen this happen due to daylight saving time, vendor shifts from UTC to local time (and vice-versa), and incorrect upstream timezone processing. Our inbuilt model checks data files against history to determine if the time range has moved unexpectedly, and alert accordingly. 

What happened to my data universe? Partial data is pernicious; it looks like data it there, but perhaps a sector is missing. Our basic models can check distinct symbology counts against manual thresholds or expected historical norms, alerting quickly and effectively. 

We developed DIVE as a solution to the industry’s pressing need for accurate, scalable, and user-friendly data validation. By integrating DIVE into the trading data ecosystem, organisations can ensure the seamless flow of high-quality data, driving more reliable decision-making and a smoother, more efficient operational workflow. 

Share this:

LET'S CHAT ABOUT YOUR PROJECT.

GET IN TOUCH