VAST Data is a unified data infrastructure platform designed for artificial intelligence (AI), high-performance computing, and enterprise data management. Founded in 2016, the company is headquartered in New York, USA. VAST Data has developed a new infrastructure model that challenges traditional tiered storage architectures by managing all data types within a single tier, aiming to deliver high performance at low cost. This architecture is referred to as DASE (Disaggregated, Shared-Everything).
Founders
- Renen Hallak serves as CEO (Chief Executive Officer). Previously involved in the XtremIO project at EMC, Hallak has shaped the company’s technical vision with expertise in scalability, resilience, and data performance.
- Shachar Fienblit, co-founder and the first CTO (Chief Technology Officer), has extensive engineering experience and designed the software infrastructure of VAST’s architecture.
- Jeff Denworth, another co-founder, is responsible for marketing strategies and has prior experience in marketing and product management at firms like DataDirect Networks (DDN).
DASE Architecture
DASE was designed as an alternative to Google’s "shared-nothing" architecture introduced two decades ago. It decouples processing logic from system state, enabling shared and transactional access across the cluster through unified data structures. This allows compute and storage components to scale independently while maintaining a "shared-everything" approach to data access.
Core Components and Technologies
- VAST DataStore: A high-performance, multi-protocol (NFS, SMB, S3) data storage layer that offers ACID (Atomicity, Consistency, Isolation, Durability) guarantees and supports file, object, and table formats.
- VAST DataSpace: A global namespace spanning edge to cloud environments, enabling low-latency multi-site clusters without distributed locking.
- VAST DataBase: A data layer that combines OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing), enabling simultaneous transaction and analytics workloads.
- VAST DataEngine: Provides real-time data processing through event triggers, user-defined functions (UDFs), and logical queries.
- Gemini: A consumption-based software licensing model that allows VAST software to run independently of hardware, which can be procured directly from manufacturing partners.
Performance with DASE
DASE eliminates east-west traffic that typically hampers performance in traditional systems, allowing processors to operate without mutual coordination. This architecture enables seamless scalability from petabytes to exabytes.
VAST Element
Data in the VAST platform is stored in units called "VAST Elements." Each element can be defined as a file, object, table, or function, and is enriched with extensive metadata. This enables simultaneous multi-protocol and multi-format access to the same data.
VAST provides atomic, consistent data structures that enable real-time snapshots without performance degradation. A single VAST cluster can support up to one million snapshots simultaneously.
Replacing conventional Reed-Solomon coding, VAST employs LDEC (Locally-Decodable Erasure Code), which delivers significantly faster data reconstruction with only 2.7% overhead, enhancing durability while reducing recovery time.
The platform also features a data reduction algorithm called Similarity, which combines global deduplication with local compression to reduce data volume even for pre-compressed or encrypted data achieving an average compression ratio of 3:1.
Use Cases
VAST supports a wide range of applications, including AI training, genomic analysis, video processing, and financial modeling. It is used by institutions such as NASA, Zoom, Pixar, the National Institutes of Health (NIH), the Allen Institute, the U.S. Air Force, and CoreWeave.
Efficiency
The platform is optimized to utilize low-cost QLC (Quad-Level Cell) SSDs with 10-year warranties and addresses durability challenges like write amplification. By eliminating multi-tiered architectures, VAST ensures all data is accessible in real time from a single storage layer.
Future Outlook
IDC has described DASE as "the architecture of the future." Designed for the AI era, VAST simplifies data management and overcomes scalability, performance, and cost limitations of traditional systems. The DASE architecture is expected to play a central role in future developments, particularly in distributed learning systems, edge-cloud collaboration, and automated AI data processing pipelines.