Data processing#

In these comprehensive case studies, we demonstrate how to seamlessly integrate vineyard’s capabilities with existing data-intensive tasks. By incorporating vineyard into complex workflows involving multiple computing engines, users can experience significant improvements in both performance and ease of use.

Effortlessly share Python objects between processes using vineyard’s intuitive and efficient approach.

Utilize vineyard as an elegant alternative to multiprocessing.shared_memory in Python.

Discover how vineyard enhances distributed machine learning training workflows by seamlessly integrating with various computing engines for improved efficiency and elegance.

Vineyard serves as the DataSet backend for Kedro pipelines, enabling efficient data sharing between tasks without intrusive code modification, even when the pipeline is deployed to Kubernetes.

Vineyard supports sharing GPU memory in zero-copy manner, enabling efficient data sharing between GPU-accelerated tasks.

Vineyard on Kubernetes#

Vineyard can be seamlessly deployed on Kubernetes, managed by the Vineyard Operator, to enhance big-data workflows through its data-aware scheduling policy. This policy orchestrates shared objects and routes jobs to where their input data resides. In the following tutorials, you will learn how to deploy Vineyard and effectively integrate it with Kubernetes.

The Vineyard operator serves as the central component for seamless integration with Kubernetes.

Vineyard functions as an efficient intermediate data storage solution for machine learning pipelines on Kubernetes.

Extending vineyard#

Vineyard offers a collection of efficient data structures tailored for data-intensive tasks, such as tensors, data frames, tables, and graphs. These data types can be easily extended to accommodate custom requirements. By registering user-defined types in the vineyard type registry, computing engines built on top of vineyard can instantly leverage the advantages provided by these custom data structures.

Craft builders and resolvers for custom Python data types.

Implement and register custom data types in C++ for seamless integration with vineyard’s ecosystem.