I/O DriversΒΆ
As we have shown in the getting-started, the open function in vineyard can open a local
file as a stream for consuming, and we notice that the path of the local file is headed
with the scheme file://.
Actually, vineyard supports several different types of data source, e.g., kafka://
for kafka topics. The functional methods to open different data sources as vineyard
streams are called drivers in vineyard. They are registered to open for
specific schemes, so that when open is invoked, it will dispatch the corresponding
driver to handle the specific data source according to the scheme of the path.
The following sample code demonstrates the dispatching logic in open, and the
registration examples.
>>> @registerize
>>> def open(path, *args, **kwargs):
>>> scheme = urlparse(path).scheme
>>> for reader in open._factory[scheme][::-1]:
>>> r = reader(path, *args, **kwargs)
>>> if r is not None:
>>> return r
>>> raise RuntimeError('Unable to find a proper IO driver for %s' % path)
>>>
>>> # different driver functions are registered as follows
>>> open.register('file', local_driver)
Most importantly, the registration design allows users to register their own drivers
to registerized vineyard methods using .register, which prevents major revisions
on the processing code to fulfill customized computation requirements.