File system store (version 1.0)#
Editor’s draft 26 July 2019
- Specification URI:
- Issue tracking:
- Suggest an edit for this spec:
Copyright 2019 Zarr core development team. This work is licensed under a Creative Commons Attribution 3.0 Unported License.
This specification defines an implementation of the Zarr abstract store API using a file system.
Status of this document#
This document is a draft for review and subject to changes. It will become final when the Zarr Enhancement Proposal (ZEP) 1 is approved via the ZEP process.
Comments, questions or contributions to this document are very welcome. Comments and questions should be raised via GitHub issues. When raising an issue, please add the label “stores-filesystem-v1.0”.
This document was produced by the Zarr core development team.
Notes about design decisions for the native File System Store#
The original file system store is designed for simplicity and easy manipulation
and transfer by external tools not aware of the store structure. In particular
gsutil can be use to transfer a local directory store to cloud
base storage, hence the keys choices will be conserved.
Conformance requirements are expressed with a combination of descriptive assertions and [RFC2119] terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in [RFC2119]. However, for readability, these words do not appear in all uppercase letters in this specification.
All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. Examples in this specification are introduced with the words “for example”.
Native storage operations#
Here we consider a file system to be any system comprised of files and directories, where:
Each file has a name (sequence of characters) and contents (sequence of bytes).
Each directory has a name (sequence of characters) and children (set of zero or more files and/or directories).
Each file or directory can be addressed by a path, comprised of its name and the names of all ancestor directories, which uniquely identifies it within the file system.
… and where the following native operations are supported:
Create a file.
Write the contents of a file.
Read the contents of a file.
Delete a file.
Create a directory.
List the children of a directory, returning the name and type (file or directory) of each child.
Delete a directory.
The Zarr store interface is defined in terms of keys and values, where a key is a sequence of characters and a value is a sequence of bytes. A file system store translates keys into file system paths. This translation assumes that the store has been defined relative to a base directory. The translation is as follows:
Replace any forward slash characters (‘/’) in the key with the native directory separator for the file system.
Join the result to the base directory path, using the native directory separator.
For example, if the file system is a POSIX file system, and the base directory path is “/data”, then the key “foo/bar” is translated to the file system path “/data/foo/bar”.
For example, if the file system is a Windows file system, and the base directory path is “C:\data”, then the key “foo/bar” is translated to the file system path “C:\data\foo\bar”.
When returning information about available keys, a file system store performs the reverse translation from file system paths to keys. This translation is as follows:
Replace any native directory separator characters with the forward slash character.
Strip the base directory path from the beginning of the path.
For example, if the file system is a POSIX file system, and the base directory path is “/data”, then the file system path “/data/foo/bar” is translated to the key “foo/bar”.
For example, if the file system is a Windows file system, and the base directory path is “C:\data”, then the file system path “C:\data\foo\bar” is translated to the key “foo/bar”.
Store API implementation#
The section below defines an implementation of the Zarr
Abstract store interface in terms of the native operations of this
storage system. Below
fspath_to_key() is a function that
translates file system paths to store keys, and
a function that translates store keys to file system paths, as defined
in the section above.
get(key) -> value: Read and return the contents of the file at file system path
set(key, value): Write
valueas the contents of the file at file system path
delete(key): Delete the file or directory at file system path
list(): Recursively walk the file system from the base directory, returning an iterator over keys obtained by calling
fspath_to_key(fp)for each descendant file path
list_prefix(prefix): Obtain a file system path by calling
key_to_fspath(prefix). If the result is a directory path, recursively walk the file system from this directory, returning an iterator over keys obtained by calling
fspath_to_key(fp)for each descendant file path
list_dir(prefix): Obtain a file system path by calling
key_to_fspath(prefix). If the result is a director path, list the directory children. Return a set of keys obtained by calling
fspath_to_key(fp)for each child file path
fp, and a set of prefixes obtained by calling
fspath_to_key(dp)for each child directory path
The canonical URI format for this store follows the file URI scheme of the base directory path, as defined in [RFC8089]. For a Windows base directory path “c:\my data” the canonical URI would be “file:///c:/my%20data”, for a Posix base directory “/my data” it would be”file:///my%20data”.
When expecting a URI string, but no scheme is present, implementations may assume a filesystem store with the (supposedly URI) string as the base directory path.
The following limitations for this store are know:
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
M. Kerwin. The “file” URI Scheme. February 2017. Proposed Standard. URL: https://tools.ietf.org/html/rfc8089