474 lines
10 KiB
ReStructuredText
474 lines
10 KiB
ReStructuredText
.. highlight:: rst
|
|
|
|
.. _capssds:
|
|
|
|
#######
|
|
capssds
|
|
#######
|
|
|
|
**Virtual overlay file system presenting a CAPS archive directory as a
|
|
read-only SDS archive.**
|
|
|
|
|
|
Description
|
|
===========
|
|
|
|
:program:`capssds` is a virtual overlay file system presenting a CAPS archive
|
|
directory as a read-only :term:`SDS` archive with no extra disk space
|
|
requirement.
|
|
|
|
CAPS Directory and file names are mapped. An application reading from a file
|
|
will only see :term:`miniSEED` records ordered by record start time. You may
|
|
connect to the virtual SDS archive using the RecordStream SDS or directly read
|
|
the single :term:`miniSEED` file. Other seismological software such as ObsPy or
|
|
Seisan may read directly from the SDS archive of the files therein.
|
|
|
|
|
|
.. _sec-capssds-usage:
|
|
|
|
Usage
|
|
=====
|
|
|
|
The virtual file system may be mounted by an unprivileged system user like
|
|
`sysop` or configured by the `root` user to be automatically mounted on machine
|
|
startup via an `/etc/fstab` entry or an systemd mount script.
|
|
|
|
The following sections assume that the CAPS archive is located under
|
|
`/home/sysop/seiscomp/var/lib/caps/archive` and the SDS archive should appear
|
|
under `/tmp/sds` with all files and directories being owned by the
|
|
`sysop` user.
|
|
|
|
Regardless which of the following mount strategies is chosen make sure to
|
|
create the target directory first:
|
|
|
|
.. code-block:: sh
|
|
|
|
mkdir -p /tmp/sds
|
|
|
|
|
|
.. _sec-capssds-usage-unpriv:
|
|
|
|
Unpriviledged user
|
|
------------------
|
|
|
|
Mount the archive:
|
|
|
|
.. code-block:: sh
|
|
|
|
capssds ~/seiscomp/var/lib/caps/archive /tmp/sds
|
|
|
|
|
|
Unmount the archive:
|
|
|
|
.. code-block:: sh
|
|
|
|
fusermount -u /tmp/sds
|
|
|
|
|
|
.. _sec-capssds-usage-fstab:
|
|
|
|
System administrator - /etc/fstab
|
|
---------------------------------
|
|
|
|
Create the /etc/fstab entry:
|
|
|
|
.. code-block:: plaintext
|
|
|
|
/home/sysop/seiscomp/var/lib/caps/archive /tmp/sds fuse.capssds defaults 0 0
|
|
|
|
|
|
Alternatively you may define mount options, e.g., to deactivate the auto mount,
|
|
grant the user the option to mount the directory himself or use the sloppy_size
|
|
feature:
|
|
|
|
.. code-block:: plaintext
|
|
|
|
/home/sysop/seiscomp/var/lib/caps/archive /tmp/sds fuse.capssds fuse.capssds noauto,exact_size,user 0 0
|
|
|
|
|
|
Mount the archive:
|
|
|
|
.. code-block:: sh
|
|
|
|
mount /tmp/sds
|
|
|
|
|
|
Unmount the archive:
|
|
|
|
.. code-block:: sh
|
|
|
|
umount /tmp/sds
|
|
|
|
|
|
.. _sec-capssds-usage-systemd:
|
|
|
|
System administrator - systemd
|
|
------------------------------
|
|
|
|
Create the following file under `/etc/systemd/system/tmp-sds.mount`.
|
|
Please note that the file name must match the path specified under `Where` with
|
|
all slashes replaced by a dash:
|
|
|
|
.. code-block:: ini
|
|
|
|
[Unit]
|
|
Description=Mount CAPS archive as readonly miniSEED SDS
|
|
After=network.target
|
|
|
|
[Mount]
|
|
What=/home/sysop/var/lib/caps/archive
|
|
Where=/tmp/sds
|
|
Type=fuse.capssds
|
|
Options=defaults,allow_other
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
|
|
|
|
Mount the archive:
|
|
|
|
.. code-block:: sh
|
|
|
|
systemctl start tmp-sds.mount
|
|
|
|
|
|
Unmount the archive:
|
|
|
|
.. code-block:: sh
|
|
|
|
systemctl stop tmp-sds.mount
|
|
|
|
|
|
Automatic startup:
|
|
|
|
.. code-block:: sh
|
|
|
|
systemctl enable tmp-sds.mount
|
|
|
|
|
|
.. _sec-capssds-impl:
|
|
|
|
Implementation Details
|
|
======================
|
|
|
|
:program:`capssds` makes use of the FUSE :cite:p:`fuse` is a userspace
|
|
filesystem framework provided by the Linux kernel as well as the libfuse
|
|
:cite:p:`libfuse` user space library.
|
|
|
|
The file system provides only read access to the data files and implements only
|
|
:ref:`basic operations <sec-capssds-impl-ops>` required to list and read data files.
|
|
It has to fulfill 2 main tasks, the :ref:`sec-capssds-impl-pathmap`
|
|
of CAPS and SDS directory tree entries and the :ref:`sec-capssds-impl-conv`.
|
|
:ref:`Caches <sec-capssds-impl-perf>` are used the improve the performance.
|
|
|
|
.. _sec-capssds-impl-ops:
|
|
|
|
Supported operations
|
|
--------------------
|
|
|
|
* `init` - initializes the file system
|
|
* `getattr` - get file and directory attributes such as size and access rights
|
|
* `access` - check for specific access rights
|
|
* `open` - open a file
|
|
* `read` - read data at a specific file position
|
|
* `readdir` - list directory entries
|
|
* `release` - release a file handle
|
|
* `destroy` - shutdown the file system
|
|
|
|
Please refer to
|
|
`fuse.h <https://github.com/libfuse/libfuse/blob/master/include/fuse.h>`_
|
|
for a complete list of fuse operations.
|
|
|
|
|
|
.. _sec-capssds-impl-pathmap:
|
|
|
|
Path mapping
|
|
------------
|
|
|
|
CAPS uses a :ref:`comparable directory structure <sec-archive>` to SDS with
|
|
three differences:
|
|
|
|
* The channel does not use the `.D` prefix.
|
|
* The day of year index is zero-based (0-365) where as SDS uses an index
|
|
starting with 1 (1-366).
|
|
* CAPS data files use the extension `.data`.
|
|
|
|
The following example shows the translation from a CAPS data file path to an SDS
|
|
file path for the stream AM.R0F05.00.SHZ for data on January 1st 2025:
|
|
|
|
`2025/AM/R0F05/SHZ/AM.R0F05.00.SHZ.2025.000.data -> 2025/AM/R0F05/SHZ.D/AM.R0F05.00.SHZ.D.2025.001`
|
|
|
|
Directories and file names not fulfilling the :term:`miniSEED` format
|
|
specification are not listed.
|
|
|
|
|
|
.. _sec-capssds-impl-conv:
|
|
|
|
Data file conversion
|
|
--------------------
|
|
|
|
A :ref:`CAPS data file <sec-caps-archive-file-format>` contains records of
|
|
certain types in the order of their arrival together with a record index for
|
|
record lookup and sorting. If a process reads data, only :term:`miniSEED` records
|
|
contained in the CAPS data file are returned in order of the records start time
|
|
and not the order of arrival. Likewise only :term:`miniSEED` records are counted
|
|
for the reported file size unless the `-o sloppy-size` option is specified.
|
|
|
|
|
|
.. _sec-capssds-impl-perf:
|
|
|
|
Performance optimization
|
|
------------------------
|
|
|
|
When a file is opened all :term:`miniSEED` records are copied to a memory
|
|
buffer. This allows fast index based data access at the cost of main memory
|
|
consumption. The number or simultaneously opened data files can be configured
|
|
through the `-o cached_files` option and must match the available memory size.
|
|
If an application tries to open more files than available, the action will fail.
|
|
|
|
To obtain the mapped SDS file size the CAPS data file must be scanned for
|
|
`miniSEED` records. Although only the header data is read this is still an
|
|
expensive operation for hundreds of files. A file size cache is used containing
|
|
up to `-o cached_file_sizes` entries each consuming 56 bytes of memory. File
|
|
sizes recently accessed are pushed to the front of the cache. A cache item is
|
|
invalidated if the modification time of the CAPS data file is more recent than
|
|
the entry creation time.
|
|
|
|
If your use case does not require the listing of the exact file size, you may
|
|
use the `-o sloppy-size` option which will stop generating the :term:`miniSEED`
|
|
file size and will return the size of the CAPS file instead.
|
|
|
|
|
|
|
|
|
|
|
|
Command-Line Options
|
|
====================
|
|
|
|
:program:`capstool [options] [capsdir] mountpoint`
|
|
|
|
.. _File-system specific options:
|
|
|
|
|
|
File-system specific options
|
|
----------------------------
|
|
|
|
.. option:: -o caps_dir=DIR
|
|
|
|
Default: ``Current working directory``
|
|
|
|
Path to the CAPS archive directory.
|
|
|
|
.. option:: -o sloppy_size
|
|
|
|
Return the size of the CAPS data file instead of summing
|
|
up the size of all MSEED records. Although there is a
|
|
cache for the MSEED file size calculating the real size is
|
|
an expensive operation. If your use case does not depend
|
|
on the exact size you may activate this flag for speedup.
|
|
|
|
.. option:: -o cached_file_sizes=int
|
|
|
|
Default: ``100000``
|
|
|
|
Type: *int*
|
|
|
|
Number of file sizes to cache. Used when sloppy_size is
|
|
off to avoid unnecessary recomputation of MSEED sizes. A
|
|
cache entry is valid as long as neither the mtime nor
|
|
size of the CAPS data file changed. Each entry consumes
|
|
56 bytes of memory.
|
|
|
|
.. option:: -o cached_files=int
|
|
|
|
Default: ``100``
|
|
|
|
Type: *int*
|
|
|
|
Number of CAPS data files to cache \(100\). The file
|
|
handle for each cached file will be kept open to speed
|
|
up data access.
|
|
|
|
|
|
.. _FUSE Options:
|
|
|
|
|
|
FUSE Options
|
|
------------
|
|
|
|
.. option:: -h, --help
|
|
|
|
Print this help text.
|
|
|
|
.. option:: -V, --version
|
|
|
|
Print version.
|
|
|
|
.. option:: -d
|
|
|
|
Enable debug output \(implies \-f\).
|
|
|
|
.. option:: -o debug
|
|
|
|
Enable debug output \(implies \-f\).
|
|
|
|
.. option:: -f
|
|
|
|
Enable foreground operation.
|
|
|
|
.. option:: -s
|
|
|
|
Disable multi\-threaded operation.
|
|
|
|
.. option:: -o clone_fd
|
|
|
|
Use separate fuse device fd for each thread \(may improve performance\).
|
|
|
|
.. option:: -o max_idle_threads=int
|
|
|
|
Default: ``-1``
|
|
|
|
Type: *int*
|
|
|
|
The maximum number of idle worker threads allowed.
|
|
|
|
.. option:: -o max_threads=int
|
|
|
|
Default: ``10``
|
|
|
|
Type: *int*
|
|
|
|
The maximum number of worker threads allowed.
|
|
|
|
.. option:: -o kernel_cache
|
|
|
|
Cache files in kernel.
|
|
|
|
.. option:: -o [no]auto_cache
|
|
|
|
Enable caching based on modification times.
|
|
|
|
.. option:: -o no_rofd_flush
|
|
|
|
Disable flushing of read\-only fd on close.
|
|
|
|
.. option:: -o umask=M
|
|
|
|
Type: *octal*
|
|
|
|
Set file permissions.
|
|
|
|
.. option:: -o uid=N
|
|
|
|
Set file owner.
|
|
|
|
.. option:: -o gid=N
|
|
|
|
Set file group.
|
|
|
|
.. option:: -o entry_timeout=T
|
|
|
|
Default: ``1``
|
|
|
|
Unit: *s*
|
|
|
|
Type: *float*
|
|
|
|
Cache timeout for names.
|
|
|
|
.. option:: -o negative_timeout=T
|
|
|
|
Default: ``0``
|
|
|
|
Unit: *s*
|
|
|
|
Type: *float*
|
|
|
|
Cache timeout for deleted names.
|
|
|
|
.. option:: -o attr_timeout=T
|
|
|
|
Default: ``1``
|
|
|
|
Unit: *s*
|
|
|
|
Type: *float*
|
|
|
|
Cache timeout for attributes.
|
|
|
|
.. option:: -o ac_attr_timeout=T
|
|
|
|
Default: ``attr_timeout``
|
|
|
|
Unit: *s*
|
|
|
|
Type: *float*
|
|
|
|
Auto cache timeout for attributes.
|
|
|
|
.. option:: -o noforget
|
|
|
|
Never forget cached inodes.
|
|
|
|
.. option:: -o remember=T
|
|
|
|
Default: ``0``
|
|
|
|
Unit: *s*
|
|
|
|
Type: *float*
|
|
|
|
Remember cached inodes for T seconds.
|
|
|
|
.. option:: -o modules=M1[:M2...]
|
|
|
|
Names of modules to push onto filesystem stack.
|
|
|
|
.. option:: -o allow_other
|
|
|
|
Allow access by all users.
|
|
|
|
.. option:: -o allow_root
|
|
|
|
Allow access by root.
|
|
|
|
.. option:: -o auto_unmount
|
|
|
|
Auto unmount on process termination.
|
|
|
|
|
|
.. _Options for subdir module:
|
|
|
|
|
|
Options for subdir module
|
|
-------------------------
|
|
|
|
.. option:: -o subdir=DIR
|
|
|
|
Prepend this directory to all paths \(mandatory\).
|
|
|
|
.. option:: -o [no]rellinks
|
|
|
|
Transform absolute symlinks to relative.
|
|
|
|
|
|
.. _Options for iconv module:
|
|
|
|
|
|
Options for iconv module
|
|
------------------------
|
|
|
|
.. option:: -o from_code=CHARSET
|
|
|
|
Default: ``UTF-8``
|
|
|
|
Original encoding of file names.
|
|
|
|
.. option:: -o to_code=CHARSET
|
|
|
|
Default: ``UTF-8``
|
|
|
|
New encoding of the file names.
|
|
|
|
|