[installation] Init with inital config for global

This commit is contained in:
2025-10-30 15:08:17 +01:00
commit 7640b452ed
3678 changed files with 2200095 additions and 0 deletions

View File

@ -0,0 +1,473 @@
.. highlight:: rst
.. _capssds:
#######
capssds
#######
**Virtual overlay file system presenting a CAPS archive directory as a
read-only SDS archive.**
Description
===========
:program:`capssds` is a virtual overlay file system presenting a CAPS archive
directory as a read-only :term:`SDS` archive with no extra disk space
requirement.
CAPS Directory and file names are mapped. An application reading from a file
will only see :term:`miniSEED` records ordered by record start time. You may
connect to the virtual SDS archive using the RecordStream SDS or directly read
the single :term:`miniSEED` file. Other seismological software such as ObsPy or
Seisan may read directly from the SDS archive of the files therein.
.. _sec-capssds-usage:
Usage
=====
The virtual file system may be mounted by an unprivileged system user like
`sysop` or configured by the `root` user to be automatically mounted on machine
startup via an `/etc/fstab` entry or an systemd mount script.
The following sections assume that the CAPS archive is located under
`/home/sysop/seiscomp/var/lib/caps/archive` and the SDS archive should appear
under `/tmp/sds` with all files and directories being owned by the
`sysop` user.
Regardless which of the following mount strategies is chosen make sure to
create the target directory first:
.. code-block:: sh
mkdir -p /tmp/sds
.. _sec-capssds-usage-unpriv:
Unpriviledged user
------------------
Mount the archive:
.. code-block:: sh
capssds ~/seiscomp/var/lib/caps/archive /tmp/sds
Unmount the archive:
.. code-block:: sh
fusermount -u /tmp/sds
.. _sec-capssds-usage-fstab:
System administrator - /etc/fstab
---------------------------------
Create the /etc/fstab entry:
.. code-block:: plaintext
/home/sysop/seiscomp/var/lib/caps/archive /tmp/sds fuse.capssds defaults 0 0
Alternatively you may define mount options, e.g., to deactivate the auto mount,
grant the user the option to mount the directory himself or use the sloppy_size
feature:
.. code-block:: plaintext
/home/sysop/seiscomp/var/lib/caps/archive /tmp/sds fuse.capssds fuse.capssds noauto,exact_size,user 0 0
Mount the archive:
.. code-block:: sh
mount /tmp/sds
Unmount the archive:
.. code-block:: sh
umount /tmp/sds
.. _sec-capssds-usage-systemd:
System administrator - systemd
------------------------------
Create the following file under `/etc/systemd/system/tmp-sds.mount`.
Please note that the file name must match the path specified under `Where` with
all slashes replaced by a dash:
.. code-block:: ini
[Unit]
Description=Mount CAPS archive as readonly miniSEED SDS
After=network.target
[Mount]
What=/home/sysop/var/lib/caps/archive
Where=/tmp/sds
Type=fuse.capssds
Options=defaults,allow_other
[Install]
WantedBy=multi-user.target
Mount the archive:
.. code-block:: sh
systemctl start tmp-sds.mount
Unmount the archive:
.. code-block:: sh
systemctl stop tmp-sds.mount
Automatic startup:
.. code-block:: sh
systemctl enable tmp-sds.mount
.. _sec-capssds-impl:
Implementation Details
======================
:program:`capssds` makes use of the FUSE :cite:p:`fuse` is a userspace
filesystem framework provided by the Linux kernel as well as the libfuse
:cite:p:`libfuse` user space library.
The file system provides only read access to the data files and implements only
:ref:`basic operations <sec-capssds-impl-ops>` required to list and read data files.
It has to fulfill 2 main tasks, the :ref:`sec-capssds-impl-pathmap`
of CAPS and SDS directory tree entries and the :ref:`sec-capssds-impl-conv`.
:ref:`Caches <sec-capssds-impl-perf>` are used the improve the performance.
.. _sec-capssds-impl-ops:
Supported operations
--------------------
* `init` - initializes the file system
* `getattr` - get file and directory attributes such as size and access rights
* `access` - check for specific access rights
* `open` - open a file
* `read` - read data at a specific file position
* `readdir` - list directory entries
* `release` - release a file handle
* `destroy` - shutdown the file system
Please refer to
`fuse.h <https://github.com/libfuse/libfuse/blob/master/include/fuse.h>`_
for a complete list of fuse operations.
.. _sec-capssds-impl-pathmap:
Path mapping
------------
CAPS uses a :ref:`comparable directory structure <sec-archive>` to SDS with
three differences:
* The channel does not use the `.D` prefix.
* The day of year index is zero-based (0-365) where as SDS uses an index
starting with 1 (1-366).
* CAPS data files use the extension `.data`.
The following example shows the translation from a CAPS data file path to an SDS
file path for the stream AM.R0F05.00.SHZ for data on January 1st 2025:
`2025/AM/R0F05/SHZ/AM.R0F05.00.SHZ.2025.000.data -> 2025/AM/R0F05/SHZ.D/AM.R0F05.00.SHZ.D.2025.001`
Directories and file names not fulfilling the :term:`miniSEED` format
specification are not listed.
.. _sec-capssds-impl-conv:
Data file conversion
--------------------
A :ref:`CAPS data file <sec-caps-archive-file-format>` contains records of
certain types in the order of their arrival together with a record index for
record lookup and sorting. If a process reads data, only :term:`miniSEED` records
contained in the CAPS data file are returned in order of the records start time
and not the order of arrival. Likewise only :term:`miniSEED` records are counted
for the reported file size unless the `-o sloppy-size` option is specified.
.. _sec-capssds-impl-perf:
Performance optimization
------------------------
When a file is opened all :term:`miniSEED` records are copied to a memory
buffer. This allows fast index based data access at the cost of main memory
consumption. The number or simultaneously opened data files can be configured
through the `-o cached_files` option and must match the available memory size.
If an application tries to open more files than available, the action will fail.
To obtain the mapped SDS file size the CAPS data file must be scanned for
`miniSEED` records. Although only the header data is read this is still an
expensive operation for hundreds of files. A file size cache is used containing
up to `-o cached_file_sizes` entries each consuming 56 bytes of memory. File
sizes recently accessed are pushed to the front of the cache. A cache item is
invalidated if the modification time of the CAPS data file is more recent than
the entry creation time.
If your use case does not require the listing of the exact file size, you may
use the `-o sloppy-size` option which will stop generating the :term:`miniSEED`
file size and will return the size of the CAPS file instead.
Command-Line Options
====================
:program:`capstool [options] [capsdir] mountpoint`
.. _File-system specific options:
File-system specific options
----------------------------
.. option:: -o caps_dir=DIR
Default: ``Current working directory``
Path to the CAPS archive directory.
.. option:: -o sloppy_size
Return the size of the CAPS data file instead of summing
up the size of all MSEED records. Although there is a
cache for the MSEED file size calculating the real size is
an expensive operation. If your use case does not depend
on the exact size you may activate this flag for speedup.
.. option:: -o cached_file_sizes=int
Default: ``100000``
Type: *int*
Number of file sizes to cache. Used when sloppy_size is
off to avoid unnecessary recomputation of MSEED sizes. A
cache entry is valid as long as neither the mtime nor
size of the CAPS data file changed. Each entry consumes
56 bytes of memory.
.. option:: -o cached_files=int
Default: ``100``
Type: *int*
Number of CAPS data files to cache \(100\). The file
handle for each cached file will be kept open to speed
up data access.
.. _FUSE Options:
FUSE Options
------------
.. option:: -h, --help
Print this help text.
.. option:: -V, --version
Print version.
.. option:: -d
Enable debug output \(implies \-f\).
.. option:: -o debug
Enable debug output \(implies \-f\).
.. option:: -f
Enable foreground operation.
.. option:: -s
Disable multi\-threaded operation.
.. option:: -o clone_fd
Use separate fuse device fd for each thread \(may improve performance\).
.. option:: -o max_idle_threads=int
Default: ``-1``
Type: *int*
The maximum number of idle worker threads allowed.
.. option:: -o max_threads=int
Default: ``10``
Type: *int*
The maximum number of worker threads allowed.
.. option:: -o kernel_cache
Cache files in kernel.
.. option:: -o [no]auto_cache
Enable caching based on modification times.
.. option:: -o no_rofd_flush
Disable flushing of read\-only fd on close.
.. option:: -o umask=M
Type: *octal*
Set file permissions.
.. option:: -o uid=N
Set file owner.
.. option:: -o gid=N
Set file group.
.. option:: -o entry_timeout=T
Default: ``1``
Unit: *s*
Type: *float*
Cache timeout for names.
.. option:: -o negative_timeout=T
Default: ``0``
Unit: *s*
Type: *float*
Cache timeout for deleted names.
.. option:: -o attr_timeout=T
Default: ``1``
Unit: *s*
Type: *float*
Cache timeout for attributes.
.. option:: -o ac_attr_timeout=T
Default: ``attr_timeout``
Unit: *s*
Type: *float*
Auto cache timeout for attributes.
.. option:: -o noforget
Never forget cached inodes.
.. option:: -o remember=T
Default: ``0``
Unit: *s*
Type: *float*
Remember cached inodes for T seconds.
.. option:: -o modules=M1[:M2...]
Names of modules to push onto filesystem stack.
.. option:: -o allow_other
Allow access by all users.
.. option:: -o allow_root
Allow access by root.
.. option:: -o auto_unmount
Auto unmount on process termination.
.. _Options for subdir module:
Options for subdir module
-------------------------
.. option:: -o subdir=DIR
Prepend this directory to all paths \(mandatory\).
.. option:: -o [no]rellinks
Transform absolute symlinks to relative.
.. _Options for iconv module:
Options for iconv module
------------------------
.. option:: -o from_code=CHARSET
Default: ``UTF-8``
Original encoding of file names.
.. option:: -o to_code=CHARSET
Default: ``UTF-8``
New encoding of the file names.