You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

304 lines
7.2 KiB
ReStructuredText

.. highlight:: rst
.. _scardac:
#######
scardac
#######
**Waveform archive data availability collector.**
Description
===========
scardac scans an :term:`SDS waveform archive <SDS>` , e.g.,
created by :ref:`slarchive` or :ref:`scart` for
available :term:`miniSEED <miniSeed>` data. It will
collect information about
* Data extents - the absolute earliest and latest times data is available of a
particular channel,
* Data segments - continuous data segments sharing the same quality and sampling
rate attributes.
scardac is intended to be executed periodically, e.g., as a cronjob.
The data availability information is stored in the SeisComP database under the
root element :ref:`DataAvailability <api-datamodel-python>`. Access to the data
availability is provided by the :ref:`fdsnws` module via the services:
* :ref:`/fdsnws/station <sec-station>` (extent information only, see
``matchtimeseries`` and ``includeavailability`` request parameters).
* :ref:`/fdsnws/ext/availability <sec-avail>` (extent and segment information
provided in different formats)
.. _scarcac_non-sds:
Non-SDS archives
----------------
scardac can be extended by plugins to scan non-SDS archives. For example the
*daccaps* plugin provided by :cite:t:`caps` allows scanning archives generated
by a CAPS server. Plugins are added to global module configuration, e.g.:
.. code-block:: properties
plugin = xyz
.. _scarcac_workflow:
Workflow
--------
#. Read existing ``Extents`` from database
#. Scan the SDS archive for new channel IDs and create new ``Extents``
#. Subsequently process the ``Extents`` using ``threads`` number of parallel
threads. For each ``Extent``:
#. Find all available daily data files
#. Sort the file list according date
#. For each data file
* remove ``DataSegments`` that do longer exists
* update or create ``DataSegments`` that changed or are new
* a segment is split if
* the ``jitter`` (difference between previous records end time and
current records start time) is exceeded
* the quality or sampling rate changed
* merge segment information into ``DataAttributeExtents`` (``Extents``
sharing the same quality and sample rate information)
* merge segment start and end time into overall ``Extent``
Examples
--------
#. Get command line help or execute scardac with default parameters and informative
debug output:
.. code-block:: sh
scardac -h
scardac --debug
#. Update the availability of waveform data files existing in the standard
:term:`SDS` archive to the seiscomp database and create an XML file using
:ref:`scxmldump`:
.. code-block:: sh
scardac -d mysql://sysop:sysop@localhost/seiscomp -a $SEISCOMP_ROOT/var/lib/archive --debug
scxmldump -Yf -d mysql://sysop:sysop@localhost/seiscomp -o availability.xml
#. Update the availability of waveform data files existing in the standard
:term:`SDS` archive to the seiscomp database. Use :ref:`fdsnws` to fetch a flat file containing a list
of periods of available data from stations of the CX network sharing the same
quality and sampling rate attributes:
.. code-block:: sh
scardac -d mysql://sysop:sysop@localhost/seiscomp -a $SEISCOMP_ROOT/var/lib/archive
wget -O availability.txt 'http://localhost:8080/fdsnws/ext/availability/1/query?network=CX'
.. note::
The |scname| module :ref:`fdsnws` must be running for executing this
example.
.. _scardac_configuration:
Module Configuration
====================
| :file:`etc/defaults/global.cfg`
| :file:`etc/defaults/scardac.cfg`
| :file:`etc/global.cfg`
| :file:`etc/scardac.cfg`
| :file:`~/.seiscomp/global.cfg`
| :file:`~/.seiscomp/scardac.cfg`
scardac inherits :ref:`global options<global-configuration>`.
.. confval:: archive
Default: ``@SEISCOMP_ROOT@/var/lib/archive``
Type: *string*
Path to MiniSeed waveform archive where all data is stored. The SDS archive
structure is defined as
YEAR\/NET\/STA\/CHA\/NET.STA.LOC.CHA.YEAR.DATEOFYEAR, e.g.
2018\/GE\/APE\/BHZ.D\/GE.APE..BHZ.D.2018.125
.. confval:: threads
Default: ``1``
Type: *int*
Number of threads scanning the archive in parallel.
.. confval:: batchSize
Default: ``100``
Type: *int*
Batch size of database transactions used when updating data
availability segments. Allowed range: [1,1000].
.. confval:: jitter
Default: ``0.5``
Type: *float*
Acceptable derivation of end time and start time of successive
records in multiples of sample time.
.. confval:: maxSegments
Default: ``1000000``
Type: *int*
Maximum number of segments per stream. If the limit is reached
no more segments are added to the database and the corresponding
extent is flagged as to fragmented. Use a negative value to
disable any limit.
Command-Line Options
====================
.. program:: scardac
:program:`scardac [OPTION]...`
Generic
-------
.. option:: -h, --help
Show help message.
.. option:: -V, --version
Show version information.
.. option:: --config-file arg
Use alternative configuration file. When this option is
used the loading of all stages is disabled. Only the
given configuration file is parsed and used. To use
another name for the configuration create a symbolic
link of the application or copy it. Example:
scautopick \-> scautopick2.
.. option:: --plugins arg
Load given plugins.
Verbosity
---------
.. option:: --verbosity arg
Verbosity level [0..4]. 0:quiet, 1:error, 2:warning, 3:info,
4:debug.
.. option:: -v, --v
Increase verbosity level \(may be repeated, eg. \-vv\).
.. option:: -q, --quiet
Quiet mode: no logging output.
.. option:: --print-component arg
For each log entry print the component right after the
log level. By default the component output is enabled
for file output but disabled for console output.
.. option:: --component arg
Limit the logging to a certain component. This option can
be given more than once.
.. option:: -s, --syslog
Use syslog logging backend. The output usually goes to
\/var\/lib\/messages.
.. option:: -l, --lockfile arg
Path to lock file.
.. option:: --console arg
Send log output to stdout.
.. option:: --debug
Execute in debug mode.
Equivalent to \-\-verbosity\=4 \-\-console\=1 .
.. option:: --trace
Execute in trace mode.
Equivalent to \-\-verbosity\=4 \-\-console\=1 \-\-print\-component\=1
\-\-print\-context\=1 .
.. option:: --log-file arg
Use alternative log file.
Collector
---------
.. option:: -a, --archive arg
Overrides configuration parameter :confval:`archive`.
.. option:: --threads arg
Overrides configuration parameter :confval:`threads`.
.. option:: -b, --batch-size arg
Overrides configuration parameter :confval:`batchsize`.
.. option:: -j, --jitter arg
Overrides configuration parameter :confval:`jitter`.
.. option:: --generate-test-data arg
Do not scan the archive but generate test data for each
stream in the inventory. Format:
days,gaps,gapslen,overlaps,overlaplen. E.g. the following
parameter list would generate test data for 100 days
\(starting from now\(\)\-100\) which includes 150 gaps with a
length of 2.5s followed by 50 overlaps with an overlap of
5s: \-\-generate\-test\-data\=100,150,2.5,50,5