82 lines
2.8 KiB
ReStructuredText
82 lines
2.8 KiB
ReStructuredText
.. _sec-caps-upgrading:
|
|
|
|
Upgrading
|
|
=========
|
|
|
|
New file format
|
|
---------------
|
|
|
|
Starting from version 2021.048 CAPS introduces a new file storage format.
|
|
Actually the files are still compatible and chunk based but two new chunk types
|
|
were added. The upgrade itself should run smoothly without interruption but due
|
|
to the new file format all files must be converted before they can be read.
|
|
CAPS will do that on-the-fly whenever a file is opened for reading or writing.
|
|
|
|
That can cause performance drops until all files have been converted. But it
|
|
should not cause any outages.
|
|
|
|
Rationale
|
|
---------
|
|
|
|
The time to store an out-of-order record in CAPS increased the more records
|
|
were stored already. This was caused by a linear search of the insert position.
|
|
The more records were stored the more records had to be checked and the more
|
|
file content had to be paged in system memory which is a slow operation.
|
|
In addition a second index file had to be maintained which requires an additional
|
|
open file descriptor per data file. As we also looked for way to reduce
|
|
disc fragmentation and to allow file size pre-allocation on any operating system
|
|
we decided to redesign the way how individual records are stored within a data
|
|
file. What we wanted was:
|
|
|
|
* Fast insert operations
|
|
* Fast data retrieval
|
|
* Portable file size pre-allocations
|
|
* Efficient OS memory paging
|
|
|
|
CAPS now implements a B+tree index per data file. No additional index file is
|
|
required. The index is maintained as additional chunks in the data file itself.
|
|
Furthermore CAPS maintains a meta chunk at the end of the file with information
|
|
about the logical and pyhsical file size, the index chunks and so on. If that
|
|
chunk is not available or is not valid then the data file will be re-scanned
|
|
and converted. This is what actually happens after an upgrade.
|
|
|
|
As a consequence, time window requests will be much faster with respect to
|
|
CPU time. Also file accesses are less frequent and reading file content overhead
|
|
while extracting arbitrary time windows is less than before.
|
|
|
|
As the time range stored in the data file is now part of the meta data a full
|
|
re-scan is not necessary when restarting CAPS without its archive log. When
|
|
dealing with many channels it will speed up re-scanning an archive a lot.
|
|
|
|
Manual archive conversion
|
|
-------------------------
|
|
|
|
If a controlled conversion of the archive files is desired then the following
|
|
procedure can be applied:
|
|
|
|
1. Stop caps
|
|
|
|
.. code-block:: sh
|
|
|
|
$ seiscomp stop caps
|
|
|
|
2. Enter the configured archve directory
|
|
|
|
.. code-block:: sh
|
|
|
|
$ cd seiscomp/var/lib/caps/archive
|
|
|
|
3. Check all files and trigger a conversion
|
|
|
|
.. code-block:: sh
|
|
|
|
$ find -name *.data -exec rifftest {} check \;
|
|
|
|
4. Start caps
|
|
|
|
.. code-block:: sh
|
|
|
|
$ seiscomp start caps
|
|
|
|
Depending on the size of the archive step 3 can take some time.
|