SFF++ library: reading and writing SFF from C++
Definition of the Stuttgart File Format

The following format definition is obtained from sff.doc which comes along with libsff or libstuff. The SFF was invented in 1996.

Date
28/09/2007
Contents

The SFF (Stuttgart File Format) is an attempt to reconcile different demands on the way seismic data used at the Institute of Geophysics at Stuttgart University should be archived.

A single data format allows the standardization of software used to perform common tasks on the data as reading, writing, processing and plotting of data. Software has to be written only once, may be used by many people and may be kept at a single place in the computer system.

The general structure of the file format is a header block followed by one or more data blocks. Within the header and the data blocks optional blocks containing additional information are allowed. Each data block is structured as described by the GSE2.0 format. The data are compressed using second differences and are encoded into pure ASCII characters using a six bit encoding scheme (CM6) also described by the GSE2.0 format. The ASCII encoding ensures portability of the data across different operating systems and computer architectures. Moreover, it allows sending data via e-mail.

Overall structure

The whole datafile is ASCII readable with any text editor and is therefor transferable from any system to any system via email. You can extract valid GSE2.0 data blocks from the files by just using a text editor to delete additional lines.

The whole file consists of one file header block and one ore more data blocks:

    File Header
    Data Block
       .
       .
       .

File Header

The File Header consists of a STAT line which is obligatory. There may be an optional FREE block and/or and an optional SRCE line:

    STAT line              obligatory
    FREE block             optional
    SRCE line              optional
See also
sff::FileHeader

Data Block

Each Data Block has to start with an obligatory DAST line and a WID2 line defined in GSE2.0 format. After that there have to follow the encoded data samples between a DAT2 identifier and a CHK2 checksum. These lines may be followed by an optional FREE block and/or an optional INFO line.

    DAST line              obligatory
    WID2 line              obligatory \
    DAT2 identifier        obligatory  | The GSE2.0 data block consists
    dataset                obligatory  | of these four elements.
    CHK2 line              obligatory /
    FREE block             optional
    INFO line              optional
See also
sff::TraceHeader

Definition of the elements

STAT line

This line provides general information about the data file
position format contents
1-5 a5 STAT (identifier)
6-12 f6.2 library version
minor versions are counted in 0.01 steps
major versions are counted in 1.0 steps
14-26 a13 timestamp of file creation time: yymmdd.hhmmss
28-37 a10 code with a combination of two possible characters:
F: there follows a FREE block
S: there follows a SRCE line
See also
sff::STAT

FREE block

This is a block of any set of 80 characters wide lines.
The start of this block is indicated a single line
containing FREE in the first 5 positions. Another line
of this content indicates the end of the FREE block.
A FREE block may contain any usefull information for
the user and has to follow no other standard than
a line length of 80 characters.
See also
sff::FREE

SRCE line

This line holds information of the source that caused the
seismic signal
position format contents
1-5 a5 SRCE (identifier)
6-25 a20 type of source (any string like "earthquake")
27 a1 type of coordinate system:
C: cartesian
S: spherical
29-43 f15.6 c1: x, latitude (see also Coordinate Specification)
44-58 f15.6 c2: y, longitude (see also Coordinate Specification)
59-73 f15.6 c3: z, height (see also Coordinate Specification)
75-80 a6 date of source event: yymmdd
82-91 a10 time of source event: hhmmss.sss
See also
sff::SRCE

DAST line

This line holds information on the actual dataset
position format contents
1-5 a5 DAST (identifier)
7-16 i10
number of characters in encoded dataset
From library version 1.10 this field may be -1. In this case the reading program has to determine the number of characters itself by detecting the CHK2 line. This change was necessary to implement the C++ version of libsff since this starts writing without having encoded the whole trace already.
18-33 e16.6 ampfac
This is a factor to scale the (floating point) dataset to an desireable dynamic range before converting it to Fortran integer values. After reading the dataset and decoding and converting it to floating point you have to multiply each sample by ampfac to get back the original values. As the maximum range of integer values goes from -(2.**31) to (2.**31)-1 you might like to adjust the maximum integer value to 0x7FFFFFFF. This may cause problems as the second differences compressing algorithm may increase the dynamic range of your data by a factor of four in the worst case. It is save to adjust the largest absolute value in the dataset to (2.**23)-1 which is 0x7FFFFF.
35-44 a10
code with a combination of three possible characters indicating possible optional blocks and a following further dataset:
F: a FREE block follows after dataset
I: an INFO line follows after dataset
D: there is another Data Block following in this file (this must be the last character in code)
See also
sff::DAST

WID2 line

(is 132 characters wide!)

This waveform identification line holds information on the dataset as defined in GSE2.0 format.

position name format contents
1-4 id a4 WID2 (identifier)
6-15 date i4,a1,i2,a1,i2 date of first sample: yyyy/mm/dd
17-28 time i2,a1,i2,a1,f6.3time of first sample: hh:mm:ss.sss
30-34 station a5 for a valid GSE2.0 block use ISC station code
36-38 channel a3 for a valid GSE2.0 block use FDSN channel designator
40-43 auxid a4 auxiliary identification code
45-47 datatype a3 must be CM6 in SFF
49-56 samps i8 number of samples
58-68 samprat f11.6 data sampling rate in Hz
70-79 calib e10.2 calibration factor
81-87 calper f7.3 calibration period where calib is valid
89-94 instype a6 instrument type (as defined in GSE2.0)
96-100 hang f5.1 horizontal orientation of sensor, measured in degrees clockwise from North (-1.0 if vertical)
102-105 vang f4.1 vertical orientation of sensor, measured in degrees from vertical (90.0 if horizontal)
See also
sff::WID2

DAT2 indicator

This line indicates the beginning of the encoded dataset.
The dataset follows in 80 characters wide lines.

CHK2 line

Provides a checksum for the dataset. The checksum has to be
calculated as defined in GSE2.0:
integer function CHECKSUM(nsamp, idata)
integer nsamp, idata
integer nchecksum, data, ichecksum
integer MODULO_VALUE
parameter(MODULO_VALUE=100 000 000)
modulo=MODULO_VALUE
do 1 i=1,nsamp
data=idata(i)
if (abs(data).ge.modulo) data=data-(data/modulo)*modulo
nchecksum=nchecksum+data
if (abs(nchecksum).ge.modulo)
& nchecksum=nchecksum-(nchecksum/modulo)*modulo
1 continue
ichecksum=abs(nchecksum)
CHECKSUM=ichecksum
return
end
position format contents
1-4 a4 CHK2 (identifier)
6-13 i8 checksum

INFO line

Holds additional information on the seismometer position.
position format contents
1-5 a5 INFO (identifier)
6 a1 type of coordinate system:
C: cartesian
S: spherical
8-22 f15.6 c1: x, latitude (see also Coordinate Specification)
23-37 f15.6 c2: y, longitude (see also Coordinate Specification)
38-52 f15.6 c3: z, height (see also Coordinate Specification)
54-57 i4 number of stacks done during acquisition (a value of zero and a value of one both mean a single shot)
See also
sff::INFO

Coordinate Specification

Notice that given coordinates imply a spatial relation between the source location and the receiver locations. While spherical coordinates refer to a fixed reference frame on the earth, cartesian coordinates refer to an arbitrary origin. The creator of the datafile is responsible to take care that coordinate information is consistent between the SRCE line and the several possible INFO lines.

cartesian coordinates

 x, y and z are vector components in a right handed cartesian
 reference frame. While x and y lie arbitrary orientated in the
 horizontal plane, z is counted positive upwards from an arbitrary
 reference level (preferably the free surface). All three coordinate
 values are measured in meters.

spherical coordinates

 Latitude and longitude are given in the geographical reference frame
 and measured in degrees. The height value gives the height above
 the geoid and is measured in meters.
See also
sff::SRCE, sff::INFO, SRCE line, INFO line