AFF --- A container for numbers (array) by Friederich and Forbriger.
Design decisions

Contents of this page:

Common interface concepts

Sparse interfaces

The class library is intended to be a light-weight library. This means it should offer basic functionality in terms of multidimensional containers with counted references (and not more in first place). We do not like to include a tremendous amount of code for specialized concepts (like subranges or expression templates in Blitz++) each time we just need a small array. Thus the header files providing array declarations (aff/array.h and the files included therein) should be as sparse as possible. All extra functionality like iterators (aff::Iterator presented in aff/iterator.h) or slices (aff::Slice presented in aff/slice.h) should be external to the aff::Array class. This allows us to load their definitions only where needed. However, this approach requires that the internals of aff::Array are exposed to the outside through appropriate functions (see Accessing internals).

Member typedefs

Class templates like aff::Iterator may be used with any container class, that provides an appropriate interface. This interface convention concerns the access to the type of related objects. I will explain by example:

We use an iterator i which was declared

for a container of type Cont, it expects to find a corresponding container class that promises constness of the elements through

Cont::Tcontainer_of_const

or short

Cont::Tcoc

For aff::ConstArray the type aff::ConstArray::Tcoc is just the class itself. However aff::Array::Tcoc gives an aff::ConstArray.

See also
aff::SharedHeap::Tcontainer_of_const
aff::ConstSharedHeap::Tcontainer_of_const
aff::Array::Tcontainer_of_const
aff::ConstArray::Tcontainer_of_const
aff::Series::Tcontainer_of_const
aff::ConstSeries::Tcontainer_of_const

In the same way we may access the appropriate element type through

Cont::Tvalue

which is T for aff::Array<T> and const for aff::ConstArray<T>. However a

Cont::Tconst_value

will always provide a type with const qualifier.

See also
aff::Array::Tvalue
aff::Array::Tconst_value
aff::ConstArray::Tvalue
aff::ConstArray::Tconst_value
aff::Series::Tvalue
aff::Series::Tconst_value
aff::ConstSeries::Tvalue
aff::ConstSeries::Tconst_value

In the same way we may access the type of the appropriate representation by

Cont::Trepresentation
See also
aff::Array::Trepresentation
aff::ConstArray::Trepresentation
aff::Series::Trepresentation
aff::ConstSeries::Trepresentation

Notice: Using these typedefs (and also the typedefs for the shape class, etc.) improves the maintainability of your code. Think of using the $HOME variable in shell scripts. Once the name of your home directory changes, you need not modify all your shell scripts. Now consider one day your shape class might be renamed...

Member typedef Tcontainer

Design decision:
Every class that can be converted to a container type, should provide a member typedef Tcontainer and an appropriate conversion operator.
See also
aff::util::Slice
aff::util::Subarray
aff::deepcopy
Background
aff::deepcopy is a good example for function designed to deal with any container. There may be others in the future, like global arithmetic operators or sum-reduction. Due to its generality the function template puts no restrictions on its template arguments. You may instantiate that template for any class. In some sense this is bad practice and we have to resolve ambiguities and support type conversions. In particular, think of feeding a subarray (class aff::util::Subarray) to one of these whole-array functions (this might be one of the most interesting uses). aff::util::Subarray easily matches the template parameter, but does not offer the member functions necessary for element access.
Hence we must ensure conversion of the aff::util::Subarray to its container class. In our concept this is done with in aff::deepcopy. It looks for a Tcontainer typedef in the argument class definitions and converts the class objects to its corresponding container class before the copy operation.
Barton and Nackman propose another concept. Using their scheme we would introduce a general Container class,
template<class C>
class Container {
public:
typedef C& Tcontainer_reference;
Container(Tcontainer_reference c): M(c) { }
operator Tcontainer_reference() { return(M); }
private:
Tcontainer_reference M;
};
that takes a special container class as a template argument and initializes a member data reference to an object of this class in its constructors. We would then derive aff::Array from this class by
template<class T>
class Array: public Container<Array <T> > { };
This way any reference to a container (aff::Array, aff:Series, aff::ConstArray, etc.) can be converted to a Container class object, which agein offers a conversion operator to a reference to its leaf class. Container-specific functions then are declared
template<class S, class T>
void deecopy(const Container<S>& source, Container<T>& target);
deepcopy than can only be called for objects that are derived from Container.
Trade-offs
The Barton and Nackman trick involves another member data field in each container class to hold the reference in the Container base class. aff::Array would have to extra member data fields, because aff::Array and aff::ConstArray both must inherit from Container. I regard this as a partial violation to our concept of sparse interfaces. and small data types and discard this option.
However, our concept requires to create a full copy of at least the target container in each whole-array operation. This would not be necessary generally. Generally we would operate directly on the aff::Array reference passed as target of the operation.
With the Barton and Nackman trick this copy operation would only be necessary with class objects, that are not directly derived from Container, as are aff::util::Subarray and companions. However, for those we would have to introduce specializations (overloaded functions) of whole-array operations, that first perform the conversion (creating an aff::Array or else) and then call the function that takes Container arguments.
Alternative
The cheapest alternative (with respect to runtime overhead in the whole-array function and in the container classes aff::Array, etc.) is to delegate the problem to aff::util::Subarray and companions. We could introduce a member data field in them of type Tarray. This would allow for a member function returning a reference to this member. There should be no runtime overhead, since every subarray must once be converted to an array to be useful (now this conversion takes place outside aff::util::Subarray). But this would involve the inconvenience to call an extra member function in Subarray, when passing to a whole-array function. The template argument type of the corresponding whole-array function remains unrestricted (totally unchecked).

Accessing internals

Providing extended functionality outside of aff::Array (see Sparse interfaces) requires, that aff::Array, aff::ConstArray, aff::Series, and aff::ConstSeries expose some of their internals. This concerns the underlying shape as well as the represented data.

aff::ConstArray and aff::ConstSeries provide a read-only reference to the data (i.e. an aff::ConstSharedHeap object) through their member-functions aff::ConstArray::representation and aff::ConstSeries::representation, respectively. In the same way aff::Array and aff::Series return an aff::SharedHeap through their representation member function.

All of them return a copy of their shape through the member functions aff::Array::shape, aff::ConstArray::shape, aff::Series::shape, and aff::ConstSeries::shape, respectively. The type of the appropriate shape is available through a member typedef (see Member typedefs).

In return all containers provide a constructor that takes a representation and a shape object and checks for their consistency.

Decision against a base class to express common interface

This library contains different classes that provide common interfaces. For example all aff::ConstArray, aff::Array, aff::Series and aff::ConstSeries provide the necessary interface to be used together with aff::Iterator or aff::Browser. A rather elegant way to express this commonality in a template context is the Barton and Nackman trick. All containers that can work together with aff::Iterater sould have to inherit from a class aff::Iteratable. The base class is templated, takes the iteratable class as template parameter and stores a reference to the instance of the iteratable class. This way each iteratable class can be converted to aff::Iteratable, which again returns a reference to the classes iteratable features in the appropriate context.

This way of expressing common interfaces makes the whole classes more complicated than necessary to provide their elementary functionality. We have to store an extra reference to the leaf class object for each feature, we will express this way. And we have to include a whole bunsch of extra code for each feature. Since we prefer Sparse interfaces this method was rejected.


Three classes for one container

One container class like aff::Array or aff::Series is made up from its class definition together with two other classes like the representation in aff::SharedHeap and a shape like aff::Strided or aff::LinearShape. Why not put all this functionality within one class like aff::Array?

  1. aff::Array and aff::Series are class templates because they shall be provided for any element type the user desires. Consequently for each element type in use a separate instantiation of this template class must be compiled. The code describing the shape of the memory layout and the way index values to raw memory have to be calculated is completely independent from the element type of the container. The shapes code can be compiled once for all template instantiations needed. For this reason it is efficient to provide this code in a separate class.
  2. The containers in this library use reference counted pointers to raw memory. This way they share data in memory. The default way to copy containers is a shallow copy, where only a reference is copied (see The concept of represented memory). Using a seperate class aff::SharedHeap to handle these reference counted pointers allows us to share memory between containers of different types, i.e. an aff::Series<T> and an aff::Array<T>.

Class hierarchy: member data vs. inheritance

Containers like aff::Array rely on functionality provided by other classes. They are based on shapes like aff::Strided and memory representations like aff::SharedHeap (see The concept of represented memory).

An array isn't a shape.
Thus it would look like better design to use shapes as member data. We prefer, however, to derive privately from the shape classes. This hides them from the outside (apart from explicit access - see Accessing internals). At the same time we make use of using declarations to provide access to member functions like aff::Strided::size() that make also sense as a member of aff::Array.
An array is some kind of memory representation.
Thus it would look like proper design to derive an array from a representation class. We prefer, however, to use the memory representation as a private member. We think of the representation as an individual and independent object that can be passed (e.g.) from an aff::Array to and aff::Series. Also due to the replication of the representation in aff::Array (see Replicated ConstSharedHeap) and the distinction between containers that allow data modification and containers that allow only read access this leads to a clearer design. This is reflected by the conciseness of the array constructors. Use the representation class as member data should introduce no runtime overhead. The full class specification including member data is available at compile-time. This should enable compilers to do excessive inlining.

Replicated ConstSharedHeap

Design decision

aff::Array has a member of type aff::SharedHeap (which is exposed to the outside through aff::Array::representation), which itself inherits from aff::ConstSharedHeap. At the same time aff::ConstArray has a member of type aff::Array and inherits itself from aff::ConstSharedHeap (which is exposed to the outside through aff::ConstArray::representation). Thus the class aff::ConstSharedHeap is replicated in aff::Array and it is not replicated by deriving from virtual base classes a virtual base class.

The same applies to aff::Series and aff::ConstSeries.

Where is the problem?

Having an array object a declared

where T is any type, we want to pass this object to a function that promises constness of the elements (see Notes on the const-correctness of arrays). The function is thus declared

void func(const aff::ConstArray<T>&);

and we want to use it like

func(a)

Consequently we must offer a way to convert an

to an

implicitely. This is done by deriving aff::Array<T> publicly from aff::ConstArray<T>.

The memory representation is needed by both, aff::Array<T> and its base class. Hence aff::ConstArray<T> has to inherit from the representation. It would be natural for aff::ConstArray<T> to inherit from aff::ConstSharedHeap only. However, since the derived aff::Array<T> needs full access to an aff::SharedHeap<T> (to expose the representation to the outside), we might tend to derive aff::ConstArray<T> from aff::SharedHeap<T> privately, allowing only read access and conversion to aff::ConstSharedHeap.

Why is this a problem? Consider the inside of the above function. We might know, that the columns of the passed array contain seismogram waveforms. And we might like to access them in an appropriate way (i.e. through an interface that provides waveform operations), though just reading - not modifying - the data. Then we would like to code something like

template<class T>
void func(const aff::ConstArray<T>& a)
{
// cycle all seismograms
for (Tsubscript iseis=a.f(1); iseis<=a.l(1); iseis++)
{
// extract shape
aff::Strided shape(a.shape());
// collapse to waveform iseis
shape.collapse(1,iseis);
// create a time series
shape.size(), shape.first_offset());
// operate on time series (e.g. recursive filter)
some_waveform_operation(waveform);
}
}

The above example requires that we can construct an aff::ConstSeries<T> from an aff::ConstSharedHeap<T> (which is returned by aff::ConstArray::representation). The same problem appears together with aff::ConstArray, when creating a subarray or slice from an aff::ConstArray with aff::subarray or aff::slice and aff::ConstArray itself knowing nothing about slices, etc.

Constructing aff::ConstArray from an aff::ConstSharedHeap sounds a natural operation. However, aff::ConstArray will ask for an aff::SharedHeap, if we derive from aff::SharedHeap (as sketched above). Conclusion: aff::ConstArray must use an aff::ConstSharedHeap only. At the same time we must hold the full aff::SharedHeap together with the aff::Array object, since this must return an aff::SharedHeap through aff::Array::representation to allow the above operation (accessing data through aff::Series or constructing a slice - see Sparse interfaces).

Solution

The most convincing solution (IMHO) to this problem is to use an (additional) member of type aff::SharedHeap<T> in aff::Array<T> which inherits from aff::ConstArray<T>. In consequence aff::ConstSharedHeap<T> is then a replicated within aff::Array<T>. For a proper design we might consider to make aff::ConstSharedHeap a virtual base, thus avoiding member data duplication. This would, however, introduce an extra level of indirection (additional to the indirection when accessing the heap data through the pointer to the aff::util::SHeap struct in aff::ConstSharedHeap). On the other hand, fully replicating the base aff::ConstSharedHeap just adds one member data pointer (the pointer to the aff::util::SHeap struct) to the data block in aff::Array (which already contains many bytes from the aff::Strided base). This overhead is not considered significant.

But notice: We now must take care to synchronize the aff::SharedHeap base of aff::Array and the aff::ConstSharedHeap base of aff::ConstArray during construction. This is no major concern, but it is error-prone to some degree. It is, however, much easier to keep them synchronous when using member data instead of inheritance.


Copy constructor and copy operator

Usually we would expect the copy operator and the copy constructor to have the same semantics. Here the copy constructor of aff::Array must have reference semantics (it does a shallow copy). This is necessary to allow arrays as return values from functions. In this case the copy constructor is automatically invoked. Reference semantics ensure a minimal overhead. in terms of memory usage and execution time.

In the case of the copy (assignment) operator things are less clear: If we define the copy operator to have reference semantics, it has the same behaviour as the copy constructor. That is what we usually would expect. An expression like

A=B;

means that array A drops its reference to the memory location it was pointing to and forgets its previous shape. Following this statement array A will refer to the same memory location as array B and will have the same shape. Both are indistinguishable.

However, in many cases (most cases?) we will use the copy (assignment) operator in the sense of a mathematical equation. This may read like

A=B+C;

although expressions like this are not yet supported by the library features. In this case we do not mean that A should drop it reference. A may refer to an array in memory which is also referred by other array instances. And we want these values to be set to the result of the operation B + C. In that case the copy operator should have deep copy semantics.

Design decision
The classes aff::Array and aff::Series provide copy (assignment) operators with shallow copy semantics. The automatically created copy constructor and copy operator do just right for this. This is sensible, because we are not offering mathematical array operations. This operations may be delegated to a wrapper class in the future, which then also may provide expression templates and an appropriate assignment operator.

Namespaces

We group all code in two namespaces. Major modules which will be accessed by the user are placed in namepsace aff. Modules meant to be used internally are placed in aff::util. Use directives like

using namespace aff;

or

using aff::Array;

for convenient access.


Binary library

Note
The option to provide precompiled templates is finally removed from the library.
Date
10/11/2010

Multidimensional arrays

Todo:
Explain Wolfgangs idea of multidimensional arrays.

Notes on the const-correctness of arrays

Where is the problem?

When passing a container (i.e. an array) to a function, we would like to promise that the values in the container are not modified, in case the function uses only read-access. Consider a declaration

void func(const int& v)

of a function that takes and argument of type int an promises that this will not be modified. Passing by reference is used, because this is more efficient than passing by value (in particular for large objects - which is not the case for int, but for an array). And qualifying the type const promises that the value passed by reference will not be changed.

A declaration

void func(const Array<int>& v)

does not what we want (see General considerations). It just promises the constness of the container, not of the data. Within the function the passed reference may be assigned to a non-const Array<int>, which allows modification of the data (see The concept of represented memory).

Thus we must use something like

void func(const ConstArray<int>& v)

where ConstArray<int> does not allow modification of the data (be no means - copying and conversions included) and may be derived from an Array<int> by a trivial conversion (like a conversion to a public base class).

The approach used in this library

We distinguish between the constness of the array and the constness of the elements. A definition

aff::Array<int> A(12,12);
const aff::Array<int> B(A);

means that array B is a constant array initialized to array A. This means, that the container is constant. Its shape and reference may not be changed.

If you want to define constness of the contained values (e.g. when passing an array to a function), you have to use

which defines that the contents of C may not be changed (i.e. they are of type const int. They are still refering to the same data in memory. If you modify data elements through A, this will be visible through C.

An array for elements of type T is derived from an array for elements of type const T. Functions that only need read access to arrays should be declared like

void func(const aff::ConstArray<int>& array);

and may be called like

aff::Array<int> A(12,12);
func(A);

The type conversion from

to

is trivial and has no runtime overhead.

Each container class must deal with this issue on its own. Sorry...

See also
aff::ConstSharedHeap
aff::ConstArray
aff::ConstSeries
Restrictions for containers with const qualifier
In 7/2005 we changed the design decision of not allowing data modification through containers that are declared const. Strictly distinguishing between constness of the container and constness of the contained data allows to modify data through an object c that was declared
const Array<int> c;
The containers in this library (aff::Array, etc.) allow data modification through instances declared const. This may appear surprising to users of the library. However, since it is possible to create a copy of a const container at any place and modifying the data through this copy, we would regard a different behaviour as a false promise.

To ensure true constness of the data, you have to assign to the base class of the container. Any container class (e.g. Cont) provides the type of container for const elements through a typedef Tcontainer_of_const (i.e. Cont::Tcontainer_of_const) or short Tcoc. Remember that a const aff::Array always may be assigned to a mutable aff::Array, which in turn allows modification of the data!

Alternatives

Three alternatives to this concept have been discussed (and discarded). Both have the appealing property of needing only one class definition for each container (in contrast to a class and a base class in our case). Additionally both would offer name commonality for containers of non-const elements and containers of const elements.

Using arrays with element type const T
A rather straight approach is to use the element type const T where an array of elements of type T should be used, that we do not allow to be changed. This design concept can be accomplished with a special traits class that is specialized for const T and allows to derive a mutable or const version of any type. By further providing appropriate conversion operators, an
Array<T>
could be converted to an
Array<const T>,
both sharing the same elements in memory. In this approach, however, both container classes are completely independent (although having the same name) due to their different template arguments. The conversion to the container for const elements is not a trivial conversion (like for a reference to a reference of a public base class) and must be done explicitely. That's inconvenient for the most common use (i.e. passing a container to a function).
Deriving from a template specialization
The name commonality could still be achieved by deriving the Array<T> from template specialization Array<const T>. In this case the specialization must be used as a base class before it is actually defined. That's improper design.
Ensuring constness of elements through const qualifier of functions
We could strictly follow the concept (as we do anyway to some extent) to couple the constness of the container to the constness of the contained data. This is done by const qualifiers to member functions that allow modification of the data. To avoid pitfalls, we have to consider copy operators and copy constructors then too. Both must not promise const-ness to their arguments. While this works in principle, we would end up with a container class which doesn't allow copies of const instances. Hence we could not return a container from a function, that ensures that the accessed data cannot be modified.

General considerations

Arrays using the shared heap representation have reference semantics. Different instances will access the same data in memory. Copying an array instance just copies a reference to the data. This mechanism is not obvious to the compiler. The array instances are values in the sense of C++ types and not references. Passing an const aff::Array to a function does not prohibit the function from assigning this instance to a non-const aff::Array, which then references the same memory area and allows the modification of the values contained in the array.

Generally it has to be defined, what is meant by declaring an array instance to be const. In the first place this means constness of the container to the compiler. The compiler will ensure, that the container (array class) is not changed, thus no data member of the array is changed. This means that the array will keep the reference to the same data and that the index-mapping defined by the array shape may not be changed. However, the compiler will not prevent to change any data, the array refers to.

We may define access operators that have a non-const version that returns a reference to the data, allowing to change the data values together with a const version that returns a value of const reference, thus preventing the data from being changed through an instance that is declared const. However, the compiler will always allow to create a non-const copy of a const array instance. In the sense of const-ness of C++ data, this does not violate the const-ness of the source of the copy. The shape of the original array may not be changed. Only the shape of the copy may be changed. But the data of the original array may now be changed through the copied instance, since our array classes implicitly have reference semantic. Thus we have to distinguish between const-ness of the container (array class instance) and the contained data (values in memory the arrays refers to).

In this library we will not provide a const and a non-const version of the array classes. With templated code it is more convenient to use an array with element type const T as the const version of an array with element type T. To allow conversion of an instance with element type T to an instance of type const T, we use the version for elements of type const T as a base classe.

  • The need of const-correctness is discussed in "Chapter 1 Introduction, C++ Conventions, Implementation of Vector and Matrix Classes" of "Numerical Recipes in C++". A link to a PDF file of this chapter is provided at "http://www.nr.com/cpp-blurb.html".
  • The "C++ FAQ Lite" discusses many aspects of const-correctness in Chapter 18, which you find at "http://www.inf.uni-konstanz.de/~kuehl/cpp/cppfaq.htm/const-correctness.html".
  • You may find my thoughts about const-correctness with containers that have reference semantics at "http://www.geophysik.uni-frankfurt.de/~forbrig/txt/cxx/tutorial/consthandle.doc/doc/html/index.html".