Field3d: An Open Source File Format For Voxel Data
Field3d: An Open Source File Format For Voxel Data
Magnus Wrenninge
Sony Pictures Imageworks
page 1
Revision history Introduction Concepts and conventions Extents and data window Mappings and coordinate spaces Integer versus floating-point coordinates System components The Field class hierarchy Concrete Field classes Metadata Virtual and non-virtual access to voxels Iterators Interpolation Field3DFile The use of HDF5 Structure of a Field3D file Examples of usage Manipulating voxels Manipulating mappings Mappings and interpolation Creating a field, filling it with data and writing it to disk Writing multiple fields to one file Reading fields from disk More examples Programming guide Coding standards Typedefs Use of namespaces When does the namespace change? Use of RTTI Use of exceptions Extending the system Extending file I/O for new classes Future work Bridge classes Reading of partial fields Access to partial fields Bounds-free fields Frequently asked questions Credits
page 2
3 4 6 6 6 8 9 9 11 11 11 12 12 14 14 15 17 17 17 17 18 18 19 19 20 20 21 21 22 22 22 23 23 23 23 23 23 23 24 25
Revision history
Aug 3, 2009 First draft. Still missing documentation of concrete field classes and system extension.
page 3
Introduction
Field3d is an open source library for storing voxel data. It provides c++ classes that handle storage in memory, as well as a file format based on hdf5 that allows the c++ objects to easily be written to and read from disk. The library was initially developed at Sony Pictures Imageworks as a replacement for the three different in-house file formats already used to store voxel data. It is the foundation for Imageworks' in-house simulation framework and volume rendering software and is actively being used in production. Field3d comes with most of the nuts and bolts needed to write something like a fluid simulator or volume renderer, but it doesn't force the programmer to use those. While the supplied c++ classes map directly to the underlying file format, it is still possible to extend the class hierarchy with new classes if needed. In the future, it will also be possible to read and write arbitrary 3d data structures, unrelated to Field3d, to the same .f3d files as the built-in types. The library has a number of features that make it a generally usable format for storing voxel-based data: Multiple types of data structures The library ships with data structures for dense, sparse (allocate-on-write blocked arrays) and mac fields, and each is stored natively in the .f3d file format. Arbitrary number of fields per file A Field3d file can contain any number of fields. The file I/O interface makes it easy to extract all fields from a file, or individual ones by referencing their name or the attribute they represent. hdf5 ensures that only the bytes needed for a particular layer are read, allowing quick access to small fields even if they are part of larger files. Support for multiple bit depths The included c++ data structures are templated, allowing data to be stored using 16/32/64 bits as needed. Bit depths may be mixed arbitrarily when writing fields to disk. Arbitrary mappings Field3d ships with a single mapping/transform type a 4x4 matrix transformation. Other, arbitrary transformations can be supported by extending the Mapping class hierarchy. Storage of additional data It's often necessary to store more information about a field than just the voxel data. To address this, each field can store an arbitrary number of key-value pairs as metadata. Heterogenous storage A Field3d file may contain a mix of all of the above features - there is no requirement on using a single data structure, bit depth, resolution or mapping for all fields in a file.
page 4
Extendable via plugins The field types, mappings and their respective I/O routines can all be extended either by adding more data structures directly to the library, or by writing plugins. Proven back-end technology A Field3d file is really a convention for storing voxel data using the hdf5 file format. hdf5 handles the reading and writing of actual bytes to disk. It is used extensively by organisations such as nasa for storing simulation data and other gigantic data sets. More information can be found athttp://www.hdfgroup.org/HDF5/. Data compression Field3d can use any compression algorithm used by hdf5. Currently, Field3d compresses all data using gzip. Open format Though Field3d (.f3d) files are easiest to read using the supplied I/O classes, it is still a standard hdf5 file, and can be read using those libraries directly if needed.
page 5
data window
Using separate extents and data windows can be helpful for image processing (a blur filter could run on a padded version of the field so that no boundary conditions need to be handled), interpolation (guarantees that a voxel has neighbors to interpolate to, even at the edge of the field) or for optimizing memory use (only allocate memory for the voxels needed).
page 6
Mappings only place the field in world space, and preserves that placement regardless of the resolution of the underlying field. This helps simplify the writing of resolution-independent code, such as defining fields of different resolution that occupy the same space. There are three main coordinate systems used in Field3d: World space Local space Voxel space World space is the global coordinate system and exists outside of any Field3d field. Local space is a resolution-independent coordinate system that maps the full extents of the field to a [0,1] space. Voxel space is used for indexing into the underlying voxels of a field. A field with 100 voxels along the x axis maps [0.0,100.0] in voxel space to [0.0,1.0] in local space. The reason for keeping two object-space coordinate systems (local and voxel) is that the local space is resolution independent, and makes it easier to deal with overlapping fields that cover the same space but are of different resolution. The mapping knows its field's resolution in order to provide local-to-voxel space transformations, but its transformation into world space (i.e. the field's position in space) stays the same after changing the resolution of a field. For this reason, the Mapping base class implements a localToWorld transform directly. The diagram below illustrates the coordinate systems used (in 2d for purposes of clarity).
local space [0,1]
world origin
page 7
page 8
System components
The Field class hierarchy
Field objects belong to a class hierarchy which is broken down into the different major tasks performed. They help simplify generic programming, and given that the concrete classes are templated, the base classes separate as much of the generic information about the field away from the parts that depend on knowing the Field's template type.
FieldBase
FieldRes
Field<T>
ProceduralField<T>
WritableField<T>
ResizableField<T>
DenseField<T>
SparseField<T>
MACField<T>
FieldBase This class provides run-time class information to the library, metadata and other generic attributes of a field. Field name and attribute Each FieldBase object has public string data member for setting its name (indicating its association or perhaps purpose) and for its attribute name (indicating if it's used to store velocity, levelset, density, etc). These can be changed at will without touching the data, mapping or metadata of the field. ClassFactory registration className() is used to register new field types with the ClassFactory. Non-static instantiation of fields is not used very often, but in the reading of ProceduralField objects from disk it comes into play.
page 9
Metadata Each field can store an arbitrary number of key-value pairs in the metadata section. Normally, this is passive information about a field, but ProceduralFields often use it as a way of passing in parameters. metadataChanged() can be implemented by subclasses that need notification of changes. Reference counting Field3d uses boost::intrusive_ptr for reference counting. Each Field class normally defines a Ptr typedef that refers to boost::intrusive_ptr<FieldType>. This helps keep syntax brief. boost::mutex is used for thread safety when incrementing/decrementing the reference count. RTTI Field3d provides field_dynamic_cast, a version ofdynamic_cast that is safe when objects cross a shared library boundary. See the Use of RTTI section in the programming guide below. FieldRes Information about the field's dimensions and mapping in space is handled by FieldRes. Refer to the sections above on "Extents and data window" and "Mappings and coordinate spaces" for more detail. It should be noted that although this class holds the resolution of the field, it does not allow resizing. That is handled by ResizableField. Field<Data_T> Templating on the data type is introduced by Field<Data_T>. The class is responsible for all const access to the field. The reason for keeping non-const and const access in separate base classes is that certains types of fields (most notably ProceduralFields) don't have non-const access to data.
Field<Data_T>
also provides the const_iterator class which is used to loop over the voxels in the field's data window. These stl-like objects are compatible with some (but not all) standard library algorithms. For example, std::fill works well but std::sort obviously doesn't. Also see the Iterators section below. Creation of iterators is done with cbegin() and cend(), which also exist in the form cbegin(Box3i subset). The bounding box version is used for iterating over only a subset of the field. WritableField<Data_T> This class adds non-const access to data in the same fashion as Field<Data_T> does for const access. It provides virtual lvalue() an iterator class, begin(), and end(). ResizableField<Data_T> The resizing of fields is handled by ResizableField<Data_T>. It provides a few different versions of setSize(), and each time the field is resized it also updates the Mapping object to reflect the new resolution.
page 10
ProceduralField<Data_T> To be written
Metadata
Each layer in a Field3d file can contain its own set of metadata. Metadata is represented as a collection of key-value pairs and are stored in the FieldBase class. There is a fixed set of types supported:
std::string int float Vec3<int> Vec3<float>
On disk, each metadata item becomes its own Attribute in the hdf5 file.
accesses voxels using the virtual value() call and thus isn't optimal in terms of speed. Just as with the fastValue() call, concrete classes may also implement
page 11
their own version of const_iterator which either uses fastValue() or keeps a raw pointer to its data.
Iterators
Field3d provides stl-like iterators for looping over the voxels in a field. These are more efficient than nested loops over i,j,k since they don't require calculating what memory location each new voxel index points to. For DenseField, a loop using iterator is very close to the speed of iterating over a std::vector or even using pointer arithmetic on a float*. Access to the current voxel index is still available through the iterators .x, .y, .z member variables, if needed. Some classes provide nonstandard iterator types. For example, SparseField has block_iterator which return a Box3i for each block in the field. This bounding box can then be used as a subset when creating iterators from the field itself.
MACField
has a mac_comp_iterator which simplifies iteration over each mac component for a given region of voxels. A simple iterator loop example:
DenseFieldf::Ptr field(new DenseFieldf); field->setSize(V3i(10, 20, 30)); for (DenseFieldf::iterator i = field.begin(); i != field.end(); ++i) { *i = static_cast<float>(i.x + i.y + i.z); }
Interpolation
Field3d comes with two types of interpolation linear and monotonic cubic. Interpolation is always performed in voxel space, so in order to sample a field in world space, we need to use the Mapping to transform the point from world to voxel space. A simple interpolation example:
DenseField<T>::Ptr field = someField; DenseField<T>::LinearInterp interp; // Prefix indicates coordinate space V3d wsP(0.0, 0.0, 0.0); V3d vsP; // Use mapping to transform between coordinate spaces field->mapping()->worldToVoxel(wsP, vsP); page 12
Generic and specific interpolator classes Just as with iterators, interpolators come in two flavors generic ones, that access data using the virtual interface (Field::value()) and specific ones, that have knowledge of the data structure they traverse.
Generic
Specific
LinearDenseFieldInterp
FieldInterp
LinearMACFieldInterp
CubicDenseFieldInterp
LinearFieldInterp
CubicFieldInterp
CubicMACFieldInterp
The generic interpolators all inherit from the FieldInterp base class and have a virtual interface for performing interpolation. The specific classes share no base class they are only usable when the concrete type is known. To aid in the writing of generic code, each concrete class (for example DenseField) provides a set of convenience typedefs which bring in the appropriate interpolator class into the scope of the class. For example, DenseField<float>::LinearInterp is a reference to LinearDenseFieldInterp<float>. Some classes do not have specific interpolator classes SparseField at the moment uses the generic interpolators. In these cases the typedef simply referes to those classes, i.e. SparseField<float>::LinearInterp refers to LinearFieldInterp<float>.
page 13
Field3DFile
Subclasses of Field can be written to and read from disk using the Field3DInputFile and Field3DOutputFile classes. Any number of fields, along with their mappings and metadata may be stored in one file. There is no need for the data structure, template type, mapping or field resolution to be identical.
Host application
HDF5
page 14
Partition
Mapping
Layer
Metadata
mapping_type (Attribute)
mapping_data (Dataset)
layer_data (Dataset)
int (Attribute)
layer_metadata (Group)
float (Attribute)
...
page 15
When storing multiple fields, the graph grows as expected (see below). We would refer to the layers in this file asname1:attribute1,name1:attribute2andname2:attribute1.
Partition
Mapping
Layer
layer_type (Attribute)
name1 (Group)
attribute1 (Group)
layer_metadata (Group)
float (Attribute)
...
int (Attribute)
float (Attribute)
layer_type (Attribute)
attribute1 (Group)
layer_data (Dataset)
int (Attribute)
layer_metadata (Group)
float (Attribute)
...
page 16
Examples of usage
Manipulating voxels
DenseField<float> field; field.setSize(V3i(10, 5, 5)); // Write to voxel field.lvalue(0, 0, 0) = 1.0f; // Read from voxel float value = field.value(0, 0, 0);
Manipulating mappings
MACFieldf::Ptr sim; sim->setSize(V3i(100, 200, 100)); MatrixFieldMapping::Ptr mapping(new MatrixFieldMapping); M44d localToWorldMtx; localToWorldMtx.setScale(V3d(10.0, 20.0, 10.0)); mapping->setLocalToWorld(localToWorldMtx);
page 17
page 18
More examples
For further examples, see the sample_code directory in the root of the library source code.
page 19
Programming guide
Coding standards
The following is by no means an entirely exhaustive breakdown of the coding standards used in Field3d, but should at least indicate the preferred format for the most common uses. 1. If nothing is mentioned in these standards regarding a particular style option, find existing cases in the code and match those. 2. Keep line lengths less than or equal to 80 characters. 3. The order of include files (both in .h and .cpp files) should be: System headers; External library headers; Project headers. 4. Always put spaces after commas 4.1. float value = someFunction(arg, arg2); 4.2. notfloat value = someFunction(arg,arg2); 5. Never put spaces directly inside parentheses 5.1. someFunction(arg1); 5.2. notsomeFunction( arg1 ); 5.3. float value = (a + 1) * b + 1; 5.4. notfloat value = ( a + 1 ) * b + 1; 6. Always put spaces around binary operators, never around unary 6.1. float value = a * b + 1; 6.2. notfloat value = a*b+1; 6.3. float value = -a; 6.4. notfloat value = - a; 6.5. float value = a; 6.6. notfloat value=a 7. Never make exceptions to the above rules for spaces, even to shorten a line to 80 characters. 8. Always fully scope symbol names in header files. Never putusingstatements in header files, with the exception of locations where they have localized effect, e.g. inside function implementations. 9. Class data members should be prefixedm_, or in the case of static data membersms_. 10. Braces,{}, are placed on a separate line for classes and functions, and on the same line in conjunction withfor,whileandifstatements. 11. Thepublicsection of a class always appears before theprivatesection. 12. One-line functions may be implemented inline in a class header. All other functions should be implemented in the .cpp file, or at the bottom of the .h file (for templated methods and inline methods). 13. In the class definition, maintain the member functions in groups depending on their context, i.e. accessors, virtual functions to be implemented by subclasses, virtual functions previously defined in base classes, etc. The groups are delimited by a comment with dashes extending to the full 80 character width. If desired, a doxygen group may also be created to improve the reading of generated documentation. 14. Separate new classes with a full-width comment line followed by the name of the class and a second full-width comment line. 15. Separate function implementations by a single full-width comment line.
page 20
16. Template arguments have a_Tsuffix. 17. Use//!style (doxygen) comments to document anything in header files. Dont duplicate these comments in the source file. 18. Feel free to use any other doxygen constructs in code documentation, such as \note, \warning, \return, \todo, \param.
Typedefs
Standard typedefs exist in the following cases: Concrete field template instances Interpolators
DenseField<float> == DenseFieldf SparseField<V3f> == SparseField3f
Use of namespaces
The lack of namespaces often causes complications in applications that allow plugins to be written. For example, if the host application dynamically links to a library of version X, and a plugin is compiled against version Y, the symbols will be identical and the plugin will probably call version X, which may or may not be compatible with the headers used at compilation. In order to prevent this, Field3d not only has its own namespace but also has a versioned namespace internally. Thus, in the first release, Field3d lives in Field3D::v1_0. To keep code that uses the library from having to explicitly scope symbols with Field3D::v1_0, a using statement is included everywhere the Field3d namespace is opened. While this breaks the rule (though only here do we break that rule) of never putting a using statement in a header file, it allows code to reference a class as Field3D::DenseField<float>, but to the compiler the mangled symbol will include its full scope, i.e. Field3D::v1_0::DenseField<float>, thus preventing any symbol clashes if multiple versions of the library are used in the same application. While this seldom becomes an issue in simple, standalone applications, in a production environment with tens or hundreds of versioned plugins loaded in a host application it often causes problems. In order to cleanly handle the versioning, and keep the version number in a single place, a #define called FIELD3D_NAMESPACE_OPEN is used instead of explicitly stating the full namespace in each file. This #define lives in ns.h. When closing the namespace, one of theFIELD3D_NAMESPACE_HEADER_CLOSEand FIELD3D_NAMESPACE_SOURCE_CLOSE are used as appropriate, depending on if the file is a header or source file.
page 21
Use of RTTI
Gcc has issues when it comes to maintaining RTTI information once an object crosses a shared library boundary. dynamic_cast<> works well if both the call and the allocated object reside in the same dynamic link unit, but if an object was allocated in a shared library and the dynamic_cast<> happens in the application itself, the result will be a zero pointer. Field3d avoids this by providing its own run-time type info checks. field_dynamic_cast<> works only for the class hierarchy under Field, and performs a full string compare between type names to determine matching classes. It first checks the object itself, and if no match is found, it will check the base class recursively up the hierarchy. If gcc's behavior changes in the future, the implementation of field_dynamic_cast<> could change internally to directly call dynamic_cast<>, but wouldn't break existing code.
Use of exceptions
Field3d uses exceptions internally, but catches everything before it has a chance to cross the shared library boundary. When writing plugins and/or extending the library, it is ok to use exceptions. However, any functions called from outside the library should refrain from throwing exceptions.
page 22
Future work
Bridge classes
In the first release, the file I/O routines require the inputs and outputs to be classes from the supplied class hierarchy. Bridge classes would allow users of the library to write small translation functions that could pass voxel data, mapping definitions and metadata to arbitrary data structures.
Bounds-free fields
asd
page 23
page 24
Credits
The Field3d library was originally developed at Sony Pictures Imageworks during 2008 and 2009. The following people were involved in the design and implementation: Magnus Wrenninge Chris Allen Sosh Mirsepassi Stephen Marshall Chris Burdorf Henrik Flt Scot Shinderman Doug Bloom
Special thanks go to Barbara Ford for management of the project and to Rob Bredow for spearheading the Open Source effort at Imageworks.
page 25