[hdf-forum] setting chunk dimensions

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[hdf-forum] setting chunk dimensions

Natalie Happenhofer

Hi!
I?m trying to write my data via hyperslabs, ad there is also a nice example how to do it on the HDF5.org webpage. I just don?t understand how to set the chunk_dims, or, more precisely, what do this chunking dimensions do?
Here is the part of the example code using the chunk-dims:

nt
main (void)
{
    hid_t       file;                          /* handles */
    hid_t       dataspace, dataset;
    hid_t       filespace;
    hid_t       cparms;
    hsize_t      dims[2]  = { 3, 3};            /*
                         * dataset dimensions
                         * at the creation time
                         */
    hsize_t      dims1[2] = { 3, 3};            /* data1 dimensions */
    hsize_t      dims2[2] = { 7, 1};            /* data2 dimensions */
   

    hsize_t      dims3[2] = { 2, 2};            /* data3 dimensions */

    hsize_t      maxdims[2] = {H5S_UNLIMITED, H5S_UNLIMITED};
    hsize_t      chunk_dims[2] ={2, 5};
    hsize_t      size[2];
    hsize_t      offset[2];

    herr_t      status;

    int         data1[3][3] = { {1, 1, 1},       /* data to write */
                {1, 1, 1},
                {1, 1, 1} };

    int         data2[7]    = { 2, 2, 2, 2, 2, 2, 2};

    int         data3[2][2] = { {3, 3},
                {3, 3} };
    int fillvalue = 0;

    /*
     * Create the data space with unlimited dimensions.
     */
    dataspace = H5Screate_simple(RANK, dims, maxdims);

    /*
     * Create a new file. If file exists its contents will be overwritten.
     */
    file = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

    /*
     * Modify dataset creation properties, i.e. enable chunking.
     */
    cparms = H5Pcreate(H5P_DATASET_CREATE);
    status = H5Pset_chunk( cparms, RANK, chunk_dims);
    status = H5Pset_fill_value (cparms, H5T_NATIVE_INT, &fillvalue );

   

chunk_dims is set to {2,5}, which I don?t understand, because the initial dataset is 3x3 and is then extended to 10x3 - why the {2,5}?

thx,
NH


     * Create a new dataset within the file using cparms
     * creation properties.
     */
    dataset = H5Dcreate2(file, DATASETNAME, H5T_NATIVE_INT, dataspace, H5P_DEFAULT,
            cparms, H5P_DEFAULT);

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20081021/cf3d68ff/attachment.html>

Reply | Threaded
Open this post in threaded view
|

[hdf-forum] setting chunk dimensions

Francesc Alted
Hi Natalie,

A Tuesday 21 October 2008, Natalie Happenhofer escrigu?:
> Hi!
> I?m trying to write my data via hyperslabs, ad there is also a nice
> example how to do it on the HDF5.org webpage. I just don?t understand
> how to set the chunk_dims, or, more precisely, what do this chunking
> dimensions do? Here is the part of the example code using the
> chunk-dims:
[clip]
>hsize_t      chunk_dims[2] ={2, 5};
[clip]
> chunk_dims is set to {2,5}, which I don?t understand, because the
> initial dataset is 3x3 and is then extended to 10x3 - why the {2,5}?

Because you told HDF5 that your chunk dimensions are {2,5} (see above).
Chunk sizes don't have nothing to do with the dimensions of your
dataset, but rather on the way the I/O is done.

I'd recommend you to carefully read the "Datasets" section of the User's
Guide at:

http://www.hdfgroup.org/HDF5/doc/UG/UG_frame10Datasets.html

and particularly the section labeled as "Chunked".

--
Francesc Alted
Freelance Developer & Consultant

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe at hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe at hdfgroup.org.




Reply | Threaded
Open this post in threaded view
|

[hdf-forum] setting chunk dimensions

Ruth Aydt
Administrator
In reply to this post by Natalie Happenhofer
Hi Natalie,

You can think of the hyperslabs as the way you logically access (write  
or read) subsets of a complete dataset from your application's  
perspective.  By specifying different hyperslabs you can access  
different subsets of the dataset.   You can also access the entire  
dataset -- it just depends on what you specify in the write or read.

Chunked storage defines how the dataset is physically written to /  
read from disk.   The chunk size is set when the dataset is created  
and remains constant.  Typically you want to chose a chunk layout that  
will perform well for the most frequent logical access pattern -- or  
for the access pattern that you want the best performance with.

So hyberslabs are about logical access and chunks are about physical  
storage organization on disk.   Both hyperslabs and chunks will have  
the same number of dimensions as the dataset.  But, the dimension  
*sizes* for both hyberslabs and chunks may be (and usually are)  
different than your dataset's dimension sizes.

The interaction of chunk sizes, hyperslab selections, and various  
other factors can dramatically impact performance.

You may be interested in sections 4.1 and 5 of the NetCDF-4  
Performance Report found at www.hdfgroup.org/pubs/papers.   They give  
some explanation about hyperslabs and chunked storage, and how  
performance may vary, as well as how chunked storage may impact  
filesize.

-Ruth



On Oct 21, 2008, at 3:10 AM, Natalie Happenhofer wrote:

> Hi!
> I?m trying to write my data via hyperslabs, ad there is also a nice  
> example how to do it on the HDF5.org webpage. I just don?t  
> understand how to set the chunk_dims, or, more precisely, what do  
> this chunking dimensions do?
> Here is the part of the example code using the chunk-dims:
>
> nt
> main (void)
> {
>     hid_t       file;                          /* handles */
>     hid_t       dataspace, dataset;
>     hid_t       filespace;
>     hid_t       cparms;
>     hsize_t      dims[2]  = { 3, 3};            /*
>                          * dataset dimensions
>                          * at the creation time
>                          */
>     hsize_t      dims1[2] = { 3, 3};            /* data1 dimensions */
>     hsize_t      dims2[2] = { 7, 1};            /* data2 dimensions */
>
>
>     hsize_t      dims3[2] = { 2, 2};            /* data3 dimensions */
>
>     hsize_t      maxdims[2] = {H5S_UNLIMITED, H5S_UNLIMITED};
>     hsize_t      chunk_dims[2] ={2, 5};
>     hsize_t      size[2];
>     hsize_t      offset[2];
>
>     herr_t      status;
>
>     int         data1[3][3] = { {1, 1, 1},       /* data to write */
>                 {1, 1, 1},
>                 {1, 1, 1} };
>
>     int         data2[7]    = { 2, 2, 2, 2, 2, 2, 2};
>
>     int         data3[2][2] = { {3, 3},
>                 {3, 3} };
>     int fillvalue = 0;
>
>     /*
>      * Create the data space with unlimited dimensions.
>      */
>     dataspace = H5Screate_simple(RANK, dims, maxdims);
>
>     /*
>      * Create a new file. If file exists its contents will be  
> overwritten.
>      */
>     file = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT,  
> H5P_DEFAULT);
>
>     /*
>      * Modify dataset creation properties, i.e. enable chunking.
>      */
>     cparms = H5Pcreate(H5P_DATASET_CREATE);
>     status = H5Pset_chunk( cparms, RANK, chunk_dims);
>     status = H5Pset_fill_value (cparms, H5T_NATIVE_INT, &fillvalue );
>
>
>
> chunk_dims is set to {2,5}, which I don?t understand, because the  
> initial dataset is 3x3 and is then extended to 10x3 - why the {2,5}?
>
> thx,
> NH
>
>
>      * Create a new dataset within the file using cparms
>      * creation properties.
>      */
>     dataset = H5Dcreate2(file, DATASETNAME, H5T_NATIVE_INT,  
> dataspace, H5P_DEFAULT,
>             cparms, H5P_DEFAULT);
>
> Express yourself instantly with MSN Messenger! MSN Messenger

------------------------------------------------------------
Ruth Aydt
The HDF Group
1901 South First Street,  Suite C-2
Champaign, IL 61820

aydt at hdfgroup.org      (217)265-7837
------------------------------------------------------------



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/attachments/20081021/b50842be/attachment.html>