Reading dataset values with number of dimension 2.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Reading dataset values with number of dimension 2.

Deepak 8 Kumar
I have a HDF5 based application to read the hdf5 file which has dataset values with number of dimensions 2. I am using HDF 1.10.1 Windows10 x64.
I allocate array of pointers to read the data values in the same way as given in the hdf5 documentation. Here is the code which is reading the dataset values.
In this code, the dynamic 2D array of pointers is not initialized as per the standard way using a loop.

unique_ptr<T*[]> apbuffer = make_unique<T*[]>(size_of_dimensions[0]);
T** buffer = apbuffer.get();
unique_ptr<T[]> apbuffer1 = make_unique<T[]>(size_of_dimensions[0] * size_of_dimensions[1]);
buffer[0] = apbuffer1.get();
for (int i = 1; i < size_of_dimensions[0]; i++)  {
        buffer[i] = buffer[0] + i * size_of_dimensions[1];
}
H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer[0]);

Can we allocate the buffer like the below code?
T** buffer= new T*[dims[0]];
for(int i = 0; i < dims[0]; ++i)
    buffer[i] = new T[dims[1]];

I would like to know, what are other possible ways to allocate the buffer to read 2 dimension dataset values?
Any insight is greatly appreciated.

_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|

Re: Reading dataset values with number of dimension 2.

Nelson, Jarom

I think for a variable size multidimensional array in C++, it’s more common to just use a one-dimensional array, and do the array indexing math yourself (or with inline utility methods). Using an array of pointers to arrays just adds extra levels of indirection, more memory lookups, slower code.

 

Your second code snippet will not work, because the H5Dread is going to write the values to the buffer directly, not to the arrays pointed to by the buffer. You’ll want to do this instead:

T* buffer = new T[dims[1] * dims[0]];

 

Rather than trying to shoehorn the built-in multi-dim array syntax for your array by using the array of pointers, consider building up a wrapper class that owns the single-dim array, and having inline accessor methods can help with the semantics of lookup. And internally, you can have it use smart pointers, or ensure aligned data, or whatever. Then you can have something like:

 

ArrayWrapper<double> v(128,256);

v.value(123,234) = 1234.5678;

cout << v.value(1,2) << endl;

 

H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT, v.data());

 

Your compiler should be able to inline the calls to .value() so that it is equivalent to v.data()[v.xsize()*y + x]. If not, it will still be much faster than loading the address of a sub-array from memory and then reading from that address. (as long as your access patterns match the memory layout of the array)

 

Jarom

 

From: Hdf-forum [mailto:[hidden email]] On Behalf Of Deepak 8 Kumar
Sent: Wednesday, July 19, 2017 6:50 AM
To: [hidden email]
Subject: [Hdf-forum] Reading dataset values with number of dimension 2.

 

I have a HDF5 based application to read the hdf5 file which has dataset values with number of dimensions 2. I am using HDF 1.10.1 Windows10 x64.
I allocate array of pointers to read the data values in the same way as given in the hdf5 documentation. Here is the code which is reading the dataset values.
In this code, the dynamic 2D array of pointers is not initialized as per the standard way using a loop.

unique_ptr<T*[]> apbuffer = make_unique<T*[]>(size_of_dimensions[0]);
T** buffer = apbuffer.get();
unique_ptr<T[]> apbuffer1 = make_unique<T[]>(size_of_dimensions[0] * size_of_dimensions[1]);
buffer[0] = apbuffer1.get();
for (int i = 1; i < size_of_dimensions[0]; i++)  {
        buffer[i] = buffer[0] + i * size_of_dimensions[1];
}
H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer[0]);

Can we allocate the buffer like the below code?
T** buffer= new T*[dims[0]];
for(int i = 0; i < dims[0]; ++i)
    buffer[i] = new T[dims[1]];

I would like to know, what are other possible ways to allocate the buffer to read 2 dimension dataset values?
Any insight is greatly appreciated.


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|

Re: Reading dataset values with number of dimension 2.

Adev
If you are interested, the HighFive hdf5 C++ bindings allow you to read / write any 2 dimensional using boost::ublas::matrix with only few lines of codes :

https://github.com/BlueBrain/HighFive/blob/master/src/examples/boost_ublas_double.cpp

PS: I am one of the author of HighFive.

Regards,
Adrien Devresse

On July 19, 2017 8:32:40 PM GMT+02:00, "Nelson, Jarom" <[hidden email]> wrote:

I think for a variable size multidimensional array in C++, it’s more common to just use a one-dimensional array, and do the array indexing math yourself (or with inline utility methods). Using an array of pointers to arrays just adds extra levels of indirection, more memory lookups, slower code.

 

Your second code snippet will not work, because the H5Dread is going to write the values to the buffer directly, not to the arrays pointed to by the buffer. You’ll want to do this instead:

T* buffer = new T[dims[1] * dims[0]];

 

Rather than trying to shoehorn the built-in multi-dim array syntax for your array by using the array of pointers, consider building up a wrapper class that owns the single-dim array, and having inline accessor methods can help with the semantics of lookup. And internally, you can have it use smart pointers, or ensure aligned data, or whatever. Then you can have something like:

 

ArrayWrapper<double> v(128,256);

v.value(123,234) = 1234.5678;

cout << v.value(1,2) << endl;

 

H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT, v.data());

 

Your compiler should be able to inline the calls to .value() so that it is equivalent to v.data()[v.xsize()*y + x]. If not, it will still be much faster than loading the address of a sub-array from memory and then reading from that address. (as long as your access patterns match the memory layout of the array)

 

Jarom

 

From: Hdf-forum [mailto:[hidden email]] On Behalf Of Deepak 8 Kumar
Sent: Wednesday, July 19, 2017 6:50 AM
To: [hidden email]
Subject: [Hdf-forum] Reading dataset values with number of dimension 2.

 

I have a HDF5 based application to read the hdf5 file which has dataset values with number of dimensions 2. I am using HDF 1.10.1 Windows10 x64.
I allocate array of pointers to read the data values in the same way as given in the hdf5 documentation. Here is the code which is reading the dataset values.
In this code, the dynamic 2D array of pointers is not initialized as per the standard way using a loop.

unique_ptr<T*[]> apbuffer = make_unique<T*[]>(size_of_dimensions[0]);
T** buffer = apbuffer.get();
unique_ptr<T[]> apbuffer1 = make_unique<T[]>(size_of_dimensions[0] * size_of_dimensions[1]);
buffer[0] = apbuffer1.get();
for (int i = 1; i < size_of_dimensions[0]; i++)  {
        buffer[i] = buffer[0] + i * size_of_dimensions[1];
}
H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer[0]);

Can we allocate the buffer like the below code?
T** buffer= new T*[dims[0]];
for(int i = 0; i < dims[0]; ++i)
    buffer[i] = new T[dims[1]];

I would like to know, what are other possible ways to allocate the buffer to read 2 dimension dataset values?
Any insight is greatly appreciated.


--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5