Parallel file access recommendation

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Parallel file access recommendation

Jan Oliver Oelerich
Hello HDF users,

I am using HDF5 through NetCDF and I recently changed my program so that
each MPI process writes its data directly to the output file as opposed
to the master process gathering the results and being the only one who
does I/O.

Now I see that my program slows down file systems a lot (of the whole
HPC cluster) and I don't really know how to handle I/O. The file system
is a high throughput Beegfs system.

My program uses a hybrid parallelization approach, i.e. work is split
into N MPI processes, each of which spawns M worker threads. Currently,
I write to the output file from each of the M*N threads, but the writing
is guarded by a mutex, so thread-safety shouldn't be a problem. Each
writing process is a complete `open file, write, close file` cycle.

Each write is at a separate region of the HDF5 file, so no chunks are
shared among any two processes. The amount of data to be written per
process is 1/(M*N) times the size of the whole file.

Shouldn't this be exactly how HDF5 + MPI is supposed to be used? What is
the `best practice` regarding parallel file access with HDF5?

Thank you and best regards,
Jan Oliver Oelerich



--
Dr. Jan Oliver Oelerich
Faculty of Physics and Material Sciences Center
Philipps-Universität Marburg

Addr.: Room 02D35, Hans-Meerwein-Straße 6, 35032 Marburg, Germany
Phone: +49 6421 2822260
Mail : [hidden email]
Web  : http://academics.oelerich.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|

Re: Parallel file access recommendation

Quincey Koziol-3
Hi Jan,

> On May 23, 2017, at 2:46 AM, Jan Oliver Oelerich <[hidden email]> wrote:
>
> Hello HDF users,
>
> I am using HDF5 through NetCDF and I recently changed my program so that each MPI process writes its data directly to the output file as opposed to the master process gathering the results and being the only one who does I/O.
>
> Now I see that my program slows down file systems a lot (of the whole HPC cluster) and I don't really know how to handle I/O. The file system is a high throughput Beegfs system.
>
> My program uses a hybrid parallelization approach, i.e. work is split into N MPI processes, each of which spawns M worker threads. Currently, I write to the output file from each of the M*N threads, but the writing is guarded by a mutex, so thread-safety shouldn't be a problem. Each writing process is a complete `open file, write, close file` cycle.
>
> Each write is at a separate region of the HDF5 file, so no chunks are shared among any two processes. The amount of data to be written per process is 1/(M*N) times the size of the whole file.
>
> Shouldn't this be exactly how HDF5 + MPI is supposed to be used? What is the `best practice` regarding parallel file access with HDF5?

        Yes, this is probably the correct way to operate, but generally things are much better for this case when collective I/O operations are used.  Are you using collective or independent I/O?  (Independent is the default)

        Quincey


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|

Re: Parallel file access recommendation

Aaron Friesz
A year or so back, we changed to BeeGFS as well.  There were some issues getting parrallel I/O setup.  First thing you want to do is run the parrallel mpio test. I believe they can be found here: https://support.hdfgroup.org/HDF5/Tutor/pprog.html.

This will help you verify if your cluster has mpio setup correctly.  If that doesn't work, you'll need to get in touch with the management group to fix that.  

Then you need to make sure you are using an HDF5 library that is configured to do parrallel I/O.  

I know there aren't a lot of specifics here, but it took me about two weeks of convincing to get my cluster management group to realize that things weren't working quite right.  Once everything was setup, I was able to generate and write about 40 GB of data in around two minutes.

On Tue, May 23, 2017 at 8:18 AM, Quincey Koziol <[hidden email]> wrote:
Hi Jan,

> On May 23, 2017, at 2:46 AM, Jan Oliver Oelerich <[hidden email]> wrote:
>
> Hello HDF users,
>
> I am using HDF5 through NetCDF and I recently changed my program so that each MPI process writes its data directly to the output file as opposed to the master process gathering the results and being the only one who does I/O.
>
> Now I see that my program slows down file systems a lot (of the whole HPC cluster) and I don't really know how to handle I/O. The file system is a high throughput Beegfs system.
>
> My program uses a hybrid parallelization approach, i.e. work is split into N MPI processes, each of which spawns M worker threads. Currently, I write to the output file from each of the M*N threads, but the writing is guarded by a mutex, so thread-safety shouldn't be a problem. Each writing process is a complete `open file, write, close file` cycle.
>
> Each write is at a separate region of the HDF5 file, so no chunks are shared among any two processes. The amount of data to be written per process is 1/(M*N) times the size of the whole file.
>
> Shouldn't this be exactly how HDF5 + MPI is supposed to be used? What is the `best practice` regarding parallel file access with HDF5?

        Yes, this is probably the correct way to operate, but generally things are much better for this case when collective I/O operations are used.  Are you using collective or independent I/O?  (Independent is the default)

        Quincey


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.hdfgroup.org_mailman_listinfo_hdf-2Dforum-5Flists.hdfgroup.org&d=DwICAg&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=Rx9txIqgEINHtVDIDfXdIw&m=lnwp4oSn3StCocEX3B_WwTydNuJ5oFX7VYl-Ei3bbpw&s=5GdG4kU-9hw-z8kHIDPj6-WfvdQeASwtycyfNyQ1tn0&e=
Twitter: https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_hdf5&d=DwICAg&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=Rx9txIqgEINHtVDIDfXdIw&m=lnwp4oSn3StCocEX3B_WwTydNuJ5oFX7VYl-Ei3bbpw&s=YAEy34105plaH2V5vqw54_wLbsigIZ__8F13hUdNgEQ&e=


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5