h5fcreate 1.10 unable to lock

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

h5fcreate 1.10 unable to lock

Greg Werner

The metadata-related changes in hdf5 1.10 have made it possible for my
(massively parallel) simulation code to restart from a checkpoint.

However, h5fcreate fails to open a file in serial (on BG/Q).  In a minimal
test program, a simple h5fcreate call (in a serial program running on 1
processor) results in an error (in contrast, with hdf5 1.8.10, the file is
successfully created):

CALL h5fcreate_f(fileName, H5F_ACC_EXCL_F, fileId, h5err)

fails to create the file (it has zero size), and yields the errors:

HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
   #000: H5F.c line 491 in H5Fcreate(): unable to create file
     major: File accessibilty
     minor: Unable to open file
   #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or
initialize file structure
     major: File accessibilty
     minor: Unable to open file
   #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed
     major: Virtual File Layer
     minor: Can't update object
   #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file,
errno = 38, error message = 'Function not implemented'
     major: File accessibilty
     minor: Bad file ID accessed
  Tried to create file, err =  -1
HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
   #000: H5F.c line 749 in H5Fclose(): not a file ID
     major: Invalid arguments to routine
     minor: Inappropriate type

Creating files in parallel does not seem to be a problem.

Is there a work-around?  Even in the full simulation code, this is a
straightforward serial open/read/write by a single process; there is no
danger of multiple readers, etc.

Might this issue be fixed by the recent patch to 1.10?

Thanks,
Greg.

_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: h5fcreate 1.10 unable to lock

Dana Robinson
Hi Greg,

It looks like flock(2) is available as an API call, but is not implemented for that file system, so it returns a failure code. The HDF5 library only inspects the flock() return value and not errno, so we just note the failure and our API call fails in turn.

Just out of curiosity, is this a Lustre file system? I've heard that the overhead for locking is high, so admins often disable it.

Unfortunately, there is no work-around for the file-locking calls in either HDF5 1.10.0 or 1.10.0-patch1 aside from modifying the source. Also unfortunately, you are not the only person who is tripping over the file locking issue when it is unnecessary or unwanted.

For the very short term, I'm considering putting a source patch on our website that will disable the file locking. You'll have to apply the patch and build the library yourself, but this would fix your problem. Let me check into how to best accomplish this and I'll shoot for getting this out next week sometime.

Our current plan to really fix the issue is to start by generating an RFC describing the issue and our proposed solutions. After a brief period for comments, we'll implement the changes for HDF5 1.10.1, which should be released in the very near future (mid-summer, I believe). Before the release date you'll be able to use a snapshot to get the functionality. Since this is a problem that affects several users, I'm going to be keen on getting this into a snapshot ASAP so hopefully you won't have to wait long for official functionality that addresses your problem.

Dana Robinson
Software Engineer
The HDF Group

-----Original Message-----
From: Hdf-forum [mailto:[hidden email]] On Behalf Of Greg Werner
Sent: Thursday, June 2, 2016 11:29 AM
To: [hidden email]
Subject: [Hdf-forum] h5fcreate 1.10 unable to lock


The metadata-related changes in hdf5 1.10 have made it possible for my (massively parallel) simulation code to restart from a checkpoint.

However, h5fcreate fails to open a file in serial (on BG/Q).  In a minimal test program, a simple h5fcreate call (in a serial program running on 1
processor) results in an error (in contrast, with hdf5 1.8.10, the file is successfully created):

CALL h5fcreate_f(fileName, H5F_ACC_EXCL_F, fileId, h5err)

fails to create the file (it has zero size), and yields the errors:

HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
   #000: H5F.c line 491 in H5Fcreate(): unable to create file
     major: File accessibilty
     minor: Unable to open file
   #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure
     major: File accessibilty
     minor: Unable to open file
   #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed
     major: Virtual File Layer
     minor: Can't update object
   #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'
     major: File accessibilty
     minor: Bad file ID accessed
  Tried to create file, err =  -1
HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
   #000: H5F.c line 749 in H5Fclose(): not a file ID
     major: Invalid arguments to routine
     minor: Inappropriate type

Creating files in parallel does not seem to be a problem.

Is there a work-around?  Even in the full simulation code, this is a straightforward serial open/read/write by a single process; there is no danger of multiple readers, etc.

Might this issue be fixed by the recent patch to 1.10?

Thanks,
Greg.

_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: h5fcreate 1.10 unable to lock

Greg Werner

For the record, this problem (flock not being available on BG/Q with
GPFS, IBM General Parallel File System) is fixed by the "Patch to Disable
File Locking", a patch to 1.10-0-patch1, namely file-lock-removal.diff at

https://www.hdfgroup.org/HDF5/release/obtainsrc5110.html#patch

Greg.

On Thu, 2 Jun 2016, Dana Robinson wrote:

> Hi Greg,
>
> It looks like flock(2) is available as an API call, but is not implemented for that file system, so it returns a failure code. The HDF5 library only inspects the flock() return value and not errno, so we just note the failure and our API call fails in turn.
>
> Just out of curiosity, is this a Lustre file system? I've heard that the overhead for locking is high, so admins often disable it.
>
> Unfortunately, there is no work-around for the file-locking calls in either HDF5 1.10.0 or 1.10.0-patch1 aside from modifying the source. Also unfortunately, you are not the only person who is tripping over the file locking issue when it is unnecessary or unwanted.
>
> For the very short term, I'm considering putting a source patch on our website that will disable the file locking. You'll have to apply the patch and build the library yourself, but this would fix your problem. Let me check into how to best accomplish this and I'll shoot for getting this out next week sometime.
>
> Our current plan to really fix the issue is to start by generating an RFC describing the issue and our proposed solutions. After a brief period for comments, we'll implement the changes for HDF5 1.10.1, which should be released in the very near future (mid-summer, I believe). Before the release date you'll be able to use a snapshot to get the functionality. Since this is a problem that affects several users, I'm going to be keen on getting this into a snapshot ASAP so hopefully you won't have to wait long for official functionality that addresses your problem.
>
> Dana Robinson
> Software Engineer
> The HDF Group
>
> -----Original Message-----
> From: Hdf-forum [mailto:[hidden email]] On Behalf Of Greg Werner
> Sent: Thursday, June 2, 2016 11:29 AM
> To: [hidden email]
> Subject: [Hdf-forum] h5fcreate 1.10 unable to lock
>
>
> The metadata-related changes in hdf5 1.10 have made it possible for my (massively parallel) simulation code to restart from a checkpoint.
>
> However, h5fcreate fails to open a file in serial (on BG/Q).  In a minimal test program, a simple h5fcreate call (in a serial program running on 1
> processor) results in an error (in contrast, with hdf5 1.8.10, the file is successfully created):
>
> CALL h5fcreate_f(fileName, H5F_ACC_EXCL_F, fileId, h5err)
>
> fails to create the file (it has zero size), and yields the errors:
>
> HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
>   #000: H5F.c line 491 in H5Fcreate(): unable to create file
>     major: File accessibilty
>     minor: Unable to open file
>   #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure
>     major: File accessibilty
>     minor: Unable to open file
>   #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed
>     major: Virtual File Layer
>     minor: Can't update object
>   #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'
>     major: File accessibilty
>     minor: Bad file ID accessed
>  Tried to create file, err =  -1
> HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
>   #000: H5F.c line 749 in H5Fclose(): not a file ID
>     major: Invalid arguments to routine
>     minor: Inappropriate type
>
> Creating files in parallel does not seem to be a problem.
>
> Is there a work-around?  Even in the full simulation code, this is a straightforward serial open/read/write by a single process; there is no danger of multiple readers, etc.
>
> Might this issue be fixed by the recent patch to 1.10?
>
> Thanks,
> Greg.
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [hidden email]
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [hidden email]
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5
>

_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Loading...