Quantcast

Building hdf5-1.10.0-patch 1 on Blue Gene Q

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Building hdf5-1.10.0-patch 1 on Blue Gene Q

Sjaardema, Gregory D

I have been unsuccessful in building a parallel version of hdf5-1.10.0-patch1 on a blue gene q system (rzuseq).  I have used both the yod-configure approach and manually changing all ./conftest to srun –n1 ./conftest and although both approaches configure and build correctly, I am unable to run the testhdf5 or testphdf5.  The testhdf5 gives errors of the sort:

 

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line  187 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

  #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed

    major: Virtual File Layer

    minor: Can't update object

  #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'

    major: File accessibilty

    minor: Bad file ID accessed

*** UNEXPECTED RETURN from H5Fclose is -1 at line  198 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 749 in H5Fclose(): not a file ID

    major: Invalid arguments to routine

    minor: Inappropriate type

 

This happens on both a lustre and nfs filesystem.  When I build hdf5-1.8.16 using same procedure; everything works correctly.  I have also used the bulid_hdf5 in the CGNS distribution with no change in behavior.

 

I need hdf5-1.10.0-patch1 or later to investigate the collective metadata changes.

 

If anyone on the list or any of the hdf5 developers or support people have successfully bult on a blue gene q system, your help would be very much appreciated.

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building hdf5-1.10.0-patch 1 on Blue Gene Q

Dana Robinson

Hi Greg,

 

It looks like you are bumping into the "file locking not implemented" issue. There is a small source patch here that disables file locking:

 

https://support.hdfgroup.org/HDF5/release/obtainsrc5110.html

 

Note that file locking was implemented solely to help users get concurrent file opening semantics right. There's no actual loss of HDF5 or SWMR functionality.

 

In the upcoming HDF5 1.10.1, file locking can be disabled via an environment variable and we have a more informative error message when we detect that file locking is not implemented on a file system.

 

Let me know if that doesn't work for you and we can diagnose further.

 

Cheers,

 

Dana Robinson

Software Engineer

The HDF Group

 

From: Hdf-forum [mailto:[hidden email]] On Behalf Of Sjaardema, Gregory D
Sent: Thursday, December 8, 2016 3:23 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

I have been unsuccessful in building a parallel version of hdf5-1.10.0-patch1 on a blue gene q system (rzuseq).  I have used both the yod-configure approach and manually changing all ./conftest to srun –n1 ./conftest and although both approaches configure and build correctly, I am unable to run the testhdf5 or testphdf5.  The testhdf5 gives errors of the sort:

 

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line  187 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

  #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed

    major: Virtual File Layer

    minor: Can't update object

  #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'

    major: File accessibilty

    minor: Bad file ID accessed

*** UNEXPECTED RETURN from H5Fclose is -1 at line  198 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 749 in H5Fclose(): not a file ID

    major: Invalid arguments to routine

    minor: Inappropriate type

 

This happens on both a lustre and nfs filesystem.  When I build hdf5-1.8.16 using same procedure; everything works correctly.  I have also used the bulid_hdf5 in the CGNS distribution with no change in behavior.

 

I need hdf5-1.10.0-patch1 or later to investigate the collective metadata changes.

 

If anyone on the list or any of the hdf5 developers or support people have successfully bult on a blue gene q system, your help would be very much appreciated.

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [EXTERNAL] Building hdf5-1.10.0-patch 1 on Blue Gene Q

Sjaardema, Gregory D
In reply to this post by Sjaardema, Gregory D

A little more information on the previous email:

 

The parallel tests that fail are: calloc, fltread, and atomicity.  All others pass.

 

The h5dump fails to open any existing hdf5 file that I have tried.

 

Using mpicc which is powerpc64-bgq-linux-gcc 4.7.2

 

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”

 

From: Hdf-forum <[hidden email]> on behalf of "Sjaardema, Gregory D" <[hidden email]>
Reply-To: HDF Users Discussion List <[hidden email]>
Date: Thursday, December 8, 2016 at 1:23 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [EXTERNAL] [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

I have been unsuccessful in building a parallel version of hdf5-1.10.0-patch1 on a blue gene q system (rzuseq).  I have used both the yod-configure approach and manually changing all ./conftest to srun –n1 ./conftest and although both approaches configure and build correctly, I am unable to run the testhdf5 or testphdf5.  The testhdf5 gives errors of the sort:

 

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line  187 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

  #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed

    major: Virtual File Layer

    minor: Can't update object

  #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'

    major: File accessibilty

    minor: Bad file ID accessed

*** UNEXPECTED RETURN from H5Fclose is -1 at line  198 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 749 in H5Fclose(): not a file ID

    major: Invalid arguments to routine

    minor: Inappropriate type

 

This happens on both a lustre and nfs filesystem.  When I build hdf5-1.8.16 using same procedure; everything works correctly.  I have also used the bulid_hdf5 in the CGNS distribution with no change in behavior.

 

I need hdf5-1.10.0-patch1 or later to investigate the collective metadata changes.

 

If anyone on the list or any of the hdf5 developers or support people have successfully bult on a blue gene q system, your help would be very much appreciated.

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [EXTERNAL] Re: Building hdf5-1.10.0-patch 1 on Blue Gene Q

Sjaardema, Gregory D
In reply to this post by Dana Robinson

I think this is working.  I am going to run some more tests and build the applications that use the library, but I am able to h5dump files and it looks like most/all of the serial and parallel tests are working.

 

Thanks for the quick response.

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”

 

From: Hdf-forum <[hidden email]> on behalf of Dana Robinson <[hidden email]>
Reply-To: HDF Users Discussion List <[hidden email]>
Date: Thursday, December 8, 2016 at 1:46 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [EXTERNAL] Re: [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

Hi Greg,

 

It looks like you are bumping into the "file locking not implemented" issue. There is a small source patch here that disables file locking:

 

https://support.hdfgroup.org/HDF5/release/obtainsrc5110.html

 

Note that file locking was implemented solely to help users get concurrent file opening semantics right. There's no actual loss of HDF5 or SWMR functionality.

 

In the upcoming HDF5 1.10.1, file locking can be disabled via an environment variable and we have a more informative error message when we detect that file locking is not implemented on a file system.

 

Let me know if that doesn't work for you and we can diagnose further.

 

Cheers,

 

Dana Robinson

Software Engineer

The HDF Group

 

From: Hdf-forum [mailto:[hidden email]] On Behalf Of Sjaardema, Gregory D
Sent: Thursday, December 8, 2016 3:23 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

I have been unsuccessful in building a parallel version of hdf5-1.10.0-patch1 on a blue gene q system (rzuseq).  I have used both the yod-configure approach and manually changing all ./conftest to srun –n1 ./conftest and although both approaches configure and build correctly, I am unable to run the testhdf5 or testphdf5.  The testhdf5 gives errors of the sort:

 

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line  187 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

  #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed

    major: Virtual File Layer

    minor: Can't update object

  #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'

    major: File accessibilty

    minor: Bad file ID accessed

*** UNEXPECTED RETURN from H5Fclose is -1 at line  198 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 749 in H5Fclose(): not a file ID

    major: Invalid arguments to routine

    minor: Inappropriate type

 

This happens on both a lustre and nfs filesystem.  When I build hdf5-1.8.16 using same procedure; everything works correctly.  I have also used the bulid_hdf5 in the CGNS distribution with no change in behavior.

 

I need hdf5-1.10.0-patch1 or later to investigate the collective metadata changes.

 

If anyone on the list or any of the hdf5 developers or support people have successfully bult on a blue gene q system, your help would be very much appreciated.

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [EXTERNAL] Re: Building hdf5-1.10.0-patch 1 on Blue Gene Q

Sjaardema, Gregory D
In reply to this post by Dana Robinson

I am getting confusing results with the file-locking patch.  The parallel tests (check-p) seem to run correctly, but the serial tests fail on the low level file i/o tests (file) with an error that looks related to the file locking.  I have verified that the patch applied correctly.  Here is the results of the test output:

 

For help use: /usr/workspace/wsrzc/gdsjaar/seacas/TPL/hdf5/hdf5-1.10.0-patch1/test/./testhdf5 -help

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line 3158 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

/

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”

 

From: Hdf-forum <[hidden email]> on behalf of Dana Robinson <[hidden email]>
Reply-To: HDF Users Discussion List <[hidden email]>
Date: Thursday, December 8, 2016 at 1:46 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [EXTERNAL] Re: [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

Hi Greg,

 

It looks like you are bumping into the "file locking not implemented" issue. There is a small source patch here that disables file locking:

 

https://support.hdfgroup.org/HDF5/release/obtainsrc5110.html

 

Note that file locking was implemented solely to help users get concurrent file opening semantics right. There's no actual loss of HDF5 or SWMR functionality.

 

In the upcoming HDF5 1.10.1, file locking can be disabled via an environment variable and we have a more informative error message when we detect that file locking is not implemented on a file system.

 

Let me know if that doesn't work for you and we can diagnose further.

 

Cheers,

 

Dana Robinson

Software Engineer

The HDF Group

 

From: Hdf-forum [mailto:[hidden email]] On Behalf Of Sjaardema, Gregory D
Sent: Thursday, December 8, 2016 3:23 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

I have been unsuccessful in building a parallel version of hdf5-1.10.0-patch1 on a blue gene q system (rzuseq).  I have used both the yod-configure approach and manually changing all ./conftest to srun –n1 ./conftest and although both approaches configure and build correctly, I am unable to run the testhdf5 or testphdf5.  The testhdf5 gives errors of the sort:

 

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line  187 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

  #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed

    major: Virtual File Layer

    minor: Can't update object

  #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'

    major: File accessibilty

    minor: Bad file ID accessed

*** UNEXPECTED RETURN from H5Fclose is -1 at line  198 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 749 in H5Fclose(): not a file ID

    major: Invalid arguments to routine

    minor: Inappropriate type

 

This happens on both a lustre and nfs filesystem.  When I build hdf5-1.8.16 using same procedure; everything works correctly.  I have also used the bulid_hdf5 in the CGNS distribution with no change in behavior.

 

I need hdf5-1.10.0-patch1 or later to investigate the collective metadata changes.

 

If anyone on the list or any of the hdf5 developers or support people have successfully bult on a blue gene q system, your help would be very much appreciated.

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [EXTERNAL] Re: Building hdf5-1.10.0-patch 1 on Blue Gene Q

Dana Robinson

Hi Greg,

 

That is a weird error. Did you build from a clean state after applying the patch?

 

Dana

 

From: Hdf-forum [mailto:[hidden email]] On Behalf Of Sjaardema, Gregory D
Sent: Monday, December 12, 2016 11:37 AM
To: HDF Users Discussion List <[hidden email]>
Subject: Re: [Hdf-forum] [EXTERNAL] Re: Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

I am getting confusing results with the file-locking patch.  The parallel tests (check-p) seem to run correctly, but the serial tests fail on the low level file i/o tests (file) with an error that looks related to the file locking.  I have verified that the patch applied correctly.  Here is the results of the test output:

 

For help use: /usr/workspace/wsrzc/gdsjaar/seacas/TPL/hdf5/hdf5-1.10.0-patch1/test/./testhdf5 -help

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line 3158 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

/

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”

 

From: Hdf-forum <[hidden email]> on behalf of Dana Robinson <[hidden email]>
Reply-To: HDF Users Discussion List <[hidden email]>
Date: Thursday, December 8, 2016 at 1:46 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [EXTERNAL] Re: [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

Hi Greg,

 

It looks like you are bumping into the "file locking not implemented" issue. There is a small source patch here that disables file locking:

 

https://support.hdfgroup.org/HDF5/release/obtainsrc5110.html

 

Note that file locking was implemented solely to help users get concurrent file opening semantics right. There's no actual loss of HDF5 or SWMR functionality.

 

In the upcoming HDF5 1.10.1, file locking can be disabled via an environment variable and we have a more informative error message when we detect that file locking is not implemented on a file system.

 

Let me know if that doesn't work for you and we can diagnose further.

 

Cheers,

 

Dana Robinson

Software Engineer

The HDF Group

 

From: Hdf-forum [[hidden email]] On Behalf Of Sjaardema, Gregory D
Sent: Thursday, December 8, 2016 3:23 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

I have been unsuccessful in building a parallel version of hdf5-1.10.0-patch1 on a blue gene q system (rzuseq).  I have used both the yod-configure approach and manually changing all ./conftest to srun –n1 ./conftest and although both approaches configure and build correctly, I am unable to run the testhdf5 or testphdf5.  The testhdf5 gives errors of the sort:

 

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line  187 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

  #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed

    major: Virtual File Layer

    minor: Can't update object

  #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'

    major: File accessibilty

    minor: Bad file ID accessed

*** UNEXPECTED RETURN from H5Fclose is -1 at line  198 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 749 in H5Fclose(): not a file ID

    major: Invalid arguments to routine

    minor: Inappropriate type

 

This happens on both a lustre and nfs filesystem.  When I build hdf5-1.8.16 using same procedure; everything works correctly.  I have also used the bulid_hdf5 in the CGNS distribution with no change in behavior.

 

I need hdf5-1.10.0-patch1 or later to investigate the collective metadata changes.

 

If anyone on the list or any of the hdf5 developers or support people have successfully bult on a blue gene q system, your help would be very much appreciated.

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [EXTERNAL] Re: Building hdf5-1.10.0-patch 1 on Blue Gene Q

Sjaardema, Gregory D

Yes, I’ve built it a couple times from scratch…  Deleting all files, untarring, patching, build.

 

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”

 

From: Hdf-forum <[hidden email]> on behalf of Dana Robinson <[hidden email]>
Reply-To: HDF Users Discussion List <[hidden email]>
Date: Monday, December 12, 2016 at 9:48 AM
To: HDF Users Discussion List <[hidden email]>
Subject: Re: [Hdf-forum] [EXTERNAL] Re: Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

Hi Greg,

 

That is a weird error. Did you build from a clean state after applying the patch?

 

Dana

 

From: Hdf-forum [mailto:[hidden email]] On Behalf Of Sjaardema, Gregory D
Sent: Monday, December 12, 2016 11:37 AM
To: HDF Users Discussion List <[hidden email]>
Subject: Re: [Hdf-forum] [EXTERNAL] Re: Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

I am getting confusing results with the file-locking patch.  The parallel tests (check-p) seem to run correctly, but the serial tests fail on the low level file i/o tests (file) with an error that looks related to the file locking.  I have verified that the patch applied correctly.  Here is the results of the test output:

 

For help use: /usr/workspace/wsrzc/gdsjaar/seacas/TPL/hdf5/hdf5-1.10.0-patch1/test/./testhdf5 -help

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line 3158 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

    major: Invalid arguments to routine

    minor: Inappropriate type

  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID

    major: Invalid arguments to routine

    minor: Bad value

*** UNEXPECTED RETURN from H5Dclose is -1 at line 3183 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 334 in H5Dclose(): not a dataset

    major: Invalid arguments to routine

    minor: Inappropriate type

*** UNEXPECTED RETURN from H5Dcreate2 is -1 at line 3180 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5D.c line 121 in H5Dcreate2(): not a location ID

/

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”

 

From: Hdf-forum <[hidden email]> on behalf of Dana Robinson <[hidden email]>
Reply-To: HDF Users Discussion List <[hidden email]>
Date: Thursday, December 8, 2016 at 1:46 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [EXTERNAL] Re: [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

Hi Greg,

 

It looks like you are bumping into the "file locking not implemented" issue. There is a small source patch here that disables file locking:

 

https://support.hdfgroup.org/HDF5/release/obtainsrc5110.html

 

Note that file locking was implemented solely to help users get concurrent file opening semantics right. There's no actual loss of HDF5 or SWMR functionality.

 

In the upcoming HDF5 1.10.1, file locking can be disabled via an environment variable and we have a more informative error message when we detect that file locking is not implemented on a file system.

 

Let me know if that doesn't work for you and we can diagnose further.

 

Cheers,

 

Dana Robinson

Software Engineer

The HDF Group

 

From: Hdf-forum [[hidden email]] On Behalf Of Sjaardema, Gregory D
Sent: Thursday, December 8, 2016 3:23 PM
To: HDF Users Discussion List <[hidden email]>
Subject: [Hdf-forum] Building hdf5-1.10.0-patch 1 on Blue Gene Q

 

I have been unsuccessful in building a parallel version of hdf5-1.10.0-patch1 on a blue gene q system (rzuseq).  I have used both the yod-configure approach and manually changing all ./conftest to srun –n1 ./conftest and although both approaches configure and build correctly, I am unable to run the testhdf5 or testphdf5.  The testhdf5 gives errors of the sort:

 

Linked with hdf5 version 1.10 release 0

Testing  -- Configure definitions (config)

Testing  -- Encoding/decoding metadata (metadata)

Testing  -- Checksum algorithm (checksum)

Testing  -- Ternary Search Trees (tst)

Testing  -- Memory Heaps (heap)

Testing  -- Skip Lists (skiplist)

Testing  -- Reference Counted Strings (refstr)

Testing  -- Low-Level File I/O (file)

*** UNEXPECTED RETURN from H5Fcreate is -1 at line  187 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 491 in H5Fcreate(): unable to create file

    major: File accessibilty

    minor: Unable to open file

  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure

    major: File accessibilty

    minor: Unable to open file

  #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed

    major: Virtual File Layer

    minor: Can't update object

  #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'

    major: File accessibilty

    minor: Bad file ID accessed

*** UNEXPECTED RETURN from H5Fclose is -1 at line  198 in tfile.c

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 0:

  #000: H5F.c line 749 in H5Fclose(): not a file ID

    major: Invalid arguments to routine

    minor: Inappropriate type

 

This happens on both a lustre and nfs filesystem.  When I build hdf5-1.8.16 using same procedure; everything works correctly.  I have also used the bulid_hdf5 in the CGNS distribution with no change in behavior.

 

I need hdf5-1.10.0-patch1 or later to investigate the collective metadata changes.

 

If anyone on the list or any of the hdf5 developers or support people have successfully bult on a blue gene q system, your help would be very much appreciated.

..Greg

 

-- 

"A supercomputer is a device for turning compute-bound problems into I/O-bound problems”


_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Loading...