Quantcast

Re: Broken file, can not add attribute

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Broken file, can not add attribute

Dave Allured - NOAA Affiliate
HDF5 support,

We traced this problem to a known bug, JIRA DB HDFFV-9281, "Attribute creation index can't be incremented", previously reported by Andrey Paramonov:

https://lists.hdfgroup.org/pipermail/hdf-forum_lists.hdfgroup.org/2015-April/008462.html

I believe the root cause is the management of the per-variable maximum attribute creation index within HDF5.  This is a 16-bit counter with maximum value 65535.  Careless usage patterns within client applications can easily saturate the counter.  In particular, repeated delete/recreate cycles on the same attribute name will ramp up the counter.  That is what happened in our own use case below.

Here is a related netcdf-C Github issue with excellent analysis by Constantine Khroulev:
https://github.com/Unidata/netcdf-c/issues/350

I think you can close the newer JIRA ticket HDFFV-10026 and mark it duplicate, or merge it with the previous JIRA DB HDFFV-9281.

We look forward to some kind of fix at the NetCDF or HDF5 level.  Thank you for looking into this.

--Dave A.
NOAA/OAR/ESRL/PSD/CIRES


On Tue, Nov 15, 2016 at 12:10 PM, Dave Allured - NOAA Affiliate <[hidden email]> wrote:
Elena,

Thank you for reproducing the issue, and for showing me how to display the object structure.  h5debug versions 1.8.16 and 1.10.0 are both showing ridiculous values for number of attributes, for known good Netcdf-4 files, even trivial examples.  There might be a problem in h5debug, independent from this broken file problem.

--Dave


On Mon, Nov 14, 2016 at 10:54 PM, Elena Pourmal <[hidden email]> wrote:
Dave,

I reproduced the issue with h5py and entered HDFFV-10026 (for your reference) into our JIRA database.

h5py reports: Unable to create attribute (Attribute creation index can't be incremented)

h5debug tool reports suspicious number of attributes for that dataset ;-) See below. We will need to investigate. 

Thank you for your report!

Elena 

[volga:~] epourmal% h5debug precip.V1.0.2016.nc 2482
Reading signature at address 2482 (rel)
Object Header...
Dirty:                                             FALSE
Version:                                           2
Header size (in bytes):                            12
Number of links:                                   1
Attribute creation order tracked:                  Yes
Attribute creation order indexed:                  Yes
Attribute storage phase change values:             Default
Timestamps:                                        Disabled
Number of messages (allocated):                    7 (8)
Number of chunks (allocated):                      1 (2)
Chunk 0...
   Address:                                        2482
   Size in bytes:                                  263
   Gap:                                            0
Message 0...
   Message ID (sequence number):                   0x0001 `dataspace' (0)
   Dirty:                                          FALSE
   Message flags:                                  <none>
   Chunk number:                                   0
   Raw message data (offset, size) in chunk:       (14, 52) bytes
   Message Information:                           
      Rank:                                        3
      Dim Size:                                    {310, 120, 300}
      Dim Max:                                     {UNLIM, 120, 300}
Message 1...
   Message ID (sequence number):                   0x0005 `fill_new' (0)
   Dirty:                                          FALSE
   Message flags:                                  <C>
   Chunk number:                                   0
   Raw message data (offset, size) in chunk:       (72, 10) bytes
   Message Information:                           
      Space Allocation Time:                       Incremental
      Fill Time:                                   If Set
      Fill Value Defined:                          User Defined
      Size:                                        4
      Data type:                                   <dataset type>
Message 2...
   Message ID (sequence number):                   0x000b `filter pipeline' (0)
   Dirty:                                          FALSE
   Message flags:                                  <C>
   Chunk number:                                   0
   Raw message data (offset, size) in chunk:       (88, 22) bytes
   Message Information:                           
      Number of filters:                           2/2
      Filter at position 0                        
         Filter identification:                    0x0002
         Filter name:                              NONE
         Flags:                                    0x0001
         Num CD values:                            1
            CD value 0                             4
      Filter at position 1                        
         Filter identification:                    0x0001
         Filter name:                              NONE
         Flags:                                    0x0001
         Num CD values:                            1
            CD value 0                             2
Message 3...
   Message ID (sequence number):                   0x0008 `layout' (0)
   Dirty:                                          FALSE
   Message flags:                                  <C>
   Chunk number:                                   0
   Raw message data (offset, size) in chunk:       (116, 27) bytes
   Message Information:                           
      Version:                                     3
      Type:                                        Chunked
      Number of dimensions:                        4
      Size:                                        {1, 120, 300, 4}
      Index Type:                                  v1 B-tree
      B-tree address:                              17919
Message 4...
   Message ID (sequence number):                   0x0015 `ainfo' (0)
   Dirty:                                          FALSE
   Message flags:                                  <DS>
   Chunk number:                                   0
   Raw message data (offset, size) in chunk:       (149, 28) bytes
   Message Information:                           
      Number of attributes:                        18446744073709551615
      Track creation order of attributes:          TRUE
      Index creation order of attributes:          TRUE
      Max. creation index value:                   65535
      'Dense' attribute storage fractal heap address: 4671
      'Dense' attribute storage name index v2 B-tree address: 4817
      'Dense' attribute storage creation order index v2 B-tree address: 4855
Message 5...
   Message ID (sequence number):                   0x0003 `datatype' (0)
   Dirty:                                          FALSE
   Message flags:                                  <C>
   Chunk number:                                   0
   Raw message data (offset, size) in chunk:       (183, 20) bytes
   Message Information:                           
      Type class:                                  floating-point
      Size:                                        4 bytes
      Version:                                     1
      Byte order:                                  little endian
      Precision:                                   32 bits
      Offset:                                      0 bits
      Low pad type:                                zero
      High pad type:                               zero
      Internal pad type:                           zero
      Normalization:                               implied
      Sign bit location:                           31
      Exponent location:                           23
      Exponent bias:                               0x0000007f
      Exponent size:                               8
      Mantissa location:                           0
      Mantissa size:                               23
Message 6...
   Message ID (sequence number):                   0x0000 `null' (0)
   Dirty:                                          FALSE
   Message flags:                                  <none>
   Chunk number:                                   0
   Raw message data (offset, size) in chunk:       (209, 62) bytes
   Message Information:                           
      <No info for this message>
On Nov 14, 2016, at 4:10 PM, Dave Allured - NOAA Affiliate <[hidden email]> wrote:

HDF5 support,

My work group encounters rare failures when attempting to update collections of HDF5 files.  The usual symptom is that a previously valid file becomes partially broken, such that it becomes impossible to add new attributes to one particular data variable.  The file otherwise reads and writes normally.  Attributes can be added to other variables, and data can still be written to the suspect variable, including extending the unlimited dimension.

We normally operate on these files with the Netcdf-4 interface to HDF5, but I think I isolated the failure to pure HDF5 functionality.  You can examine one of these broken files here (9 Mb):

ftp://ftp.cdc.noaa.gov/Public/dallured/hdf5/precip.V1.0.2016.nc

Diagnostics such as h5check, h5debug, h5dump, and h5stat all report this file as valid.  h5edit demonstrates the problem:

> ls -go precip.V1.0.2016.nc
-rw-r--r-- 1 9024497 Nov  9 15:02 precip.V1.0.2016.nc

> h5edit -c 'CREATE /precip/att12 { DATATYPE H5T_IEEE_F32LE DATASPACE SIMPLE ( 2 )  DATA { 777.0, 778.0 } } ;' precip.V1.0.2016.nc
failed to create attribute (null)
CREATE Attribute command failed

... and the data file is unchanged.  It does not matter whether the added attribute is scalar or array.

We are currently using the HDF5 library version 1.8.16 in combination with Netcdf to read and write these data sets.  The version of h5edit above is 1.3.1.  The broken file behaves the same way on both Mac and Linux platforms, and with several Netcdf-based attribute writers in addition to h5edit.

We are investigating, but we have not yet found a good way to isolate the exact program or event that creates this broken file condition.  We are suspicious that NFS connected file servers may contribute to the problem, but no strong evidence yet to back this up.

Today I have these questions:

1.  Can someone identify exactly what is wrong with the above sample file?

2.  Can h5check or another utility be updated to diagnose this condition?

3.  Has anyone else experienced partially broken files with this symptom?

Thank you for any insights.

--Dave A.
NOAA/OAR/ESRL/PSD/CIRES

_______________________________________________
Hdf-forum is for HDF software users discussion.
[hidden email]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
Loading...