Conversation
There was a problem hiding this comment.
Since we now depend on hdf5-json to do this testing, it might be a good idea to include hdf5-json's tests as a step in the CI
mattjala
left a comment
There was a problem hiding this comment.
Besides a few minor comments and questions, this is good to go in. I'll try to get the outstanding PRs on hdf5-json reviewed this week so that we can avoid having HSDS depend on a specific branch.
hsds/group_sn.py
Outdated
| created = link_item["created"] | ||
| # allow "pre-dated" attributes if recent enough | ||
| predate_max_time = config.get("predate_max_time", default=10.0) | ||
| if now - created > predate_max_time: |
There was a problem hiding this comment.
This comparison seems backwards. If I understand correctly, the difference between current time and creation time should need to be under the max time, not above it
|
@jreadey Linter issues are preventing CI from running on this right now |
|
@jreadey Should we mark this as draft until it's in a final state for review? |
Use h5json package for typing and objids
Important
Migrated HSDS to use the h5json library for core utilities, restructured utility modules, added support for client-provided object IDs and timestamps, and updated dependencies to require Python 3.10+ with h5json 1.0.0+.
Library Migration and Utility Restructuring
hsds/util/idUtil.py,hsds/util/timeUtil.py,hsds/util/hdf5dtype.py, andhsds/util/arrayUtil.pyas their functionality is now provided by h5json.hsds/util/nodeUtil.pywith node ID generation, partitioning, and datanode URL resolution functions.util.idUtil�h5json.objid,util.timeUtil�h5json.time_util).Object ID and Timestamp Handling
POST_Dataset,POST_Group,POST_Datatype, and related functions indset_dn.py,group_dn.py,ctype_dn.py, anddset_sn.py.max_timestamp_driftconfiguration parameter to validate client-provided timestamps inattr_dn.py,link_dn.py, and related modules, with fallback to server-generated timestamps when skew exceeds threshold.deleted_idsset when creating new objects with the same ID.Configuration and Dependencies
default_vlen_type_size,predate_maxtime,posix_delay,max_compact_dset_size, andmax_timestamp_drifttoadmin/config/config.yml.pyproject.tomlto require Python 3.10+, add h5json 1.0.0+, update numpy to 2.0.0+, and constrain numcodecs to �0.15.1..github/workflows/python-package.yml.API and Function Refactoring
POST_Dataset,POST_Group, andPOST_Datatypehandlers to support batch creation of multiple objects using new helper functions (createDatasets,createGroups,createDatatypeObjs) andDomainCrawlerfor writing initial data.getChunkLayoutcalls togetChunkDimsthroughout codebase; moved layout from top-level response to nested undercreationProperties.h5domaintofileinlink_dn.py,link_sn.py, andservicenode_lib.py; added per-link timestamp validation inPUT_Links.POST_Dataset,POST_Group, andPOST_Datatypeinstead of always creating empty objects.New Functionality
hsds/post_crawl.pywithPostCrawlerclass for asynchronously creating multiple HDF5 objects with configurable worker count and error handling.getConsolidatedMetaDatafunction inasync_lib.pyto create consolidated metadata summaries for all objects in a domain.put_datamethod toDomainCrawlerfor writing one-chunk dataset values; addeddoPointWriteanddoHyperslabWritefunctions indset_lib.pyfor writing point and hyperslab selections.getobjsparameter togetDomainResponsefunction to optionally return domain objects from S3 summary file.Bug Fixes and Improvements
HTTPInternalServerErrortoHTTPBadRequestfor duplicate object IDs and invalid configurations inctype_dn.py,dset_dn.py, andgroup_dn.py.posix_delayconfiguration support tofileClient.pyfor simulating cloud storage latencies inget_object,put_object, andlist_keysmethods.HSDS_VERSIONfrom 0.9.2 to 1.0.0 inbasenode.py.Test Updates
testPostDatasetWithId,testPostTypeWithId,testPostWithId), attribute initialization (testPostDatasetWithAttributes,testPostWithAttributes), timestamp handling (testUseTimestamp), and batch creation (testPostMulti,testDatasetPostMulti).creationPropertiesinstead of top-level; removedCHUNK_MIN/CHUNK_MAXconstants and moved them to local scope; updated external link tests to usefilefield instead ofh5domain.array_util_test.py,hdf5_dtype_test.py, andid_util_test.pyas their functionality is now tested through h5json library.createObjId,getFilterItem) instead of local utilities.This description was created by
for 2bafb51. You can customize this summary. It will automatically update as commits are pushed.