-
Notifications
You must be signed in to change notification settings - Fork 182
Upgrading to v2.0 API
Here is a list of recommended ways to work-around API changes.
To detect whether your building against hwloc 2.0.0 or later:
#if HWLOC_API_VERSION >= 0x20000
...
#endif
If using official releases, the library soname ensures you're using the right library (12 instead of 5). If using unofficial or nightly tarballs, you may also check the runtime API version:
#include <hwloc.h>
#if HWLOC_API_VERSION >= 0x00020000
/* headers are recent */
if (hwloc_get_api_version() < 0x20000)
... error out, the hwloc runtime library is older than 2.0 ...
#else
/* headers are pre-2.0 */
if (hwloc_get_api_version() >= 0x20000)
... error out, the hwloc runtime library is more recent than 2.0 ...
#endif
NUMA nodes are not in the main tree anymore. They are attached under objects as memory children. This list starts at obj->memory_first_child
and its size is obj->memory_arity
. Hence there can now exist two local NUMA nodes, for instance on KNL.
The normal list of children (starting at obj->first_child
, ending at obj->last_child
, of size obj->arity
, and available as the array obj->children
) now only contains CPU-side objects: PUs, Cores, Packages, Caches, Groups, Machine and System. hwloc_get_next_child() may still be used to iterate over all children of all lists.
Hence there is a CPU-hierarchy using normal children, while memory is attached to that hierarchy depending on its affintiy.
For instance:
- a machine with 2 packages but a single NUMA node is now modeled as a "Machine" object with two "Package" children and one "NUMANode" memory children (displayed first in lstopo below).
Machine (1024MB total)
NUMANode L#0 (P#0 1024MB)
Package L#0
Core L#0 + PU L#0 (P#0)
Core L#1 + PU L#1 (P#1)
Package L#1
Core L#2 + PU L#2 (P#2)
Core L#3 + PU L#3 (P#3)
- a machine with 2 packages with one NUMA node and 2 cores in each is now
Machine (2048MB total)
Package L#0
NUMANode L#0 (P#0 1024MB)
Core L#0 + PU L#0 (P#0)
Core L#1 + PU L#1 (P#1)
Package L#1
NUMANode L#1 (P#1 1024MB)
Core L#2 + PU L#2 (P#2)
Core L#3 + PU L#3 (P#3)
- if there are two NUMA nodes per package:
Machine (4096MB total)
Package L#0
Group0 L#0
NUMANode L#0 (P#0 1024MB)
Core L#0 + PU L#0 (P#0)
Core L#1 + PU L#1 (P#1)
Group0 L#1
NUMANode L#1 (P#1 1024MB)
Core L#2 + PU L#2 (P#2)
Core L#3 + PU L#3 (P#3)
Package L#1
[...]
- In practice, there's usually a L3 instead of the above Group:
Machine (4096MB total)
Package L#0
L3 L#0 (16MB)
NUMANode L#0 (P#0 1024MB)
Core L#0 + PU L#0 (P#0)
Core L#1 + PU L#1 (P#1)
L3 L#1 (16MB)
NUMANode L#1 (P#1 1024MB)
Core L#2 + PU L#2 (P#2)
Core L#3 + PU L#3 (P#3)
Package L#1
[...]
Functions for iterating over the level of NUMA nodes still work fine.
However, applications that ever walked up/down to find NUMANode parent/children must now be updated. For instance, finding a NUMANode parent should be replaced with finding a parent that has a memory child, and using that child.
It is still possible to look at a nodeset and then iterate over NUMA nodes whose nodeset is included.
I/O children are not in the main object children list anymore either. They are in the list starting at obj->io_first_child
and whose size if obj->io_arity
.
Misc children are not in the main object children list anymore. They are in the list starting at obj->misc_first_child
and whose size if obj->misc_arity
.
hwloc_get_next_child() may still be used to iterate over all children of all lists.
Instead of a single HWLOC_OBJ_CACHE, there are now 8 types HWLOC_OBJ_L1CACHE
, ..., HWLOC_OBJ_L5CACHE
, HWLOC_OBJ_L1ICACHE
, ..., HWLOC_OBJ_L3ICACHE
. Cache object attributes is unchanged.
hwloc_get_cache_type_depth()
is not really needed to disambiguate cache types anymore since new types can be passed to hwloc_get_type_depth()
without ever getting HWLOC_TYPE_DEPTH_MULTIPLE
anymore.
hwloc_obj_type_is_cache()
, hwloc_obj_type_is_dcache()
and hwloc_obj_type_is_icache()
may be used to check whether a given type is a cache, data/unified cache or instruction cache.
Objects do not have allowed_cpuset and allowed_nodeset anymore. They are only available for the entire topology using hwloc_topology_get allowed_cpuset() and hwloc_topology_get_allowed_nodeset().
As usual, those are only needed when the WHOLE_SYSTEM topology flag is given, which means disallowed objects are kept in the topology. If so, you may find out whether some PUs inside an object is allowed by checking whether hwloc_bitmap_intersects(obj->cpuset, hwloc_topology_get_allowed_cpuset(topology)). Replace cpusets with nodesets for NUMA nodes. To find out which ones, replace intersects() with and() to get the actual intersection.
obj->depth as well as depth given to functions such as hwloc_get_obj_by_depth() or returned by hwloc_topology_get_depth() are now signed int.
Other depth such as cache-specific depth attribute are still unsigned.
Memory attributes such as obj->memory.local_memory
are now only available in NUMANode-specific attributes in obj->attr->numanode.local_memory
.
Except obj->memory.total_memory
which is still available in all objects as obj->total_memory
.
hwloc_topology_ignore_type(), hwloc_topology_ignore_type_keep_structure() and hwloc_topology_ignore_all_keep_structure() replaced
Respectively superseded by
hwloc_topology_set_type_filter(topology, type, HWLOC_TYPE_FILTER_KEEP_NONE);
hwloc_topology_set_type_filter(topology, type, HWLOC_TYPE_FILTER_KEEP_STRUCTURE);
hwloc_topology_set_all_types_filter(topology, HWLOC_TYPE_FILTER_KEEP_STRUCTURE);
Also, the meaning of KEEP_STRUCTURE has changed (only entire levels may be ignored, instead of single objects), the old behavior is not available anymore.
Superseded by
hwloc_topology_set_icache_types_filter(topology, HWLOC_TYPE_FILTER_KEEP_ALL);
HWLOC_TOPOLOGY_FLAG_WHOLE_IO, HWLOC_TOPOLOGY_FLAG_IO_DEVICES and HWLOC_TOPOLOGY_FLAG_IO_BRIDGES replaced
To keep all I/O devices (PCI, Bridges, and OS devices), use:
hwloc_topology_set_io_types_filter(topology, HWLOC_TYPE_FILTER_KEEP_ALL);
To only keep important devices (Bridges with children, common PCI devices and OS devices):
hwloc_topology_set_io_types_filter(topology, HWLOC_TYPE_FILTER_KEEP_IMPORTANT);
2.0 can load 1.x files, with some caveats:
- Only NUMA-distances are imported. Other distance matrices are ignored (they were never used by default anyway).
2.0 can export 1.x-compatible files, with some caveats:
- Only distances attached to the root object are exported (i.e. distances that cover the entire machine). Other distance matrices are dropped (they were never used by default anyway).
Users are advised to negociate hwloc versions between exporter and importer: If the importer isn't 2.x, the exporter should export to 1.x. Otherwise, things should work by default. See below.
Flags may be used to force a hwloc-1.x-compatible XML export. This should be negociated/detected between the importer and the exporter processes so that the most recent XML format is used:
- If both always support 2.0, don't pass any flag.
- When the importer uses hwloc 1.x, export with HWLOC_TOPOLOGY_EXPORT_XML_FLAG_V1. Otherwise the importer will fail to import.
- When the exporter uses hwloc 1.x, a 2.0 importer can import without problem.
#if HWLOC_API_VERSION >= 0x20000
if (need 1.x compatible XML export)
hwloc_topology_export_xml(...., HWLOC_TOPOLOGY_EXPORT_XML_FLAG_V1);
else /* need 2.x compatible XML export */
hwloc_topology_export_xml(...., 0);
#else
hwloc_topology_export_xml(....);
#endif
hwloc_topology_diff_load_xml(), hwloc_topology_diff_load_xmlbuffer(), hwloc_topology_diff_export_xml(), hwloc_topology_diff_export_xmlbuffer() and hwloc_topology_diff_destroy() lost the topology argument
The first argument (topology) isn't needed anymore.
Now in hwloc/distances.h
Distances are not available in objects anymore. One should first call hwloc_distances_get()
(or a variant) to retrieve distances (possibly with one call to get the number of available distances structures, and another call to actually get them). Then it may consult these structures, and finally release them.
The set of object involved in a distances structure is specified by an array of objects, it may not always cover the entire machine or so.
Most bitmap functions may have to reallocate the internal bitmap storage. In v1.x, they would silently crash if realloc failed. In v2.0, they now return an int that can be negative on error. However the preallocated storage is 512 bits, hence realloc will not even be used unless you run hwloc on machines with larger PU or NUMAnode indexes.
hwloc_obj_add_info(), hwloc_cpuset_from_nodeset() and hwloc_nodeset_to_cpuset() also return an int, with would be -1 in case of allocation errors.
Some functions moved to hwloc/export.h or hwloc/distances.h, but those files are still auto-included by hwloc.h.
hwloc_obj_snprintf() removed because long-deprecated by hwloc_obj_type_snprintf()
and hwloc_obj_attr_snprintf()
.
hwloc_type_sscanf()
extends hwloc_obj_type_sscanf()
by passing a union hwloc_obj_attr_u
which may receive cache, group, bridge or OS device attributes.
hwloc_type_sscanf_as_depth()
is also added to directly return the corresponding level depth within a topology.
Removed, deprecated by hwloc_distrib()
hwloc_topology_insert_misc_object_by_cpuset() and hwloc_topology_insert_misc_object_by_parent() replaced
hwloc_topology_insert_misc_object_by_cpuset()
is replaced with hwloc_topology_alloc_group_object()
and hwloc_topology_insert_group_object()
.
hwloc_topology_insert_misc_object_by_parent()
is replaced with hwloc_topology_insert_misc_object()
.
Use the variant without _nodeset suffix and pass the new HWLOC_MEMBIND_BYNODESET flag
Now useless since all topologies are NUMA. Use the variant without the _strict suffix
The root object is always HWLOC_OBJ_MACHINE
Not available anymore (no supported operating system supports it).
hwloc_topology_set_custom()
, hwloc_custom_insert_topology()
and hwloc_custom_insert_group_object_by_parent()
removed from the API.
The corresponding hwloc-assembler
and hwloc-assembler-remote
command-line tools also removed.
The custom interface is not available anymore. Topologies always start with object with valid cpusets and nodesets.
The field has been removed from hwloc_obj_t
. Offline are simply listed in the complete_cpuset as previously.
The object field has been removed.