Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support in init script for using 'archdetect' alternative to archspec (only if $EESSI_USE_ARCHDETECT is set to '1') #187

Merged
merged 57 commits into from
Nov 8, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
9d06a89
alternative to detect cpu arch for path
hmeiland Sep 15, 2022
898b4d5
adding support for amd detection
hmeiland Sep 15, 2022
46e0c07
document example lscpu outputs which is input to archdetect
hmeiland Sep 15, 2022
eb9f891
add arm ampere altra lscpu example
hmeiland Sep 15, 2022
2c561a4
allow overrides for unittesting and make cpu_vendor more readable
hmeiland Sep 15, 2022
0dd9a2a
adding ref files for lscpu for unit tests
hmeiland Sep 15, 2022
c51613d
correct variable for positive check
hmeiland Sep 15, 2022
d318874
add testcases for intel cpuinfo
hmeiland Sep 15, 2022
978e16a
add testcases for intel and amd cpuinfo
hmeiland Sep 15, 2022
4a9fc68
add power9 detection
hmeiland Sep 15, 2022
a7fc80a
add power9 test
hmeiland Sep 15, 2022
636ef94
update matrix
hmeiland Sep 15, 2022
78f76ff
remove matrix
hmeiland Sep 15, 2022
09d822d
remove matrix
hmeiland Sep 15, 2022
e821c4e
code cleanup
hmeiland Sep 15, 2022
c7d3e2b
code cleanup
hmeiland Sep 15, 2022
36e8fa4
code cleanup
hmeiland Sep 15, 2022
e796e4f
code cleanup
hmeiland Sep 15, 2022
572143e
code cleanup
hmeiland Sep 15, 2022
b3c9b13
add aarch64 cpuinfo test data
hmeiland Sep 15, 2022
e8341a6
add unset for feature vars
hmeiland Sep 15, 2022
4914394
add CI workflow to test cpu_archdetect.yml
boegel Sep 15, 2022
0aab86f
ignore case in cpuinfo to allow CentOS & Ubuntu
hmeiland Sep 15, 2022
3ce8a01
re-order power and add aarch64 detection
hmeiland Sep 16, 2022
30925e6
add aarch64 tests
hmeiland Sep 16, 2022
e42e0a6
move to features for aarch64
hmeiland Sep 16, 2022
51ddaa7
add support for aarch64 neoverse-n1
hmeiland Sep 16, 2022
38194b8
add support for aarch64 neoverse-n1
hmeiland Sep 16, 2022
d891813
add support for aarch64 neoverse-n1
hmeiland Sep 16, 2022
0cd0af4
add architecture-cpu comments
hmeiland Sep 16, 2022
28c7b23
re-order aarch64 output and remove graviton
hmeiland Sep 16, 2022
ed42876
re-order aarch64 output and remove graviton
hmeiland Sep 16, 2022
d0fdf4b
remove graviton from test
hmeiland Sep 16, 2022
2899960
move from /bin/bash to /usr/bin/env bash
hmeiland Sep 16, 2022
bbac252
synch output for archdetect / archspec
hmeiland Sep 16, 2022
a5b4e8e
rewrite by Alex Domingo (lexming)
hmeiland Sep 19, 2022
0d1daf4
move test to new code
hmeiland Sep 19, 2022
ce2daca
redirecting output to err, keeping final echo
hmeiland Sep 19, 2022
1d71095
adding power to arch_detect
hmeiland Sep 19, 2022
70be449
prep arch_detect for other functions, now: cpupath
hmeiland Sep 19, 2022
a797d6d
renew tests schema
hmeiland Sep 20, 2022
8726b40
allow multiple tests scenarios
hmeiland Sep 20, 2022
037abce
allow multiple tests scenarios
hmeiland Sep 20, 2022
a31ce0e
allow multiple tests scenarios
hmeiland Sep 20, 2022
95ae920
renewing tests
hmeiland Sep 20, 2022
c3329d5
renew tests
hmeiland Sep 20, 2022
4e2033e
roll-over to new code for eessi_archdetect.sh
hmeiland Sep 20, 2022
3eddc78
change hardcoded bash
hmeiland Sep 20, 2022
818ecad
move CPU arch specifications to separate spec files
lexming Sep 21, 2022
6726288
fix execution from any location in the file system
lexming Sep 21, 2022
d4d9681
fix log messages ibeing printed in standard output
lexming Sep 21, 2022
24767eb
Merge pull request #2 from lexming/feature-archdetect
hmeiland Sep 21, 2022
50b46ca
standarize log formatting by adding logging function
lexming Sep 26, 2022
c74bb8e
add version tag and option to print it
lexming Sep 26, 2022
025da07
Merge pull request #3 from lexming/feature-archdetect
hmeiland Sep 30, 2022
806e1fb
be more specific on skylake_avx512 to avoid KNL confusion
hmeiland Oct 2, 2022
2d77d74
Update init/eessi_environment_variables
hmeiland Nov 8, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 96 additions & 0 deletions init/archdetect.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# archdetect

bash based script to detect e.g. cpu architectures and provide cpupath for setting pointer to optimized software


example lscpu outputs:

# Zen2
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 120
On-line CPU(s) list: 0-119
Thread(s) per core: 1
Core(s) per socket: 60
Socket(s): 2
NUMA node(s): 4
Vendor ID: AuthenticAMD
CPU family: 23
Model: 49
Model name: AMD EPYC 7V12 64-Core Processor
Stepping: 0
CPU MHz: 2445.424
BogoMIPS: 4890.84
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 512K
L3 cache: 16384K
NUMA node0 CPU(s): 0-29
NUMA node1 CPU(s): 30-59
NUMA node2 CPU(s): 60-89
NUMA node3 CPU(s): 90-119
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm art rep_good nopl extd_apicid aperfmperf eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext retpoline_amd ssbd vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat umip

# Zen3
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 120
On-line CPU(s) list: 0-119
Thread(s) per core: 1
Core(s) per socket: 60
Socket(s): 2
NUMA node(s): 4
Vendor ID: AuthenticAMD
CPU family: 25
Model: 1
Model name: AMD EPYC 7V73X 64-Core Processor
Stepping: 2
CPU MHz: 1846.550
BogoMIPS: 3693.10
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 512K
L3 cache: 98304K
NUMA node0 CPU(s): 0-29
NUMA node1 CPU(s): 30-59
NUMA node2 CPU(s): 60-89
NUMA node3 CPU(s): 90-119
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm art rep_good nopl extd_apicid aperfmperf eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core invpcid_single retpoline_amd vmmcall fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr arat umip vaes vpclmulqdq

# Arm Ampere Altra
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 1
Core(s) per socket: 64
Socket(s): 1
NUMA node(s): 1
Vendor ID: ARM
Model: 1
Model name: Neoverse-N1
Stepping: r3p1
BogoMIPS: 50.00
L1d cache: 4 MiB
L1i cache: 4 MiB
L2 cache: 64 MiB
L3 cache: 32 MiB
NUMA node0 CPU(s): 0-63
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Not affected
Vulnerability Spec store bypass: Not affected
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Mitigation; CSV2, BHB
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
74 changes: 74 additions & 0 deletions init/eessi_archdetect.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
#!/bin/bash

# current pathways implemented in EESSI
# x86_64/generic
# x86_64/intel/haswell
# x86_64/intel/skylake_avx512
# x86_64/amd/zen2
# x86_64/amd/zen3
# aarch64/generic
# ppc64le/generic
# ppc64le/power9le

ARGUMENT=${1:-none}

cpupath () {
# let the kernel tell base machine type
MACHINE_TYPE=$(uname -m)

# fallback path
CPU_PATH="${MACHINE_TYPE}/generic"

if [ ${MACHINE_TYPE} == "aarch64" ]; then
echo ${CPU_PATH}
exit
fi

if [ ${MACHINE_TYPE} == "ppc64le" ]; then
echo ${CPU_PATH} "not sure what to do next..."
echo "please mail output of lscpu to me..."
exit
fi

if [ ${MACHINE_TYPE} == "x86_64" ]; then
# check for vendor info, if available, for x86_64
CPUINFO_VENDOR_FLAG=$(grep -m 1 ^vendor_id /proc/cpuinfo)
[[ $CPUINFO_VENDOR_FLAG =~ .*GenuineIntel* ]] && CPU_VENDOR=intel
[[ $CPU_VENDOR_FLAG =~ .*AuthenticAMD* ]] && CPU_VENDOR=amd

CPU_FLAGS=$(grep -m 1 ^flags /proc/cpuinfo)
[[ $CPU_FLAGS =~ .*avx2* ]] && HAS_AVX2=true
[[ $CPU_FLAGS =~ .*fma* ]] && HAS_FMA=true
[[ $CPU_FLAGS =~ .*avx512f* ]] && HAS_AVX512F=true
[[ $CPU_FLAGS =~ .*avx512vl* ]] && HAS_AVX512VL=true
[[ $CPU_FLAGS =~ .*avx512ifma* ]] && HAS_AVX512IFMA=true
[[ $CPU_FLAGS =~ .*avx512_vbmi2* ]] && HAS_AVX512_VBMI2=true
[[ $CPU_FLAGS =~ .*avx512_vnni* ]] && HAS_AVX512_VNNI=true
[[ $CPU_FLAGS =~ .*avx512fp16* ]] && HAS_AVX512FP16=true
[[ $CPU_FLAGS =~ .*vaes* ]] && HAS_VAES=true

[[ ${CPU_VENDOR} == "intel" ]] && [[ ${HAS_AVX2} ]] && [[ ${HAS_FMA} ]] && CPU_TYPE=haswell
[[ ${CPU_VENDOR} == "intel" ]] && [[ ${HAS_AVX512F} ]] && CPU_TYPE=skylake_avx512
# [[ ${HAS_AVX512IFMA} ]] && [[ ${HAS_AVX512_VBMI2} ]] && CPU_TYPE=icelake_avx512
# [[ ${HAS_AVX512_VNNI} ]] && [[ ${HAS_AVX512VL} ]] && [[ ${HAS_AVX512FP16} ]] && CPU_TYPE=sapphire_rapids_avx512

[[ ${CPU_VENDOR} == "amd" ]] && [[ ${HAS_AVX2} ]] && [[ ${HAS_FMA} ]] && CPU_TYPE=zen2
[[ ${CPU_VENDOR} == "amd" ]] && [[ ${HAS_AVX2} ]] && [[ ${HAS_FMA} ]] && [[ ${HAS_VAES} ]] && CPU_TYPE=zen3

[[ ${CPU_VENDOR} ]] && [[ $CPU_TYPE ]] && CPU_PATH="${MACHINE_TYPE}/${CPU_VENDOR}/${CPU_TYPE}"

echo ${CPU_PATH}
exit
fi

echo "should not see this...something weird going on..."
echo "please mail output of lscpu to me..."
}

if [ ${ARGUMENT} == "none" ]; then
echo usage: $0 cpupath
exit
elif [ ${ARGUMENT} == "cpupath" ]; then
echo $(cpupath)
exit
fi
12 changes: 9 additions & 3 deletions init/eessi_environment_variables
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,15 @@ if [ -d $EESSI_PREFIX ]; then
if [ -d $EESSI_EPREFIX ]; then

# determine subdirectory in software layer
# note: eessi_software_subdir_for_host.py will pick up value from $EESSI_SOFTWARE_SUBDIR_OVERRIDE if it's defined!
export EESSI_EPREFIX_PYTHON=$EESSI_EPREFIX/usr/bin/python3
export EESSI_SOFTWARE_SUBDIR=$($EESSI_EPREFIX_PYTHON ${EESSI_INIT_DIR_PATH}/eessi_software_subdir_for_host.py $EESSI_PREFIX)
if [ -z $EESSI_USE_ARCHDETECT ]; then
# if archdetect is enabled, use internal code
export EESSI_SOFTWARE_SUBDIR=$(${EESSI_INIT_DIR_PATH}/eessi_archdetect.sh cpupath)
echo "archdetect says ${EESSI_SOFTWARE_SUBDIR}"
else
# note: eessi_software_subdir_for_host.py will pick up value from $EESSI_SOFTWARE_SUBDIR_OVERRIDE if it's defined!
export EESSI_EPREFIX_PYTHON=$EESSI_EPREFIX/usr/bin/python3
export EESSI_SOFTWARE_SUBDIR=$($EESSI_EPREFIX_PYTHON ${EESSI_INIT_DIR_PATH}/eessi_software_subdir_for_host.py $EESSI_PREFIX)
fi
if [ ! -z $EESSI_SOFTWARE_SUBDIR ]; then

echo "Using ${EESSI_SOFTWARE_SUBDIR} as software subdirectory." >> $output
Expand Down