A chronological collection of things I’ve found interesting.
2025-11-26
- Why Systolic Architectures? (PDF, 2.4 MB)
- Flang Documentation
- The Practitioner’s Cookbook for Good Parallel Performance on Multi- and Many-Core Systems | RRZE (PDF, 7.1 MB)
- Performance Analysis of the Apple AMX Matrix Accelerator | Jonathan Zhou (PDF, 1777 KB)
- Counting cycles and instructions on ARM-based Apple systems | Daniel Lemire
- Apple Firestorm/Icestorm CPU microarchitecture docs
- Finding and evaluating AMX co-processors in Apple silicon chips
- Apple vs. Oranges: Evaluating the Apple Silicon M-Series SoCs for HPC Performance and Efficiency
- A64 SIMD Instruction List: SVE Instructions
- High Performance Computing Class | FSU Jena
- Designing a SIMD Algorithm from Scratch | mcyoung
2025-11-20
- Comparing OpenBLAS and Accelerate on Apple Silicon for BLAS Routines | Frank Rosner
- Benchmarking and Testing | Tyler Sean Rau
- OpenMP* SIMD for Inclusive/Exclusive Scans
- Effiziente Nutzung von Hochleistungsrechnern in der numerischen Strömungsmechanik | Dr. Georg Hager (PDF, 599 KB)
- Automatic Translation of FORTRAN Programs to Vector Form | Randy Allen and Ken Kennedy (PDF, 2.8 MB)
- Bik, A. J., Tian, X., & Girkar, M. B. (2006). Multimedia vectorization of floating‐point MIN/MAX reductions. Concurrency and Computation: Practice and Experience, 18(9), 997-1007. https://doi.org/10.1002/cpe.1009
- Vectorization Essentials | Intel Software (PDF, 1913 KB)
- Karp, A. H., & Babb, R. B. (1988). A comparison of 12 parallel Fortran dialects. Ieee Software, 5(5), 52-67. https://doi.org/10.1109/52.7943
- Oraji, Y. M., Hück, A., & Bischof, C. (2025, November). Extending MPI Correctness Benchmarking to the Fortran Language. In Proceedings of the SC'25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 244-248). https://doi.org/10.1145/3731599.3767366
- Investigating the performance of LLVM- based Intel Fortran Complier (ifx) | Dhani Ruhela (PDF, 596 KB)
- Rouson, D., Dibba, B., Rasmussen, K., Richardson, B., Torres, D., Zhang, Y., … & Shende, S. (2024). Just Write Fortran: Experiences with a Language-Based Alternative to MPI+ X. https://doi.org/10.25344/S4H88D