zlacker

Over the years, the merits of Intel’s 512-bit Advanced Vector eXtensions (AVX-512) have been extensively debated. Introduced in 2014, CPUs took time to offer robust support. In parallel, Arm Scalable Vector Extensions (SVE), targeting Arm servers, has only gained traction in recent times. Today, the landscape has shifted significantly with Intel’s Sapphire Rapids CPUs on one flank and the AWS Graviton 3 and Ampere Altra chips on the other. Here are compelling reasons to opt for these over the traditional AVX2 and NEON extensions:

1. Masked Loads: Efficient data processing by selectively loading data. 2. Half-Precision Floating Point Math: Accelerated computations with reduced memory footprint.

These features proved invaluable for the latest SimSIMD release. The software now processes vector similarities up to 300x faster using NEON, SVE, AVX2, and AVX-512 extensions across Inner Product, Euclidean, Angular, Hamming, and Jaccard distances. It outstrips commonly used libraries like NumPy and SciPy, famously built on BLAS and LAPACK.

Check out the post for some cool tricks, clarifications on AVX-512 and SVE advantages, and benchmark numbers :)

Here is the repo: https://github.com/ashvardanian/simsimd

replies(1): >>westur+Tz

>>ashvar+(OP)
I encourage one to merge into e.g. {NumPy, SciPy, }; are there PRs?

Though SymPy.physics only yet supports X,Y,Z vectors and doesn't mention e.g. "jaccard"?, FWIW: https://docs.sympy.org/latest/modules/physics/vector/vectors... https://docs.sympy.org/latest/modules/physics/vector/fields.... #cfd

include/simsimd/simsimd.h: https://github.com/ashvardanian/SimSIMD/blob/main/include/si...

conda-forge maintainer docs > Switching BLAS implementation: https://conda-forge.org/docs/maintainer/knowledge_base.html#... :

  conda install "libblas=*=*mkl"
  conda install "libblas=*=*openblas"
  conda install "libblas=*=*blis"
  conda install "libblas=*=*accelerate"
  conda install "libblas=*=*netlib"

numpy-feedstock: https://github.com/conda-forge/numpy-feedstock/blob/main/rec...

scipy-feedstock: https://github.com/conda-forge/scipy-feedstock/blob/main/rec...

pysimdjson-feedstock: https://github.com/conda-forge/pysimdjson-feedstock/blob/mai...

simdjson-feedstock: https://github.com/conda-forge/simdjson-feedstock/blob/main/...

mkl_random-feedstock: https://github.com/conda-forge/mkl_random-feedstock https://github.com/google/paranoid_crypto/tree/main/paranoid... :

> NumPy-based implementation of random number generation sampling using Intel (R) Math Kernel Library, mirroring numpy.random, but exposing all choices of sampling algorithms available in MKL

blas: https://github.com/conda-forge/blas-feedstock/blob/main/reci...

xtensor-blas-feedstock: https://github.com/conda-forge/xtensor-blas-feedstock

xtensor-fftw (FFT with xtensor (c++)) could probably be AVX-512 and SVE -optimized as well? https://github.com/xtensor-stack/xtensor-fftw

ggml_cpu_has_avx512() https://github.com/search?q=repo%3Aggerganov%2Fggml%20AVX&ty... https://github.com/search?q=repo%3Aggerganov%2Fllama.cpp%20a...

CuPy would also be an impactful place to merge and defend these optimizations; though no GPUs have AVX-512 or SVE? cupyx.scipy.spatial.distance: https://docs.cupy.dev/en/stable/reference/scipy_spatial_dist... https://docs.cupy.dev/en/stable/reference/comparison.html

replies(1): >>ashvar+wC

>>westur+Tz
Hey, thanks for recommendations! Yes, I’m definitely open to contributing there.

replies(1): >>westur+mF1

>>ashvar+wC
Np. Thanks for the optimizations.

From "PostgresML is 8-40x faster than Python HTTP microservices" (2023) >>33270638 :

> Apache Ballista and Polars do Apache Arrow and SIMD.

> The Polars homepage links to the "Database-like ops benchmark" of {Polars, data.table, DataFrames.jl, ClickHouse, cuDF, spark, (py)datatable, dplyr, pandas, dask, Arrow, DuckDB, Modin,} but not yet PostgresML? https://h2oai.github.io/db-benchmark/ *

LLM -> Vector database: https://en.wikipedia.org/wiki/Vector_database

/? inurl:awesome site:github.com "vector database" https://www.google.com/search?q=inurl%253Aawesome+site%253Ag... : https://github.com/dangkhoasdc/awesome-vector-database , https://github.com/mileszim/awesome-vector-database , https://github.com/currentslab/awesome-vector-search

/? "vector database" "duckdb" https://www.google.com/search?q=+%22vector+database%22+%22du... ... pgvector

pgvector/pgvector/src/vector.c: vector_spherical_distance https://github.com/pgvector/pgvector/blob/master/src/vector....

postgresml/postgresml: /? distance https://github.com/search?q=repo%3Apostgresml%2Fpostgresml%2...