btrfs-progs: crypto: fix SSE2/SSE4.1 detection of BLAKE2

On recent x86-64 system with march=native|<cpu>|<microarchitecture
level> gcc/clang will automatically define all the available vector
extensions macros. crypto/blake2-config.h then correctly set all the
HAVE_<EXTENSION> macros.

crypto/blake2-round.h then checks the HAVE_<EXTENSION> macros for
including further headers:

    #if defined(HAVE_SSE41)
    #include "blake2b-load-sse41.h"
    #else
    #include "blake2b-load-sse2.h"
    #endif

which is wrong. On recent systems it always results in including
blake2b-load-sse41.h. crypto/blake2-round.h itself is included by
crypto/blake2b-sse2.c and now we have a SSE2/SSE4.1 code mixing
resulting in the incompatible type for argument build errors described
in #589.

The idea is to remove the lines above from crypto/blake2-round.h and put
the includes directly into crypto/blake2b-sse2.c and
crypto/blake2b-sse41.c respectively.

Note this slightly diverges from the upstream BLAKE2 sources.

Pull-request: #591
Author: Tino Mai <mai.tino@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
Tino Mai 2023-03-05 18:15:52 +01:00 committed by David Sterba
parent 366cd079bc
commit 0da635fd1b
3 changed files with 2 additions and 6 deletions

View file

@ -136,12 +136,6 @@
#endif
#if defined(HAVE_SSE41)
#include "blake2b-load-sse41.h"
#else
#include "blake2b-load-sse2.h"
#endif
#define ROUND(r) \
LOAD_MSG_ ##r ##_1(b0, b1); \
G1(row1l,row2l,row3l,row4l,row1h,row2h,row3h,row4h,b0,b1); \

View file

@ -30,6 +30,7 @@
#include <x86intrin.h>
#endif
#include "blake2b-load-sse2.h"
#include "blake2b-round.h"
static const uint64_t blake2b_IV[8] =

View file

@ -34,6 +34,7 @@
#include <x86intrin.h>
#endif
#include "blake2b-load-sse41.h"
#include "blake2b-round.h"
static const uint64_t blake2b_IV[8] =