11 Answered Questions

4 Answered Questions

[SOLVED] AVX2 what is the most efficient way to pack left based on a mask?

1 Answered Questions

[SOLVED] What are the best instruction sequences to generate vector constants on the fly?

4 Answered Questions

[SOLVED] SIMD prefix sum on Intel cpu

  • 2012-05-14 16:44:36
  • skyde
  • 5978 View
  • 17 Score
  • 4 Answer
  • Tags:   c++ sse simd mmx

2 Answered Questions

[SOLVED] How to implement atoi using SIMD?

  • 2016-02-01 09:33:51
  • the_drow
  • 2608 View
  • 22 Score
  • 2 Answer
  • Tags:   c++ x86 sse simd atoi

4 Answered Questions

[SOLVED] What's missing/sub-optimal in this memcpy implementation?

5 Answered Questions

[SOLVED] How to perform the inverse of _mm256_movemask_epi8 (VPMOVMSKB)?

  • 2014-02-07 07:55:30
  • Satya Arjunan
  • 3576 View
  • 19 Score
  • 5 Answer
  • Tags:   c x86 simd avx avx2

1 Answered Questions

[SOLVED] Fastest way to compute absolute value using SSE

1 Answered Questions

[SOLVED] Fast vectorized rsqrt and reciprocal with SSE/AVX depending on precision

3 Answered Questions

[SOLVED] Fastest Implementation of Exponential Function Using SSE

2 Answered Questions

4 Answered Questions

[SOLVED] Why vectorizing the loop does not have performance improvement

4 Answered Questions

[SOLVED] print a __m128i variable

5 Answered Questions

[SOLVED] Header files for x86 SIMD intrinsics

3 Answered Questions

[SOLVED] Emulating shifts on 32 bytes with AVX

5 Answered Questions

[SOLVED] SSE intrinsic functions reference

  • 2011-08-23 06:07:31
  • NGaffney
  • 33284 View
  • 50 Score
  • 5 Answer
  • Tags:   c++ c gcc sse simd

1 Answered Questions

[SOLVED] Fastest way to unpack 32 bits to a 32 byte SIMD vector

  • 2014-06-15 01:27:28
  • alecco
  • 1654 View
  • 6 Score
  • 1 Answer
  • Tags:   x86 simd avx avx2

3 Answered Questions

[SOLVED] Load address calculation when using AVX2 gather instructions

  • 2013-04-24 13:34:42
  • Paul R
  • 3910 View
  • 13 Score
  • 3 Answer
  • Tags:   x86 sse simd avx2

1 Answered Questions

[SOLVED] Loading 8 chars from memory into an __m256 variable as packed single precision floats

  • 2015-12-15 01:22:08
  • pseudomarvin
  • 1296 View
  • 5 Score
  • 1 Answer
  • Tags:   c++ sse simd avx avx2

3 Answered Questions

[SOLVED] C++ error: ‘_mm_sin_ps’ was not declared in this scope

2 Answered Questions

[SOLVED] Sum reduction of unsigned bytes without overflow, using SSE2 on Intel

3 Answered Questions

[SOLVED] practical BigNum AVX/SSE possible?

7 Answered Questions

[SOLVED] How to determine if memory is aligned?

1 Answered Questions

2 Answered Questions

[SOLVED] SSE multiplication of 4 32-bit integers

3 Answered Questions

[SOLVED] What's the difference between logical SSE intrinsics?

3 Answered Questions

[SOLVED] Fastest way to do horizontal vector sum with AVX instructions

3 Answered Questions

[SOLVED] Fastest Implementation of Exponential Function Using AVX

3 Answered Questions

[SOLVED] Why is this SIMD multiplication not faster than non-SIMD multiplication?

3 Answered Questions

[SOLVED] Parallel for vs omp simd: when to use each?