SIMD/Intel Instrinsic
_mm_srai_epi16() , _mm_srai_epi32() : Shift
김뿡한
2015. 1. 22. 12:14
Intrinsics Guide
__m128i _mm_srai_epi16 (__m128i a, int imm8)
Synopsis
__m128i _mm_srai_epi16 (__m128i a, int imm8)
#include "emmintrin.h"
Instruction: psraw xmm, imm
CPUID Flags: SSE2
#include "emmintrin.h"
Instruction: psraw xmm, imm
CPUID Flags: SSE2
Description
Shift packed 16-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst.
Operation
FOR j := 0 to 7
i := j*16
IF imm8[7:0] > 15
dst[i+15:i] := SignBit
ELSE
dst[i+15:i] := SignExtend(a[i+15:i] >> imm8[7:0])
FI
ENDFOR
Performance
Architecture | Latency | Throughput |
---|---|---|
Haswell | 1 | 1 |
Ivy Bridge | 1 | 1 |
Sandy Bridge | 1 | 1 |
Westmere | 1 | 1 |
Nehalem | 1 | 1 |
__m128i _mm_srai_epi32 (__m128i a, int imm8)
Synopsis
__m128i _mm_srai_epi32 (__m128i a, int imm8)
#include "emmintrin.h"
Instruction: psrad xmm, imm
CPUID Flags: SSE2
#include "emmintrin.h"
Instruction: psrad xmm, imm
CPUID Flags: SSE2
Description
Shift packed 32-bit integers in a right by imm8 while shifting in sign bits, and store the results in dst.
Operation
FOR j := 0 to 3
i := j*32
IF imm8[7:0] > 31
dst[i+31:i] := SignBit
ELSE
dst[i+31:i] := SignExtend(a[i+31:i] >> imm8[7:0])
FI
ENDFOR
Performance
Architecture | Latency | Throughput |
---|---|---|
Haswell | 1 | - |
Ivy Bridge | 1 | - |
Sandy Bridge | 1 | - |
Westmere | 1 | - |
Nehalem | 1 | - |