본문 바로가기

SIMD8

__m128i _mm_blend_epi32 () : Mix data Intrinsics Guide __m128i _mm_blend_epi32 (__m128i a, __m128i b, const int imm8)Synopsis__m128i _mm_blend_epi32 (__m128i a, __m128i b, const int imm8) #include "immintrin.h" Instruction: vpblendd xmm, xmm, xmm, imm CPUID Flags: AVX2DescriptionBlend packed 32-bit integers from a and b using control mask imm8, and store the results in dst.OperationFOR j := 0 to 3 i := j*32 IF imm8[j%8] dst[i+31:i] .. 2015. 1. 21.
void _mm_prefetch ( ) Intrinsics Guide void _mm_prefetch (char const* p, int i)Synopsisvoid _mm_prefetch (char const* p, int i) #include "xmmintrin.h" Instruction: prefetchwt1 mprefetch CPUID Flags: PREFETCHWT1DescriptionFetch the line of data from memory that contains address p to a location in the cache heirarchy specified by the locality hint i. 2015. 1. 14.
_mm_store_......() Intrinsics Guide void _mm_storel_epi64 (__m128i* mem_addr, __m128i a)Synopsisvoid _mm_storel_epi64 (__m128i* mem_addr, __m128i a) #include "emmintrin.h" Instruction: movq m64, xmm CPUID Flags: SSE2DescriptionStore 64-bit integer from the first element of a into memory.OperationMEM[mem_addr+63:mem_addr] := a[63:0] 2015. 1. 13.
_mm_lddqu_si128(), _mm_loadl_epi64() Intrinsics Guide __m128i _mm_lddqu_si128 (__m128i const* mem_addr)Synopsis__m128i _mm_lddqu_si128 (__m128i const* mem_addr) #include "pmmintrin.h" Instruction: lddqu xmm, m128 CPUID Flags: SSE3DescriptionLoad 128-bits of integer data from unaligned memory into dst. This intrinsic may perform better than _mm_loadu_si128 when the data crosses a cache line boundary.Operationdst[127:0] := MEM[mem_ad.. 2015. 1. 13.