主题
	私有实现辅助函数

详细描述

“通用内联函数”是一组类型和函数，旨在简化不同平台上的代码向量化。目前支持不同架构上的几种不同的 SIMD 扩展。针对各种架构，包括 x86（**SSE/SSE2/SSE4.2**）、ARM（**NEON**）、PowerPC（**VSX**）、MIPS（**MSA**），实现了对各种类型的 128 位寄存器的支持。x86（**AVX2**）支持 256 位长寄存器，x86（**AVX512**）支持 512 位长寄存器。如果编译期间没有可用的 SIMD 扩展，则将选择内联函数的回退 C++ 实现，代码将按预期工作，尽管速度可能会较慢。

类型

有几种类型表示打包值的向量寄存器，每种类型都实现为基于一个 SIMD 寄存器的结构。

cv::v_uint8 和 cv::v_int8：8 位整数（无符号/有符号） - char
cv::v_uint16 和 cv::v_int16：16 位整数（无符号/有符号） - short
cv::v_uint32 和 cv::v_int32：32 位整数（无符号/有符号） - int
cv::v_uint64 和 cv::v_int64：64 位整数（无符号/有符号） - int64
cv::v_float32：32 位浮点数（有符号） - float
cv::v_float64：64 位浮点数（有符号） - double

列出类型的精确位长度（和值数量）是在编译时推断的，并且取决于在库编译期间选择的可用架构 SIMD 功能。所有类型都包含 **nlanes** 枚举，用于检查类型的精确值数量。

如果类型的精确位长度很重要，则可以使用特定固定长度的寄存器类型。

有几种类型表示 128 位寄存器。

cv::v_uint8x16 和 cv::v_int8x16：十六个 8 位整数值（无符号/有符号） - char
cv::v_uint16x8 和 cv::v_int16x8：八个 16 位整数值（无符号/有符号） - short
cv::v_uint32x4 和 cv::v_int32x4：四个 32 位整数值（无符号/有符号） - int
cv::v_uint64x2 和 cv::v_int64x2：两个 64 位整数值（无符号/有符号） - int64
cv::v_float32x4：四个 32 位浮点数（有符号） - float
cv::v_float64x2：两个 64 位浮点数（有符号） - double

有几种类型表示 256 位寄存器。

cv::v_uint8x32 和 cv::v_int8x32：三十二个 8 位整数值（无符号/有符号） - char
cv::v_uint16x16 和 cv::v_int16x16：十六个 16 位整数值（无符号/有符号） - short
cv::v_uint32x8 和 cv::v_int32x8：八个 32 位整数值（无符号/有符号） - int
cv::v_uint64x4 和 cv::v_int64x4：四个 64 位整数值（无符号/有符号） - int64
cv::v_float32x8：八个 32 位浮点数（有符号） - float
cv::v_float64x4：四个 64 位浮点数（有符号） - double

注意: 目前仅针对 AVX2 SIMD 扩展实现了 256 位寄存器，如果您想直接使用此类型，请不要忘记检查 CV_SIMD256 预处理器定义。
#if CV_SIMD256

//...

#endif

有几种类型表示 512 位寄存器。

cv::v_uint8x64 和 cv::v_int8x64：六十四个 8 位整数值（无符号/有符号） - char
cv::v_uint16x32 和 cv::v_int16x32：三十二个 16 位整数值（无符号/有符号） - short
cv::v_uint32x16 和 cv::v_int32x16：十六个 32 位整数值（无符号/有符号） - int
cv::v_uint64x8 和 cv::v_int64x8：八个 64 位整数值（无符号/有符号） - int64
cv::v_float32x16：十六个 32 位浮点数（有符号） - float
cv::v_float64x8：八个 64 位浮点数（有符号） - double
注意
目前仅针对 AVX512 SIMD 扩展实现了 512 位寄存器，如果您想直接使用此类型，请不要忘记检查 CV_SIMD512 预处理器定义。

NEON 变体中未实现 cv::v_float64x2，如果您想使用此类型，请不要忘记检查 CV_SIMD128_64F 预处理器定义。

加载和存储操作

这些操作允许显式设置寄存器的内容，或者通过从某个内存块加载寄存器来设置内容，以及将寄存器的内容保存到内存块。

存在可变大小的寄存器加载操作，这些操作根据所选平台的功能提供最大可用大小的结果。

构造函数：从内存，
其他创建方法：vx_setall_s8, vx_setall_u8, ..., vx_setzero_u8, vx_setzero_s8, ...
内存加载操作：vx_load, vx_load_aligned, vx_load_low, vx_load_halves，
带值扩展的内存操作：vx_load_expand, vx_load_expand_q

还有固定大小寄存器加载/存储操作。

对于128位寄存器

构造函数：从内存，从两个值，...
其他创建方法：v_setall_s8，v_setall_u8，...，v_setzero_u8，v_setzero_s8，...
内存加载操作：v_load，v_load_aligned，v_load_low，v_load_halves，
带值扩展的内存操作：v_load_expand，v_load_expand_q

对于256位寄存器（检查CV_SIMD256预处理器定义）

构造函数：从内存，从四个值，...
其他创建方法：v256_setall_s8，v256_setall_u8，...，v256_setzero_u8，v256_setzero_s8，...
内存加载操作：v256_load，v256_load_aligned，v256_load_low，v256_load_halves，
带值扩展的内存操作：v256_load_expand，v256_load_expand_q

对于512位寄存器（检查CV_SIMD512预处理器定义）

构造函数：从内存，从八个值，...
其他创建方法：v512_setall_s8，v512_setall_u8，...，v512_setzero_u8，v512_setzero_s8，...
内存加载操作：v512_load，v512_load_aligned，v512_load_low，v512_load_halves，
带值扩展的内存操作：v512_load_expand，v512_load_expand_q

内存存储操作在不同的平台功能上类似：v_store，v_store_aligned，v_store_high，v_store_low

值重排序

这些操作允许在一个或多个向量中重新排序或重新组合元素。

交叉存储，解交叉存储（2、3和4通道）：v_load_deinterleave，v_store_interleave
扩展：v_expand，v_expand_low，v_expand_high
压缩：v_pack，v_pack_u，v_pack_b，v_rshr_pack，v_rshr_pack_u，v_pack_store，v_pack_u_store，v_rshr_pack_store，v_rshr_pack_u_store
重组：v_zip，v_recombine，v_combine_low，v_combine_high
反转：v_reverse
提取：v_extract

算术、位运算和比较运算

逐元素二元和一元运算。

算术运算：+，-，*，/，v_mul_expand
非饱和算术运算：v_add_wrap，v_sub_wrap
位移运算：<<，>>，v_shl，v_shr
位逻辑运算：&，|，^，~
比较运算：>，>=，<，<=，==，!=
min/max：v_min，v_max

规约和掩码

大多数这些操作只返回一个值。

规约：v_reduce_min，v_reduce_max，v_reduce_sum，v_popcount
掩码：v_signmask，v_check_all，v_check_any，v_select

其他数学运算

一些常用操作：v_sqrt，v_invsqrt，v_magnitude，v_sqr_magnitude，v_exp，v_log，v_erf，v_sin，v_cos
绝对值：v_abs，v_absdiff，v_absdiffs

转换

不同的类型转换和强制转换

舍入：v_round，v_floor，v_ceil，v_trunc，
转换为浮点数：v_cvt_f32，v_cvt_f64
重新解释：v_reinterpret_as_u8，v_reinterpret_as_s8，...

矩阵运算

在这些运算中，向量表示矩阵行/列：v_dotprod，v_dotprod_fast，v_dotprod_expand，v_dotprod_expand_fast，v_matmul，v_transpose4x4

可用性

大多数操作仅针对可用类型的一个子集实现，下表显示了不同操作对类型的适用性。

普通整数

操作\类型	uint 8	int 8	uint 16	int 16	uint 32	int 32
加载，存储	x	x	x	x	x	x
交错	x	x	x	x	x	x
扩展	x	x	x	x	x	x
低位扩展	x	x	x	x	x	x
高位扩展	x	x	x	x	x	x
四分之一扩展	x	x
加，减	x	x	x	x	x	x
溢出加，溢出减	x	x	x	x
溢出乘	x	x	x	x
乘	x	x	x	x	x	x
扩展乘	x	x	x	x	x
比较	x	x	x	x	x	x
移位			x	x	x	x
点积				x		x
快速点积				x		x
扩展点积	x	x	x	x		x
快速扩展点积	x	x	x	x		x
逻辑运算	x	x	x	x	x	x
最小值，最大值	x	x	x	x	x	x
绝对差	x	x	x	x	x	x
绝对差集		x		x
归约	x	x	x	x	x	x
掩码	x	x	x	x	x	x
打包	x	x	x	x	x	x
无符号打包	x		x
带符号打包	x
解包	x	x	x	x	x	x
提取	x	x	x	x	x	x
旋转（通道）	x	x	x	x	x	x
转换为float32						x
转换为float64						x
4x4转置					x	x
反转	x	x	x	x	x	x
提取n	x	x	x	x	x	x
广播元素					x	x

大整数

操作\类型	uint 64	int 64
加载，存储	x	x
加，减	x	x
移位	x	x
逻辑运算	x	x
反转	x	x
提取	x	x
旋转（通道）	x	x
转换为float64		x
提取n	x	x

浮点数

操作\类型	float 32	float 64
加载，存储	x	x
交错	x
加，减	x	x
乘	x	x
除法	x	x
比较	x	x
最小值，最大值	x	x
绝对差	x	x
归约	x
掩码	x	x
解包	x	x
转换为float32		x
转换为float64	x
平方根，绝对值	x	x
浮点数学函数	x	x
4x4转置	x
提取	x	x
旋转（通道）	x	x
反转	x	x
提取n	x	x
广播元素	x
指数	x	x
对数	x	x
正弦，余弦	x	x

类
结构体	cv::v_reg< _Tp, n >

宏
#define	OPENCV_HAL_MATH_HAVE_EXP 1

类型定义
typedef v_float32x16	simd512::v_float32
	最大可用向量寄存器容量：32位浮点数（单精度）

typedef v_reg< float, 16 >	cv::v_float32x16
	十六个32位浮点数（单精度）

typedef v_reg< float, 4 >	cv::v_float32x4
	四个32位浮点数（单精度）

typedef v_reg< float, 8 >	cv::v_float32x8
	八个32位浮点数（单精度）

typedef v_float64x8	simd512::v_float64
	最大可用向量寄存器容量：64位浮点数（双精度）

typedef v_reg< double, 2 >	cv::v_float64x2
	两个64位浮点数（双精度）

typedef v_reg< double, 4 >	cv::v_float64x4
	四个64位浮点数（双精度）

typedef v_reg< double, 8 >	cv::v_float64x8
	八个64位浮点数（双精度）

typedef v_int16x32	simd512::v_int16
	最大可用向量寄存器容量：16位有符号整数。

typedef v_reg< short, 16 >	cv::v_int16x16
	十六个16位有符号整数。

typedef v_reg< short, 32 >	cv::v_int16x32
	三十二个16位有符号整数。

typedef v_reg< short, 8 >	cv::v_int16x8
	八个16位有符号整数。

typedef v_int32x16	simd512::v_int32
	最大可用向量寄存器容量：32位有符号整数。

typedef v_reg< int, 16 >	cv::v_int32x16
	十六个32位有符号整数。

typedef v_reg< int, 4 >	cv::v_int32x4
	四个32位有符号整数。

typedef v_reg< int, 8 >	cv::v_int32x8
	八个32位有符号整数。

typedef v_int64x8	simd512::v_int64
	最大可用向量寄存器容量：64位有符号整数。

typedef v_reg< int64, 2 >	cv::v_int64x2
	两个64位有符号整数。

typedef v_reg< int64, 4 >	cv::v_int64x4
	四个64位有符号整数。

typedef v_reg< int64, 8 >	cv::v_int64x8
	八个64位有符号整数。

typedef v_int8x64	simd512::v_int8
	最大可用向量寄存器容量：8位有符号整数。

typedef v_reg< schar, 16 >	cv::v_int8x16
	十六个8位有符号整数。

typedef v_reg< schar, 32 >	cv::v_int8x32
	三十二个8位有符号整数。

typedef v_reg< schar, 64 >	cv::v_int8x64
	六十四个8位有符号整数。

typedef v_uint16x32	simd512::v_uint16
	最大可用向量寄存器容量：16位无符号整数。

typedef v_reg< ushort, 16 >	cv::v_uint16x16
	十六个16位无符号整数。

typedef v_reg< ushort, 32 >	cv::v_uint16x32
	三十二个16位无符号整数。

typedef v_reg< ushort, 8 >	cv::v_uint16x8
	八个16位无符号整数。

typedef v_uint32x16	simd512::v_uint32
	最大可用向量寄存器容量：32位无符号整数。

typedef v_reg< unsigned, 16 >	cv::v_uint32x16
	十六个32位无符号整数。

typedef v_reg< unsigned, 4 >	cv::v_uint32x4
	四个32位无符号整数。

typedef v_reg< unsigned, 8 >	cv::v_uint32x8
	八个32位无符号整数。

typedef v_uint64x8	simd512::v_uint64
	最大可用向量寄存器容量：64位无符号整数。

typedef v_reg< uint64, 2 >	cv::v_uint64x2
	两个64位无符号整数。

typedef v_reg< uint64, 4 >	cv::v_uint64x4
	四个64位无符号整数。

typedef v_reg< uint64, 8 >	cv::v_uint64x8
	八个64位无符号整数。

typedef v_uint8x64	simd512::v_uint8
	最大可用向量寄存器容量的8位无符号整数。

typedef v_reg< uchar, 16 >	cv::v_uint8x16
	十六个8位无符号整数。

typedef v_reg< uchar, 32 >	cv::v_uint8x32
	三十二个8位无符号整数。

typedef v_reg< uchar, 64 >	cv::v_uint8x64
	六十四个8位无符号整数。

枚举
enum	{ cv::simd128_width = 16 , cv::simd256_width = 32 , cv::simd512_width = 64 , cv::simdmax_width = simd512_width }

函数
void	cv::v256_cleanup ()

template<typename _Tp >
v_reg< _Tp, simd256_width/sizeof(_Tp)>	cv::v256_load (const _Tp *ptr)
	从内存加载256位长度的寄存器内容。

template<typename _Tp >
v_reg< _Tp, simd256_width/sizeof(_Tp)>	cv::v256_load_aligned (const _Tp *ptr)
	从内存加载寄存器内容（已对齐）

template<typename _Tp >
v_reg< typename V_TypeTraits< _Tp >::w_type, simd256_width/sizeof(typename V_TypeTraits< _Tp >::w_type)>	cv::v256_load_expand (const _Tp *ptr)
	使用双倍扩展从内存加载寄存器内容。

v_reg< float, simd256_width/sizeof(float)>	cv::v256_load_expand (const hfloat *ptr)

template<typename _Tp >
v_reg< typename V_TypeTraits< _Tp >::q_type, simd256_width/sizeof(typename V_TypeTraits< _Tp >::q_type)>	cv::v256_load_expand_q (const _Tp *ptr)
	使用四倍扩展从内存加载寄存器内容。

template<typename _Tp >
v_reg< _Tp, simd256_width/sizeof(_Tp)>	cv::v256_load_halves (const _Tp loptr, const _Tp hiptr)
	从两个内存块加载寄存器内容。

template<typename _Tp >
v_reg< _Tp, simd256_width/sizeof(_Tp)>	cv::v256_load_low (const _Tp *ptr)
	将128位数据加载到低位部分（高位部分未定义）。

void	cv::v512_cleanup ()

template<typename _Tp >
v_reg< _Tp, simd512_width/sizeof(_Tp)>	cv::v512_load (const _Tp *ptr)
	从内存加载512位长度的寄存器内容。

template<typename _Tp >
v_reg< _Tp, simd512_width/sizeof(_Tp)>	cv::v512_load_aligned (const _Tp *ptr)
	从内存加载寄存器内容（已对齐）

template<typename _Tp >
v_reg< typename V_TypeTraits< _Tp >::w_type, simd512_width/sizeof(typename V_TypeTraits< _Tp >::w_type)>	cv::v512_load_expand (const _Tp *ptr)
	使用双倍扩展从内存加载寄存器内容。

v_reg< float, simd512_width/sizeof(float)>	cv::v512_load_expand (const hfloat *ptr)

template<typename _Tp >
v_reg< typename V_TypeTraits< _Tp >::q_type, simd512_width/sizeof(typename V_TypeTraits< _Tp >::q_type)>	cv::v512_load_expand_q (const _Tp *ptr)
	使用四倍扩展从内存加载寄存器内容。

template<typename _Tp >
v_reg< _Tp, simd512_width/sizeof(_Tp)>	cv::v512_load_halves (const _Tp loptr, const _Tp hiptr)
	从两个内存块加载寄存器内容。

template<typename _Tp >
v_reg< _Tp, simd512_width/sizeof(_Tp)>	cv::v512_load_low (const _Tp *ptr)
	将256位数据加载到低位部分（高位部分未定义）。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::abs_type, n >	cv::v_abs (const v_reg< _Tp, n > &a)
	元素的绝对值。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::abs_type, n >	cv::v_absdiff (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	绝对差值。

template<int n>
v_reg< double, n >	cv::v_absdiff (const v_reg< double, n > &a, const v_reg< double, n > &b)

template<int n>
v_reg< float, n >	cv::v_absdiff (const v_reg< float, n > &a, const v_reg< float, n > &b)

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_absdiffs (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	饱和绝对差值。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_add (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	添加值。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_add_wrap (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	无饱和添加值。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_and (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	按位与。

template<int i, typename _Tp , int n>
v_reg< _Tp, n >	cv::v_broadcast_element (const v_reg< _Tp, n > &a)
	广播向量的第 i 个元素。

template<int n>
v_reg< int, n *2 >	cv::v_ceil (const v_reg< double, n > &a)

template<int n>
v_reg< int, n >	cv::v_ceil (const v_reg< float, n > &a)
	向上取整元素。

template<typename _Tp , int n>
bool	cv::v_check_all (const v_reg< _Tp, n > &a)
	检查所有打包的值是否小于零。

template<typename _Tp , int n>
bool	cv::v_check_any (const v_reg< _Tp, n > &a)
	检查任何打包的值是否小于零。

void	cv::v_cleanup ()

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_combine_high (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	从两个向量的最后一个元素组合向量。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_combine_low (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	从两个向量的第一个元素组合向量。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_cos (const v_reg< _Tp, n > &a)
	元素的余弦 \( cos(x) \)。

template<int n>
v_reg< float, n *2 >	cv::v_cvt_f32 (const v_reg< double, n > &a)
	将下半部分转换为浮点数。

template<int n>
v_reg< float, n *2 >	cv::v_cvt_f32 (const v_reg< double, n > &a, const v_reg< double, n > &b)
	转换为浮点数。

template<int n>
v_reg< float, n >	cv::v_cvt_f32 (const v_reg< int, n > &a)
	转换为浮点数。

template<int n>
v_reg< double,(n/2)>	cv::v_cvt_f64 (const v_reg< float, n > &a)
	将下半部分转换为双精度浮点数。

template<int n>
v_reg< double, n/2 >	cv::v_cvt_f64 (const v_reg< int, n > &a)
	将下半部分转换为双精度浮点数。

template<int n>
v_reg< double, n >	cv::v_cvt_f64 (const v_reg< int64, n > &a)
	转换为双精度浮点数。

template<int n>
v_reg< double,(n/2)>	cv::v_cvt_f64_high (const v_reg< float, n > &a)
	将向量的上半部分转换为双精度浮点数。

template<int n>
v_reg< double,(n/2)>	cv::v_cvt_f64_high (const v_reg< int, n > &a)
	将向量的上半部分转换为双精度浮点数。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_div (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	除法运算。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 >	cv::v_dotprod (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	元素的点积。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 >	cv::v_dotprod (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &c)
	元素的点积。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 >	cv::v_dotprod_expand (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	元素的点积并扩展。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 >	cv::v_dotprod_expand (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > &c)
	元素的点积。

template<int n>
v_reg< double, n/2 >	cv::v_dotprod_expand (const v_reg< int, n > &a, const v_reg< int, n > &b)

template<int n>
v_reg< double, n/2 >	cv::v_dotprod_expand (const v_reg< int, n > &a, const v_reg< int, n > &b, const v_reg< double, n/2 > &c)

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 >	cv::v_dotprod_expand_fast (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	快速计算元素的点积并扩展。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 >	cv::v_dotprod_expand_fast (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > &c)
	快速计算元素的点积。

template<int n>
v_reg< double, n/2 >	cv::v_dotprod_expand_fast (const v_reg< int, n > &a, const v_reg< int, n > &b)

template<int n>
v_reg< double, n/2 >	cv::v_dotprod_expand_fast (const v_reg< int, n > &a, const v_reg< int, n > &b, const v_reg< double, n/2 > &c)

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 >	cv::v_dotprod_fast (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	快速计算元素的点积。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 >	cv::v_dotprod_fast (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &c)
	快速计算元素的点积。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_eq (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	相等比较。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_erf (const v_reg< _Tp, n > &a)
	误差函数。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_exp (const v_reg< _Tp, n > &a)
	元素的指数 \( e^x \)。

template<typename _Tp , int n>
void	cv::v_expand (const v_reg< _Tp, n > &a, v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &b0, v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &b1)
	将值扩展到更宽的打包类型。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 >	cv::v_expand_high (const v_reg< _Tp, n > &a)
	将较高的值扩展到更宽的打包类型。

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 >	cv::v_expand_low (const v_reg< _Tp, n > &a)
	将较低的值扩展到更宽的打包类型。

template<int s, typename _Tp , int n>
v_reg< _Tp, n >	cv::v_extract (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	向量提取。

template<int s, typename _Tp , int n>
_Tp	cv::v_extract_n (const v_reg< _Tp, n > &v)
	向量提取。

template<int n>
v_reg< int, n *2 >	cv::v_floor (const v_reg< double, n > &a)

template<int n>
v_reg< int, n >	cv::v_floor (const v_reg< float, n > &a)
	向下取整元素。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_fma (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< _Tp, n > &c)
	乘法和加法。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_ge (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	大于或等于比较。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_gt (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	大于比较。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_interleave_pairs (const v_reg< _Tp, n > &vec)

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_interleave_quads (const v_reg< _Tp, n > &vec)

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_invsqrt (const v_reg< _Tp, n > &a)
	倒数平方根。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_le (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	小于或等于比较。

template<typename _Tp >
v_reg< _Tp, simd128_width/sizeof(_Tp)>	cv::v_load (const _Tp *ptr)
	从内存加载寄存器内容。

template<typename _Tp >
v_reg< _Tp, simd128_width/sizeof(_Tp)>	cv::v_load_aligned (const _Tp *ptr)
	从内存加载寄存器内容（已对齐）

template<typename _Tp , int n>
void	cv::v_load_deinterleave (const _Tp *ptr, v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	加载并解交错 (2 个通道)

template<typename _Tp , int n>
void	cv::v_load_deinterleave (const _Tp *ptr, v_reg< _Tp, n > &a, v_reg< _Tp, n > &b, v_reg< _Tp, n > &c)
	加载并解交错 (3 个通道)

template<typename _Tp , int n>
void	cv::v_load_deinterleave (const _Tp *ptr, v_reg< _Tp, n > &a, v_reg< _Tp, n > &b, v_reg< _Tp, n > &c, v_reg< _Tp, n > &d)
	加载并解交错 (4 个通道)

template<typename _Tp >
v_reg< typename V_TypeTraits< _Tp >::w_type, simd128_width/sizeof(typename V_TypeTraits< _Tp >::w_type)>	cv::v_load_expand (const _Tp *ptr)
	使用双倍扩展从内存加载寄存器内容。

v_reg< float, simd128_width/sizeof(float)>	cv::v_load_expand (hfloat *ptr)

template<typename _Tp >
v_reg< typename V_TypeTraits< _Tp >::q_type, simd128_width/sizeof(typename V_TypeTraits< _Tp >::q_type)>	cv::v_load_expand_q (_Tp *ptr)
	使用四倍扩展从内存加载寄存器内容。

template<typename _Tp >
v_reg< _Tp, simd128_width/sizeof(_Tp)>	cv::v_load_halves (_Tp loptr, _Tp hiptr)
	从两个内存块加载寄存器内容。

template<typename _Tp >
v_reg< _Tp, simd128_width/sizeof(_Tp)>	cv::v_load_low (_Tp *ptr)
	加载64位数据到低位部分（高位部分未定义）。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_log (v_reg< _Tp, n > &a)
	元素的自然对数 \( \log(x) \)。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_lt (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	小于比较。

template<typename _Tp >
v_reg< _Tp, simd128_width/sizeof(_Tp)>	cv::v_lut (_Tp tab, int idx)

template<int n>
v_reg< double, n/2 >	cv::v_lut (const double *tab, const v_reg< int, n > &idx)

template<int n>
v_reg< float, n >	cv::v_lut (const float *tab, const v_reg< int, n > &idx)

template<int n>
v_reg< int, n >	cv::v_lut (const int *tab, const v_reg< int, n > &idx)

template<int n>
v_reg< unsigned, n >	cv::v_lut (const unsigned *tab, const v_reg< int, n > &idx)

template<int n>
void	cv::v_lut_deinterleave (const double tab, const v_reg< int, n 2 > &idx, v_reg< double, n > &x, v_reg< double, n > &y)

template<int n>
void	cv::v_lut_deinterleave (const float *tab, const v_reg< int, n > &idx, v_reg< float, n > &x, v_reg< float, n > &y)

template<typename _Tp >
v_reg< _Tp, simd128_width/sizeof(_Tp)>	cv::v_lut_pairs (_Tp tab, int idx)

template<typename _Tp >
v_reg< _Tp, simd128_width/sizeof(_Tp)>	cv::v_lut_quads (_Tp tab, int idx)

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_magnitude (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	幅度。

template<int n>
v_reg< float, n >	cv::v_matmul (v_reg< float, n > &v, v_reg< float, n > &a, v_reg< float, n > &b, v_reg< float, n > &c, v_reg< float, n > &d)
	矩阵乘法。

template<int n>
v_reg< float, n >	cv::v_matmuladd (v_reg< float, n > &v, v_reg< float, n > &a, v_reg< float, n > &b, v_reg< float, n > &c, v_reg< float, n > &d)
	矩阵乘法和加法。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_max (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	为每一对选择最大值。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_min (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	为每一对选择最小值。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_mul (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	乘法运算。

template<typename _Tp , int n>
void	cv::v_mul_expand (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b, v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &c, v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &d)
	乘法和扩展。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_mul_hi (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	乘法并提取高位部分。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_mul_wrap (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	无饱和乘法运算。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_muladd (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b, v_reg< _Tp, n > &c)
	v_fma 的同义词。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_ne (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	不相等比较。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_not (v_reg< _Tp, n > &a)
	按位非。

template<int n>
v_reg< double, n >	cv::v_not_nan (v_reg< double, n > &a)

template<int n>
v_reg< float, n >	cv::v_not_nan (v_reg< float, n > &a)

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_or (v_reg< _Tp, n > &a, v_reg< _Tp, n > &b)
	按位或。

template<int n>
void	cv::v_pack_store (hfloat *ptr, const v_reg< float, n > &v)

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_pack_triplets (v_reg< _Tp, n > &vec)

template<typename _Tp , int n>
v_reg< typename V_TypeTraits< _Tp >::abs_type, n >	cv::v_popcount (const v_reg< _Tp, n > &a)
	统计向量通道中的1比特数，并以对应的无符号类型返回结果。

template<typename _Tp , int n>
void	cv::v_recombine (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, v_reg< _Tp, n > &low, v_reg< _Tp, n > &high)
	将两个向量组合成另外两个向量的低位和高位部分。

template<typename _Tp , int n>
_Tp	cv::v_reduce_max (const v_reg< _Tp, n > &a)
	查找一个最大值。

template<typename _Tp , int n>
_Tp	cv::v_reduce_min (const v_reg< _Tp, n > &a)
	查找一个最小值。

template<typename _Tp , int n>
V_TypeTraits< typenameV_TypeTraits< _Tp >::abs_type >::sum_type	cv::v_reduce_sad (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	计算值的绝对差之和。

template<typename _Tp , int n>
V_TypeTraits< _Tp >::sum_type	cv::v_reduce_sum (const v_reg< _Tp, n > &a)
	计算打包值的和。

template<int n>
v_reg< float, n >	cv::v_reduce_sum4 (const v_reg< float, n > &a, const v_reg< float, n > &b, const v_reg< float, n > &c, const v_reg< float, n > &d)
	计算每个输入向量的所有元素之和，返回和的向量。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_reverse (const v_reg< _Tp, n > &a)
	向量反转顺序。

template<int imm, typename _Tp , int n>
v_reg< _Tp, n >	cv::v_rotate_left (const v_reg< _Tp, n > &a)
	向量中元素左移。

template<int imm, typename _Tp , int n>
v_reg< _Tp, n >	cv::v_rotate_left (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)

template<int imm, typename _Tp , int n>
v_reg< _Tp, n >	cv::v_rotate_right (const v_reg< _Tp, n > &a)
	向量中元素右移。

template<int imm, typename _Tp , int n>
v_reg< _Tp, n >	cv::v_rotate_right (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)

template<int n>
v_reg< int, n *2 >	cv::v_round (const v_reg< double, n > &a)

template<int n>
v_reg< int, n *2 >	cv::v_round (const v_reg< double, n > &a, const v_reg< double, n > &b)

template<int n>
v_reg< int, n >	cv::v_round (const v_reg< float, n > &a)
	对元素进行四舍五入。

template<typename _Tp , int n>
int	cv::v_scan_forward (const v_reg< _Tp, n > &a)
	获取第一个负数通道索引。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_select (const v_reg< _Tp, n > &mask, const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	按元素选择（混合操作）

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_shl (const v_reg< _Tp, n > &a, int imm)
	按位左移。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_shr (const v_reg< _Tp, n > &a, int imm)
	按位右移。

template<typename _Tp , int n>
int	cv::v_signmask (const v_reg< _Tp, n > &a)
	获取负值掩码。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_sin (const v_reg< _Tp, n > &a)
	计算元素的正弦值 \( sin(x) \)。

template<typename _Tp , int n>
void	cv::v_sincos (const v_reg< _Tp, n > &x, v_reg< _Tp, n > &s, v_reg< _Tp, n > &c)
	同时计算元素的正弦值 \( sin(x) \) 和余弦值 \( cos(x) \)。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_sqr_magnitude (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	幅度的平方。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_sqrt (const v_reg< _Tp, n > &a)
	计算元素的平方根。

template<typename _Tp , int n>
void	cv::v_store (_Tp *ptr, const v_reg< _Tp, n > &a)
	将数据存储到内存中。

template<typename _Tp , int n>
void	cv::v_store (_Tp *ptr, const v_reg< _Tp, n > &a, hal::StoreMode)

template<typename _Tp , int n>
void	cv::v_store_aligned (_Tp *ptr, const v_reg< _Tp, n > &a)
	将数据存储到内存中（已对齐）

template<typename _Tp , int n>
void	cv::v_store_aligned (_Tp *ptr, const v_reg< _Tp, n > &a, hal::StoreMode)

template<typename _Tp , int n>
void	cv::v_store_aligned_nocache (_Tp *ptr, const v_reg< _Tp, n > &a)

template<typename _Tp , int n>
void	cv::v_store_high (_Tp *ptr, const v_reg< _Tp, n > &a)
	将数据存储到内存（高半部分）

template<typename _Tp , int n>
void	cv::v_store_interleave (_Tp *ptr, const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< _Tp, n > &c, const v_reg< _Tp, n > &d, hal::StoreMode=hal::STORE_UNALIGNED)
	交叉存储 (4通道)

template<typename _Tp , int n>
void	cv::v_store_interleave (_Tp *ptr, const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, const v_reg< _Tp, n > &c, hal::StoreMode=hal::STORE_UNALIGNED)
	交叉存储 (3通道)

template<typename _Tp , int n>
void	cv::v_store_interleave (_Tp *ptr, const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b, hal::StoreMode=hal::STORE_UNALIGNED)
	交叉存储 (2通道)

template<typename _Tp , int n>
void	cv::v_store_low (_Tp *ptr, const v_reg< _Tp, n > &a)
	将数据存储到内存（低半部分）

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_sub (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	减法运算。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_sub_wrap (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	无饱和减法运算。

template<typename _Tp , int n>
void	cv::v_transpose4x4 (v_reg< _Tp, n > &a0, const v_reg< _Tp, n > &a1, const v_reg< _Tp, n > &a2, const v_reg< _Tp, n > &a3, v_reg< _Tp, n > &b0, v_reg< _Tp, n > &b1, v_reg< _Tp, n > &b2, v_reg< _Tp, n > &b3)
	转置 4x4 矩阵。

template<int n>
v_reg< int, n *2 >	cv::v_trunc (const v_reg< double, n > &a)

template<int n>
v_reg< int, n >	cv::v_trunc (const v_reg< float, n > &a)
	截断元素。

template<typename _Tp , int n>
v_reg< _Tp, n >	cv::v_xor (const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)
	按位异或。

template<typename _Tp , int n>
void	cv::v_zip (const v_reg< _Tp, n > &a0, const v_reg< _Tp, n > &a1, v_reg< _Tp, n > &b0, v_reg< _Tp, n > &b1)
	交叉两个向量。

变量
静态常量 unsigned char	cv::popCountTable []

初始化为零
创建具有零元素的新向量
v_uint8x16	cv::v_setzero_u8 ()

模板<>
v_uint8x16	cv::v_setzero_ ()

v_int8x16	cv::v_setzero_s8 ()

v_uint16x8	cv::v_setzero_u16 ()

v_int16x8	cv::v_setzero_s16 ()

v_uint32x4	cv::v_setzero_u32 ()

v_int32x4	cv::v_setzero_s32 ()

v_float32x4	cv::v_setzero_f32 ()

v_float64x2	cv::v_setzero_f64 ()

v_uint64x2	cv::v_setzero_u64 ()

v_int64x2	cv::v_setzero_s64 ()

v_uint8x32	cv::v256_setzero_u8 ()

v_int8x32	cv::v256_setzero_s8 ()

v_uint16x16	cv::v256_setzero_u16 ()

v_int16x16	cv::v256_setzero_s16 ()

v_uint32x8	cv::v256_setzero_u32 ()

v_int32x8	cv::v256_setzero_s32 ()

v_float32x8	cv::v256_setzero_f32 ()

v_float64x4	cv::v256_setzero_f64 ()

v_uint64x4	cv::v256_setzero_u64 ()

v_int64x4	cv::v256_setzero_s64 ()

v_uint8x64	cv::v512_setzero_u8 ()

v_int8x64	cv::v512_setzero_s8 ()

v_uint16x32	cv::v512_setzero_u16 ()

v_int16x32	cv::v512_setzero_s16 ()

v_uint32x16	cv::v512_setzero_u32 ()

v_int32x16	cv::v512_setzero_s32 ()

v_float32x16	cv::v512_setzero_f32 ()

v_float64x8	cv::v512_setzero_f64 ()

v_uint64x8	cv::v512_setzero_u64 ()

v_int64x8	cv::v512_setzero_s64 ()

用值初始化
创建元素设置为特定值的新向量
v_uint8x16	cv::v_setall_u8 (uchar val)

模板<>
v_uint8x16	cv::v_setall_ (uchar val)

v_int8x16	cv::v_setall_s8 (schar val)

模板<>
v_int8x16	cv::v_setall_ (schar val)

v_uint16x8	cv::v_setall_u16 (ushort val)

模板<>
v_uint16x8	cv::v_setall_ (ushort val)

v_int16x8	cv::v_setall_s16 (short val)

模板<>
v_int16x8	cv::v_setall_ (short val)

v_uint32x4	cv::v_setall_u32 (unsigned val)

模板<>
v_uint32x4	cv::v_setall_ (unsigned val)

v_int32x4	cv::v_setall_s32 (int val)

模板<>
v_int32x4	cv::v_setall_ (int val)

v_float32x4	cv::v_setall_f32 (float val)

模板<>
v_float32x4	cv::v_setall_ (float val)

v_float64x2	cv::v_setall_f64 (double val)

模板<>
v_float64x2	cv::v_setall_ (double val)

v_uint64x2	cv::v_setall_u64 (uint64 val)

模板<>
v_uint64x2	cv::v_setall_ (uint64 val)

v_int64x2	cv::v_setall_s64 (int64 val)

模板<>
v_int64x2	cv::v_setall_ (int64 val)

v_uint8x32	cv::v256_setall_u8 (uchar val)

v_int8x32	cv::v256_setall_s8 (schar val)

v_uint16x16	cv::v256_setall_u16 (ushort val)

v_int16x16	cv::v256_setall_s16 (short val)

v_uint32x8	cv::v256_setall_u32 (unsigned val)

v_int32x8	cv::v256_setall_s32 (int val)

v_float32x8	cv::v256_setall_f32 (float val)

v_float64x4	cv::v256_setall_f64 (double val)

v_uint64x4	cv::v256_setall_u64 (uint64 val)

v_int64x4	cv::v256_setall_s64 (int64 val)

v_uint8x64	cv::v512_setall_u8 (uchar val)

v_int8x64	cv::v512_setall_s8 (schar val)

v_uint16x32	cv::v512_setall_u16 (ushort val)

v_int16x32	cv::v512_setall_s16 (short val)

v_uint32x16	cv::v512_setall_u32 (unsigned val)

v_int32x16	cv::v512_setall_s32 (int val)

v_float32x16	cv::v512_setall_f32 (float val)

v_float64x8	cv::v512_setall_f64 (double val)

v_uint64x8	cv::v512_setall_u64 (uint64 val)

v_int64x8	cv::v512_setall_s64 (int64 val)

重新解释
在不修改底层数据的情况下将向量转换为不同的类型。
template<typename _Tp0 , int n0>
v_reg< uchar, n0 *sizeof(_Tp0)/sizeof(uchar)>	cv::v_reinterpret_as_u8 (const v_reg< _Tp0, n0 > &a)

template<typename _Tp0 , int n0>
v_reg< schar, n0 *sizeof(_Tp0)/sizeof(schar)>	cv::v_reinterpret_as_s8 (const v_reg< _Tp0, n0 > &a)

template<typename _Tp0 , int n0>
v_reg< ushort, n0 *sizeof(_Tp0)/sizeof(ushort)>	cv::v_reinterpret_as_u16 (const v_reg< _Tp0, n0 > &a)

template<typename _Tp0 , int n0>
v_reg< short, n0 *sizeof(_Tp0)/sizeof(short)>	cv::v_reinterpret_as_s16 (const v_reg< _Tp0, n0 > &a)

template<typename _Tp0 , int n0>
v_reg< unsigned, n0 *sizeof(_Tp0)/sizeof(unsigned)>	cv::v_reinterpret_as_u32 (const v_reg< _Tp0, n0 > &a)

template<typename _Tp0 , int n0>
v_reg< int, n0 *sizeof(_Tp0)/sizeof(int)>	cv::v_reinterpret_as_s32 (const v_reg< _Tp0, n0 > &a)

template<typename _Tp0 , int n0>
v_reg< float, n0 *sizeof(_Tp0)/sizeof(float)>	cv::v_reinterpret_as_f32 (const v_reg< _Tp0, n0 > &a)

template<typename _Tp0 , int n0>
v_reg< double, n0 *sizeof(_Tp0)/sizeof(double)>	cv::v_reinterpret_as_f64 (const v_reg< _Tp0, n0 > &a)

template<typename _Tp0 , int n0>
`cv::v_reg< cv::uint64, n0 * sizeof(_Tp0) / sizeof(cv::uint64)>`	`cv::v_reinterpret_as_u64` (const `cv::v_reg< _Tp0, n0 > &a`)

template<typename _Tp0 , int n0>
`cv::v_reg< cv::int64, n0 * sizeof(_Tp0) / sizeof(cv::int64)>`	`cv::v_reinterpret_as_s64` (const `cv::v_reg< _Tp0, n0 > &a`)

左移
左移位
`template<int shift, int n>`
`cv::v_reg< cv::ushort, n >`	`cv::v_shl` (const `cv::v_reg< cv::ushort, n > &a`)

`template<int shift, int n>`
`cv::v_reg< short, n >`	`cv::v_shl` (const `cv::v_reg< short, n > &a`)

`template<int shift, int n>`
v_reg< unsigned, n >	`cv::v_shl` (const `cv::v_reg< unsigned, n > &a`)

`template<int shift, int n>`
v_reg< int, n >	`cv::v_shl` (const `cv::v_reg< int, n > &a`)

`template<int shift, int n>`
`cv::v_reg< cv::uint64, n >`	`cv::v_shl` (const `cv::v_reg< cv::uint64, n > &a`)

`template<int shift, int n>`
`cv::v_reg< cv::int64, n >`	`cv::v_shl` (const `cv::v_reg< cv::int64, n > &a`)

右移
右移位
`template<int shift, int n>`
`cv::v_reg< cv::ushort, n >`	`cv::v_shr` (const `cv::v_reg< cv::ushort, n > &a`)

`template<int shift, int n>`
`cv::v_reg< short, n >`	`cv::v_shr` (const `cv::v_reg< short, n > &a`)

`template<int shift, int n>`
v_reg< unsigned, n >	`cv::v_shr` (const `cv::v_reg< unsigned, n > &a`)

`template<int shift, int n>`
v_reg< int, n >	`cv::v_shr` (const `cv::v_reg< int, n > &a`)

`template<int shift, int n>`
`cv::v_reg< cv::uint64, n >`	`cv::v_shr` (const `cv::v_reg< cv::uint64, n > &a`)

`template<int shift, int n>`
`cv::v_reg< cv::int64, n >`	`cv::v_shr` (const `cv::v_reg< cv::int64, n > &a`)

舍入移位
舍入右移位
`template<int shift, int n>`
`cv::v_reg< cv::ushort, n >`	`cv::v_rshr` (const `cv::v_reg< cv::ushort, n > &a`)

`template<int shift, int n>`
`cv::v_reg< short, n >`	`cv::v_rshr` (const `cv::v_reg< short, n > &a`)

`template<int shift, int n>`
v_reg< unsigned, n >	`cv::v_rshr` (const `cv::v_reg< unsigned, n > &a`)

`template<int shift, int n>`
v_reg< int, n >	`cv::v_rshr` (const `cv::v_reg< int, n > &a`)

`template<int shift, int n>`
`cv::v_reg< cv::uint64, n >`	`cv::v_rshr` (const `cv::v_reg< cv::uint64, n > &a`)

`template<int shift, int n>`
`cv::v_reg< cv::int64, n >`	`cv::v_rshr` (const `cv::v_reg< cv::int64, n > &a`)

打包
将两个向量中的值打包到一个向量中返回的向量类型元素个数是输入向量类型元素个数的两倍。带有 _u 后缀的变体也会转换为相应的无符号类型。 pack：用于 16 位、32 位和 64 位整数输入类型 pack_u：用于 16 位和 32 位有符号整数输入类型注意除 64 位以外的所有变体都使用饱和运算。
template<int n>
`cv::v_reg< cv::uchar, 2 * n >`	`cv::v_pack` (const `cv::v_reg< cv::ushort, n > &a`, const `cv::v_reg< cv::ushort, n > &b`)

template<int n>
`cv::v_reg< cv::schar, 2 * n >`	`cv::v_pack` (const `cv::v_reg< short, n > &a`, const `cv::v_reg< short, n > &b`)

template<int n>
`cv::v_reg< cv::ushort, 2 * n >`	`cv::v_pack` (const `cv::v_reg< unsigned, n > &a`, const `cv::v_reg< unsigned, n > &b`)

template<int n>
`cv::v_reg< short, 2 * n >`	`cv::v_pack` (const `cv::v_reg< int, n > &a`, const `cv::v_reg< int, n > &b`)

template<int n>
`cv::v_reg< unsigned, 2 * n >`	`cv::v_pack` (const `cv::v_reg< cv::uint64, n > &a`, const `cv::v_reg< cv::uint64, n > &b`)

template<int n>
`cv::v_reg< int, 2 * n >`	`cv::v_pack` (const `cv::v_reg< cv::int64, n > &a`, const `cv::v_reg< cv::int64, n > &b`)

template<int n>
`cv::v_reg< cv::uchar, 2 * n >`	`cv::v_pack_u` (const `cv::v_reg< short, n > &a`, const `cv::v_reg< short, n > &b`)

template<int n>
`cv::v_reg< cv::ushort, 2 * n >`	cv::v_pack_u (const v_reg< int, n > &a, const v_reg< int, n > &b)

带舍入移位的打包
将两个向量中的值打包到一个向量中，并进行舍入移位输入向量中的值将向右舍入移位n位，转换为较窄的类型，并返回到结果向量中。带有u后缀的变体转换为无符号类型。 pack：用于 16 位、32 位和 64 位整数输入类型 pack_u：用于 16 位和 32 位有符号整数输入类型注意除 64 位以外的所有变体都使用饱和运算。
`template<int shift, int n>`
`cv::v_reg< cv::uchar, 2 * n >`	cv::v_rshr_pack (const v_reg< ushort, n > &a, const v_reg< ushort, n > &b)

`template<int shift, int n>`
`cv::v_reg< cv::schar, 2 * n >`	cv::v_rshr_pack (const v_reg< short, n > &a, const v_reg< short, n > &b)

`template<int shift, int n>`
`cv::v_reg< cv::ushort, 2 * n >`	cv::v_rshr_pack (const v_reg< unsigned, n > &a, const v_reg< unsigned, n > &b)

`template<int shift, int n>`
`cv::v_reg< short, 2 * n >`	cv::v_rshr_pack (const v_reg< int, n > &a, const v_reg< int, n > &b)

`template<int shift, int n>`
`cv::v_reg< unsigned, 2 * n >`	cv::v_rshr_pack (const v_reg< uint64, n > &a, const v_reg< uint64, n > &b)

`template<int shift, int n>`
`cv::v_reg< int, 2 * n >`	cv::v_rshr_pack (const v_reg< int64, n > &a, const v_reg< int64, n > &b)

`template<int shift, int n>`
`cv::v_reg< cv::uchar, 2 * n >`	cv::v_rshr_pack_u (const v_reg< short, n > &a, const v_reg< short, n > &b)

`template<int shift, int n>`
`cv::v_reg< cv::ushort, 2 * n >`	cv::v_rshr_pack_u (const v_reg< int, n > &a, const v_reg< int, n > &b)

打包和存储
将输入向量中的值打包后存储到内存中值将转换为较窄的类型后存储到内存中。带有u后缀的变体转换为相应的无符号类型。 pack：用于 16 位、32 位和 64 位整数输入类型 pack_u：用于 16 位和 32 位有符号整数输入类型注意除 64 位以外的所有变体都使用饱和运算。
template<int n>
void	cv::v_pack_store (uchar *ptr, const v_reg< ushort, n > &a)

template<int n>
void	cv::v_pack_store (schar *ptr, const v_reg< short, n > &a)

template<int n>
void	cv::v_pack_store (ushort *ptr, const v_reg< unsigned, n > &a)

template<int n>
void	cv::v_pack_store (short *ptr, const v_reg< int, n > &a)

template<int n>
void	cv::v_pack_store (unsigned *ptr, const v_reg< uint64, n > &a)

template<int n>
void	cv::v_pack_store (int *ptr, const v_reg< int64, n > &a)

template<int n>
void	cv::v_pack_u_store (uchar *ptr, const v_reg< short, n > &a)

template<int n>
void	cv::v_pack_u_store (ushort *ptr, const v_reg< int, n > &a)

带舍入移位的打包和存储
将输入向量中的值打包后存储到内存中值将向右舍入移位n位，转换为较窄的类型，并存储到内存中。带有u后缀的变体转换为无符号类型。 pack：用于 16 位、32 位和 64 位整数输入类型 pack_u：用于 16 位和 32 位有符号整数输入类型注意除 64 位以外的所有变体都使用饱和运算。
`template<int shift, int n>`
void	cv::v_rshr_pack_store (uchar *ptr, const v_reg< ushort, n > &a)

`template<int shift, int n>`
void	cv::v_rshr_pack_store (schar *ptr, const v_reg< short, n > &a)

`template<int shift, int n>`
void	cv::v_rshr_pack_store (ushort *ptr, const v_reg< unsigned, n > &a)

`template<int shift, int n>`
void	cv::v_rshr_pack_store (short *ptr, const v_reg< int, n > &a)

`template<int shift, int n>`
void	cv::v_rshr_pack_store (unsigned *ptr, const v_reg< uint64, n > &a)

`template<int shift, int n>`
void	cv::v_rshr_pack_store (int *ptr, const v_reg< int64, n > &a)

`template<int shift, int n>`
void	cv::v_rshr_pack_u_store (uchar *ptr, const v_reg< short, n > &a)

`template<int shift, int n>`
void	cv::v_rshr_pack_u_store (ushort *ptr, const v_reg< int, n > &a)

打包布尔值
将多个向量中的布尔值打包到一个无符号8位整数向量中注意必须提供有效的布尔值，以保证所有架构的结果相同。
template<int n>
`cv::v_reg< cv::uchar, 2 * n >`	cv::v_pack_b (const v_reg< ushort, n > &a, const v_reg< ushort, n > &b)
	！对于16位布尔值

template<int n>
v_reg< uchar, 4 *n >	cv::v_pack_b (const v_reg< unsigned, n > &a, const v_reg< unsigned, n > &b, const v_reg< unsigned, n > &c, const v_reg< unsigned, n > &d)

template<int n>
v_reg< uchar, 8 *n >	cv::v_pack_b (const v_reg< uint64, n > &a, const v_reg< uint64, n > &b, const v_reg< uint64, n > &c, const v_reg< uint64, n > &d, const v_reg< uint64, n > &e, const v_reg< uint64, n > &f, const v_reg< uint64, n > &g, const v_reg< uint64, n > &h)

宏定义文档

◆ OPENCV_HAL_MATH_HAVE_EXP

#define OPENCV_HAL_MATH_HAVE_EXP 1

#include <opencv2/core/hal/intrin_cpp.hpp>

类型定义文档

◆ v_float32

typedef v_float32x16 simd512::v_float32

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量：32位浮点数（单精度）

◆ v_float32x16

typedef v_reg<float, 16> cv::v_float32x16

#include <opencv2/core/hal/intrin_cpp.hpp>

十六个32位浮点数（单精度）

◆ v_float32x4

typedef v_reg<float, 4> cv::v_float32x4

#include <opencv2/core/hal/intrin_cpp.hpp>

四个32位浮点数（单精度）

◆ v_float32x8

typedef v_reg<float, 8> cv::v_float32x8

#include <opencv2/core/hal/intrin_cpp.hpp>

八个32位浮点数（单精度）

◆ v_float64

typedef v_float64x8 simd512::v_float64

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量：64位浮点数（双精度）

◆ v_float64x2

typedef v_reg<double, 2> cv::v_float64x2

#include <opencv2/core/hal/intrin_cpp.hpp>

两个64位浮点数（双精度）

◆ v_float64x4

typedef v_reg<double, 4> cv::v_float64x4

#include <opencv2/core/hal/intrin_cpp.hpp>

四个64位浮点数（双精度）

◆ v_float64x8

typedef v_reg<double, 8> cv::v_float64x8

#include <opencv2/core/hal/intrin_cpp.hpp>

八个64位浮点数（双精度）

◆ v_int16

typedef v_int16x32 simd512::v_int16

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量：16位有符号整数。

◆ v_int16x16

typedef v_reg<short, 16> cv::v_int16x16

#include <opencv2/core/hal/intrin_cpp.hpp>

十六个16位有符号整数。

◆ v_int16x32

typedef v_reg<short, 32> cv::v_int16x32

#include <opencv2/core/hal/intrin_cpp.hpp>

三十二个16位有符号整数。

◆ v_int16x8

typedef v_reg<short, 8> cv::v_int16x8

#include <opencv2/core/hal/intrin_cpp.hpp>

八个16位有符号整数。

◆ v_int32

typedef v_int32x16 simd512::v_int32

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量：32位有符号整数。

◆ v_int32x16

typedef v_reg<int, 16> cv::v_int32x16

#include <opencv2/core/hal/intrin_cpp.hpp>

十六个32位有符号整数。

◆ v_int32x4

typedef v_reg<int, 4> cv::v_int32x4

#include <opencv2/core/hal/intrin_cpp.hpp>

四个32位有符号整数。

◆ v_int32x8

typedef v_reg<int, 8> cv::v_int32x8

#include <opencv2/core/hal/intrin_cpp.hpp>

八个32位有符号整数。

◆ v_int64

typedef v_int64x8 simd512::v_int64

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量：64位有符号整数。

◆ v_int64x2

typedef v_reg<int64, 2> cv::v_int64x2

#include <opencv2/core/hal/intrin_cpp.hpp>

两个64位有符号整数。

◆ v_int64x4

typedef v_reg<int64, 4> cv::v_int64x4

#include <opencv2/core/hal/intrin_cpp.hpp>

四个64位有符号整数。

◆ v_int64x8

typedef v_reg<int64, 8> cv::v_int64x8

#include <opencv2/core/hal/intrin_cpp.hpp>

八个64位有符号整数。

◆ v_int8

typedef v_int8x64 simd512::v_int8

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量：8位有符号整数。

◆ v_int8x16

typedef v_reg<schar, 16> cv::v_int8x16

#include <opencv2/core/hal/intrin_cpp.hpp>

十六个8位有符号整数。

◆ v_int8x32

typedef v_reg<schar, 32> cv::v_int8x32

#include <opencv2/core/hal/intrin_cpp.hpp>

三十二个8位有符号整数。

◆ v_int8x64

typedef v_reg<schar, 64> cv::v_int8x64

#include <opencv2/core/hal/intrin_cpp.hpp>

六十四个8位有符号整数。

◆ v_uint16

typedef v_uint16x32 simd512::v_uint16

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量：16位无符号整数。

◆ v_uint16x16

typedef v_reg<ushort, 16> cv::v_uint16x16

#include <opencv2/core/hal/intrin_cpp.hpp>

十六个16位无符号整数。

◆ v_uint16x32

typedef v_reg<ushort, 32> cv::v_uint16x32

#include <opencv2/core/hal/intrin_cpp.hpp>

三十二个16位无符号整数。

◆ v_uint16x8

typedef v_reg<ushort, 8> cv::v_uint16x8

#include <opencv2/core/hal/intrin_cpp.hpp>

八个16位无符号整数。

◆ v_uint32

typedef v_uint32x16 simd512::v_uint32

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量：32位无符号整数。

◆ v_uint32x16

typedef v_reg<unsigned, 16> cv::v_uint32x16

#include <opencv2/core/hal/intrin_cpp.hpp>

十六个32位无符号整数。

◆ v_uint32x4

typedef v_reg<unsigned, 4> cv::v_uint32x4

#include <opencv2/core/hal/intrin_cpp.hpp>

四个32位无符号整数。

◆ v_uint32x8

typedef v_reg<unsigned, 8> cv::v_uint32x8

#include <opencv2/core/hal/intrin_cpp.hpp>

八个32位无符号整数。

◆ v_uint64

typedef v_uint64x8 simd512::v_uint64

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量：64位无符号整数。

◆ v_uint64x2

typedef v_reg<uint64, 2> cv::v_uint64x2

#include <opencv2/core/hal/intrin_cpp.hpp>

两个64位无符号整数。

◆ v_uint64x4

typedef v_reg<uint64, 4> cv::v_uint64x4

#include <opencv2/core/hal/intrin_cpp.hpp>

四个64位无符号整数。

◆ v_uint64x8

typedef v_reg<uint64, 8> cv::v_uint64x8

#include <opencv2/core/hal/intrin_cpp.hpp>

八个64位无符号整数。

◆ v_uint8

typedef v_uint8x64 simd512::v_uint8

#include <opencv2/core/hal/intrin.hpp>

最大可用向量寄存器容量的8位无符号整数。

◆ v_uint8x16

typedef v_reg<uchar, 16> cv::v_uint8x16

#include <opencv2/core/hal/intrin_cpp.hpp>

十六个8位无符号整数。

◆ v_uint8x32

typedef v_reg<uchar, 32> cv::v_uint8x32

#include <opencv2/core/hal/intrin_cpp.hpp>

三十二个8位无符号整数。

◆ v_uint8x64

typedef v_reg<uchar, 64> cv::v_uint8x64

#include <opencv2/core/hal/intrin_cpp.hpp>

六十四个8位无符号整数。

枚举类型文档

◆ 匿名枚举

匿名枚举

#include <opencv2/core/hal/intrin_cpp.hpp>

枚举值
simd128_width
simd256_width
simd512_width
simdmax_width

函数文档

◆ v256_cleanup()

void cv::v256_cleanup ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_load()

template<typename _Tp >

v_reg< _Tp, simd256_width/sizeof(_Tp)> cv::v256_load ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从内存加载256位长度的寄存器内容。

参数

ptr	指向包含数据的内存块的指针

返回值: 寄存器对象

注意: 返回类型将根据传入的指针类型推断，例如 uchar ==> cv::v_uint8x32, int ==> cv::v_int32x8 等。; 使用前请检查 CV_SIMD256 预处理器定义。使用 vx_load 版本以获得最大可用寄存器长度的结果。; 对齐要求：如果 CV_STRONG_ALIGNMENT=1，则传入的指针必须对齐（`sizeof(lane type)` 就足够了）。不要在没有运行时指针对齐检查的情况下转换指针类型（例如 `uchar*` => `int*`）。

此函数的调用图如下所示

◆ v256_load_aligned()

template<typename _Tp >

v_reg< _Tp, simd256_width/sizeof(_Tp)> cv::v256_load_aligned ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从内存加载寄存器内容（已对齐）

类似于 cv::v256_load，但源内存块应已对齐（对于 SIMD256 为 32 字节边界，SIMD512 为 64 字节边界，等等）。

注意: 使用前请检查 CV_SIMD256 预处理器定义。使用 vx_load_aligned 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v256_load_expand() [1/2]

template<typename _Tp >

v_reg< typename V_TypeTraits< _Tp >::w_type, simd256_width/sizeof(typename V_TypeTraits< _Tp >::w_type)> cv::v256_load_expand ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

使用双倍扩展从内存加载寄存器内容。

与 cv::v256_load 相同，但结果包类型将是内存类型的两倍宽。

short buf[8] = {1, 2, 3, 4, 5, 6, 7, 8}; // 类型为 int16

v_int32x8 r = v256_load_expand(buf); // r = {1, 2, 3, 4, 5, 6, 7, 8} - 类型为 int32

cv::v_int32x8

v_reg< int, 8 > v_int32x8

八个32位有符号整数。

**定义** intrin_cpp.hpp:525

cv::v256_load_expand

v_reg< typename V_TypeTraits< _Tp >::w_type, simd256_width/sizeof(typename V_TypeTraits< _Tp >::w_type)> v256_load_expand(const _Tp *ptr)

使用双倍扩展从内存加载寄存器内容。

**定义** intrin_cpp.hpp:1956

适用于 8 位、16 位、32 位整数源类型。

注意: 使用前请检查 CV_SIMD256 预处理器定义。使用 vx_load_expand 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v256_load_expand() [2/2]

v_reg< float, simd256_width/sizeof(float)> cv::v256_load_expand ( const hfloat * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_load_expand_q()

template<typename _Tp >

v_reg< typename V_TypeTraits< _Tp >::q_type, simd256_width/sizeof(typename V_TypeTraits< _Tp >::q_type)> cv::v256_load_expand_q ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

使用四倍扩展从内存加载寄存器内容。

与 cv::v256_load_expand 相同，但结果类型是源类型的四倍宽。

char buf[8] = {1, 2, 3, 4, 5, 6, 7, 8}; // 类型为 int8

v_int32x8 r = v256_load_expand_q(buf); // r = {1, 2, 3, 4, 5, 6, 7, 8} - 类型为 int32

cv::v256_load_expand_q

v_reg< typename V_TypeTraits< _Tp >::q_type, simd256_width/sizeof(typename V_TypeTraits< _Tp >::q_type)> v256_load_expand_q(const _Tp *ptr)

使用四倍扩展从内存加载寄存器内容。

**定义** intrin_cpp.hpp:2044

适用于 8 位整数源类型。

注意: 使用前请检查 CV_SIMD256 预处理器定义。使用 vx_load_expand_q 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v256_load_halves()

template<typename _Tp >

v_reg< _Tp, simd256_width/sizeof(_Tp)> cv::v256_load_halves	(	const _Tp *	loptr,
		const _Tp *	hiptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从两个内存块加载寄存器内容。

参数

loptr	包含前半部分数据（0..n/2）的内存块
hiptr	包含后半部分数据（n/2..n）的内存块

int lo[4] = { 1, 2, 3, 4 }, hi[4] = { 5, 6, 7, 8 };

v_int32x8 r = v256_load_halves(lo, hi);

cv::v256_load_halves

v_reg< _Tp, simd256_width/sizeof(_Tp)> v256_load_halves(const _Tp *loptr, const _Tp *hiptr)

从两个内存块加载寄存器内容。

**定义** intrin_cpp.hpp:1865

注意: 使用前请检查 CV_SIMD256 预处理器定义。使用 vx_load_halves 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v256_load_low()

template<typename _Tp >

v_reg< _Tp, simd256_width/sizeof(_Tp)> cv::v256_load_low ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将128位数据加载到低位部分（高位部分未定义）。

参数

ptr	包含前半部分数据（0..n/2）的内存块

int lo[4] = { 1, 2, 3, 4 };

v_int32x8 r = v256_load_low(lo);

cv::v256_load_low

v_reg< _Tp, simd256_width/sizeof(_Tp)> v256_load_low(const _Tp *ptr)

将128位数据加载到低位部分（高位部分未定义）。

**定义** intrin_cpp.hpp:1780

注意: 使用前请检查 CV_SIMD256 预处理器定义。使用 vx_load_low 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v256_setall_f32()

v_float32x8 cv::v256_setall_f32 ( float val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setall_f64()

v_float64x4 cv::v256_setall_f64 ( double val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setall_s16()

v_int16x16 cv::v256_setall_s16 ( short val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setall_s32()

v_int32x8 cv::v256_setall_s32 ( int val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setall_s64()

v_int64x4 cv::v256_setall_s64 ( int64 val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setall_s8()

v_int8x32 cv::v256_setall_s8 ( schar val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setall_u16()

v_uint16x16 cv::v256_setall_u16 ( ushort val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setall_u32()

v_uint32x8 cv::v256_setall_u32 ( unsigned val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setall_u64()

v_uint64x4 cv::v256_setall_u64 ( uint64 val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setall_u8()

v_uint8x32 cv::v256_setall_u8 ( uchar val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_f32()

v_float32x8 cv::v256_setzero_f32 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_f64()

v_float64x4 cv::v256_setzero_f64 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_s16()

v_int16x16 cv::v256_setzero_s16 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_s32()

v_int32x8 cv::v256_setzero_s32 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_s64()

v_int64x4 cv::v256_setzero_s64 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_s8()

v_int8x32 cv::v256_setzero_s8 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_u16()

v_uint16x16 cv::v256_setzero_u16 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_u32()

v_uint32x8 cv::v256_setzero_u32 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_u64()

v_uint64x4 cv::v256_setzero_u64 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v256_setzero_u8()

v_uint8x32 cv::v256_setzero_u8 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_cleanup()

void cv::v512_cleanup ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_load()

template<typename _Tp >

v_reg< _Tp, simd512_width/sizeof(_Tp)> cv::v512_load ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从内存加载512位长度的寄存器内容。

参数

ptr	指向包含数据的内存块的指针

返回值: 寄存器对象

注意: 返回类型将根据传递的指针类型检测，例如 uchar ==> cv::v_uint8x64，int ==> cv::v_int32x16 等。; 使用前请检查 CV_SIMD512 预处理器定义。使用 vx_load 版本以获得最大可用寄存器长度的结果。; 对齐要求：如果 CV_STRONG_ALIGNMENT=1，则传入的指针必须对齐（`sizeof(lane type)` 就足够了）。不要在没有运行时指针对齐检查的情况下转换指针类型（例如 `uchar*` => `int*`）。

此函数的调用图如下所示

◆ v512_load_aligned()

template<typename _Tp >

v_reg< _Tp, simd512_width/sizeof(_Tp)> cv::v512_load_aligned ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从内存加载寄存器内容（已对齐）

类似于 cv::v512_load，但源内存块应对齐（对于 SIMD512 等，应对齐到 64 字节边界）。

注意: 使用前请检查 CV_SIMD512 预处理器定义。使用 vx_load_aligned 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v512_load_expand() [1/2]

template<typename _Tp >

v_reg< typename V_TypeTraits< _Tp >::w_type, simd512_width/sizeof(typename V_TypeTraits< _Tp >::w_type)> cv::v512_load_expand ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

使用双倍扩展从内存加载寄存器内容。

与 cv::v512_load 相同，但结果包类型将比内存类型宽 2 倍。

short buf[8] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}; // 类型为 int16

v_int32x16 r = v512_load_expand(buf); // r = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} - 类型为 int32

cv::v512_load_expand

v_reg< typename V_TypeTraits< _Tp >::w_type, simd512_width/sizeof(typename V_TypeTraits< _Tp >::w_type)> v512_load_expand(const _Tp *ptr)

使用双倍扩展从内存加载寄存器内容。

定义 intrin_cpp.hpp:1987

cv::v_int32x16

v_reg< int, 16 > v_int32x16

十六个32位有符号整数。

定义 intrin_cpp.hpp:548

适用于 8 位、16 位、32 位整数源类型。

注意: 使用前请检查 CV_SIMD512 预处理器定义。使用 vx_load_expand 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v512_load_expand() [2/2]

v_reg< float, simd512_width/sizeof(float)> cv::v512_load_expand ( const hfloat * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_load_expand_q()

template<typename _Tp >

v_reg< typename V_TypeTraits< _Tp >::q_type, simd512_width/sizeof(typename V_TypeTraits< _Tp >::q_type)> cv::v512_load_expand_q ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

使用四倍扩展从内存加载寄存器内容。

与 cv::v512_load_expand 相同，但结果类型是源类型的 4 倍。

char buf[16] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}; // 类型为 int8

v_int32x16 r = v512_load_expand_q(buf); // r = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} - 类型为 int32

cv::v512_load_expand_q

v_reg< typename V_TypeTraits< _Tp >::q_type, simd512_width/sizeof(typename V_TypeTraits< _Tp >::q_type)> v512_load_expand_q(const _Tp *ptr)

使用四倍扩展从内存加载寄存器内容。

定义 intrin_cpp.hpp:2074

适用于 8 位整数源类型。

注意: 使用前请检查 CV_SIMD512 预处理器定义。使用 vx_load_expand_q 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v512_load_halves()

template<typename _Tp >

v_reg< _Tp, simd512_width/sizeof(_Tp)> cv::v512_load_halves	(	const _Tp *	loptr,
		const _Tp *	hiptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从两个内存块加载寄存器内容。

参数

loptr	包含前半部分数据（0..n/2）的内存块
hiptr	包含后半部分数据（n/2..n）的内存块

int lo[4] = { 1, 2, 3, 4, 5, 6, 7, 8 }, hi[4] = { 9, 10, 11, 12, 13, 14, 15, 16 };

v_int32x16 r = v512_load_halves(lo, hi);

cv::v512_load_halves

v_reg< _Tp, simd512_width/sizeof(_Tp)> v512_load_halves(const _Tp *loptr, const _Tp *hiptr)

从两个内存块加载寄存器内容。

定义 intrin_cpp.hpp:1896

注意: 使用前请检查 CV_SIMD512 预处理器定义。使用 vx_load_halves 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v512_load_low()

template<typename _Tp >

v_reg< _Tp, simd512_width/sizeof(_Tp)> cv::v512_load_low ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将256位数据加载到低位部分（高位部分未定义）。

参数

ptr	包含前半部分数据（0..n/2）的内存块

int lo[8] = { 1, 2, 3, 4, 5, 6, 7, 8 };

v_int32x16 r = v512_load_low(lo);

cv::v512_load_low

v_reg< _Tp, simd512_width/sizeof(_Tp)> v512_load_low(const _Tp *ptr)

将256位数据加载到低位部分（高位部分未定义）。

定义 intrin_cpp.hpp:1808

注意: 使用前请检查 CV_SIMD512 预处理器定义。使用 vx_load_low 版本以获得最大可用寄存器长度的结果。

此函数的调用图如下所示

◆ v512_setall_f32()

v_float32x16 cv::v512_setall_f32 ( float val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setall_f64()

v_float64x8 cv::v512_setall_f64 ( double val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setall_s16()

v_int16x32 cv::v512_setall_s16 ( short val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setall_s32()

v_int32x16 cv::v512_setall_s32 ( int val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setall_s64()

v_int64x8 cv::v512_setall_s64 ( int64 val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setall_s8()

v_int8x64 cv::v512_setall_s8 ( schar val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setall_u16()

v_uint16x32 cv::v512_setall_u16 ( ushort val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setall_u32()

v_uint32x16 cv::v512_setall_u32 ( unsigned val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setall_u64()

v_uint64x8 cv::v512_setall_u64 ( uint64 val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setall_u8()

v_uint8x64 cv::v512_setall_u8 ( uchar val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_f32()

v_float32x16 cv::v512_setzero_f32 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_f64()

v_float64x8 cv::v512_setzero_f64 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_s16()

v_int16x32 cv::v512_setzero_s16 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_s32()

v_int32x16 cv::v512_setzero_s32 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_s64()

v_int64x8 cv::v512_setzero_s64 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_s8()

v_int8x64 cv::v512_setzero_s8 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_u16()

v_uint16x32 cv::v512_setzero_u16 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_u32()

v_uint32x16 cv::v512_setzero_u32 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_u64()

v_uint64x8 cv::v512_setzero_u64 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v512_setzero_u8()

v_uint8x64 cv::v512_setzero_u8 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_abs()

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::abs_type, n > cv::v_abs ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

元素的绝对值。

仅适用于浮点类型。

◆ v_absdiff() [1/3]

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::abs_type, n > cv::v_absdiff	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

绝对差值。

返回 \( |a - b| \) 并转换为对应的无符号类型。示例

v_int32x4 a, b; // {1, 2, 3, 4} 和 {4, 3, 2, 1}

v_uint32x4 c = v_absdiff(a, b); // 结果为 {3, 1, 1, 3}

cv::v_absdiff

v_reg< typename V_TypeTraits< _Tp >::abs_type, n > v_absdiff(const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)

绝对差值。

定义 intrin_cpp.hpp:1007

cv::v_reg

定义 intrin_cpp.hpp:374

适用于 8 位、16 位、32 位整数源类型。

◆ v_absdiff() [2/3]

template<int n>

v_reg< double, n > cv::v_absdiff	(	const v_reg< double, n > &	a,
		const v_reg< double, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

这是一个重载的成员函数，为了方便提供。它与上面的函数的区别只在于它接受的参数。

对于 64 位浮点值

◆ v_absdiff() [3/3]

template<int n>

v_reg< float, n > cv::v_absdiff	(	const v_reg< float, n > &	a,
		const v_reg< float, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

这是一个重载的成员函数，为了方便提供。它与上面的函数的区别只在于它接受的参数。

对于 32 位浮点值

◆ v_absdiffs()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_absdiffs	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

饱和绝对差值。

返回 \( saturate(|a - b|) \) 。对于 8 位和 16 位有符号整数源类型。

此函数的调用图如下所示

◆ v_add()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_add	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

#include <opencv2/core/hal/intrin_cpp.hpp>

添加值。

适用于所有类型。

◆ v_add_wrap()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_add_wrap	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

无饱和添加值。

适用于 8 位和 16 位整数值。

◆ v_and()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_and	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

#include <opencv2/core/hal/intrin_cpp.hpp>

按位与。

仅适用于整数类型。

◆ v_broadcast_element()

template<int i, typename _Tp , int n>

v_reg< _Tp, n > cv::v_broadcast_element ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

广播向量的第 i 个元素。

方案

{ v[0] v[1] v[2] ... v[SZ] } => { v[i], v[i], v[i] ... v[i] }

限制：0 <= i < nlanes 支持类型：32 位整数和浮点数 (s32/u32/f32)

◆ v_ceil() [1/2]

template<int n>

v_reg< int, n *2 > cv::v_ceil ( const v_reg< double, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

这是一个重载的成员函数，为了方便提供。它与上面的函数的区别只在于它接受的参数。

此函数的调用图如下所示

◆ v_ceil() [2/2]

template<int n>

v_reg< int, n > cv::v_ceil ( const v_reg< float, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

向上取整元素。

对每个值向上取整。输入类型为浮点向量 ==> 输出类型为整数向量。

注意: 仅适用于浮点类型。

此函数的调用图如下所示

◆ v_check_all()

template<typename _Tp , int n>

bool cv::v_check_all ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

检查所有打包的值是否小于零。

无符号值将转换为有符号值：uchar 254 => char -2。

◆ v_check_any()

template<typename _Tp , int n>

bool cv::v_check_any ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

检查任何打包的值是否小于零。

无符号值将转换为有符号值：uchar 254 => char -2。

◆ v_cleanup()

void cv::v_cleanup ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_combine_high()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_combine_high	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从两个向量的最后一个元素组合向量。

方案

{A1 A2 A3 A4}
{B1 B2 B3 B4}
---------------
{A3 A4 B3 B4}

适用于除 64 位以外的所有类型。

◆ v_combine_low()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_combine_low	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从两个向量的第一个元素组合向量。

方案

{A1 A2 A3 A4}
{B1 B2 B3 B4}
---------------
{A1 A2 B1 B2}

适用于除 64 位以外的所有类型。

◆ v_cos()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_cos ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

元素的余弦 \( cos(x) \)。

仅适用于浮点类型。核心实现与v_sincos相同。

◆ v_cvt_f32() [1/3]

template<int n>

v_reg< float, n *2 > cv::v_cvt_f32 ( const v_reg< double, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将下半部分转换为浮点数。

支持的输入类型为cv::v_float64。

◆ v_cvt_f32() [2/3]

template<int n>

v_reg< float, n *2 > cv::v_cvt_f32	(	const v_reg< double, n > &	a,
		const v_reg< double, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

转换为浮点数。

支持的输入类型为cv::v_float64。

◆ v_cvt_f32() [3/3]

template<int n>

v_reg< float, n > cv::v_cvt_f32 ( const v_reg< int, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

转换为浮点数。

支持的输入类型为cv::v_int32。

◆ v_cvt_f64() [1/3]

template<int n>

v_reg< double,(n/2)> cv::v_cvt_f64 ( const v_reg< float, n > & a )

#include <opencv2/core/hal/intrin_cpp.hpp>

将下半部分转换为双精度浮点数。

支持的输入类型为cv::v_float32。

◆ v_cvt_f64() [2/3]

template<int n>

v_reg< double, n/2 > cv::v_cvt_f64 ( const v_reg< int, n > & a )

#include <opencv2/core/hal/intrin_cpp.hpp>

将下半部分转换为双精度浮点数。

支持的输入类型为cv::v_int32。

◆ v_cvt_f64() [3/3]

template<int n>

v_reg< double, n > cv::v_cvt_f64 ( const v_reg< int64, n > & a )

#include <opencv2/core/hal/intrin_cpp.hpp>

转换为双精度浮点数。

支持的输入类型为cv::v_int64。

◆ v_cvt_f64_high() [1/2]

template<int n>

v_reg< double,(n/2)> cv::v_cvt_f64_high ( const v_reg< float, n > & a )

#include <opencv2/core/hal/intrin_cpp.hpp>

将向量的上半部分转换为双精度浮点数。

支持的输入类型为cv::v_float32。

◆ v_cvt_f64_high() [2/2]

template<int n>

v_reg< double,(n/2)> cv::v_cvt_f64_high ( const v_reg< int, n > & a )

#include <opencv2/core/hal/intrin_cpp.hpp>

将向量的上半部分转换为双精度浮点数。

支持的输入类型为cv::v_int32。

◆ v_div()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_div	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

#include <opencv2/core/hal/intrin_cpp.hpp>

除法运算。

仅适用于浮点类型。

◆ v_dotprod() [1/2]

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > cv::v_dotprod	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

元素的点积。

将两个寄存器中的值相乘并对相邻的结果对求和。

方案

{A1 A2 ...} // 16位
x {B1 B2 ...} // 16位
-------------
{A1B1+A2B2 ...} // 32位

◆ v_dotprod() [2/2]

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > cv::v_dotprod	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		const v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

元素的点积。

与cv::v_dotprod相同，但将第三个元素添加到相邻对的和中。方案

{A1 A2 ...} // 16位
x {B1 B2 ...} // 16位
-------------
{A1B1+A2B2+C1 ...} // 32位

◆ v_dotprod_expand() [1/4]

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > cv::v_dotprod_expand	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

元素的点积并扩展。

将两个寄存器中的值相乘并扩展相邻结果对的和。

方案

{A1 A2 A3 A4 ...} // 8位
x {B1 B2 B3 B4 ...} // 8位
-------------
{A1B1+A2B2+A3B3+A4B4 ...} // 32位

◆ v_dotprod_expand() [2/4]

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > cv::v_dotprod_expand	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		const v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

元素的点积。

与cv::v_dotprod_expand相同，但将第三个元素添加到相邻对的和中。方案

{A1 A2 A3 A4 ...} // 8位
x {B1 B2 B3 B4 ...} // 8位
-------------
{A1B1+A2B2+A3B3+A4B4+C1 ...} // 32位

◆ v_dotprod_expand() [3/4]

template<int n>

v_reg< double, n/2 > cv::v_dotprod_expand	(	const v_reg< int, n > &	a,
		const v_reg< int, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

此函数的调用图如下所示

◆ v_dotprod_expand() [4/4]

template<int n>

v_reg< double, n/2 > cv::v_dotprod_expand	(	const v_reg< int, n > &	a,
		const v_reg< int, n > &	b,
		const v_reg< double, n/2 > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

此函数的调用图如下所示

◆ v_dotprod_expand_fast() [1/4]

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > cv::v_dotprod_expand_fast	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

快速计算元素的点积并扩展。

将两个寄存器中的值相乘并扩展相邻结果对的和。

与cv::v_dotprod_expand相同，但在某些平台上可能会对结果对进行无序求和，如果只需要所有通道的总和，并且在受影响的平台上应该能获得更好的性能，则可以使用此内在函数。

此函数的调用图如下所示

◆ v_dotprod_expand_fast() [2/4]

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > cv::v_dotprod_expand_fast	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		const v_reg< typename V_TypeTraits< _Tp >::q_type, n/4 > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

快速计算元素的点积。

与cv::v_dotprod_expand_fast相同，但将第三个元素添加到相邻对的和中。

此函数的调用图如下所示

◆ v_dotprod_expand_fast() [3/4]

template<int n>

v_reg< double, n/2 > cv::v_dotprod_expand_fast	(	const v_reg< int, n > &	a,
		const v_reg< int, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

此函数的调用图如下所示

◆ v_dotprod_expand_fast() [4/4]

template<int n>

v_reg< double, n/2 > cv::v_dotprod_expand_fast	(	const v_reg< int, n > &	a,
		const v_reg< int, n > &	b,
		const v_reg< double, n/2 > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

此函数的调用图如下所示

◆ v_dotprod_fast() [1/2]

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > cv::v_dotprod_fast	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

快速计算元素的点积。

与cv::v_dotprod相同，但在某些平台上可能会对结果对进行无序求和，如果只需要所有通道的总和，并且在受影响的平台上应该能获得更好的性能，则可以使用此内在函数。

此函数的调用图如下所示

◆ v_dotprod_fast() [2/2]

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > cv::v_dotprod_fast	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		const v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

快速计算元素的点积。

与cv::v_dotprod_fast相同，但将第三个元素添加到相邻对的和中。

此函数的调用图如下所示

◆ v_eq()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_eq	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

相等比较。

◆ v_erf()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_erf ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

误差函数。

注意: 目前支持FP32精度。

◆ v_exp()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_exp ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

元素的指数 \( e^x \)。

仅适用于浮点类型。核心实现步骤

分解输入：将输入转换为\( 2^{x \cdot \log_2e} \)并将指数分解为整数部分和小数部分：\( x \cdot \log_2e = n + f \)，其中\( n \)是整数部分，\( f \)是小数部分。
计算\( 2^n \)：通过位移计算。
调整小数部分：计算\( f \cdot \ln2 \)将小数部分转换为以\( e \)为底。
使用多项式逼近\( e^{f \cdot \ln2} \)：小数部分越接近0，结果越准确。
- 对于float16和float32，使用6项泰勒展开式。
- 对于float64，使用4项Pade逼近多项式。
组合结果：将两部分相乘得到最终结果：\( e^x = 2^n \cdot e^{f \cdot \ln2} \)。

注意: 计算精度取决于实现和输入向量的数

◆ v_expand()

template<typename _Tp , int n>

void cv::v_expand	(	const v_reg< _Tp, n > &	a,
		v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &	b0,
		v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &	b1 )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将值扩展到更宽的打包类型。

将寄存器的内容复制到两个具有两倍宽度打包类型的寄存器中。方案

int32x4 int64x2 int64x2

{A B C D} ==> {A B} , {C D}

◆ v_expand_high()

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > cv::v_expand_high ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将较高的值扩展到更宽的打包类型。

与cv::v_expand_low相同，但扩展向量的较高一半。

方案

int32x4 int64x2

{A B C D} ==> {C D}

◆ v_expand_low()

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > cv::v_expand_low ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将较低的值扩展到更宽的打包类型。

与cv::v_expand相同，但返回向量的较低一半。

方案

int32x4 int64x2

{A B C D} ==> {A B}

◆ v_extract()

template<int s, typename _Tp , int n>

v_reg< _Tp, n > cv::v_extract	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

向量提取。

方案

{A1 A2 A3 A4}
{B1 B2 B3 B4}
========================
shift = 1 {A2 A3 A4 B1}
shift = 2 {A3 A4 B1 B2}
shift = 3 {A4 B1 B2 B3}

限制：0 <= shift < nlanes

用法

v_int32x4 a, b, c;

c = v_extract<2>(a, b);

cv::v_extract

v_reg< _Tp, n > v_extract(const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)

向量提取。

定义 intrin_cpp.hpp:2425

适用于所有类型。

◆ v_extract_n()

template<int s, typename _Tp , int n>

_Tp cv::v_extract_n ( const v_reg< _Tp, n > & v )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

向量提取。

方案：返回v的第s个元素。限制：0 <= s < nlanes

用法

v_int32x4 a;
int r;
r = v_extract_n<2>(a);

适用于所有类型。

◆ v_floor() [1/2]

template<int n>

v_reg< int, n *2 > cv::v_floor ( const v_reg< double, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

这是一个重载的成员函数，为了方便提供。它与上面的函数的区别只在于它接受的参数。

此函数的调用图如下所示

◆ v_floor() [2/2]

template<int n>

v_reg< int, n > cv::v_floor ( const v_reg< float, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

向下取整元素。

对每个值取下限。输入类型为浮点向量=>输出类型为整型向量。

注意: 仅适用于浮点类型。

此函数的调用图如下所示

◆ v_fma()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_fma	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		const v_reg< _Tp, n > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

乘法和加法。

返回 \( a*b + c \) 仅适用于浮点类型和有符号32位整数。

◆ v_ge()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_ge	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

大于或等于比较。

适用于除64位整数值之外的所有类型。

◆ v_gt()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_gt	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

大于比较。

适用于除64位整数值之外的所有类型。

◆ v_interleave_pairs()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_interleave_pairs ( const v_reg< _Tp, n > & vec )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_interleave_quads()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_interleave_quads ( const v_reg< _Tp, n > & vec )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_invsqrt()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_invsqrt ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

倒数平方根。

返回 \( 1/sqrt(a) \) 仅适用于浮点类型。

◆ v_le()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_le	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

小于或等于比较。

适用于除64位整数值之外的所有类型。

◆ v_load()

template<typename _Tp >

v_reg< _Tp, simd128_width/sizeof(_Tp)> cv::v_load ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从内存加载寄存器内容。

参数

ptr	指向包含数据的内存块的指针

返回值: 寄存器对象

注意: 返回类型将从传递的指针类型检测，例如 uchar ==> cv::v_uint8x16，int ==> cv::v_int32x4 等。; 使用 vx_load 版本以获得最大可用寄存器长度的结果; 对齐要求：如果 CV_STRONG_ALIGNMENT=1，则传入的指针必须对齐（`sizeof(lane type)` 就足够了）。不要在没有运行时指针对齐检查的情况下转换指针类型（例如 `uchar*` => `int*`）。

此函数的调用图如下所示

◆ v_load_aligned()

template<typename _Tp >

v_reg< _Tp, simd128_width/sizeof(_Tp)> cv::v_load_aligned ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从内存加载寄存器内容（已对齐）

类似于cv::v_load，但源内存块应对齐（对于SIMD128为16字节边界，SIMD256为32字节边界等）

注意: 使用 vx_load_aligned 版本以获得最大可用寄存器长度的结果

此函数的调用图如下所示

◆ v_load_deinterleave() [1/3]

template<typename _Tp , int n>

void cv::v_load_deinterleave	(	const _Tp *	ptr,
		v_reg< _Tp, n > &	a,
		v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

加载并解交错 (2 个通道)

从内存加载数据进行反交错并存储到2个寄存器中。方案

{A1 B1 A2 B2 ...} ==> {A1 A2 ...}, {B1 B2 ...}

适用于除 64 位以外的所有类型。

此函数的调用图如下所示

◆ v_load_deinterleave() [2/3]

template<typename _Tp , int n>

void cv::v_load_deinterleave	(	const _Tp *	ptr,
		v_reg< _Tp, n > &	a,
		v_reg< _Tp, n > &	b,
		v_reg< _Tp, n > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

加载并解交错 (3 个通道)

从内存加载数据进行反交错并存储到3个寄存器中。方案

{A1 B1 C1 A2 B2 C2 ...} ==> {A1 A2 ...}, {B1 B2 ...}, {C1 C2 ...}

适用于除 64 位以外的所有类型。

此函数的调用图如下所示

◆ v_load_deinterleave() [3/3]

template<typename _Tp , int n>

void cv::v_load_deinterleave	(	const _Tp *	ptr,
		v_reg< _Tp, n > &	a,
		v_reg< _Tp, n > &	b,
		v_reg< _Tp, n > &	c,
		v_reg< _Tp, n > &	d )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

加载并解交错 (4 个通道)

从内存加载数据进行反交错并存储到4个寄存器中。方案

{A1 B1 C1 D1 A2 B2 C2 D2 ...} ==> {A1 A2 ...}, {B1 B2 ...}, {C1 C2 ...}, {D1 D2 ...}

适用于除 64 位以外的所有类型。

此函数的调用图如下所示

◆ v_load_expand() [1/2]

template<typename _Tp >

v_reg< typename V_TypeTraits< _Tp >::w_type, simd128_width/sizeof(typename V_TypeTraits< _Tp >::w_type)> cv::v_load_expand ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

使用双倍扩展从内存加载寄存器内容。

与cv::v_load相同，但结果打包类型将是内存类型的两倍宽度。

short buf[4] = {1, 2, 3, 4}; // 类型为 int16

v_int32x4 r = v_load_expand(buf); // r = {1, 2, 3, 4} - 类型为 int32

cv::v_load_expand

v_reg< typename V_TypeTraits< _Tp >::w_type, simd128_width/sizeof(typename V_TypeTraits< _Tp >::w_type)> v_load_expand(const _Tp *ptr)

使用双倍扩展从内存加载寄存器内容。

定义 intrin_cpp.hpp:1926

cv::v_int32x4

v_reg< int, 4 > v_int32x4

四个32位有符号整数。

定义 intrin_cpp.hpp:503

适用于 8 位、16 位、32 位整数源类型。

注意: 使用 vx_load_expand 版本以获得最大可用寄存器长度的结果

此函数的调用图如下所示

◆ v_load_expand() [2/2]

v_reg< float, simd128_width/sizeof(float)> cv::v_load_expand ( const hfloat * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_load_expand_q()

template<typename _Tp >

v_reg< typename V_TypeTraits< _Tp >::q_type, simd128_width/sizeof(typename V_TypeTraits< _Tp >::q_type)> cv::v_load_expand_q ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

使用四倍扩展从内存加载寄存器内容。

与cv::v_load_expand相同，但结果类型是源类型的四倍。

char buf[4] = {1, 2, 3, 4}; // 类型为int8

v_int32x4 r = v_load_expand_q(buf); // r = {1, 2, 3, 4} - 类型为int32

cv::v_load_expand_q

v_reg< typename V_TypeTraits< _Tp >::q_type, simd128_width/sizeof(typename V_TypeTraits< _Tp >::q_type)> v_load_expand_q(const _Tp *ptr)

使用四倍扩展从内存加载寄存器内容。

定义 intrin_cpp.hpp:2015

适用于 8 位整数源类型。

注意: 使用 vx_load_expand_q 版本以获得最大可用寄存器长度的结果

此函数的调用图如下所示

◆ v_load_halves()

template<typename _Tp >

v_reg< _Tp, simd128_width/sizeof(_Tp)> cv::v_load_halves	(	const _Tp *	loptr,
		const _Tp *	hiptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

从两个内存块加载寄存器内容。

参数

loptr	包含前半部分数据（0..n/2）的内存块
hiptr	包含后半部分数据（n/2..n）的内存块

int lo[2] = { 1, 2 }, hi[2] = { 3, 4 };

v_int32x4 r = v_load_halves(lo, hi);

cv::v_load_halves

v_reg< _Tp, simd128_width/sizeof(_Tp)> v_load_halves(const _Tp *loptr, const _Tp *hiptr)

从两个内存块加载寄存器内容。

定义 intrin_cpp.hpp:1835

注意: 使用 vx_load_halves 版本以获得最大可用寄存器长度的结果

此函数的调用图如下所示

◆ v_load_low()

template<typename _Tp >

v_reg< _Tp, simd128_width/sizeof(_Tp)> cv::v_load_low ( const _Tp * ptr )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

加载64位数据到低位部分（高位部分未定义）。

参数

ptr	包含前半部分数据（0..n/2）的内存块

int lo[2] = { 1, 2 };

v_int32x4 r = v_load_low(lo);

cv::v_load_low

v_reg< _Tp, simd128_width/sizeof(_Tp)> v_load_low(const _Tp *ptr)

加载64位数据到低位部分（高位部分未定义）。

定义 intrin_cpp.hpp:1753

注意: 使用 vx_load_low 版本以获得最大可用寄存器长度的结果

此函数的调用图如下所示

◆ v_log()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_log ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

元素的自然对数 \( \log(x) \)。

仅适用于浮点类型。核心实现步骤

输入分解：使用二进制表示将输入分解为尾数部分\( m \)和指数部分\( e \)。使得\( \log(x) = \log(m \cdot 2^e) = \log(m) + e \cdot \ln(2) \)。
调整尾数和指数部分：如果尾数小于\( \sqrt{0.5} \)，则调整指数和尾数以确保尾数在\( (\sqrt{0.5}, \sqrt{2}) \)范围内，以便更好地逼近。
对\( \log(m) \)的多项式逼近：\( m \)越接近1，结果越准确。
- 对于float16和float32，使用9项泰勒级数。
- 对于float64，使用6项Pade多项式逼近。
组合结果：将两部分相加即可得到最终结果。

注意: 计算精度取决于实现和输入的数据类型。; 类似于std::log()的行为，\( \ln(0) = -\infty \)。

◆ v_lt()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_lt	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

小于比较。

适用于除64位整数值之外的所有类型。

◆ v_lut() [1/5]

template<typename _Tp >

v_reg< _Tp, simd128_width/sizeof(_Tp)> cv::v_lut	(	const _Tp *	tab,
		const int *	idx )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_lut() [2/5]

template<int n>

v_reg< double, n/2 > cv::v_lut	(	const double *	tab,
		const v_reg< int, n > &	idx )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_lut() [3/5]

template<int n>

v_reg< float, n > cv::v_lut	(	const float *	tab,
		const v_reg< int, n > &	idx )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_lut() [4/5]

template<int n>

v_reg< int, n > cv::v_lut	(	const int *	tab,
		const v_reg< int, n > &	idx )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_lut() [5/5]

template<int n>

v_reg< unsigned, n > cv::v_lut	(	const unsigned *	tab,
		const v_reg< int, n > &	idx )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_lut_deinterleave() [1/2]

template<int n>

void cv::v_lut_deinterleave	(	const double *	tab,
		const v_reg< int, n *2 > &	idx,
		v_reg< double, n > &	x,
		v_reg< double, n > &	y )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_lut_deinterleave() [2/2]

template<int n>

void cv::v_lut_deinterleave	(	const float *	tab,
		const v_reg< int, n > &	idx,
		v_reg< float, n > &	x,
		v_reg< float, n > &	y )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_lut_pairs()

template<typename _Tp >

v_reg< _Tp, simd128_width/sizeof(_Tp)> cv::v_lut_pairs	(	const _Tp *	tab,
		const int *	idx )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_lut_quads()

template<typename _Tp >

v_reg< _Tp, simd128_width/sizeof(_Tp)> cv::v_lut_quads	(	const _Tp *	tab,
		const int *	idx )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_magnitude()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_magnitude	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

幅度。

返回\( sqrt(a^2 + b^2) \) 仅限浮点类型。

◆ v_matmul()

template<int n>

v_reg< float, n > cv::v_matmul	(	const v_reg< float, n > &	v,
		const v_reg< float, n > &	a,
		const v_reg< float, n > &	b,
		const v_reg< float, n > &	c,
		const v_reg< float, n > &	d )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

矩阵乘法。

方案

{A0 A1 A2 A3} |V0|
{B0 B1 B2 B3} |V1|
{C0 C1 C2 C3} |V2|
{D0 D1 D2 D3} x |V3|
====================
{R0 R1 R2 R3}, 其中
R0 = A0V0 + B0V1 + C0V2 + D0V3,
R1 = A1V0 + B1V1 + C1V2 + D1V3
...

◆ v_matmuladd()

template<int n>

v_reg< float, n > cv::v_matmuladd	(	const v_reg< float, n > &	v,
		const v_reg< float, n > &	a,
		const v_reg< float, n > &	b,
		const v_reg< float, n > &	c,
		const v_reg< float, n > &	d )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

矩阵乘法和加法。

方案

{A0 A1 A2 A3} |V0| |D0|
{B0 B1 B2 B3} |V1| |D1|
{C0 C1 C2 C3} x |V2| + |D2|
==================== |D3|
{R0 R1 R2 R3}, 其中
R0 = A0V0 + B0V1 + C0V2 + D0,
R1 = A1V0 + B1V1 + C1V2 + D1
...

◆ v_max()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_max	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

为每一对选择最大值。

方案

{A1 A2 ...}
{B1 B2 ...}
--------------
{max(A1,B1) max(A2,B2) ...}

对于除64位整数以外的所有类型。

◆ v_min()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_min	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

为每一对选择最小值。

方案

{A1 A2 ...}
{B1 B2 ...}
--------------
{min(A1,B1) min(A2,B2) ...}

对于除64位整数以外的所有类型。

◆ v_mul()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_mul	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

#include <opencv2/core/hal/intrin_cpp.hpp>

乘法运算。

适用于16位和32位整数类型以及浮点类型。

◆ v_mul_expand()

template<typename _Tp , int n>

void cv::v_mul_expand	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &	c,
		v_reg< typename V_TypeTraits< _Tp >::w_type, n/2 > &	d )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

乘法和扩展。

将两个寄存器的值相乘，并将结果存储到具有更宽打包类型的两个寄存器中。方案

{A B C D} // 32位
x {E F G H} // 32位
---------------
{AE BF} // 64位
{CG DH} // 64位

示例

v_uint32x4 a, b; // {1,2,3,4} 和 {2,2,2,2}
v_uint64x2 c, d; // 结果
v_mul_expand(a, b, c, d); // c, d = {2,4}, {6, 8}

仅针对16位和无符号32位源类型实现 (v_int16x8, v_uint16x8, v_uint32x4)。

◆ v_mul_hi()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_mul_hi	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

乘法并提取高位部分。

将两个寄存器的值相乘并存储结果的高位部分。仅针对16位源类型实现 (v_int16x8, v_uint16x8)。返回 \( a*b >> 16 \)

◆ v_mul_wrap()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_mul_wrap	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

无饱和乘法运算。

适用于 8 位和 16 位整数值。

◆ v_muladd()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_muladd	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		const v_reg< _Tp, n > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

v_fma 的同义词。

此函数的调用图如下所示

◆ v_ne()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_ne	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

不相等比较。

◆ v_not()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_not ( const v_reg< _Tp, n > & a )

#include <opencv2/core/hal/intrin_cpp.hpp>

按位非。

仅适用于整数类型。

◆ v_not_nan() [1/2]

template<int n>

v_reg< double, n > cv::v_not_nan ( const v_reg< double, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_not_nan() [2/2]

template<int n>

v_reg< float, n > cv::v_not_nan ( const v_reg< float, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_or()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_or	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

#include <opencv2/core/hal/intrin_cpp.hpp>

按位或。

仅适用于整数类型。

◆ v_pack() [1/6]

template<int n>

v_reg< short, 2 *n > cv::v_pack	(	const v_reg< int, n > &	a,
		const v_reg< int, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack() [2/6]

template<int n>

v_reg< int, 2 *n > cv::v_pack	(	const v_reg< int64, n > &	a,
		const v_reg< int64, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack() [3/6]

template<int n>

v_reg< schar, 2 *n > cv::v_pack	(	const v_reg< short, n > &	a,
		const v_reg< short, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack() [4/6]

template<int n>

v_reg< unsigned, 2 *n > cv::v_pack	(	const v_reg< uint64, n > &	a,
		const v_reg< uint64, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack() [5/6]

template<int n>

v_reg< ushort, 2 *n > cv::v_pack	(	const v_reg< unsigned, n > &	a,
		const v_reg< unsigned, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack() [6/6]

template<int n>

v_reg< uchar, 2 *n > cv::v_pack	(	const v_reg< ushort, n > &	a,
		const v_reg< ushort, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_b() [1/3]

template<int n>

v_reg< uchar, 8 *n > cv::v_pack_b	(	const v_reg< uint64, n > &	a,
		const v_reg< uint64, n > &	b,
		const v_reg< uint64, n > &	c,
		const v_reg< uint64, n > &	d,
		const v_reg< uint64, n > &	e,
		const v_reg< uint64, n > &	f,
		const v_reg< uint64, n > &	g,
		const v_reg< uint64, n > &	h )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

这是一个重载的成员函数，为了方便提供。它与上面的函数的区别仅仅在于它接受的参数。对于64位布尔值

方案

a {0xFFFF.. 0}
b {0 0xFFFF..}
c {0xFFFF.. 0}
d {0 0xFFFF..}
 
e {0xFFFF.. 0}
f {0xFFFF.. 0}
g {0 0xFFFF..}
h {0 0xFFFF..}
===============
{
0xFF 0 0 0xFF 0xFF 0 0 0xFF
0xFF 0 0xFF 0 0 0xFF 0 0xFF
}

◆ v_pack_b() [2/3]

template<int n>

v_reg< uchar, 4 *n > cv::v_pack_b	(	const v_reg< unsigned, n > &	a,
		const v_reg< unsigned, n > &	b,
		const v_reg< unsigned, n > &	c,
		const v_reg< unsigned, n > &	d )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

这是一个重载的成员函数，为了方便提供。它与上面的函数的区别仅仅在于它接受的参数。对于32位布尔值

方案

a {0xFFFF.. 0 0 0xFFFF..}
b {0 0xFFFF.. 0xFFFF.. 0}
c {0xFFFF.. 0 0xFFFF.. 0}
d {0 0xFFFF.. 0 0xFFFF..}
===============
{
0xFF 0 0 0xFF 0 0xFF 0xFF 0
0xFF 0 0xFF 0 0 0xFF 0 0xFF
}

◆ v_pack_b() [3/3]

template<int n>

v_reg< uchar, 2 *n > cv::v_pack_b	(	const v_reg< ushort, n > &	a,
		const v_reg< ushort, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

！对于16位布尔值

方案

a {0xFFFF 0 0 0xFFFF 0 0xFFFF 0xFFFF 0}
b {0xFFFF 0 0xFFFF 0 0 0xFFFF 0 0xFFFF}
===============
{
0xFF 0 0 0xFF 0 0xFF 0xFF 0
0xFF 0 0xFF 0 0 0xFF 0 0xFF
}

◆ v_pack_store() [1/7]

template<int n>

void cv::v_pack_store	(	hfloat *	ptr,
		const v_reg< float, n > &	v )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_store() [2/7]

template<int n>

void cv::v_pack_store	(	int *	ptr,
		const v_reg< int64, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_store() [3/7]

template<int n>

void cv::v_pack_store	(	schar *	ptr,
		const v_reg< short, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_store() [4/7]

template<int n>

void cv::v_pack_store	(	short *	ptr,
		const v_reg< int, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_store() [5/7]

template<int n>

void cv::v_pack_store	(	uchar *	ptr,
		const v_reg< ushort, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_store() [6/7]

template<int n>

void cv::v_pack_store	(	无符号整型指针	ptr,
		const v_reg< uint64, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_store() [7/7]

template<int n>

void cv::v_pack_store	(	ushort *	ptr,
		const v_reg< unsigned, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_triplets()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_pack_triplets ( const v_reg< _Tp, n > & vec )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_u() [1/2]

template<int n>

v_reg< ushort, 2 *n > cv::v_pack_u	(	const v_reg< int, n > &	a,
		const v_reg< int, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_u() [2/2]

template<int n>

v_reg< uchar, 2 *n > cv::v_pack_u	(	const v_reg< short, n > &	a,
		const v_reg< short, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_u_store() [1/2]

template<int n>

void cv::v_pack_u_store	(	uchar *	ptr,
		const v_reg< short, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_pack_u_store() [2/2]

template<int n>

void cv::v_pack_u_store	(	ushort *	ptr,
		const v_reg< int, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_popcount()

template<typename _Tp , int n>

v_reg< typename V_TypeTraits< _Tp >::abs_type, n > cv::v_popcount ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

统计向量通道中的1比特数，并以对应的无符号类型返回结果。

方案

{A1 A2 A3 ...} => {popcount(A1), popcount(A2), popcount(A3), ...}

适用于所有整数类型。

此函数的调用图如下所示

◆ v_recombine()

template<typename _Tp , int n>

void cv::v_recombine	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		v_reg< _Tp, n > &	低位,
		v_reg< _Tp, n > &	高位 )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将两个向量组合成另外两个向量的低位和高位部分。

低位 = cv::v_combine_low(a, b);

高位 = cv::v_combine_high(a, b);

cv::v_combine_high

v_reg< _Tp, n > v_combine_high(const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)

从两个向量的最后一个元素组合向量。

定义 intrin_cpp.hpp:2358

cv::v_combine_low

v_reg< _Tp, n > v_combine_low(const v_reg< _Tp, n > &a, const v_reg< _Tp, n > &b)

从两个向量的第一个元素组合向量。

定义 intrin_cpp.hpp:2336

◆ v_reduce_max()

template<typename _Tp , int n>

_Tp cv::v_reduce_max ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

查找一个最大值。

方案

{A1 A2 A3 ...} => max(A1,A2,A3,...)

除64位整数和64位浮点数类型外，适用于所有类型。

◆ v_reduce_min()

template<typename _Tp , int n>

_Tp cv::v_reduce_min ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

查找一个最小值。

方案

{A1 A2 A3 ...} => min(A1,A2,A3,...)

除64位整数和64位浮点数类型外，适用于所有类型。

◆ v_reduce_sad()

template<typename _Tp , int n>

V_TypeTraits< typenameV_TypeTraits< _Tp >::abs_type >::sum_type cv::v_reduce_sad	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

计算值的绝对差之和。

方案

{A1 A2 A3 ...} {B1 B2 B3 ...} => sum{ABS(A1-B1),abs(A2-B2),abs(A3-B3),...}

cv::sum

Scalar sum(InputArray src)

计算数组元素的和。

cv::abs

static uchar abs(uchar a)

定义 cvstd.hpp:66

除64位类型外，适用于所有类型。

◆ v_reduce_sum()

template<typename _Tp , int n>

V_TypeTraits< _Tp >::sum_type cv::v_reduce_sum ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

计算打包值的和。

方案

{A1 A2 A3 ...} => sum{A1,A2,A3,...}

◆ v_reduce_sum4()

template<int n>

v_reg< float, n > cv::v_reduce_sum4	(	const v_reg< float, n > &	a,
		const v_reg< float, n > &	b,
		const v_reg< float, n > &	c,
		const v_reg< float, n > &	d )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

计算每个输入向量的所有元素之和，返回和的向量。

方案

result[0] = a[0] + a[1] + a[2] + a[3]
result[1] = b[0] + b[1] + b[2] + b[3]
result[2] = c[0] + c[1] + c[2] + c[3]
result[3] = d[0] + d[1] + d[2] + d[3]

◆ v_reinterpret_as_f32()

template<typename _Tp0 , int n0>

v_reg< float, n0 *sizeof(_Tp0)/sizeof(float)> cv::v_reinterpret_as_f32 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reinterpret_as_f64()

template<typename _Tp0 , int n0>

v_reg< double, n0 *sizeof(_Tp0)/sizeof(double)> cv::v_reinterpret_as_f64 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reinterpret_as_s16()

template<typename _Tp0 , int n0>

v_reg< short, n0 *sizeof(_Tp0)/sizeof(short)> cv::v_reinterpret_as_s16 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reinterpret_as_s32()

template<typename _Tp0 , int n0>

v_reg< int, n0 *sizeof(_Tp0)/sizeof(int)> cv::v_reinterpret_as_s32 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reinterpret_as_s64()

template<typename _Tp0 , int n0>

v_reg< int64, n0 *sizeof(_Tp0)/sizeof(int64)> cv::v_reinterpret_as_s64 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reinterpret_as_s8()

template<typename _Tp0 , int n0>

v_reg< schar, n0 *sizeof(_Tp0)/sizeof(schar)> cv::v_reinterpret_as_s8 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reinterpret_as_u16()

template<typename _Tp0 , int n0>

v_reg< ushort, n0 *sizeof(_Tp0)/sizeof(ushort)> cv::v_reinterpret_as_u16 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reinterpret_as_u32()

template<typename _Tp0 , int n0>

v_reg< unsigned, n0 *sizeof(_Tp0)/sizeof(unsigned)> cv::v_reinterpret_as_u32 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reinterpret_as_u64()

template<typename _Tp0 , int n0>

v_reg< uint64, n0 *sizeof(_Tp0)/sizeof(uint64)> cv::v_reinterpret_as_u64 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reinterpret_as_u8()

template<typename _Tp0 , int n0>

v_reg< uchar, n0 *sizeof(_Tp0)/sizeof(uchar)> cv::v_reinterpret_as_u8 ( const v_reg< _Tp0, n0 > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_reverse()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_reverse ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

向量反转顺序。

反转向量顺序

REG {A1 ... An} ==> REG {An ... A1}

适用于所有类型。

◆ v_rotate_left() [1/2]

template<int imm, typename _Tp , int n>

v_reg< _Tp, n > cv::v_rotate_left ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

向量中元素左移。

适用于所有类型

◆ v_rotate_left() [2/2]

template<int imm, typename _Tp , int n>

v_reg< _Tp, n > cv::v_rotate_left	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rotate_right() [1/2]

template<int imm, typename _Tp , int n>

v_reg< _Tp, n > cv::v_rotate_right ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

向量中元素右移。

适用于所有类型

◆ v_rotate_right() [2/2]

template<int imm, typename _Tp , int n>

v_reg< _Tp, n > cv::v_rotate_right	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_round() [1/3]

template<int n>

v_reg< int, n *2 > cv::v_round ( const v_reg< double, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

这是一个重载的成员函数，为了方便提供。它与上面的函数的区别只在于它接受的参数。

此函数的调用图如下所示

◆ v_round() [2/3]

template<int n>

v_reg< int, n *2 > cv::v_round	(	const v_reg< double, n > &	a,
		const v_reg< double, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

这是一个重载的成员函数，为了方便提供。它与上面的函数的区别只在于它接受的参数。

此函数的调用图如下所示

◆ v_round() [3/3]

template<int n>

v_reg< int, n > cv::v_round ( const v_reg< float, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

对元素进行四舍五入。

对每个值进行四舍五入。输入类型为浮点向量 ==> 输出类型为整型向量。

注意: 仅适用于浮点类型。

此函数的调用图如下所示

◆ v_rshr() [1/6]

template<int shift, int n>

v_reg< int, n > cv::v_rshr ( const v_reg< int, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr() [2/6]

template<int shift, int n>

v_reg< int64, n > cv::v_rshr ( const v_reg< int64, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr() [3/6]

template<int shift, int n>

v_reg< short, n > cv::v_rshr ( const v_reg< short, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr() [4/6]

template<int shift, int n>

v_reg< uint64, n > cv::v_rshr ( const v_reg< uint64, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr() [5/6]

template<int shift, int n>

v_reg< unsigned, n > cv::v_rshr ( const v_reg< unsigned, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr() [6/6]

template<int shift, int n>

v_reg< ushort, n > cv::v_rshr ( const v_reg< ushort, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack() [1/6]

template<int shift, int n>

v_reg< short, 2 *n > cv::v_rshr_pack	(	const v_reg< int, n > &	a,
		const v_reg< int, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack() [2/6]

template<int shift, int n>

v_reg< int, 2 *n > cv::v_rshr_pack	(	const v_reg< int64, n > &	a,
		const v_reg< int64, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack() [3/6]

template<int shift, int n>

v_reg< schar, 2 *n > cv::v_rshr_pack	(	const v_reg< short, n > &	a,
		const v_reg< short, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack() [4/6]

template<int shift, int n>

v_reg< unsigned, 2 *n > cv::v_rshr_pack	(	const v_reg< uint64, n > &	a,
		const v_reg< uint64, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack() [5/6]

template<int shift, int n>

v_reg< ushort, 2 *n > cv::v_rshr_pack	(	const v_reg< unsigned, n > &	a,
		const v_reg< unsigned, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack() [6/6]

template<int shift, int n>

v_reg< uchar, 2 *n > cv::v_rshr_pack	(	const v_reg< ushort, n > &	a,
		const v_reg< ushort, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_store() [1/6]

template<int shift, int n>

void cv::v_rshr_pack_store	(	int *	ptr,
		const v_reg< int64, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_store() [2/6]

template<int shift, int n>

void cv::v_rshr_pack_store	(	schar *	ptr,
		const v_reg< short, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_store() [3/6]

template<int shift, int n>

void cv::v_rshr_pack_store	(	short *	ptr,
		const v_reg< int, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_store() [4/6]

template<int shift, int n>

void cv::v_rshr_pack_store	(	uchar *	ptr,
		const v_reg< ushort, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_store() [5/6]

template<int shift, int n>

void cv::v_rshr_pack_store	(	无符号整型指针	ptr,
		const v_reg< uint64, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_store() [6/6]

template<int shift, int n>

void cv::v_rshr_pack_store	(	ushort *	ptr,
		const v_reg< unsigned, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_u() [1/2]

template<int shift, int n>

v_reg< ushort, 2 *n > cv::v_rshr_pack_u	(	const v_reg< int, n > &	a,
		const v_reg< int, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_u() [2/2]

template<int shift, int n>

v_reg< uchar, 2 *n > cv::v_rshr_pack_u	(	const v_reg< short, n > &	a,
		const v_reg< short, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_u_store() [1/2]

template<int shift, int n>

void cv::v_rshr_pack_u_store	(	uchar *	ptr,
		const v_reg< short, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_rshr_pack_u_store() [2/2]

template<int shift, int n>

void cv::v_rshr_pack_u_store	(	ushort *	ptr,
		const v_reg< int, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_scan_forward()

template<typename _Tp , int n>

int cv::v_scan_forward ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

获取第一个负数通道索引。

返回值是第一个负数通道的索引（对于全正数输入未定义）示例

v_int32x4 r; // 设置为 {0, 0, -1, -1}

int idx = v_heading_zeros(r); // idx = 2

◆ v_select()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_select	(	const v_reg< _Tp, n > &	掩码,
		const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

按元素选择（混合操作）

返回值将通过使用以下方案组合值a和b来构建：result[i] = mask[i] ? a[i] : b[i];

注意

: mask 元素值限制为这些值

0：从b选择元素
0xff/0xffff/等：从a选择元素（与基于位的运算符完全兼容）

◆ v_setall_() [1/10]

模板<>

v_float64x2 cv::v_setall_ ( double val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_() [2/10]

模板<>

v_float32x4 cv::v_setall_ ( float val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_() [3/10]

模板<>

v_int32x4 cv::v_setall_ ( int val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_() [4/10]

模板<>

v_int64x2 cv::v_setall_ ( int64 val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_() [5/10]

模板<>

v_int8x16 cv::v_setall_ ( schar val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_() [6/10]

模板<>

v_int16x8 cv::v_setall_ ( short val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_() [7/10]

模板<>

v_uint8x16 cv::v_setall_ ( uchar val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_() [8/10]

模板<>

v_uint64x2 cv::v_setall_ ( uint64 val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_() [9/10]

模板<>

v_uint32x4 cv::v_setall_ ( unsigned val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_() [10/10]

模板<>

v_uint16x8 cv::v_setall_ ( ushort val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_f32()

v_float32x4 cv::v_setall_f32 ( float val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_f64()

v_float64x2 cv::v_setall_f64 ( double val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_s16()

v_int16x8 cv::v_setall_s16 ( short val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_s32()

v_int32x4 cv::v_setall_s32 ( int val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_s64()

v_int64x2 cv::v_setall_s64 ( int64 val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_s8()

v_int8x16 cv::v_setall_s8 ( schar val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_u16()

v_uint16x8 cv::v_setall_u16 ( ushort val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_u32()

v_uint32x4 cv::v_setall_u32 ( unsigned val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_u64()

v_uint64x2 cv::v_setall_u64 ( uint64 val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setall_u8()

v_uint8x16 cv::v_setall_u8 ( uchar val )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_()

模板<>

v_uint8x16 cv::v_setzero_ ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_f32()

v_float32x4 cv::v_setzero_f32 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_f64()

v_float64x2 cv::v_setzero_f64 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_s16()

v_int16x8 cv::v_setzero_s16 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_s32()

v_int32x4 cv::v_setzero_s32 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_s64()

v_int64x2 cv::v_setzero_s64 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_s8()

v_int8x16 cv::v_setzero_s8 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_u16()

v_uint16x8 cv::v_setzero_u16 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_u32()

v_uint32x4 cv::v_setzero_u32 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_u64()

v_uint64x2 cv::v_setzero_u64 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_setzero_u8()

v_uint8x16 cv::v_setzero_u8 ( )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shl() [1/7]

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_shl	(	const v_reg< _Tp, n > &	a,
		int	imm )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

按位左移。

适用于16位、32位和64位整数值。

◆ v_shl() [2/7]

template<int shift, int n>

v_reg< int, n > cv::v_shl ( const v_reg< int, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shl() [3/7]

template<int shift, int n>

v_reg< int64, n > cv::v_shl ( const v_reg< int64, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shl() [4/7]

template<int shift, int n>

v_reg< short, n > cv::v_shl ( const v_reg< short, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shl() [5/7]

template<int shift, int n>

v_reg< uint64, n > cv::v_shl ( const v_reg< uint64, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shl() [6/7]

template<int shift, int n>

v_reg< unsigned, n > cv::v_shl ( const v_reg< unsigned, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shl() [7/7]

template<int shift, int n>

v_reg< ushort, n > cv::v_shl ( const v_reg< ushort, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shr() [1/7]

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_shr	(	const v_reg< _Tp, n > &	a,
		int	imm )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

按位右移。

适用于16位、32位和64位整数值。

◆ v_shr() [2/7]

template<int shift, int n>

v_reg< int, n > cv::v_shr ( const v_reg< int, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shr() [3/7]

template<int shift, int n>

v_reg< int64, n > cv::v_shr ( const v_reg< int64, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shr() [4/7]

template<int shift, int n>

v_reg< short, n > cv::v_shr ( const v_reg< short, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shr() [5/7]

template<int shift, int n>

v_reg< uint64, n > cv::v_shr ( const v_reg< uint64, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shr() [6/7]

template<int shift, int n>

v_reg< unsigned, n > cv::v_shr ( const v_reg< unsigned, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_shr() [7/7]

template<int shift, int n>

v_reg< ushort, n > cv::v_shr ( const v_reg< ushort, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

◆ v_signmask()

template<typename _Tp , int n>

int cv::v_signmask ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

获取负值掩码。

已弃用: v_signmask 严重依赖于通道数，因此不够通用

返回值是一个位掩码，其中对应于负打包值索引的位设置为 1。示例

v_int32x4 r; // 设置为 {-1, -1, 1, 1}

int mask = v_signmask(r); // mask = 3 <== 00000000 00000000 00000000 00000011

cv::v_signmask

int v_signmask(const v_reg< _Tp, n > &a)

获取负值掩码。

定义 intrin_cpp.hpp:1446

◆ v_sin()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_sin ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

计算元素的正弦值 \( sin(x) \)。

仅适用于浮点类型。核心实现与v_sincos相同。

◆ v_sincos()

template<typename _Tp , int n>

void cv::v_sincos	(	const v_reg< _Tp, n > &	x,
		v_reg< _Tp, n > &	s,
		v_reg< _Tp, n > &	c )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

同时计算元素的正弦值 \( sin(x) \) 和余弦值 \( cos(x) \)。

仅适用于浮点类型。核心实现步骤

输入归一化：将周期性从 2π 缩放到 4，并使用周期性和三角恒等式将角度缩减到范围 \( [0, \frac{\pi}{4}] \)。
对 \( sin(x) \) 和 \( cos(x) \) 的多项式逼近
- 对于 float16 和 float32，对正弦使用 4 项泰勒级数，对余弦使用 5 项泰勒级数。
- 对于 float64，对正弦使用 7 项泰勒级数，对余弦使用 8 项泰勒级数。
选择结果：选择并转换原始输入角度的最终正弦和余弦值。

注意: 计算精度取决于实现和输入向量的数

◆ v_sqr_magnitude()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_sqr_magnitude	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

幅度的平方。

返回 \( a^2 + b^2 \) 仅适用于浮点类型。

◆ v_sqrt()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_sqrt ( const v_reg< _Tp, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

计算元素的平方根。

仅适用于浮点类型。

◆ v_store() [1/2]

template<typename _Tp , int n>

void cv::v_store	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将数据存储到内存中。

将寄存器内容存储到内存中。方案

寄存器 {A B C D} ==> 内存 {A B C D}

指针可以是不对齐的。

此函数的调用图如下所示

◆ v_store() [2/2]

template<typename _Tp , int n>

void cv::v_store	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a,
		hal::StoreMode	)

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

此函数的调用图如下所示

◆ v_store_aligned() [1/2]

template<typename _Tp , int n>

void cv::v_store_aligned	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将数据存储到内存中（已对齐）

将寄存器内容存储到内存中。方案

寄存器 {A B C D} ==> 内存 {A B C D}

指针**应该**以16字节边界对齐。

此函数的调用图如下所示

◆ v_store_aligned() [2/2]

template<typename _Tp , int n>

void cv::v_store_aligned	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a,
		hal::StoreMode	)

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

此函数的调用图如下所示

◆ v_store_aligned_nocache()

template<typename _Tp , int n>

void cv::v_store_aligned_nocache	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

此函数的调用图如下所示

◆ v_store_high()

template<typename _Tp , int n>

void cv::v_store_high	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将数据存储到内存（高半部分）

将寄存器内容的高半部分存储到内存中。方案

寄存器 {A B C D} ==> 内存 {C D}

此函数的调用图如下所示

◆ v_store_interleave() [1/3]

template<typename _Tp , int n>

void cv::v_store_interleave	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		const v_reg< _Tp, n > &	c,
		const v_reg< _Tp, n > &	d,
		hal::StoreMode	= hal::STORE_UNALIGNED )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

交叉存储 (4通道)

将4个寄存器中的数据交叉存储到内存中。方案

{A1 A2 ...}, {B1 B2 ...}, {C1 C2 ...}, {D1 D2 ...} ==> {A1 B1 C1 D1 A2 B2 C2 D2 ...}

适用于除 64 位以外的所有类型。

此函数的调用图如下所示

◆ v_store_interleave() [2/3]

template<typename _Tp , int n>

void cv::v_store_interleave	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		const v_reg< _Tp, n > &	c,
		hal::StoreMode	= hal::STORE_UNALIGNED )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

交叉存储 (3通道)

将3个寄存器中的数据交叉存储到内存中。方案

{A1 A2 ...}, {B1 B2 ...}, {C1 C2 ...} ==> {A1 B1 C1 A2 B2 C2 ...}

适用于除 64 位以外的所有类型。

此函数的调用图如下所示

◆ v_store_interleave() [3/3]

template<typename _Tp , int n>

void cv::v_store_interleave	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b,
		hal::StoreMode	= hal::STORE_UNALIGNED )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

交叉存储 (2通道)

将2个寄存器中的数据交叉存储到内存中。方案

{A1 A2 ...}, {B1 B2 ...} ==> {A1 B1 A2 B2 ...}

适用于除 64 位以外的所有类型。

此函数的调用图如下所示

◆ v_store_low()

template<typename _Tp , int n>

void cv::v_store_low	(	_Tp *	ptr,
		const v_reg< _Tp, n > &	a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

将数据存储到内存（低半部分）

将寄存器内容的低半部分存储到内存中。方案

寄存器 {A B C D} ==> 内存 {A B}

此函数的调用图如下所示

◆ v_sub()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_sub	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

#include <opencv2/core/hal/intrin_cpp.hpp>

减法运算。

适用于所有类型。

◆ v_sub_wrap()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_sub_wrap	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

无饱和减法运算。

适用于 8 位和 16 位整数值。

◆ v_transpose4x4()

template<typename _Tp , int n>

void cv::v_transpose4x4	(	v_reg< _Tp, n > &	a0,
		const v_reg< _Tp, n > &	a1,
		const v_reg< _Tp, n > &	a2,
		const v_reg< _Tp, n > &	a3,
		v_reg< _Tp, n > &	b0,
		v_reg< _Tp, n > &	b1,
		v_reg< _Tp, n > &	b2,
		v_reg< _Tp, n > &	b3 )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

转置 4x4 矩阵。

方案

a0 {A1 A2 A3 A4}
a1 {B1 B2 B3 B4}
a2 {C1 C2 C3 C4}
a3 {D1 D2 D3 D4}
===============
b0 {A1 B1 C1 D1}
b1 {A2 B2 C2 D2}
b2 {A3 B3 C3 D3}
b3 {A4 B4 C4 D4}

◆ v_trunc() [1/2]

template<int n>

v_reg< int, n *2 > cv::v_trunc ( const v_reg< double, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

这是一个重载的成员函数，为了方便提供。它与上面的函数的区别只在于它接受的参数。

◆ v_trunc() [2/2]

template<int n>

v_reg< int, n > cv::v_trunc ( const v_reg< float, n > & a )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

截断元素。

截断每个值。输入类型为浮点向量 ==> 输出类型为整型向量。

注意: 仅适用于浮点类型。

◆ v_xor()

template<typename _Tp , int n>

v_reg< _Tp, n > cv::v_xor	(	const v_reg< _Tp, n > &	a,
		const v_reg< _Tp, n > &	b )

#include <opencv2/core/hal/intrin_cpp.hpp>

按位异或。

仅适用于整数类型。

◆ v_zip()

template<typename _Tp , int n>

void cv::v_zip	(	const v_reg< _Tp, n > &	a0,
		const v_reg< _Tp, n > &	a1,
		v_reg< _Tp, n > &	b0,
		v_reg< _Tp, n > &	b1 )

内联

#include <opencv2/core/hal/intrin_cpp.hpp>

交叉两个向量。

方案

{A1 A2 A3 A4}
{B1 B2 B3 B4}
---------------
{A1 B1 A2 B2} 和 {A3 B3 A4 B4}

适用于除 64 位以外的所有类型。

变量文档

◆ popCountTable

const unsigned char cv::popCountTable[]

static

#include <opencv2/core/hal/intrin_cpp.hpp>

初始值

=
{
    0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
    4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
}

主题

详细描述

类型

加载和存储操作

值重排序

算术、位运算和比较运算

规约和掩码

其他数学运算

转换

矩阵运算

可用性

类

宏

类型定义

枚举

函数

变量

初始化为零

用值初始化

重新解释

左移

右移

舍入移位

打包

带舍入移位的打包

打包和存储

带舍入移位的打包和存储

打包布尔值

宏定义文档

◆ OPENCV_HAL_MATH_HAVE_EXP

类型定义文档

◆ v_float32

◆ v_float32x16

◆ v_float32x4

◆ v_float32x8

◆ v_float64

◆ v_float64x2

◆ v_float64x4

◆ v_float64x8

◆ v_int16

◆ v_int16x16

◆ v_int16x32

◆ v_int16x8

◆ v_int32

◆ v_int32x16

◆ v_int32x4

◆ v_int32x8

◆ v_int64

◆ v_int64x2

◆ v_int64x4

◆ v_int64x8

◆ v_int8

◆ v_int8x16

◆ v_int8x32

◆ v_int8x64

◆ v_uint16

◆ v_uint16x16

◆ v_uint16x32

◆ v_uint16x8

◆ v_uint32

◆ v_uint32x16

◆ v_uint32x4

◆ v_uint32x8

◆ v_uint64

◆ v_uint64x2

◆ v_uint64x4

◆ v_uint64x8

◆ v_uint8

◆ v_uint8x16

◆ v_uint8x32

◆ v_uint8x64

枚举类型文档

◆ 匿名枚举

函数文档

◆ v256_cleanup()

◆ v256_load()

◆ v256_load_aligned()

◆ v256_load_expand() [1/2]

◆ v256_load_expand() [2/2]

◆ v256_load_expand_q()