BytesFunc ¶

Authors:: Michael Griffin
Version:: 3.4.4 for 2025-01-14
Copyright:: 2014 - 2025
License:: This document may be distributed under the Apache License V2.0.
Language:: Python 3.6 or later

Introduction ¶

The BytesFunc module provides high speed array processing functions for use with Python ‘bytes’ and ‘bytearray’ objects. These functions are patterned after the functions in the standard Python “operator” module together with some additional ones from other sources.

The purpose of these functions is to perform mathematical calculations on “bytes” and “bytearray” objects significantly faster than using native Python.

Function Summary ¶

The compare operators used for ‘ball’, ‘bany’, and ‘findindex’ are examples only, and other compare operations are available. Many functions will accept other parameter combinations of sequences and numeric parameters. See the details for each function for what parameter combinations are valid.

Brief Description ¶

Function	Equivalent to
and_	Perform a bitwise AND across the sequence.
ball	True if all elements of the sequence meet the match criteria.
bany	True if any elements of the sequence meet the match criteria.
bmax	Return the maximum value in the sequence.
bmin	Return the minimum value in the sequence.
bsum	Return the sum of the sequence.
eq	True if all elements of the sequence equal the compare value.
findindex	Returns the index of the first value in an array to meet the specified criteria.
ge	True if all elements of the sequence are greater than or equal to the compare value.
gt	True if all elements of the sequence are greater than the compare value.
invert	Perform a bitwise invert across the sequence.
le	True if all elements of the sequence are less than or equal to the compare value.
lshift	Perform a bitwise left shift across the sequence.
lt	True if all elements of the sequence are less than the compare value.
ne	True if all elements of the sequence are not equal the compare value.
or_	Perform a bitwise OR across the sequence.
rshift	Perform a bitwise right shift across the sequence.
xor	Perform a bitwise XOR across the sequence.

Python Equivalent ¶

Function	Equivalent to
and_	[x & param for x in sequence1]
ball	all([(x > param) for x in array])
bany	any([(x > param) for x in array])
bmax	max(sequence)
bmin	min(sequence)
bsum	sum(sequence)
eq	all([x == param for x in sequence])
findindex	[x for x,y in enumerate(array) if y > param][0]
ge	all([x >= param for x in sequence])
gt	all([x > param for x in sequence])
invert	[~x for x in sequence1]
le	all([x <= param for x in sequence])
lshift	[x << param for x in sequence1]
lt	all([x < param for x in sequence])
ne	all([x != param for x in sequence])
or_	[x \| param for x in sequence1]
rshift	[x >> param for x in sequence1]
xor	[x ^ param for x in sequence1]

Description ¶

Parameters ¶

Parameter Formats ¶

Parameters come in several forms.

Sequences. Sequences are either “bytes” or “bytearray” objects. Bytes sequences are immutable and must not be used for output destinations. Bytearray sequences are mutable, and may be used for inputs or outputs.
Numeric parameters. Numeric input parameters are individual integers and must be in the range of 0 to 255.
Comparison operators. Comparison operators are unicode strings in the form used by Python for compare operations. These must be quoted strings, and not bare Python symbols. See the section below for a list of these.
Sequence length control. Sequence length control allows only part of a sequence to be used as an input. See the section below for details.
Overflow detection disable. Overflow detection control is used for disable integer overflow. See the section below for details.

Example:

sequence = bytes([1, 2, 5, 99, 8])
# Find the maximum value and return it. The answer should be 99.
result = bytesfunc.bmax(sequence)

Example:

sequence1 = bytes([1, 2, 5, 99, 8])
sequence2 = bytearray([0, 0, 0, 0, 0])
# Xor each element in sequence1 with '7', and write the output to
# sequence2. Sequence2 should be bytearray(b'\x06\x05\x02d\x0f').
bytesfunc.xor(sequence1, 7, sequence2)

Example:

sequence1 = bytes([1, 2, 5, 99, 8, 101])
# Find the first index of sequence1 which is greater than or equal to 99.
# The answer should be 3.
result = bytesfunc.findindex('>=', sequence, 99)

Function Documentation Details ¶

and_¶

Calculate and_ over the values in a bytes or bytearray object.

Equivalent to:	[x & param for x in sequence1]
or	[param & x for x in sequence1]
or	[x & y for x,y in zip(sequence1, sequence2)]

Call formats:

and_(sequence1, param)
and_(sequence1, param, outpsequence)
and_(param, sequence1)
and_(param, sequence1, outpsequence)
and_(sequence1, sequence2)
and_(sequence1, sequence2, outpsequence)
and_(sequence1, param, maxlen=y)
and_(sequence1, param, nosimd=False)

sequence1 - The first input data bytes or bytearray sequence to be examined. If no output sequence is provided the results will overwrite the input data.
param - A non-sequence numeric parameter.
sequence2 - A second input data sequence. Each element in this sequence is applied to the corresponding element in the first sequence.
outpsequence - The output sequence. This parameter is optional.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled. This parameter is optional. The default is FALSE.

ball ¶

Calculate ball over the values a bytes or bytearray object.

Equivalent to:

all([(x > param) for x in array])

Call formats:

result = ball(opstr, sequence, param)
result = ball(opstr, sequence, param, maxlen=y)
result = ball(opstr, sequence, param, nosimd=False)

opstr - The arithmetic comparison operation as a string.
These are: ‘==’, ‘>’, ‘>=’, ‘<’, ‘<=’, ‘!=’.
sequence - An input bytes or bytearray to be examined.
param - A non-array numeric parameter.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - A boolean value corresponding to the result of all the comparison operations. If any comparison operations result in true, the return value will be true. If all of them result in false, the return value will be false.

bany ¶

Calculate bany over the values a bytes or bytearray object.

Equivalent to:

any([(x > param) for x in array])

Call formats:

result = bany(opstr, sequence, param)
result = bany(opstr, sequence, param, maxlen=y)
result = bany(opstr, sequence, param, nosimd=False)

opstr - The arithmetic comparison operation as a string.
These are: ‘==’, ‘>’, ‘>=’, ‘<’, ‘<=’, ‘!=’.
sequence - An input bytes or bytearray to be examined.
param - A non-array numeric parameter.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - A boolean value corresponding to the result of all the comparison operations. If all comparison operations result in true, the return value will be true. If any of them result in false, the return value will be false.

bmax ¶

Calculate bmax over the values in an array.

Equivalent to:

max(sequence)

Call formats:

result = bmax(sequence)
result = bmax(sequence, maxlen=y)
result = bmax(sequence, nosimd=False)

sequence - The input bytes or bytearray to be examined.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result = The maximum of all the values in the sequence.

bmin ¶

Calculate bmin over the values in an array.

Equivalent to:

min(sequence)

Call formats:

result = bmin(sequence)
result = bmin(sequence, maxlen=y)
result = bmin(sequence, nosimd=False)

sequence - The input bytes or bytearray to be examined.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result = The minimum of all the values in the sequence.

bsum ¶

Calculate the arithmetic sum of an bytes or bytearray sequence.

Equivalent to:

sum(sequence)

Call formats:

result = bsum(sequence)
result = bsum(sequence, maxlen=y)
result = bsum(sequence, matherrors=False)
result = bsum(sequence, nosimd=False)

sequence - An input bytes or bytearray to be examined.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
matherrors - If True, checks for numerical errors including integer overflow are ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - The sum of the sequence.

eq ¶

Calculate eq over the values in a bytes or bytearray object.

Equivalent to:	all([x == param for x in sequence])
or	all([param == x for x in sequence])
or	all([x == y for x,y in zip(sequence1, sequence2)])

Call formats:

result = eq(sequence1, param)
result = eq(param, sequence1)
result = eq(sequence1, sequence2)
result = eq(sequence1, param, maxlen=y)
result = eq(sequence1, param, nosimd=False)

sequence1 - An input bytes or bytearray to be examined.
sequence2 - An input bytes or bytearray to be examined.
param - A integer numeric input parameter in the range 0 - 255.
The first and second parameters are compared to each other. If one parameter is a sequence and the other is an integer, the integer is compared to each element in the sequence. If both parameters are sequences, each element of one sequence is compared to the corresponding element of the other sequence.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - A boolean value corresponding to the result of all the comparison operations. If all comparison operations result in true, the return value will be true. If any of them result in false, the return value will be false.

findindex ¶

Calculate findindex over the values a bytes or bytearray object.

Equivalent to:

[x for x,y in enumerate(array) if y > param][0]

Call formats:

result = findindex(opstr, sequence, param)
result = findindex(opstr, sequence, param, maxlen=y)
result = findindex(opstr, sequence, param, nosimd=False)

opstr - The arithmetic comparison operation as a string.
These are: ‘==’, ‘>’, ‘>=’, ‘<’, ‘<=’, ‘!=’.
sequence - An input bytes or bytearray to be examined.
param - A non-array numeric parameter.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - The resulting index. This will be negative if no match was found.

ge ¶

Calculate ge over the values in a bytes or bytearray object.

Equivalent to:	all([x >= param for x in sequence])
or	all([param >= x for x in sequence])
or	all([x >= y for x,y in zip(sequence1, sequence2)])

Call formats:

result = ge(sequence1, param)
result = ge(param, sequence1)
result = ge(sequence1, sequence2)
result = ge(sequence1, param, maxlen=y)
result = ge(sequence1, param, nosimd=False)

sequence1 - An input bytes or bytearray to be examined.
sequence2 - An input bytes or bytearray to be examined.
param - A integer numeric input parameter in the range 0 - 255.
The first and second parameters are compared to each other. If one parameter is a sequence and the other is an integer, the integer is compared to each element in the sequence. If both parameters are sequences, each element of one sequence is compared to the corresponding element of the other sequence.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - A boolean value corresponding to the result of all the comparison operations. If all comparison operations result in true, the return value will be true. If any of them result in false, the return value will be false.

gt ¶

Calculate gt over the values in a bytes or bytearray object.

Equivalent to:	all([x > param for x in sequence])
or	all([param > x for x in sequence])
or	all([x > y for x,y in zip(sequence1, sequence2)])

Call formats:

result = gt(sequence1, param)
result = gt(param, sequence1)
result = gt(sequence1, sequence2)
result = gt(sequence1, param, maxlen=y)
result = gt(sequence1, param, nosimd=False)

sequence1 - An input bytes or bytearray to be examined.
sequence2 - An input bytes or bytearray to be examined.
param - A integer numeric input parameter in the range 0 - 255.
The first and second parameters are compared to each other. If one parameter is a sequence and the other is an integer, the integer is compared to each element in the sequence. If both parameters are sequences, each element of one sequence is compared to the corresponding element of the other sequence.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - A boolean value corresponding to the result of all the comparison operations. If all comparison operations result in true, the return value will be true. If any of them result in false, the return value will be false.

invert ¶

Calculate invert over the values in a bytes or bytearray object.

Equivalent to:

[~x for x in sequence1]

Call formats:

invert(sequence1)
invert(sequence1, outpseq)
invert(sequence1, maxlen=y)
invert(sequence1, nosimd=False)

sequence1 - The input bytes or bytearray to be examined. If no output bytearray is provided the results will overwrite the input data, in which case it must be a bytearray.
outpseq - The output bytearray. This parameter is optional.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled. This parameter is optional. The default is FALSE.

le ¶

Calculate le over the values in a bytes or bytearray object.

Equivalent to:	all([x <= param for x in sequence])
or	all([param <= x for x in sequence])
or	all([x <= y for x,y in zip(sequence1, sequence2)])

Call formats:

result = le(sequence1, param)
result = le(param, sequence1)
result = le(sequence1, sequence2)
result = le(sequence1, param, maxlen=y)
result = le(sequence1, param, nosimd=False)

sequence1 - An input bytes or bytearray to be examined.
sequence2 - An input bytes or bytearray to be examined.
param - A integer numeric input parameter in the range 0 - 255.
The first and second parameters are compared to each other. If one parameter is a sequence and the other is an integer, the integer is compared to each element in the sequence. If both parameters are sequences, each element of one sequence is compared to the corresponding element of the other sequence.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - A boolean value corresponding to the result of all the comparison operations. If all comparison operations result in true, the return value will be true. If any of them result in false, the return value will be false.

lshift ¶

Calculate lshift over the values in a bytes or bytearray object.

Equivalent to:	[x << param for x in sequence1]
or	[param << x for x in sequence1]
or	[x << y for x,y in zip(sequence1, sequence2)]

Call formats:

lshift(sequence1, param)
lshift(sequence1, param, outpsequence)
lshift(param, sequence1)
lshift(param, sequence1, outpsequence)
lshift(sequence1, sequence2)
lshift(sequence1, sequence2, outpsequence)
lshift(sequence1, param, maxlen=y)
lshift(sequence1, param, nosimd=False)

sequence1 - The first input data bytes or bytearray sequence to be examined. If no output sequence is provided the results will overwrite the input data.
param - A non-sequence numeric parameter.
sequence2 - A second input data sequence. Each element in this sequence is applied to the corresponding element in the first sequence.
outpsequence - The output sequence. This parameter is optional.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled. This parameter is optional. The default is FALSE.

lt ¶

Calculate lt over the values in a bytes or bytearray object.

Equivalent to:	all([x < param for x in sequence])
or	all([param < x for x in sequence])
or	all([x < y for x,y in zip(sequence1, sequence2)])

Call formats:

result = lt(sequence1, param)
result = lt(param, sequence1)
result = lt(sequence1, sequence2)
result = lt(sequence1, param, maxlen=y)
result = lt(sequence1, param, nosimd=False)

sequence1 - An input bytes or bytearray to be examined.
sequence2 - An input bytes or bytearray to be examined.
param - A integer numeric input parameter in the range 0 - 255.
The first and second parameters are compared to each other. If one parameter is a sequence and the other is an integer, the integer is compared to each element in the sequence. If both parameters are sequences, each element of one sequence is compared to the corresponding element of the other sequence.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - A boolean value corresponding to the result of all the comparison operations. If all comparison operations result in true, the return value will be true. If any of them result in false, the return value will be false.

ne ¶

Calculate ne over the values in a bytes or bytearray object.

Equivalent to:	all([x != param for x in sequence])
or	all([param != x for x in sequence])
or	all([x != y for x,y in zip(sequence1, sequence2)])

Call formats:

result = ne(sequence1, param)
result = ne(param, sequence1)
result = ne(sequence1, sequence2)
result = ne(sequence1, param, maxlen=y)
result = ne(sequence1, param, nosimd=False)

sequence1 - An input bytes or bytearray to be examined.
sequence2 - An input bytes or bytearray to be examined.
param - A integer numeric input parameter in the range 0 - 255.
The first and second parameters are compared to each other. If one parameter is a sequence and the other is an integer, the integer is compared to each element in the sequence. If both parameters are sequences, each element of one sequence is compared to the corresponding element of the other sequence.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled if present. The default is False (SIMD acceleration is enabled if present).
result - A boolean value corresponding to the result of all the comparison operations. If all comparison operations result in true, the return value will be true. If any of them result in false, the return value will be false.

or_¶

Calculate or_ over the values in a bytes or bytearray object.

Equivalent to:	[x \| param for x in sequence1]
or	[param \| x for x in sequence1]
or	[x \| y for x,y in zip(sequence1, sequence2)]

Call formats:

or_(sequence1, param)
or_(sequence1, param, outpsequence)
or_(param, sequence1)
or_(param, sequence1, outpsequence)
or_(sequence1, sequence2)
or_(sequence1, sequence2, outpsequence)
or_(sequence1, param, maxlen=y)
or_(sequence1, param, nosimd=False)

sequence1 - The first input data bytes or bytearray sequence to be examined. If no output sequence is provided the results will overwrite the input data.
param - A non-sequence numeric parameter.
sequence2 - A second input data sequence. Each element in this sequence is applied to the corresponding element in the first sequence.
outpsequence - The output sequence. This parameter is optional.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled. This parameter is optional. The default is FALSE.

rshift ¶

Calculate rshift over the values in a bytes or bytearray object.

Equivalent to:	[x >> param for x in sequence1]
or	[param >> x for x in sequence1]
or	[x >> y for x,y in zip(sequence1, sequence2)]

Call formats:

rshift(sequence1, param)
rshift(sequence1, param, outpsequence)
rshift(param, sequence1)
rshift(param, sequence1, outpsequence)
rshift(sequence1, sequence2)
rshift(sequence1, sequence2, outpsequence)
rshift(sequence1, param, maxlen=y)
rshift(sequence1, param, nosimd=False)

sequence1 - The first input data bytes or bytearray sequence to be examined. If no output sequence is provided the results will overwrite the input data.
param - A non-sequence numeric parameter.
sequence2 - A second input data sequence. Each element in this sequence is applied to the corresponding element in the first sequence.
outpsequence - The output sequence. This parameter is optional.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled. This parameter is optional. The default is FALSE.

xor ¶

Calculate xor over the values in a bytes or bytearray object.

Equivalent to:	[x ^ param for x in sequence1]
or	[param ^ x for x in sequence1]
or	[x ^ y for x,y in zip(sequence1, sequence2)]

Call formats:

xor(sequence1, param)
xor(sequence1, param, outpsequence)
xor(param, sequence1)
xor(param, sequence1, outpsequence)
xor(sequence1, sequence2)
xor(sequence1, sequence2, outpsequence)
xor(sequence1, param, maxlen=y)
xor(sequence1, param, nosimd=False)

sequence1 - The first input data bytes or bytearray sequence to be examined. If no output sequence is provided the results will overwrite the input data.
param - A non-sequence numeric parameter.
sequence2 - A second input data sequence. Each element in this sequence is applied to the corresponding element in the first sequence.
outpsequence - The output sequence. This parameter is optional.
maxlen - Limit the length of the sequence used. This must be a valid positive integer. If a zero or negative length, or a value which is greater than the actual length of the sequence is specified, this parameter is ignored.
nosimd - If True, SIMD acceleration is disabled. This parameter is optional. The default is FALSE.

Parameter Details ¶

Comparison Operators ¶

Some functions use comparison operators. These are unicode strings containing the Python compare operators and include following:

Operator	Description
‘<’	Less than.
‘<=’	Less than or equal to.
‘>’	Greater than.
‘>=’	Greater than or equal to.
‘==’	Equal to.
‘!=’	Not equal to.

All comparison operators must contain only the above characters and may not include any leading or trailing spaces or other characters.

Numeric Parameters ¶

“Bytes” and “bytearray” objects are sequences of 8 bit bytes with each element being in the range of 0 to 255. When a function accepts a non-sequence numeric parameter, this must also be in the range of 0 to 255.

Using Less than the Entire Sequence ¶

If the size of the sequence is larger than the desired length of the calculation, it may be limited to the first part of the sequence by using the ‘maxlen’ parameter. In the following example only the first 3 elements will be operated on, with the following ones left unchanged.:

x = bytes([20,21,22,23,24,25])
result = bytesfunc.bmax(x, maxlen=3)

Suppressing or Ignoring Math Errors ¶

Some functions can be made to ignore some mathematical errors (e.g. integer overflow) by setting the ‘matherrors’ keyword parameter to True.:

x = bytes([20,21,22,23,24,250,250])
result = bytesfunc.sum(x, matherrors=True)

Ignoring errors may be desirable if the side effect (e.g. the result of an integer overflow) is the intended effect, or for reasons of a minor performance improvement in some cases. Benchmark your calculation before deciding if this is worth while.

Differences with Native Python ¶

In some cases ‘BytesFunc’ will not produce exactly the same result as Python. There are several reasons for this, the primary one being that BytesFunc operates on different underlying data types. Specifically, BytesFunc uses the platform’s native integer types while Python integers are of arbitrary size and can never overflow (Python simply expands the word size indefinitely), while BytesFunc integers will overflow the same as they would with programs written in C.

Think of BytesFunc as exposing C style semantics in a form convenient to use in Python. Some convenience which Python provides (e.g. no limit to the size of integers) is traded off for large performance increases.

SIMD Support ¶

General ¶

SIMD (Single Instruction Multiple Data) is a set of CPU features which allow multiple operations to take place in parallel. Some, but not all, functions may make use of these instructions to speed up execution.

Disabling SIMD ¶

Those functions which do support SIMD features will automatically make use of them by default unless this feature is disabled. There is normally no reason to disable SIMD, but should there be hardware related problems the function can be forced to fall back to conventional execution mode.

If the optional parameter “nosimd” is set to true (“nosimd=True”), SIMD execution will be disabled. The default is “False”.

To repeat, there is normally no reason to wish to disable SIMD.

Platform Support ¶

SIMD instructions are presently supported only on the following:

64 bit x86 (i.e. AMD64) using GCC.
32 bit ARMv7 using GCC (tested on Raspberry Pi 3).
64 bit ARMv8 AARCH64 using GCC (tested on Raspberry Pi 4).

Other compilers or platforms will still run the same functions and should produce the same results, but they will not benefit from SIMD acceleration.

However, non-SIMD functions will still be much faster standard Python code. See the performance benchmarks to see what the relative speed differences are. With wider data types (e.g. double precision floating point) SIMD provides only marginal speed ups anyway.

Raspberry Pi 32 versus 64 bit ¶

The Raspberry Pi uses an ARM CPU. This can operate in 32 or 64 bit mode. When in 32 bit mode, the Raspberry Pi 3 operates in ARMv7 mode. This has 64 bit ARM NEON SIMD vectors.

When in 64 bit mode, it acts as an ARMv8, with AARCH64 128 bit ARM NEON SIMD vectors.

The Raspbian Linux OS is 32 bit mode only. Other distros such as Ubuntu offer 64 bit versions.

The “setup.py” file uses platform detection code to determine which ARM CPU and mode it is running on. Due to the availability of hardware for testing, this code is tailored to the Raspberry Pi 3 and Raspberry Pi 4 and the operating systems listed. This code then selects the appropriate compiler arguments to pass to the setup routines to tell the compiler what mode to compile for.

If other ARM platforms are used which have different platform signatures or which require different compiler arguments, the “setup.py” file may need to be modified in order to use SIMD acceleration.

However, the straight ‘C’ code should still compile and run, and still provide performance many times faster than when using native Python.

SIMD Function Support ¶

The following table shows which functions are supported by SIMD on which CPU architectures.

Function	x86	ARMv7	ARMv8
and_	X	X	X
ball	X	X	X
bany	X	X	X
bmax	X	X	X
bmin	X	X	X
bsum		X	X
eq	X	X	X
findindex	X	X	X
ge	X	X	X
gt	X	X	X
invert	X	X	X
le	X	X	X
lshift	X	X	X
lt	X	X	X
ne	X	X	X
or_	X	X	X
rshift	X	X	X
xor	X	X	X

SIMD Support Attributes ¶

“Simdsupport” provides information on the SIMD level compiled into this version of the library. There are two attributes, ‘hassimd’ and ‘simdarch’.

‘hassimd’ is TRUE if the CPU supports the required SIMD features.
‘simdarch’ contains a string indicating the CPU architecture the library
was compiled for.

Example:

>>> bytesfunc.simdsupport.hassimd
True

Example:

>>> bytesfunc.simdsupport.simdarch
'x86_64'

This was created primarily for unit testing and benchmarking and should not be considered to be a permanent or stable part of the library.

Performance ¶

Variables affecting Performance ¶

The purpose of the BytesFunc module is to execute common operations faster than native Python. The relative speed will depend upon a number of factors:

The function.
Function options. Turning checking off will result in faster performance.
The data in the sequence and the parameters.
The size of the sequence.
The platform, including CPU type (e.g. x86 or ARM), operating system, and compiler.

The speeds listed below should be used as rough guidelines only. More exact results will require application specific testing. The numbers shown are the execution time of each function relative to native Python. For example, a value of ‘50’ means that the corresponding BytesFunc operation ran 50 times faster than the closest native Python equivalent.

Both relative performance (the speed-up as compared to Python) and absolute performance (the actual execution speed of Python and BytesFunc) will vary significantly depending upon the compiler (which is OS platform dependent) and whether compiled to 32 or 64 bit. If your precise actual benchmark performance results matter, be sure to conduct your testing using the actual OS and compiler your final program will be deployed on. The values listed below were measured on x86-64 Linux compiled with GCC.

Note: Some more complex BytesFunc functions do not work exactly the same way as the native Python equivalents. This means that the benchmark results should be taken as general guidelines rather than precise comparisons.

Typical Performance Readings ¶

In this set of tests, all error checking was turned on and SIMD acceleration was enabled where this did not conflict with the preceding (the defaults in each case).

The Bytesfunc versus Python factor of 100.0 means the bytesfunc version ran 100 times faster than in native Python on that platform. Benchmarks for different hardware and platforms cannot be compared via this benchmark in terms of absolute performance as these are relative, not absolute numbers.

An SIMD versus non-SIMD factor of 10.0 means the SIMD version was 10 times faster than the non-SIMD version. An SIMD versus non-SIMD factor of 0.0 means the function did not support SIMD on the tested platform.

x86-64 Benchmarks ¶

The following tests were conducted on an x86-64 CPU.

Relative Performance - Python Time / Bytesfunc Time.

function	Bytesfunc vs Python	SIMD vs non-SIMD
and_	1664.9	13.3
ball	470.3	7.7
bany	490.6	7.7
bmax	58.8	3.2
bmin	57.3	3.2
bsum	12.6
eq	460.5	7.4
findindex	733.4	7.3
ge	462.0	7.3
gt	320.4	5.3
invert	1604.3	13.5
le	472.9	7.3
lshift	2289.1	25.1
lt	307.4	5.2
ne	459.6	7.3
or_	1556.0	23.7
rshift	1614.4	24.3
xor	1598.9	25.3

Stat	Value
Average:	813
Maximum:	2289
Minimum:	12.6
Array size:	100000

ARMv7 Benchmarks ¶

The following tests were conducted on an ARM CPU in 32 bit mode (ARMv7) on a Raspberry Pi 3.

Relative Performance - Python Time / Bytesfunc Time.

function	Bytesfunc vs Python	SIMD vs non-SIMD
and_	1103.9	3.7
ball	347.6	2.6
bany	331.7	2.4
bmax	265.9	5.0
bmin	261.4	5.0
bsum	71.5	2.9
eq	349.6	2.6
findindex	508.1	3.2
ge	359.9	2.6
gt	359.2	2.6
invert	914.4	3.7
le	360.0	2.6
lshift	1358.1	4.2
lt	357.0	2.6
ne	328.2	2.4
or_	1121.2	3.7
rshift	960.1	5.1
xor	1170.3	3.7

Stat	Value
Average:	585
Maximum:	1358
Minimum:	71.5
Array size:	100000

ARMv8 Benchmarks ¶

The following tests were conducted on an ARM CPU in 64 bit mode (ARMv8) on a Raspberry Pi 4.

Relative Performance - Python Time / Bytesfunc Time.

function	Bytesfunc vs Python	SIMD vs non-SIMD
and_	998.9	7.5
ball	501.7	6.0
bany	543.4	6.2
bmax	364.9	13.9
bmin	361.5	13.9
bsum	115.8	6.6
eq	506.5	6.0
findindex	687.0	6.1
ge	533.2	6.0
gt	528.5	6.0
invert	698.4	9.2
le	532.1	6.0
lshift	1332.2	7.6
lt	527.2	6.0
ne	554.3	6.2
or_	740.1	6.3
rshift	713.0	6.2
xor	1099.4	8.2

Stat	Value
Average:	630
Maximum:	1332
Minimum:	115.8
Array size:	100000

Platform Effects ¶

The platform, including CPU, OS, compiler, and compiler version can affect performance, and this influence can change significantly for different functions.

If your application requires exact performance data, then benchmark your application in the specific platform (hardware, OS, and compiler) that you will be using.

Platform support ¶

List of tested Operation Systems, Compilers, and CPU Architectures ¶

BytesFunc is written in ‘C’ and uses the standard C libraries to implement the underlying math functions. BytesFunc has been tested on the following platforms.

OS	Arch	Bits	Compiler	Python
almalinux 9.5	x86_64	64	GCC	3.9.19
alpine 3.20.3	i686	32	GCC	3.12.7
Debian 12	i686	32	GCC	3.11.2
Debian 12	x86_64	64	GCC	3.11.2
FreeBSD 14.1	amd64	64	Clang	3.11.10
OpenBSD 7.6	amd64	64	Clang	3.11.10
Raspbian 12	armv7l	32	GCC	3.11.2
Ubuntu 24.04	aarch64	64	GCC	3.12.3
Debian 12	aarch64	64	GCC	3.11.2
opensuse-leap 15.6	x86_64	64	GCC	3.6.15
Ubuntu 24.04	x86_64	64	GCC	3.12.3
Ubuntu 24.10	x86_64	64	GCC	3.12.7
MS Windows 11	AMD64	64	MSC	3.13.1

amd64 and x86_64 are two names for the same thing. armv7l is 32 bit ARM. aarch64 is 64 bit ARM. The ARM test hardware consists of Raspberry PI models 3, 4, and 5.

The Rasberry Pi 3 tests were conducted on a Raspberry Pi 3 ARM CPU running in 32 bit mode.
The Ubuntu ARM tests were conducted on a Raspberry Pi 4 ARM CPU running in 64 bit mode.
All others were conducted using VMs running on x86 hardware.

Platform Oddities ¶

As most operators are implemented using native behaviour, details of some operations may depend on the CPU architecture.

Lshift and rshift will exhibit a behaviour that depends on the CPU type whether it is 32 or 64 bit, and array size.

For 32 bit x86 systems, if the array word size is 32 bits or less, the shift is masked to 5 bits. That is, shift amounts greater than 32 will “roll over”, repeating smaller shifts.

On 64 bit systems, this behaviour will vary depending on whether SIMD is used or not. Arrays which are not even multiples of SIMD register sizes may exibit different behaviour at different array indexes (depending on whether SIMD or non-SIMD instructions were used for those parts of the array).

ARM does not display this roll-over behaviour, and so may give different results than x86. However, negative shift values may result in the shift operation being conducted in the opposite direction (e.g. right shift instead of left shift).

The conclusion is that bit shift operations which use a shift amount which is not in the range of 0 to “maximum number” may produce undefined results. So valid bit shift amounts should be 0 to 7.

Installing on Linux with PIP and PEP-668 ¶

PEP-668 (PEPs describe changes to Python) introduced a new feature which can affect how packages are installed with PIP. If PIP is configured to be EXTERNALLY-MANAGED it will refuse to install a package outside of a virtual environment.

The intention of this is to prevent conflicts between packages which are installed using the system package manager, and ones which are installed using PIP.

Linux distros which are affeced by this include the latest versions of Debian and Ubuntu.

As this package is a library which is intended to be used by other applications, there is no one right way to install it, whether inside or outside of a virtual environment. Review the options available with PIP to see what is suitable for your application.

For testing purposes this package was installed by setting the environment variable PIP_BREAK_SYSTEM_PACKAGES to “1”, which effectively disables this feature in PIP.

example:

export PIP_BREAK_SYSTEM_PACKAGES=1