Python Numpy Broadcasting

“The word broadcasting refers to how numpy handles arrays of varying shapes during arithmetic operations.” The smaller array is “broadcast” across the bigger array, subject to specific limits so that their shapes are consistent. Broadcasting allows you to vectorize array operations such that looping happens in C rather than Python.”

 For Example:

To understand NumPy’s broadcasting method, we add two arrays of different dimensions.

#import numpy as np using the import keyword
import numpy as np
# Pass some random number(length of array) to the arange() function and store it in a variable.
gvn_arry = np.arange(4)
# Add a number to the given array and and store it in another variable.
result = gvn_arry + 6
# Print the above result
print(result)

Output:

[6 7 8 9]

In this case, the given array has one dimension (axis), which has a length of 4, whereas 6. is a simple integer with no dimensions. Because they have different dimensions, Numpy attempts to broadcast (simply stretch) the smaller array along a specific axis, making it suitable for the mathematical operation.

The Numpy Broadcasting Rules

Numpy broadcasting follows a strict set of rules to ensure that array operations are consistent and fail-safe. The following are two general broadcasting rules in numpy:

When we perform an operation on a NumPy array, NumPy compares the array’s shape element by element from right to left. Only when two dimensions are equal or one of them is 1, are they compatible. If two dimensions are equal, the array is preserved.

The array is broadcasted along the dimension if it is one. NumPy throws a ValueError if neither of the two conditions is met, indicating that the array cannot be broadcasted. If and only if all dimensions are compatible, the arrays are broadcasted.
The arrays being compared do not have to have the same number of dimensions. The array with fewer dimensions can be easily scaled along the missing dimension.

Implementation

Let the two arrays be arr_1= (5, 3) and arr_2= (5, 1)

The sum of arrays with compatible dimensions: The arrays have compatible dimensions (5, 3) and (5, 1). To match the dimension of arr_1 the array arr_2 is expanded along the second dimension.

# Import numpy module as np using the import keyword.
import numpy as np
# Pass the rowsize*columnsize as argument to the arrange() function and 
# pass the rowsize, columnsize  as arguments to the reshape() function
# Store it in a variable.
arry1 = np.arange(15).reshape(5, 3)
# Print the shape(rowsize, columnsize) using the shape function
print("First Array shape = ", arry1.shape)
# similary get the other array
arry2 = np.arange(5).reshape(5, 1)
print("Second Array shape = ", arry2.shape)
# Print the sum of both the arrays by adding the above two variables.
print("Adding both arrays and printing the sum of it:")
print(arry1 + arry2)

Output:

First Array shape = (5, 3) 
Second Array shape = (5, 1) 
Adding both arrays and printing the sum of it: 
[[ 0 1 2] 
 [ 4 5 6] 
 [ 8 9 10] 
 [12 13 14] 
 [16 17 18]]

Example2:

# Import numpy module as np using the import keyword.
import numpy as np
# Pass the rowsize*columnsize as argument to the arrange() function and 
# pass the rowsize, columnsize  as arguments to the reshape() function
# Store it in a variable.
arry1 = np.arange(15).reshape(5, 4)
# Print the shape(rowsize, columnsize) using the shape function
print("First Array shape = ", arry1.shape)
# similary get the other array
arry2 = np.arange(5).reshape(5, 1)
print("Second Array shape = ", arry2.shape)
# Print the sum of both the arrays by adding the above two variables.
print("Adding both arrays and printing the sum of it:")
print(arry1 + arry2)

Output:

ValueError                                Traceback (most recent call last)
<ipython-input-17-3bb527438935> in <module>()
      4 # pass the rowsize, columnsize  as arguments to the reshape() function
      5 # Store it in a variable.
----> 6 arry1 = np.arange(15).reshape(5, 4)
      7 # Print the shape(rowsize, columnsize) using the shape function
      8 print("First Array shape = ", arry1.shape)

ValueError: cannot reshape array of size 15 into shape (5,4)

Explanation:

Here, the number of rows is 5, while the number of columns is 4.
It cannot be placed in a matrix of size 16 (a matrix of size 5*4 = 20 
is required).

Example3:

# Import numpy module as np using the import keyword.
import numpy as np
# Pass the rowsize*columnsize as argument to the arrange() function and 
# pass the rowsize, columnsize  as arguments to the reshape() function
# Store it in a variable.
arry1 = np.arange(18).reshape(6, 3)
# Print the shape(rowsize, columnsize) using the shape function
print("First Array shape = ", arry1.shape)
# similary get the other array
arry2 = np.arange(3)
print("Second Array shape = ", arry2.shape)
# Print the sum of both the arrays by adding the above two variables.
print("Adding both arrays and printing the sum of it:")
print(arry1 + arry2)

Output:

First Array shape = (6, 3) 
Second Array shape = (3,) 
Adding both arrays and printing the sum of it: 
[[ 0 2 4] 
 [ 3 5 7] 
 [ 6 8 10] 
 [ 9 11 13] 
 [12 14 16] 
 [15 17 19]]

Example4:

arry1 = np.arange(120).reshape(5, 4, 3, 2)
print("First Array shape = ", arry1.shape)
 
arry2 = np.arange(24).reshape(4, 3, 2)
print("Second Array shape = ", arry2.shape)
 
print("Adding both arrays and printing the sum of it: \n", (arry1 + arry2).shape)

Output:

First Array shape = (5, 4, 3, 2) 
Second Array shape = (4, 3, 2) 
Adding both arrays and printing the sum of it: 
(5, 4, 3, 2)

Explanation:

It is vital to realize that several arrays can be propagated along many 
dimensions. Array1 has dimensions (5, 4, 3, 2), while array2 has dimensions 
( 4, 3, 2).Array1 is extended along the third dimension, whereas array2 is 
stretched along the first and second dimensions, yielding the dimension array 
(5, 4, 3, 2).

Broadcasting’s Speed Advantages

Numpy broadcasting is more efficient than looping through the array. Let’s start with the first example. The user can choose not to use the broadcasting mechanism and instead loop through an array, adding the same number to each element in the array. This can be slow for two reasons: looping involves interacting with the Python loop, which reduces the speed of the C implementation. Second, NumPy employs strides rather than loops. Setting strides to 0 allows you to repeat the elements indefinitely without incurring any memory overhead.