CommonLounge Archive

NumPy: Manipulating arrays

April 24, 2019

In the last two tutorials, we learnt about NumPy arrays, indexing and slicing, and some mathematical operations and functions NumPy supports.

In this tutorial, we will learn how to manipulate NumPy arrays in various ways — including reshaping an array, concatenating multiple arrays, appending elements or rows, etc.

You are encouraged to follow along with the tutorial and play around with NumPy, tinkering with the code and making sure you’re getting the hang of it. Let’s get started!

Set-up

Let’s start by defining some arrays

import numpy as np
# Create a 1-dimensional array
a = np.array([1,4,2,3,5,7,8,6])
print('array "a"')
print(a)
print()
# Create a 2-dimensional array
b = np.array([[1,0,1,0,2,3], [1,3,0,1,2,0], [0,1,0,0,1,3]])
print('array "b"')
print(b)
print()
np.random.seed(42) # set a seed to that we always get the same random values
c = b + 5
print('array "c"')
print(c)

Reshape and flatten

Reshape can be used to change the shape of an array.

# Reshape to shape 4 x 2
a = a.reshape(4, 2) 
print('array "a" reshaped to 4 x 2')
print(a)
print()
# Reshape to shape 2 x 4
a = a.reshape(2, 4) 
print('array "a" reshaped to 2 x 4')
print(a)

As you can see above, during reshaping to a 2-D object, values are filled row by row.


Note that the number of elements in the original array and the final array must be equal. Otherwise, NumPy throws an error:

a.reshape(3, 5)

“Flatten” is equivalent to reshaping to vector of length a.size.

a = a.flatten() 
print('Flattened array "a"')
print(a)

Resize

If we would like to reshape while changing the total number of elements in the array, we can use resize. Resize will repeat elements (if resizing to a larger size), or it will throw away elements (if resizing to a smaller size).

print(np.resize(a, (3, 5)))

Broadcast

Broadcasting makes copies of the existing array.

The first argument is the array itself, and the second argument is the shape of the new array.

Since broadcasting makes copies of the original array, the trailing dimensions of the new array must match the dimensions of the original array. For example, we can broadcast an array of shape (8, ) to an array of shape (6, 8). But we cannot broadcast it to shape (6, 4).

print('Broadcasting array "a" to a new size')
print(np.broadcast_to(a, (6, 8)))

Let’s try broadcasting the 2D array. The original shape of B was (3, 6), and we will broadcast it to the shape (2, 3, 6). This will repeat the array twice.

print(b)
print()
print('Broadcasting array "b" to a new size')
z = np.broadcast_to(b, (2, 3, 6))
print('New shape', z.shape)
print(z)

Expanding and squeezing dimensions

expand_dims and squeeze can be used to add or remove dimensions from an array.

# Expand the shape of an array by inserting axis 
z = np.expand_dims(b, axis=1) # expand at position 1
print('Expanded array "z"')
print('Shape of z', z.shape)
print(z)
print()
# Remove single-dimensional entries from the shape of an array
print('Squeezed array "z"')
print(np.squeeze(z))

Concatenating and stacking

concatenate and stack can be used to join multiple arrays into one. concatenate joins arrays along an existing axis, thereby, the total number of dimensions stays the same. stack on the other hand, creates a new dimension.

Let’s start by taking a look at arrays b and c:

print('array "b"')
print(b)
print()
print('array "c"')
print(c)

Now, let’s try concatenating. concatenate joins arrays along an existing axis:

# Join arrays along an existing axis, axis 0
print('Concatenated array "b" and "c", along axis 0')
print(np.concatenate((b, c), axis=0))
print()
# Join arrays along an existing axis, axis 1
print('Concatenated arrays "b" and "c" along axis 1')
print(np.concatenate((b, c), axis=1))

Let’s also try stacking. Unlike concatenation, stacking creates a new dimension:

# Join arrays along a new axis, 0
print('Stack arrays "b" and "c" along a new axis, 0')
z = np.stack((b, c), 0)
print(z.shape)
print(z)
print()
# Join arrays along a new axis, 1
print('Stack arrays "b" and "c" along a new axis, 1')
z = np.stack((b, c), 1)
print(z.shape)
print(z)

Splitting

split, hsplit (horizontal split) and vsplit (vertical split) can be used to split an array into equal-sized subarrays.

Each of these 3 functions returns a list of arrays.

Again, let’s start by taking a look at arrays b and c:

print('array "b"')
print(b)
print()
print('array "c"')
print(c)

By default, split splits by the axis 0. For 2D arrays, this means splitting vertically:

print('Split array "b"')
print(np.split(b, 3))
print()
print('Split array "c"') 
print(np.split(c, 3))

Now, let’s look at some ways to split horizontally:

print('Split array "c" into 3 parts along axis 1') 
print(np.split(c, 3, axis=1))
print()
print('Horizontally split array "b" into 2 parts')
print(np.hsplit(b, 2))
print()
print('Horizontally split array "b" into 3 parts')
print(np.hsplit(b, 3))

Append, insert and delete

And finally, append, insert and delete can be used to insert and delete elements from an array (the same way as in Python). append is equivalent to inserting elements to the end of the array.

print('array "a"')
print(a)
print()
# Append values to the end of an array
print('Values appended to array "a"')
print(np.append(a, [1, 2]))
print()
# Insert values (0, 0) along the given axis (0) at the given index (3)
print('Values inserted to array "a"')
print(np.insert(a, 3, [0, 0], axis=0))
print()
# Delete the element at given locations (1) along an axis (0)
print('Values deleted from array "a"')
print(np.delete(a, [1, 3], axis=0))

Let’s see some examples with 2D arrays:

# 2d array "b"
print('array "b"')
print(b)
print()
# Append row of values to the end of a 2darray
# Notice that the argument itself is a 2D array.
print('Values appended to array "b" along axis 0')
print(np.append(b, [[1, 2, 3, 4, 5, 6]], axis=0))
print()
# Append column of values to the end of a 2darray
print('Values appended to array "b" along axis 1')
print(np.append(b, [[7], [8], [9]], axis=1))
print()
# Insert values (5, 5, 5) along the given axis (1) at the given index (2)
print('Scalar values inserted to array "b" in column position 2')
print(np.insert(b, 2, 5, axis=1))
print()
# Insert values (1, 2, 3) along the given axis (1) at the given index (1)
print('Column vector inserted to array "b" in column position 1')
print(np.insert(b, [1], [[1],[2],[3]], axis=1))
print()
# Delete the row/column/array at given locations (1) along an axis (0)
print('Row deleted from array "b"')
print(np.delete(b, 1, axis=0))
print()
# Delete the row/column/array at given locations (2) along an axis (1)
print('Column deleted from array "b"')
print(np.delete(b, 2, axis=1))

Summary

In this tutorial we learnt how to

  • change the shape of a NumPy array using reshape and flatten
  • change the size of a NumPy array using resize and broadcast_to
  • join arrays using concatenate and stack
  • split arrays using split
  • append, insert and delete items / subarrays from an array

This tutorial concludes our section on NumPy. In the upcoming project, you’ll use NumPy do to some practical data exploration.


© 2016-2022. All rights reserved.