Numpy
Numpy
NumPy:
NumPy stands for ‘Numerical Python’ or ‘Numeric Python’. It is an open source module of Python
which provides fast mathematical computation on arrays and matrices. Since, arrays and matrices
are an essential part of the Machine Learning ecosystem, NumPy along with Machine Learning
modules like Scikit-learn, Pandas, Matplotlib, TensorFlow, etc. complete the Python Machine
Learning Ecosystem.
There are a number of ways to initialize new Numpy arrays, for example from
From lists
For example, to create new vector and matrix arrays from Python lists we can use the
numpy.array function
In [3]: # a matrix: the argument to the array function is a nested Python list
M = np.array([[1, 2], [3, 4]])
M
localhost:8888/notebooks/Numpy.ipynb 1/19
9/25/2019 Numpy
So far the numpy.ndarray looks a lot like a Python list (or nested list). Why not simply use
Python lists for computations instead of creating a new array type?
Python lists are very general. They can contain any kind of object. They are dynamically
typed. They do not support mathematical functions such as matrix and dot multiplications, etc.
Implementating such functions for Python lists would not be very efficient because of the
dynamic typing.
Numpy arrays are statically typed and homogeneous. The type of the elements is
determined when array is created.
Numpy arrays are memory efficient.
Because of the static typing, fast implementation of mathematical functions such as
multiplication and addition of numpy arrays can be implemented in a compiled language (C
and Fortran is used).
size = 1000000
timeits = 1000
<class 'numpy.ndarray'>
<class 'list'>
localhost:8888/notebooks/Numpy.ipynb 2/19
9/25/2019 Numpy
For larger arrays it is inpractical to initialize the data manually, using explicit pythons lists. Instead
we can use one of the many functions in numpy that generates arrays of different forms.
Most of the times, we use NumPy built-in methods to create arrays. These are much simpler and
faster.
arange()
linspace()
zeros()
ones()
eye()
diag()
Random
rand()
random()
randn()
randint()
reshape()
a. arange()
arange() is very much similar to Python function range()
Syntax: arange([start,] stop[, step,], dtype=None)
Return evenly spaced values within a given interval.
localhost:8888/notebooks/Numpy.ipynb 3/19
9/25/2019 Numpy
Out[12]: array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
b. linspace()
Return evenly spaced numbers over a specified interval.
Press shift+tab for the documentation.
In [13]: # start from 1 & end at 15 with 10 evenly spaced points b/w 1 to 15.
print(np.linspace(1, 15, 15))
type(np.linspace(1, 15, 15))
Out[13]: numpy.ndarray
In [17]: # Lets find the step size with "retstep" which returns the array and the step siz
my_linspace = np.linspace(5, 15, 9, retstep=True)
my_linspace[1]
Out[17]: 1.25
Out[15]: tuple
In [18]: my_linspace[0]
Out[18]: array([ 5. , 6.25, 7.5 , 8.75, 10. , 11.25, 12.5 , 13.75, 15. ])
In [19]: my_linspace[1]
Out[19]: 1.25
Don't Confuse!
arange() takes 3rd argument as step size.
linspace() take 3rd argument as no of point we want.
c. zeros()
localhost:8888/notebooks/Numpy.ipynb 4/19
9/25/2019 Numpy
d. ones()
We want to create an array with all ones
In [24]: np.ones(3)
e. eye()
Creates an identity matrix must be a square matrix, which is useful in several linear algebra
problems.
Return a 2-D array with ones on the diagonal and zeros elsewhere.
In [26]: np.eye(5)
localhost:8888/notebooks/Numpy.ipynb 5/19
9/25/2019 Numpy
f. diag()
f1 full()
In [31]: np.full((3,3),'hello')
Random
We can also create arrays with random numbers using Numpy's built-in functions in Random
module.
np.random. and then press tab for the options with random
g. rand()
Create an array of the given shape and populate it with random samples from a uniform
distribution over [0, 1) .
In [35]: np.random.rand(3,2) # row, col, note we are not passing a tuple here, each dimens
h. random()
localhost:8888/notebooks/Numpy.ipynb 6/19
9/25/2019 Numpy
()
This will return random floats in the half-open interval [0, 1) following the “continuous uniform”
distribution.
np.random.random((4,3))
In [ ]: np.random.random((4,3))
In [ ]:
i. randn()
Return a sample (or samples) from the "standard normal" or a "Gaussian" distribution. Unlike rand
which is uniform.
Press shift+tab for the documentation.
In [36]: np.random.randn(2)
localhost:8888/notebooks/Numpy.ipynb 7/19
9/25/2019 Numpy
sample_size = 100000
uniform = np.random.rand(sample_size)
normal = np.random.randn(sample_size)
i. reshape()
shapes an array without changing data of array.
localhost:8888/notebooks/Numpy.ipynb 8/19
9/25/2019 Numpy
array = np.arange(8)
print("Original array : \n", array)
# Constructs 3D array
array = np.arange(8).reshape(2, 2, 2)
print("\nOriginal array reshaped to 3D : \n", array)
Original array :
[0 1 2 3 4 5 6 7]
[[4 5]
[6 7]]]
Out[40]: 30
Out[41]: array([11, 86, 59, 68, 60, 77, 69, 14, 16, 29])
Attributes of a NumPy :
Ndim: displays the dimension of the array
Shape: returns a tuple of integers indicating the size of the array
Size: returns the total number of elements in the NumPy array
Dtype: returns the type of elements in the array, i.e., int64, character
Itemsize: returns the size in bytes of each item
localhost:8888/notebooks/Numpy.ipynb 9/19
9/25/2019 Numpy
nbytes: which lists the total size (in bytes) of the array
Reshape: Reshapes the NumPy array
x3 ndim: 3
x3 shape: (3, 4, 5)
x3 size: 60
Each array has attributes ndim (the number of dimensions), shape (the size of each dimension),
and size (the total size of the array):
In [47]: array_1d
In [48]: # In the simplest case, selecting one or more elements of NumPy array looks very
# Getting value at certain index
array_1d[0]
Out[48]: -10
localhost:8888/notebooks/Numpy.ipynb 10/19
9/25/2019 Numpy
In [53]: # Getting up-to and from certain index -- remember index starts from '0'
# (no need to give start and stop indexes)
array_1d[:2], array_1d[2:]
In [55]: array_1d
# The first element is changed to -102
array_2d[row][col]
or
array_2d[row,col] .
[[11 12 13 14]
[21 22 23 24]
[31 32 33 34]]
Use array slicing to get a subarray consisting of the first 2 rows x 2 columns.
localhost:8888/notebooks/Numpy.ipynb 11/19
9/25/2019 Numpy
[[12 13]
[22 23]]
When you modify a slice, you actually modify the underlying array.
Before: 12
After: 1000
[[11 12 13 14]
[21 22 23 24]
[31 32 33 34]]
In [3]: # Using both integer indexing & slicing generates an array of lower rank
row_rank1 = an_array[1, :] # Rank 1 view
In [4]: # Slicing alone: generates an array of the same rank as the an_array
row_rank2 = an_array[1:2, :] # Rank 2 view
print()
col_rank1 = an_array[:, 1]
col_rank2 = an_array[:, 1:2]
localhost:8888/notebooks/Numpy.ipynb 12/19
9/25/2019 Numpy
print('Original Array:')
print(an_array)
row_indices = np.arange(4)
print('\nRows indices picked : ', row_indices)
In [ ]: # Examine the pairings of row_indices and col_indices. These are the elements we
for row,col in zip(row_indices,col_indices):
print(row, ", ",col)
In [ ]: # Change one element from each row using the indices selected
an_array[row_indices, col_indices] += 100000
print('\nChanged Array:')
print(an_array)
In [3]: # create a filter which will be boolean values for whether each element meets thi
c=a > 2
print(c)
[[False False]
[ True True]
[ True True]]
Notice that the c is a same size ndarray as array a, array c is filled with True for each element
whose corresponding element in array a is greater than 2 and False for those elements whose
value is less than 2.
localhost:8888/notebooks/Numpy.ipynb 13/19
9/25/2019 Numpy
We can use , these comparison expressions directly for access. Result is only those elements for
which the expression evaluates to True.
In [4]: print(a[c])
print(a[c].shape)
[3 4 5 6]
(4,)
Lets see if this works with writing mulitple conditions as well. In that process we'll also see that we
dont have to store results in one variable and then pass for subsetting. We can instead, write the
conditional expression directly for subsetting.
In [5]: a>2
In [6]: a<5
In [9]: print(a)
a[(a>2) | (a<5)] , a[(a>2) & (a<5)] ###### A, B i.e Multiple operation in one l
[[1 2]
[3 4]
[5 6]]
localhost:8888/notebooks/Numpy.ipynb 14/19
9/25/2019 Numpy
print(x)
print()
print(y)
[[111 112]
[121 122]]
[[211.1 212.1]
[221.1 222.1]]
In [12]: # add
print(x + y) # The plus sign works
print()
print(np.add(x, y)) # so does the numpy function "add"
[[322.1 324.1]
[342.1 344.1]]
[[322.1 324.1]
[342.1 344.1]]
In [13]: # subtract
print(x - y)
print()
print(np.subtract(x, y))
[[-100.1 -100.1]
[-100.1 -100.1]]
[[-100.1 -100.1]
[-100.1 -100.1]]
In [14]: # multiply
print(x * y)
print()
print(np.multiply(x, y))
[[23432.1 23755.2]
[26753.1 27096.2]]
[[23432.1 23755.2]
[26753.1 27096.2]]
localhost:8888/notebooks/Numpy.ipynb 15/19
9/25/2019 Numpy
In [15]: # divide
print(x / y)
print()
print(np.divide(x, y))
[[0.52581715 0.52805281]
[0.54726368 0.54930212]]
[[0.52581715 0.52805281]
[0.54726368 0.54930212]]
[[10.53565375 10.58300524]
[11. 11.04536102]]
In [17]: # exponent (e ** x)
print(np.exp(x))
[[1.60948707e+48 4.37503945e+48]
[3.54513118e+52 9.63666567e+52]]
In general you'll find that , mathematical functions from numpy [being referred as np here ] when
applied on array, give back result as an array where that function has been applied on individual
elements. However the functions from package math on the other hand give error when applied to
arrays. They only work for scalars.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-19-f63d9241fcd6> in <module>
1 # square root
2 import math
----> 3 math.sqrt(x)
np.dot() in Numpy
If both a and b are 1-D (one dimensional) arrays — Inner product of two vectors (without a
complex conjugation)
If both a and b are 2-D (two dimensional) arrays — Matrix multiplication
localhost:8888/notebooks/Numpy.ipynb 16/19
9/25/2019 Numpy
If the last dimension of a is not the same size as the second-to-last dimension of b.
Out[22]: 219
You can see that result is not what you'd expect from matrix multiplication. This happens because
a single dimensional array is not a matrix.
In [23]: print(v.shape)
print(w.shape)
(2,)
(2,)
In [26]: v=v.reshape((1,2))
w=w.reshape((1,2))
v
Now if you simply try to do v.dot(w) or np.dot(v,w) [both are same] , you will get and error because
you can multiple a mtrix of shape 2X1 with a matrix of 2X1 .
In [27]: np.dot(v,w)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-27-efb51945670c> in <module>
----> 1 np.dot(v,w)
localhost:8888/notebooks/Numpy.ipynb 17/19
9/25/2019 Numpy
matrix v : [[ 9 10]]
matrix v Transpose: [[ 9]
[10]]
matrix w: [[11 12]]
matrix w Transpose: [[11]
[12]]
~~~~~~~~~ v multiply with transpose of w
[[219]]
~~~~~~~~~ transpose of v is multiply by w
[[ 99 108]
[110 120]]
If you leave v to be a single dimensional array . you will simply get an element wise multiplication.
Here is an example
In [29]: print(x)
v=np.array([9,10])
print("~~~~~")
print(v)
x.dot(v)
[[111 112]
[121 122]]
~~~~~
[ 9 10]
In [30]: print(x)
print("~~~")
print(y)
x.dot(y)
[[111 112]
[121 122]]
~~~
[[211.1 212.1]
[221.1 222.1]]
In [ ]:
localhost:8888/notebooks/Numpy.ipynb 18/19
9/25/2019 Numpy
localhost:8888/notebooks/Numpy.ipynb 19/19