Chapter 11
Tools for Scientific Computing
The purpose of this book is to prepare you to work in data science, the natural sciences and.
engineering. We have addressed programming concepts and the Python language. In this
chapter we will examine some of the tools used in scientific and engineering computations,
11.1 Numpy
‘Assume that you have collected temperature data for a project and that you have 10,000
collected data points. It would be absurd to write out 10,000 different variables to hold the
values: [a,b,c % aa, ab, aC, »]- In this case using just lower case, you would need to
go up to three letter variables (aaa, aab, aac, ..). There is not a concern about running out
of variable names. The concer is that there is not a good way to access all of those variable
names. It cannot be easily accessed in a loop.
What we want is something like the vector idea found in math courses where you have
1/82, %3, 10000. Most programming languages support this through a data structure
called an array. These languages extend the basic datatype to an array version. Mean-
ing you can have arrays of integers, floats and other more exotic data containers. These
arrays must have a uniform data type for the entire array. Python does not include a ba-
sic array structure. It does have a more flexible structure known as a list will allows for
elements of different data types. Lists were covered in an earlier chapter.
To get a traditional array in Python, we use the NumPy module which is a collection of
objects used to create and manipulate numerical data. NumPy provides an extension pack-
age to Python for multi-dimensional arrays. Numpy is written in C and often optimised
for special hardware and so core Numpy routines are at least an order of magnitude faster.
‘The design of the package reflects an array oriented computing view and is designed for
scientific computing. Numpy is very fast and very efficient. Although Python can be rather
slow, Numpy can achieve speeds associated with complied languages like C.
Information and download of Numpy are at: https: //numpy.org/ . Numpy is part of the
Anaconda distribution
‘We begin with creating a simple array of 32 bit integers.134 Chapter 11. Tools for Scientific Computing
Sos
a= mp.array((o, 1, 2, 31)
prints)
wis
Note that when printing the array it looks like a list, but without the commas separating
the values
‘We can get information about the datatype through
[print (expe)
eine (a. type)
aeypeC' ints)
An example of a two dimensional array
[o-eperyti, 20% aD
postage
a>
Numpy arrays are zero indexed. This means that the index for the array starts with 0 and
not 1. Access is via square brackets as show here
fa = ap-array((a7, 39, 23, 201)
prine ala
2
We typically work with integer or floating point arrays. Ifall the elements are integers then
that will be the datatype for the whole array, otherwise we get an array of floats:
[a= ap amayttr, #23, to
Pe parraysr, 39, 28, 29.00)
In practice we don’t hand enter the data. Iti either read in froma file (see chapter 10) or
it is created by an array generation function. Some very useful functions to create Numpy
arrays follow.
+ To create a range of numbers starting at 4, up to but not including # with a separation
of step,
np.arange(a,b,step)
* To create a range of N numbers starting at a, ending at b,
np. linspace(a,b, N)
+ To create an array of zeros with shape (n,m),
np.zeros ((n,n))11.1. Numpy 135
+ To create an array of ones with shape (1,m),
np-ones((n,m))
* To create an array of random numbers between 0 and 1 with shape (1m),
np.random.rand(n,m)
Please experiment to get a feel for Numpy arrays.
print ep. Linapace 0,5, 500)
We saw above that one can acces
used to assign values
array elements using the square brackets. This can be
= ap zeros GSD)
eta = 812
prince)
Lo. 0, 612. 0, 0.)
Numpy has many feature for array access and manipulation. Accessing parts of the array
are known as slices. One such slice is the range operation: array[startiend:step] (begin at
“start”, go up to but one less than “end” and use “step” as the stride):
3 > np.arangeCiO)
prince)
[0,4 2, 8, 4 5, 8, 7, 8 8)
princes)
B34
The slice is a view of the data - not a copy.
‘Fandine (0,55,15) # random array
S) i print this out
a= (3:6) # grap a alice
14 = ","a) # print out the alice
100 #’ et one of he elenents in the slice
fied d=", a) # print d again to see the modification
a) # print a to gen if it was changed
47, 10, 2, 80, 12, 6 34, 29, 43, 14, 44, 48, 28, 32, 40]
60, 32, 6)
tied d * [100, 12, 6)
= LAT, 10, 2, 100, 12, 6, 34, 29, 43, 14, 44, 48, 28, 32, 40)
‘To make a copy you should use the copy method so you don’t modify the original array:
stil cony0
Arrays and loops go hand in hand. A for loop is an easy way to access or modify the
elements of an array. For example, fill out an array of length 100 with 1/ (i+ 1) where jis
the index.
eee
tor 4 in range(™)
xt) = Ve)136 Chapter
‘Tools for Scientific Comp
11.2. Numpy operations
Python is known to be slow. This partially has to do with the dynamic variables and the
challenges of optimization in this environment. So, Numpy has a number of element-wise
operations built-in. The basic binary operators are overloaded which means that when
Python sees x +y it knows to call the element-wise addition function (under the hood),
Automatic element-wise operations include
* Addition of arrays: x+y
+ Addition with constants: x +10
+ Scalar multiplication: ¢ x
* Array multiplication: ab
‘+ Matrix multiplication (dot): np.dot (x,y)
# Functions x*x+22x43,np-sin(e), np-exp(x)
‘These operations are very fast. This is where the power of Numpy starts to emerge. If you
want to create a collection of points (x4, y,) where 0 < x¢ < 47, ye = af +e *sin(xy),
with 0 < k < 200, the traditional element-wise approach would be
CADsx(4) + methoxy (x (19) emer sinGela})
Using the overloaded operators in Numpy, we can rewrite this using the implicit element
wise operations:
‘npore nunpy a= oP
x = 200
5p. Lsnspace(0,enp.pi.8)
x + mp expC amp sins)
Because these arrays are objects, they have methods associated with them.
ssa)
femean()
fxsta0) # standard Deviation
-max0)
mind
Elementwise logical operations and comparisons can be done with Numpy arrays as well
We will see more Numpy in a later section on plotting,
11.3. Random Values
When doing simulations of natural systems, or testing code, or performing numerical op-
timization, itis important to have access to random numbers. Python and Numpy provide
values sampled from a variety of distributions. The generation of random values is a whole
separate subject and one needs to keep in mind that the values produced by these routines11.3. Random Values 137
are not truly random and are not appropriate for cryptographic level randomness. How-
ever, many applications work very well with pseudo-random sequences. Random and
Numpy Random libraries both use the Mersenne twister sequence to generate their values
and in most cases only one library is required,
‘The Random number library is accessed via import random and the Numpy ran-
dom routines are brought in with import nunpy. The Random library is accessed
via randon.function() and the Numpy random sample library is accessed via
numpy random. function()
A few routines in Random:
+ randint (low, high) - Produces a random integer where low ICT, RIC2, RIC3, RICH \a
RCI [RIC | RICS | RICH RECT, R3C2, RIC3, RICA \n
Assume the CSV files looks like
4,6.4,5.5,2.3,9.4
14.6,84.7,17.9,4.12
4,2,4,-4.8,66
0.88
1.2,
44
Recall to read a text file:
Fin = open(iasca.cay')
row = Tine.strip@)
Prine (roe)
tin.eloee)
How would we get these values loaded into a Numpy two-dimensional array? Problem
one is that a row is a single object, not a list of values. Second issue is that they come in as
text.
Tin = openvaaca.cay')
row = Tine. strip)
fSvens = Dine. sphie(*,")
prine (itera)
sin, etese0
Each line needs to be split and the data converted to floats. Then it can be loaded into a
numpy array. If we know the size of the matrix, then itis fairly straightforward:
fasta = npcenpty((9,5) type
tin = open('aata.crv')
tor Line in fin
row = Line, seripQ)
teas = ror.splat(’,
cols = 0
detazovs cols]
evanciten)
If you don’t know the dimensions then a small modification will work:140. Chapter 11. Tools for Scientific Computing
Jasta = np-array(() typ
tin = opent'dava.cor')
tor lane an tin
row = Line, strip
vens = row. split (*,")
“t1est")
‘Or you can just do
[trom nunpy import gente
acta + gentzonext
[from pandas import read
11.5 Matplotlib
‘There are a number of plotting options for Python. One of the oldest plotting packages
and part of the SciPy collection is MatPlotLib. You can think of this as an array visualiza~
tion tool and not just a tool to produce graphs. This section will focus on traditional 2D
plots with the goal of publication quality plots. Other applications such as 3D plots, ani-
‘mated plots, ete will not be addressed here. We have included references to the MatPlotLib
documentation and references at the end,
In order to use the routines you need to import the plotting package:
For our first example, we will plot a simple data set:
fe = mp array (10,1,2,3,49)
fy = mp-areay((1.4,3.7,91)
put .piet G9)
[Pieisnev0
The resulting plot is shown in Figure 11.1(a). Itis that simple. The general idea is that you
create the x and y arrays from your code (data analysis, simulation, etc). These are passed.
into the plot command combined with any decorations you want. Matplotlib then renders
the graphics. When using Jupyter Lab it attempts to render any of the plot commands, even
a title command, plotting works differently than in the interpreter. So, you will normally
place all of your plot commands related to a given plot in a single cell,
Another example is given in Figure 11.1(b).
ps Lanapace (0,1, 100)
fy = xe(i-x) # Inverted parabola
pit.plot(e,y, color = ‘green’, Linestyle
pit-shovO
dorted")