Numerical Python Modules08/09/2000
This month we look under the hood of Numerical Python (NumPy) and explore a more detailed list of features and capabilities. The heart of Numerical Python is the multiarray datatype we have explored so far, but the NumPy distribution is really a package of modules that enable the programmer to make effective use of the multiarray concept. NumPy includes support for linear algebra functions, Fast Fourier Transforms, and random number generation, and provides help for users familiar with the commercial matrix manipulation package MATLAB. Let's look at the basics of NumPy's add-on modules in preparation for getting our hands really dirty next month. It's the accessories that make NumPy fly.
Under the hood
The term broadcasting in NumPy is used to refer to smaller dimensioned multiarrays being either replicated or extended in a natural way to work with higher dimensioned multiarrays. While we'll leave the details of broadcasting to next month, understanding this concept is crucial to pushing NumPy to its full potential. As a preface to next month's discussion of the complex rules of broadcasting, some basics of multiarray manipulation need exploration -- specifically indexing, slicing, and ellipses.
With a multiarray in hand, it turns out there are flexible rules for indexing and slicing.
Given a matrix of integers:
let's see what we can do with the array as entered.
>>> a = array(((0,1,2,3,4,5,6,7,8,9), (9,8,7,6,5,4,3,2,1,0))) >>> print a [[0 1 2 3 4 5 6 7 8 9] [9 8 7 6 5 4 3 2 1 0]]
There are two styles to indexing, either  or [,]. (While we'll limit the discussion to only two dimensions, the concepts readily extend to multiarrays of higher orders.)
>>> print a 8 >>> print a[1,1] 8
The difference between the two indexing styles can be characterized as cascaded extraction versus direct. Think of the method
(a). The first operation
(a) returns the entire second row of the a multiarray; the second
 then indexes into the second element of the extracted row. The extraction "cascades." Think of the second method,
a[(1,1)]. It uses the tuple as an array address and retrieves the element directly. Of the two methods, the better one to use is the second
([,]). It extends to the more complex cases as expected, where the first method can lead to unexpected results.
Following is an example of slicing. In this case the snippet
a[:,2] extracts the second column, all rows. Remember, for a two-dimensional matrix the indexes are
[rows,cols]. The: operator acts as a "slice" operator. It can be used in several forms: alone
[:], meaning select everything; or with end points
[start:stop], meaning select everything from start up to but not including stop; or
[start:stop:step], meaning select everything from start up to but not including stop with a given increment (step) value.
>>> print a[:,2] [2 7]
The next example shows that
 is not the same as
[,] when slicing is involved. This snippet attempts to retrieve the second row of the entire a multiarray.
>>> print a[:] Traceback (innermost last): File "<stdin>", line 1, in ? IndexError: index out of bounds
Why? Well, think of evaluating
a[:] part just returns
a; now doing
a will actually attempt to return the third (counting from zero) row of a - which is nonexistent! Thus the exception.
Using more complex forms of the slice operator, the following example gives the (1,3,5) values of the 1st row
>>> print a[0,1:7:2] [1 3 5]
When dealing with larger multi-dimensional arrays, ellipses (...) can be used to cover an indeterminate number of slices. Ellipses are essentially a shorthand, filling in for unspecified dimensions. Only the first, left-most ellipsis can be expanded. The ellipses will expand to the number of unspecified dimensions that remain.
As an example, if a is an array of four or more dimensions, then the following code snippet is true:
>>> a[5,:,:,1] == a[5,...,1]
This shows the use of
... as a substitute for
:,:, (select all) slices of the intermediate dimensions.
Go ahead and create some test arrays and try your hand at indexing and slicing. More examples can be found in the NumPy documentation -- if you need further guidance. Getting the hang of indexing and slicing is a good start towards more powerful NumPy usage.
Pages: 1, 2