Data Types [02]: Numpy Array Basics

4 minute read

Published:

NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers. In NumPy dimensions are called axes.


Array Creation

  • np.array(tuple/list, dtype = int64/float64/complex/...)
  • np.zeros(tuple/list, dtype = int64/float64/complex/...)
  • np.ones(tuple/list, dtype = int64/float64/complex/...)
  • np.empty(tuple/list, dtype = int64/float64/complex/...)
  • np.arange(start, end, space)
  • np.linspace(start, end, number)

Basic Operations & Functions

  • Arithmetic operators (+, -, *, /, **, etc.) on arrays apply elementwise.
  • Matrix product can be performed using the @ operator or the dot function
  • a.sum(axis=n), a.min(axis=n), a.min(axis=n), a.cumsum(axis=n)
    a = np.arange(12).reshape(3, 2, 2)
      
    a
    Out[124]: 
    array([[[ 0,  1],
            [ 2,  3]],
    
           [[ 4,  5],
            [ 6,  7]],
    
           [[ 8,  9],
            [10, 11]]])
            
    a.sum(axis=0)
    Out[120]: 
    array([[12, 15],
           [18, 21]])
    
    a.sum(axis=1)
    Out[121]: 
    array([[ 2,  4],
           [10, 12],
           [18, 20]])
    
    a.sum(axis=2)
    Out[122]: 
    array([[ 1,  5],
           [ 9, 13],
           [17, 21]])
       
    b.cumsum(axis=0)
    Out[123]: 
    array([[[ 0,  1],
            [ 2,  3]],
    
           [[ 4,  6],
            [ 8, 10]],
    
           [[12, 15],
            [18, 21]]], dtype=int32)      
    
  • prod, cumprod, diff
  • dot, vdot
  • all, any
  • argmax, argmin
  • average, mean, median
  • corrcoef, cov, std, var
  • cross, inner, outer
  • ceil, floor, round
  • transpose, trace
  • sort

Indexing & Slicing

  • One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python sequences.
  • Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by commas.
  • Iterating over multidimensional arrays is done with respect to the first axis. if one wants to perform an operation on each element in the array, one can use the flat attribute which is an iterator over all the elements of the array. ```python for element in b.flat: print(element)

Out[143]:
0 1 2 3 4 5 6 7 8 9 10 11


---
## Shape Manipulation
- `a.ravel()`: return a flattened array by row
- `a.reshape()`
- `a.T`
- `a.resize()`: modifies the array itself
- `np.stack()`
- `np.vstack((a, b))` 
- `np.hstack((a, b))`
- `np.column_stack((a, b))`: stacks 1D arrays as columns into a 2D array, equivalent to hstack only for 2D arrays
- `np.row_stack((a, b))`: equivalent to vstack
- `np.c_`, `np.r_`
  - If the index expression contains comma separated arrays, then stack them along their first axis.
  - If the index expression contains slice notation or scalars then create a 1-D array with a range indicated by the slice notation.
  - Optional character strings placed as the first element of the index expression can be used to change the output.
    - The strings `'r'` or `'c'` result in matrix output.
    - A string integer specifies which axis to stack multiple comma separated arrays along.
    - A string of two comma-separated integers allows indication of the minimum number of dimensions to force each entry into as the second integer.
    
    ```python
    np.r_['0,1', [1,2,3], [4,5,6]]
    Out[164]: array([1, 2, 3, 4, 5, 6])

    np.r_['0,2', [1,2,3], [4,5,6]]
    Out[165]: 
    array([[1, 2, 3],
           [4, 5, 6]])

    np.r_['0,3', [1,2,3], [4,5,6]]
    Out[166]: 
    array([[[1, 2, 3]],
           [[4, 5, 6]]])
    ```
    
    - A string with three comma-separated integers allows specification of the axis to concatenate along, the minimum number of dimensions to force the entries to, and which    axis should contain the start of the arrays which are less than the specified number of dimensions.
    
    ```python
    np.r_['0,2,0', [1,2,3], [4,5,6]]
    Out[193]: 
    array([[1],
           [2],
           [3],
           [4],
           [5],
           [6]])
    ```
    To understand this, first, let put the entries into a two dimensional arrays:
    ```
    [[1],
     [2],
     [3]],
    [[4],  
     [5],
     [6]]
    ```
    The third parameter indicates that the concatenation should start from axis 0, which remains unchanged. Finally, concatenate along axis 1, that results
    ```
    [[1],
     [2],
     [3],
     [4],  
     [5],
     [6]]
    ```
    Let's look at another example:
    ```python
    np.r_['0,2,1', [1,2,3], [4,5,6]]
    Out[198]: 
    array([[1, 2, 3],
           [4, 5, 6]])    
    ```
    The third parameter indicates that the concatenation should start from axis 1, which results:
    ```
    [[1,2,3]],
    [[4,5,6]]
    ```
    Finally, concatenate along axis 1, that results
    ```
    [[1,2,3],
     [4,5,6]]
    ```

---
## Splitting Arrays
- `np.split()`
- `np.hsplit()`
- `np.vsplit()`

---
## Save & Load Numpy Arrays
- `load()` and `save()`: handle NumPy binary files with a `.npy` file extension
```python
a = np.array([1, 2, 3, 4, 5, 6])
np.save('filename', a)
b = np.load('filename.npy')

b
Out[256]: array([1, 2, 3, 4, 5, 6])
  • loadtxt(), savetxt(): handle normal text files ```python np.savetxt(‘filename.csv’, a) c = np.loadtxt(‘filename.csv’)

c Out[258]: array([1., 2., 3., 4., 5., 6.]) ```


Comments