{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Scientific Computing: NumPy\n", "\n", "NumPy (**Num**erical **Py**thon) is one of the most common packages used in Python. In fact, numerous computational packages that offer scientific capabilities utilize NumPy's array objects as a standard interface for data exchange. That's why understanding NumPy arrays and array-based computing principles is crucial. \n", "\n", "NumPy offers a vast array of efficient methods for creating and manipulating numerical data arrays. Unlike Python lists, which can accommodate various data types within a single list, NumPy arrays require homogeneity among their elements for efficient mathematical operations. Utilizing NumPy arrays provides advantages such as faster execution and reduced memory consumption compared to Python lists. With NumPy, data storage is optimized through the specification of data types, enhancing code optimization.\n", "\n", ":::{.callout-note}\n", "Documentation for this package is available at https://numpy.org/doc/stable/.\n", ":::\n", "\n", "To use NumPy in your code, you typically import it with the alias `np`" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.567834Z", "iopub.status.busy": "2026-01-19T18:20:36.567535Z", "iopub.status.idle": "2026-01-19T18:20:36.640413Z", "shell.execute_reply": "2026-01-19T18:20:36.639730Z" } }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Creating NumPy Arrays\n", "\n", "Arrays serve as a fundamental data structure within the NumPy. They represent a grid of values containing information on raw data, element location, and interpretation. Elements share a common data type, known as the array dtype.\n", "\n", "One method of initializing NumPy arrays involves using Python lists, with nested lists employed for two- or higher-dimensional data structures." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.643751Z", "iopub.status.busy": "2026-01-19T18:20:36.643429Z", "iopub.status.idle": "2026-01-19T18:20:36.647391Z", "shell.execute_reply": "2026-01-19T18:20:36.646723Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1D array: [1 2 3 4 5 6]\n" ] } ], "source": [ "a = np.array([1, 2, 3, 4, 5, 6])\n", "print(\"1D array:\", a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can access the elements through indexing. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.685200Z", "iopub.status.busy": "2026-01-19T18:20:36.684915Z", "iopub.status.idle": "2026-01-19T18:20:36.691991Z", "shell.execute_reply": "2026-01-19T18:20:36.691263Z" } }, "outputs": [ { "data": { "text/plain": [ "np.int64(1)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Arrays are N-Dimensional (that's why sometimes we refer to them as ndarray). That means that NumPy arrays will encompass vector (1-Dimensions), Matrices (2D) or tensors (3D and higher). We can get all the information of the array by checking its attributes. To create a 2D array, we can use nested lists:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.694408Z", "iopub.status.busy": "2026-01-19T18:20:36.694193Z", "iopub.status.idle": "2026-01-19T18:20:36.697191Z", "shell.execute_reply": "2026-01-19T18:20:36.696558Z" } }, "outputs": [], "source": [ "a = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Mathematically, we can think of this as a matrix with 2 rows and 4 columns, i.e.,\n", "\n", "$$a=\\begin{bmatrix}1 & 2 & 3 & 4 \\\\ 5 & 6 & 7 & 8 \\end{bmatrix}$$\n", "\n", "We can check its attributes to get more information about the array:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.700116Z", "iopub.status.busy": "2026-01-19T18:20:36.699913Z", "iopub.status.idle": "2026-01-19T18:20:36.703937Z", "shell.execute_reply": "2026-01-19T18:20:36.703497Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Dimensions/axes: 2\n", "Shape (size of array in each dimension): (2, 4)\n", "Size (total number of elements): 8\n", "Number of bytes: 64\n", "Data type: int64\n", "Item size (in bytes): 8\n" ] } ], "source": [ "print('Dimensions/axes:', a.ndim)\n", "print('Shape (size of array in each dimension):', a.shape)\n", "print('Size (total number of elements):', a.size)\n", "print('Number of bytes:', a.nbytes)\n", "print('Data type:', a.dtype)\n", "print('Item size (in bytes):', a.itemsize)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have already seen how to access elements in a 1D array. For 2D arrays, we can use two indices: the first for the row and the second for the column." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.706047Z", "iopub.status.busy": "2026-01-19T18:20:36.705860Z", "iopub.status.idle": "2026-01-19T18:20:36.708688Z", "shell.execute_reply": "2026-01-19T18:20:36.708282Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Element at (0, 2): 3\n" ] } ], "source": [ "element = a[0, 2] # Access the element in the first row and third column\n", "print(\"Element at (0, 2):\", element)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also use slicing to access subarrays. For example, to get the first two rows and the first three columns:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.710582Z", "iopub.status.busy": "2026-01-19T18:20:36.710413Z", "iopub.status.idle": "2026-01-19T18:20:36.713316Z", "shell.execute_reply": "2026-01-19T18:20:36.712938Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Subarray:\n", " [[1 2 3]\n", " [5 6 7]]\n" ] } ], "source": [ "subarray = a[0:2, 0:3]\n", "print(\"Subarray:\\n\", subarray)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We don't need to specify both indices all the time. For example, to get the first row, we can do" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.715439Z", "iopub.status.busy": "2026-01-19T18:20:36.715253Z", "iopub.status.idle": "2026-01-19T18:20:36.717856Z", "shell.execute_reply": "2026-01-19T18:20:36.717395Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "First row: [1 2 3 4]\n" ] } ], "source": [ "first_row = a[0, :]\n", "print(\"First row:\", first_row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or to get the second column" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.720479Z", "iopub.status.busy": "2026-01-19T18:20:36.720299Z", "iopub.status.idle": "2026-01-19T18:20:36.723131Z", "shell.execute_reply": "2026-01-19T18:20:36.722736Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Second column: [2 6]\n" ] } ], "source": [ "second_column = a[:, 1]\n", "print(\"Second column:\", second_column)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can initialize arrays using different commands depending on our aim. For instance, the most straightforward case would be to pass a list to `np.array()` to create one: " ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.725161Z", "iopub.status.busy": "2026-01-19T18:20:36.724980Z", "iopub.status.idle": "2026-01-19T18:20:36.728114Z", "shell.execute_reply": "2026-01-19T18:20:36.727745Z" } }, "outputs": [ { "data": { "text/plain": [ "array([5, 6, 7])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr1 = np.array([5,6,7])\n", "arr1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, sometimes we are more ambiguous and have no information on what our array contains. We just need to be able to initialize an array so that later on, our code, can update it. For this, we typically create arrays of the desired dimensions and fill them with zeros (`np.zeros()`), ones (`np.ones()`), with a given value (`np.full()`) or without initializing (`np.empty()`). \n", "\n", ":::{.callout-tip}\n", "When working with large data, `np.empty()` can be faster and more efficient. Also, large arrays can take up most of your memory and, in those cases, carefully establishing the `dtype()` can help to manage memory more efficiently (i.e., choose 8 bits over 64 bits.)\n", ":::" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.729933Z", "iopub.status.busy": "2026-01-19T18:20:36.729754Z", "iopub.status.idle": "2026-01-19T18:20:36.733172Z", "shell.execute_reply": "2026-01-19T18:20:36.732721Z" } }, "outputs": [ { "data": { "text/plain": [ "array([0., 0., 0., 0.])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.zeros(4)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.734925Z", "iopub.status.busy": "2026-01-19T18:20:36.734758Z", "iopub.status.idle": "2026-01-19T18:20:36.738679Z", "shell.execute_reply": "2026-01-19T18:20:36.738082Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[1., 1., 1.],\n", " [1., 1., 1.]])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.ones((2,3))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To create higher-dimensional arrays, we can pass a tuple representing the shape of the array:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.741375Z", "iopub.status.busy": "2026-01-19T18:20:36.741127Z", "iopub.status.idle": "2026-01-19T18:20:36.744719Z", "shell.execute_reply": "2026-01-19T18:20:36.744292Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[[1.],\n", " [1.]],\n", "\n", " [[1.],\n", " [1.]],\n", "\n", " [[1.],\n", " [1.]]])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.ones((3,2,1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This created a 3D array with 3 layers of matrices with 2 rows and 1 column. \n", "\n", "\n", "We can use `np.full()` to create an array of constant values that we specify in the `fill_value` option. " ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.746519Z", "iopub.status.busy": "2026-01-19T18:20:36.746342Z", "iopub.status.idle": "2026-01-19T18:20:36.749703Z", "shell.execute_reply": "2026-01-19T18:20:36.749282Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[4, 4],\n", " [4, 4]])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.full((2,2) , fill_value= 4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`np.empty()` creates an array without initializing its values. The values in the array will be whatever is already present in the allocated memory, which can be random and unpredictable." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.751522Z", "iopub.status.busy": "2026-01-19T18:20:36.751343Z", "iopub.status.idle": "2026-01-19T18:20:36.754709Z", "shell.execute_reply": "2026-01-19T18:20:36.754310Z" } }, "outputs": [ { "data": { "text/plain": [ "array([-2.0000e+000, 1.6189e-319])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.empty(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With `np.linspace()`, we can create arrays with evenly spaced values over a specified range. The syntax is `np.linspace(start, stop, num)`, where `start` is the starting value, `stop` is the ending value, and `num` is the number of evenly spaced values to generate." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.756690Z", "iopub.status.busy": "2026-01-19T18:20:36.756505Z", "iopub.status.idle": "2026-01-19T18:20:36.759907Z", "shell.execute_reply": "2026-01-19T18:20:36.759463Z" } }, "outputs": [ { "data": { "text/plain": [ "array([0. , 0.25, 0.5 , 0.75, 1. ])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linspace(0, 1, 5) # Generates 5 evenly spaced values between 0 and 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`np.arange()` is another useful function to create arrays with evenly spaced values, similar to the built-in `range()` function but returning a NumPy array. The syntax is `np.arange(start, stop, step)`, where `start` is the starting value, `stop` is the ending value (exclusive), and `step` is the increment between each value." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.761693Z", "iopub.status.busy": "2026-01-19T18:20:36.761516Z", "iopub.status.idle": "2026-01-19T18:20:36.764662Z", "shell.execute_reply": "2026-01-19T18:20:36.764201Z" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 2, 4, 6, 8])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(0, 10, 2) # Generates values from 0 to 8 with a step of 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that both `np.linspace()` and `np.arange()` can be used to create sequences of numbers, but they differ in how you specify the spacing and the number of elements. In general, use `np.linspace()` when you want a specific number of evenly spaced values over a range, and use `np.arange()` when you want to specify the step size between values.\n", "\n", "Sometimes, you might also need to create identity matrices, which are square matrices with ones on the diagonal and zeros elsewhere. You can use `np.eye()` to create an identity matrix of a specified size." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.766789Z", "iopub.status.busy": "2026-01-19T18:20:36.766611Z", "iopub.status.idle": "2026-01-19T18:20:36.770017Z", "shell.execute_reply": "2026-01-19T18:20:36.769529Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[1., 0., 0.],\n", " [0., 1., 0.],\n", " [0., 0., 1.]])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.eye(3) # Creates a 3x3 identity matrix" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or you might want to create diagonal matrices with specific values on the diagonal. You can use `np.diag()` for this purpose." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.772229Z", "iopub.status.busy": "2026-01-19T18:20:36.772047Z", "iopub.status.idle": "2026-01-19T18:20:36.775426Z", "shell.execute_reply": "2026-01-19T18:20:36.775009Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 0, 0],\n", " [0, 2, 0],\n", " [0, 0, 3]])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.diag([1, 2, 3]) # Creates a diagonal matrix with 1, 2, 3 on the diagonal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, to create random arrays, NumPy provides several functions in the `np.random` module. For example, you can create an array of random floats between 0 and 1 using `np.random.rand()`, or an array of random integers within a specified range using `np.random.randint()`, or a normal distribution using `np.random.randn()`." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.777541Z", "iopub.status.busy": "2026-01-19T18:20:36.777338Z", "iopub.status.idle": "2026-01-19T18:20:36.803525Z", "shell.execute_reply": "2026-01-19T18:20:36.803034Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[0.02894132, 0.48808971, 0.69074487],\n", " [0.33248865, 0.10684061, 0.70547865]])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.random.rand(2, 3) # Creates a 2x3 array of random floats between 0 and 1" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.805808Z", "iopub.status.busy": "2026-01-19T18:20:36.805535Z", "iopub.status.idle": "2026-01-19T18:20:36.809912Z", "shell.execute_reply": "2026-01-19T18:20:36.809484Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[8, 2, 0],\n", " [7, 4, 2]])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.random.randint(0, 10, size=(2, 3)) # Creates a 2x3 array of random integers between 0 and 9" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.812022Z", "iopub.status.busy": "2026-01-19T18:20:36.811812Z", "iopub.status.idle": "2026-01-19T18:20:36.815642Z", "shell.execute_reply": "2026-01-19T18:20:36.815150Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0.20414178, 0.16512519, -0.13471576],\n", " [ 0.51861456, 1.70765362, 0.549939 ]])" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.random.randn(2, 3) # Creates a 2x3 array of random floats from a standard normal distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{.callout-tip}\n", "### Random Seed\n", "\n", "When generating random numbers, it's often useful to set a random seed using `np.random.seed()`. This ensures that the sequence of random numbers generated is reproducible, meaning that you will get the same random numbers each time you run your code with the same seed. This is particularly important for debugging and sharing results.\n", "\n", ":::\n", "\n", "\n", "#### Managing Array Data\n", "\n", "Arrays accept common operations like sorting, concatenating and finding unique elements. \n", "\n", "For instance, using the `sort()` method we can sort elements within an array. " ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.817735Z", "iopub.status.busy": "2026-01-19T18:20:36.817526Z", "iopub.status.idle": "2026-01-19T18:20:36.821754Z", "shell.execute_reply": "2026-01-19T18:20:36.821282Z" } }, "outputs": [ { "data": { "text/plain": [ "array([ 0, 2, 3, 5, 10, 50])" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr1 = np.array((10,2,5,3,50,0))\n", "np.sort(arr1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In multidimensional arrays, we can sort the elements of a given dimension by specifying the axis along which to sort. When axis=0, the operation collapses along the first dimension (rows in a 2D array), giving one result per column. When axis=1, it collapses along the second dimension (columns in a 2D array), giving one result per row." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.823995Z", "iopub.status.busy": "2026-01-19T18:20:36.823771Z", "iopub.status.idle": "2026-01-19T18:20:36.827355Z", "shell.execute_reply": "2026-01-19T18:20:36.826902Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [8, 1, 5]])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mat1 = np.array([[1,2,3],[8,1,5]])\n", "mat1" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.829353Z", "iopub.status.busy": "2026-01-19T18:20:36.829146Z", "iopub.status.idle": "2026-01-19T18:20:36.832585Z", "shell.execute_reply": "2026-01-19T18:20:36.832192Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [1, 5, 8]])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mat1.sort(axis=1) # Sort along columns\n", "mat1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using `concatenate` we can join the elements of two arrays along an existing axis. " ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.834450Z", "iopub.status.busy": "2026-01-19T18:20:36.834266Z", "iopub.status.idle": "2026-01-19T18:20:36.838651Z", "shell.execute_reply": "2026-01-19T18:20:36.837884Z" } }, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 6, 7, 8])" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr1 = np.array((1,2,3))\n", "arr2 = np.array((6,7,8))\n", "np.concatenate((arr1,arr2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead, if we want to concatenate along a new axis, we use `vstack()` and `hstack()` " ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.841708Z", "iopub.status.busy": "2026-01-19T18:20:36.841493Z", "iopub.status.idle": "2026-01-19T18:20:36.845132Z", "shell.execute_reply": "2026-01-19T18:20:36.844708Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [6, 7, 8]])" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.vstack((arr1,arr2)) # Vertical stack" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.847168Z", "iopub.status.busy": "2026-01-19T18:20:36.846964Z", "iopub.status.idle": "2026-01-19T18:20:36.850362Z", "shell.execute_reply": "2026-01-19T18:20:36.849942Z" } }, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 6, 7, 8])" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.hstack((arr1,arr2)) # Horizontal stack" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also possible to reshape arrays. For instance, let's reshape the concatenation of `arr1` and `arr2` to 3 rows and 2 columns" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.852423Z", "iopub.status.busy": "2026-01-19T18:20:36.852219Z", "iopub.status.idle": "2026-01-19T18:20:36.855739Z", "shell.execute_reply": "2026-01-19T18:20:36.855312Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 2],\n", " [3, 6],\n", " [7, 8]])" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr_c = np.concatenate((arr1,arr2))\n", "arr_c.reshape((3,2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also perform aggregation functions over all elements, like finding the minimum, maximum, means, sum of elements and much more. " ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.857898Z", "iopub.status.busy": "2026-01-19T18:20:36.857682Z", "iopub.status.idle": "2026-01-19T18:20:36.860799Z", "shell.execute_reply": "2026-01-19T18:20:36.860378Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "6\n", "3\n", "2.0\n" ] } ], "source": [ "print(arr1.min())\n", "print(arr1.sum())\n", "print(arr1.max())\n", "print(arr1.mean())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This can also be done over a specific axis in multidimensional arrays. For example, let's create a 2D array and find the sum across rows and columns" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.862708Z", "iopub.status.busy": "2026-01-19T18:20:36.862516Z", "iopub.status.idle": "2026-01-19T18:20:36.865746Z", "shell.execute_reply": "2026-01-19T18:20:36.865374Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[5 7 9]\n", "[ 6 15]\n" ] } ], "source": [ "mat2 = np.array([[1,2,3],[4,5,6]])\n", "print(mat2.sum(axis=0)) # Sum along rows\n", "print(mat2.sum(axis=1)) # Sum along columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also possible to get only the unique elements of an array or to count how many elements are repeated. " ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.867918Z", "iopub.status.busy": "2026-01-19T18:20:36.867741Z", "iopub.status.idle": "2026-01-19T18:20:36.877241Z", "shell.execute_reply": "2026-01-19T18:20:36.876805Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 1 2 3 5 6 7 8 11]\n", "Unique elements: [ 1 2 3 5 6 7 8 11]\n", "Counts: [3 1 2 1 1 1 1 2]\n" ] } ], "source": [ "arr1 = np.array((1,2,3,3,1,1,5,6,7,8,11,11))\n", "print(np.unique(arr1))\n", "unq, count = np.unique(arr1, return_counts=True)\n", "print(\"Unique elements:\", unq)\n", "print(\"Counts:\", count)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using `where()`, we can find the indices of elements that satisfy a given condition." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.879345Z", "iopub.status.busy": "2026-01-19T18:20:36.879150Z", "iopub.status.idle": "2026-01-19T18:20:36.882385Z", "shell.execute_reply": "2026-01-19T18:20:36.881956Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Indices of elements greater than 25: (array([4, 5, 6]),)\n" ] } ], "source": [ "arr1 = np.array((10,15,20,25,30,35,40))\n", "indices = np.where(arr1 > 25)\n", "print(\"Indices of elements greater than 25:\", indices)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also use boolean indexing to filter elements based on a condition." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.885068Z", "iopub.status.busy": "2026-01-19T18:20:36.884778Z", "iopub.status.idle": "2026-01-19T18:20:36.888367Z", "shell.execute_reply": "2026-01-19T18:20:36.887864Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Elements greater than 25: [30 35 40]\n" ] } ], "source": [ "filtered_elements = arr1[arr1 > 25]\n", "print(\"Elements greater than 25:\", filtered_elements)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And we can replace elements that meet a condition using `np.where()`" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.890816Z", "iopub.status.busy": "2026-01-19T18:20:36.890571Z", "iopub.status.idle": "2026-01-19T18:20:36.893937Z", "shell.execute_reply": "2026-01-19T18:20:36.893485Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Array after replacement: [10 15 20 25 -1 -1 -1]\n" ] } ], "source": [ "new_arr = np.where(arr1 > 25, -1, arr1) # Replace elements greater than 25 with -1\n", "print(\"Array after replacement:\", new_arr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Array Operations\n", "\n", "NumPy arrays support common operations as addition, subtraction and multiplication. These operations are performed element-wise, meaning that they are applied to each corresponding element in the arrays." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.896124Z", "iopub.status.busy": "2026-01-19T18:20:36.895898Z", "iopub.status.idle": "2026-01-19T18:20:36.899017Z", "shell.execute_reply": "2026-01-19T18:20:36.898503Z" } }, "outputs": [], "source": [ "A = np.array(((1,2,3),\n", " (4,5,6)))\n", "B = np.array(((10,20,30),\n", " (40,50,60)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Element-wise addition, subtraction and multiplication can be performed with `+`, `-` and `*`. " ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.901489Z", "iopub.status.busy": "2026-01-19T18:20:36.901256Z", "iopub.status.idle": "2026-01-19T18:20:36.905112Z", "shell.execute_reply": "2026-01-19T18:20:36.904566Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[11, 22, 33],\n", " [44, 55, 66]])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A + B" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.907768Z", "iopub.status.busy": "2026-01-19T18:20:36.907196Z", "iopub.status.idle": "2026-01-19T18:20:36.911192Z", "shell.execute_reply": "2026-01-19T18:20:36.910701Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 9, 18, 27],\n", " [36, 45, 54]])" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B - A" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.913281Z", "iopub.status.busy": "2026-01-19T18:20:36.913054Z", "iopub.status.idle": "2026-01-19T18:20:36.916787Z", "shell.execute_reply": "2026-01-19T18:20:36.916287Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 10, 40, 90],\n", " [160, 250, 360]])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A * B" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To multiply (`*`) or divide (`/`) all elements by an scalar, we just specify the scalar. " ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.919163Z", "iopub.status.busy": "2026-01-19T18:20:36.918921Z", "iopub.status.idle": "2026-01-19T18:20:36.922711Z", "shell.execute_reply": "2026-01-19T18:20:36.922204Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[10, 20, 30],\n", " [40, 50, 60]])" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A * 10" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.924986Z", "iopub.status.busy": "2026-01-19T18:20:36.924741Z", "iopub.status.idle": "2026-01-19T18:20:36.928582Z", "shell.execute_reply": "2026-01-19T18:20:36.928088Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[1., 2., 3.],\n", " [4., 5., 6.]])" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B / 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that NumPy automatically **broadcasts** the scalar to all elements of the array.\n", "\n", ":::{.callout-tip}\n", "\n", "### Broadcasting\n", "\n", "Broadcasting is a powerful mechanism in NumPy that allows operations to be performed on arrays of different shapes. When performing operations between arrays of different shapes, NumPy automatically expands the smaller array along the dimensions of the larger array so that they have compatible shapes. This process is called broadcasting.\n", "\n", "For example, consider adding a 1D array to a 2D array. NumPy will \"broadcast\" the 1D array across the rows of the 2D array to perform the addition." ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.930961Z", "iopub.status.busy": "2026-01-19T18:20:36.930715Z", "iopub.status.idle": "2026-01-19T18:20:36.934250Z", "shell.execute_reply": "2026-01-19T18:20:36.933666Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[11 22 33]\n", " [14 25 36]]\n" ] } ], "source": [ "A = np.array([[1, 2, 3],\n", " [4, 5, 6]])\n", "B = np.array([10, 20, 30]) # 1D array\n", "C = A + B # B is broadcasted across the rows of A\n", "print(C)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::\n", "\n", "\n", "Comparing NumPy arrays is also possible using operators as `==`, `!=`, and the like. Comparisons will result in an array of booleans indicating if the condition is met for a given element. " ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.936640Z", "iopub.status.busy": "2026-01-19T18:20:36.936340Z", "iopub.status.idle": "2026-01-19T18:20:36.941495Z", "shell.execute_reply": "2026-01-19T18:20:36.940745Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[ True, False, True],\n", " [False, False, True]])" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr1 = np.array(((1,2,3),(4,5,6)))\n", "arr2 = np.array(((1,5,3),(7,2,6)))\n", "arr1==arr2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recall that we use double equals `==` for comparison, while a single equals `=` is used for assignment.\n", "\n", "\n", "Note that element-wise multiplication is different from matrix multiplication. Matrix multiplication is achieved with either `@` or `matmul()`." ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.943953Z", "iopub.status.busy": "2026-01-19T18:20:36.943727Z", "iopub.status.idle": "2026-01-19T18:20:36.948142Z", "shell.execute_reply": "2026-01-19T18:20:36.947647Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[20, 29],\n", " [47, 74]])" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.matmul(arr1,arr2.T) # Note the transpose of arr2 to match dimensions" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "execution": { "iopub.execute_input": "2026-01-19T18:20:36.950249Z", "iopub.status.busy": "2026-01-19T18:20:36.950041Z", "iopub.status.idle": "2026-01-19T18:20:36.953454Z", "shell.execute_reply": "2026-01-19T18:20:36.952971Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[20, 29],\n", " [47, 74]])" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr1 @ arr2.T # Note the transpose of arr2 to match dimensions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercises\n", "\n", "1. Create a 1D array with all integer elements from 1 to 10 (both included). No hard-coding allowed!\n", "2. From the array you created in 1, create one that contains all odd elements and one with all even elements. \n", "3. Create a new array that replaces all elements in 1 that are odd by -1. \n", "4. Create a 3-by-3 matrix filled with 'True' values (i.e., booleans).\n", "5. Suppose you have array `a=np.array(['a','b','c','d','e','f','g'])` and `b = np.array(['g','h','c','a','e','w','g'])`. Find all elements that are equal. Can you get the position where the elements of both arrays match?\n", "6. Write a function that takes a element an array and prints elements that are divisible by a given number. Try it creating an array from 1 to 20 and printing divisibles by 3. \n", "7. Consider two matrices, A and B, both of size 100x100, filled with random integer values between 1 and 10. Implement a function to perform element-wise multiplication of these matrices using nested loops. Implement the same operation using Numpy's vectorized multiplication. Repeat again with matrices of size 1000x1000, 10000x10000 and compare the execution time. Which one is faster? " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3", "path": "/usr/local/share/jupyter/kernels/python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.5" } }, "nbformat": 4, "nbformat_minor": 4 }