Linear Algebra
Linear Algebra
Scalars and Vectors
- Scalar: A single number representing magnitude only.
- Vector: An ordered array of scalars representing both magnitude and direction.
- Row Vector: $\mathbf{x} = [x_1, x_2, \dots, x_n]$
- Column Vector: $\mathbf{x} = \begin{bmatrix} x_1 \ x_2 \ \vdots \ x_n \end{bmatrix}$
Vector Operations
- Transpose: Converts a column vector into a row vector.
- Addition: Element-wise addition of two vectors.
- Scalar Multiplication: Multiplying every element of the vector by a scalar.
- Inner Product (Dot Product): The sum of the products of corresponding elements; results in a scalar.The angle $\theta$ between two vectors is expressed as:
Vector Norms
A norm is a function that represents the “length” of a vector.
- $L_0$ Norm: The number of non-zero elements in a vector.
- $L_1$ Norm (Manhattan Norm): Sum of the absolute values of the elements.
- $L_2$ Norm (Euclidean Norm): The square root of the sum of the squares of the elements.
- $L_p$ Norm:
Matrices and Tensors
An $m \times n$ matrix is a rectangular array with $m$ rows and $n$ columns. $\mathbb{R}^{m \times n}$ denotes the space of all real-valued $m \times n$ matrices.
- Square Matrix: A matrix where the number of rows equals the number of columns.
- Diagonal Matrix: A square matrix where all elements outside the main diagonal are zero.
- Identity Matrix ($I$): A diagonal matrix where all diagonal elements are 1.
Matrix Multiplication
Multiplication of $A$ and $B$ is defined only if the number of columns in $A$ equals the number of rows in $B$. If $C = AB$:
- Properties:
- Associative: $(AB)C = A(BC)$
- Left Distributive: $A(B+C) = AB + AC$
- Right Distributive: $(B+C)A = BA + CA$
- Non-commutative: $AB \neq BA$ (generally).
Matrix Transpose
The transpose $A^T$ of an $m \times n$ matrix is an $n \times m$ matrix where $(A^T){ij} = A{ji}$.
- Properties:
- $(A^T)^T = A$
- $(A+B)^T = A^T + B^T$
- $(AB)^T = B^T A^T$
- $(kA)^T = k A^T$
Matrix Inverse
For a square matrix $A$, if there exists a matrix $B$ such that $AB = BA = I$, then $B$ is the inverse, denoted as $A^{-1}$.
Other Matrix Operations
- Vectorization ($\text{vec}$): Rearranging matrix elements column-wise into a single column vector.
- Matrix Inner Product: $\langle A, B \rangle$ = .
- Hadamard Product ($\odot$): Element-wise multiplication of two matrices of the same dimension.
- Kronecker Product ($\otimes$): Each element of $A$ is multiplied by the entire matrix $B$.
Tensors
A tensor is a multi-dimensional array, generalizing scalars (0D), vectors (1D), and matrices (2D) to $n$-dimensions.
Matrix Calculus
Common Derivatives
- $\frac{\partial (\mathbf{a}^T \mathbf{x})}{\partial \mathbf{x}} = \mathbf{a}$
- $\frac{\partial (\mathbf{x}^T A \mathbf{x})}{\partial \mathbf{x}} = (A + A^T)\mathbf{x}$
- $\frac{\partial \text{tr}(AX)}{\partial X} = A^T$
- $\frac{\partial \text{tr}(X^T A X)}{\partial X} = (A + A^T)X$
Jacobian and Gradient Matrices
- Jacobian Matrix: For a function $\mathbf{f}: \mathbb{R}^n \to \mathbb{R}^m$, the Jacobian $J$ is an $m \times n$ matrix of first-order partial derivatives.
- Gradient Matrix: For a scalar function $f(X)$, the gradient $\nabla_X f$ is the transpose of the Jacobian matrix: $\nabla_X f = (\frac{\partial f}{\partial X})^T$.
- Hessian Matrix: A square matrix of second-order partial derivatives of a scalar-valued function.
Matrix Differentials and Trace
- Trace Properties:
- $\text{tr}(A) = \sum A_{ii}$
- $\text{tr}(ABC) = \text{tr}(BCA) = \text{tr}(CAB)$ (Cyclic property)
- Differential Rules:
- $d(A \pm B) = dA \pm dB$
- $d(AB) = (dA)B + A(dB)$
- $d(A^T) = (dA)^T$
- $d(\text{tr}(X)) = \text{tr}(dX)$
- $d(X^{-1}) = -X^{-1}(dX)X^{-1}$
Solving via Differentials
The relationship between the differential and the Jacobian for a scalar function $f(X)$ is:
By expanding the differential $df$ and rearranging it into the trace form $\text{tr}(G^T dX)$, the matrix $G$ is identified as the gradient.


