Search
Meaning
Semantics
Text
Cosine Similarity
= measures angle between 2 vectors
= S_{C}(A, B)
= \cos(\theta\)
= \frac{adjacent}{hypotenuse}
= \frac{A \cdot B}{\|A\| \, \|B\|}
\[ S_C(A, B) = \frac{A \cdot B}{\|A\| \, \|B\|} \]
Cosine Similarity +1.0 = same meaning = n-dim-parallel and same dir +0.0 = unrelated = n-dim orthogonal -0.1 = opposite meaning = n-dim-parallel and oppo dir
Cosine similarity
Example in 3D
measures how aligned two 3D vectors are
the cosine of the angle between them
Take two 3D vectors:\[ A = (1, 2, 3) \\ B = (4, 5, 6) \]
1. Dot product:\[ A \cdot B = 1\cdot4 + 2\cdot5 + 3\cdot6 = 32 \]
2. Norms (lengths):\[ \|A\| = \sqrt{1^2 + 2^2 + 3^2} = \sqrt{14} \] \[ \|B\| = \sqrt{4^2 + 5^2 + 6^2} = \sqrt{77} \]
3. Cosine similarity:\[ \cos(\theta) = \frac{A \cdot B}{\|A\|\|B\|} = \frac{32}{\sqrt{14}\sqrt{77}} \approx 0.9746 \]
So the cosine similarity between A and B in 3D
is approximately 0.9746
meaning
the angle between them is small and
they point in a very similar direction
Meaning
Semantics
Text
RAG = Retrieval Augmented Generation
Data Modality |
Text |
Image | OCR
Video | VLM
Audio | ASR
.csv .xml .json
SQL .py
OCR = Optical Character Recongnition
ASR = Automatic Speech Recognition
VLM = Vision Language Model
Embeddings = Vector
= List of Numbers
Captures semantic meaning
Dimensions x1000
.png, .jpg .mp4 .wav

