Search
   Meaning
      Semantics
         Text
Cosine Similarity
= measures angle between 2 vectors
= S_{C}(A, B)
= \cos(\theta\)
= \frac{adjacent}{hypotenuse}
= \frac{A \cdot B}{\|A\| \, \|B\|}

\[ S_C(A, B) = \frac{A \cdot B}{\|A\| \, \|B\|} \]

Cosine Similarity
   +1.0 = same meaning     = n-dim-parallel and same dir
   +0.0 = unrelated        = n-dim orthogonal
   -0.1 = opposite meaning = n-dim-parallel and oppo dir
Cosine similarity 
   Example in 3D 
      measures how aligned two 3D vectors are 
         the cosine of the angle between them    
Take two 3D vectors:
\[ A = (1, 2, 3) \\ B = (4, 5, 6) \]
1. Dot product:
\[ A \cdot B = 1\cdot4 + 2\cdot5 + 3\cdot6 = 32 \]
2. Norms (lengths):
\[ \|A\| = \sqrt{1^2 + 2^2 + 3^2} = \sqrt{14} \] \[ \|B\| = \sqrt{4^2 + 5^2 + 6^2} = \sqrt{77} \]
3. Cosine similarity:
\[ \cos(\theta) = \frac{A \cdot B}{\|A\|\|B\|} = \frac{32}{\sqrt{14}\sqrt{77}} \approx 0.9746 \]
So the cosine similarity between A and B in 3D 
  is approximately 0.9746
  meaning 
     the angle between them is small and 
     they point in a very similar direction
Meaning
   Semantics
      Text
RAG = Retrieval Augmented Generation

Data Modality  |    
   Text        |    
      Image    | OCR
         Video | VLM
      Audio    | ASR
.csv
.xml
.json
SQL
.py
OCR = Optical Character Recongnition
ASR = Automatic Speech Recognition
VLM = Vision Language Model
Embeddings = Vector
           = List of Numbers
             Captures semantic meaning
             Dimensions x1000 
.png, .jpg
.mp4
.wav