1 4 6

Jaward Sesay

Jaward

https://github.com/Jaykef

Jaykef_

Jaykef

AI & ML interests

I like to train large deep neural nets too 🧠🤖💥 | First Paper (AutoAgents: A Framework for Automatic Agent Generation) Accepted @ IJCAI 2024 | Role Model Karpathy

Articles

On Coding Your First Attention

24 days ago

• 7

Organizations

Posts 24

Post

1074

Build your own GPT-4 Tokenizer! - @karpathy 's minbpe exercise.
Step 1: BasicTokenizer
Got "close" to beating minbpe's train speed :(
step 2 RegexTokenizer coming soon.

Notes on lessons learned:
- tokenization is the assembly language of LLMs:)
It's not a healthy choice to code it lol.
- encoding can literally drive you mad.
- merging is where sh*t gets real - moment of truth:)
- training requires precision.
- decoding is trivial.

Post

1752

mlx_micrograd - mlx port of Karpathy's micrograd- a tiny scalar-valued autograd engine with a small PyTorch-like neural network library on top.

https://github.com/Jaykef/mlx_micrograd
Installation

pip install mlx_micrograd

Example usage
Example showing a number of possible supported operations:

from mlx_micrograd.engine import Value

a = Value(-4.0)
b = Value(2.0)
c = a + b
d = a * b + b**3
c += c + 1
c += 1 + c + (-a)
d += d * 2 + (b + a).relu()
d += 3 * d + (b - a).relu()
e = c - d
f = e**2
g = f / 2.0
g += 10.0 / f
print(f'{g.data}') # prints array(24.7041, dtype=float32), the outcome of this forward pass
g.backward()
print(f'{a.grad}') # prints array(138.834, dtype=float32), i.e. the numerical value of dg/da
print(f'{b.grad}') # prints array(645.577, dtype=float32), i.e. the numerical value of dg/db