Tiny AMP transformer email

Train step

Training window: abc -> d

Approx cross-entropy loss: 1.609438 after 0 updates. Last target: Last prediction:

Loss graph

Attention

postokenscoreweight
0a00.333333
1b00.333333
2c00.333333

Inference

Current context: abc

Run one full train step before inference.

Generated continuation:

Output logits

charlogitprob
a00.2
b00.2
c00.2
d00.2
e00.2