Aman Sanger
Thoughts
Books
Audio
Lex Fridman Podcast
2024-09
Latent Space Podcast
2023-07
Longer
Llama-2
is
was expensive
2023-08
Shorter
4-bit weight-quantization is more expensive than 16-bit
2023-08
flash attention isn't helpful when generating tokens
2023-05
llama-1 needs multi-query attention
2023-05
instruction finetuning is underrated
2023-04