Introduction
Background
Model Architecture
Scaled Dot-Product Attention 1.
Multi-Head Attention
Applications of Attention in our Model
Why Self-Attention
Training - TBD
Results
Conclusion
아래에 질문을 작성해주세요.