跳转至

QIN's blog

Transformer

QIN's blog

首页
自我介绍
Studying
Studying
- Dive into deep learning
- Image Classification
  Image Classification
  - AlexNet
  - VGG
  - ResNet
  - MobileNets
  - GoogLeNet
- Object Detection
  Object Detection
  - R-CNN
  - Fast R-CNN
  - Faster R-CNN
  - YOLO v1
  - YOLO v2
- Instance Segmentation
  Instance Segmentation
  - U-Net
  - PSPNet
  - DeepLabv3+
- Other
  Other
- Chaos
Video Compression
Video Compression
- DVC
- FVC
- NIFC
- C2F

Transformer: Attention is all you need¶

对标：recurrent neural networks, long short-term memory, gated recurrent neural networks

attention mechanism

encoder-decoder architectures

Multi-Head Attention

stacked self-attention and point-wise

Untitled

Q: Queries, K: Key, V: Value

Untitled

\[ Attention(Q,K,V)=softmax(\frac{QK^T}{\sqrt{d_k}})V \]

QUASTION¶

sequence transduction是什么

有哪些query

哪里输出的是翻译结果，输出的概率是什么