(Transformer : Attention is All You Need)

트랜스포머(Transformer)란?

Untitled

Untitled

위치 인코딩(Positional Encoding)

$$ \begin{aligned} \text{PE}{\text{pos},2i} &= \sin \bigg(\frac{\text{pos}}{10000^{2i/d{\text{model}}}}\bigg) \\ \text{PE}{\text{pos},2i+1} &= \cos \bigg(\frac{\text{pos}}{10000^{2i/d{\text{model}}}}\bigg) \end{aligned} $$

def get_angles(pos, i, d_model):
    """
    sin, cos 안에 들어갈 수치를 구하는 함수입니다.
    """
    angle_rates = 1 / np.power(10000, (2 * (i//2)) / np.float32(d_model))
    return pos * angle_rates

def positional_encoding(position, d_model):
    """
    위치 인코딩(Positional Encoding)을 구하는 함수입니다.
    """
    angle_rads= get_angles(np.arange(position)[:, np.newaxis],
                          np.arange(d_model)[np.newaxis, :],
                          d_model)

# apply sin to even indices in the array; 2iangle_rads[:, 0::2]= np.sin(angle_rads[:, 0::2])

# apply cos to odd indices in the array; 2i+1angle_rads[:, 1::2]= np.cos(angle_rads[:, 1::2])

    pos_encoding= angle_rads[np.newaxis,...]

return tf.cast(pos_encoding, dtype=tf.float32)

Self-Attention(셀프 어텐션)