Attention is all you need

A Vaswani, N Shazeer, N Parmar…�- …�neural information�…, 2017 - proceedings.neurips.cc
The dominant sequence transduction models are based on complex recurrent
orconvolutional neural networks in an encoder and decoder configuration. The best
performing such models also connect the encoder and decoder through an attentionm
echanisms. We propose a novel, simple network architecture based solely onan attention
mechanism, dispensing with recurrence and convolutions entirely. Experiments on two
machine translation tasks show these models to be superiorin quality while being more�…

[CITATION][C] Attention is all you need

V Ashish�- Advances in neural information processing systems, 2017 - cir.nii.ac.jp
Attention is All you Need Vaswani Ashish 作成者 … Advances in Neural Information
Processing Systems 30 I-, 2017
Showing the best results for this search. See all results