AIFFEL Life

[Day81] 최신 NLP 흐름

nevermet 2020. 12. 26. 21:17

오늘은 최신 NLP 흐름에 대해 알아보는 내용이었습니다. Transformer 이후 어떤 새로운 모델들이 등장했으며, 어떤 방향으로 자연어처리가 발전해 나가는지 공부해 볼 수 있는 시간이었습니다.

 

1. Deep contextualized word representations

arxiv.org/abs/1802.05365

 

Deep contextualized word representations

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are

arxiv.org

2. Improving Language Understanding by Generative Pre-Training (paper)

s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

3. Language Models are Unsupervised Multitask Learners (paper)

cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf

4. Allen NLP Demo

demo.allennlp.org/next-token-lm?text=AllenNLP%20is

 

AllenNLP Demo

A collection of interactive demos of over 20 popular NLP models.

demo.allennlp.org

5. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (paper)

arxiv.org/pdf/1810.04805.pdf

6. 트랜스포머 기반 모델 문제점

blog.pingpong.us/transformer-review/#transformer-%EA%B8%B0%EB%B0%98-%EB%AA%A8%EB%8D%B8%EC%9D%98-%EB%AC%B8%EC%A0%9C%EC%A0%90

 

Transformer - Harder, Better, Faster, Stronger

Transformer 구조체와 이 구조를 향상시키기 위한 기법들을 같이 알아봅시다.

blog.pingpong.us

7. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

arxiv.org/pdf/1901.02860.pdf

8. XL-Net

ratsgo.github.io/natural%20language%20processing/2019/09/11/xlnet/

 

XLNet · ratsgo's blog

XLNet은 구글 연구팀(Yang et al., 2019)이 발표한 기법으로 공개 당시 20개 자연어 처리 데이터셋에서 최고 성능을 기록한 아키텍처입니다. 일부 데이터에 한해서는 기존 강자인 BERT를 크게 앞서 자연

ratsgo.github.io

9. XLNet: Generalized Autoregressive Pretraining for Language Understanding (paper)

arxiv.org/abs/1906.08237

 

XLNet: Generalized Autoregressive Pretraining for Language Understanding

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with mask

arxiv.org

10. ALBERT: A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS (paper)

arxiv.org/pdf/1909.11942.pdf