12/2 Multilingual and Cross-Lingual Analysis of Neural Machine Translation Models (김재명 연구원/NAVER LABS Europe, France)

작성자
kaistsoftware
작성일
2021-12-01 16:51
조회
7070
  • 강사 : 김재명 연구원 (NAVER LABS Europe, France)
  • 일시 : 2021. 12. 2 (목) 17:00~18:30
In this talk, we explore and analyze multilinguality and cross-linguality with respect to neural machine translation (NMT).
Recent studies on the analysis of the multilingual representations focus on identifying whether there is an emergence of language-independent representations, or whether a multilingual model partitions its weights among different languages. While most of such work has been conducted in a "black-box" manner, in this talk, we aim to analyze individual components of a multilingual NMT model. In particular, we look at the encoder self-attention and encoder-decoder attention heads (in a many-to-one NMT model) that are more specific to the translation of a certain language pair than others by (1) employing metrics that quantify some aspects of the attention weights such as "variance" or "confidence", and (2) systematically ranking the importance of attention heads with respect to translation quality. We observe that surprisingly, the set of most important attention heads are very similar across the language pairs and that it is possible to remove nearly one-third of the less important heads without hurting the translation quality greatly.
Having seen the internals of the multilingual NMT models, we now turn our attention to the bilingual (and cross-lingual) data itself. More specifically, we investigate whether discourse relations are preserved across cross-lingual sentences, using openly available discourse corpora derived from TED talks. We find that on average, 68% and 48% of inter-sentential discourse relations are exactly matched across 28 language pairs at the first and second level of the Penn Discourse Treebank hierarchy, respectively. Motivated by these findings, we performed a preliminary study on the effectiveness of discourse relations when applied to context-aware NMT. Experimental results show that adding discourse information can enhance NMT models' capability to adapt to contextual information and better handle various discourse phenomena. In addition, we show that constraining different types of discourse relations makes it possible to control target translation by adding appropriate discourse markers while maintaining the quality of translation.
전체 121
번호 제목 작성자 작성일 추천 조회
공지사항
2024년 봄학기 콜로퀴엄 일정 안내
kaistsoftware | 2024.02.21 | 추천 1 | 조회 3972
kaistsoftware 2024.02.21 1 3972
70
11/25 Modeling Species Interactions and Distributions Under Imperfect Detection (서유진 박사/Brown Univ.)
kaistsoftware | 2021.11.22 | 추천 0 | 조회 6372
kaistsoftware 2021.11.22 0 6372
69
11/18 Introduction to Advanced Computational Imaging Techniques (김민혁 교수/KAIST 전산학부)
kaistsoftware | 2021.11.17 | 추천 0 | 조회 6037
kaistsoftware 2021.11.17 0 6037
68
11/11 Intelligent Positive Computing for Digital Wellbeing (이의진 교수/KAIST 전산학부)
kaistsoftware | 2021.11.15 | 추천 0 | 조회 5770
kaistsoftware 2021.11.15 0 5770
67
11/4 Human and Data in NLP Pipeline (강동엽 교수/Dept. of CSE, Univ. of Minnesota, USA)
kaistsoftware | 2021.11.03 | 추천 0 | 조회 5820
kaistsoftware 2021.11.03 0 5820
66
10/28 Skinner box에서 AI 챗봇까지: 교수-학습의 변화 (허선영 교수/서울신학대 교수학습개발센터)
kaistsoftware | 2021.10.27 | 추천 0 | 조회 6030
kaistsoftware 2021.10.27 0 6030
65
10/14 미래 모빌리티 서비스를 위한 딥러닝 기술 (이재길 교수/KAIST 전산학부)
kaistsoftware | 2021.10.13 | 추천 0 | 조회 5761
kaistsoftware 2021.10.13 0 5761
64
9/16 Cross-Modal Learning (진소영 박사/CSAIL, MIT, USA)
kaistsoftware | 2021.09.24 | 추천 0 | 조회 7676
kaistsoftware 2021.09.24 0 7676
63
2021년 가을학기 콜로퀴엄 일정 안내
kaistsoftware | 2021.09.08 | 추천 0 | 조회 8124
kaistsoftware 2021.09.08 0 8124
62
6/3 햅틱스(촉감)기술과 가상현실(VR) (박진아 교수/카이스트 전산학부)
kaistsoftware | 2021.06.02 | 추천 0 | 조회 7353
kaistsoftware 2021.06.02 0 7353
61
5/27 Confidential Computing in the Age of AI (강병훈 교수/카이스트 전산학부)
kaistsoftware | 2021.05.24 | 추천 0 | 조회 6981
kaistsoftware 2021.05.24 0 6981