12/2 Multilingual and Cross-Lingual Analysis of Neural Machine Translation Models (김재명 연구원/NAVER LABS Europe, France)

작성자
kaistsoftware
작성일
2021-12-01 16:51
조회
15245
  • 강사 : 김재명 연구원 (NAVER LABS Europe, France)
  • 일시 : 2021. 12. 2 (목) 17:00~18:30
In this talk, we explore and analyze multilinguality and cross-linguality with respect to neural machine translation (NMT).
Recent studies on the analysis of the multilingual representations focus on identifying whether there is an emergence of language-independent representations, or whether a multilingual model partitions its weights among different languages. While most of such work has been conducted in a "black-box" manner, in this talk, we aim to analyze individual components of a multilingual NMT model. In particular, we look at the encoder self-attention and encoder-decoder attention heads (in a many-to-one NMT model) that are more specific to the translation of a certain language pair than others by (1) employing metrics that quantify some aspects of the attention weights such as "variance" or "confidence", and (2) systematically ranking the importance of attention heads with respect to translation quality. We observe that surprisingly, the set of most important attention heads are very similar across the language pairs and that it is possible to remove nearly one-third of the less important heads without hurting the translation quality greatly.
Having seen the internals of the multilingual NMT models, we now turn our attention to the bilingual (and cross-lingual) data itself. More specifically, we investigate whether discourse relations are preserved across cross-lingual sentences, using openly available discourse corpora derived from TED talks. We find that on average, 68% and 48% of inter-sentential discourse relations are exactly matched across 28 language pairs at the first and second level of the Penn Discourse Treebank hierarchy, respectively. Motivated by these findings, we performed a preliminary study on the effectiveness of discourse relations when applied to context-aware NMT. Experimental results show that adding discourse information can enhance NMT models' capability to adapt to contextual information and better handle various discourse phenomena. In addition, we show that constraining different types of discourse relations makes it possible to control target translation by adding appropriate discourse markers while maintaining the quality of translation.
전체 154
번호 제목 작성자 작성일 추천 조회
공지사항
2026년 봄학기 콜로퀴엄 일정 안내
kaistsoftware | 2026.03.04 | 추천 1 | 조회 100
kaistsoftware 2026.03.04 1 100
83
6/2 웹(WWW)의 과거, 현재, 그리고 미래 (고인영 교수/KAIST 전산학부)
kaistsoftware | 2022.05.27 | 추천 0 | 조회 13936
kaistsoftware 2022.05.27 0 13936
82
5/26 R&BD를 위한 3T(TRM/TT/TRIZ) 연계방법론 (이상국 교수/가톨릭대학교 미디어기술콘텐츠학과)
kaistsoftware | 2022.05.23 | 추천 0 | 조회 15539
kaistsoftware 2022.05.23 0 15539
81
5/19 정밀의료를 위해서 인공지능이 정말로 뭘 할까? (신현정 교수/아주대학교 산업공학과)
kaistsoftware | 2022.05.17 | 추천 0 | 조회 12930
kaistsoftware 2022.05.17 0 12930
80
5/12 수학, 확률, 통계 그리고 AI와 ML (김병천 교수/KAIST 경영공학부)
kaistsoftware | 2022.05.13 | 추천 0 | 조회 13480
kaistsoftware 2022.05.13 0 13480
79
4/14 Deep Learning for Understanding the World like Humans (안성진 교수/KAIST 전산학부)
kaistsoftware | 2022.04.11 | 추천 0 | 조회 13346
kaistsoftware 2022.04.11 0 13346
78
4/7 자연언어처리와 지식그래표: 한국어와 글로벌 활동 (최기선 교수/KAIST 전산학부)
kaistsoftware | 2022.04.04 | 추천 0 | 조회 12666
kaistsoftware 2022.04.04 0 12666
77
3/31 와인의 세계 (오영환 교수/KAIST 전산학부)
kaistsoftware | 2022.03.28 | 추천 0 | 조회 13329
kaistsoftware 2022.03.28 0 13329
76
3/24 Recommendation Systems in Biomedicine (송길태 교수/부산대학교 정보컴퓨터공학부)
kaistsoftware | 2022.03.28 | 추천 0 | 조회 13433
kaistsoftware 2022.03.28 0 13433
75
3/17 AI로 신제품 설계하기 (강남우 교수/KAIST 조천식녹색교통대학원)
kaistsoftware | 2022.03.15 | 추천 0 | 조회 13861
kaistsoftware 2022.03.15 0 13861
74
3/10 교육분야의 데이터 활용과 과제 (한정윤 박사/한국교육개발원 미래교육연구본부)
kaistsoftware | 2022.03.14 | 추천 0 | 조회 14757
kaistsoftware 2022.03.14 0 14757