12/2 Multilingual and Cross-Lingual Analysis of Neural Machine Translation Models (김재명 연구원/NAVER LABS Europe, France)

2021-12-01 16:51
  • 강사 : 김재명 연구원 (NAVER LABS Europe, France)
  • 일시 : 2021. 12. 2 (목) 17:00~18:30
In this talk, we explore and analyze multilinguality and cross-linguality with respect to neural machine translation (NMT).
Recent studies on the analysis of the multilingual representations focus on identifying whether there is an emergence of language-independent representations, or whether a multilingual model partitions its weights among different languages. While most of such work has been conducted in a "black-box" manner, in this talk, we aim to analyze individual components of a multilingual NMT model. In particular, we look at the encoder self-attention and encoder-decoder attention heads (in a many-to-one NMT model) that are more specific to the translation of a certain language pair than others by (1) employing metrics that quantify some aspects of the attention weights such as "variance" or "confidence", and (2) systematically ranking the importance of attention heads with respect to translation quality. We observe that surprisingly, the set of most important attention heads are very similar across the language pairs and that it is possible to remove nearly one-third of the less important heads without hurting the translation quality greatly.
Having seen the internals of the multilingual NMT models, we now turn our attention to the bilingual (and cross-lingual) data itself. More specifically, we investigate whether discourse relations are preserved across cross-lingual sentences, using openly available discourse corpora derived from TED talks. We find that on average, 68% and 48% of inter-sentential discourse relations are exactly matched across 28 language pairs at the first and second level of the Penn Discourse Treebank hierarchy, respectively. Motivated by these findings, we performed a preliminary study on the effectiveness of discourse relations when applied to context-aware NMT. Experimental results show that adding discourse information can enhance NMT models' capability to adapt to contextual information and better handle various discourse phenomena. In addition, we show that constraining different types of discourse relations makes it possible to control target translation by adding appropriate discourse markers while maintaining the quality of translation.
전체 121
번호 제목 작성자 작성일 추천 조회
2024년 봄학기 콜로퀴엄 일정 안내
kaistsoftware | 2024.02.21 | 추천 1 | 조회 3383
kaistsoftware 2024.02.21 1 3383
5/7 지향성 프로그램 분석 (허기홍 교수/KAIST 전산학부)
kaistsoftware | 2024.04.23 | 추천 0 | 조회 47
kaistsoftware 2024.04.23 0 47
4/23 사모투자의 이해 (최원호 교수/KAIST 전산학부)
kaistsoftware | 2024.04.18 | 추천 0 | 조회 126
kaistsoftware 2024.04.18 0 126
4/2 LLM 기반 소프트웨어 공학의 현재와 전망 (유신 교수/KAIST 전산학부)
kaistsoftware | 2024.03.25 | 추천 0 | 조회 357
kaistsoftware 2024.03.25 0 357
3/26 하드웨어도 소프트웨어처럼 짜야한다 (강지훈 교수/KAIST 전산학부)
kaistsoftware | 2024.03.21 | 추천 0 | 조회 380
kaistsoftware 2024.03.21 0 380
3/19 자율주행과 안전 (배홍상 교수/KAIST 전산학부)
kaistsoftware | 2024.03.11 | 추천 0 | 조회 543
kaistsoftware 2024.03.11 0 543
3/12 에너지 효율적인 인공지능 학습 시스템 (권영진 교수/KAIST 전산학부)
kaistsoftware | 2024.03.05 | 추천 0 | 조회 704
kaistsoftware 2024.03.05 0 704
2/27 멀티-디바이스 모바일 플랫폼 (신인식 교수/KAIST 전산학부)
kaistsoftware | 2024.02.27 | 추천 0 | 조회 1271
kaistsoftware 2024.02.27 0 1271
11/20 Where is Autonomous Driving going? Boss, Traffic Jam Pilot, and the Future (배홍상 교수/KAIST 전산학부, Zeta Mobility)
kaistsoftware | 2023.11.16 | 추천 1 | 조회 2627
kaistsoftware 2023.11.16 1 2627
11/14 데이터 품질 문제에 견고한 AI 기술 (이재길 교수/KAIST 전산학부)
kaistsoftware | 2023.11.16 | 추천 1 | 조회 1815
kaistsoftware 2023.11.16 1 1815
11/6 인터랙션 중심 AI (김주호 교수/KAIST 전산학부)
kaistsoftware | 2023.11.01 | 추천 0 | 조회 1875
kaistsoftware 2023.11.01 0 1875