NEWS 2016
Img


NEWS 2016: The Sixth Named Entities Workshop

Request Datasets - REGISTRATION IS CLOSED

Institute for Infocomm Research

Name Origin Source Script Target Script Train Size Dev Size Task ID
Western English Chinese 37K 2.8K EnCh
Western Chinese English 28K 2.7K ChEn

National Electronics and Computer Technology Center

Name Origin Source Script Target Script Train Size Dev Size Task ID
Western English Thai 27K 2.0K EnTh
Western Thai English 25K 2.0K ThEn

Microsoft Research India

Name Origin Source Script Target Script Train Size Dev Size Task ID
Mixed English Hindi 12K 1.0K EnHi
Mixed English Tamil 10K 1.0K EnTa
Mixed English Kannada 10K 1.0K EnKa
Mixed English Bangla 13K 1.0K EnBa
Western English Hebrew 9.5K 1.0K EnHe

Sarvnaz Karimi / RMIT

Name Origin Source Script Target Script Train Size Dev Size Task ID
Western English Persian 10K 2.0K EnPe

The CJK Dictionary Institute

Name Origin Source Script Target Script Train Size Dev Size Task ID
Western English Korean Hangul 7.0K 1.0K EnKo
Western English Japanese Katakana 26K 2.0K EnJa
Japanese English Japanese Kanji 10K 2.0K JnJk
Arabic Arabic English 27K 2.5K ArEn