CopyRight 2012-2014 DS文库版权所有
A Character-Level Decoder without Explicit Segmentation
(0 次评价)1683 人阅读0 次下载

 ACharacter-LevelDecoderwithoutExplicitSegmentationforNeuralMachineTranslation JunyoungChung Universit´edeMontr´eal junyoung.chung@umontreal.ca KyunghyunCho NewYorkUniversity YoshuaBengio Universit´edeMontr´ealCIFARSeniorFellow Abstract Theexistingmachinetranslationsystems,whetherphrase-basedorneural,havereliedalmostexclusivelyonword-levelmodellingwithexplicitsegmentation.Inthispaper,weaskafundamentalquestion:canneuralmachinetranslationgenerateacharactersequencewithoutanyexplicitsegmentation?Toanswerthisquestion,weevaluateanattention-basedencoder–decoderwithasubword-levelencoderandacharacter-leveldecoderonfourlanguagepairs–En-Cs,En-De,En-RuandEn-Fi–usingtheparallelcorporafromWMT’15.Ourexperimentsshowthatthemodelswithacharacter-leveldecoderoutperformtheoneswithasubword-leveldecoderonallofthefourlanguagepairs.Further-more,theensemblesofneuralmodelswithacharacter-leveldecoderoutperformthestate-of-the-artnon-neuralmachinetrans-lationsystemsonEn-Cs,En-DeandEn-FiandperformcomparablyonEn-Ru. 1Introduction Theexistingmachinetranslationsystemshavere-liedalmostexclusivelyonword-levelmodellingwithexplicitsegmentation.Thisismainlyduetotheissueofdatasparsitywhichbecomesmuchmoresevere,especiallyfor n -grams,whenasen-tenceisrepresentedasasequenceofcharactersratherthanwords,asthelengthofthesequencegrowssignificantly.Inadditiontodatasparsity,weoftenhaveaprioribeliefthataword,oritssegmented-outlexeme,isabasicunitofmeaning,makingitnaturaltoapproachtranslationasmap-pingfromasequenceofsource-languagewordstoasequenceoftarget-languagewords.Thishascontinuedwiththemorerecentlyproposedparadigmofneuralmachinetransla- tion,althoughneuralnetworksdonotsufferfromcharacter-levelmodellingandrathersufferfromtheissuesspecifictoword-levelmodelling,suchastheincreasedcomputationalcomplexityfromaverylargetargetvocabulary(Jeanetal.,2015;Lu-ongetal.,2015b).Therefore,inthispaper,wead-dressaquestionofwhether neuralmachinetrans-lationcanbedonedirectlyonasequenceofchar-acterswithoutanyexplicitwordsegmentation .Toanswerthisquestion,wefocusonrepresent-ingthetargetsideasacharactersequence.Weevaluateneuralmachinetranslationmodelswithacharacter-leveldecoderonfourlanguagepairsfromWMT’15tomakeourevaluationasconvinc-ingaspossible.Werepresentthesourcesideasasequenceofsubwordsextractedusingbyte-pairencodingfromSennrichetal.(2015),andvarythetargetsidetobeeitherasequenceofsubwordsorcharacters.Onthetargetside,wefurtherdesignanovelrecurrentneuralnetwork(RNN),called bi-scalerecurrentnetwork ,thatbetterhandlesmulti-pletimescalesinasequence,andtestitinadditiontoanaive,stackedrecurrentneuralnetwork.Onallofthefourlanguagepairs–En-Cs,En-De,En-RuandEn-Fi–,themodelswithacharacter-leveldecoderoutperformedtheoneswithasubword-leveldecoder.Weobservedasimilartrendwiththeensembleofeachofthesecon-figurations,outperformingboththepreviousbestneuralandnon-neuraltranslationsystemsonEn-Cs,En-DeandEn-Fi,whileachievingacompara-bleresultonEn-Ru.Wefindtheseresultstobeastrongevidencethatneuralmachinetranslationcanindeedlearntotranslateatthecharacter-levelandthatinfact,itbenefitsfromdoingso. 2NeuralMachineTranslation Neuralmachinetranslationreferstoarecentlyproposedapproachtomachinetranslation(For- arXiv:1603.06147v4 [cs.CL] 21 Jun 2016

打分:

0 星

用户评论:

小飞飞
于 2020-07-15 上传

版权及免责声明|RISC-V单片机中文网 |网站地图

GMT+8, 2022-10-5 06:09 , Processed in 3.937183 second(s), 30 queries .

返回顶部