{"id":229,"date":"2019-12-19T16:40:58","date_gmt":"2019-12-19T15:40:58","guid":{"rendered":"https:\/\/multi3generation.eu\/?page_id=229"},"modified":"2019-12-19T16:40:58","modified_gmt":"2019-12-19T15:40:58","slug":"wg4","status":"publish","type":"page","link":"https:\/\/multi3generation.inesc-id.pt\/?page_id=229","title":{"rendered":"WG 4 &#8211; Exploiting large knowledge bases and graphs"},"content":{"rendered":"\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-flow wp-block-group-is-layout-flow\">\n<p class=\"wp-block-paragraph\">WG4 focuses on using knowledge bases (KBs) and knowledge graphs in natural language generation, especially for the integration of common sense knowledge and world knowledge.<br>An expected result of WG4 is to increase the varieties of knowledge resources and language resources used in NLG.<br>WG4 will analyze how to efficiently integrate multimodal KBs, considering theoretical models of semantics and semantic processing that can accommodate linguistic and perceptual information.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">WG4 members will work on:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>increasing existing data-to-text NLG training sets with multilingual and multimodal content<\/li><li>testing neural NLG models performance on psycholinguistic datasets<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">If you\u2019d like to join this working group, please get in touch with the WG Leader and co-leader Irene Russo: irene.russo(at)ilc.cnr.it and Liviu P. Dinu: ldinu(at)fmi.unibuc.ro<\/p>\n<\/div><\/div>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-flow wp-block-group-is-layout-flow\">\n<h2 class=\"wp-block-heading\">Open source repository<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Data-to-text NLG training datasets<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Data-to-text NLG systems require training data. Here we provide a list of freely available datasets that have been created with different methodologies (automatically, crowdsourcing etc.) and for different NLG sub-tasks.<\/p>\n\n\n\n<figure class=\"wp-block-table is-style-stripes\"><table><tbody><tr><td><strong>Name<\/strong><\/td><td><strong>Paper<\/strong><\/td><td><strong>Year<\/strong><\/td><td><strong>Link<\/strong><\/td><\/tr><tr><td>WebNLG 2017<\/td><td>Gardent, C., Shimorina, A., Narayan, S., &amp; Perez-Beltrachini, L. (2017). Creating Training Corpora for NLG Micro-Planners. <em>ACL<\/em>.<\/td><td>2017<\/td><td><a href=\"https:\/\/webnlg-challenge.loria.fr\/challenge_2017\/\">https:\/\/webnlg-challenge.loria.fr\/challenge_2017\/<\/a><\/td><\/tr><tr><td>WebNLG 2020<\/td><td>Gardent, C., Shimorina, A., Narayan, S., &amp; Perez-Beltrachini, L. (2017). Creating Training Corpora for NLG Micro-Planners. <em>ACL<\/em>.<\/td><td>2020<\/td><td><a href=\"https:\/\/webnlg-challenge.loria.fr\/challenge_2020\/\">https:\/\/webnlg-challenge.loria.fr\/challenge_2020\/<\/a><\/td><\/tr><tr><td>KBGen<\/td><td>Banik, E., Gardent, C., &amp; Kow, E. (2013). The KBGen Challenge. <em>ENLG<\/em>.<\/td><td>2013<\/td><td><a href=\"http:\/\/www.kbgen.org\">http:\/\/www.kbgen.org<\/a><\/td><\/tr><tr><td>E2E NLG Challenge<\/td><td>Dusek, O., Novikova, J., &amp; Rieser, V. (2020). Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge. <em>Comput. Speech Lang., 59<\/em>, 123-156.<\/td><td>2017<\/td><td><a href=\"http:\/\/www.macs.hw.ac.uk\/InteractionLab\/E2E\/\">http:\/\/www.macs.hw.ac.uk\/InteractionLab\/E2E\/<\/a><\/td><\/tr><tr><td>MultiWOZ 2.2<\/td><td>Zang, X., Rastogi, A., Zhang, J., &amp; Chen, J. (2020). MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines. <em>ArXiv, abs\/2007.12720<\/em>.<\/td><td>2020<\/td><td><a href=\"https:\/\/github.com\/budzianowski\/multiwoz\">https:\/\/github.com\/budzianowski\/multiwoz<\/a><\/td><\/tr><tr><td>ToTTo<\/td><td>Parikh, Ankur P., et al. &#8220;Totto: A controlled table-to-text generation dataset.&#8221; arXiv preprint arXiv:2004.14373 (2020).<\/td><td>2020<\/td><td><a href=\"https:\/\/paperswithcode.com\/dataset\/totto\">https:\/\/paperswithcode.com\/dataset\/totto<\/a>&nbsp;<\/td><\/tr><tr><td>RotoWire<\/td><td>Wiseman, Sam, Stuart M. Shieber, and Alexander M. Rush. \u201cChallenges in Data-to-Document Generation.\u201d Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017.<\/td><td>2017<\/td><td><a href=\"https:\/\/github.com\/harvardnlp\/boxscore-data\/blob\/master\/rotowire.tar.bz2\">https:\/\/github.com\/harvardnlp\/boxscore-data\/blob\/master\/rotowire.tar.bz2<\/a><\/td><\/tr><tr><td>WikiBio<\/td><td>Lebret, R\u00e9mi, David Grangier, and Michael Auli. &#8220;Neural text generation from structured data with application to the biography domain.&#8221; arXiv preprint arXiv:1603.07771 (2016).<\/td><td>2016<\/td><td><a href=\"https:\/\/paperswithcode.com\/dataset\/wikibio\">https:\/\/paperswithcode.com\/dataset\/wikibio<\/a>&nbsp;<\/td><\/tr><tr><td>WEATHER GOV<\/td><td><\/td><td><\/td><td><\/td><\/tr><tr><td>ROBOCUP<\/td><td><\/td><td><\/td><td><\/td><\/tr><tr><td>Logic2Text<\/td><td>Chen, Zhiyu, et al. &#8220;Logic2Text: High-Fidelity Natural Language Generation from Logical Forms.&#8221; arXiv preprint arXiv:2004.14579 (2020).<\/td><td>2020<\/td><td><a href=\"https:\/\/paperswithcode.com\/dataset\/logic2text\">https:\/\/paperswithcode.com\/dataset\/logic2text<\/a>&nbsp;<\/td><\/tr><tr><td>DART<\/td><td>Nan, Linyong, et al. &#8220;Dart: Open-domain structured data record to text generation.&#8221; arXiv preprint arXiv:2007.02871 (2020).<\/td><td>2020<\/td><td><a href=\"https:\/\/paperswithcode.com\/dataset\/dart\">https:\/\/paperswithcode.com\/dataset\/dart<\/a>&nbsp;<\/td><\/tr><tr><td>ENT-DESC<\/td><td>Cheng, Liying, et al. &#8220;ENT-DESC: Entity Description Generation by Exploring Knowledge Graph.&#8221; arXiv preprint arXiv:2004.14813 (2020).<\/td><td>2020<\/td><td><a href=\"https:\/\/paperswithcode.com\/dataset\/ent-desc\">https:\/\/paperswithcode.com\/dataset\/ent-desc<\/a>&nbsp;<\/td><\/tr><tr><td>GEM (Generation, Evaluation, and Metrics)<\/td><td>Gehrmann, Sebastian, et al. &#8220;The gem benchmark: Natural language generation, its evaluation and metrics.&#8221; arXiv preprint arXiv:2102.01672 (2021).<\/td><td>2021<\/td><td><a href=\"https:\/\/paperswithcode.com\/dataset\/gem\">https:\/\/paperswithcode.com\/dataset\/gem<\/a>&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n<\/div><\/div>\n\n\n\n<h2 class=\"wp-block-heading\"> CA18231 Meeting <\/h2>\n\n\n\n<div class=\"wp-block-file\"><a href=\"https:\/\/multi3generation.inesc-id.pt\/wp-content\/uploads\/2021\/11\/WG4-MC-Meeting_december_2020.pdf\">WG4-MC-Meeting_december_2020<\/a><a href=\"https:\/\/multi3generation.inesc-id.pt\/wp-content\/uploads\/2021\/11\/WG4-MC-Meeting_december_2020.pdf\" class=\"wp-block-file__button\" download>Download<\/a><\/div>\n\n\n\n<div class=\"wp-block-file\"><a href=\"https:\/\/multi3generation.inesc-id.pt\/wp-content\/uploads\/2021\/11\/WG4-MC-Meeting_october_2021.pdf\">WG4-MC-Meeting_october_2021<\/a><a href=\"https:\/\/multi3generation.inesc-id.pt\/wp-content\/uploads\/2021\/11\/WG4-MC-Meeting_october_2021.pdf\" class=\"wp-block-file__button\" download>Download<\/a><\/div>\n","protected":false},"excerpt":{"rendered":"<p>WG4 focuses on using knowledge bases (KBs) and knowledge graphs in natural language generation, especially for the integration of common sense knowledge and world knowledge.An &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":175,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-229","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/multi3generation.inesc-id.pt\/index.php?rest_route=\/wp\/v2\/pages\/229","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/multi3generation.inesc-id.pt\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/multi3generation.inesc-id.pt\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/multi3generation.inesc-id.pt\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/multi3generation.inesc-id.pt\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=229"}],"version-history":[{"count":0,"href":"https:\/\/multi3generation.inesc-id.pt\/index.php?rest_route=\/wp\/v2\/pages\/229\/revisions"}],"up":[{"embeddable":true,"href":"https:\/\/multi3generation.inesc-id.pt\/index.php?rest_route=\/wp\/v2\/pages\/175"}],"wp:attachment":[{"href":"https:\/\/multi3generation.inesc-id.pt\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=229"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}