登入
選單
返回
Google圖書搜尋
Understanding and Generating Multi-sentence Texts
Rik Koncel-Kedziorski
出版
University of Washington Libraries
, 2019
URL
http://books.google.com.hk/books?id=kYwXywEACAAJ&hl=&source=gbs_api
註釋
English is often found in units comprised of multiple sentences, but synthesizing information across sentence boundaries, whether for understanding or generation, is a difficult challenge for natural language processing algorithms. Techniques for such synthesis, however, are necessary for improved language understanding and have the potential to transform downstream applications including dialog systems, question answering, and educational technologies. In this thesis, I investigate techniques for understanding and generating multi-sentence natural language texts. As a first step toward general cross-sentence reasoning, I will describe a model for solving open-world math word problems. The model treats word problem texts as semantically-enhanced equation trees using a recursive semantic structure of quantities and generates possible solutions with an integer linear programming approach. Local and global information from the text is combined, and the model learns from data how to select the maximum scoring tree to answer each problem. Continuing in the math word problem domain, I present an editing method for automatically customizing math word problems to meet thematic constraints. This technique preserves the complex document structure of human authored text, editing in a globally coherent and syntactically informed way. Additionally, this model improves on previous thematic generation approaches by automatically building an understanding of theme from an arbitrary text. Reusing the existing syntactic and semantic relationships of the human authored text to preserve its mathematical meaning, my method can produce novel and coherent thematic word problems in English. Finally, I outline a model for generating multi-sentence texts from knowledge graphs using an innovative neural encoding, and provide evidence that knowledge graphs can help structure the generation of longer English texts. My novel graph transforming encoder extends the recent transformer model for text encoding to graph-structured inputs. The overall model learns to encode the input graph and output text in an end to end fashion. Human and automatic evaluation show that relational knowledge improves generated text.