Our R+D+i in Artificial Intelligence for Writing
To solve our investigative questions as related to Artificial Intelligence for Writing, we are applying state-of-the-art machine learning techniques, applied linguistic research, and expert knowledge on scientific writing to develop new models, functions, and algorithms in the field of artificial intelligence.
We seek to comprehensively aid writers during the entire writing process. This goal will be achieved through our applied research, development, and innovation (R+D+i), merging the latest technological advances in artificial intelligence for writing with established writing guidelines.
Our R+D+i is manifested in WriteWise, a unique software that will modernize writing by reducing the time and effort required by writers to finish their texts.
Research Lines in Artificial Intelligence for Academic Writing
Natural Language Processing Applied to Writing
Natural Language Processing Applied to Writing
We combine machine learning and computational linguistics within the framework of natural language processing, as applied to modelling and revising the writing process and scientific texts.
This line of research applies the following methodologies:
1. Novel approaches for representing textual data from scientific articles:
- Word embeddings combined with deep/machine learning models for natural language processing tasks.
- Graph-based representations
2. Novel computational approaches for analyzing scientific articles, with specific investigative focus on:
- Discourse Segmentation
- Automatic Punctuation Analysis
- Rule-based Text Mining
- Topic Modelling
- Readability/Coherence Classification
Rhetoric-Discourse and Lexical-Grammar in Artificial Intelligence
Applied for Writing
Rhetoric-Discourse and Lexical-Grammar in Artificial Intelligence
Applied for Writing
We use functional and applied discursive frameworks, combined with corpus analysis, computational linguistics, and natural language processing approaches, to empirically determine the discursive and linguistics norms and requirements of academic and scientific texts.
This line of research seeks to identify and comprehend the:
1. Communicative purposes and lexical-grammar features that constitute written texts in distinct scientific disciplines.
2. Textual and discursive foundations of academic and scientific texts.
Publications in Artificial Intelligence for Writing
▷ A novel machine learning model that guides graduate students to write more organized and structured texts
Javier Vera, Hector Allende-Cid, René Venegas, Sebastián Rodríguez, Wenceslao Palma, Sofía Zamora, Fernando Lillo, Humberto González, Ashley Van Cott, and Eduardo N. Fuentes. 2018. Molecular Biology of the Cell, 29:26
Academic writing is one of the most valuable skills a scientist can develop. A primary challenge for graduate students is to coherently and concisely organize and present ideas within a manuscript. Writing a quality research manuscript requires transmitting the most relevant information through precise sentences that fulfill diverse communicational roles, ultimately resulting in a coherent, understandable text connected by cohesive mechanisms (e.g. lexical relationships between pairs of terms). Despite technological advances, the execution and teaching of the writing process have not similarly advanced. Therefore, a top priority for graduate programs is to implement new methodologies and technologies that aid students in communicating research advances. Through our investigation, we developed a novel, unsupervised machine-learning model applied to cell biology and biomedical texts that guides students in writing better organized and more structured texts.
▷ Revealing the collaborative dynamics of large-scale arXiv text collection by means of k-shell decomposition
Javier Vera, Wenceslao Palma, Hector Allende, Sebastian Rodriguez, Juan Pavez, and Eduardo Fuentes. 2019. NetSci-X: International Conference on Network Science.
In this work was shown how k − shell decomposition helps to understand the dynamics of the formation of the decentralized and collaborative language community defined by the electronic repository arXiv. Our results suggest that there are several global patterns that emerges from the microscopic activity of users sharing content. The growth of the collection of texts (and therefore of the associated networks) was (almost) completely governed by the outmost k −shells, which exponentially increased its size over time. Nevertheless, the size of the most dense set of nodes (Skmax ) tends to linearly increase its size. This points in the direction of the existence of an exponential accumulation of words that forces changes in the main discipline (computer science, in our case), represented by Skmax . These observations were confirmed by the behavior of the (normalized) critical index k∗ = arg maxk |Sk |, since it exponentially shifts to the outmost network layers. Further study should describe the relationship between the index k and the number of connected components of the k − shell Sk . Moreover, it is plausible to propose that the decentralized features of arXiv appear precisely at those external layers..