1

Multitask Prompted Training Enables Zero-Shot Task Generalization

Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models’ …

DRIFT: A Toolkit for Diachronic Analysis of Scientific Literature

In this work, we present to the NLP community, and to the wider research community as a whole, an application for the diachronic analysis of research corpora. We open source an easy-to-use tool coined: DRIFT, which allows researchers to track …

Improving Reading Comprehension with Abstract Words using Augmentation, Linguistic Features and Voting

We present our approaches and methods for SemEval-2021 Task-4 Reading Comprehension of Abstract Meaning. Given a question with a fill-in-the-blank, and a corresponding context, the task is to predict the most suitable word from a list of 5 options. …

Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques

Toxicity detection of text has been a popular NLP task in the recent years. In SemEval-2021 Task-5 Toxic Spans Detection, the focus is on detecting toxic spans within English passages. Most state-of-the-art span detection approaches employ various …

Document Ranking with XLNet-Based Models

Establishing a good information retrieval system in popular mediums of entertainment is a quickly growing area of investigation for companies and researchers alike. We delve into the domain of information retrieval for podcasts. In Spotify’s Podcast …