Category: corpus linguistics
-
The Frontiers of Discourse: Media, War, and Identity in Crisis Contexts
The conference entitled Discourse Across Cultures was held at Transilvania University of Brașov on March 21-22, focusing on two key words: discourse and culture. The focus of this event was on cross-cultural discourse patterns, as well as on specific communicative practices within cultures and offered participation opportunities to researchers interested in diverse research areas: linguistics,…
-
The uses of topic modeling in CORECON: Potentials and challenges
Topic modeling is an umbrella term used to denote a host of semi-automated or fully automated corpus linguistic methods that aim to map the content of texts by identifying dominant themes. When driven by an algorithm that is trained in either a supervised or unsupervised manner on a dataset, the method makes it possible to…
-
Corpus compilation and data annotation protocols in CORECON
Data collection is the base for any empirical study and corpus analysis is a well-established method used within critical discourse studies. Jędrzej Olejniczak in his blogpost wrote about data collection, management and processing in CORECON. Here I explain the motivations behind methodological choices and our rationales in data collection and data annotation protocols used when…
-
How we collect, manage and process our research material for CORECON
The research carried out within CORECON is based on a large database that consists of hundreds of news articles and social media entries in Romanian, Polish and English. These constitute a corpus (plural: corpora): a large set of linguistic data (e.g., collected from news outlets and social media) that can be processed with the use…