Category: corpus linguistics

The Frontiers of Discourse: Media, War, and Identity in Crisis Contexts

Apr 6, 2025

—

by

grant.corecon@ulbsibiu.ro

in conference, corpus linguistics, Iulia-Maria Ticărău, research

The conference entitled Discourse Across Cultures was held at Transilvania University of Brașov on March 21-22, focusing on two key words: discourse and culture. The focus of this event was on cross-cultural discourse patterns, as well as on specific communicative practices within cultures and offered participation opportunities to researchers interested in diverse research areas: linguistics,…
The uses of topic modeling in CORECON: Potentials and challenges

Feb 10, 2025

—

by

grant.corecon@ulbsibiu.ro

in corpus linguistics, Katarzyna Molek-Kozakowska, methodology

Topic modeling is an umbrella term used to denote a host of semi-automated or fully automated corpus linguistic methods that aim to map the content of texts by identifying dominant themes. When driven by an algorithm that is trained in either a supervised or unsupervised manner on a dataset, the method makes it possible to…
Corpus compilation and data annotation protocols in CORECON

Jan 24, 2025

—

by

grant.corecon@ulbsibiu.ro

in corpus linguistics, Marcin Deutschmann, methodology, research

Data collection is the base for any empirical study and corpus analysis is a well-established method used within critical discourse studies. Jędrzej Olejniczak in his blogpost wrote about data collection, management and processing in CORECON. Here I explain the motivations behind methodological choices and our rationales in data collection and data annotation protocols used when…
How we collect, manage and process our research material for CORECON

Oct 28, 2024

—

by

grant.corecon@ulbsibiu.ro

in corpus linguistics, Jędrzej Olejniczak

The research carried out within CORECON is based on a large database that consists of hundreds of news articles and social media entries in Romanian, Polish and English. These constitute a corpus (plural: corpora): a large set of linguistic data (e.g., collected from news outlets and social media) that can be processed with the use…

Category: corpus linguistics

The Frontiers of Discourse: Media, War, and Identity in Crisis Contexts

The uses of topic modeling in CORECON: Potentials and challenges

Corpus compilation and data annotation protocols in CORECON

How we collect, manage and process our research material for CORECON