Percorrer por autor "Petukhova, Alina"
A mostrar 1 - 3 de 3
Resultados por página
Opções de ordenação
Item MN-DS: A Multilabeled News Dataset for News Articles Hierarchical Classification(Multidisciplinary Digital Publishing Institute (MDPI), 2023-04-23) Petukhova, Alina; Fachada, Nuno; FE - Faculty of Engineering; COPELABS - Cognitive and People-centric ComputingThis article presents a dataset of 10,917 news articles with hierarchical news categories collected between 1 January 2019 and 31 December 2019. We manually labeled the articles based on a hierarchical taxonomy with 17 first-level and 109 second-level categories. This dataset can be used to train machine learning models for automatically classifying news articles by topic. This dataset can be helpful for researchers working on news structuring, classification, and predicting future events based on released news. Keywords: news dataset; text classification; NLP; media topic taxonomyItem Retail system scenario modeling using fuzzy cognitive maps(Multidisciplinary Digital Publishing Institute (MDPI), 2022) Petukhova, Alina; Fachada, Nuno; FE - Faculty of EngineeringA retail business is a network of similar-format grocery stores with a sole proprietor and a well-established logistical infrastructure. The retail business is a stable market, with low growth, limited customer revenues, and intense competition. On the system level, the retail industry is a dynamic system that is challenging to represent due to uncertainty, nonlinearity, and imprecision. Due to the heterogeneous character of retail systems, direct scenario modeling is arduous. In this article, we propose a framework for retail system scenario planning that allows managers to analyze the effect of different quantitative and qualitative factors using fuzzy cognitive maps. Previously published fuzzy retail models were extended by adding external factors and combining expert knowledge with domain research results. We determined the most suitable composition of fuzzy operators for the retail system, highlighted the system’s most influential concepts, and how the system responds to changes in external factors. The proposed framework aims to support senior management in conducting flexible long-term planning of a company’s strategic development, and reach its desired business goals. Keywords: retail; complex systems; fuzzy cognitive maps; scenario planningItem TextCL: a Python package for NLP preprocessing tasks(Elsevier B.V., 2022-07-01) Petukhova, Alina; Fachada, Nuno; FE - Faculty of EngineeringPreprocessing text data sets for use in Natural Language Processing tasks is usually a time-consuming and expensive effort. Text data, normally obtained from sources such as, but not limited to, web scraping, scanned documents or PDF files, is typically unstructured and prone to artifacts and other types of noise. The goal of the TextCL package is to simplify this process by providing multiple methods suited for text data preprocessing. It includes functionality for splitting texts into sentences, filtering sentences by language, perplexity filtering, and removing duplicate sentences. Another functionality offered by the TextCL package is the outlier detection module, which allows to identify and filter out texts that are different from the main topic distribution of the data set. This method allows selecting one of several unsupervised outlier detection algorithms, such as TONMF (block coordinate descent framework), RPCA (robust principal component analysis), or SVD (singular value decomposition) and apply it to the text data. Keywords: Natural language processing ; Text filtering ; Outlier detection