MN-DS: A Multilabeled News Dataset for News Articles Hierarchical Classification

dc.contributor.authorPetukhova, Alina
dc.contributor.authorFachada, Nuno
dc.contributor.institutionFE - Faculty of Engineering
dc.contributor.institutionCOPELABS - Cognitive and People-centric Computing
dc.date.issued2023-04-23
dc.descriptionData
dc.description.abstractThis article presents a dataset of 10,917 news articles with hierarchical news categories collected between 1 January 2019 and 31 December 2019. We manually labeled the articles based on a hierarchical taxonomy with 17 first-level and 109 second-level categories. This dataset can be used to train machine learning models for automatically classifying news articles by topic. This dataset can be helpful for researchers working on news structuring, classification, and predicting future events based on released news. Keywords: news dataset; text classification; NLP; media topic taxonomypt
dc.description.abstractThis article presents a dataset of 10,917 news articles with hierarchical news categories collected between 1 January 2019 and 31 December 2019. We manually labeled the articles based on a hierarchical taxonomy with 17 first-level and 109 second-level categories. This dataset can be used to train machine learning models for automatically classifying news articles by topic. This dataset can be helpful for researchers working on news structuring, classification, and predicting future events based on released news.en
dc.description.sponsorshipThis research was funded by Fundação para a Ciência e Tecnologia under Project UIDB/04111/2020 (COPELABS).
dc.formatapplication/pdf
dc.identifier.citationPetukhova, A & Fachada, N 2023, 'MN-DS: A Multilabeled News Dataset for News Articles Hierarchical Classification', Data, vol. 8, no. 5, 74. https://doi.org/10.3390/data8050074
dc.identifier.doihttps://doi.org/10.3390/data8050074
dc.identifier.issn2306-5729
dc.identifier.urlhttps://www.scopus.com/pages/publications/85160203624
dc.language.isoeng
dc.peerreviewedno
dc.publisherMultidisciplinary Digital Publishing Institute (MDPI)
dc.relation.ispartofData
dc.rightsopenAccess
dc.subjectRECOLHA DE DADOS
dc.subjectNOTÍCIAS
dc.subjectPROCESSAMENTO DA LINGUAGEM NATURAL
dc.subjectCOMUNICAÇÃO SOCIAL
dc.subjectPROCESSAMENTO DE DADOS
dc.subjectTAXONOMIA
dc.subjectINFORMÁTICA
dc.subjectDATA COLLECTION
dc.subjectNEWS
dc.subjectNATURAL LANGUAGE PROCESSING
dc.subjectMEDIA
dc.subjectDATA PROCESSING
dc.subjectTAXONOMY
dc.subjectCOMPUTER SCIENCE
dc.titleMN-DS: A Multilabeled News Dataset for News Articles Hierarchical Classificationen
dc.typearticle

Ficheiros

Principais
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
2023_data_mnds.pdf
Tamanho:
1.17 MB
Formato:
Adobe Portable Document Format
Descrição:
Licença
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
license.txt
Tamanho:
1.71 KB
Formato:
Item-specific license agreed upon to submission
Descrição: