Trip-MAML: Multi-Aspect Multi-Lingual review dataset

Human Language Technologies (HLT), Istituto di Scienza e Tecnologie dell'Informazione "A. Faedo", Consiglio Nazionale delle Ricerche - Pisa, Italy

Universidad de Jaén - Jaén, Spain


Overview

Trip-MAML is a Multi-Aspect Multi-Lingual dataset for aspect-oriented opinion mining annotated at sentence-level, consisting of Amazon hotel reviews in English, Italian, and Spanish.

This dataset is an extension of the Trip-MA dataset, which covers the English part of the corpus consisting of 442 English hotel reviews. The Trip-MAML extension adds 500 Italian and 500 Spanish hotel reviews that have been manually annotated at the sentence-level with Multi-Aspect sentiment labels by strictly following the same annotation protocol.

A more detailed description on the dataset and its annotation process could be found in [1, 2].

Download

To download the corpus use the following links:

Publications

If you use the dataset in your research, please refer it as:

  1. Jiménez Zafra, S. M., Berardi, G., Esuli, A., Marcheggiani, D., Martín-Valdivia, M. T., and Moreo Fernández, A. A Multi-lingual Annotated Dataset for Aspect-Oriented Opinion Mining. EMNLP 2015. [paper] [bib]
  2. Marcheggiani, D., Täckström, O., Esuli, A., & Sebastiani, F. (2014). Hierarchical multi-label conditional random fields for aspect-oriented opinion mining. In Advances in Information Retrieval (pp. 273-285). Springer International Publishing. [paper] [bib]

For any question, contact: S. M. Jiménez, sjzafra@ujaen.es; or A. Moreo, alejandro.moreo@isti.cnr.it