Modeling Complex Interactions in Long Documents for Aspect-Based Sentiment Analysis

Zehong Yan1, Wynne Hsu1, Mong Li Lee1, David Roy Bartram-Shaw2
1NUS Centre for Trusted Internet & Community, National University of Singapore
2Edelman Data & Intelligence

WASSA Workshop, ACL, 2024 / PDF / Project Page / Code / Data

We introduce DART, a hierarchical transformer-based framework for aspect-based sentiment analysis in long documents. DART handles the complexities of longer text through its global context interaction and two-level aspect-specific aggregation blocks. For empirical validation, we curate two datasets for aspect-based sentiment analysis in long documents: SocialNews and TrustData.

Abstract

The growing number of online articles and reviews necessitates innovative techniques for document-level aspect-based sentiment analysis. Capturing the context in which an aspect is mentioned is crucial. Existing models have focused on relatively short reviews and may fail to consider distant contextual information. This is especially so in longer documents where an aspect may be referred to in multiple ways across dispersed sentences. This work introduces a hierarchical Transformer-based architecture that encodes information at different level of granularities with attention aggregation mechanisms to learn the local and global aspect-specific document representations. For empirical validation, we curate two datasets of long documents: one on social issues, and another covering various topics involving trust-related issues. Experimental results show that the proposed architecture outperforms state-of-the-art methods for document-level aspect-based sentiment classification. We also demonstrate the potential applicability of our approach for long document trust prediction.

Introduction

  • Sentence-level aspect-based sentiment analysis fail to consider the aspect context, which can often be inferred from preceding or succeeding sentences or paragraphs.
  • Large language models (LLMs), such as GPT4, although powerful, still exhibit limited capabilities in modeling unknown aspects and get lost in long contexts.
  • We design a Transformer-based architecture, called DART, to capture dependency among sentences in long documents and learn aspect-specific document representations.

Framework

The proposed DART framework consists of four key modules:

  • Sentence Encoding Block transforms the document into individual sentences and generate representations for sentence-aspect combination.
  • Global Context Interaction Block models interactions among sentences and generate context-aware representations that capture aspect-specific information across long-range dependencies.
  • Aspect Aggregation Block performs local and global attentive pooling to obtain the document representation.
  • Sentiment Classification Block With the document representation obtained, this block leverages a two-layered Multilayer Perceptron (MLP) to predict the sentiment for the aspect.

Dataset

Comparative Study

Performance in TripAdvisor and BeerAdvocate

Performance in SocialNews

Note
  • DART surpasses others in overall accuracy and excels in five key aspects, particularly in the digital-online and health aspects.
  • DART shows superior performance as document length increases.

Application to Trust and Polarity Prediction

Note
  • DART gives the best performance in three key aspects and is competitive with GPT4-zeroshot for the Dependability aspect, due to the fewer number of documents with this aspect.

Ablation

BibTeX

If you found this work helpful for your research, please cite it as following:

@inproceedings{yan-etal-2024-modeling,
    title = "Modeling Complex Interactions in Long Documents for Aspect-Based Sentiment Analysis",
    author = "Yan, Zehong and Hsu, Wynne and Lee, Mong-Li and Bartram-Shaw, David",
    booktitle = "Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, {\&} Social Media Analysis",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.wassa-1.3",
    pages = "23--34",
}

This website is adapted from Nerfies and Sniffer. Thanks for the great work.

Written on June 26, 2024