Self-supervised text sentiment transfer with rationale predictions and pretrained transformers

dc.contributor.advisorBuys, Jan
dc.contributor.authorSinclair, Neil
dc.date.accessioned2023-04-21T12:29:46Z
dc.date.available2023-04-21T12:29:46Z
dc.date.issued2022
dc.date.updated2023-04-21T12:29:30Z
dc.description.abstractSentiment transfer involves changing the sentiment of a sentence, such as from a positive to negative sentiment, whilst maintaining the informational content. Whilst this challenge in the NLP research domain can be constructed as a translation problem, traditional sequence-to-sequence translation methods are inadequate due to the dearth of parallel corpora for sentiment transfer. Thus, sentiment transfer can be posed as an unsupervised learning problem where a model must learn to transfer from one sentiment to another in the absence of parallel sentences. Given that the sentiment of a sentence is often defined by a limited number of sentiment-specific words within the sentence, this problem can also be posed as a problem of identifying and altering sentiment-specific words as a means of transferring from one sentiment to another. In this dissertation we use a novel method of sentiment word identification from the interpretability literature called the method of rationales. This method identifies the words or phrases in a sentence that explain the ‘rationale' for a classifier's class prediction, in this case the sentiment of a sentence. This method is then compared against a baseline heuristic sentiment word identification method. We also experiment with a pretrained encoder-decoder Transformer model, known as BART, as a method for improving upon previous sentiment transfer results. This pretrained model is fine-tuned first in an unsupervised manner as a denoising autoencoder to reconstruct sentences where sentiment words have been masked out. This fine-tuned model then generates a parallel corpus which is used to further fine-tune the final stage of the model in a self-supervised manner. Results were compared against a baseline using automatic evaluations of accuracy and BLEU score as well as human evaluations of content preservation, sentiment accuracy and sentence fluency. The results of this dissertation show that both neural network and heuristic-based methods of sentiment word identification achieve similar results across models for similar levels of sentiment word removal for the Yelp dataset. However, the heuristic approach leads to improved results with the pretrained model on the Amazon dataset. We also find that using the pretrained Transformers model improves upon the results of using the baseline LSTM trained from scratch for the Yelp dataset for all automatic metrics. The pretrained BART model scores higher across all human-evaluated outputs for both datasets, which is likely due to its larger size and pretraining corpus. These results also show a similar trade-off between content preservation and sentiment transfer accuracy as in previous research, with more favourable results on the Yelp dataset relative to the baseline.
dc.identifier.apacitationSinclair, N. (2022). <i>Self-supervised text sentiment transfer with rationale predictions and pretrained transformers</i>. (). ,Faculty of Science ,Department of Statistical Sciences. Retrieved from http://hdl.handle.net/11427/37819en_ZA
dc.identifier.chicagocitationSinclair, Neil. <i>"Self-supervised text sentiment transfer with rationale predictions and pretrained transformers."</i> ., ,Faculty of Science ,Department of Statistical Sciences, 2022. http://hdl.handle.net/11427/37819en_ZA
dc.identifier.citationSinclair, N. 2022. Self-supervised text sentiment transfer with rationale predictions and pretrained transformers. . ,Faculty of Science ,Department of Statistical Sciences. http://hdl.handle.net/11427/37819en_ZA
dc.identifier.ris TY - Master Thesis AU - Sinclair, Neil AB - Sentiment transfer involves changing the sentiment of a sentence, such as from a positive to negative sentiment, whilst maintaining the informational content. Whilst this challenge in the NLP research domain can be constructed as a translation problem, traditional sequence-to-sequence translation methods are inadequate due to the dearth of parallel corpora for sentiment transfer. Thus, sentiment transfer can be posed as an unsupervised learning problem where a model must learn to transfer from one sentiment to another in the absence of parallel sentences. Given that the sentiment of a sentence is often defined by a limited number of sentiment-specific words within the sentence, this problem can also be posed as a problem of identifying and altering sentiment-specific words as a means of transferring from one sentiment to another. In this dissertation we use a novel method of sentiment word identification from the interpretability literature called the method of rationales. This method identifies the words or phrases in a sentence that explain the ‘rationale' for a classifier's class prediction, in this case the sentiment of a sentence. This method is then compared against a baseline heuristic sentiment word identification method. We also experiment with a pretrained encoder-decoder Transformer model, known as BART, as a method for improving upon previous sentiment transfer results. This pretrained model is fine-tuned first in an unsupervised manner as a denoising autoencoder to reconstruct sentences where sentiment words have been masked out. This fine-tuned model then generates a parallel corpus which is used to further fine-tune the final stage of the model in a self-supervised manner. Results were compared against a baseline using automatic evaluations of accuracy and BLEU score as well as human evaluations of content preservation, sentiment accuracy and sentence fluency. The results of this dissertation show that both neural network and heuristic-based methods of sentiment word identification achieve similar results across models for similar levels of sentiment word removal for the Yelp dataset. However, the heuristic approach leads to improved results with the pretrained model on the Amazon dataset. We also find that using the pretrained Transformers model improves upon the results of using the baseline LSTM trained from scratch for the Yelp dataset for all automatic metrics. The pretrained BART model scores higher across all human-evaluated outputs for both datasets, which is likely due to its larger size and pretraining corpus. These results also show a similar trade-off between content preservation and sentiment transfer accuracy as in previous research, with more favourable results on the Yelp dataset relative to the baseline. DA - 2022 DB - OpenUCT DP - University of Cape Town KW - natural language processing KW - neural networks KW - sentiment transfer KW - transformers LK - https://open.uct.ac.za PY - 2022 T1 - Self-supervised text sentiment transfer with rationale predictions and pretrained transformers TI - Self-supervised text sentiment transfer with rationale predictions and pretrained transformers UR - http://hdl.handle.net/11427/37819 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/37819
dc.identifier.vancouvercitationSinclair N. Self-supervised text sentiment transfer with rationale predictions and pretrained transformers. []. ,Faculty of Science ,Department of Statistical Sciences, 2022 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/37819en_ZA
dc.language.rfc3066eng
dc.publisher.departmentDepartment of Statistical Sciences
dc.publisher.facultyFaculty of Science
dc.subjectnatural language processing
dc.subjectneural networks
dc.subjectsentiment transfer
dc.subjecttransformers
dc.titleSelf-supervised text sentiment transfer with rationale predictions and pretrained transformers
dc.typeMaster Thesis
dc.type.qualificationlevelMasters
dc.type.qualificationlevelMSc
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_sci_2022_sinclair neil.pdf
Size:
4.02 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description:
Collections