Measuring And Improving Persuasiveness Of Large Language Models

Somesh Singh*, Yaman Kumar Singla*, S I Harini*, Balaji Krishnamurthy
* Equal Contribution
Adobe Media and Data Science Research (MDSR), Adobe

ICLR 2025

Get in touch with us at behavior-in-the-wild@googlegroups.com

Abstract

Crafting a message to elicit a desired response can be time-consuming. While prior research has explored content generation and popularity prediction, the impact of wording on behavior change has been underexplored. We introduce the concept of transsuasion (trans = carrying across, suasion = the act of persuading, transsuasion = the act of carrying across text from non-persuasive to persuasive).

  1. Data Generation. We utilize pairs of tweets by the same user with similar meanings but varying wording and likes to study transsuasion.
  2. LLM Persuasion. Our research shows that larger language models (LLMs) are more effective at identifying which tweet versions attract more likes and can transform low-performing versions into high-performing ones.
  3. Model Efficiency. We demonstrate that smaller LLMs can be optimized to surpass larger LLMs in persuasion abilities.
  4. Resources. We introduce PersuasionBench and PersuasionArena, providing a benchmark and a suite of tasks for evaluating and enhancing persuasion in text. Our benchmarks and models are publicly available.

Persuasion Leaderboard

Here are the results of our models on the Persuasion Leaderboard. The leaderboard is based on the paper and the PersuasionArena website.

Model Avg. Elo
Topline (T2)1357
Ours (13B)1293
Ours-Instruct (13B)1304
Ours (CS+BS) (13B)1299
Vicuna-1.5-13B1195
LLaMA3-70B1099
GPT-3.5877
GPT-4o1187
GPT-41092
Baseline (T1)1251
GPT-41213
Baseline (T1)979

Transsuasion Examples

A few samples showing Transsuasion. While the account, time, and meaning of the samples remain similar, the behavior over the samples varies significantly.

Transsuasion ground truth samples
Transsuasion headline image

A few samples showing Transsuasion using our model. The left part contains original low-liked tweet, and the right contains the transsuaded version of the tweet.

Transsuasion generated samples
Transsuasion generated examples

Transsuasion Data

We are releasing our test dataset in the HuggingFace format.

Case Username Media Filter Link Match Text Edit Likes % Input Output #Samples
Refine text (Ref) SameNo ImagesNo>0.8-40 T1T2265k
Paraphrase (Parap) SameNo ImagesNo>0.6>0.640 T1T2163K
Transsuade and Add Image (AddImg) SameImage only on o/p sideNo>0.6>0.640 T1T2, I248k
Free-form refine with text and optionally visual content (FFRef) SameImage on either or both sidesNo>0.8-40 T1, I1T2, I2701k
Free-form paraphrase with text and optionally visual content (FFPara) SameImage on either or both sidesNo>0.6>0.640 T1, I1T2, I224k
Transsuade Visual Only (VisOnly) SameImage similarity > 0.7No--40 T1, I1, T2I268k
Transsuade Text Only (TextOnly) SameImage on o/p side or both sidesNo>0.8-40 T1, I1, I2T269k
Highlight Different Aspects of Context (Hilight) SameImages IgnoredYes>0.6>0.640 T1, Con1, I1T2, I2241k
Transcreation (Transc) BrandImages IgnoredIgnored>0.8-20 T1, U1, I1T2, U2, I2131k

Human Eval

To participate in the Human Eval, please use the registration form link that will be published here soon.

You need to accept the Terms of Service beforehand.

BibTeX

@article{singh2024measuring,
  title={Measuring and Improving Persuasiveness of Large Language Models},
  author={Somesh Singh and Yaman K Singla and Harini SI and Balaji Krishnamurthy},
  year={2024},
  journal={arXiv preprint arXiv:2410.02653}
}

Terms Of Service

Users are required to agree to the following terms before using the service. The service is a research preview. It only provides limited safety measures and may generate offensive content. It must not be used for any illegal, harmful, violent, racist, or sexual purposes. Please do not upload any private information. The service collects user dialogue data, including both text and images, and reserves the right to distribute it under a Creative Commons Attribution (CC-BY) or a similar license. They are also restricted to uses that follow the license agreement of Twitter, and LLaMA.

Acknowledgement

We thank Adobe for their generous sponsorship. We thank the LLaMA team for giving us access to their models, and open-source projects, including Vicuna.