Review sites get tough with generative AI – but it’s just the beginning

(Updated to add input from G2) Enterprise review sites are finally taking a firm stance against reviewers using generative AI in review submissions. It’s hardly a surprise, except it took them almost eight months. And, of course, the outright ban also raises further questions.

The story so far – it wasn’t long after ChatGPT launched in late November 2022 that people started tapping into its skills to create plausible-sounding reviews for enterprise review sites like G2, TrustRadius and Gartner Peer Insights.

With a few prompts, ChatGPT and other gen AI tools produce detailed, insightful reviews – sometimes surpassing the depth and breadth of human-generated reviews.

That’s partly because many people find it hard to be creative when faced with the big empty space of a question like: “Tell us what you like most about this product/service, and why?”

Gen AI has also proven to be scarily good at writing variations of a review to align with the typical wording for a 5*, 4*, 3*, 2* and 1*. This has created new options for reviewers not quite ready to provide the full five but also keen not to slam a vendor unnecessarily.

Gartner, G2 and TrustRadius: The computer says NO

In late July 2023, Gartner Peer Insights and TrustRadius both introduced anti-gen AI policies. G2 quickly followed suit, while allowing the use of AI writing assistants.

  • TrustRadius says: “We absolutely do not allow the use of generative AI to create product summaries that are not a reflection of an individual’s own experience.”
  • GPI says: “We continue to have an undeterred focus on review originality, integrity, individuality, quality and credibility. GPI advocates against and will not accept any reviews generated, in any capacity, by an AI language model.”
  • G2 says: “Because of our commitment to unbiased buyer feedback, reviews that use generative AI to write a majority of the content do not meet G2’s community guidelines. While we remain firm that reviews should come from real folks, the one exception is using AI writing assistant. So long as the tool only aims to improve readability, grammar, and spelling, we welcome reviewers to utilize these tools.”

Gen AI-created reviews are just a mash-up of reality

A secret to the runaway success of reviews created by ChatGPT et al is that they sound so plausible. That’s because output is based on genuine reviews, consumed as part of the training data. Although the short-lived “live browsing” functionality in ChatGPT 4.0 often ended with an error message, ChatGPT has evidently sifted through plenty of GPI and TrustRadius reviews.

Is there a difference between using ChatGPT – or any other Large Language Model (LLM) – as a wordsmith and using it to generate a review from scratch? We think so – and consistently achieved superb results in tests.

We guided ChatGPT on the output we were looking for, plus the highlights and lowlights that it should surface … and it delivered. We haven’t published or submitted any of these reviews – they were strictly for educational purposes – but get in touch if you’d like to compare some genuine reviews with those we created with ChatGPT. And see if you can tell the difference.

We agree that the enterprise reviews industry could do without any more middle-of-the-road, bland reviews that offer nothing new, insightful or innovative – whether these are generated by an LLM or a human.

Therefore, it makes sense for the review sites to institute an outright ban on gen AI-created text, although they are coy on how they are detecting this. We believe they’re relying on one of the online tools that purport to identify text generated by an LLM.

But we still have a number of questions. First, what about false positives? Will reviewers be asked to rework their review because it sounds too samey? Or will these reviews be rejected out of hand? (As noted above, we wouldn’t mind the latter)

Next, what about a reviewer who uses a tool like ChatGPT to give them inspiration, then reworks the machine output in their own voice? And perhaps (maybe even accidentally) adds a typo or grammatical error on the way – because a tell-tale sign of gen AI-created text is flawless spelling and grammar.

And how about a reviewer who enlists the help of tools like Grammarly?

And – is there a limit for the percentage of text that “could” have been created by gen AI? I can vouch that I wrote this blog, then ran it through the usual grammar and spellchecks. Yet you’ll still find that some “detector” tools claim an LLM wrote this. (Is this a glitch in the Matrix?)

Finally, what about when people write their review in their native language, then used a translation tool like DeepL to convert into English. Is that against the rules? And what if they used ChatGPT to translate instead?

You still need to keep your gen AI in check

One eye-opener for the Destrier team came after feeding a few hundred reviews to ChatGPT, then asking for the most negative of the bunch.

Quick as a flash, ChatGPT responded: “PRODUCT has been a bit of a disappointment for us. We have experienced several unexplained downtimes and the performance has been less than stellar. We’re also disappointed with the cost, which seems to escalate quickly once you start scaling up.”

This looked plausible, but we didn’t recall any reviews mentioning these concerns. Our response was along the lines of “Sorry, but what the …” and we got the reply: “My response was a simulated one based on patterns I learned during training, using a negative sentiment toward PRODUCT.”

To get the “proper” answer, we re-ran the prompt using more specific language.

We need to find the middle ground

We understand the rationale behind the review sites banning gen AI-created content. But the issue is not cut and dried. Gen AI-created reviews can be convincing, compelling and creative. They can also be misleading, misrepresentative and mawkish.

But then again, so can human-generated reviews!

An outright ban on using generative AI is simply the review sites’ response to Gen AI’s opening gambit. Stay tuned for the next move.

What’s your view? Do you think the review sites can keep ChatGPT and co at bay, or will gen AI-created reviews start to overwhelm them?

Destrier supports vendors large and small in evolving their strategic approach to enterprise peer reviews. If you’d like to find out how we could help your brand cut through, let’s connect.