Is Using ChatGPT cheating in online assessments?

As technology continues to advance, new tools and resources are reshaping how organizations approach the use of AI in talent assessments. ChatGPT and other LLMs are often capable of producing highly relevant, human-like responses in seconds in such a way that they appear to ‘think’ about assessment content just like a human. For some assessment types, LLMs like ChatGPT can answer assessment items in ways that do in fact mimic the responses of a highly qualified candidate. This emerging capability is understandably concerning to organizations who include assessments as part of their hiring processes.

While groundbreaking innovations like AI are exciting and have brought applicability for a great number of work tasks and daily activities, they also raise concerns about the growing risk of AI cheating. This has led many employers to ask, “Are my candidates cheating using ChatGPT?” When it comes to candidates using generative AI in recruitment processes, it is important to understand its limitations and potential impact on the assessment industry as a whole. Let’s look at its impact on different assessment types.

AI cheating and assessments: Myths vs reality

It’s no secret that AI usage has grown exponentially since generative AI software like ChatGPT have become such common, accessible tools. Despite this surge, ease of use does not equate to organizations needing to totally abandon tried-and-true scientific principles and well-designed assessment processes.

Is using ChatGPT cheating to help tailor a resume to the specific skills listed in a job description? I would argue no, perhaps it can even be seen as resourceful. Is using tools like Claude and ChatGPT to answer assessment questions cheating? The answer to that is more nuanced and honestly, less concerning than it may initially seem. Here are a few reasons why you shouldn’t lose sleep about candidates using AI for cheating in online assessments:

1. AI doesn’t ‘know’ the answers, it predicts them

Although it might look like AI-driven tools like ChatGPT are solving the answers to questions, they are not actually able to ‘figure out’ the right answers. Instead, they are scouring the internet at a rapid pace and generating responses based on patterns in existing data. As a result, these tools are more effective for simpler item types that are conventional, text-based, and multiple-choice, with clear right and wrong answers. For example, technical knowledge tests about a principle or concept with a correct response option and several unambiguously incorrect options are the types of assessments where LLM’s are really effective.

Technology is always advancing and improving on the types of items it is effective in responding to. Even as so-called AI cheating tools evolve past processing text-based content to consuming more complex item types, LLMs still have a tough time with more advanced formats. For example, assessment approaches that are collecting trace data and interactive measures of information processing (e.g., multitasking, working memory) are able to mitigate the advantage that a candidate would receive from using an LLM to respond to assessments. Translation? These types of assessments are resistant to AI cheating practices.

2. AI tools often discourage cheating

When it comes to personality assessments, AI-generated responses typically provide good advice when participants ask how to respond to items and emphasize that candidates should answer in a way that genuinely reflects their views. In this way, AI can actually support the intent behind well-designed assessments rather than undermine them.

While the exact wording varies each time, the engine typically responds with something like “As an AI engine, I am not designed to help cheat on assessments” or “You should read the item and answer honestly about what best characterizes who you are.” This is good news because regardless if the platform is Claude, CoPilot, Gemini, ChatGPT, etc., cheating is discouraged and frankly hard to do as the tools themselves work to reinforce ethical behaviors. AI is actually providing the type of response that assessment vendors and clients would want it to provide to help prevent any kind of cheating in online assessments.

3. Complex assessment types are difficult for AI to navigate

When it comes to situational judgment tests (SJT), AI is surprisingly good at creating content, but not as good at responding to it. To be clear, there are certain types of situational judgment tests that are ‘easier’ for LLM’s to respond to than others. But well-designed SJTs still show similar distributions in scores as they have always had.

The proliferation of tools like ChatGPT has not changed the usefulness of a properly developed SJT to organizations who use them in their hiring processes. SJT formats that ask respondents to pick the best option out of a set of several possible responses are more susceptible to Gen AI-assisted responding than more complex approaches. For example, in more complex formats where participants are asked to rate every response individually, as is the case with most of Talogy’s SJT content, AI is less likely to improve scores for a candidate.

AI cheating becomes even more difficult when assessments use advanced scoring models. This is particularly true for SJT content using ideal point-scoring, where the best response might be anywhere on the rating scale continuum.

How to stay ahead of AI cheating in assessments

While it is completely logical to have concerns about cheating using AI technology, it is important to consider a couple key questions.

First, are candidates actually doing it?

Second, is it really cheating?

Research on these topics suggests that while it is always important to stay on top of emerging issues, AI cheating may not be as dire as it sounds and there may be some relatively easy interventions that can mitigate it.

When assessment distributions are examined on a month-to-month basis since the proliferation of LLM availability, there has not been much shift in scoring. In other words, if ChatGPT cheating was destroying the usefulness and value of assessments, there would be noticeable score changes and average scores would increase across the board. The reality is that these score improvements have not been observed. This means that even though there may be new opportunities to cheat available to candidates, they are not cheating at the rate that many have feared.

In response to the second question, it is important to think about whether or not those who do use LLMs to respond to assessments are cheating? In other words, is this a low integrity behavior or are they just motivated and resourceful candidates using the tools available to them to put ‘their best food forward?’ Research on honesty statements suggests the latter. While some candidates are using these tools to cheat, using honesty statements at the beginning of assessments to let candidates know that it is inappropriate to use these types of tools has led to large reductions in generative-AI assisted responding in published research.

These findings show that maybe the problem is not that candidates are dishonest cheaters. Instead, if you just inform them about the appropriate way to respond to assessment measures, they do, by and large, follow the rules. Most candidates want to be good employees, not rule breakers.

The reality of AI cheating in talent assessments

While cheating on assessments will always be a risk – and generative AI-assisted responding does provide new avenues for dishonest behavior – it is important to understand that these test-taking strategies may not be as common as many think and, with a well-designed assessment program, they are unlikely to provide the participant with as much advantage as is often feared.

Test security and honest responding have always been and will continue to be priorities at Talogy. We are confident that our focus on continuous innovation will allow us to plan ahead and provide the best guidance to our clients to identify, develop, and retain talent in this evolving age of AI.

Artificial intelligence in talent management

The potential for Artificial Intelligence (AI) to significantly enhance how we hire and develop talent is incredibly exciting.

But let’s be clear, the results to date haven’t always been positive.

In this whitepaper, we provide a balanced and transparent overview of the pros and cons of using AI in talent management – highlighting where our industry can benefit from its powerful analytical potential, and flagging areas where AI techniques should be approached with caution.

Download now

AI cheating: Will candidates use ChatGPT to pass assessments?

AI cheating and assessments: Myths vs reality

1. AI doesn’t ‘know’ the answers, it predicts them

2. AI tools often discourage cheating

3. Complex assessment types are difficult for AI to navigate

How to stay ahead of AI cheating in assessments

The reality of AI cheating in talent assessments

Artificial intelligence in talent management

The potential for Artificial Intelligence (AI) to significantly enhance how we hire and develop talent is incredibly exciting.

Most recent posts

The importance of emotional intelligence in coaching

Resistance to change management: Why it happens and how to overcome it

Is measuring AI fluency enough in the age of AI?