Conceptualizing Solutions to the AI Plagiarism Problem
Thinking in general terms about the interface between pedagogical appropriateness and AI-immunity.
[image created with Midjourney]
Welcome to AutomatED: the newsletter on how to teach better with tech.
Each week, I share what I have learned — and am learning — about AI and tech in the university classroom. What works, what doesn't, and why.
Let’s take a look at a framework for thinking about solving the AI plagiarism problem for written assignments.
Last week, I discussed some of the main ways that I have found that professors are in denial about the depth of the AI plagiarism problem.
Although I suggested that many professors should doubt that their written assignments are AI-immune – at least until they have grappled with the problem – I noted that it is possible to design these assignments so that they cannot be satisfactorily completed by AI. Today, we take a broad view of this and other solutions to the AI plagiarism problem.
To effectively deal with students plagiarizing take-home written assignments with AI, we need a good conceptual schema of the solution space.
A lot of discussion of solutions has been one-off and piecemeal. By generalizing our reflection on these issues, we get a better perspective on which attributes of potential solutions matter.
📈 🗺️ Two Dimensions
When we design an assignment for a course, there are a variety of factors that we should take into account that concern the pedagogical appropriateness of the assignment, including – but not limited to – the following:
the role of the assignment in the module in which it is found, as well as the course as a whole;
whether the assignment is appropriate given our students’ abilities and knowledge at the point in the module that it is assigned;
the amount of time that our students can be reasonably expected to have available to complete it;
how we expect our students to complete the assignment, including the software that they need to complete it and the formats in which their submissions should take; and
how we plan to grade or assess students’ submissions, as well as how we convey to students our expectations.
In general, assignments are more or less pedagogically appropriate depending on the extent to which they help students learn what they ought to be learning at a given point in time in a given professor’s course.
Crucially, pedagogical appropriateness is not the same as pedagogical effectiveness. There could be an assignment that is highly effective at helping students learn but not appropriate. For instance, they might be required to spend far more time completing it than can be reasonably expected given their other assignments. Such an assignment would be less pedagogically appropriate than others, even if it is more pedagogically effective.
Pedagogical appropriateness is highly complex and context-sensitive. As such, I do not intend to give or take myself to have given an exhaustive characterization of what it amounts to. However, I hope that the concept is sufficiently demarcated by these general terms. My goal is simply to develop a useful conceptual schema for thinking about solving the AI plagiarism problem. I encourage you to combine your own considered views about pedagogical appropriateness with this schema to improve your own outlook on solving the problem.
Now, just as we can think of pedagogical appropriateness as a dimension along which assignments can be ordered or ranked, we can think of another dimension that ranges from being completely immune to AI plagiarism to being capable of being plagiarized easily with AI. I will call this attribute of assignments ‘AI-immunity.’
Assignments are more or less AI-immune depending on the extent to which students can easily and reliably use AI to complete them (without being detected as plagiarized), such that the professor’s most demanding expectations are satisfied.
For example, if (i) a student can input an assignment’s instructions unchanged (i.e., verbatim as provided by the professor) into ChatGPT’s prompt field; (ii) the raw resultant output is always going to receive a perfect score from the professor; and (iii) the professor cannot detect whether the student plagiarized, then the assignment is not at all AI-immune. The student need not modify the prompt or tinker with ways of formulating it for ChatGPT, and they need not edit or revise ChatGPT’s output. Within mere seconds of receiving the assignment from the professor, they can get a perfect grade on the assignment and be at no risk of detection – all they need to do is be aware of ChatGPT and be willing to use it.
An assignment is more AI-immune to the degree that the student needs to work to modify the prompt before inputting it into the AI, to the degree that the AI’s output is not reliable in its quality relative to the professor’s expectations, and to the degree that the professor has reason to suspect the AI’s output as plagiarized.
We can create a graph of assignments that has these two dimensions as axes:
The Upper-Right Quadrant
Ideally, all of a professor’s most pedagogically appropriate assignments would be maximally AI-immune. But this is rarely the case. Assignments can be pedagogically appropriate but not at all AI-immune, and assignments can be maximally AI-immune but not at all pedagogically appropriate.
There are two broad strategies for a professor to consistently achieve AI-immunity while retaining pedagogical appropriateness, thereby locating more of their assignments in the upper-right quadrant:
AI-immunize each pedagogically appropriate assignment in itself; or
pair it with another assessment that increases its AI-immunity.
Let's discuss how a professor should go about implementing each of these kinds of strategies.
1. Experiment to Achieve One-Off AI-Immunity
To AI-immunize a written assignment in itself, a professor needs to first decide whether the assignment can be and should be completed in an environment where students lack access to AI tools. (Such an environment is often going to be one where students lack access to any devices, given the multitude of ways in which they can hide their use of AI tools via use of a device.)
Assuming an assignment cannot or should not be so completed, the professor needs to start by developing an understanding of how aspects of the assignment score with respect to AI-immunity.
It is possible to develop a general sense of what various AI tools can and cannot do at a given point in time. At this time, many AI tools are capable of completing aspects of written assignments that do not require highly specialized or idiosyncratic information or jargon; that do not require comprehensive or accurate citations; and that do not require the use of skills that are not regularly displayed by writing extant on the internet.
However, these sorts of general rules are almost too abstract to be useful, and AI technologies' rapid evolution requires a professor to constantly update their general understanding of AI's capabilities. Furthermore, specific assignments' instructions interface in surprising ways with specific AI tools, so it can often be challenging to generalize to a particular use case.
So, after a professor decides that a given assignment cannot or should not be completed in a device-free environment, the first step in evaluating the AI-immunity of the assignment is to experiment with trying to complete it with publicly available AI tools.
Professors should try to input an assignment's instructions as stated verbatim; with nearby variations (synonyms, rephrasings, etc.); and with additional context and details that address the shortcomings of initial AI outputs.
After evaluating the outputs of the AI tools in response to these inputs, a professor should, as a second step, alter the instructions and parameters of the assignment in an attempt to make it more AI-immune, if needed.
The professor should then return to the first step to see if the assignment is sufficiently AI-immune. Rinse and repeat.
2. Or Pair with AI-Immune Sister Task
The problem, though, is that some assignments that are highly pedagogically appropriate are far from AI-immune. You might find that one of your best assignments cannot be made AI-immune without significantly undermining its pedagogical appropriateness.
This is a situation where you might be tempted to remove the assignment entirely. However, the other option is to pair the assignment with another that is more AI-immune.
Pairing requires the professor to find a second assignment or task that the student must complete in connection with the first assignment that effectively reveals, when completed, whether the student plagiarized the first with the help of AI tools.
For example, you could allow students to complete a written assignment in class with their laptops that is not AI-immune at all but then you could immediately thereafter query them verbally one-on-one about the content of their writing (or about the subject-matter that their writing concerns).
Alternatively, you could have students present their take-home written work to their classmates in a way that requires them to engage at length with their classmates' organic and spontaneous comments and questions. ("Flipping the classroom," as this sort of method is sometimes called, is widely seen as pedagogically effective in many contexts anyway.)
You could have students apply their written submissions' content to a real-world case that is revealed to them only in class, and then create a case study during class time.
After a student completes a pair of assignments, the professor needs to have a rubric or set of criteria by which they judge the student's submissions. This rubric should award grades that are – in some way or another – sensitive to mismatches between the quality of student's submissions.
A simple case would be one where it is obvious to you that the student plagiarized the first assignment. However, most cases will not be so simple.
Here are some options:
Weight the more AI-immune assignment more heavily than the other (perhaps in proportion to its relative AI-immunity).
Compare the grades of the two assignments and take only the lowest grade.
Calculate a disparity value (i.e., a difference between the higher score and the lower score) that, in effect, reveals the likelihood that the student has plagiarized the written assignment. Determine student grades based on a combination of the disparity value and the scores of each (or both) assignment(s).
Throughout, the professor’s goal is to incentivize students to complete the course's assignments honestly and earnestly, and especially those that are most pedagogically appropriate.
TLDR (Too Long; Didn’t Read)
The process of addressing AI plagiarism through assignment design.
Dr Philippa Hardman investigates what we can do to make AI good at citing sources. Unfortunately, ChatGPT (and similar models) often just make up citations. This obviously isn’t good for research or teaching. They run three experiments attempting to address the problem: updating prompt structure, improving source specificity, and then refining the initial prompt.
Two philosophers write a guide “for the ethical use of AI systems such as ChatGPT in academic essay writing.” As students use these tools, it’s important to communicate how to use them skillfully and ethically. As the technology advances, more guides like this are a must.
Economist Bryan Caplan bet that “no AI would be able to get A’s on 5 out of 6 of [his] exams by January of 2029.” Three months later, GPT-4 got an A. At this point his odds are not good.