Introducing: Assessment in an AI World

A robot head, with a hand, possibly human, writing on a clipboard. There's a form on the clipboard labelled "Assessment"
Reading Time: 3 minutes

Since Chat GPT arrived almost a year ago (30th Nov 2022), the most frequent questions, we, and others in similar roles across the world, were asked at the end of 2022 and into 2023 were, “How can I tell if students have been using AI”, / “How can I stop them” / “Will [insert assessment type] be one that students couldn’t possibly use AI to answer”. Here, at Dundee, we had already been discussing the impact of generative AI (GenAI) becoming more readily available for a year or so before Nov 30th 2022, though it was only a small group of interested staff. Of course, Artificial Intelligence isn’t new, it’s been around since the 1950s (though some would argue the concepts go back to antiquity).

The questions we are now asked, and our discussions, have rapidly evolved to “How should we assess students in the 21st Century”. This change in question reflects the shift we’ve seen over the course of 2023, as Higher Education has moved on from initial reactions of panic and hysteria over GenAI to a recognition that AI is going to change the workplace, society as it becomes more ubiquitous and easy for anyone to use.

While we, alongside everyone else, can’t answer the question directly around how we should assess, we can start the discussion, and prepare for further rapid technological change.

Logos from several AI tools. The OpenAI/ChatGPT one is largest, as that's generally the best known.
AI Logos. Clockwise from the top left: OpenAI (also used for their ChatGPT tool), PoeAI, Google Bard, Claude AI, Microsoft Bing, Perplexity

Getting ready

In order to get the most from the posts this week, we’d like you to get ready by creating an account with a GenAI tool. If you do not yet have one, there are a number of choices such as BingChat,  Google Bard, ChatGPT, Claude, or Perplexity

If you are a University of Dundee member of staff or student, you should have access to Bing via your University Office 365 Account. For staff, make sure that you have the latest version of the Edge browser (it should be present already on a staff build laptop), If you open Edge, have a look at the icons running down the right – you’ll need the top icon (which may be the bing icon – see the image above), or the newer co-pilot icon which has been rolling out since Nov 1st 2023. Alternatively, in Edge or Chrome, go to Bing.com. Make sure you sign in with your UoD account. For staff from other MS 365 using institutions, you will probably have access to it via your institutional account. Equally, if you are a Google Institution, you’ll probably have access to Bard via your own institutional login. If you have a premium version of Grammarly, you’ll also have access to AI text generation, though we’ll be looking at image generation this week, as well as text generation.

Alternatively you may wish to sign up for another service, such as ChatGPT, Claude or Perplexity, (see the starting list above), however these may require you to share a phone number as well as your email address. If you are unsure or have concerns about how these tools might use your data check this Jisc guide on navigating the terms and conditions of using GenAI tools.

What we’ll cover this week:

  1. Academic Integrity – can and should we detect AI usage?
  2. Could GenAI pass your assessment?
  3. How can GenAI design assessments?

Those of you who have used LearningX before may wonder why this season is shorter than our usual 5, we know that this is such a rapidly changing field so we’re going to be updating this regularly, and future iterations may be longer.

 

 

1: Academic Integrity – can, and should, we detect student use of AI?

A robot writing an essay.
Reading Time: 3 minutes

It’s been a year since ChatGPT3 exploded on the digital landscape attracting 100 million users in just two months. We’d been aware of generative AI (GenAI) and its potential to support learning but also that it could be used to generate answers to assessments. ChatGPT3 proved a game changer, ChatGPT4 was launched quickly on its heels causing worldwide panic in academia. There was talk of the death of the essay, the need to ban GenAI tools. Very quickly academics’ thoughts turned to think about how we could detect AI generated writing and whether Turnitin could help with this.

AI detection

It transpired that Turnitin had been working on developing an AI detector and they announced their intention to launch and switch on AI detection for all customers in April 2023. The University, along with most UK universities, decided to opt out of this launch. A key factor in this decision was that we were unable to test and evaluate the reliability of the detection tool and were not provided with independently verified data to give us sufficient confidence to implement at this time.

Almost six months on it is still not clear how Turnitin detects AI writing. There has been no information forthcoming detailing how its detection tool identifies AI generated writing. Meanwhile, research internationally is highlighting a range of issues with AI detection. We previously highlighted the following from Jisc:

Jisc notes:AI detectors cannot prove conclusively that text was written by AI.”  

Michael Webb (17/3/2023), AI writing detectors – concepts and considerations, Jisc National Centre for AI 

In a recent update Jisc have reiterated their previous recommendations position and guidance, highlighting that: 

  • No AI detection software can conclusively prove text was written by AI
  • It is easy to defeat AI detection software
  • All AI detection software will give false positives
  • We need to consider what we are actually trying to detect as AI-assisted writing is becoming the norm. 

Michael Webb (18/09/2023), AI Detection – Latest Recommendations, Jisc National Centre for AI 

OpenAI the company the developed ChatGPT developed their own AI detector. They’ve subsequently withdrawn it due to its low rate of accuracy. Meanwhile a white paper from Anthology, the parent company of Blackboard, highlights that their research on AI detection tools has led them to conclude that they are not currently fit for purpose. You can find some interesting examples of content that have been uploaded to some of the AI detection tools, for example they think that the US Constitution and excerpts from the Bible were generated by AI.

Now that we’re able to create a test environment for the Turnitin AI detection tool we are evaluating it with both human and AI generated work. In the meantime, it is interesting to note that universities that did not initially opt-out of AI detection have now disabled its use. Vanderbilt University posted an announcement to this effect in August 2023 explaining why they had taken this decision.

If you suspect an assessment includes AI-generated writing …

Here at Dundee our Academic Misconduct by Students Code of Practice was already clear that any unauthorised use of AI would be viewed as academic misconduct.

If you have an assessment that you think is not the student’s own work, you should refer this your School’s Associate Dean for Quality and Academic Standards (AD QAS) for review and where appropriate this can be referred to our academic integrity panel for further review. Other universities are likely to have their own investigation processes.

It is important that lecturers are aware that they should *NOT* use unauthorised AI detection tools. We have not approved the use of any of these tools at the University of Dundee and we do not have student consent to upload their work to third party sites. It is likely that other Universities have similar guidance, but you should check locally.

Cues to AI generated writing

If you are carefully reading student work you may be able to identify signs that Gen-AI has been used to help with the writing, or possibly even an essay mill. For example

  • Check references: GenAI is fallible and may generate inaccurate statements and make up or hallucinate references. Similarly there may be factual inaccuracies in the writing again reflecting the hallucinations of AI.
  • Consistency between pieces of work: Is there a change in tone or style of writing from previous pieces of work that the student has submitted. Have a look back at the student’s previous submissions.
  • Consistency within a piece of work: Are there variations of style within the piece of work? Is there consistency in spelling, eg all UK versions or a mixture of UK and US in the same sentences or paragraphs.

If you’d like to understand more about how AI detection tools work and some thoughts on the implications of AI on academic integrity watch this recording of our webinar with Robin Crockett, Head of Academic Integrity at the University of Northampton.

A time for reflection

We asked at the start if we “should” detect student use of AI? What are your thoughts? What are the potential implications of this? We’d welcome you sharing your ideas here.

All posts in this LearningX

1: Academic Integrity – can, and should, we detect student use of AI?

2: Could AI do your assessment?

2 students working using an exercise book. There are electronic networks behind them.
Reading Time: 3 minutes

Introduction

We have seen that AI detection is not very reliable, however, we clearly want to ensure that student work is, indeed student work. While students may be using AI to support the creation of their work, it’s important to ensure that it’s not easy for students to use AI to do all of their work. Many staff use the same, or similar assessments over the years, and while it is not easy to radically change an assessment within the academic year, it is often possible to tweak assessments

Could AI answer my assessment now?

The first thing to do is to use a current or recent item of coursework and paste it into Bing (or another Generative AI of your choice). Would the answer pass? Does it meet all of the learning outcomes at an appropriate level? Has it included references? If so, are they all accurate, or are some of them hallucinations (i.e. looking convincing, but don’t actually exist).

If you think it is a low pass or a minor fail, do you think that with a few extra prompts it could generate something that would be a clear pass. Remember, your students may not know the subject as well as you do, so think about how they might adjust the prompts.

How could I change the assessment?

Immediate changes

While a significant change in the assessment type requires QA approval, small changes may be possible. Could you think of any changes that you could do now, to meet the existing learning outcomes. If you’d like some ideas, have a look at the suggestions that UCL have made. In a recent AI and Assessment workshop, one idea that was suggested by some staff was

“Use more evaluative elements like compare and contrast in essays”

We tested that type of question and found that, while Bing Chat was able to evaluate the strengths and weaknesses of two different approaches to public health when asked individually, it was not able to compare them with each other. Another suggestion was

“Set case studies involving watching a video”

What might you add? What does Generative AI suggest as options for assessments that can’t easily be answered by genAI?

In the future

It seems that increasingly, essay-based coursework is likely to be used in the same way it currently is. We already have many areas that are using authentic assessment (which takes many forms, as a quick search will show). Students will be expected to be able to use AI tools when they start their careers, in almost all fields, so future assessments are likely to use AI in some way or another.

Universities globally are looking at this, and so you may well be involved in discussions examining the role of assessment in your subject areas, and how they should be tweaked. Professional, Statutory and Regulatory Bodies (PSRBs) are also involved in discussions in this area.

JISC’s National Centre for AI has a range of useful resources, including a set of postcards (PowerPoint) to think about different approaches to assessment. This is, however, a rapidly changing field, so what is a suitable assessment approach today, might not be next week.

Resources to give you further ideas.

As well as JISC’s postcards, you may find these interesting.

  • 10-minute chats on generative AI. These are hosted by Tim Fawns at Monash. While they are all interesting, Dave Cormier’s (21 Sept 2023) looks at the role of the essay in today’s world. They’re about 10 minutes long.
  • AI + AI = AI“. Martin Compton (who presented in one of our webinars), wrote a post in May 2023, looking at AI and Academic Integrity, and how they lead to the need for Assessment Innovation.
  • Assessment Makeovers in the AI age, (44 minutes) a podcast from Teaching in Higher Education featuring Derek Bruff.

To think about

What challenges do you foresee when trying to design assessments that are effective in an AI world, yet also allow our students to demonstrate their knowledge and skills? How comfortable do you feel developing assessments that may be very different from those you did as a student (or even got students to do 2 or 3 years ago)?

Do share both your ideas and any other useful resources you have found in the comments.

All posts in this LearningX

2: Could AI do your assessment?

3: AI to generate tests

A robot interacting with a series of boxes.
Reading Time: 2 minutes

Introduction

So far, we have mainly thought about students using AI, however, we’d now like look at ways that staff may wish to use AI.

Generating quizzes

Gen AI can be directed to create quizzes, and it is a feature that academics have been looking at since early days of ChatGPT. Originally staff were predominantly giving gen AI a subject area, however, this can often lead to a range of questions, many of which may not cover content covered in your course, and some of which may contain inaccuracies (the hallucinations we’ve seen in relation to references and other content). More recently tools have allowed staff to point the AI in the direction of a website or open a document (pdf) in the browser, and ask it to generate a quiz based on the document.

Now you can do this as easily as opening the document in Edge and asking Bing AI to “create a quiz based on this page” or “Using only the current webpage context create a quiz”, as demonstrated in this video aimed at students creating quizzes to support their understanding of content.

Blackboard and AI

Blackboard has recently introduced the AI Design Assistants to support staff when generating content for students. While we at Dundee have not yet released these (they’re currently being evaluated by a small group of academic staff) the quiz generation allows staff to develop quizzes without having to leave the Blackboard. If you’re at an institution that has enabled it (or are part of our evaluation group) we’d like to hear your thoughts in the comments.

[Starts at the relevant point in the video]

Create a quiz

Try creating a quiz from a pdf or URL using Bing or another tool of your choice.

How useful would those questions be for your students? When testing, we found that in some cases, Bing used a completely different page (one I had never opened) to generate the questions, while at other times it worked as expected.

To think about?

In our guidance for staff, you are reminded to acknowledge content that has been generated with the support of AI (just as students are required to). How do you think your students will react to knowing you have used AI in this way?

When using tools other than Bing (while logged into your UoD account), remember to consider whether or not your content should be shared outside of the University. Is the content shared through a Creative Commons / Open Education licenced, or is it content that requires an institutional login?

All posts in this LearningX

3: AI to generate tests