Introducing group assessment to improve constructive alignment: impact on teacher and student

Daniela Standen, School Director of Teaching and Learning, ISLI  Alison Nicholson, Honorary Fellow, UoR


In summer 2018-19 Italian and French in Institution-wide Language Programme, piloted paired Oral exams. The impact of the change is explored below. Although discussed in the context of language assessment, the drivers for change, challenges and outcomes are relevant to any discipline intending to introduce more authentic and collaborative tasks in their assessment mix. Group assessments constitute around 4% of the University Assessment types (EMA data, academic year 2019-20).


  • improve constructive alignment between the learning outcomes, the teaching methodology and the assessment process
  • for students to be more relaxed and produce more authentic and spontaneous language
  • make the assessment process more efficient, with the aim to reduce teacher workload


IWLP provides credit-bearing language learning opportunities for students across the University. Around 1,000 students learn a language with IWLP at Reading.

The learning outcomes of the modules talk about the ability to communicate in the language.  The teaching methodology employed favours student–student interaction and collaboration.  In class, students work mostly in pairs or small groups. The exam format, on the other hand, was structured so that a student would interact with the teacher.

The exam was often the first time students would have spoken one-to-one with the teacher. The change in interaction pattern could be intimidating and tended to produce stilted Q&A sessions or interrogations, not communication.


Who was affected by the change?

221 Students

8 Teachers

7 Modules

4 Proficiency Levels

2 Languages

What changed?

  • The interlocution pattern changed from teacher-student to student-student, reflecting the normal pattern of in-class interaction
  • The marking criteria changed, so that quality of interaction was better defined and carried higher weight
  • The marking process changed, teachers as well as students were paired. Instead of the examiner re-listening to all the oral exams in order to award a mark, the exams were double staffed. One teacher concentrated on running the exam and marking using holistic marking criteria and the second teacher listened and marked using analytic rating scales

Expected outcomes

  • Students to be more relaxed and produce more authentic and spontaneous language
  • Students to student interaction creates a more relaxed atmosphere
  • Students take longer speaking turns
  • Students use more features of interaction

(Hardi Prasetyo, 2018)

  • For there to be perceived issues of validity and fairness around ‘interlocutor effects’ i.e. how does the competence of the person I am speaking to affect my outcomes. (Galaczi & French, 2011)


  • Homogeneous pairings, through class observation
  • Include monologic and dialogic assessment tasks
  • Planned teacher intervention
  • Inclusion of communicative and linguistic marking criteria
  • Pairing teachers as well as students, for more robust moderation


Methods of evaluation

Questionnaires were sent to 32 students who had experienced the previous exam format to enable comparison.  Response rate was 30%, 70% from students of Italian. Responses were consistent across the two languages.

8 Teachers provided verbal or written feedback.

 Students’ Questionnaire Results

Overall students’ feedback was positive.  Students recognised closer alignment between teaching and assessment, and that talking to another student was more natural. They also reported increased opportunities to practise and felt well prepared.

However, they did not feel that the new format improved their opportunity to demonstrate their learning or speaking to a student more relaxing.  The qualitative feedback tells us that this is due to anxieties around pairings.

Teachers’ Feedback

  • Language production was more spontaneous and authentic. One teacher commented ‘it was a much more authentic situation and students really helped each other to communicate’
  • Marking changed from a focus on listening for errors towards rewarding successful communication
  • Workload decreased by up to 30%, for the average student cohort and peaks and troughs of work were better distributed


Overall, the impact on both teachers and students was positive. Student reported that they were well briefed and had greater opportunities to practise before the exam. Teachers reported a positive impact on workloads and on the students’ ability to demonstrate they were able to communicate in the language.

However, this was not reflected in the students’ feedback. There is a clear discrepancy in the teachers and students’ perception of how the new format allows students to showcase learning.

Despite mitigating action being taken, students also reported anxiety around ‘interlocutor effect’.  Race (2014) tells us that even when universities have put all possible measures in place to make assessment fair they often fail to communicate this appropriately to students. The next steps should therefore focus on engaging students to bridge this perception gap.


Follow up was planned for the 2019-20 academic cycle but could not take place due to the COVID-19 pandemic.


Galaczi & French, in Taylor, L. (ed.), (2011). Examining Speaking: Research and practice in assessing second language speaking. Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Dehli, Tokyo, Mexico City: CUP.

Fulcher, G. (2003). Testing Second Language Speaking. Ediburgh: Pearson.

Hardi Prasetyo, A. (2018). Paired Oral Tests: A literature review. LLT Journal: A Journal on Language and Language Teaching, 21(Suppl.), 105-110.

Race, P. (2014) Making Learning happen (3rd ed.), Los Angeles; London: Sage

Race, P. (2015) The lecturer’s toolkit : a practical guide to assessment, learning and teaching (4th ed.), London ; New York, NY : Routledge, Taylor & Francis Group


How ISLI’s Assessment Team created an online oral exam for the Test of English for Educational Purposes (TEEP)

Fiona Orel– International Study and Language Institute (ISLI)



ISLI’s Test of English for Educational Purposes (TEEP) is administered at the end of pre-sessional courses as a measure of students’ academic English proficiency. The speaking test has traditionally been an academic discussion between two students that is facilitated by an interlocutor and marked by an observer.

This case study outlines the process of creating a version of the TEEP speaking test for 1-1 online delivery.


  • To create an online TEEP speaking test that could be administered at the beginning of June to 95 students
  • To ensure reliability and security of results
  • To support students and teachers with the transition


The Pre-sessional English course 3 (PSE 3) started in April during the period of lockdown.  At the end of the course all students sit a TEEP test which includes a test of speaking skills. We realised that we wouldn’t be able to administer the usual two student + two teachers test given the constraints with technology and the changes in teaching and learning which reduced to a certain degree the students’ opportunities for oral interaction and that we would need to develop a new 1-1 test that maintained the validity and reliability of the original TEEP Speaking test.


We had two main objectives: to create a valid online 1-1 speaking test, and to make sure that the technology we used to administer the test was simple and straight-forward for both teachers and students, and would have reasonably reliable connectivity in the regions where students were based (China, Middle East and UK).

The first thing we needed to do was to return to our test specifications – what exactly were we hoping to assess through the oral exam? The original face-to-face test had five criteria: overall communication, interaction, fluency, accuracy and range, and intelligibility. We knew that interaction had been impacted by the move online, but decided that the aspect of responding appropriately to others was a crucial aspect of interaction that needed to remain and included this in the ‘overall communication’ criteria. Recognising also that interlocutors would also need to be examiners, we worked on streamlining the criteria to remove redundancy and repetition and to ensure that each block contained the same type of description in the same order thereby making it easier for tutors to skim and recall.

We then worked out exactly what functions and skills in speaking that we wanted to test and how we could do that while mostly working with existing resources. We aligned with the original test specifications by testing students’ ability to:

  • Provide appropriate responses to questions and prompt
  • Describe experiences and things
  • Give and justify an opinion by, for example, stating an opinion, explaining causes and effects, comparing, evaluating.

The test format that enabled this was:

  • Part one: an interview with the student about their studies and experience of studying online
  • Part two: problem solving scenario: Students are introduced to a problem which the teacher screen shares with them and they are given three possible solutions to compare, evaluate and rank most to least effective
  • Part three: abstract discussion building on the talk given in part two

The final stage was trialling a platform to conduct the tests. We had considered Zoom due to its reliability but discounted it due to security concerns. BB Collaborate had connectivity issues in China so we decided to use Teams as connectivity was generally better and students and teachers were familiar with the platform as they had been using it for tutorials. Due to the spread of students over time zones, we decided to spread the speaking tests over three mornings finishing by 11:00 BST on each day. We kept the final slot on Friday free for all teachers to enable rescheduling of tests for any student experiencing issues with connectivity on the day.

Finally, we needed to help teachers and students prepare for the tests. For students, learning materials were produced with videos of a sample test, there was a well-attended webinar to introduce the format and requirements, and the recording of this webinar was made available to all students along with a document on their BB course. This instructed them what to do before test day and what to expect on test day.

The test format and procedures were introduced to teachers with instructions for tasks to do before the test, during the test, and after the test. There was also an examiner’s script prepared with integrated instructions and speech to standardise how the tests were administered. Each test was recorded to ensure security and to enable moderation. All students had to verify their identity at the start of the test. The test recording caused some problems as we knew that the video would have to be downloaded and deleted from Stream before anyone else or the student in the Team meeting who had been tested could access it. For this reason we allowed 40 minutes for each 20 minute interview as downloading was sometimes a lengthy process depending on internet speeds. We had 2 or 3 people available each day to pick up any problems such as a teacher being unwell or having tech issues, and/or a student experiencing problems. This worked well and on the first two days we did have to reschedule a number of tests, fortunately, all worked well on the final day. The teachers were fully committed and worked hard to put students at ease, informal feedback from students was the appreciation of an opportunity to talk 1-1 with a tutor, and tutors said that the test format allowed for plenty of evidence upon which to base a decision.


The test was successful overall and there were fewer technical issues than we had anticipated. Teachers and students were happy with it as an assessment measure and we were able to award valid and reliable grades.

Working together collaboratively with the teachers and the Programme Director was incredibly rewarding and meant that we had a wide resource base of talent and experience when we did run into any problems.


Incredibly detailed planning, the sharing of information across Assessment and Pre-sessional Teams, and much appreciated support from the TEL team helped to make the test a success. Students and teachers had very clear and detailed instructions and knew exactly what was expected and how the tests would be conducted. The sharing of expertise across teams meant that problems were solved quickly and creatively, and it is good to see this practice becoming the norm.

We need to work on the issue of downloading and deleting the video after each test as this caused some anxiety for some teachers with slower internet connection. We also need to have more technical support available, especially on the first day. Most students had tested their equipment as instructed but some who hadn’t experienced issues. It would be even better if a similar activity could be built into the course so that teachers and students experience the test situation before the final test.

Follow up

ISLI’s Assessment Team is now preparing to administer the same tests to a much larger cohort of students at the end of August. We will apply the lessons learned during this administration and work to make the process easier for teachers.