Involving students in the appraisal of rubrics for performance-based assessment in Foreign Languages By Dott. Rita Balestrini

Context

In 2016, in the Department of Modern Languages and European Studies (DMLES), it was decided that the marking schemes used to assess writing and speaking skills needed to be revised and standardised in order to ensure transparency and consistency of evaluation across different languages and levels. A number of colleagues teaching language modules had a preliminary meeting to discuss what changes had to be made, what criteria to include in the new rubrics and whether the new marking schemes would apply to all levels. While addressing these questions, I developed a project with the support of the Teaching and Learning Development Fund. The project, now in its final stage, aims to enhance the process of assessing writing and speaking skills across the languages taught in the department. It intends to make assessment more transparent, understandable and useful for students; foster their active participation in the process; and increase their uptake of feedback.

The first stage of the project involved:

a literature review on the use of standard-based assessment, assessment rubrics and exemplars in higher education;
the organization of three focus groups, one for each year of study;
the development of a questionnaire, in collaboration with three students, based on the initial findings from the focus groups;
the collection of exemplars of written and oral work to be piloted for one Beginners language module.

I had a few opportunities to disseminate some key ideas emerged from the literature review – School of Literature and Languages’ assessment and feedback away day, CQSD showcase and autumn meeting of the Language Teaching Community of Practice. Having only touched upon the focus groups at the CQSD showcase, I will describe here how they were organised, run and analysed and will summarise some of the insights gained.

Organising and running the focus groups

Focus groups are a method of qualitative research that has become increasingly popular and is often used to inform policies and improve the provision of services. However, the data generated by a focus group are not generalisable to a population group as a whole (Barbour, 2007; Howitt, 2016).

After attending the People Development session on ‘Conducting Focus groups’, I realised that the logistics of their organization, the transcription of the discussion and the analysis of the data they generate require a considerable amount of time and detailed planning . Nonetheless, I decided to use them to gain insights into students’ perspectives on the assessment process and into their understanding of marking criteria.

The recruitment of participants was not a quick task. It involved sending several emails to students studying at least one language in the department and visiting classrooms to advertise the project. In the end, I managed to recruit twenty-two volunteers: eight for Part I, six for Part II and eight for Part III. I obtained their consent to record the discussions and use the data generated by the analysis. As a ‘thank you’ for participating, students received a £10 Amazon voucher.

Each focus group lasted one hour, the discussions were entirely recorded and were based on the same topic guide and stimulus material. To open discussion, I used visual stimuli and asked the following question:

In your opinion, what is the aim of assessment?

In all three groups, this triggered some initial interaction directly with me. I then started picking up on differences between participants’ perspectives, asking for clarification and using their insights. Slowly, a relaxed and non-threatening atmosphere developed and led to more spontaneous and natural group conversation, which followed different dynamics in each group. I then began to draw on some core questions I had prepared to elicit students’ perspectives. During each session, I took notes on turn-taking and some relevant contextual clues.

I ended all the three focus group sessions by asking participants to carry out a task in groups of 3 or 4. I gave each group a copy of the marking criteria currently used in the department and one empty grid reproducing the structure of the marking schemes. I asked them the following question:

If you were given the chance to generate your own marking criteria, what aspects of writing/speaking /translating would you add or eliminate?

I then invited them to discuss their views and use the empty grid to write down the main ideas shared by the members of their group. The most desired criteria were effort, commitment, and participation.

Transcribing and analysing the focus groups’ discussions

Focus groups, as a qualitative method, are not tied to any specific analytical framework, but qualitative researchers warn us not to take the discourse data at face value (Barbour, 2007:21). Bearing this in mind, I transcribed the recorded discussions and chose discourse analysis as an analytical framework to identify the discursive patterns emerging from students’ spoken interactions.

The focus of the analysis was more on ‘words’ and ‘ideas’ rather than on the process of interaction. I read and listened to the discussions many times and, as I identified recurrent themes, I started coding some excerpts. I then moved back and forth between the coding frame and the transcripts, adding or removing themes, renaming them, reallocating excerpts to different ‘themes’.

Spoken discourse lends itself to multiple levels of analysis, but since my focus was on students’ perspectives on the assessment process and their understanding of marking criteria, I concentrated on those themes that seemed to offer more insights into these specific aspects. Relating one theme to the other helped me to shed new light on some familiar issues and to reflect on them in a new way.

Some insights into students’ perspectives

As language learners, students gain personal experience of the complexity of language and language learning, but the analysis suggests that they draw on the theme of complexity to articulate their unease with the atomistic approach to evaluation of rubrics and, at times, also to contest the descriptors of the standard for a first level class. This made me reflect about whether the achievement of almost native-like abilities is actually the standard against which we want to base our evaluation. Larsen-Freeman’s (2015) and Kramsch’s (2008) approach to language development as a ‘complex system’ helped me to shed light on the idea of ‘complexity’ and ‘non-linear relations’ in the context of language learning which emerged from the analysis.

The second theme I identified is the ambiguity and vagueness of the standards for each criterion. Students draw on this theme not so much to communicate their lack of understanding of the marking scheme, but to question the reliability of a process of evaluation that matches performances to numerical values by using opaque descriptors.

The third theme that runs through the discussions is the tension between the promise of objectivity of the marking schemes and the fact that their use inevitably implies an element of subjectivity. There is also a tension between the desire for an objective counting of errors and the feeling that ‘errors’ need to be ‘weighted’ in relation to a specific learning context and an individual learning path. On one hand, there is the unpredictable and infinite variety of complex performances that cannot easily be broken down into parts in order to be evaluated objectively, on the other hand, there is the expectation that the sum of the parts, when adequately mapped to clear marking schemes, results in an objective mark.

Rubrics in general seem to be part of a double discourse. They are described as unreliable, discouraging and disheartening as an instructional tool. The feedback they provide is seen as having no effect on language development as does the complex and personalised feedback that teachers provide. Effective and engaging feedback is always associated with the expert knowledge of a teacher, not with rubrics. However, the need for rubrics as a tool of evaluation is not questioned in itself.

The idea of using exemplars to pin down standards and make the process of evaluation more objective emerges from the Part III focus group discussion. Students considered pros and cons of using exemplars drawing on the same rationales that can be found debated in scholarly articles. Listening to, and reading systematically through, students’ discourses was quite revealing and brought to light some questionable views on language and language assessment that most marking schemes measuring achievement in foreign languages contribute to promote.

Conclusion

The insights into students’ perspectives gained from the analysis of the focus groups suggest that rubrics can easily create false expectations in students and foster an assessment ‘culture’ based on an idea of learning as steady increase in skills. We need to ask ourselves how we could design marking schemes that communicate a more realistic view of language development. Could we create marking schemes that students do not find disheartening or ineffective in understanding how to progress? Rather than just evaluation tools, rubrics should be learning tools that describe different levels of performance and avoid evaluative language.

However, the issues of ‘transparency’ and ‘reliability’ cannot be solved by designing clearer, more detailed or student-friendly rubrics. These issues can only be addressed by sharing our expert knowledge of ‘criteria’ and ‘standards’ with students, which can be achieved through dialogue, practice, observation and imitation. Engaging students in marking exercises and involving them in the construction of marking schemes – for example by asking them how they would measure commonly desired criteria like effort and commitment – offers us a way forward.

References:

Barbour, R. 2007. Doing focus groups. London: Sage.

Howitt, D. 2016. Qualitative Research Methods in Psychology. Harlow: Pearson.

Kramsch, C. 2008. Ecological perspectives on foreign language education. Language Teaching 41 (3): 389-408.

Larsen-Freeman, D. 2015. Saying what we mean: Making a case for ‘language acquisition’ to become ‘language development’. Language Teaching 48 (4): 491-505.

Potter, M. and M. Wetherell. 1987. Discourse and social psychology. Beyond attitudes and behaviours. London: Sage.

Links to related posts

‘How did I do?’ Finding new ways to describe the standards of foreign language performance. A follow-up project on the redesign of two marking schemes (DLC)

Working in partnership with our lecturers to redesign language marking schemes

Sharing the ‘secrets’: Involving students in the use (and design?) of marking schemes