Principles of assessment informing quality assurance

The principles of assessment informing quality assurance

Introduction

About this course

This course is the second in a series of online professional learning courses around quality assurance of assessment in senior general syllabuses in Queensland.

It provides an overview of the key principles of assessment that inform quality assurance processes.

Senior assessment must be valid, administered fairly and provide equality of opportunity. Senior students should also have confidence they get the results they deserve, regardless of who marks their assessment or the assessment technique used.

All assessment capable teachers should know and understand these assessment principles and use them to communicate assessment information to stakeholders.

Learning goal

Participants will:

  1. Understand the principles that inform the quality assurance of summative assessment in general syllabuses.
  2. Use the language of quality assurance. 

Success criteria

You will know you are successful if you:

  • know where to look for evidence to make judgments about the validity of assessment 
  • understand the skills required to support the reliability of assessment results
  • consider fairness and equity of assessment for all students
  • use the language of quality assurance in professional conversations on assessment    

Length

  • self-paced
  • approximately 1 hour

Assumed knowledge:

Participants should already know and understand the various roles and responsibilities of:  

  • QCAA assessors
  • teachers in schools
  • QCAA

in  quality assuring summative assessment in general syllabuses.



Talking about assessment

The educators in this video acknowledge that quality assuring assessment is informed by three assessment principles:

  • validity
  • reliability
  • equity and fairness 

These principles set the priorities for quality assurance and determine what evidence of assessment practice assessors look for when quality assuring senior assessment.

Overview: the principles of assessment informing quality assurance

Validity

Defining validity

 

Validity is a key assessment principle. 

It is a judgment about whether an assessment is 'fit-for-purpose'.

In other words,

assessment instrument + use of assessment results (purpose) = validity judgment

A judgment about validity must be:

  • reasonable and accurate
  • specific to a particular use and context
  • evidence-based.

 

 

How schools and assessors make judgments about validity

How assessors make judgments about validity

Assessors make judgments about the validity of an assessment instrument before it is used.  In this context, the purpose of the assessment is to measure student achievement at a particular point in time.

Assessors make judgments about validity through:

  • endorsing internal summative assessment 
  • developing and writing external assessment
  • participating in scrutiny panels for external assessment.



How schools make judgments about validity

Schools may use an assessment instrument and its results for more than one purpose.

For example:

  • Schools might conclude that students who score highly on a mathematics examination have mastered modelling and problem solving.
  • Schools might also use the same assessment to predict how students will score in future mathematics assessment.

Schools must judge whether the mathematics exam is valid for each of these purposes.

An  assessment instrument may be well constructed, but invalid if used for the wrong purpose.

Priorities for quality assuring assessment validity


Assessors prioritise the following elements of assessment to gather evidence of validity:

  • alignment to the discipline as described in the syllabus rationale
  • alignment to syllabus subject matter (content validity)
  • alignment to syllabus constructs such as technique, conditions and cognition (construct validity)
  • authenticity of the task
  • scope and scale of the task
  • technical accuracy of item construction
  • use of an authentication strategy.

Collectively this evidence builds a plausible argument to support an assessor's judgment about validity.

Alignment to the syllabus

 

Assessors  look for evidence that:

  • an assessment instrument aligns to the technique identified by the syllabus
  • assessment instruments allow students to demonstrate the assessment objectives
  • items or tasks within an instrument align to the subject matter (content) and cognition (construct) identified by the syllabus
  • assessment instrument ISMGs align to the syllabus  
  • external assessment marking guides align to the assessment objectives of the syllabus.

Example — General Mathematics

Item: 

Use matrices to solve the following simultaneous equations.

3x - 2y = -2

4x - 5y = 2

In the General Mathematics syllabus, matrices concepts and simultaneous equations are discrete topics.

An assessor would not find evidence of content validity to support the validity argument.

They would advise rewording this item without references to matrices.

Revised item:

Solve the following equations simultaneously.

3x - 2y = -2

4x - 5y = 2

 

Authenticity of the task

 

Assessment instruments should include tasks that are appropriately challenging and provide realistic contexts.

Example — Physical Education

Item

Evaluate the impact of one of the following psychological factors on improving your physical performance in preparing for an elite badminton tournament:

  • personality type
  • stress and anxiety
  • motivation
  • team dynamics.

While the item's subject matter and cognition align to the syllabus, it would be unrealistic for senior students to appreciate the complexities of preparing for an elite badminton tournament as they are not elite players.

An assessor would advise that the task:

  • direct students to focus on their own or other students' psychological preparation to enhance their performance in badminton.

This becomes a more authentic task, related to students' experiences within the teaching and learning of the unit.

Revised item

Evaluate the impact of one of the following psychological factors on improving your physical performance in badminton.

  • personality type
  • stress and anxiety
  • motivation
  • team dynamics

 

Scope and scale of the assessment

 

The scope and scale of an assessment refers to the breadth and depth of subject matter and skills required for students to appropriately respond within syllabus conditions for time and length.  

When making a judgment about the suitability of the scope and scale of an assessment instrument assessors consider:

  • the context of the assessment
  • the amount of information required to complete the assessment
  • the amount of stimulus material to be accessed to complete the assessment
  • the amount of data to be collected to complete the assessment
  • the marks allocated to items within the assessment

 

Example — Geography

Item

Analyse the land management challenges facing North Stradbroke Island.

This item represents a task or component within a geography field report.

The scope and scale of the task is such that students would find it difficult to respond comprehensively (for 7 marks) within the 600-800 word limit suggested by the syllabus for the analysis section of a field report.

An assessor would advise that :

  • the scope of the task focus on one land management challenge rather than a number of challenges 
  • the scale of the task focus on a specific location or site on North Stradbroke Island.

Revised item

Analyse the land management challenges posed by tourism at Point Lookout.

The revised item provides students the opportunity to demonstrate comprehensive analysis within the syllabus recommended word length.

Reflection


Reflect on past assessment. 

Consider a time when you have designed a task that:

  • set an inauthentic context for the student, or
  • was inappropriate because the scope or scale was not feasible given the conditions for time or word length.

In light of your reflection, how could the task have been made valid?  

Accuracy of item construction

Accuracy of item construction

 

Authentication strategy

 

QCAA Schools and teachers Trained assessors
Provides guidelines for authentication strategies. Enact authentication strategies to ensure work submitted for internal summative assessment is the student's own. Quality assure internal summative assessment instruments before they are used, to ensure an appropriate authentication strategy has been identified.

 

Check for understanding

The following terms represent aspects of assessment that can be evaluated by assessors to make a judgment about validity.

 

  • Scope and scale
    An assessment design priority that ensures students can respond to a task appropriately within syllabus conditions for time and length.
  • Construct validity
    A judgment of how well an assessment instrument calls upon the the cognition identified in syllabus assessment objectives.
  • Authentic task
    Assessment that provides a realistic context for students, drawing on their experience within the teaching and learning of the syllabus unit.
  • Content validity
    A judgment of how well an assessment instrument calls upon the subject matter and skills identified in syllabus assessment objectives.
  • Alignment
    Ensuring an assessment instrument supports the subject matter, cognition and assessment technique identified by the syllabus.
  • Authentication strategy
    An approach to validating student work as their own, which is identified on an assessment instrument and enacted at various stages of assessment implementation.

Reliability

Defining reliability

Where validity is a judgment about the purposes of assessment — reliability is a judgment about the measurement of assessment.

It refers to the extent to which the results of assessment are

  • consistent
  • replicable  
  • free from error.

While the reliability of results is a priority for senior summative assessment, no assessment instrument will provide results that are perfectly reliable.

There will always be some degree of variation that affects results.

Like validity, a judgment about reliability is based on evidence drawn from a range of sources.

 

Factors that influence reliability

Factors that influence reliability

Like validity, a judgment about reliability is based on evidence drawn from a range of sources. This includes:

The students completing the assessment

  • assessment anxiety
  • health
  • fatigue
  • motivation
  • test-taking skills.



The teachers and assessors marking or confirming assessment

  • inter-marker reliability
  • intra-marker reliability
  • bias
  • calibration of markers
  •  accuracy of marking.

The administration and implementation of assessment

  • clarity of instructions
  • internal consistency of assessment items
  • physical conditions of the room or place where assessment is conducted
  • distractions during assessment
  • straightforward unambiguous marking guides that can be interpreted consistently by all teachers.

How assessors quality assure assessment reliability

How assessors quality assure assessment reliability

Assessors are trained and employed to ensure the reliability of student results is as high as possible.

This includes:

  • undertaking regular calibration to syllabus instrument-specific marking guides
  • participating in the annual training for external assessment marking operations that includes calibration to marking guides and marking of training scripts
  • confirming senior internal summative assessment results
  • monitored marking of external assessments through the use of control scripts.

Priorities for quality assuring assessment reliability


Assessors prioritise the following elements of assessment to provide evidence of reliability.

Intra-marker reliability 

  • A marker should adhere to a marking guide and not give undue attention to extraneous factors such as handwriting.

Inter-marker reliability  

  • A marker should consistently give the same marks to the same student response on different occasions.

Consistent and precise application of marking guides and ISMGs 

  • Bias —  a maker should adhere to a marking guide and not give undue attention to extraneous factors such as handwriting.
  • Marker drift — a marker should be consistent in their application of a marking guide to student responses as they progress through a marking operation (avoiding drift to becoming more generous or more rigorous).
  • Regression to the mid-point — a marker should utilise the full marking range and not mark to the middle of a range merely to avoid controversy.

The relationship between validity and reliability


'In any assessment there is a limit to the extent to which both reliability and validity can be optimised.'

(Harlen and Johnson, 2014)

Assessment items that require explicit, clear-cut responses such as multiple choice, will have high reliability because they reduce the chance of marker error.

These item types however, generally under-represent the full range of  assessment objectives identified in syllabuses and limit evidence of construct validity.

While constructed responses such as essays and extended paragraphs provide greater opportunity to demonstrate a range of assessment objectives,  they may be less reliable because there will be some degree of marker variance. 

These items require detailed marking guides, thorough training of markers and monitoring of marker performance to minimise the threats to reliability.

A quality summative assessment instrument is a balance between validity and reliability.  

Equity and fairness

Untitled content

Heading 1 text goes here

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.

Untitled content

  •  oiojiji
    •  
       
      kljl
  • ;klmklmk
       
       
       

     

  • jjkj