WORKSHOPS

Workshop 1: Understanding and implementing the moderation of school based assessment for high-stakes examinations

Damian Murchan, Stuart Shaw, Evgenia Likhovtseva

This workshop provides an opportunity to learn about external moderation of school-based assessment (SBA) used in high-stakes secondary school examinations. Presentations and group work help develop participants’ critical understanding of moderation approaches and participants can share their own practices. Many jurisdictions employ SBA within secondary qualifications to increase the validity of inferences about students’ learning of aspects of curricula difficult to assess using traditional examinations. This, however, presents challenges with reliability, prompting moderation of SBA.

Workshop 2: Optimising the construct validity of test items

Ezekiel Sweiry

The purpose of this workshop is to explore key themes and principles, from both research and practice, relating to how the construct validity of test items can be optimised. Construct validity is taken to refer to the degree to which items assess the underlying theoretical constructs they are intended to measure. Session 1 will consider the key threats to validity posed by different selected and constructed response item formats, and explore the extent to which different levels of thinking can be elicited through these item formats. Session 2 will explore the key features of test items that impact on validity, including Read More

Workshop 3: An Introduction to the Generalized Kernel Equating Framework with Applications in R

Jorge Gonzalez, Marie Wiberg, Alina von Davier

The aim of equating is to adjust the score scales on different test forms so that scores can be comparable and used interchangeably. This is extremely important to provide fair assessments to all test takers. The goals of the pre-conference workshop are for attendees to be able to understand the principles of equating, to conduct equating, and to interpret the results of equating in reasonable ways. Emphasis will be given to the new Generalized Kernel Equating (GKE) framework as described in the forthcoming book “Generalized Kernel Equating using R” written by the instructors (Wiberg, González, von Davier, 2024).

Workshop 4: Breaking barriers for all test-takers

Caroline Jongkamp, Helen Claydon, Thomais Rousoulioti, Renika-Irini Papakammenou

It is often the case that diversity and inclusion are afterthoughts when an organisation is evolving its summative e-assessment offering. This workshop will provide an engaging opportunity for collaboration with peers, to consider the perspectives of a range of test-takers. Thought-provoking discussions will equip participants with areas to take away and integrate in their future work practices.

Workshop 5: Assess your assessment

Bas Hemker, Cor Sluijter

This workshop will provide participants with all the tools needed to formally assess educational assessments, either computer-based, paper-based, or through an assessment system of their choice. Assessing the quality of your own assessment serves as an instrument for quality assurance of the assessment. It also helps to communicate the quality of assessment, to ensure accountability to end users (students, teachers, schools, policy makers) and to the general public.

Workshop 6: Evaluating Impact in the Context of Educational Assessment

Brigita Seguis, Hanan Khalifa

The workshop aims to offer a comprehensive overview of key concepts, methodologies, and best practices for assessing the impact of assessments, educational programmes, interventions and policies. Targeted at professionals in educational assessment, research, policy, and practice, it will equip participants with the knowledge and skills necessary for conducting rigorous impact evaluations and making evidence-based decisions to enhance educational outcomes.

Workshop 7: Introduction to multilevel modelling using large-scale assessment data

Anastasios Karakolidis, Vasiliki Pitsia

This workshop provides an accessible theoretical and practical introduction to multilevel modelling, a technique that allows for the appropriate analysis of large-scale assessment data and offers significant advantages compared to other single-level techniques (e.g., examination of interactions between student- and school-level factors). Specifically, the workshop presents key concepts and design features of large-scale assessments relevant to multilevel modelling (e.g., cluster sampling, weights), introduces participants to the theory behind multilevel models, considers issues from a practical perspective to support data preparation and the selection of modelling techniques, and engages participants in the application of multilevel modelling (using Mplus) and the interpretation of its results. Upon completion of the workshop, participants are expected to have a thorough understanding of key aspects of large-scale assessments and multilevel modelling, and be able to run their own multilevel models.

Understanding and implementing the moderation of school-based assessment for high-stakes examinations

Damian Murchan, Stuart Shaw and Evgenia Likhovtseva

This workshop draws in part on a study designed to investigate how different jurisdictions implement external moderation of School Based Assessment (SBA). Concerns about high-stakes examinations at upper secondary level prompt some systems to incorporate SBA into their qualifications. While addressing issues of validity and student stress, SBA raises reliability concerns, potentially compromising trust in the qualifications. External moderation is frequently used to allay such concerns. The study identified illustrations of moderation in secondary school exit examinations across 13 jurisdictions and investigated the local contexts in which they occur. Findings, including insights gained from interviews with key officials, revealed how education systems use moderation to ensure consistency of standards within and across schools. A range of approaches were identified but showing a marked mixing of models. The findings provide useful insights for jurisdictions contemplating introducing externally moderated SBA or wishing to review their current practice in this area.

This workshop focuses on the policy and practices internationally associated with external moderation of high-stakes examinations at the conclusion of secondary education. The workshop divides into four parts. Sessions within each part culminate in ‘Reflection and Activity’ discussions.

Part 1: The place of external moderation in examinations: using moderation to support school based assessment

Part One invites participants to consider the complex issues associated with any reform or adjustment in relation to high-stakes assessment (Kellaghan & Greaney, 2020). The instance of high-stakes examinations is addressed in the context of the significant influence of such assessments on pedagogical practice, students’ learning, students’ future educational and employment opportunities, and wellbeing.

Part 2: Overview of moderation approaches

Part Two situates SBA as a salient feature of high-stakes assessment, including in the international examination systems that constitute the backdrop to this workshop. Consequently, moderation is required to ensure comparability, consistency, and accuracy of SBA marks within and across different schools.

Part 3: Managing and implementing systems of externally moderated SBA

Part Three explores the implementation and management of externally moderated SBA. Three main areas are addressed in Session 4:

Focus on the complete examination cycle.
Professional development and materials for teachers.
Challenges to implementation (including recent developments in generative AI which pose additional challenge for jurisdictions that employ externally moderated SBA.

Part 4: Lessons Learned

Part Four reflects on the key messages and themes emerging across the preceding presentations, identifying lessons of relevance to practice in relation to external moderation. Themes evident in the overall analysis in Part Four include:

benefits and challenges of using different moderation strategies
factors influencing decisions by system designers and administrators in relation to moderation
locus of control of moderation (centralised or decentralised)
systemic approaches to implementation, including incremental introduction of moderation, capacity, and capacity-building at national/agency and at school levels
communication, messaging and securing stakeholder support in relation to moderation systems
current trends and future possibilities, including consideration of the impact of digital technology.

Presenters

Damian Murchan is an Associate Professor in the School of Education at Trinity College Dublin. Formerly a teacher and principal in primary schools, he has held recent leadership positions as Head of the School of Education and Head of the School of Creative Arts in Trinity College. Involved in teacher education programmes for many years, he has wide experience of working with primary and second-level teachers in the area of assessment. His research interests include assessment, e-learning, educational reform and teachers’ professional development. Recent projects have focused on reform of policy and practice in lower and upper secondary education in Ireland. Damian is the Vice-President of the Association for Educational Assessment – Europe, and a Fellow of the Association. He has held a number of advisory roles in relation to the development of assessment policy and practice in Ireland and internationally. He tweets @damianmurchan.

Damian publishes on the topic of curriculum and assessment reform. He co-edited the book Curriculum change within policy and practice: Reforming second-level education in Ireland (Palgrave Macmillan, 2021). This volume explored fundamental restructuring of lower secondary education in Ireland, including focus on highly contested reforms to assessment and school-based assessment (SBA) in particular. His most recent book was published in May 2024. Titled Understanding and applying assessment in education Second Edition (Murchan & Shiel, SAGE Publications Ltd), this book includes analysis of a range of issues relating to high-stakes examinations, SBA and moderation in lower and upper secondary education.

Stuart Shaw is Honorary Professor of University College London in the Institute of Education – Curriculum, Pedagogy & Assessment. He has worked for international awarding organisations for over 20 years and is particularly interested in demonstrating how educational, psychological, and vocational tests seek to meet the demands of validity, reliability, and fairness.

Stuart joined Cambridge Assessment (now Cambridge University Press & Assessment) in January 2001. He was a Senior Validation Officer with Cambridge ESOL for over six years with specific skill responsibilities for assessing second language writing. From 2007 to 2021, he was Head of Research at Cambridge Assessment International Education. He is now an independent educational assessment researcher and consultant. Stuart is Chair of the Board of Trustees of the Chartered Institute of Educational Assessors (CIEA). He is also a Fellow of the CIEA. Stuart is a Fellow of the Association for Educational Assessment in Europe (AEA-Europe), an elected member of the Council of AEA-Europe, and is Chair of its Scientific Programme Committee. He is also an elected member of the Board of Trustees of the International Association for Educational Assessment (IAEA) and Chair of the IAEA Communications Committee.

Stuart has a wide range of publications in English second language assessment and educational/psychological research journals (around 150 publications). His most recent book, co-authored with Isabel Nisbet – ‘Educational Assessment in a Changing World: Lessons Learned and the Path Ahead’ is to be published by Routledge in November, 2024.

Evgenia Likhovtseva is a Visiting Researcher at the School of Education, Trinity College Dublin, and holds the position of Research Manager at the Imperial War Museum in London. Her research encompasses a broad spectrum of educational policy areas, with a special focus on assessment and examination practices. Evgenia earned her BA in Philosophy and a Master of Public Policy and Management. She further enhanced her skills at the Executive Public Policy for Internationals (EPPI) program at the Goldman School of Public Policy, University of California, Berkeley.

Evgenia received her PhD from Trinity College Dublin in 2021, focusing on comparative policy practices related to the development of World Class Universities (WCU) in BRICS countries. Her research took her to Brazil, Russia, China, and South Africa. She was a Visiting Fellow at the Fudan Development Institute, University of Fudan, where she lectured on and examined the contemporary Chinese higher education policy landscape. During her PhD, she was awarded several scholarships, including the Government of Ireland – International Education Scholarship and the Postgraduate Research Fellowship Award in Arts, Humanities, and Social Sciences at Trinity College Dublin. She has also taught Philosophy and Education at Trinity College Dublin and Public Policy at the School of Advanced Studies, University of Tyumen, Russia.

Evgenia has extensive experience working on large-scale policy projects with both national and local governments. She collaborated with the International Centre for Local Democracy in Sweden (ICLD), focusing on the political participation of people with disabilities in the Baltic region of Russia and Scandinavia. The outcomes of her work have been published in policy briefs, specifically focusing on best practices for the political participation of people with disabilities. She led an international scoping study at the Association of Commonwealth Universities (ACU) in London, providing valuable data for the Foreign, Commonwealth and Development Office (FCDO) on university capacity strengthening in Africa and Southeast Asia.

Assess your Assessment

Bas Hemker, PhD. And Cor Sluijter, PhD

Bas and Cor

Were born in Amsterdam in the 1960’s
Have written a thesis on a psychometric topics in the 1990’s
Have vast experience in evaluating tests
Have been the head of the psychometric department of Cito
Are Fellows of AEA Europe
Are official assessors for the review system for the quality of tests and exams of the Dutch Research Centre for Examinations and Certification (RCEC)
Love to give workshops together

Bas

Still works at Cito, specialized as an educational measurement researcher, with quality of school exams as one of his projects
Is a member of the Dutch Committee on Test Matters (COTAN) for almost two decades, now working on the new national quality criteria
Is lecturer/researcher in educational measurement at the Open University
Is supervising a PhD on quality of teacher made tests
Was a member for many years of the Professional Development Committee of AEA Europe

Cor

Works as an independent consultant on test quality and test use
Is a lecturer in educational measurement at Fontys University of Applied Sciences
Is external member of several Exam Boards of various universities in the Netherlands
Is a member of the Dutch Advisory Board on Competencies in Financial Services (CDFD)
Will become the new Vice-President of AEA Europe in November 2024

Abstract

Educational assessments serve specific purposes, such as evaluation, monitoring, diagnosis, selection, or guidance. Achieving these goals requires assessments of sufficient quality. This workshop aims to provide participants with practical tools to objectively evaluate the quality of an assessment of their choice.

In the theoretical part of the workshop, we provide an overview of evaluation systems, like the Standards for Educational and Psychological Tests, the EFPA review model, the ETS standards for quality and fairness, and more, highlighting their similarities and differences.

In the applied part of the workshop, participants apply theory to practice by evaluating the quality of an assessment of their own choice, based on relevant criteria. This involves applying the most suitable system for evaluating the assessment’s quality. Relevant materials include research reports on how standards are determined, the reliability and validity of the assessment scores, the assessment manual, and more.

The workshop facilitators guide participants in applying various evaluation criteria to their assessment, such as:

– Theoretical Basis of Assessment Construction

Assessing how well the assessment content aligns with its intended purpose, theoretical background, and operationalization.

– Quality of Assessment Materials

Evaluating the level of standardization in assessment tasks, scoring, and instructions, as well as the clarity of administration guidelines.

– Quality of the Assessment Manual

Focusing on the information provided to support assessment users in administration and interpretation.

– Norms

Considering norm-referenced, content-referenced or criterion-referenced interpretation criteria.

– Reliability

Evaluating reliability coefficients and the quality of research supporting assessment score reliability.

– Construct Validity

Assessing construct validity outcomes and the quality of relevant research carried out.

– Criterion Validity

Evaluating the relationship between external measures related to assessment outcomes and the quality of relevant research carried out.

In the final discussion, participants share their findings, and we conclude with practical lessons learned.

Workshop Preparation

Participants should bring all relevant information about their chosen assessment, including the assessment manual, relevant research reports, and ideally the assessment itself.

An Introduction to the Generalized Kernel Equating Framework with Applications in R

Jorge González, Marie Wiberg, Alina A. von Davier

Abstract

The aim of equating is to adjust the score scales on different test forms so that scores can be comparable and used interchangeably. This is extremely important to provide fair assessments to all test takers. The goals of the pre-conference workshop are for attendees to be able to understand the principles of equating, to conduct equating, and to interpret the results of equating in reasonable ways. Emphasis will be given to the new Generalized Kernel Equating (GKE) framework as described in the forthcoming book “Generalized Kernel Equating using R” written by the instructors (Wiberg, González, von Davier, 2024). Different R packages will be used to illustrate how to perform equating when test scores data are collected under different data collection designs. Traditional equating methods, and both kernel equating method and item response theory (IRT) equating methods under the GKE framework will be illustrated. The main part of the training session is devoted to practical exercises in how to prepare and analyze test score data using different data collection designs and different equating methods. Expected audience includes researchers, graduate students, and practitioners. An introductory statistical background as well as experience in R is recommended but not required.

Keywords: Test equating, practical implementation in R, assessment at different administrations.

Short Bios:

Jorge González is associate professor at the Faculty of Mathematics, Pontificia Universidad Católica de Chile. He is author of a book and several publications on test equating. His research is focused on statistical modeling of data arising from the social sciences, particularly on the fields of test theory, educational measurement, and psychometrics.

Marie Wiberg is professor in Statistics with specialty in psychometrics at Umeå University in Sweden. She is the author of more than 60 peer-review research papers and have edited nine books. Her research interests include test equating, large-scale assessments, parametric and nonparametric item response theory and educational measurement and psychometrics in general.

Alina A. von Davier is the chief of assessment at Duolingo, and the Founder of EdAstra Tech. She has received several awards, including the ATP’s Career Award, the AERA for signification contribution to educational measurement and research methodology award, and the NCME annual award for scientific contributions. Her research is in the field of computational psychometrics, machine learning, assessment, and education.

Why AEA-E members / conference delegates should attend this workshop:

In the realm of educational measurement, ensuring the comparability of test scores holds significant importance, as these scores influence crucial decisions across diverse contexts. Test scores play an important role in determining academic admissions, awarding scholarships, tracking progress in achievement, assessing competencies in specific tasks, among other applications. Fairness to all test takers stands as a fundamental aspect of educational assessment. When utilizing test scores for decision-making purposes, it becomes imperative to present scores in a manner that is both equitable and precise.

Given concerns regarding test security, it is common for measurement programs to administer different versions of a test, all aimed at evaluating the same attribute. Equating emerges as a fundamental technique used to adjust the score scales across different forms, enabling the use of test scores interchangeable. In this pre-conference workshop, participants will gain insights into the principles of equating, learn how to conduct equating analyses, and effectively interpret the results. A comprehensive understanding of equating and its various methodologies is paramount in ensuring a fair assessment, making it a matter of considerable importance for all members of the AEA.

Who this Workshop is for:

Expected audience comprises researchers, graduate students, and practitioners. While an introductory statistical background and experience in R are recommended, they are not mandatory.

Overview of workshop

Equating is essential for adjusting the score scales across different test forms to ensure comparability and interchangeability of scores (González & Wiberg, 2017). It plays a central role in large-scale testing programs, facilitating the collection, analysis, and reporting of test scores. Equating guarantees a fair assessment irrespective of test takers’ backgrounds, time, or location.

This pre-conference workshop has two primary objectives. Firstly, it aims to introduce participants to equating, providing both conceptual understanding and practical experience through examples and exercises. Utilizing the R software, and particularly the equate, kequate, and SNSequate packages, attendees will learn how to conduct and implement various equating methods applicable across different data collection designs.

Secondly, the workshop seeks to equip attendees with the necessary skills to perform diverse equating methods under the new Generalized Kernel Equating (GKE) framework (Wiberg, González and von Davier, 2024) using available R packages. Drawing from the instructor’s experiences, the objective is to offer an updated perspective on the Kernel Equating process and methodologies while consolidating recent advancements into the structured and all-encompassing GKE framework.

The GKE framework expands upon the foundational principles of Kernel Equating (KE) by offering a comprehensive theory that encompasses all its facets. Specifically, the GKE framework introduces several enhancements:

i) Diverse models and techniques for presmoothing.
ii) Expansion of design functions to estimate score probabilities across different equating models.

iii) Exploring alternative kernels beyond the Gaussian kernel.

iv) Introducing multiple options for selecting bandwidth parameters.
v) Incorporating additional types of data beyond binary scoring.
vi) Offering novel perspectives on equating evaluation.

Following the chapters of the book “Generalized Kernel Equating: with applications in R”, authored by the instructors and Prof. Alina A. von Davier, which will be published by Chapman and Hall in mid 2024, the training session will cover traditional equating methods, and both kernel equating and item response theory equating under the GKE framework. Using real-world data, practical applications and software code will be provided to enhance accessibility and promote widespread adoption of these methods.

Participants will be guided through the steps of traditional equating methods and kernel equating, utilizing R packages such as equate, SNSequate, and kequate. They will also receive practical guidance on performing item response theory equating under the GKE framework using R. The workshop will conclude with practical recommendations and examples to ensure a fair assessment regardless of test takers’ backgrounds or circumstances.

Overall, this pre-conference workshop aims to provide attendees with comprehensive knowledge of equating developments and practical skills within the R environment. Through examples, exercises, and hands-on activities, participants will have ample opportunities to familiarize themselves with various equating methods and R packages.

Preparation for the workshop:

Attendees should bring their laptops with the R software installed, along with the latest versions of the R packages: equate, kequate, SNSequate, mirt. Electronic training materials will be provided to all participants.

Tentative Schedule

Time	Session	Presenter
9:30 – 9:45 9:45 – 10:45 10:45 – 11:00 11:00 – 12:00	Welcome & introductions, outline of the workshop. Introduction to equating: principles, designs and some methods. Break Kernel equating and introduction to the GKE framework.	Jorge, Alina & Marie Alina Jorge
12.00- 13.00	Lunch
13.00- 14.00	GKE framework continued. Examples in R	Jorge
14.00- 14.30	IRT KE under the GKE framework for binary and polytomous scored items.	Marie
14.30- 14.45	Tea/coffee break
14.45-15.30	IRT KE under the GKE framework for binary and polytomous scored items. Examples in R	Marie
15.30 16.30	Examples and practical recommendations to provide a fair assessment. Questions.	Jorge, Alina & Marie

Note: This is a tentative schedule. Sessions can be further detailed if required.

Introduction to multilevel modelling using large-scale assessment data

Abstract

This workshop provides an accessible theoretical and practical introduction to multilevel modelling, a technique that allows for the appropriate analysis of large-scale assessment data and offers significant advantages compared to other single-level techniques (e.g., examination of interactions between student- and school-level factors). Specifically, the workshop presents key concepts and design features of large-scale assessments relevant to multilevel modelling (e.g., cluster sampling, weights), introduces participants to the theory behind multilevel models, considers issues from a practical perspective to support data preparation and the selection of modelling techniques, and engages participants in the application of multilevel modelling (using Mplus) and the interpretation of its results. Upon completion of the workshop, participants are expected to have a thorough understanding of key aspects of large-scale assessments and multilevel modelling, and be able to run their own multilevel models.

Overview of workshop

International large-scale assessments, such as the Trends in International Mathematics and Science Study (TIMSS), the Progress in International Reading Literacy Study (PIRLS), and the Programme for International Student Assessment (PISA) and their national equivalents (e.g., National Assessment of Educational Progress [NAEP]), play a crucial role in shaping educational policies and practices globally. Such assessments provide data that are rich yet complex due to their assessment and sampling designs. It is important to be aware of these complexities in order to analyse large-scale assessment data correctly and interpret results appropriately to inform policy and practice. This workshop aims to introduce AEA-E conference delegates to these complexities, the ways assessment designs need to be accounted for in the analysis of assessment data, and the techniques that need to be used for the appropriate analysis and interpretation of large-scale assessment data. Multilevel modelling is a very useful statistical analysis technique for drawing meaningful inferences from complex large-scale assessment data. However, it can be perceived as overly technical and highly complicated, potentially deterring some professionals/researchers from its use.

This workshop provides an accessible introduction to the topic and serves as a starting point for the application of multilevel modelling. Participants will engage in an interactive learning process, during which they will: i) gain an understanding of key concepts and design features of large-scale assessments relevant to multilevel modelling, ii) familiarise themselves with the logic and theory behind multilevel models by considering issues from a practical perspective to support data preparation and the selection of modelling techniques, and iii) apply multilevel modelling using Mplus.

Who this workshop is for

This workshop will be useful to educators, researchers, academics, undergraduate and postgraduate students, policy-makers and other professionals involved in collecting, analysing, and/or using (or planning to use) national and/or international large-scale assessment data to address research questions and inform educational policy and practice. Participants will acquire transferable knowledge and technical skills, which can be applied across different contexts beyond large-scale assessments. Specifically, this workshop is relevant to individuals who works with data stemming from complex designs (e.g., clustered samples and longitudinal data) for which multilevel models are useful.

Presenters

Anastasios Karakolidis is a Research Associate at the Educational Research Centre, Ireland, and he is currently the National Project Manager for the Programme for International Student Assessment (PISA). He also works on the National Assessments and has been involved in several research projects across Europe. Anastasios has given lectures on research methodology and advanced statistical techniques to postgraduate students and academic staff. His research interests include research methodology, statistical analysis, measurement, assessment and testing. Anastasios has published papers in peer-reviewed academic journals, presented his research at various international conferences, and co-authored a book chapter on multilevel modelling of international large-scale assessment data.

Vasiliki Pitsia is a Research Associate at the Educational Research Centre, Ireland, specialising in quantitative research methodology and statistical analysis techniques. She is involved in national and international large-scale assessments, including TIMSS, PIRLS, and the National Assessments of Mathematics and English Reading (NAMER) and serves as an Associate Editor of the Irish Journal of Education. Vasiliki has acquired a broad range of research experience through her roles as a researcher, data analyst, and psychometrician on various projects in Ireland and Greece, and as a consultant at the World Bank Group. She also has extensive teaching experience, delivering lectures on research methodology, statistics, measurement, and assessment to postgraduate students and staff at academic institutions across Europe and workshops on statistics within the ERC. Her research has attracted grants and awards, including the AEA-Europe Kathleen Tattersall New Assessment Researcher Award, and it has been published in peer-reviewed academic journals and presented at national and international conferences. Her research interests and areas of expertise include research methodology, statistical analysis, psychometrics, measurement, and assessment.

Evaluating Impact in the Context of Educational Assessment

Abstract

The workshop aims to offer a comprehensive overview of key concepts, methodologies, and best practices for assessing the impact of assessments, educational programmes, interventions and policies. Targeted at professionals in educational assessment, research, policy, and practice, it will equip participants with the knowledge and skills necessary for conducting rigorous impact evaluations and making evidence-based decisions to enhance educational outcomes.
In the first part of the workshop, we will focus on essential definitions (e.g., washback, impact, consequential validity), concepts (e.g., input, output, outcome), and evaluation frameworks and models (e.g., LogFrame, Theory of Change).
We will then explore various impact methodologies, including experimental, quasi-experimental, and non-experimental approaches, illustrated with practical examples from diverse educational settings. The strengths and limitations of each approach will be discussed, guiding participants on selecting appropriate evaluation methodologies based on their context, objectives and available resources.

In the second part of the workshop we will cover the essential steps in designing and conducting impact evaluations. Participants will learn practical strategies for formulating evaluation questions and hypotheses, defining indicators, selecting data collection instruments, and engaging stakeholders. A significant portion of the discussion will focus on data collection methods and techniques, such as surveys, interviews, classroom observations, and document analysis. Participants will engage in hands-on exercises, review and critique data collection instruments, and discuss key sampling approaches.
Throughout the workshop, participants will engage in group discussions, share their own experiences and learn from diverse perspectives. The workshop will allow participants to develop a deeper understanding of impact evaluation principles and practices, and how these can be applied in their own contexts.

Presenters’ bios:

Dr Brigita Séguis is an impact evaluation specialist and educational researcher with extensive experience in designing, conducting and overseeing research and evaluation projects related to assessment, education, bilingualism and multilingualism, language learning and digital innovations.

Currently she oversees the delivery and implementation of impact evaluation projects across Cambridge University Press and Assessment (English). This includes collaborating with educational institutions and government organisations on joint evaluation projects, designing and commissioning impact evaluation studies and white papers, situational and policy analysis, fieldwork and data collection, providing mentoring and training, quality assurance, and dissemination of findings across a wide range of audiences and formats. She has conducted impact evaluation projects in Japan, UAE, UK, Vietnam, Spain, Uzbekistan, France and Oman.

Prior to joining the Impact Evaluation team, Brigita worked as a Senior Research Manager and was responsible for conducting research related to assessment development and validation.

Brigita holds a DPhil in Linguistics from Oxford University.

Dr Hanan Khalifa is a leading language assessment expert who developed national and international examinations and aligned curricula and tests to the CEFR. Working for International Development Organizations, she led monitoring and evaluation programs. For two decades, Hanan led Education Reform & Impact work at Cambridge University Press & Assessment English. Most recently, she is leading a Pan Arab initiative on developing a conjoint measurement scale for Arabic under the auspices of MetaMetrics Inc and Alef Education.

As an academic and a Council of Europe expert, she authored and contributed to seminal work, e.g., the socio-cognitive model for Reading (Khalifa & Weir 2009), the New Companion volume of the CEFR (2018, 2020), Qatar Foundation Arabic Framework (2022), Cambridge Partnership for Education Impact Framework (2023) and advised ministries of education globally on language education matters.

Dr Khalifa has won several international awards, gave numerous workshops, masterclasses and is an accomplished public speaker.

Breaking barriers for all test-takers

Presenters’ Bios

Caroline Jongkamp:

Caroline Jongkamp has worked in assessment development and assessment policy for 24 years. She has led assessment development for Dutch secondary school leaving exams. Also, she was responsible for transition from paper-based assessment systems to computer-based assessments. She has advised assessment organisations worldwide in the design of assessment systems and in the improvement of current assessment systems.

Her experience covers all aspects of assessment development and administration. Caroline has a special interest in e-assessment, inclusive assessment, item banking, maintaining standards, and review of test and item statistics.

Caroline is currently manager at the College voor Toetsen en Examens (CvTE), the board of tests and examinations in the Netherlands, where she coordinates diploma programmes for candidates that are not enrolled in a school and for students in special education. She has previously worked at Cito, the Netherlands.

Caroline holds an MSc in Econometrics with a specialization in Operations Research. She has been a Steering Group member for AEA-Europe’s eAssessment Special Interest Group.

Helen Claydon:

Helen Claydon is an experienced assessment developer and leader, having worked in assessment development for almost 30 years. She has led a range of projects developing summative and formative assessments for ages from 6 years-old to adult. Most notably she has led assessment development for national programmes in the United Kingdom, including the Scottish National Standardised Assessments (SNSA), the KS1 and KS2 National Curriculum Tests in mathematics (SATs) and the Professional Skills Tests for Prospective Teachers. She has also advised organisations, such as Qualifications Wales, the Association for Project Management (APM) and the Association of Chartered Certified Accountants (ACCA) on design of their qualifications and/or transition to e-assessment.

Her experience covers all aspects of the assessment development process, including assessment design, item writing and review, trialling, review of item and test statistics, test construction, standard setting and standards maintenance. She has particular interests in e-assessment and accessibility. Her subject specialism is mathematics and Helen is an experienced mathematics item writer for the primary and early secondary age ranges.

Helen is currently Deputy Head of Admissions Testing at GL Assessment. She has previously worked at the Australian Council for Educational Research (ACER UK); the Standards and Testing Agency (STA); ACCA; Edexcel; the Qualifications and Curriculum Authority (QCA); and the National Foundation for Educational Research (NFER). She has also worked as a freelance assessment consultant.

Helen has a Master’s degree in Education. She is a Steering Committee Member for AEA-Europe’s eAssessment SIG and a Fellow of the Chartered Institute of Educational Assessors. Helen was a board member of the e-Assessment Association between 2015 and 2018, during which time she helped to launch the International e-Assessment Awards. Helen has been a judge for the awards each year since their launch.

Thomais Rousoulioti:

Thomais Rousoulioti, PhD in Applied Linguistics (scholarship from the state scholarship foundation), works as a special teaching staff at Aristotle University of Thessaloniki, Greece. During the years 2010-2017 she worked at Department of Support and Promotion of the Greek Language of the Centre for the Greek Language, Greece, where she was involved in the design and implementation of research programs regarding the teaching and assessment of the Greek language as a second/foreign language but also with the design and editing of tests for the assessment of participants in the examinations for the Certification of Attainment in Greek.

She has also worked for the Hellenic Open University and the University of Nicosia teaching in online distance MA programs. Her research interests include the teaching and assessment of language proficiency in multilingual settings, the design of teaching materials, adult education, distance education and inclusive assessment. She is the coordinator of a postgraduate course for the assessment in Greek as a second/foreign language and the coordinator of the module Students’ Assessment in the training program Routes of the Centre for the Greek Language which is the sole representative of Greece at the European Federation of National Institutions for Language (http://efnil.org/).

Thomais has delved in assessment issues for almost 15 years, she has coordinated a range of field actions developing formative, alternative and summative assessments for ages from 8 years-old to adult. Her experience covers assessment development processes such as assessment design, item writing, piloting, review of items, test construction and resource planning for conducting e-assessment. She is a member of the SIG Steering Group on Inclusive Assessment in Education (AEA-Europe), the “Psifis” laboratory of Aristotle University of Thessaloniki, Greece, the Hellenic Society of Applied Linguistics, EALTA, ALTE and OsloMet’s EnA research team.

Renika-Irini Papakammenou:

Irini-Renika Papakammenou holds a BA in English literature with English language from the university of North Wales, Bangor and an MSc in Teaching English to Speakers of other Languages (TESOL) from the University of Stirling. She has done a PhD in Linguistics with specialization in Language Testing and Assessment at the University of Cyprus. She has received an award for her PhD thesis as the best postgraduate thesis of the year. She has also published scientific papers on language testing and assessment and alternative assessment techniques in international books and journals. She has presented in numerous local and international conferences, and she has delivered training courses. She is a member of language teaching and assessment societies and research groups such as a member of the SIG Steering Group on Inclusive Assessment in Education (AEA-Europe). Her research interests include language testing and assessment, curricula, material design (for face-to-face and online classrooms), classroom practices and teacher education and development.

Irini has been involved in EFL teaching and learning for the past 21 years. During her 21-year career she has contributed to the field in a number of different capacities such as English Language Teacher, Exam Preparation Teacher (all exams including IELTS and TOEFL), Teacher Trainer, Oral Examiner, Speaker and Researcher. She is the owner of a private institute of foreign languages, she owns a distance learning platform and leads distance learning EFL examination programs. She has created innovative classroom and online materials which have been presented in international conferences and have been published. Irini has recently received the Global Teacher Award 2020.

Short abstract:

It is often the case that diversity and inclusion are afterthoughts when an organisation is evolving its summative e-assessment offering. This workshop will provide an engaging opportunity for collaboration with peers, to consider the perspectives of a range of test-takers. Thought-provoking discussions will equip participants with areas to take away and integrate in their future work practices.

The premise for the workshop is that participants set new priorities to develop e-assessments and assessment services to support test-takers with a range of different forms of special educational needs and disabilities (SEND) and culturally diverse backgrounds. The workshop will focus on the test taker and consider how all parties in the test process (test developers, test administrators, teachers, school administrators) can support fair testing practices. The participants will work in groups as test-takers with different needs and explore how e-assessment can break barriers for all test-takers.

This workshop is led by members of the AEA-Europe eAssessment and Inclusive Assessment SIGs.

No prior experience of e-assessment or inclusive assessment is needed.