Large-scale student assessments have been administered in Canadian provinces and territories (jurisdictions) for many decades, some dating back almost a century. Most student-assessment-related initiatives have originated at the jurisdictional level, and it was not until 1989 that steps were undertaken to establish a Pan-Canadian assessment program. The purpose of this article is to describe the rationale for, and provide a high-level description of, the evolution of Canada’s pan-Canadian student assessment program, including its transition to e-assessment, but first a little context.
Canada is a confederation of ten provinces and three territories. Areas of responsibility are divided between the federal and provincial/territorial governments. The federal level is responsible for portfolios such as national defence, foreign affairs, fisheries, telecommunications, immigration and citizenship, and the banking and monetary systems, while the provinces and territories are responsible for areas such as social services, health, forestry, highways and education. Since there is no national department of education, each province and territory enacts its own policies related to aspects such as curriculum, teacher certification and assessing and reporting on student learning progress. School boards and districts establish their own local policies within the framework of policies enacted by their provincial/territorial ministries of education.[1]
As a result of the federal/jurisdictional distribution of powers in Canada, there was no organized effort to examine educational issues on a pan-Canadian basis until the Council of Ministers of Education, Canada (CMEC) was established. Founded in 1967 by the jurisdictional ministers of education, the CMEC was meant to serve as:
a forum to discuss policy issues;
a mechanism through which to undertake activities, projects and initiatives in areas of mutual interest;
a means by which to consult and cooperate with national education organizations and the federal government; and
an instrument to represent the education interests of the provinces and territories internationally.[2]
In Canada, like in most other countries, the content and approaches used by education systems, as well as their respective performance, are increasingly under the microscope. Education stakeholders (e.g., educators, parents/guardians, the business community and taxpayers) ask questions such as “How well is our education system preparing students to be successful in a rapidly changing world in which global economic competition, swift technological advancements, and the necessity for flexibility and lifelong learning shape the environment?” In response to these kinds of questions, provincial and territorial ministries of education have taken measures to assess students in various subjects and at different stages of their schooling. Additionally, jurisdictions have participated in international assessment initiatives such as the Organisation for Economic Co-operation and Development’s (OECD) Programme for International Student Assessment (PISA) and the International Association for the Evaluation of Educational Achievement’s (IEA) Trends in International Mathematics and Science Study (TIMSS) and Progress in International Reading Literacy Study (PIRLS). Given their common interest in maximizing the effectiveness and quality of education systems in Canada, the ministers of education, through the CMEC, saw the need to establish a program to assess education system performance in each jurisdiction, and which would also yield information at the country level. The result was the design and development of a pan-Canadian assessment program, the School Achievement Indicators Program (SAIP), to evaluate student performance.
In 1989, the CMEC, in collaboration with all participating Canadian jurisdictions, initiated development of the SAIP, which was a criterion-referenced assessment tool designed to determine student achievement in relation to pan-Canadian standards. “The original intent of SAIP assessments was to describe what students knew and what they were capable of in each of the three designated subject areas: reading and writing, mathematics and science. This information would then be used to compare student performance from one jurisdiction to another, to report to the public about the efficiency and equity of each system on a pan-Canadian basis, and to provide information that would improve instruction in the participating jurisdictions.[4]
The SAIP was administered annually to random samples of 13- and 16-year-old students in English and in French. (These age groups were selected to include students at the end of elementary/beginning of secondary school and those at the end of compulsory schooling.) The same paper-based assessment instruments were administered to the two age groups in order to study the change in student knowledge and skills following the additional years of instruction. Administered in the spring of each year, the assessments were conducted on a cyclical basis, as shown in the table below.
Mathematics | Reading & Writing | Science |
---|---|---|
1993 | 1994 | 1996 |
1997 | 1998 | 1999 |
2001 | 2002 (Writing) | 2004 |
Each assessment comprised two components: a two-and-one-half-hour knowledge- and skills-based test, as well as questionnaires administered to students, teachers and principals (school questionnaire). The student questionnaire gathered information such as the language spoken at home, parents’ education level, extracurricular activities, attitudes regarding the subject-matter being tested and school in general. The teacher questionnaire provided information about their professional background and instructional practices. The principal (school) questionnaire allowed for the gathering of information about school and community characteristics, staffing, resources and services. The SAIP did not provide performance information for individual students, schools or school boards. Instead, it provided a measure/estimate of how well each jurisdiction’s education system was performing. Information on performance was reported on a five-point scale with Level 1 being the lowest and Level 5 being the highest. In terms of standards, it was expected that most 13-year-olds/Grade 8 students should achieve at least at Level 2, and most 16-year-olds should achieve at least at Level 3. Two kinds of reports were disseminated after each administration: a public report on student performance at the jurisdictional and pan-Canadian levels and a technical report, providing a detailed description of all aspects of the assessment (e.g., test development, administration, scoring, test reliability and validity), meant for researchers and the jurisdictions. Jurisdictions had the option to provide summary reports to their education stakeholders.
The SAIP assessments were conducted nine times; however, over the decade, the CMEC came to recognize that the pan-Canadian assessment program needed to evolve in order to:
reflect changes in jurisdictional curriculums;
integrate the increased jurisdictional interest and emphasis on international assessments; and
allow for the testing of the “core” subjects of mathematics, reading and science at the jurisdictional level.
Consequently, the SAIP was replaced by the Pan-Canadian Assessment Program (PCAP) in 2007.
In 2003, the CMEC approved the development of the PCAP, which was designed to gather achievement data attained by 13-year-old students in both official languages. (Originally, the program focused on 13-year-olds, but this was changed to Grade 8 students in intact classrooms to minimize disruption to school systems.) Administration was to occur every three years and would involve assessing a major domain (academic subject), as well as minor domains, mirroring the approach taken by the OECD PISA project. (Since Canada is an OECD member country, among international assessments, the CMEC and its federal partner, Human Resources Development Canada (HRDC) emphasized participation in the PISA project.) The intention was to identify PCAP’s major domain that would coincide with the major domain for the PISA (assessment of 15-year-olds) that would be administered two years after the PCAP. So, for example, the first PCAP, scheduled for 2007, involved reading as the major domain and mathematics and science as the minor domains. Since the same cohort of students would be assessed again in 2009, the CMEC and jurisdictions would be able examine the PCAP reading results of Canadian students and then investigate results/trends from the PISA two years later at age 15. The following table shows the planned cycle for the PCAP.[5]
Date | 2007 | 2010 | 2013 | 2016 | 2019 | 2023[6] |
---|---|---|---|---|---|---|
Major Domain | Reading | Math | Science | Reading | Math | Science |
Minor Domain | Math | Science | Reading | Math | Science | Reading |
Minor Domain | Science | Reading | Math | Science | Reading | Math |
Like its predecessor, SAIP, the PCAP is administered to random samples of students from across Canada. Within each jurisdiction, a randomly selected sample of publicly funded schools is drawn followed by randomly selected classrooms. Test forms (versions of the test) are randomly distributed within the selected intact classrooms. Students participating in PCAP, as well as their teachers and school principals, complete questionnaires designed to provide contextual information to support the interpretation of performance results. Also, like SAIP, PCAP was initially administered in a paper format; however, in recognition that students were using technology extensively in their everyday lives both inside and outside of the classroom, the CMEC took the decision to begin the transition to digital assessments beginning in 2019. To facilitate the transition from paper-based to online assessment, and to allow for the preservation of comparative trend achievement data over time, in 2019, some schools completed the PCAP on paper (the administration mode for each school was randomly allocated by the sampling contractor), and minimal changes/edits were made to the digital version. Essentially, the paper-based assessment was simply transferred to a computer-based platform. A mode study was conducted to control for any effects between the paper and online versions of the test.
For the 2019 PCAP assessment, groups of assessment units were distributed within four booklets in order to minimize the assessment burden on any one student. Booklets were designed such that students required approximately 90 minutes to complete any one booklet (60 minutes allocated to the items of the major domain and 30 minutes to the minor-domain items). A student questionnaire was included at the end of each booklet. Students were given an additional 30 minutes to complete the test if needed. As mentioned previously, the four booklets were randomly distributed to students within selected classes. Regarding the balance of item types within student booklets, approximately 70 to 80 percent were selected-response and 20 to 30 percent were constructed-response items.
PCAP reports student achievement data for all three learning domains, and there are two performance measures used to report assessment results: average/mean scores and proficiency levels. Mean scores allow for the comparison of jurisdictional results to the pan-Canadian mean. In addition, benchmarks or performance levels have been developed to provide more detailed information regarding what students know and can do with regard to the major domain. (Standard-setting activities, involving educators from each jurisdiction, set cut scores from standard/scale scores [derived from the test’s raw scores], which align to the PCAP performance levels.) Three levels have been identified for reading; four have been identified for mathematics and science. The performance-level descriptors provide for a deeper understanding of student performance beyond the average/mean scores. Level 2 has been designated the minimum expected level of performance for 13-year-olds. Like with SAIP, results are not reported at the student, school or school board levels. PCAP results are meant to complement classroom-based and jurisdictional assessments not replace them. “On a program level, jurisdictions can validate the results of their own assessments against PCAP results as well as those of the Programme for International Student Assessment (PISA).[7] More detailed information on topics such as assessment design; development; blueprint/specifications; sampling strategy; field testing; administration; coding, data capture and analysis; standard setting; performance descriptors; questionnaire foundations; student, teacher, classroom and school characteristics; equating and reporting can be accessed in the 2019 PCAP Assessment Framework, Contextual Report and Technical Report.[8]
The first step toward modernizing the PCAP was taken in 2019, when it was first administered online. Looking ahead, however, the CMEC conceived of a plan/project to transform the assessment to meet current and future needs and released a Request for Proposals (RFP) for a pan-Canadian assessment program online platform. Regarding project benefits and rationale, the RFP stated that “The project will continue the online delivery of the PCAP assessment, keeping pace with evolving trends in educational assessment. Moving forward, CMEC and the PCAP assessment team look to expand the use of various technology-enhanced items for access to a greater pool of item formats, both to remain current in the types of assessment items used in online assessments and to enhance student engagement during assessments.[9] The RFP provided specific features required of the platform related to the following topics:[10]
System Capabilities (e.g., operational on a variety of modern operating systems and web browsers; supportive of modern accessibility software; preventive measures against cheating);
Availability in English and French (e.g., fully functional in all aspects of test development; practice and operational tests; on-screen system messages; labelling of report templates);
Assessment Design and Item Bank (e.g., robust item bank to manage cognitive items for reading, math and science, as well as student, teacher and school administrator questionnaires; procedures for distributed, remote item development; access to online expert advice related aspects such as graphic design and layout; solution that can accommodate selected- and constructed-response items, as well as a variety of alternative item types, including technology-enhanced items);
Field- and Operational-Test Features/Activities (e.g., provision of technical support during administration times; allow students to move backward as well as forward during the assessment; allow students to identify unanswered questions before submission; provide for student accommodations [e.g., extra time]; provide alternate formats [e.g., coloured background, dyslexia font, large print]; enable text/image enlargement; capture all student-response-related data, as well as metadata such as the amount of time spent on each item; provide daily reports, including completion-rate data for each class/school; allow the CMEC to download daily completion reports);
Coding of Constructed-Response Items (e.g., platform to support the selection of student exemplars for coder training and reliability purposes; support any coding method [e.g., single, double or multi-coding] selected by CMEC; allow for the inclusion of any support materials for coders, such as the assessment item, coding rubric and training exemplars; permit coders to send difficult-to-code responses to the coding leader; allow coding leaders to review all coded responses and make changes if necessary; provide reports that capture coder statistics [e.g., number of items coded in an hour], item statistics [all marks given to an item], student statistics [all marks given for a student], reliability results for all coders on reliability items, and multi-coding reports [i.e., coding agreement of all coders across selected multi-coded items]);
Data and Data Transfer (e.g., work with CMEC psychometricians and sampling team to clarify/refine data templates; use CMEC templates (and variable names) for all data exports; allow CMEC to download data files related to selected- and constructed-response cognitive data, as well as student, teacher and school questionnaires).
Following the competitive RFP process, the CMEC selected Vretta to serve as its technology partner for the modernizing project. Work has begun with CMEC to develop online items and passages for field-test purposes; field testing is planned for the spring of 2026; and administration of the next, modernized PCAP online is scheduled for the spring of 2027.
For more than three decades, the CMEC has collaborated with its jurisdictional partners to conceptualize, develop and deliver pan-Canadian student assessment programs. In keeping with best practices, the CMEC has continuously monitored and reviewed its assessment processes and made adjustments as necessary. In addition, with changing times comes changing needs, and so the pan-Canadian assessment program has continually evolved over the years. The three-year project to modernize the PCAP has just begun. Future articles will describe the challenges, opportunities and outcomes of this ambitious initiative.
Dr. Richard Jones has extensive experience in the fields of large-scale educational assessment and program evaluation and has worked in the assessment and evaluation field for more than 35 years. Prior to founding RMJ Assessment, he held senior leadership positions with the Education Quality and Accountability Office (EQAO) in Ontario, as well as the Saskatchewan and British Columbia Ministries of Education. In these roles, he was responsible for initiatives related to student, program and curriculum evaluation; education quality indicators; school and school board improvement planning; school accreditation; and provincial, national and international testing.
Richard began his career as an educator at the elementary, secondary and post-secondary levels. Subsequently, he was a researcher and senior manager for an American-based, multi-national corporation delivering consulting services in the Middle East.
Feel free to reach out to Richard “Rick” at [email protected] (or via LinkedIn) to inquire about best practices in large-scale assessment and/or program evaluation.
The author would like to acknowledge and thank Dr. Pierre Brochu, President of Consultation MEP Consulting Inc., a company that provides psychometric and other large-scale assessment services in the public and private sectors, for his valuable input to this article. Pierre served for 15 years as the Director of Learning Assessment Programs with the CMEC during which time he oversaw the final years of the SAIP, as well as the conceptualization, design and implementation of the PCAP program.