During our discussion of
curriculum development and general methods in education, we gave the importance of objectives in
education. We also distinguished between Instructional and behavioural objectives. We
observed that curriculum implementation and lesson delivery
often culminate in ascertaining
whether the objectives we set out to achieve were actually achieved. This is often called
evaluation. This unit introduces you to some
important concepts associated with ascertaining whether
objectives have been achieved or
not. Basically, the unit takes you through the meanings of test, measurement assessment and
evaluation in education. Their functions are also
discussed. You should understand the fine distinctions between these concepts and the purpose of each
as you will have recourse to them later in this course and as a professional teacher.
OBJECTIVES:
By the end of this unit, you
should be able to:
1. distinguish clearly between
test, measurement, assessment and evaluation;
2. state the purposes of
assessment and evaluation in education; and
3. give the techniques of
assessment in education.
THE
CONCEPTS OF TEST, MEASUREMENT, ASSESSMENT AND
EVALUATION
IN EDUCATION
These concepts are often used interchangeably
by practitioners and if they have the same
meaning. This is not so. As a
teacher, you should be able to distinguish one from the other
and use any particular one at the
appropriate time to discuss issues in the classroom.
Measurement
The process of measurement as it
implies involves carrying out actual measurement in order
to assign a quantitative meaning
to a quality i.e. what is the length of the chalkboard?
Determining this must be
physically done.
Measurement is therefore a
process of assigning numerals to objects, quantities or events in
other to give quantitative
meaning to such qualities.
In the classroom, to determine a
child’s performance, you need to obtain quantitative
measures on the individual scores
of the child. If the child scores 80 in Mathematics, there is
no other interpretation you
should give it. You cannot say he has passed or failed.
Measurement stops at ascribing
the quantity but not making value judgement on the child’s
performance.
Assessment
Assessment is a fact finding
activity that describes conditions that exists at a particular time.
Assessment often involves
measurement to gather data. However, it is the domain of
assessment to organise the
measurement data into interpretable forms on a number of
variables.
Assessment in educational setting
may describe the progress students have made towards a
given educational goal at a point
in time. However, it is not concerned with the explanation
of the underlying reasons and
does not proffer recommendations for action. Although, there
may be some implied judgement as
to the satisfactoriness or otherwise of the situation.
In the classroom, assessment
refers to all the processes and products which are used to
describe the nature and the
extent of pupils’ learning. This also takes cognisance of the
degree of correspondence of such
learning with the objectives of instruction.
Some educationists in contrasting
assessment with evaluation opined that while evaluation is
generally used when the subject
is not persons or group of persons but the effectiveness or
otherwise of a course or
programme of teaching or method of teaching, assessment is used
generally for measuring or
determining personal attributes (totality of the student, the
environment of learning and the
student’s accomplishments).
A number of instrument are often
used to get measurement data from various sources. These
include Tests, aptitude tests,
inventories, questionnaires, observation schedules etc. All these
sources give data which are
organised to show evidence of change and the direction of that
change. A test is thus one of the
assessment instruments. It is used in getting quantitative
data.
Evaluation
Evaluation adds the ingredient of
value judgement to assessment. It is concerned with the
application of its findings and
implies some judgement of the effectiveness, social utility or
desirability of a product,
process or progress in terms of carefully defined and agreed upon
objectives or values. Evaluation
often includes recommendations for constructive action.
Thus, evaluation is a qualitative
measure of the prevailing situation. It calls for evidence of
effectiveness, suitability, or
goodness of the programme.
It is the
estimation of the worth of a thing, process or programme in order to
reach meaningful
decisions about that thing, process or programme.
Professional Development and Training
In the
context of what the Bruce-Grey Catholic District School Board is
already
offering, professional development and training on "Planning, Assessment
&
Evaluation", specific to the needs of new teachers, should include the
following
core
content:
•
Curriculum-focused long- and short-term planning, keeping the end in
mind:
what students need to know and will be able to do
•
Selecting and using ongoing classroom assessment strategies and data
to
inform instruction and plan appropriate interventions to improve
student
achievement
• Using
assessment and evaluation strategies that are appropriate to the
curriculum
and the learning activities, are fair to all students, and
accommodate
the needs and experiences of all students, including
English
language learners and students with special education needs
•
Providing students with numerous and varied opportunities to
demonstrate
the full extent of their achievement without overwhelming
them
•
Collecting multiple samples of student work that provide evidence of
their
achievement
•
Referring to exemplars to assess and evaluate student work
• Using
provincial achievement charts to assess and evaluate student
work
•
Selecting and using effective strategies to support students'
self-monitoring,
self-assessment, and goal-setting for their own
learning
•
Informing and helping students and parents to understand the
assessment
and evaluation strategies to be used and giving them
meaningful
feedback for improvement
•
Applying provincial report card policies and board guidelines for
reporting
on student achievement
About Core Content
• The
Bruce-Grey Catholic District School Board offers professional
development
and supports to all their teachers in order to ensure quality
teaching
and improved student achievement.
•
Effective professional development must be manageable, relevant, timely,
and
appropriate to the daily responsibilities of new teachers.
• The
above core content has been developed in an effort to support new
teachers
as they progress along a continuum of professional development
through
their first year in the profession. As with all areas of learning,
proficiency
will develop over time.
• The
core content is not to be viewed as a checklist of activities to
undertake
or an assessment tool to gauge the teacher’s performance.
• It is
intended as a guide for individual choice regarding professional
development
and training activities for new teachers.
New Teacher Self-Reflection Tool
The
following variety of questions is designed for teacher self-reflection. You may
wish to
use a
selection of the questions below as starting points in conversations within
your mentoring
relationship
and when planning and revising your Individual NTIP Strategy.
• How
would I describe my long- and short-term planning process?
• During
planning, do I keep the end in mind and then give my students a clear sense
of where
we are going?
• What
strategies am I using to identify the learning needs of all students? Which
strategies
have been most and least successful?
• What
different assessment strategies, including observation and performance tasks,
am I
using? Are there others that I would like to try?
• Are my
assessment and evaluation strategies appropriate to the needs of my
students,
the curriculum expectations being assessed and the learning activities
being
used? (Do I have too few, enough, or too many assessment activities?) How
do I
know this?
• What
tools (such as rubrics, checklists) am I using to track student progress and
inform
instruction? Are there other tools that I would like to try?
• Do I
share assessment tools with students when they start an assessment task? If
not, how
could I integrate this into my classroom practice?
• To
what extent am I giving students multiple opportunities for practice and
feedback?
• In
what ways do I give my students feedback for improvement?
• How am
I using assessment information to inform my instruction?
• What
have I noticed about how my students respond to feedback?
• How do
I use the provincial achievement chart(s) to assess and evaluate student
work?
• Do my
assessments reflect a balance of the achievement chart categories? If not,
how can
I achieve this balance?
• To
what extent have I been using exemplars/anchors in: my lessons? my assessment
of
student work? my communication with students and parents?
• What
strategies, including modeling, am I using to develop and encourage students'
self-monitoring,
self-assessment, and goal-setting skills? Is there evidence that
students
are internalizing these skills?
• Do I
understand the provincial report card policies and school board guidelines for
reporting
student achievement? If not, where do I need clarification?
• How am
I using assessment data to develop class profiles in order to look for
patterns
and trends?
• How am
I using assessment data to group students according to needs and interests
(large
and small groups)?
• To
whom do I turn when I have a question about planning, assessment, and
evaluation?
• What
kind of support or new learning do I need in order to plan, assess, and
evaluate
even
more effectively?
• ?
Using This Tool
This use
of this material is optional and you are invited to use only the strategies and
tools that are
specific
to your needs and interests.
Principles for Fair Student Assessment
Practices for Education
The Principles
for Fair Student Assessment Practices for Education in Canada contains
a set of principles and related
guidelines generally accepted by professional
organizations as indicative of fair
assessment practice within the Canadian educational
context. Assessments depend on a
professional judgment; the principles and related
guidelines presented in this document
identify the issues to consider in exercising this
professional judgment and in striving
for the fair and equitable assessment of all students.
Assessment is broadly defined in the Principles
as
the process of collecting and
interpreting information that can be
used (i) to inform students, and their
parents/guardians where applicable,
about the progress they are making toward
attaining the knowledge, skills,
attitudes, and behaviors to be learned or acquired, and
(ii) to inform the various personnel
who make educational decisions (instructional,
diagnostic, placement, promotion,
graduation, curriculum planning, program
development, policy) about students.
Principles and related guidelines are set out for
both developers and users of
assessments. Developers include people who construct
assessment methods and people who set
policies for particular assessment programs.
Users include people who select and
administer assessment methods, commission
assessment development services, or
make decisions on the basis of assessment
results and findings. The roles my
overlap, as when a teacher or instructor develops
and administers an assessment
instrument and then scores and interprets the students.
responses, or when a ministry or
department of education or local school system
commissions the development and
implementation of an assessment program and
scoring services and makes decisions
on the basis of the assessment results.
The Principles
for Fair Student Assessment Practices for Education in Canada is the
product of a comprehensive effort to
reach consensus on what constitutes sound
principles to guide the fair
assessment of students. The principles and their related
guidelines should be considered
neither exhaustive nor mandatory; however,
organizations, institutions, and
individual professionals who endorse them are
committing themselves to endeavor to follow their intent and spirit so as to achieve
fair and equitable assessments of
students.
Organization and Use of the Principles
The principles and their related
guidelines are organized in two parts. Part A is directed
at assessments carried out by teachers
at the elementary and secondary school levels.
Part A is also applicable at the
post-secondary level with some modifications,
particularly with respect to whom
assessment results are reported. Part B is directed at
standardized assessments developed
external to the classroom by commercial test
publishers, provincial and territorial
ministries and departments of education, and local
school jurisdictions (boards,
boroughs, counties, and school districts).
Five general principles of fair
assessment practices are provided in each Part. Each
principle is followed by a series of
guidelines for practice. In the case of Part A where
no prior sets of standards for fair
practice exist, a brief comment accompanies each
guideline to help clarify and
illuminate the guideline and its application.
The Joint Advisory Committee
recognizes that in the field of assessment some terms
are defined or used differently by
different groups of people. To maintain as much
consistency in terminology as
possible, an attempt has been made to employ generic
terms in the Principles.
Problems
with Student Evaluations: Is Assessment
the
Remedy?
One of the most
encouraging solutions that I see out of this morass . . . the unending tired
debate over student
evaluations ◊) . . . is the assessment movement. Those who object to
sophisticated
assessments usually ask, "Why can't we just use grades as measures of
learning?" Doesn't
that just echo like "Why can't we just use student ratings of professors
as measures of good
teaching?. . . . My hope is that, a decade from now, members will
look at our
discussions about student ratings in the POD archives and realize just how far
people can come in
ten years if they commit to breaking out of primitive conventions. .
.(e.g., using mere
grades as measures of learning).
_____________________________________________________
High-quality
standardized tests of the cognitive and affective impact of courses are
essential
for gauging the relative effectiveness of non-traditional educational methods.
As far as I know,
disciplines other than physics, astronomy (Adams et al. 2000; Zeilik et al.
1997, 1998, 1999),
and possibly economics (Saunders 1991, Kennedy & Siegfried 1997,
Chizmar &
Ostrosky 1998, Allgood and Walstad 1999) have yet to develop any such tests
and therefore cannot
effectively gauge either the need for or the efficacy of their reform
efforts. In my
opinion, all disciplines should consider the construction of high-quality
standardized tests of
essential introductory course concepts.
Because most
disciplines have failed to develop definitive tests to measure cognitive and
affective course
impacts, seemingly simplistic statements from the pro Student Evaluation of
Teaching (SET) camp
cannot always be immediately dismissed. For example:
1. Aleamoni (1987)
addressed "Myth #5: Student rating forms are both unreliable and
invalid" as
follows: ". . . Most student forms have been validated by the judgement of
experts that the
items and subscales measure important aspects of instruction . . . (and also) .
. . by statistical
tools such as factor analysis. . . further evidence of validity comes from
studies in which
student ratings are correlated with other indicators of teacher competence,
such as peer
(colleague) ratings, expert judges' ratings, graduating seniors and alumni
ratings, and student
learning."
2. Michael Scriven
(1988) [as quoted by D'Apollonia & Abrami (1997)] stated that "student
ratings are not only
A valid, but often the only valid, way to get much of the information
needed for most
evaluations." (Emphasis in the original.)
3. Marsh & Dunkin
(1992) concluded: "SET's are clearly multidimensional, quite reliable,
and reasonably
valid."
4. Cashin (1995)
stated "In general, student ratings tend to be statistically reliable,
valid,
and relatively free
from bias or the need for control; probably more so than any other data
used for
evaluation."
5. Marsh and Roche
(1997) claimed that "there is little evidence of the validity of any other
sources of data . . .
. (on teaching effectiveness)."
The question is
"VALID FOR WHAT?" I think SETs can be "valid" in the
sense that can be
useful for gauging
the affective impact of a course and for providing diagnostic feedback to
teachers
[see, e.g., Hake & Swihart (1979)] to assist them in making
mid-course corrections.
However IMHO, SETs
are not valid in their widespread use by administrators to gauge
the
cognitive
impact of courses [see, e.g., Williams & Ceci (1997); Hake
(2000; 2002a,b); Johnson
(2002)]. In fact
the gross misuse of SET's as gauges of student learning is, in my view, one of
the
institutional
factors that thwarts substantive educational reform (Hake
2002c, Lesson #12).
Although there are
many SET researchers (see, e.g. Abrami et al. 1990; Aleamoni 1987 ;
d'Apollonia &
Cohen 1997; Cohen 1981; Cashin 1995; Marsh & Roche 1997; Marsh & Dunkin
1992) who claim that
SETs are valid indicators of students' cognitive condition (for a review see
Hake 2000), their
conclusions are almost always based on measuring student learning or
"achievement"
by course grades or exams and not by pre/post testing . . . (pre/post
even despite
the Lordly
Cronbachian objections of some education/psychology specialists – see Hake
(2001).
. . with valid and
reliable instruments such as the Force Concept Inventory of Hestenes et
al.
(1992) and Halloun et
al. (1995) [see, e.g., Hake (2002c)].
With regard to the
problem of using course performance as a measure of student achievement or
learning, Peter
Cohen's (1981) oft-quoted meta-analysis of 41 studies on 68 separate
multisection
courses purportedly
showing that:
the average
correlation between an overall instructor rating and student achievement was
+0.43; the average
correlation between an overall course rating and student achievement was
+0.47 . . . the
results . . . provide strong support for the validity of student ratings as
measures of teaching
effectiveness
was reviewed and reanalyzed
by Feldman (1989) who pointed out that McKeachie (1987)
has recently reminded
educational researchers and practitioners that the achievement tests
assessing student
learning in the sorts of studies reviewed here. . . (e.g., those by Cohen
(1981, 1986, 1987). .
. typically measure lower-level educational objectives such as memory
of facts and
definitions rather than higher-level outcomes such as critical thinking and
problem solving . .
.[he might have added conceptual understanding] . . . that are usually
taken as important in
higher education.
Striking back at SET
skeptics, Peter Cohen (1990) opined:
Negative attitudes
toward student ratings are especially resistant to change, and it seems that
faculty and
administrators support their belief in student-rating myths with personal and
anecdotal evidence,
which (for them) outweighs empirically based research evidence.
However, as far as I
know, neither Cohen nor any other SET champion has countered the fatal
objection of
McKeachie that the evidence for the validity of SET's as gauges of the
cognitive
impact
of courses rests for the most part on measures of students' lower-level
thinking as
exhibited
in course grades or exams. At least in physics it is well-known
(see, e.g., Hake 2002c)
that students in traditional
mechanics courses can achieve A's through rote memorization and
algorithmic problem
solving, while achieving normalized gains in conceptual understanding of
only about 0.2 (i.e.,
pre-to-post gains that are only about 0.2 of the maximum possible gain).
Williams & Ceci
(1997) write:
1. "in searching
for better and fairer means of evaluating teaching effectiveness and
providing better
bases for reappraisal of one's teaching, we need to experiment with
alternative methods
of soliciting students' opinions," and
2. "teaching
faculty should be given the opportunity to train in techniques . . . (of
presentation style).
. . that can enhance their student ratings. . .(as shown by Williams &
Ceci 1997). . . , especially
if such ratings are to be used by administrators in
recommendations for
tenure and promotion."
Education
research and development (R&D) by disciplinary experts (DE's), and of
the
same quality and nature as traditional science/engineering R&D, is needed
to
develop
potentially effective educational methods within each discipline. But the DE's
should
take advantage of the insights of (a DE's doing education R&D in other
disciplines,
(b) cognitive scientists, (c) faculty and graduates of education schools, and
(d)
classroom teachers . . . .
The education of
disciplinary experts in education research requires Ph.D. programs at least
as rigorous as those
for experts in traditional research. The programs should include, in
addition to the
standard disciplinary graduate courses, some exposure to: the history and
philosophy of
education, computer science, statistics, political science, social science,
economics,
engineering – see Lesson 11, and, most importantly, cognitive science (i.e.,
philosophy, psychology,
artificial intelligence, linguistics, anthropology, and neuroscience).
. . . In the U.S.
there are now about a dozen Ph.D. programs (Physical Science Resource
Center 2001, UMd-PERG
2001) in physics education within physics departments and about
half that number of
interdisciplinary programs between physics and education or cognitive
psychology. In my
opinion, all scientific disciplines should consider offering Ph.D.
programs
in education research.
No comments:
Post a Comment