Alternative Assessment as an Appropriate Tool/The Value of Terminology
From LiteracyTentWiki
Back to Assessment Information
The following discussion took place on the National Institute for Literacy Assessment Discussion Listserv from January 30 through February 27. It is focused on: “alternative” assessment as an appropriate tool; the value of the term “alternative.”
Read Discussion Summary: Alternative Assessment/Appropriate Tool/Value of Terms
Hello,
Is the TABE appropriate to administer to youth as young as 13 or 14 years
old?
Thank you,
Dianna Baycich
Ohio Literacy Resource Center
Hi all,
I also just wanted to note that the TABE was not normed on people the age that Dianna inquires after - 13/14 year olds.
So wouldn't this make the TABE invalid for use with young teens?
For what purpose would the TABE be used with people so young?
marie cora
Assessment Discussion List Moderator
I can give one example of when it was used with folks age 13 to 16. In 1991 and 1992 I worked at a Private Industry Council Summer Youth Employment and Training Program. We gave the TABE to all the kids we accepted into the program. We used the results to determine if they would be placed in tutoring as well as a job or just a job.
The reason I asked the question this time is for a colleague of mine who needs an instrument to determine reading level for a study she is doing. She wanted to use the instrument that was commonly used in Ohio. She was originally going to work with adults but had to change her study to use youth. So... a bit more explanation for why I was asking.
Thank you to everyone who has responded so far. I hope to hear more.
Dianna
Dianna -
Isn't there a testing instrument used in the Ohio public school system for (and with) this age group that would be more appropriate than the Teaching Adult Basic Education tool (TABE)? If reading level is being measured, how do the schools gain that data? In my opinion, these kids are far from adults - even if they are entering the job market.
I agree with Marie that it should be questioned whether or not the TABE would provide valid scores when it's designed to be used with adults. If your colleague's study is going to be worth anything at all, the tool used to determine reading levels should be one designed to work with that age group.
With kids this age (particularly if they are having difficulties in school), I even wonder whether or not timed testing is the way to *get* accurate data. Does the public school system have a different type of evaluation that determines grade level than TABE for special education students?
Nancy Hansen
adult literacy administrator
Sioux Falls Area Literacy Council
sfallsliteracy at yahoo.com
Sioux Falls, SD
Hi Dianna and Fellow List Servers,
My two cents, somewhat related to your question....
We are required to use the TABE here in New York State. So it's "appropriate" that we administer it. In my opinion, the TABE does not test reading ability, it tests the ability to take a standardized test. And it's not even a very good standardized test: tricky, culturally biased, etc. There is some connection between taking a standardized- multiple-choice reading test and being able to read--I will admit it--but not enough for me.
Maybe you've heard this sort of thing before, but just checking. I would not use the TABE test if I were not required to do so.
From Bruce Carmel
Turning Point
Bruce,
What would you use? That would satisfy both you and funders?
Thanks for ANY enlightenment on this problem.
Andrea
Hi Andrea,
What would I use instead of the TABE if I didn't have to use the TABE? I would use different tools for different purposes: placement, measuring individual progress, and reporting program impact.
--For beginning readers, I would use something like the Literacy Volunteers READ Test for all of those purposes. In that test, you ask people to read passages of different levels of difficulty and see whether or not they can actually READ them.
--For people who are closer to GED, I would have them take the GED a predictor test. Yes it's still a standardized test, but it's the one our students need to pass to achieve their goals. The TABE is not a very good predictor of GED success.
--For everyone's individual progress, I would use portfolios. Portfolios are great for noting individual progress, but I don't think they work for program impact. Accomplishments in portfolios are too all-over-the-place to be aggregated.
--Another way I have placed beginner-intermediate people is by having them read a range of texts. For example, after an orientation, students did a one-on-one assessment. They were shown a supermarket circular, a subway map, a job application, a telephone book, and a newspaper. We asked them which ones they could read. Then they were asked to DO things such as tell us how they got to school using the map,try to find their best friend's name in the phone book, and explain how to fill out the application. This took a lot of staff time, but it showed us a lot about students reading levels.
Assessment is a great challenge in our field. I have never found a tool I like that is easily administered. The only ones I like take lots of time, and they are not perfect. And even if we devote the time to a good tool, we still have to use standardized tests as well to comply with funders. It's a great frustration that there is no useful, valid, easily administered assessment tool for all stakeholders: funders, students, staff and others.
From Bruce Carmel
Bruce,
Let me applaud - loudly and with gusto - everything you wrote in this post. The practice test is indeed a much better tool for predicting success on the GED, and the portfolio assessment approach is really the most fair way to assess the diverse population of ABE/GED learners. I think there is surely a way to use portfolios to somehow demonstrate a program's success as well as an individual student's success, but not with the current NRS.
Thank you so much for sharing your views. It is my continued hope that someday the funders will understand the complexities of "valid" assessment tools. In the meantime, I'm very happy for the students in your program who have so many choices regarding how they may demonstrate their learning.
Patti White, M.Ed.
Disabilities Project Manager
Arkansas Adult Learning Resource Center
prwhite at madisoncounty.net
I'd have to agree with Patti White. TABE is useful for staff who are used to grade levels, but many people who work with adults prefer CASAS, a test normed on adults and much less stressful than a TABE test. When it comes to students CASAS and portfolios are more user friendly. CASAS has a comparison study on GED test success; the higher the CASAS score the higher the probability of passing the GED test. CASAS and Official Practice Tests are better predictors than a grade level score, in my opinion. va
Virginia Tardaewether
Chemeketa Community College
4000 Lancaster Drive NE
Salem, OR 97305
503-399-6147
Hi all,
Great conversation going here..looking forward to more.
Bruce, thanks so much for your posts. I'm not picking on you, I promise: I'm picking on EVERYONE!!!
Bruce, you said:
"And even if we devote the time to a good tool, we still have to use
standardized tests as well to comply with funders."
Many of you are moaning now, thinking: "here she goes yet again." - but it's my job: good tools and bad tools are standardized; portfolios are standardized; quizzes can be standardized; TABE is standardized but so is the writing rubric REEP; in theory, any assessment can be standardized. What everyone rails against is the particular selection of assessments that are available to us today - and while those are standardized, that's not what makes them good, bad, or ugly.
Standardized tests should be viewed as positive things because their soul mission in life is to attempt a level playing field.
As a field, we object to the paucity of choices, not that those choices are standardized. We also object to the use of materials that are out-dated or do not reflect today's needs. We object to the mis-use of a particular assessment. We should object to incorrect use of data and test results. We should object to mis-alignment between curriculum and assessment - which directly speaks to Bruce's (and many many folks') wish that there be an assessment that can serve the purposes of the classroom and program as well as it can serve the purposes of high stakes issues (like funding or career advancement).
What we really want are in fact standardized assessments, that's not the issue. The issue is that we don't yet have a thorough selection of tools that meet our complex needs - whether those tools are standardized or not.
marie-harp-on-cora
Assessment Discussion List Moderator
Hi Marie (and everyone)
I hear your point about standardizing assessment tools. Thanks for the posting. When I hear people say "standardized" in this context, it is always "standardized tests." (Except in the posting you just wrote:)) I don't think they mean something such as a standardized system for scoring portfolios. I think "standardized" means TABE, CASAS, etc. in our field.
Bruce
Hi Bruce,
Thanks for this - and I completely agree with you: for whatever reasons, 'standardized' does seem to equal TABE, CASAS, and BEST within our field.
I guess you could say it's a piece of my mission - to convince our field not to think this way or use the term this way any longer.
marie cora
Assessment Discussion List Moderator
Hi List (and Marie)
I think Marie's point is very important. It didn't work to talk about assessment tools that weren't TABE-like as "alternative." I didn't work to criticize standardized tests but have no alternative to offer. I think putting forth the strengths and legitimacy of tools such as portfolios, outcome checklists, holistically scored writing samples, etc is a good way to go.
Bruce
Hi Bruce and everyone,
Bruce, you said:
"I think putting forth the strengths and legitimacy of tools such as portfolios, outcome checklists, holistically scored writing samples, etc is a good way to go."
This sounds like a very good path to go down to me. I think people would have a lot to say and share about alternative tools, their uses, and their strengths. It would be a great exercise to list them all out and discuss the strengths, uses, and limitations of each one.
What questions do folks have about alternative assessments?: using them, seeking them out, developing them, whatever area most intrigues you.
What can folks share with the rest of us in terms of "the strengths and legitimacy" of alternative tools such as portfolios, checklists, analytic/holistic scoring, rubric use, writing samples, in-take/placement processes?
Are any of the tools you use standardized? Not standardized? Do you think that this is important? Why or why not?
Are any of the tools used for both classroom and program purposes?
I have other questions for you, but let's leave it at that for right now. Let us hear what your thoughts are. We're looking forward to it.
Thanks,
marie cora
Assessment Discussion List Moderator
Hi Marie, Bruce and All,
These kinds of constructed response assessments are easier to build that selected-response, but MUCH harder to score. The REEP is one Performance Assessment with which many of us are familiar. It is a standardized, constructed-response tool, and I think we can look at its statewide implementation as a bellwether of using more authentic, alternative assessments.
In Massachusetts, a lot of time and effort goes into standardizing scorers, initially and continually, in order to ensure that the tool is being used according to its design. Despite the institutional commitment of the DOE, it is a great struggle, perhaps even an act of faith, to ensure that all scorers are aligned. We all know of cases where two scorers, reading the same essay and using the same rubric, show a startling disparity in points awarded.
This is not to say that authentic assessment is invalid or undesirable; I fell that they are MORE authentic, valid and desirable... but we need to keep an eye on their reliability. As we put forward the strengths of these tools, we must be ready to acknowledge and pro-actively address their limitations by diligently and thoroughly preparing these tools. This is not as hard as it might sound: we must be sure that the tools we select are design actually measure the domain for which we aim and we must make sure that we use them reliably, i.e., with some standardization (which is NOT a four-letter word). We can't just take them off the shelf and expect one size to fit all- that's what gave "standardized testing" its bad name in the first place.
Every teacher designs assessments for their own class- I have a great presentation rating form, but it only works for the specific curriculum. I'm sure that others have great things as well and I'd like to get ideas from them; what's the best way to get these out in the field, and discuss where they are appropriate?
Kevin O'Connor Framingham Adult ESL Program
I think that alternative assessment tools (portfolios, outcome checklists etc.) are an excellent way to ensure that assessment
- relates to what is being taught in the classroom,
- focuses on tasks that relate to learner goals and objectives
- is supported by teacher/learner conferencing
The challenge is to ensure that teachers have good standards-based tools such as (e.g. rubrics, outcome checklists etc.) to inform their assessments, and adequate professional development training and support - so that they are confident in their use of these tools to inform their assessments.
I am a particular fan of portfolio assessment such as the European Language Portfolio - which provides a flexible but structured approach based on a common, shared standard. To quote from a report on the European Language Portfolio - portfolios provide "an important interface between language learning, teaching and assessment" and achieve these "invisible learning
outcomes ... :
-commitment to and ownership of one's language learning
-tolerance of ambiguity and uncertainty in communicative situations and learning
-willingness to take risks in order to cope with communicative tasks
-learning skills and strategies necessary for continuous, independent language learning
reflective basic orientation to language learning, with abilities for self-assessment of language competence[1]
[1] Page 13, A European Language Portfolio From piloting to implementation (2001-2004): Consolidated report - Final Version, Rolf Scharer, General Rapporteur, Language Policy Division, Strasbourg
Pauline McNaughton
Executive Director / Directrice executive
Centre for Canadian Language Benchmarks/Centre des niveaux de competence
linguistique canadiens
200 Elgin Street, Suite 803 / 200 rue Elgin, piece 803
Ottawa, ON K2P 1L5
T (613) 230-7729 F (613) 230-9305
pmcnaughton at language.ca
Hello List,
Can we call them AUTHENTIC instead of ALTERNATIVE, I know it's semantics, but let's have the semantics work in our favor. Anyway...
This is how we do assessment at Turning Point:
We use TABE and BEST to report progress to our FUNDERS, and a whole set of assessment tools (including those tests) to report progress to the STUDENTS, the TEACHERS and the PROGRAM including:
--Writing samples
--Portfolios
--Attendance and participation
--GED predictor for higher levels
--Teachers' assessment of the student skill level.
I know that last one is tricky. This is what it means: If a student is breezing through the work in Basic Education 2, but bombs out on the TABE--the teacher can promote him or her to BE 3. There is no Education Gain reported to our funder, but the student moves to the next level class, something she cares about more than her TABE score (usually).
I know it would be great if we could use Portfolios or other authentic tools to report programmatic gain, and maybe this discussion will push me to do more on that. But even if I do, it's not going to be recognized by our major (government) funders.
From Bruce Carmel
Hi,
Well my opinion is that assessment should pertain to the task at hand and be outlined as such. Whether you are using a Rubric or checklist. To standardize is to say that all students are learning at the same rate/pace. If your assessment is based on things like content, effort, use of certain language (depending where your students are, then you will be assessing each individual student on what they are capable of. That is what makes a portfolio such an effective tool in evaluating individual students.
Thanks,
Sandra Cook
Northlands College
Technology Enhanced Literacy
Hi Bruce and all,
Actually, a discussion of semantics would be quite welcomed by me. I do believe that part of the difficulty of navigating an already hugely complex system (lack of system?) is that we don't really have a common language together - just look at how often we (I) discuss the term 'standardize'. Some of our terms are clear, but many are not well-defined, or their definitions have shifted over time in response to either politics or research or educational trends. Still other terms have multiple meanings, and folks can interpret those terms within their own contexts - which might be different from the contexts of practitioners in another place.
What do folks think about this? What do folks think about Bruce's suggestion that we use 'authentic' for this discussion instead of 'alternative'? How do you understand these two terms? Do you think this matters?
Also Bruce, thanks for the outline of your assessment structure at Turning Point. Others - please also let us know how you mix and match Commercial assessments with other types of assessments at your programs.
Here's a couple of resources:
There is a pretty good-sized Assessment Glossary that can be accessed from either the LINCS Special Collection in Assessment (http://literacy.kent.edu/Midwest/assessment/) or the ALEWiki Assessment area at http://wiki.literacytent.org/index.php/Assessment_Information
As a professional on-line community, we could build our own set of definitions that speak directly to the issues that we experience. You can add your own definitions or revise ones that are there at the Wiki right now - at this point, "alternative", "authentic" and "performance" assessment all share the same definition there. Do you agree with this?
Here's a good resource that discusses Authentic Assessment in the
context of workplace education -it discusses a distinction between
alternative and authentic.
Using Authentic Assessment in Vocational Education by Rodney Custer et
al. ERIC doc - see the first chapter of the book.
http://www.eric.ed.gov/ERICWebPortal/Home.portal?_nfpb=true&ERICExtSearc
h_SearchValue_0=Using+Authentic+Assessment+in+Vocational+Education&ERICE
xtSearch_SearchType_0=kw&_pageLabel=RecordDetails&objectId=0900000b80091
a0c
Do folks have other resources to share? Thoughts, ideas? Let's hear them!
marie cora
Assessment Discussion List Moderator
Hi Sandra, thanks so much for your post.
You said: "To standardize is to say that all students are learning at the same rate/pace." This is not correct.
To standardize does not speak to the outcomes of the students' learning. It speaks to the inputs of developing a test that tries to be fair to all students. A standardized test precisely will NOT take into consideration differing rates or pace or anything else - because if it did, then you would start introducing bias.
A correct statement would be: "To standardize is to say that all students are provided an equal opportunity to demonstrate their knowledge, skill, or performance."
marie cora
Assessment Discussion List Moderator
In Manitoba we use a guided portfolio for people to demonstrate progress and skill development in reading text, document use, writing and oral communications. We have three separate levels of portfolio... the highest is transferable to the adult high school diploma. So students with those kinds of goals have something to "work for." The first two levels give more basic literacy students a certificate at the end which is also good for students wanting some demonstration of success when they might take years to get their GED. The website for more information is:
http://www.edu.gov.mb.ca/aet/all/publications/Stages/stages.htm
Robin Millar
marie--
I would go for "assessment" to cover all assessments. Sub=headings could be used for different types of assessments, e.g., standardized tests, portfolios, attendance. Then code each assessment as to who wants it and who gets it. Put in cost, too, if that is a critical variable.
Andrea
I do not really agree with either statement.
I agree that the first statement is not addressing outcomes and that standardization is about input. However, I disagree with the words “equal opportunity” in the second statement.
Educational inequality and the achievement gap are very real things. Regardless of standardization of input, there is still the issue of equal opportunity which translates to access to the same resources, qualified teachers, adequate learning environments, supportive social structures (family, friends, work, etc.).
It is not about the standard, it is about all the other stuff in society that we have not taken care of. Not too long ago all people were given the right to vote, but the trick was that they had to prove they could read and write, and not to long before that they had to be property owners (I hope you know where I am going with this). Well, we have not gotten rid of the vote because this is fundamentally important in a democratic society, but we have fought to equalize and in some cases eliminate some barriers to the right to vote completely. Standards are not the problem; we should not have to get rid of them. It is the inequality and the prejudices motivated by race, economics, and social position that continue to be a problem.
I guess what I am saying is that our fight against the standard is misdirected. We should fighting to eliminate those things that are keeping many from meeting the standards.
I hope I made some sense, I tend not to many times.
Eu-
Eugenio Longoria Sáenz
ezl109 at psu.edu
Hi Eugenio,
Thanks so much for your post. You are talking about the standard Opportunity to Learn (OTL), and it is a very real and important standard. Perhaps the most important one, but unfortunately in ABE, the one that commands the least resources. It is true that even the mechanisms that strive to be the most fair, are always going to be limited by their environment.
What do others have to say about this piece of the equation?
For a good, succinct reading on standards-based reform including a discussion of OTL, see:
A User's Guide to Standards-Based Educational Reform: From Theory to
Practice
by Regie Stites
http://www.ncsall.net/?id=352
marie
Marie, et al,
By "alternative", I presume you mean that these assessment options are an alternative to multiple-choice assessments. Is that a fair inference? I sometimes refer to alternative assessments as non-multiple choice assessments, just to make clear what I am talking about.
From my perspective, referring to them as authentic seems to muddy this discussion. Webster provides two of the following definitions for authentic which may help to illustrate my thinking:
a) worthy of acceptance or belief as conforming to or based on fact <paints an authentic picture of our society
b) true to one's own personality, spirit, or character
So for example, a student's CASAS scale score in math (say 212) from a multiple choice test may be worthy of acceptance of a person's math ability. An analysis of the test item responses may even provide greater information about a person's strengths and weaknesses. However, they cannot say much about how the student perceives the relation of "math" to his/her own personality and life. Two students at entry might both achieve a score of 207 in math for very different reasons. One student might have liked math, viewed herself as being capable of learning math but just not used it for many years. The other student might have never liked math, generally seen herself as having other strengths, but been forced to use math as part of her job. To ascertain this type of information, the teacher might have to talk to the student and find out the student's past experiences with math, the student's perceptions of its importance in his/her life, etc. Then, a custom assessment/project can be designed that is meaningful and authentic to that particular student.
From my perspective, all standardization (whether multiple-choice or non-multiple choice assessments) will to some extent reduce the authenticity for the student. The CASAS system attempts to address this by providing assessments that are relevant to adults and based in various contexts (life skills, employability skills, workforce learning, citizenship, etc.) so that the student can be assessed in contexts that are somewhat authentic to their experiences and goals.
Therefore, I prefer the term alternative assessments because then we can focus our discussion on the differences between multiple choice assessments and non-multiple choice assessments.
There is no question that non-multiple choice assessments can be legitimate and have many strengths. For example, Connecticut is currently piloting a CASAS workplace speaking assessment. This is a standardized assessment designed for ESL learners who are currently working to demonstrate their listening and speaking abilities in a workplace context. Compared to the CASAS listening multiple-choice assessments which we have used over the years, the speaking assessment has the potential for the instructor to gain a greater understanding of a student's strengths and weaknesses. Students also seem to enjoy taking the assessment. However, it needs to be administered one-on-one unlike the listening which can be group administered. The speaking assessment also places a greater training and certification burden on the test administrator and scorer. We have experienced many of these challenges with our statewide implementation of the CASAS Functional Writing Assessment over the past few years. Kevin alluded to some of those challenges such as maintaining scorer certification and interrater reliability. The scoring rubric used in both the writing and the speaking assessments can be valuable tools for classroom instruction.
In my opinion, at least some non-multiple choice assessments should be standardized so that they can be used to broaden the array of assessments available for state-level reporting/accountability.
Thanks.
Ajit
Ajit Gopalakrishnan
Education Consultant
Connecticut Department of Education
25 Industrial Park Road
Middletown, CT 06457
Tel: (860) 807-2125
Fax: (860) 807-2062
ajit.gopalakrishnan at po.state.ct.us
Hi Ajit and everyone,
Yeah, that's a good question you posed to me Ajit: I guess you are right in saying that I do think about 'alternative' as referring to any assessment that is not multiple choice. Actually, the terms I use in my head to separate this stuff out are selected-response and constructed response.
Selected response describes that situation: the person must choose (select) from a set of answers (responses) which one they think is the right one.
That's pretty tightly wrapped up in terms of what that means: you get the list of answers, you look at the choices, you determine which item on the list you think is right.
Constructed response also describes the situation precisely: the person must recall info or build for themselves (construct) the answer to a particular question. No choices are given for the person to consider - they are not selecting anything. The other thing that is hugely useful about using this term is that it is not prescriptive in how big or small a response must be constructed. So for example, many people think that a 'performance assessment' (which is a constructed response: because you are demonstrating your performance) must necessarily entail something big, lengthy, intense, etc. But in fact, a constructed response might entail just one word (as long as you are not selecting that word from a list). Here's a great example: you know what a 'cloze' exercise is? Those fill-in-the-blank worksheets that can test you on vocab or grammar? Well, that is a performance assessment, even though you are only filling in one word here and there.
I like to think about these notions this way because they are devoid of other distractors - for example, there is no mention of standardization with selected or constructed response, that is a whole other step in the process. And if you continue to think about selected response as 'multiple choice' then I bet you a dime you just fall back on equating multiple choice with TABE - and that is just not correct at all. While the TABE is an EXAMPLE of a multiple choice test - one does not equal the other.
A couple of questions back to you Ajit and to all the subscribers:
- Ajit, you made some really thoughtful comments in your arguments against using authentic assessment - what do others think of Ajit's point of view?
- Ajit, you said: "In my opinion, at least some non-multiple choice assessments should be standardized so that they can be used to broaden the array of assessments available for state-level reporting/accountability."
Folks - can anyone give us any examples of what Ajit describes above? Let's see if we can develop a growing list of the assessments being used that are different - I'll start by adding the REEP Writing Rubric to the list - it is standardized, it is a constructed response test, and at least Massachusetts uses it for reporting writing gains to the feds.
Also, Andrea Wilder (post on 2/3) suggested that we use Assessment for all types of 'tests' but that we divide that into sub-headings that list the various types, and include information on who wants the data from said test and who gets that data. We do have some amount of info listed on types of tests and costs, but we don't have a whole lot of info on who actually gets the test data and what is gets used for. What do folks think about this?...I'm intrigued....
Robin Millar (post on 2/3) describes a guided portfolio in use in Manitoba that sounds interesting: it has several levels to it. Robin - are parts of the portfolio standardized? The whole thing? Does the portfolio include both selected response and constructed responses types of assessments and info?
Ok, enough chatter from me for a Sunday morning. Hope everyone is having a lovely weekend, and see you again tomorrow,
Marie cora
Assessment Discussion List Moderator
For definitions see:
http://wiki.literacytent.org/index.php/Assessment_Information#Assessment
_Glossary
For a bunch of details and info on Commercial Assessments, but that do
not discuss the uses of data and should! Go to:
http://wiki.literacytent.org/index.php/Commercially_Available_Assessment
_Tools
To help me develop the Wiki section on Alternative Assessment, go to:
http://wiki.literacytent.org/index.php/Alternative_Assessment
To make informed choices about test selection, go to:
http://wiki.literacytent.org/index.php/Selecting_Assessment_Tools
Hi Marie,
Thanks for your response. I like your labels of selected response and constructed response. I guess by alternative, you were really referring to constructed response. I agree that there are selected response options that are not just multiple choice - I presume you might have been referring to matching, true false, etc. I wonder what the NAAL used.
With respect to your second question, I thought that I had mentioned in my earlier email two examples of standardized constructed response assessment options that we are using in Connecticut: (i) the CASAS Functional Writing Assessment and (ii) the CASAS Workplace Speaking Assessment. The former is currently reportable to the NRS while the latter is in the midst of being correlated to the NRS levels and will be reportable shortly. I would love to hear what other states are using/considering.
Ajit
Marie, Ajit and others,
Two terms evaluators often use which could be introduced in this discussion are "direct" and "indirect" measures. A "direct" measure is one which measures an actual performance of a specified task. An "indirect" measure (multiple choice paper and pencil tests are the best-known example) is one which "stands for" the performance. When the GED testing service, several years ago changed the GED writing test from a multiple choice assessment _about_ writing to a performance test of essay writing it moved from an indirect ("mediated") assessment to a direct assessment (of essay writing). Most evaluators would agree that given a world of limitless money and time, direct assessments are better -- that is, they more accurately measure the specified and desired performance of a task -- than indirect ones.
"Authentic" and "performance-based" assessments are synonymous (for me) with "direct" assessments. Paper and pencil multiple choice tests (whether standardized or not) are the best-known and most widely used example of "indirect" or "mediated" assessments.
Two other terms I find useful when categorizing assessments, measures or observations are "obtrusive' and "unobtrusive". The assessments we talk about on this list are most often or always "obtrusive," that is, the student knows s/he is being tested. This can be good ("positively obtrusive") where the assessment itself causes some additional positive learning, or bad ("negatively obtrusive"), where the assessment interferes with or prevents learning. One of the biggest complaints of the obtrusive assessments mandated by NCLB is that they are negatively obtrusive, that they, and some would argue that preparation for them, takes away a lot of valuable learning time. There are other ways in which assessments can be negatively obtrusive, too, producing false results for some for whom the testing situation creates fear that impedes normal or ordinary performance.
I find "unobtrusive" assessments the most interesting, where the learner is assessed but doesn't know it, and just sees it as part of the learning, indistinguishable from the rest. Portfolio assessment can be unobtrusive, as can journal writing, or a weekly set of problems or exercises which the student regards as regular classwork or homework. Teachers ask students questions all the time many of which are used as unobtrusive individual or group assessment. And some teachers do this systematically. In theory, unobtrusive assessments could be standardized, although I know of no example of this.( I wonder if large college lecture classes where students sometimes have assessment consoles that enable them to respond immediately and have their responses immediately tabulated and graphed for the instructor might by now have evolved some standard procedures which make them valid and reliable. Anyone know?)
I think "alternative" is a vague word which doesn't help us to think differently about kinds and purposes of assessment, whereas some of these words raise some important and interesting differences in kinds of assessment.
David J. Rosen
djrosen at comcast.net
A couple of thoughts-
First, using terms that are based on comparison is not usually a good idea, since, as the adage says, anything suffers by comparison. So, "alternative" is not the best choice of tems since it suggests different, which leads to the question, "Different than what?" It needs a term that allows it to stand on its own merits.
And I want to comment on your statement, David that, in "'unobtrusive' assessments... the learner is assessed but doesn't know it, and just sees it as part of the learning, indistinguishable from the rest." I would make a fine, but I think important distinction here. I believe the student should know that he or she is being assessed. But in unobtrusive assessment, the assessment is part of the learning process, consistent with it, grows out of it, and suggests future direction for it.
Rose
"Unobtrusive measures" can be just that--a teacher stows them away as signs that the learner feels better about being in class, progress, etc..
Unobtrusive measures could be ---a student shows up early to class, looks better dressed, does not sit in the back of the room, jokes with other students...
Andrea
Hi Andrea, David, Rose, and everyone,
I’d also like to pick up on this great discussion of terminology (sorry to be late – last week was school vacation week!).
David, you raised some really good points in your post, and added greatly to our list of terms that can possibly be more useful to us in this work. I have to side with Rose in her point regarding whether or not a student knows they are being assessed. I also feel it’s extremely important for them to know how, when, why, etc they will be assessed, but if the test is unobtrusive as David suggests, then this variable should actually be a positive one. It would provide them with information on what the expectations are for their performance.
What do others think about this?
David you also brought up standardization of unobtrusive tests, and Andrea noted some good examples of things teachers could observe of their students. Although I agree that it would be a tough thing to formally standardize such behaviors/acts/skills/interactions/etc, a teacher could take some measures to be as uniform in the process as possible. For example, they could use a standard form for checking on the things that Andrea notes, and provide themself with parameters for using the form (like use same time every day, always ask same questions in same order, never ask more than 2 students the same question, etc.). There is also what’s known as triangulation – which is used in the classroom: it is a procedure that includes 3 ways of corroborating information (http://wiki.literacytent.org/index.php/Triangulation). This is also a possible way of making the collection of some of this info more valid.
This is making me think of Exhibitions, which is a Coalition of Essential Schools initiative. It’s different because it’s high school, but there are many places where this discussion and that type of assessment overlap. Exhibitions, at the very least, would certainly fall under David’s “Direct” measure category. (http://www.essentialschools.org/pub/ces_docs/resources/cp/assess/assess.html)
So! Here’s a tally of some of the terms that have been put out there during this discussion. Let’s see if we can add to this list. Advocate for your preferred term as well, and let us know why you feel that way. Add to, or revise the definitions below as well.
Constructed response - An exercise for which examinees must create their own responses or products (performance assessment) rather than choose a response from an enumerated set (multiple choice).
Selected response - An exercise for which examinees must choose a response from an enumerated set (multiple choice) rather than create their own responses or products (performance assessment).
A "direct" measure is one which measures an actual performance of a specified task.
An "indirect" or “mediated” measure (multiple choice paper and pencil tests are the best-known example) is one which "stands for" the performance.
"Obtrusive" assessment - that is, the student knows s/he is being tested.
"Unobtrusive" assessment - the learner is assessed but doesn't know it, and just sees it as part of the learning, indistinguishable from the rest.
"Positively obtrusive" - the assessment itself causes some additional positive learning.
"Negatively obtrusive" - the assessment interferes with or prevents learning.
Questions for you all:
Performance assessment – What do you think of this term? Too vague as well? Has it taken on too much to be meaningful to us now?
Authentic assessment – What do you think of this term? Too vague? Encompasses too much so tells us little?
Participatory assessment – What do you think of this term? Is this in the same category as the ones above? Why or why not? Is this term useful for us?
Thanks for this rich discussion,
marie cora
Assessment Discussion List Moderator
