High Stakes Testing and Standardized Assessment

From LiteracyTentWiki

Back to Assessment Information


The following discussion took place on the NIFL-Assessment Listserv from July 27 to August 2, 2005. The topic is High Stakes Testing and Standardized Assessments.


Read Discussion Summary: High Stakes Testing and Standardized Assessment


Discussion Thread

"Help", he says, not quite desperately. (I have procrastinated, so I am just a "nonce" from desperation.)

As my program (staff and learners) and fellow practitioners move into the 21st century of "no adult left behind", trying to meet the accountability requirements of federal, state, and program parties, trying to be evidence-based, standards-based, and so on in the jargon of the moment, we are as you are trying to prepare our learners for post-secondary training/education and for living-wage jobs, and, well, frankly (as St Paul said) trying to be "all things to all people so that some few can be saved".

In that context, I am interested in hearing and/or discussing with folks the implementation of standardized assessments. Are they always a necessary evil? The devil's due? Have you found ways to make them relevant, engaging?

Perhaps (whisper, wink) you are you a true-believer? Is the TABE, the BEST, the CASAS, the best thing since sliced bread?

Don't be shy. Blast me. Guide me. Lurkers, come out and play. Theorists, practicivists welcome to proselytize.

Do you reject standardization? Are you are a naturalist? Please, let me know how to move down the "path not taken."

If your comments are "not ready for prime-time", you can reply privately to hdooley@riral.org. Thank you.

Howard L. Dooley, Jr.
Director of Accountability, Project RIRAL
Assessment Team, Governor's Taskforce on Adult Literacy


Hi there,

Standardized testing is not a perfect solution - and let's be honest – there are no perfect solutions. No standardized test is ideal for every situation in which it is used - nor are any of them really able to do justice to every learner. They are also expensive and time consuming.

But what standardized testing does provide is a reasonable measure of fairness and some common shared understanding. In Canada as publicly funded ESL gets into higher levels of language proficiency, particularly targeting highly skilled professionals - standardized testing is the only way to ensure fairness in terms of allocating a limited number of placements with set language entry requirements. Even in more basic settlement language training there can be wait lists and limited enrollments.

Canada is fortunate in having a national language standard upon which to base the development not only standardized tests but also curriculum development and materials development. The Canadian Language Benchmarks – a descriptive scale of communicative proficiency in ESL expressed as 12 benchmarks which include detailed statements of communicative competencies and performance tasks. It essentially provides a framework of reference for learning, teaching, programming and assessing adult ESL in Canada. It also provides a common professional foundation of shared philosophical and theoretical views on language education. (For a free copy to download see www.language.ca)

The CLB standard is being increasingly used to conduct occupational language analyses so that we can benchmark specific occupations - as we did with the nursing profession (see www.celban.org). Providing benchmarked "occupational language analysis" for specific occupations provides everyone - ESL professionals, newcomers, employers, licensing bodies – with common shared information about what the language requirements are. This is ultimately fair (at least more fair then the absence of such information where employers and newcomers are left to determine for themselves what seems to be adequate language proficiency).

Standardized testing may not always be the "sharpest knife" in the drawer someone told me recently, but it is often the only knife in the drawer.

Pauline McNaughton


NIFL-assessment colleagues,

Thanks to Howard for the engaging way in which he opened this topic.

On Jul 28, 2005, at 9:17 AM, Pauline Mcnaughton wrote:

<Standardized testing may not always be the "sharpest knife" in the drawer someone told me recently, but it is often the only knife in the drawer.>

To continue Pauline's great metaphor:

1) We should use the best knives we have.
2) We may need some new, better knives.
3) We should be careful not to use knives when spoons or forks would be better.
4) We should avoid using knives to hammer nails or fasten buttons (ouch)

David J. Rosen
djrosen@comcast.net


David - You wrote:
<Standardized testing may not always be the "sharpest knife" in the drawer someone told me recently, but it is often the only knife in the drawer.> (from Pauline Mcnaughton)

And continued Pauline's great metaphor:

1) We should use the best knives we have.
2) We may need some new, better knives.
3) We should be careful not to use knives when spoons or forks would be better.
4)We should avoid using knives to hammer nails or fasten buttons (ouch)

Allow me to add one more:

5) We should be cautious not to use "the only knife in the drawer" to cut the throat of learners in programs with just literacy level 1 learners whom *I* believe it's unrealistic to expect will increase 2 GL in one reporting period.

Nancy Hansen
Executive Director
Sioux Falls Area Literacy Council
Sioux Falls, SD
sfallsliteracy@yahoo.com


Marie has asked for some additional kitchen utensil metaphors for standardized assessment. Here are three more to add to the (edited) list:

1) Standardized testing may not always be the sharpest knife in the drawer.
2) Let's use the best knives we have, but also get some better knives.
3) Let's not use knives when spoons or forks are better.
4) Avoid using knives to hammer nails or fasten buttons (ouch).
5) Let's not use "the only knife in the drawer" to cut the throat of learners in programs with just literacy level one (whom *I* believe it's unrealistic to expect will increase two grade levels in one reporting period.)

6) Let's train those with knives to use them properly.
7) Using a knife to eat peas or mashed potatoes is inefficient and uncouth. If you don't have forks and spoons, don't settle for using knives.
8) Utensils may help in cooking, but only if there's food to cook. (A kitchen version of my favorite farming metaphor for testing, "You don't fatten a calf by weighing it.)

David J. Rosen
djrosen@comcast.net


These (metaphors) are excellent, I will find a way to incorporate them into my presentations...
DEBORAH JILL CHITESTER M.S.,CCC/SLP
Bilingual Speech-Language Pathologist
Second Language, Literacy & Learning Connection, LLC
Attaining Success for Second Language Learners
Web Site: www.SLLLC.org
E-mail: djcslp@slllc.org
732-398-1796(Tel/Fax), 732-642-5118 (cell)


Thanks, David, for the additional metaphors. I think they are a real visual way for us all to "appreciate" the topic.

Nancy Hansen
Sioux Falls Area Literacy Council
sfallsliteracy@yahoo.com


A knife is really not a good metaphor for assessment. You can't measure with a knife, just slice, dice, and poke. You need a container or filter or ruler metaphor.

Here's one of my favorite Chinese sayings: Viewing the heavens through a bamboo tube, measuring the ocean with a spoon. Here we have two metaphoric references to the limits of measurement.

Assessment as a bamboo tube says to me: When you interpret the results of any assessment be aware that you are dealing with a very limited "field of view." Any one test result is an extremely narrow window on ability. Multiple measurements are always better but never the complete picture.

Assessment as a spoon says to me: Human capabilities and qualities are like the ocean. No matter how many times you dip in the spoon you will never appreciate the vastness. Spoons are too small and can only contain the amount of seawater that fits in them. Tests are like that. They measure the small constructs of ability they were designed to measure and that's it. What we call a test of reading is not really a test of everything that reading entails. It's just a test of some small part or parts of reading ability.

Regie Stites


Nancy,

When you begin work with a new student, how do you begin? Does the student go first to a tutor/volunteer? Do you use the Challenger (I think this is what it is) books? What do you do first when you enter "teaching mode?"

Thanks.

Andrea


Andrea, Howard, et all:

So little time ... too much to say ... but here goes…"one-cannon-shot" for learners.

BUT ... P.S....This post turned out to be far more lengthy than I'd planned. I suppose it's because I view it as not a simple process, even though it is a thoughtful process.

Andrea, you wrote and questioned:
<When you begin work with a new student, how do you begin? Does the student go first to a tutor/volunteer? Do you use the Challenger (I think this is what it is) books? What do you do first when you enter "teaching mode?" >

My first step is to use "a spoon" (from David Rosen's post about a metaphor) to bring the new adult learner into the place where learning can occur.

I have chosen to forsake the "high-stakes funding" because I believe on of the "high-stakes" of submitting a learner to standardized timed testing, when they have less than 1 GL, is the self-worth of the learner whom I personally value.

So, using my spoon, I listen as they talk about their background, scooping out their memories of previous educational experiences, the personal goals they wanted and now have, the reading/writing/spelling life experiences they currently want to increase. I keep careful notes of all those shared experiences. Remember now ... I'm using a *spoon* ... not a *shovel*!

I tell them, we have established the first page in their personnel file and included with that page will be an evaluation of their current reading skills and a sample of their writing/spelling skills. I tell them that I will use an assessment tool for the series of books that our Council uses for beginning adult learners.

Yet even at *that*, I see the beads of sweat roll into the furrows of their middle-aged worried forehead. Yes, even though I reassure the learner that they can stop at any time and that I praise them each step of the way, they worry they are failing. And it's not even a *test*!

You mentioned the "Challenger" series, which is published by New Readers Press, the ProLiteracy publishing division. For the learners I serve, that is considered a higher-level study material. The assessment I mentioned above is for the Laubach "Way To Reading" series, a multi-sensory approach to tutoring.

Actually our Council uses the study material that fits the learner best. I could name them all, but there are something like 6 core series on our Council library shelves that are possible study materials for the learner. Besides some of them are listed in the New Readers Press link at <http://www.proliteracy.org>.

Each of them used for 1-to-1 tutoring (with a trained volunteer tutor) have an evaluation tool to determine if the learner has a base of skills that will make their first-return-to-an-educational-environment rewarding for them.

The New Reader is going to be in a peer-to-peer learning environment, so the next step is to show them the program materials they could study. If the learner feels the materials look too easy, we return to the registration table and do a "higher level" skill assessment for another series.

This is the point I begin the match process (or the "teaching mode" if you wish) and the tutor volunteer comes into the picture. I give the tutor individualized guidance. The tutor lines up the weekly lessons at a local library or church. They call after the first lesson and, of course, report-back each month. I keep touch with both parties of the pair on the phone.

I have strong feelings, and have periodically expressed them in various places, about the fact that standardized testing is *not* a "fair way" to gain a baseline for the start of a Literacy Level 1 learner's experience.

I also feel strongly about using portfolio documentation as a much more accurate picture of the learner's past and current base of knowledge. *They* can even see their own progress.

My annual anniversary meeting with a learner and their tutor is my opportunity to show them the New Reader's personal growth - to review the book check-ups and writing skill changes with them. To give praise and congratulations for their hard work of the past year.

In conclusion, I try to keep in mind something that Archie Willard said so well on another listserv today (and I hope Archie won't mind my quoting him):

"... When you've grown up and have lost your childhood dreams as a child, you become very defensive about yourself, your life and many other things because of your past experiences. The bad memories and the struggles of not learning to read clog your thinking and you can't see all the beautiful little things in life that are all around you..."

My genuine hope is that every learner, who registers into this program and meets with me for that first hour of a fruitful relationship that will last many more hours together, finds one small beautiful "thing" about coming for help - a smile, a positive comment, a concern or a touch of the hand. SOMEthing.

I try not to put in their face (or "at their throat"…as in "one sharp knife") any barrier that will block that from happening. And I believe standardized timed testing is just *that*, Howard. Frankly I prefer a spoon over a knife.

Nancy Hansen
Executive Director
Sioux Falls Area Literacy Council
Sioux Falls, SD


Howard and All, We are required to use CASAS as our assessment tool (Life Skills and Listening Tests.) While I agree that it is a far from perfect "knife," we have reaped benefits from its use. It provided data that proved what we already knew: Our open enrollment system was too much a revolving door and needed to be changed to a managed enrollment calendar. Now our staff would all resign if we tried to go back to open enrollment!

Also, the use of CASAS has trained us to target our instruction to assessed needs. We are much more analytical about what gaps in knowledge we need to address with each student.

Would I do away with the use of standardized tests to measure our success if I could? I would like to have a better tool, but I would not want to be without some sort of "knife."

Pat Handy
410-749-3217
Coordinator, Wicomico County Adult Learning Center
Philmore Commons, Salisbury


Hi there from Yavapai College, AZ,

I do like using the TABE for standardized testing; however, it does have its challenges. Last year we had one student test too low in the "D" book and then test too high in the "M" book, so we had difficulty finding a score for him in range. I did not make him re-TABE a third time for his initial score. Perhaps the TABE ranges need to be considered in range a little further out on the battery. Anyone have any ideas?

I also believe that the TABE is merely a tool to give us an idea of where to start our students. I did find one student who TABEd into 7A who couldn't divide. Marian from our Phoenix office gave me some suggestions about looking at the Locator to determine some of these issues. Also, we have an in-house math pretest that we give all incoming students after taking the TABE to determine where to start them in the Contemporary Satellite books. This in-house test is where we determined the problem. Otherwise, since the student was ASEII (our highest level) in all three areas, I might have encouraged him to go take his GED exam right away.

Anyone else finding students with major areas of weakness, yet testing too high? Or students testing far lower than their actual ability?

Tina

Tina Luffman
Instructional Specialist, ABE-GED
Verde Valley Campus
634-6544
tina_luffman@yc.edu


Dear everyone,

Thanks Howard, for starting us off, and thanks to all who have posted for such a rich discussion.

PLEASE! Carry on!

Here's some stuff I am noting along the way. This is kinda long. I have posted a bunch of different resources - but I would really LOVE to know what resources you use to help you with these issues. So if you have good ones, please let us know what they are.

Perhaps the best way to figure out how to implement your accountability process (i.e.: standardized assessment) is by hearing and learning from others experiences, then adapting what you're finding out to fit your situation. Case studies are great for this - and Nancy has provided us one in her reply to Andrea's question about how she goes about working with learners from the beginning. In our discussion from early July, there was some discussion of when to test and when not to test (see the posts entitled "Literacy Needs" at the NIFL List Archives at
http://www.nifl.gov/nifl-assessment/2005/). You can find some 'scenarios' as well of suggestions for implementing some of the standardized assessments at: http://www.sabes.org/assessment/scenarios1.htm

Pauline brings up issues of standardized testing and fairness….and.YAY!!! I have to thank you Pauline, because that is exactly what "standardization" actually means: to provide a level playing field so that you are doing exactly the same thing with each individual: i.e.: being fair. (See definitions at the Special Collection -http://literacy.kent.edu/Midwest/assessment/glossary.html, or at the ALEWiki - http://wiki.literacytent.org/index.php/AleAssessment). In my mind, you cannot do away with standardized tests and the accompanying extremely important processes of administration and score interpretation because then you take away the fairness aspect that is the point of standardization.

One big issue with standardized tests is actually what they are used for. I can't stress that enough. All kinds of tests (standardized and not) are sometimes used for the wrong purposes. We need to examine what a test was developed for in the first place, and determine if that then matches the need. Often, this is not done. (And never mind the fact that curriculum and assessments must be aligned, however rarely are.) For some information on making informed choices when selecting assessments, go to the ALEWiki, http://wiki.literacytent.org/index.php/AleAssessment and click on Selecting Assessment Tools.

High stakes DOES NOT EQUAL standardized tests! I've already hounded you about what standardization means; high stakes does in fact refer to WHAT THE TEST IS USED FOR. That is a huge difference. We MUST be careful to understand these nuances, big and little, if we are to effectively utilize testing for useful purposes.

Tina, much has been discussed on TABE here, and I would refer you to the Assessment Archives (address above) to see some of those discussions (use the search tool at the Archives to find the TABE posts). Also, there is a lot written about TABE at:
http://www.sabes.org/assessment/tabe.htm and how to work with some of its idiosyncracies (sp?). The Mass. Dept. of Education Adult and Community Learning Services has an Assessment Policy Manual that provides in great detail what part of the accountability system of the state looks like (i.e.: all the requirements; I say 'part' because there are other pieces of the system that are detailed in other documents. These include goal-setting with students, and collecting what are known in this state as Countable Outcomes which basically means all the other stuff besides the learning gains measured by the NRS). The Assessment Policy Manual overviews the TABE, BEST Plus, and the REEP Writing Assessment, which presently are the states 3 high stakes tests. (Massachusetts is developing it's own assessments now, to match the state's curriculum frameworks; I believe we will be piloting a low-level reading test and a math test this year.) To access the Manual, go to:
http://www.doe.mass.edu/acls/news.html and scroll down to Assessment News.

Pauline also discussed the CLBs (Canadian Language Benchmarks) - and it brought right back to me some of what was included in the recommendations of the NAS report (see that Discussion at: http://wiki.literacytent.org/index.php/AleAssessment) - that many folks must be involved in determining what might constitute a set of standards or the content before any meaningful way to measure the stuff can be developed.

Finally, Nancy notes in her post that the expectations of advancement must also be realistic. (Nancy, I think that's what you are saying, but please correct me if I am not interpreting this as you intended.) So for the people involved (student and teacher) there must be some discussion of reasonable expectations of goals and advancement (the goal-setting process is extremely important for this), but at the same time, the tests we are supposed to use should also be reasonable in how they measure learning gains.

There are some very good resources that address a bunch of the issues raised here at the LINCS Special Collection in Assessment at http://literacy.kent.edu/Midwest/assessment/. Click on Teacher/Tutor and then Selecting Assessments for a Variety of Purposes and Assessment for Instructional Purposes. Also at Teacher/Tutor, check out Volume 16 of Adventures in Assessment (ok, check out EVERY volume of Adventures! But that's another story!). Volume 16 has a couple articles that focus on integrating goal-setting into the curriculum, a basic primer for understanding and using standardized tests, and using data for program improvement. Click on Manager/Administrator and check out the sections labeled Accountability/High Stakes Testing, and also Guidelines for Selecting, Administering, and Taking Tests.

Finally, I love the metaphors. They very definitely conjure up accurate portrayals (in metaphorical ways) of this wild ride we call accountability right now. At the same time, because the metaphors use basic utensils from our daily lives that we probably don't give much thought to, it feels very close to home. Got any more?

Thanks for patiently reading. Please write back.

marie cora


I'm an adult ESL teacher and can give one teacher's perspective of the issues surrounding testing in our program in Virginia. Marie asked if teachers use performance levels in their classrooms. I certainly do, on a weekly basis. I'm required by my program to assess students three times during a 12-week "cycle" -- at the beginning, in mid-cycle and at the end (to determine promotion/retention for the upcoming cycle). I follow an extensive set of performance levels for R/W/S/L geared to our adult ESL learners. (On our website, if anybody's interested.) It's very helpful in determining what kind of progress learners have made in their 12 weeks and I usually have little difficulty in placing the continuing students in a class that meets their needs. Overlaid on this classroom assessment is our high-stakes testing program. We test a certain number of federally financed students -- those receiving scholarships, funded through federal dollars. We test them within the first week of class and again before the end of the 12-week cycle. (Having used *other* assessments at in-take to determine their initial placement.) Learners are somewhat apprehensive about these tests, but I perceive that they usually feel more comfortable when I, as the teacher they've seen for at least a couple days, can reassure them that they *don't* need to worry about the test, do the best they can. They come back into the classroom saying, that wasn't so bad.

My other comment on high-stakes testing is that, as an adult ESL teacher, I feel it makes a lot more sense to have assessments that are *performance assessments* rather than a multiple-choice, paper-and-pencil test. In my experience, the BEST Plus accomplishes that requirement of being a performance assessment that measures the language the student *has* rather than seeing if they can answer a standard list of questions right or wrong. In addition, the REEP Writing Assessment also meets that requirement (and, yes, I'm biased in favor of the RWA since I've been using it for going on 10 years now) because it gives learners the chance to show how much language they have and can *use* rather than seeing what's wrong or right. I have limited experience with CASAS and extremely limited experience with TABE and didn't like what I saw with either of them. I'm aware that there are other reasons to use either of them (esp. in terms of cost and ease of administration from a program perspective) rather than BEST Plus or the REEP writing.

One final thing that I have never heard in discussions of high-stakes testing is anything from the USDOE folks or other funders -- do *they* see program quality improving as a result of requiring these tests? Do they feel that the learners are being better served now?

Phil Cackley
REEP
Arlington, VA


Hi Phil, thank you for your post.

I really just wanted to note that this is another "case study" of how Phil's classroom and/or program manages some of its accountability system.

Please! Let's hear from others on how and what you do to meet all the many requirements and carry on effective teaching for your adult students.

Thanks,
marie cora


Dear List members:

At the end of Phil's post, he raised these important questions:


One final thing that I have never heard in discussions of high-stakes testing is anything from the USDOE folks or other funders -- do *they* see program quality improving as a result of requiring these tests? Do they feel that the learners are being better served now? Phil Cackley

This is truly information we need and want to know.

If there are folks on the list who fall into the above categories (USDOE/Funders), would you please provide a response?

If there are folks on the list who might not fall into these categories, but who can speak to these questions, would you please provide a response?

Thank you,

marie cora
Moderator, NIFL Assessment Discussion List, and
Coordinator/Developer LINCS Assessment Special Collection at
http://literacy.kent.edu/Midwest/assessment/


I am not a funder or from USDOE, but I am a program coordinator in Canton, Ohio whose program is participating in a pilot project through the USDOE. The project is called STAR (Student Achievement in Reading) and its purpose is to teach teachers strategies to improve reading in adult intermediate (NRS levels 3 and 4) learners. The first training we had focused on diagnostic reading assessment, going beyond the TABE or CASAS to assess alphabetics, fluency, vocabulary, and comprehension in order to determine instructional needs and instructional levels. STAR is based on reading research which states we need to assess struggling readers in alphabetics, fluency, vocabulary, and comprehension. I think this is the position of USDOE. You can get more information on this from the ARCS website at www.nifl.gov/readingprofiles.

Jane Meyer


I guess I'm in that gray area of funders, I don't do it myself, but I am being paid to....another time, folks.

OK, what do funders want? You'd have to make the rounds and ask, I think there are differences.

The place I work for wants adult literacy because:

  • It's the basis for democracy, people should be able to vote and to read. Maybe if people can vote and read poverty will be reduced. Life is especially unfair if you are a minority.

In addition, I want adult literacy because:

  • It's a way to reduce population growth and increase children's health. Causal relationship, the only one there is. The planet is overburdened with homo sapiens, we are in trouble, what with that and global warming and India and China wanting coal...oil...droughts increasing, ...storms...fish decline..

Adult literacy = skilled readers. Skilled readers = can read from an adult level book, and/or can learn new literacy skills, like when there is a job to be mastered. The problem is how to make this happen. "Make this happen" = skilled teachers. Skilled teachers = know how people learn to read, comfortable with technology (computers).

Skilled readers = can pass GED or other standardized test; make discernable literacy progress.

Skilled readers should be part of an education network, like what Gail S. talks about. This may help them get jobs, expand opportunities.

That's kind of basic, and I could elaborate ad infinitum but will not.

Believe me, other funders will want different things, like increased global competitiveness, improved state economies, and so on.

Andrea Wilder


I work in one of the agencies that funnel state funds to adult literacy programs. I do not make the rules, but I do administer them and perhaps that gives me the insight on assessment that you are requesting.

Our agency requires standardized testing for two reasons. The first reason is eligibility. Our mandate from our state legislature is to offer adult education services to those who have a significant reading deficiency identified as reading below the ninth grade level for native language speakers or scoring below SPL 6 for English language learners. To qualify for services, a prospective student must be tested and must score in this range. Funds are limited, and in fact have decreased over the past few years. Therefore it is necessary that funds be targeted. The standardized tests identify learners eligible to be served with our particular funds.

When agencies report to me some of the problems with testing that have been discussed here, I remind them that assessment beyond the required standardized test is absolutely allowable and in fact, must be part of the learning and teaching experience. But first, for my agency's purposes, that learner must be eligible.

The other reason we require standardized testing Howard Dooley identified already as "comparability." I am required to report on this statewide program. Despite the depth and richness of the case studies that our office collects, often all that I am asked is how many learners we served and how much progress was made. I need to be able to say that across the state, "x number of learners enrolled in our program made x amount of progress." I need the snapshot that a widely used standardized test gives me.

Having said this, I am more than aware that the snapshot is limited. I don't want the "knife" of testing used against learners. I do want to be able to report to people outside our field that there is a drawer full of tools that the adult education and literacy field uses to make sure that we offer the best learning experience possible for the adult learners.

Cyndy Colletti
Literacy Program Manager, Illinois State Library


I've been really quiet on this list for the last several weeks - partly because we just welcomed a brand new baby to our family - now that I've caught up on all the collected emails, I think I'll dive into this discussion. A colleague and I were actually discussing "standardized" testing issues over coffee this past Saturday as it relates to our own program.

To answer the questions posed by Howard:

I don't like standardized tests. I never have - even as a student in school myself. I think they are excellent gauge of a student's ability to memorize and regurgitate information but not necessarily a good gauge of a student's ability to APPLY the knowledge they have. I also think one of the fatal flaws with standardized tests is that sometimes students learn something simply to pass a test but then forget it as soon as they think they don't need it any longer. Unfortunately, because of reporting and funding, I think standardized tests, irregardless of which one a state or school uses, have become a necessary evil. I happen to agree with others that spoke up on the list that stated that they don't really think standardized tests are the best way to go in terms of assessing students. Like others, my own school does intake testing before assigning a student to a class. One of the problems I've found is that some students don't take the test seriously, they get really low scores, are improperly placed, and then they quit coming b/c they get bored. For the record, we use the TABE test. I've seen students test who simply opened their test booklet and just bubbled in answers - yet when doing work in class, it was discovered that they knew way more than the test showed. Likewise, I've had students test really high, and it not be an accurate indication of what they really knew. I've had students, especially in the math portion of the test, score at the 11th and 12 grade level yet those same students could not work with complicated fraction problems, had trouble with long division, etc, let alone the inability to do algebra and geometry. The TABE, along with any standardized tests, is going to have inherent flaws - because it uses snippets of data to "test" a student's knowledge base but it doesn't come close to giving a real and sometimes completely accurate picture. On a side note, I also agree with earlier comments that the TABE is not necessarily an ideal test to "assess" a student's reading ability. In my t levels, as a GED instructor and even as an AHS instructor, reading ability is truly only assessed when an instructor spends some quality one on one time with his or her students gauging everything from fluency to comprehension. The TABE, CASAS and even the GED definitely tests comprehension skills but give a weak assessment of the students' fluency skills. It can be assumed that if the student has trouble comprehending what they have read, then by default they have trouble with fluency - but it doesn't begin to tell or help an instructor know just where that problem might lie. Is it with word recognition, phonetics, rate, etc. There are a lot of questions that no standardized tests can ever answer and that the instructor is going to have to "assess" on his or her own.

My experience with CASAS is that it too doesn't give a complete picture BUT, I do like the fact that it is "Life Skills/Employability Skills" based. I think it's much easier to explain to someone in their 50's and 60's in terms of CASAS, than it is to have given them the TABE and show tell them that they are at a 4th grade level in a given area. I agree that such explanations are a bit demeaning to adults who have life experiences that the TABE does not take into account. There is a huge difference between the 17 year old who completed 10th grade and the 50 year old who held a job for 20 years before the plant closed and those differences are NOT Assessed or accounted for in assessments.

Howard asked if there was one tests that was "better than sliced bread". I think the answer to that is "no." No one tests will ever give a complete picture. I think that is also the fatal flaw in the NRS. It's data driven only and data is one sided. Data like that can be skewed b/c not everyone tests well; data can be misleading - students tests high or low and it not be the real "indication" of their ability; students deliberately "blow" the tests b/c they don't understand or appreciate the significance of it. There are a lot of factors, it seems to me, that make "standardized" testing flawed but because of funding issues, they are necessary. I think it becomes equally necessary then for instructors to go beyond the "initial" assessment done at an intake session to truly identify the needs and abilities of their students. I think this can be done with one to one interviews, surveys and teacher made materials. I think that as a student enters and learns, that portfolios of work highlighting their growth are the best assessment of their ability.

I don't think there is an easy answer or solution.

Regards
Katrina Hinson


Hi Katrina et all:

A personal and sincere thank you, Katrina, for your thorough, thoughtful reply to Howard's post. And Congratulations! It must be an exciting time for your family with a new baby born into it! I appreciate your taking precious, dear time to write this e-mail while deep in the work of loving and life-giving to your newborn.

You concluded with:
"I don't think there is an easy answer or solution."

I agree. But I feel a compromise should be sought.

Another point I would like to see made is coordinators, directors, administrators – whatever their title -- should have the levity to make *choices* about which assessment tools they feel best suits the population they serve without being severely punished for their choice.

I feel that right now this is the case with using the NRS system as a requirement for *all* AELS programs, whether the chosen-testing tool is "fair" in their state or not. A "no choice" circumstance.

Nancy Hansen
Sioux Falls Area Literacy Council


Hi Katrina, thanks for your post.

A couple of things to consider: You noted below in your post that 'standardized tests are necessary because of funding'. Actually, standardized tests are necessary for fairness. The funding part is quite frankly secondary - although no one would argue with your frustrations regarding **that use of them**, myself included. I'm just trying to get you (and all) to see these differences and be careful to understand how the many pieces of accountability work together, or don't work together.

And both theoretically and in reality, any type of test (including surveys, interviews, and portfolios) can be standardized, and in the best of all worlds, should definitely be standardized. (The challenges for these latter assessments are steep: costly, time-consuming, huge amounts of paper/documentation, etc.)

If you check out Phil Cackley's post, he discusses two performance-based assessments (Best Plus and REEP) that are standardized, that provide much more usable information for the student and teacher, and that are, low and behold, approved for use with the NRS. Not perfect...nothing is with all this...but moving toward a more effective space.

Perhaps we should shift our questions away from the tests themselves. Perhaps we should discuss what we want to measure, and then make some suggestions and have discussion on how might be best to capture the stuff we want to measure. We keep getting stuck in this quagmire of misinterpretation of terms.

What do others think?

marie cora
Moderator, NIFL Assessment Discussion List, and
Coordinator/Developer LINCS Assessment Special Collection at
http://literacy.kent.edu/Midwest/assessment/


Marie, You asked:

“Perhaps we should shift our questions away from the tests themselves. Perhaps we should discuss what we want to measure, and then make some suggestions and have discussion on how might be best to capture the stuff we want to measure. We keep getting stuck in this quagmire of misinterpretation of terms.”

I really want to see/measure how my students think - the thought process they use to get from one point to another - to achieve the answer they come up with. Ultimately how they think is going to determine how easily they learn and progress. Do they over think - do they under think. Do they see things through completely or do they stop too soon. (For a lot of multiple choice questions, stopping too soon, the answer is there and wrong, vs. if you "complete" the work and get the correct answer.)

I want to know if they really do know the basics - add, subtract, multiply and divide and not just with simple numbers but with LARGE numbers. I usually find when I do my own assessments that students can work with a 2 digit number but struggle when a third is added. I want to know how easily my students recognize patterns, be it shapes, numbers, letters etc.

I want to see how well they communicate both verbally and written -

I want to see/hear how well they read - not just what they can glean/comprehend from a short passage - but the rate at which they read, the fluency they read at and what kind of words they are stumbling over. Are they misunderstanding passages because the material is content related, such as a passage on "mitosis" versus being more generalized or life related like reading a newspaper article.

Older Students can become so focused on the GLE, in my case, with the TABE, especially if it's really low, that they grow frustrated and often comment that they'll "never get it". My younger students often experience the same frustration for different reasons - they come in knowing they completed 10th grade or 11th grade etc, yet test much much lower than that and then when their scores are discussed with them during student/teacher interviews they express the disappointment they feel and question if it's even worth the effort to try. Or they test really well yet when given material that's designed to correlate to their GLE/Intake assessment, they struggle and don't understand the material, grow frustrated, disheartened and in the worst case scenario, they quit attending class. Sometimes that initial assessment seems to set the stage for failure rather than success and it would seem to me that the focus should be succeeding - that there needs to be a better way to promote the positive and not the negative.

On a different note, I'd like more information on BEST Plus and REEP. Where would I find that?

Katrina


Hi Katrina,

Thanks for this...do folks have suggestions or comments for Katrina on the areas that she would like to collect information on? Let's hear what ideas folks have or what tools/processes you might already use to address some of what she outlines below.

The GLE dilemma: Have you looked into using Scale Scores? Some posts discussing this are in the List Archives (use the search button to find those posts at http://www.nifl.gov/lincs/discussions/nifl-assessment/assessment.html)
Also see http://www.sabes.org/assessment/scalescores.htm

BEST Plus: Go to the ALEWiki at http://wiki.literacytent.org/index.php/AleAssessment and click on Commercially Available Assessment Tools - there you will find contact info and overviews of several tests, as well as excerpted discussions from this List on these assessments.

REEP: http://www.arlington.k12.va.us/instruct/ctae/adult_ed/REEP/RWA/index.html
If you go to the Assessment Collection (http://literacy.kent.edu/Midwest/assessment/)
Click teacher/tutor
Click Adventures in Assessment - I believe Volumes 14, 15, and 16 all have articles written by the REEP Development team, and instructors who are using the tool.

marie cora
Moderator, NIFL Assessment Discussion List, and
Coordinator/Developer LINCS Assessment Special Collection at
http://literacy.kent.edu/Midwest/assessment/


I really appreciate the discussion, and the varied experiences and points of view. I hope more of us will join in; I'm certainly learning from your thoughts. Marie's recent comment echoes a discussion going on in RI about this same topic. I was struck by Marie's use of the word "fairness." I'm not sure I agree; I would say "comparability." I think that's what "those people" who want -- or mandate -- we use standardized assessments really want. And, of course, I have faith (faith is belief in things unseen!) that they want those comparisons to be fair. A second point: I'm not sure that in a perfect world every assessment would be standardized. Some assessing is transitory, highly personal, unique to this learner and that instructor; how would that be standardized? Isn't it, by its nature, un-standardizable?

Back to the point Marie is making. I agree that some of the dissatisfaction I have read in the discussion seems to me to stem from wanting or expecting the assessment to do or be things that it's not supposed to do or be. Not all assessment initiates from the learner or the learning situation. Particularly with standardized assessment, the assessment is usually initiated from funders or policy agencies, and it reflects what they want to know, and what they value. They are, as it were, the unseen partner in the room and in the learning situation. It may be that the assessment does not align completely, or it isn't encompassed completely, by the learning that is agreed to between every instructor and his student (or, would be happening in the absence of such an assessment). However, that doesn't mean the assessment is unqualified-ly inappropriate, inaccurate, intrusive, non-relevant, and so on. It is what it is for what it needs to accomplish. And I think that it is valid. Nothing more, but nothing less either. Just as a policy person may look at portfolios, videotapes, or anecdotes and reject them as inappropriate, non-relevant, and so on, for her purposes, instructors often do the same for standardized and even program mandated assessments that aren't generated from within a specific learning situation. The assessment identify a few items or limited skills of value to that other person. It becomes just a few items or skills to be included in the more comprehensive learning situation.

And so I see the need for us, as professionals, to make changes to our learning situations, and to recognize, value and imbed the information which a standardized test provides. We have to value it, or our learners cannot. Are we saying that the limited comprehension skills assessed have no place in the learners' acquisition of higher reading functions. Yes, they are not the totality of reading, but no relevance? It seems to me that we have to imbed it, just as we would any other assessment that we do value – de-contextualized workbook, authentic, portfolio, performance. At the program level, one way this can be done is to make the standardized pre-testing part (again, one part; not the whole) of the diagnostic phase of the learner's experience -- using the assessment to set goals, for targeted instruction, or to develop specific items in an IEP. Or, instructors may see how assessment areas are related to a core curriculum, and prepare learners for those areas and in the methods of the assessments to come. In either case, or in other ways, instructors and learners would need to be open to expanding their learning to include ideas, areas, and items, that the unseen partner in the learning process values.

Digression: And let me say emphatically, that this is why my program absolutely does not use or discuss GLE's with our learners. I agree that they are meaningless for adults. If there is anyone out there who absolutely disagrees, and finds the GLE's appropriate and practical, I would like to hear the argument and the examples. Seriously. Someone mentioned the STAR project, which is based on the ARC study, and I have heard that GLE's play a significant role in that program. Maybe someone in that project can write in, and offer some insight to the value of GLE's in developing reading skills.

I would also say that I don't see this as only a standardized test issue either. Whenever a policy decision is made, whether at the federal, state, program, or class level, then assessment will be initiated from outside the learner-instructor interaction. For example, at sites where technology is available, RIRAL instructors are required to use that technology as a method with their learners. Learners do not, in general, get to opt out on the basis of not seeing the relevance. So, learners prepare some written work using a word processor, because familiarity with technology has been identified as an important, life-long learning skill. And so, we assess how well learners progress in this area, even though it's not part of the GED test or the ESOL beginning learners' stated goals.

Howard Dooley


Hi Howard and everyone,

Howard, thank you for your thoughtful post.

I would really like to hear from you all out there on your thoughts, suggestions, comments on Howard's post below. It must get you thinking - share your thoughts with us.

I just want to clarify a couple of things:

Howard, you said: I was struck by Marie's use of the word "fairness." I'm not sure I agree; I would say "comparability." I think that's what "those people" who want -- or mandate -- we use standardized assessments really want.

The purpose of a standardized test is in fact to provide a level playing field - to try and be fair to all who take the test (sorry: broken record!). Fairness and comparability are two completely different things: one is about the purpose (to try and be fair); the other is about what the test is being used for (to compare students or scores or programs or whatever). These are fundamentally different notions, Howard, and I believe you are mixing them up. It may be true that "those people want/mandate we use standardized tests for reasons of comparability" - but that is completely different from the fact that a test was developed by psychometric methodology to try and capture a body of knowledge from a bunch of people without bias toward any one of those people. (I'm not saying standardized tests are perfect in their fairness regard either: I'm just trying to impress that this is the point of the standardization process, and really only that. Try to separate that out in your mind.)

And if you do not administer a test exactly as it is prescribed to administer (which is an **extremely** important part of testing), then you have removed the fairness aspect (the standardization) and hence, any results will not be usable - you will NOT be able to compare students, or scores, or PROGRESS within a student accurately or with any confidence whatsoever. Throw out the standardized administration process and throw out any comparing as well.

Also, you said: "Some assessing is transitory, highly personal, unique to this learner and that instructor; how would that be standardized? Isn't it, by its nature, unstandardizable?"

Perhaps. Perhaps some of that is actually a monitoring of who that person is, what his needs and goals are, how he interacts with certain materials or people, what challenges and successes you as the teacher identify with him as you work with him over time. All extremely important stuff to log and keep track of because it does build a more complete picture of that person. But couldn't you 'standardize' some of the pieces surrounding some of these activities? For example, perhaps the materials or activities used are developed/selected from a set of standards based on your curriculum (or the students' goals); and most important, I would think that you would want to make sure that when interacting with each student, your processes for working on tasks or materials is pretty much the same as for each other student. Not EQUAL, I don't mean equal. A simplistic example: if you want to check on a person's ability to write a note to a child's teacher, do you let one person write that note at home (where they could get help) but another must do it in the confines of the classroom? That's not fair.

I recently saw a very long list of activities that ESOL students in a high school had to do in order to 'graduate' out of that class (there were like 30 choices). It was a required final project. There were no guidelines, timeframes, or performance standards. The list included:

Become an ROTC member
Start a class newsletter
Write a letter to a friend in English
Talk to three strangers on the street and report your experience (didn't say if that report was to be oral or written)

Would you say that any of these activities and/or their results could be compared in any meaningful way? Of course not - but the fundamental problem with this final project rests with the fact that none of this is fair to begin with. The teacher may have tried hard to encompass a wide variety so that all her students had something that they were interested in/could relate to, but because she was using the activity for a high stakes purpose, it makes whatever results very unfair.

Ok, I've gone on plenty. Somebody else talk now.

marie cora


I may be mangling people's definitions, but here goes.

Standardized means comparable across cases, and outsiders want to know this--how one group performs relative to another group. Outsiders also want to know that teachers are taking their job seriously and know what they are doing.

It seems unfair NOT to use a standardized test--the same measure for everyone.

Back to Nancy's dilemma--for a CBO following ProLiteracy, judging what a person is doing, how they are doing--there are regular check-ups--standardized for ProLiteracy students.

As to the individual quirks that make up a student's encounters with teaching and literacy-- for the outsider, those belong in the domain of the relationship between teacher and student. They might be useful indicators of ways to enhance program strengths.

Andrea Wilder (an outsider for purposes of this email)


I think the key justification for and importance of standardization is fairness. Being able to provide the means for comparability is an important outcome, but in my mind it is not the primary objective. I have copied some excerpts from various reports and consultations leading up the development of the Canadian Language Benchmarks that stresses the importance of fairness above all.

The lack of consistent definitions and criteria across Canada was one of the issues long raised by learners and specifically identified as a concern by them at TESL Canada’s Learners’ Conference in Vancouver (March 1992). These learners were concerned that differences in language assessment criteria and in the decisions that followed from those assessments meant that some immigrants were being denied opportunities available to others - not as a result of design or plan, but through an absence of design and plan. The National LINC Benchmarks Project: Report on the Consultations, 1993.

The CLB were initially developed by the federal government with the support of provincial governments to assist immigrants to participate more fully in Canadian society. Immigrant advocates, both within and outside the ESL/FSL fields, argued persuasively that Canada needed a common set of descriptors of English and French language ability that could be applied in a number of contexts – language instruction, local community, the workplace and the academic community.

Before the Canadian Language Benchmarks were introduced, the lack of a common, easy to understand, standard, posed a significant barrier for newcomers. Immigrant advocates, both within and outside the ESL/FSL fields, argued persuasively that Canada needed a common set of descriptors of English and French language ability that could be applied in a number of contexts – language instruction, local community, the workplace and the academic community. Educational institutions often supplied descriptions of a client’s language proficiency that were not easy to interpret for employment or placement counselors.

The federal government initiated a national consultation to identify the need for a national language standard, - to find out what was being used across the country and - to make recommendations. The report concluded that, “The criteria that are available to measure student progress are seldom cross-referenced to the ‘real world’.” And that the common use of such global terms as “beginner”, “intermediate” or “advanced” to summarize a learner’s language proficiency in relation to the real world were deemed sorely inadequate. Descriptors were needed that were able to answer the types of questions asked by funding agencies, counselors and employers such as “How much English does this person have? Is this person’s English good enough to …” “Could this person now”. (Source page 10, The National LINC Benchmarks Project: Report on the Consultations)

Pauline McNaughton


Thank you Marie - your email provides considerable information and links to information to allow thoughtful follow up on best practices identified in this discussion.

A considerable challenge that we have found in Canada with the Canadian Language Benchmarks (CLB)(12 benchmark levels) as a national standard (upon which standardized assessments are based as well as curriculum, materials development, portfolio development etc.) - is ensuring that ESL professionals have adequate professional development to use and apply them in a consistent way. If ESL professionals have adequate training in the use of the CLB (or any other standard) and the principles about language and learning that are reflected in the CLB (or any other standard) - then there can be far less dependency on standardized assessment tools - especially for ongoing or EXIT assessment.

Although we have national assessment tools and a national assessment system to ensure standardized PLACEMENT - we do not have national, standardized EXIT tools. The need for ongoing and summative assessments based on the CLB is always a hot topic. As in the US, provincial and federal funders in Canada need to have some concrete data to measure the success of funded programs (and programs need this too) - but ESL professionals realize that standardized tests used to monitor progress and achievement are not likely to accurately measure the learning that has actually taken place in the classroom. In some provinces funders accept the teacher's classroom based (performance based - often using portfolio system) assessment as sufficient for this purpose. Other funders still assert the need for standardized EXIT assessments.

The Centre for Canadian Language Benchmarks was created to support the CLB standard and is governed by a large multi-stakeholder board of directors including both government funders and ESL expert representatives from TESL Canada and provincial TESL groups as well as assessor reps. The board has strongly supported the development of tools to support classroom based assessment. We have just published (in partnership with the Gov't of Alberta and Citizenship and Immigration Canada) a 2 volume "Summative Assessment Manual" which provides classroom teachers with "ESL Learner Benchmark Achievement Report Tools" based on 12 theme areas for CLB levels 1-4. It comes with detailed instructions for the selection and administration of the tasks. We hope to develop other resources like this in the future. We are also soon to publish a kit to help teachers with ongoing or "formative" assessment throughout a program.

But even with resources such as these - there is tremendous need for professional development for ESL professionals in the use of the CLB standard. "Careful interpretation and supports are needed to apply the CLB in the many contexts where Adult ESL learners and teachers are working. It is not enough to simply hand the document to teachers and expect them to apply it. Carefully planned implementation processes and professional development activities will ensure successful use of the information in the CLB 2000 document." (Quote from CLB 2000: A Guide to Implementation). One of our current priorities right now is the strategically plan for the delivery of national professional development in the use of the CLB that is affordable and accessible. We are looking at online delivery of some introductory modules on the CLB as a starting point. We are also exploring ways we can work more closely with TESL trainers to help them train their students in the use of the CLB.

In the end it is all about the people who use the tools - the teachers and learners - and how they collaborate in the process of determining needs, setting goals and measuring progress and achievement.

Pauline McNaughton
Executive Director / Directrice exécutive
Centre for Canadian Language Benchmarks/Centre des niveaux de compétence linguistique canadiens
200 Elgin Street, Suite 803 / 200 rue Elgin, pièce 803
Ottawa, ON K2P 1L5
T (613) 230-7729 F (613) 230-9305
pmcnaughton@language.ca