I truly believe that it is possible to have a standardized test that does at least a decent job of measuring student achievement. That being said, I have yet to see one that does.
Exhibit A
I snapped a quick pic of a CAPT (Connecticut's standardized test of choice) practice sheet that was left sitting by the copying machine on Friday.¹
Question 1
My favorite part of this? It includes the little bubble-it-in-grid. Not because this particular worksheet gets scanned, but it's just practice so students know how to fill in bubbles. As if there's nothing more important in our students lives than learning these valuable life skills (Objective A.12.34: Students will display proper usage of No. 2 pencils and bubbling technique).
These tests always seem to be trying to trick students. The questions asks for the answer given to the nearest gallon. Doing the math without rounding gives you an answer of 12,990.6542 gallons. The bubble grid includes space for decimals. How many students put in 12,990.65 and get it marked incorrect? What are they supposed to bubble in? 12,990? Would that get marked wrong because the last two decimals aren't filled in? 12,990.00? That's technically incorrect² but I can see how a 15 year old who is really trying to follow directions to a "T" would answer in that way.
Question 2
What knowledge is this question testing? At first it seems to be a question about proportions (40 gal. sap : 1 gal. syrup), but then it throws this whole gallons into quarts thing in at the end. Thus this question only tells us if students understand the conversion and the proportion concepts.The test can't determine if they understand one but not the other. Thus, the test doesn't determine what a student actually knows with any degree of accuracy.
Furthermore, how important is it for students to memorize conversion factors? Especially in Imperial Volume Units? I can barely keep those straight (and have little reason to). Anytime I really need to convert these units, I pull up Google and use their handy unit conversion tool.
What's the big deal?
This isn't a problem unique to Connecticut. It's a general problem that is pervasive throughout the high-stakes standardized testing world. How can these tests accurately determine what students know if they're poorly written? How can districts be told they're failing their students if the instrument used to determine that students aren't learning has serious validity problems? How can the entire education system in the United States buy into these tests as the best way to measure success?
Who are the people that write these tests? Do they read the questions they've written?
_________________________________________________
¹ Sorry for the poor quality images of the tests. They were taken with my camera phone.
² The reason 12,990.00 is incorrect because it implies that the measurement is accurate to the nearest one hundredth of a gallon which is a higher degree of accuracy than can be ascertained from the given information. In fact, the correct answer should be 13,000 gallons, due to the total dollar value being given as the entirely vague "about $556,000." This implies that the final answer can only be accurate to the nearest thousand. Thus ends the quick & dirty lesson on significant figures.
Do these questions assess skills as listed in your state standards? (I know ours contain wording such as "solve multi-step problems"). Is your issue with the questions asked or the standards they are addressing?
I would be interested in what the "right" answer is according to the testers - I know that in some cases, the logic that you used to get the 13000 would not be used by the writers of the exam. And obviously I don't think they include any "throwing out" of questions that seem statistically questionable.
I teach college and I am frustrated by many students' inability to critically think and problem-solve beyond the typical problems seen on these standardized exams. Jackie's comment seems to indicate that these questions are following the standards, my question would be whether the students who can answer these questions can critically think about problem solving on a broader scale or vice-versa. I don't think the design of questions really tests the students' abilities to figure things out effectively. I'm sure they are taught a number of steps to go through on these practice sheets - memorize the steps and you can answer the question. But, what if the question changes to something that can't be figured out by the same steps? Can they figure out the multi-steps to take or can they just memorize the multi-steps provided to them on practice exams?
Bad questions cannot assess any standards, and these are bad questions.
That Ben has to break down the answer for us betrays how bad the questions are.
I bet Ben's answer would be marked less than perfect. The correct answer requires leaving the last two boxes blank--not "0"'s but blank. (I know you said as much, just emphasizing the point.)
If I am ever on the wrong end of CPR, read this post to me again, Ben--it beats a dopamine drip to raise the BP/
I don't know if these questions address the standards (that's part of what I was asking). My point was that it may be the standards that need to be readdressed in order to get better questions.
(and for the record, I don't like the first question. I'd take any of the answers Ben listed (with justification). I'd like the second question more if it gave the students the information needed for the conversion)
I also think that if we want to assess students' ability to multi-step problems it is better done via a free-response format.
Wow. I don't think I've ever received so many comments so quickly. Does that say something about how educators feel about these tests? Perhaps it was just an exceptionally well written post (riiight).
@Jackie: I don't teach math, and thus have very limited knowledge about the math standards. I did a quick lookup on the CT Dept. of Ed site and found these pretty generic standards which seem to almost match:
2.2 a.: Develop strategies for computation and estimation using properties of number systems to solve problems.
2.2 b. Solve proportional reasoning problems.
I'm sure there are more specific standards than this, but that's the best I can do in limited time. Regardless of whether the questions meet the standards, they're tricky- and not in a "they require a lot of solid mathematical reasoning" type of way. The first question should be thrown out. The second question shouldn't require the conversion to quarts, or at the least (as you mentioned), it should give the conversion factor.
@Sue I agree that assessing students' critical thinking/mathematical reasoning would be superior. How do you measure those mathematical reasoning questions in your class? Could that method be scaled up?
@Michael: Perhaps these types of questions do serve a valuable purpose! 😉
I think these types of questions do ease my mind (in a way). When I start to feel myself get stressed about falling behind and not going over all the required content, remembering that the standardized tests will be poorly written and not do a great job of assessing the standards makes me feel better about not covering everything I'm "supposed" to.
[...] bookmarks tagged apt C(R)APT testing saved by 4 others whatever1233 bookmarked on 11/18/08 | [...]
[...] Wildeboer (Sustainably Digital) notes that lousy tests “perhaps do serve a valuable purpose. When I start to feel myself get [...]
I am not a big fan of standarized testing, and I teach this stuff. I've used this question as a practice example in my classroom even. Just so know, the students are given a formula chart that has the gallons to quarts conversion on it. (1 gallon= 4 quarts) They are not expected to remember that.
I'm glad they're given the conversion charts. That makes it marginally better (though I still have other issues). 😉
First off you said "What are they supposed to bubble in? 12,990? Would that get marked wrong because the last two decimals aren’t filled in?" Well if the unrounded is 12,990.6542 it would be 12991...but that's me being anal about that sort of thing.
To answer the question about how strict CAPT is - our algebra team had that exact question so we asked an "official" with CAPT and he said:
...we do accept answers that are more precise than the question asks for. In the example you provide, 79.2 would be accepted as a correct answer in addition to 79. We often accept a variety of answers as correct depending on rounding, truncation, etc.
Presumably that means that your number 12990 would NOT be correct but 12990.65 or 12990.7 or 12991 would be correct.
Also many questions (not necessarily these ones) will have multiple answers due to estimation. It's not always "did you get EXACTLY the right answer? No? Well you're screwed"
Lastly, and I know it's stupid since I have students that can't do this either but they DO need to practice bubbling in. The fact of the matter is that in order to move onto any college you will have to take the PSAT and/or SAT's. The decimal point always messes them up. They put whole numbers in the decimal portion. We give "benchmarks" every quarter with grid in and open ended questions and you would be surprised at how many can't/won't/don't know how to fill it out.
I agree that these questions are poorly written. I also agree that the CAPT is great at seeing how mathematically flexible students are (especially in open ended questions) but there are certainly a lot of flaws. I am in a particularly poor scoring school and many of our students just can't compete with the sometimes sophisticated language, the "got it or don't" mentality, and the complex, multi-step questions.
@Stellarpuppy: It is good to hear that the official CAPT scoring does allow for some variation in responses. Good catch on my rounding- though I do think 12990 should be taken as an acceptable answer, as it shows the student fully understands the math concepts behind the problem. Simply because they either forgot to round (as I did) or rounded incorrectly shouldn't mean they get knocked as not understanding the other concepts the question is getting at. I'm fine if they want to test for rounding, but it seems pretty nit picky to get that entire question wrong simply based on rounding error. Maybe the CAPT grading system is more advanced than a simple "right/wrong," and would recognize an answer of 12990 as understanding the mathematical concepts behind the question while recognizing the student didn't round correctly. Who knows?
On the second point (re: practicing filling in bubble sheets): I still disagree with having to practice this sort of thing. It has no utility beyond standardized tests. It's a horrible test design if students need to be coached in how to correctly fill in their answer sheet.