12
$\begingroup$

I am currently grading a problem in an exam on analysis for mathematics students. One subtask is to calculate a certain integral (The resulting term has to be as simple as possible.). For this part I am able to give up to $n$ points in essentially any way I see fit, provided that I do so consistently (the number $n$ is already fixed though).

Of course I am interested in an answer to this question in order to apply these to my specific case, but I also find the question of how to assign points fairly for questions where there are many long paths to a concrete solution interesting in general.

Some facts which may be significant:

  • I am not allowed to give fractions of points.
  • The exam paper does not contain any information on how the problems will be graded other than how many points are assigned to each subtask of a problem.
  • There are significantly more questions in the exam than can reasonably be expected to be solved in the given time; in particular, one does not have to solve all exercises to obtain full marks.

I have thought a great deal about how to assign points in such a situation, but none of the methods seems to be satisfactory. Here are some of these possibilities:

  1. All or nothing: Assign $n$ points for a perfect answer and none otherwise.
    Assessment: Small mistakes are excessively penalised.
  2. The usual approach at my university to grading such questions is to record all the different routes the students have followed to carry out the calculation, then either

    1. partition every such route into $n$ "even" chunks, and then assign one point for every successfully completed chunk; or
    2. isolate $n$ "significant steps" (e.g. rewriting the integrand in such a way that it may be easily integrated), and award a point for each such step.

    Assessment: This seems rather unfair to me for the following reason: There are infinitely many sequences of steps that one may perform, only some of which lead to a term which is sufficiently simple so that one may deem to have completed the calculation. Now, one only really knows that such a sequence leads to the solution when the sequence has been completed, so if a student simply performs the first few steps of such a sequence, he hasn't done anything qualitatively different from a student who has performed steps in any other sequence.

  3. In view of the assessment of the previous method, I could solely consider calculations which lead to a definite answer, and then penalise every mistake in the calculation by the subtraction of one point, so that one obtains $\max(0,n - m)$ points, where $m$ is the number of mistakes.
    Assessment: If someone writes something trivial like $\int f(x) \, dx = 0$, he will only have made one mistake, and will thus receive $n -1$ points. Also, students aren't aware during the exam that they have to complete their calculations at any cost.
  4. Again in view of the assessment of method 2., I could again solely consider calculations which lead to a definite answer, determine to which correct route it corresponds the best, and then apply 2.2..
    Assessment: Again, students aren't aware during the exam that they have to complete their calculations at any cost.
  5. Judge how much the final result differs from the correct one (by missing signs, summands etc.), and then assign points for how strongly the calculated result resembles the correct answer.
    Assessment: Judging such a resemblance is highly subjective. Also, it is possible that one arrives at a solution which looks similar to the correct one by completely invalid means.
  6. Again in view of the assessment of method 2., one could award points to any sequence of steps which could be considered to constitute a promising attack on the problem.
    Assessment: I would have to determine what constitutes a promising attack.

One final observation: I have not considered this up until now, but I could create a grading scheme where there are more than $n$ things one may write which give points, and the final number of points awarded is $\min(n,m)$, where $m$ is the number of things one has written that give points. This would allow for further, perhaps more nuanced, methods.

(P.s. This is my first question on stackexchange, so I am particularly eager to read any criticism of my question.)

$\endgroup$
3
  • 4
    $\begingroup$ If you're allowed to modify the grading scheme, then you could always go to a more standards-based grading model where you list out all the important concepts and then score them for each one on a 4-point scale. Then you'd be able to grade each major concept on its own merits. A 4 would be a perfect, 3 would be minor mistakes (multiply instead of add, missing labels, etc), a 2 would be conceptual error(s) (integrating instead of deriving, etc), a 1 would be a start without a finish, and a 0 would be blank. Or however you want it. After you have all the parts, you can just add up the total. $\endgroup$ Commented Aug 9, 2015 at 18:26
  • 2
    $\begingroup$ If there had been any doubt, I think the way the question is written shows very clearly that you have the mind-set of a mathematician :-) It is not necessary to have a grand unified theory here. My approach would be that if I'm grading question #7, and it's a question I've never assigned before, I would go through and write comments on everybody's answer to #7. Then I would go back through and subjectively sort out the answers in terms of quality, from worst to best (or in stacks that seem similar). Then I would subjectively assign points to the answers from worst to best. $\endgroup$
    – user507
    Commented Aug 9, 2015 at 19:27
  • 1
    $\begingroup$ Just a different POV re length: I tend to skip things marked "tldr," because "tldr," as a sign, connotes anti-intellectualism: "too long, don't read." Does the author does not care to think through what he or she writes and make it as succinct as is fitting? Are ideas that require long explanations beyond the capabilities of his or her intended audience? The reader can already see how long the post is, and "don't read" seems antithetical to a forum committed to serious business. Ask yourself, is it succinct? Did you tell us what we need to know? Did you tell us things we don't? $\endgroup$
    – user1815
    Commented Aug 18, 2015 at 18:04

3 Answers 3

7
$\begingroup$

Why not assign an even integer to the problem, and then assign full credit for all steps being correct, half of the integer value for any mistake (singular), and zero credit for more than one mistake. This is what I do after years of similar debate ...

The advantage is that it reflects the expectations of a work environment. Down the road, if students make a small error but otherwise contribute to a team with a fluid and graceful solution, it would be valuable, and accordingly they earn half the rubric points. If each problem was assigned 6, then they earn 3. If they made so many mistakes that one could not appreciate the grace and fluidity of the solution, then 0 is appropriate because it would not be helpful in the workplace either.

Then, on top of the 0, half, full credit rewards, utilize rubrics that borrow from the NCTM, CCSS, Buck or somewhere else, etc., in order to assess the numerical value for each problem, i.e., the weight of the problem, or whether the first problem is worth 8 points but the last problem is worth 2 points, etc.

This is the fairest metric to me since it does not over-emphasize correct reasoning steps at the expense of the end result, but it likewise does not fail to reward correct reasoning despite having an incorrect answer. It also permits for different difficulty levels of problems according to known rubrics for assessing problems, and I have found the students universally consider this the fairest system because regardless of the type of mistake or mistakes they make, there is no subjective choice involved, since they are only permitted one mistake. This makes sense because types of mistakes vary in their significance in terms of the context that they are made within, but on a test, there is generally no real-world context, only abstract practice, and hence equalizing the penalty for types of mistakes is generally well accepted by my students, since there is usually no way to evaluate the significance of one mistake versus another without workplace context.

$\endgroup$
3
  • $\begingroup$ In principle I agree, but you are mixing together different skills (plan the right strategy to solve the problem, complete individual steps correctly) here. If the question is given, you might not have any options. $\endgroup$
    – vonbrand
    Commented Aug 23, 2015 at 23:55
  • $\begingroup$ Yes, this method equalizes, or mixes as you state, each error as the same value. I rather like that ... after years of subjective opinions and complicated rubrics, I decided that a mistake devoid of context is equal with another mistake devoid of context. I do not, however, use this for "free response" or "performance based" problems which involve multiple steps and types of reasoning. I only use this for algorithmic problems ... @vonbrand $\endgroup$
    – oemb1905
    Commented Aug 26, 2015 at 3:03
  • 1
    $\begingroup$ I ended up with a bit more lenient scheme: 3 points = full correct solution, 2 points = full solution with an insignificant error, 1 point = substantial progress but mission not accomplished, 0 points = everything else, with me, as a teacher, being a sole authority with respect to the exact meaning of all adjectives in this description. It simplifies the grading noticeably (I've seen people spending 5 minutes deciding whether somebody should be awarded 4 or 5 points out of 10 for a single question) and sets the expectations just about right, IMHO. $\endgroup$
    – fedja
    Commented Nov 18, 2022 at 4:12
4
$\begingroup$
  • Break down the "compute the integral" into subtasks, assign points to each step
  • Define some "mistake types" (e.g. wrong sign, wrong sum, ...) and subtract a given number of points for each
  • A combination of the above

The problem with long answers is that if a student makes a mistake, you have to pick up their thread to check if later work is correct (and the mistake might make the problem much easier or next to impossible).

What I do when designing an exam is to review the subjects to be covered, and ask self-contained, limited questions that check each important point on their own (or check a few points only). Sometimes I've broken down a long development by giving intermediate results as starting points for separate questions. You can do the same by dovetailing steps in similar questions to cover a long series of steps, if you don't want to give intermediate results away. To check if they know what the steps are, ask for that (no need to carry them out explicitly).

For long developments I give them as homework, explicitly allowing the use of external tools (computer algebra system, textbook, whatever). That way mistakes shouldn't have to be carried along, as you can also encourage checking intermediate results.

$\endgroup$
0
$\begingroup$
  1. Figure out n significant steps, as per your idea 2. If there are several different popular ways of responding that have some merit to them, do this for each.
  2. Give one point per step.
  3. In case of mistakes, do not subtract points, but rather set a maximum; n-1 for a small mistake, n-2 for several small ones or one large one, etc. (If n is large, adjust the penalties to taste.)

The purpose of the exam is to measure student knowledge and skills. Typically the most valuable skills are conceptual understanding and solution methods, so give the points for these. However, should someone be sloppy and not correct their mistakes, they also show weakness in understanding; a good solution should be checked and thus problems noted. Hence, a reduction in points is in order, but I prefer setting a maximum, as that does not destroy the accomplishments of weaker students, while setting high standards for the strong ones.

Also, mistakes that are noticed or even suspected should have less weight than mistakes that are accepted uncritically and at face value. A problem one recognizes can be fixed in the real world, even if the exam situation is ill suited for this.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.