[Stoves] PCIA den Haag

Thu Mar 1 23:59:25 CST 2012

Dear Tom and Everyone

There will no doubt be some immediate reactions to what flows from the
PCIA/EPA/ISO meeting (hereinafter 'at the Hague') and then some longer
messages interpreting and discussing the resolved document.

So two things right away:

Otto, don't worry, emissions are pre-eminent because it is fundamentally
about health and safety, with fuel an interesting benefit. If a very clean
stove was available that did not save fuel it would get a great deal of
attention, OK?

Tom, 

"It's starting at the very basics - VITA WBT 4.0 - but they wanted to start
somewhere."

I think it will be seen shortly that what the agreement says is far more
than 'what test to use' and in fact it does not say 'use this test', not
even once, though to me had been was the intention at some time in the past.
There are a lot of capable people involved in stove development and testing
these days so there is no need to over-simplify the task at hand. It is a
far better document than, 'use this test and rate stoves based on the
outcome'.

There are many different types of stove, many foods to be prepared, spaces
to be heated, fuels to be consumed. The IWA (International Working
Agreement) is a cast-in-stone temporary measure which will be around for
(probably) about 3 years while and ISO process is completed.

The IWA is framed to ensure that any stove type can be dealt with, any fuel,
any appropriate protocol can be applied, with the over-riding view that one
cannot rate a stove into a performance 'tier' unless the test used to do so
can be shown to have enough skill. 

The skill definition used is '1/3 of the width of the tier'. This means one
might find it easy using a number of protocols to differentiate between a
baseline and a modestly improved stove with 40% less emissions. It is far
more difficult to guarantee that a stove performs 90% instead of 80% better
than the baseline (i.e. stating clearly that it is 90 not 80). In order to
place a stove in the 90% category, the test used would have to have enough
precision and that precision is 1/3 of the 10%, meaning 3% total error. The
WBT's used to date cannot do that. They are not accurate enough. Getting 'a
number' is not good enough when playing in this area. The answers have to be
defended and there is a lot of money riding on the consequences. It is also
worth mentioning that there are many different WBT's and they do not all
have the same accuracy.

The framing of the wording is such that when rating a stove, the claim must
be scientifically defensible, and there are formal methods for determining
that. It is not a case of making 'competing claims'. From now on, the claims
have to be validated, not just for determining what the rating number is
(that is obvious) but also for showing that, for example, placing a stove in
a 4th (highest) tier position was done using a method that is skillful
enough to do so.

There is an important point regarding how the tiers were determined. There
are performance tiers and they have numbers and they divide stoves into
categories. No problem. What they are (the tier levels or performance
improvements) is not all that important, actually. It is far more important
that they be defensible when sitting court with a donor saying such and such
a stove programme cheated and lied to them about the performance of the
product they paid to roll out. No one wants that.  The deception might be
about durability, performance, cost, suitability for purpose - anything one
might have contracted to do. This is not to say that the IWA will be used in
court, only the contract is, but it is a flag waved that it matters what you
claim and the methods that underwrite that claim has to be sufficient for
the purpose. Both the stove provider and the buyer have a responsibility to
see that the test method is capable of rating the stove. 

Back to the tiers: The tiers were established on the basis that we need
them, and we have to have some way of proposing them, and that you have to
start somewhere. So, by taking a look over years of testing using protocols
that are known not to be all that accurate (WBT's) some numbers were
proposed. It does not matter that the tests used to generate these proposed
tiers were not all that accurate, don't get excited.  It only matters that
we have agreed the tiers are basically logarithmic in distribution, and that
they will prevail for a time.

Now, what the IWA does NOT say is that because the tiers were generated by
using the Berkeley WBT3, WBT3.1 and the ETHOS Technical Committee's WBT4 and
WBT4.1 (which for an reason has been re-dubbed the "VITA WBT4.1" - I am
pretty sure Don Johnson is unaware of the use of his organisations name)
that it means only those tests can be used to place a stove, from this day
forth, on a tier. No, not at all. 

A lot of care was taken to ensure that the wording does not make such a
statement. The reason is this: it is well known that the accuracy and
precision of these WBT's is not great, and certainly not good enough to
separate the better performing stoves. The tests give a number, but it is
not a very accurate number.

I cannot state with certainty what the accuracy of any of those protocols is
because none of them has even been properly rated by an independent
laboratory against a 'real' measurement. So the IWA states clearly that all
the protocols have to be tested themselves to see whether or not they can
actually meet the 'one third of tier' precision requirement. The
responsibility for this is lies with the labs because the IWA is not able to
create institutions or direct someone else's expenditure.

In other words, you cannot use a test with an accuracy of plus-minus 30% to
say that one stove is 15% better than another because the error in making
the measurements is larger than the difference sought. Repeating a test 100
times using a method that has built-in errors of 30% cannot give you a 15%
resolution. 'The signal is lost in the noise', as they say.

It was heartening to see the number of delegates who understood this
implicitly and the importance it has for the development of this cooking
stove industry. It means that going forward, facts trump opinion. Claims
have to be justifiable, if not actually justified.

To reiterate, the IWA is not an agreement that picks a stove test and that
will be used by all to rate stoves. That did not and should not have
happened. What happened is that a framework setting out performance tiers
and how stoves will be rated has been drawn based on the outputs from old
methods. The fact that the proposers of the tiers used old methods does not
mean that all future testing must be conducted using those same methods. It
was just used as a way to generate a starting point. The ISO standard will
probably adjust the tiers when data with greater precision is generated.

The way forward has also been described. The IWA is not limited to wood
fuels, nor cooking stoves actually, it could be used as a guide for any
stove type, any fuel, any pots, any foods, because it enshrines the
principle that whatever the test method used, it has to be rated for its
skill, and it has to be skillful enough to make a determination for that
category.

Following the GACC meeting on Thursday most of the labs had a representative
at a discussion about meeting these new demands. We discussed the relative
merits of a carbon balance approach to determining the PM and the velocity
and volume method. Both are used and they have, in normal circumstances,
similar accuracies. There is nothing really between them in most situations.
Now we must determine if they are appropriate for stoves, not just as
scientifically valid methods. That conversation will continue as it is far
too early to make categorical statements about methods of PM measurement for
baseline and highly improved stoves. The challenges are daunting, to say the
least. Stoves are far more difficult to measure than vehicles, for example,
and the reduction in PM for vehicles is 'dry dust' whereas for stoves
(because of health considerations) it is all particles (including condensed,
sticky balls of tar and creosote and formaldehyde etc). 

What is clever about the IWA is that because we have zero chance to edit it
(due to the nature of the ISO processes) there was still time enough and
understanding enough to get unanimous agreement on how to go forward without
anointing any one (unrated) test method. It was not possible to anoint one
method because the accuracy of the protocols is unknown. One can't pick
blindly and hope. No.  That is not how due diligence is practised. 

What little is in the literature about the accuracy of stove tests, read
together with the need for tiers of performance, says we need better
methods, basically.  You cannot set and apply tiers of performance for
stoves that the protocol cannot differentiate. That was stated early on. The
absolutely minimum valid resolution needed is 1/3 of a tier width. I am sure
you all understand that. If your ruler only has Feet marked on it, you can't
measure Inches. Measuring in Feet 3 times does not create Inch marks.

One more thing to add is that we should all keep in mind the difference
between a systematic error and an experimental error. This is said to avoid
endless discussion about 'repeat it and the accuracy will improve'. Suppose
you have decided to remove and weigh charcoal and say it was not burned, so
the heat was not yielded, so the thermal efficiency of the stove is
such-and-such. This is part of the UCB-WBT's. 

You will have a problem determining what is charcoal and what is not. Let's
suppose you make a perfect separation. Suppose the amount of wood burned was
1000 g and the remaining charcoal is 100g. Now you need to assign a heat
value to this charcoal, but you do not know its real heat content. So you
guess and say it is 29 MJ/kg or 29,000 kJ total in this example. The range
of charcoal heat content might actually be anywhere from 12 to 32 MJ/kg - a
very wide range, but you assign a value of 29,000. So there is an error bar
needed to show that the value might really be anywhere between 12,000 and
32,000. You report that. OK, it is reported. 

This is an example of an error that is not removed by doing the test 3 or
300 times. If the heat content of the charcoal (per test) is a guess, the
error is never removed. The value is always in doubt. That doubt has to be
shown on the performance chart. What has happened to date is that the 'final
number' of each of three tests is shown, perhaps on a graph with a bunch of
other stove performance numbers. There are three dots showing the values
from the three iterations. No problem. But that 'range of performance' or
the 'range of values' is not represented by the three dots. Each has to have
an error bar above and below that captures the sum of all the errors
inherent in the test method. When these are plotted for each of the three
dots, the total range of possible 'truths' is much larger than the range for
the three dots alone. The total range shows the range of values of
performance that could be true about that stove.

If you repeated the test 100 times, and got 100 dots, they would tend to
cluster (hopefully) around some point. That is a reduction in the
experimental error, but it makes no impact on the systematic errors (for
example the heat value of the charcoal which is still a guess). There are
many such systematic errors in a UCB-WBT. I think this is well known. The
same is true for all tests. What we need to do is to first quantify what
these are so we can include them on the charts, and second, look for test
methods that limit both sources of error (systematic and experimental).

It is quickly obvious that with such a range of products, fuels, pots and
cooking styles, we need some deep reflection. The way engineers describe the
performance of thermal devices is well established and the metrics and units
agreed. The protocols for determining them, one by one or all at once
(multivariate) are pretty well established too. We do not need to re-invent
them if they are available. Some things about domestic stoves are unique and
need special consideration. One problem is the accidental measurement of
condensed water vapour in a system that is measuring PM using light
scattering. Fog droplets are highly reflective and are read as 'dust'. If
you are burning damp fuel and have a boiling pot, there is a lot of water
vapour going into the measurement system.  The SABS stove test protocol
includes steps to minimise the problem. The hood-based WBT's do not.
Another problem is variation in altitude and air moisture. These have a lot
of effect on stoves so the reported results can be quite different from
reality on the ground unless steps are taken to use the right method and
reporting units.

What might surprise stovers is the good accuracy that can obtained with
quite simple methods that determine one thing at a time. It is possible that
we will settle for a series of metrics that are individually known to be
very accurate and which read together, categorise very well the predicted
performance in the field. 

I look forward with anticipation to continuing the work on this. It is a
cutting edge subject. The collegiality shown by the labs is admirable. The
conversation is definitely focussed on proper scientific investigation, as
it should be.  We had delegates from 5 African National Standards groups,
plus regional test centres and formal sector standards bodies. They were
instrumental in crafting the wording to be performance-based and
non-prescriptive, yet still provide a clear and well-grounded framework for
the next three or four years.

The ISO standards development process is highly structured. This was
probably the last event at which you could just get on a plane and show up.
The national bodies represented on the ISO will take the process forward
(and no one else). There will of course be nominated methods and procedures
with supporting documentation submitted for consideration and debate. It is
possible a future agreement might specify a particular test method or make
reference to one or more methods, or it may be entirely 'performance based',
i.e. if it can be shown to be appropriate, you can use it. No one is
guessing at this point. In general, standards are become more performance
based and less prescriptive but that is not a guarantee of anything.

There is a Safety Rating method included in the IWA. As there is not a lot
of consensus about how have that conversation, the method was accepted as
proposed because when it comes to funding, the purchaser may set very
different targets for certain fuels (like kerosene of gas) or stability or
surface temperature depending on local circumstances. The IWA method is a
point system. Most National Standards use a pass-fail (checkbox) system. It
is likely an ISO standard would be a combination of both approaches. At
least there is something.

Although this message is lengthy, it is only an early comment on the
implications of the IWA. You will no doubt hear alternative interpretations
of the Agreement which I look forward to analysing as well.

For funders, the core methods for testing has been achieved - creating a
valid due diligence pathway - and in only 2 days. I congratulate everyone
involved. Round of applause, please.

Best regards

Crispin in the Hague

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.bioenergylists.org/pipermail/stoves_lists.bioenergylists.org/attachments/20120302/2c7cb929/attachment.html>