hypothesis testing – the _

[2922] Malaysia’s Covid-19 type II error crisis

When faced with the unknown, we form assumptions based on what we know from previous experience. In science, it is fancier to call those assumptions as hypotheses. And hypotheses are meant to be tested. Those whom have done sufficient level of statistics will quickly understand this as hypothesis testing and at the very basic level, this is the philosophical foundation of whatever Covid-19 testing that exist out there.

The logical set-up is simple. There is a null hypothesis that a test seeks to reject. A failure to reject based on some benchmark means the hypothesis may have some truth to it, while a rejection means the alternative hypothesis is likely true. In the case of Covid-19 test, the null hypothesis would be “the person is heathy” and the alternative hypothesis would be “the person is unhealthy.”

Notice the use of ‘may’ and ‘likely.’ It expresses possibility. It reflects an element behind any statistical testing method: confidence. Confidence is an important factor because all testing are prone to error. We try to reduce it, but there is a minimum error level we have to tolerate. The errors come in two forms: it is possible to test a healthy person as unhealthy, and as we have witnessed in the past several weeks in Malaysia, it is also possible to test an unhealthy person as healthy.

When we tested a healthy person as unhealthy, that is known as a Type I error. Here, we rejected the null hypothesis when we should not. It is a false positive. As far as Covid-19 is concerned, this is an inconvenience to the person tested falsely. There will be cost involved, but the person will very likely be fine.

When we tested an unhealthy person as healthy, that is Type II error. Here we failed to reject the null hypothesis when we should. It is a false negative. In our Covid-19 context, this has a life-threatening consequence.

Between the two errors, a false negative is clearly the worse mistake to commit.

This is why adhering to strict and fulltime quarantine is important. Based on what we know from public health professionals, 14 days is the reasonable period for a quarantine. Centers for Disease Control and Prevention in the United States for instance stated that any symptom would manifest itself between 2 to 14 days. If we are truly sick, regardless of test results, there is a very high likelihood the truth will be discovered.

In Malaysia, we have ignored the risk of Type II error so that the ruling class could get their convenience. After violating safety and health procedures regarding social interactions during a time of pandemic, too many Malaysians—the politician class generally, the ruling class particularly—were just too happy to rely on testing to determine whether we are free of Covid-19, without understanding the underlying risk.

Worse, the authority was just too happy to short-circuit the process as if there is no error in testing. Whether the local health authority was strong-armed into it, we do not know. What we know is that quarantine time for those coming from high-risk areas in Sabah was 3 days, and not 14 days. Unlike the 14-day period, there is no scientific explanation why 3-day period were appropriate. In fact, a 3-day quarantine period is inconsistent to what we have been informed by health authority about the nature of Covid-19.

Because of the complete ignorant trust in testing method and failure to understand the risk of Type II error by a group of people—ministers no less—we Malaysians now have to suffer a pandemic wave bigger than we had earlier.

We all have sacrificed to fight Covid-19. We went through a severe lockdown. We worked from home. We stopped going out. We wore mask however uncomfortable the experience was. We were successful in flattening the curve, until the selfish men and women undid our success.

These ignorant, arrogant men and women have triggered a type II error crisis in Malaysia. They all should resign to atone for their sins.

[2619] Why are critical values always at 1%, 5% and 10%?

I was running some regressions at work just now and I realized my overdependence on computers had made me forgotten how to calculate certain statistics manually. Modern regression softwares automatically calculate various statistics less than a second and I hardly think of what happens in that virtual blackbox.

But just now, I was following up on a technical economic debate which revolved around some statistics where the report reported its t-stats but not its probability. I was curious about its probability and so, I had to translate the t-stats into probability manually by reading the t-stats distribution table. I struggled at first. I found myself embarrassed at my inability to read the table after 6 years worth of education in economics, and another 3 or 4 years in econometrics. But I managed. I guess, it is like riding a bicycle. Once you learned it, you know it. It may take some stimulus to remember if you have not been riding, but you can really do it.

One thought came to my mind after I was done with that.

I know there is a criticism about whether the critical values—the 10%, the 5%, etc—means anything. Indeed, the critical values are rules arbitarily made up out of convenience. It is highly possible that if the calculated value breaks a particular critical value, a hypothesis can still be true despite rejection. It is all a matter of probability and probability does not work so discretely as the typical critical value rejection rule suggests. If there is a 99% possibility of a hypothesis is untrue, that 1% can still pan out to be true however unlikely. (Let us not get into the Error I and II debate)

Too many people like yes and no answer. The rejection-rule gives them that, rightly or wrongly.

But I am thinking, why, throughout the economics and econometrics world, are the critical values always the same numbers? It either 1%, 5% or 10% (I have seen 25% but… ehem). Why not 4.7%, or 7.1%?

I think I found an answer to that after looking at the t-stats table for the first time in at least 2 years.

Powerful and cheap computers were only available in the last decade of the 20th century. Because of this, many students in the olden days relied on tables for their rejection rules. Tables being tables on pieces of papers, space was at a premium. So, publishers of tables could only print sexy numbers and obviously not too many numbers over the natural number space, never mind real numbers. Either you use the tables, or calculate the critical values yourself, which is a pain.

So, that convention sticks after awhile. From early econometricians to students of econometrics, the same tables get used over and over again. It becomes a tradition.

Maybe?