A brief introduction to the key concepts in Hypothesis Testing

Nurul Huda
9 min readMar 22, 2021

We might have heard that some product companies claim that their product is 95% efficient in controlling a particular disease or an unwanted a phenomenon. For example, a company claims that, its product X kills 99.9% of germs. So how can they say so? There has to be a testing technique to prove this claim right?

Here comes the concept of hypothesis testing which is used to prove a claim or any assumptions. The main purpose of statistics is to test a hypothesis. Let me introduce about the key concepts of hypothesis testing and why it is required.

A hypothesis is an educated guess about something in the world around you. It should be testable, either by experiment or observation. For example:

1. A new medicine you think might work.

2. A way of teaching you think might be better.

3. A way of teaching you think might be better.

4. A possible location of new species.

5. A fairer way to administer standardized tests.

If you are going to propose a hypothesis, it’s customary to write a statement.

A good hypothesis statement should include-

  • “if” and “then” statement.
  • Include both the independent and dependent variables.
  • Be testable by experiment, survey or other scientifically sound technique.
  • Be based on information in prior research.
  • Have design criteria.

A statistical Hypothesis is a belief made about a population parameter. This belief may or might not be right. In other words, hypothesis testing is a proper technique utilized by scientist to support or reject statistical hypotheses. The foremost ideal approach to decide if a statistical hypothesis is correct is to examine the whole population.

Since that’s frequently impractical, we normally take a random sample from the population and inspect the equivalent. Within the event sample data set isn’t steady with the statistical hypothesis, the hypothesis is refused.

Hypothesis testing was introduced by Ronald Fisher, Jerzy Neyman, Karl Pearson and Pearson’s son, Egon Pearson.

Using Hypothesis Testing, we try to interpret or draw conclusions about the population using sample data. A Hypothesis Test evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. Whenever we want to make claims about the distribution of data or whether one set of results are different from another set of results in applied machine learning, we must rely on statistical hypothesis tests.

Why we need Hypothesis Testing?

Suppose a company needs to launch a new bicycle in the market. For this situation, they will follow Hypothesis Testing all together decide the success of the new product in the market.

Where the likelihood of the product being ineffective in the market is undertaken as the Null Hypothesis and the likelihood of the product being profitable is undertaken as an Alternative Hypothesis. By following the process of Hypothesis testing they will foresee the accomplishment.

Types of hypothesis

There are two sorts of hypothesis and both the Null Hypothesis (Ho) and Alternative Hypothesis (Ha) must be totally mutually exclusive events.

  • Null hypothesis is usually the hypothesis that the event won’t happen. It is the hypothesis to be tested for possible rejection under the assumption that it is true. The concept of the null is similar to innocent until proven guilty We assume innocence until we have enough evidence to prove that a suspect is guilty. We can think of the null hypothesis as already accepted statements, For example, Sky is blue. We already accept this statement.
  • Alternative hypothesis is a hypothesis that the event will happen. It complements the Null hypothesis. It is the opposite of the null hypothesis such that both Alternate and null hypothesis together cover all the possible values of the population parameter.

Let’s take an example to understand the different types of hypothesis. A soap company claims that its product kills on an average of 99% of the germs. To test the claim of this company we will formulate the null and alternate hypothesis.

  • Null Hypothesis(Ho): Average =99% .
  • Alternate Hypothesis(Ha): Average is not equal to 99%.

When we test a hypothesis, we assume the null hypothesis to be true until there is sufficient evidence in the sample to prove it false. In that case, we reject the null hypothesis and support the alternate hypothesis. If the sample fails to provide sufficient evidence for us to reject the null hypothesis, we cannot say that the null hypothesis is true because it is based on just the sample data. For saying the null hypothesis is true we will have to study the whole population data.

Terms associated with Hypothesis Testing:

  1. One-Tailed Test: It is a statistical hypothesis test in which the critical area of a distribution is one-sided so that it is either greater than or less than a certain value, but not both. If the sample being tested falls into the one-sided critical area, the alternative hypothesis will be accepted instead of the null hypothesis. A one-tailed test is also known as a directional hypothesis or directional test.
  2. Two-Tailed Test: A two-tailed test is a method in which the critical area of a distribution is two-sided and tests whether a sample is greater than or less than a certain range of values. If the sample being tested falls into either of the critical areas, the alternative hypothesis is accepted instead of the null hypothesis. Usually, two-tailed tests are used to determine significance at the 5% level, meaning each side of the distribution is cut at 2.5%

3. Critical Region: The critical region is the region of values that corresponds to the rejection of the null hypothesis at some chosen probability level. In other words, it is that region in the sample space in which if the calculated value lies then we reject the null hypothesis.

4. Test Statistic: The test statistic measures how close the sample has come to the null hypothesis. Its observed value changes randomly from one random sample to a different sample. A test statistic contains information about the data that is relevant for deciding whether to reject the null hypothesis or not. Different hypothesis tests use different test statistics based on the probability model assumed in the null hypothesis. Below are some of the most commonly used tests and their test statistics.

5.Type-I Error and Type-II Error: Two types of errors are there in hypothesis testing that relate to incorrect conclusions about the null hypothesis.

  1. Type-I Error: It occurs when the sample results, lead to the rejection of the null hypothesis when it is in fact true. They are equivalent to false positives. They can be controlled. The value of alpha, which is related to the level of Significance that we selected has a direct bearing on Type-I errors.
  2. Type-II Error: It occurs when based on the sample results, the null hypothesis is not rejected when it is in fact false. Type-II errors are equivalent to false negatives.

The following chart simplifies these two types of errors.

6.Level of significance: It refers to the degree of significance in which we accept or reject the null-hypothesis.100% accuracy is not possible for accepting or rejecting a hypothesis, so we therefore select a level of significance that is usually 5%.The probability of making a Type-I error and it is denoted by alpha (α). Alpha is the maximum probability that we have a Type-I error.

7.P-Value:Given the null hypothesis is true, a P-Value is a probability of getting a result as or more extreme than the sample result by random chance alone. The p-value is used all over statistics, from t-tests to simple regression analysis to tree-based models almost in all the machine learning models.

Important points for p-value:

  • It measures how compatible your data are with the null hypothesis.
  • The p-value is the smallest level of significance at which a null hypothesis can be rejected.
  • If p-value is greater than alpha(α), we do not reject the null hypothesis.

Steps involved in Hypothesis testing:

  1. Formulate the Hypothesis-Define Ho and Ha.
  2. Select an appropriate Test-To test the null hypothesis, it is necessary to select an appropriate statistical technique.
  3. Choose Level of Significance, α-α is often set at 0.05; sometimes it is 0.01; other values of α are rare.
  4. Collect Data and Calculate Test Statistic-Sample size is determined after taking into account the desired α and other qualitative considerations, such as budget constraints to collect the sample data.
  5. Determine the Probability p-value (or Critical Value):Probability is directly comparable to α.
  6. Compare p-value and α value.
  7. Make a conclusion based on the comparison result obtained in step 6 to decide whether to reject Ho or not.

The above steps are illustrated in the below image.

Let’s use hypothesis testing to derive conclusion about a claim. Suppose a principal at a certain school claims that the students in his school are above average intelligence. A random sample of thirty students IQ scores have a mean score of 112.5. Is there sufficient evidence to support the principal’s claim? The mean population IQ is 100 with a standard deviation of 15.

Let’s find the Null hypothesis. The accepted fact is that the population mean is 100, so: H0: μ=100. Then, state the Alternate Hypothesis. The claim is that the students have above average IQ scores, so: Ha: μ > 100.The fact that we are looking for scores “greater than” a certain point means that this is a one-tailed test.

Let’s visualize the above problem statement.

Let’s take an alpha level, use 5% (0.05).An area of .05 is equal to a z-score of 1.645.Let’s find the test statistic using

Here, z = (140–100) / (15/√30) = 14.60.As z is greater than 1.96, we can reject Ho. Hence we can conclude that the principal’s claims is true that the students in his school are above average intelligence.

Power of a Hypothesis Test: The power of hypothesis test is a measure of how effective the test is at identifying (say) a difference in populations if such a difference exists. It is the probability of rejecting the null hypothesis when it is false. In other words, it is the probability of not committing a Type II error is called the power of a hypothesis test i.e. 1-β.

The power of a hypothesis test is affected by three factors.

  • Sample size (n):Other things being equal, the greater the sample size, the greater the power of the test.
  • Significance level (α):The lower the significance level, the lower the power of the test. If you reduce the significance level (e.g., from 0.05 to 0.01), the region of acceptance gets bigger. As a result, you are less likely to reject the null hypothesis. This means you are less likely to reject the null hypothesis when it is false, so you are more likely to make a Type II error. In short, the power of the test is reduced when you reduce the significance level; and vice versa.
  • The “true” value of the parameter being tested. The greater the difference between the “true” value of a parameter and the value specified in the null hypothesis, the greater the power of the test. That is, the greater the effect size, the greater the power of the test.

Relationship between Hypothesis Tests and Confidence Intervals:

Hypothesis tests and confidence intervals are cut out of the same cloth. An event whose 95% confidence interval reject the hypothesis is an event for which p<0.05 under the relating hypothesis test, and the other way around. A P-value is letting you know the greatest confidence interval that despite everything prohibits the hypothesis. As such, if p<0.03 against the null hypothesis, that implies that a 97% confidence interval does exclude the null hypothesis.

At the end of this discussion of hypothesis testing, we can conclude that hypothesis encourages us to make coherent determinations, the connection among variables and gives the course to additionally investigate. Hypothesis, for the most part, results from speculation concerning studied behavior, natural phenomenon, or proven theory. An honest hypothesis ought to be clear, detailed, and reliable with the data. In the wake of building up the hypothesis, the following stage is validating or testing the hypothesis. Testing of hypothesis includes the process that empowers to concur or differ with the expressed hypothesis.

--

--

Nurul Huda

Data Scientist | Business/Data Analyst | Data Engineer