Skip to main content

About this free course

Download this course

Share this free course

Data analysis: hypothesis testing
Data analysis: hypothesis testing

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

7.1 Practical example

To illustrate the concept of proportions, let us revisit our digital marketing dataset containing 80 customers in a sample. In addition to the monthly purchases data, we also have a column labelled "big customer". This column identifies customers who have made over 55 purchases per month. In this sample, we find that 32 customers are identified as big customers.

Using this dataset, we can demonstrate how to calculate and interpret proportions:

  • Identify the specific group: Our group of interest is “big customers”.
  • Count the number of items in the group: We have 32 customers classified as “big customers”.
  • Calculate the proportion: We divide the number of big customers by the total sample size.
  • Proportion = 32 divided by 80 = 0.4 or 40%

This proportion tells us that 40% of the customers in our sample are considered "big customers" based on our definition.

We can use this proportion for various analytical purposes:

  1. Describing the customer base: We can state that two-fifths of our sampled customers are high-volume purchasers, making over 55 purchases per month.
  1. Segmentation analysis: This proportion could inform marketing strategies, helping to tailor approaches for different customer segments. With 40% being big customers, we might consider developing specific retention strategies for this significant group.
  1. Trend analysis: By calculating this proportion over time, we can track changes in the composition of our customer base. An increase in this proportion might indicate successful upselling strategies, while a decrease could signal a need for customer retention efforts.
  1. Hypothesis testing: We might want to test whether the true population proportion of big customers differs from a hypothesised value. In this case, we would use a z-test for proportions.

Here, we are particularly interested in hypothesis testing. For example, a marketing executive reviews this data and formulates a claim that the percentage of big customers in the entire population is more than 30%. We want to test this claim using a hypothesis test.

Question: At a 95% confidence level, is there enough evidence to support the marketing executive’s claim that the population proportion of big customers is more than 30%?

We will use the following formula to test proportions:

cap z equals p hat times negative p sub o divided by Square root of p sub o times left parenthesis one times negative p sub o right parenthesis divided by n

p̂ = sample proportion

p₀ = population proportion under the null hypothesis

n = sample size

Step 1: Formulate the hypotheses

  • H₀: p ≤ 0.30 (population proportion is 30% or less)
  • H₁: p > 0.30 (population proportion is more than 30%)

Step 2: Determine the z-statistic

Given information:

  • n = 80 customers
  • p̂ = 0.4
  • p₀ = 0.3

We use the z-test for proportions. The formula is: equation sequence part 1 cap z equals part 2 0.4 minus 0.3 divided by Square root of 0.3 times left parenthesis one minus 0.3 right parenthesis divided by 80 equals part 3 1.952

Step 3: Determine the critical value

For an upper tailed test, at α = 5%, we use the Excel function: =NORM.S.INV(0.95)

  • The z critical value is 1.645.

Step 4: Compare the z-statistic to the critical value and make a decision.

Our calculated z-statistic (1.952) is greater than the critical value (1.645).

Since the calculated z-statistic exceeds the critical value, we reject the null hypothesis.

Conclusion:

At a 95% confidence level, there is enough evidence to support the marketing executive’s claim that the population proportion of big customers is more than 30%. The sample data provides sufficient evidence to conclude that the true population proportion is significantly higher than 30%.

Activity 6: Testing Proportion Hypothesis

Timing: Allow around 40 minutes for this activity.

A cinema chain has records indicating that 70% of customers are willing to spend £15 on a 3D movie ticket. A ticket agent believes that this percentage has changed. To test this belief, the agent surveys 200 individuals and finds that 128 of them are willing to spend £15 on a 3D movie ticket. At the 95% confidence level, is there enough evidence to support the agent’s belief that the proportion has changed?

To use this interactive functionality a free OU account is required. Sign in or register.
Interactive feature not available in single page view (see it in standard view).

Answer

Given information:

  • Sample size (n) = 200 individuals
  • Number willing to spend £15 = 128
  • Sample proportion (p̂) = 128/200 = 0.64 or 64%
  • Claimed population proportion (p₀) = 0.70 or 70%
  • Confidence level = 95% (significance level α = 0.05)

Step 1: Formulate the hypotheses

  • H₀: p = 0.70 (population proportion is 70%)
  • H₁: p ≠ 0.70 (population proportion is not 70%)

Step 2: Determine the z-statistic

We use the z-test for proportions. The formula is:

equation sequence part 1 cap z equals part 2 0.64 minus 0.70 divided by Square root of 0.70 times left parenthesis one minus 0.70 right parenthesis divided by 200 equals part 3 1.852

Step 3: Determine the critical value

For a two-tailed test, at α = 5%, use the Excel function:

  • Upper tail = NORM.S.INV(0.975)
  • Lower tail = NORM.S.INV(0.025)

The z critical value is ±1.96.

Step 4: Compare the z-statistic to the critical value and make a decision

Our calculated z-statistic (-1.852) is greater than -1.96 and less than 1.96.

Since the calculated z-statistic does not exceed the critical values (±1.96), we fail to reject the null hypothesis. At a 95% confidence level, there is not enough evidence to support the ticket agent’s belief that the proportion of customers willing to spend £15 on a 3D movie ticket has changed from 70%. The sample data does not provide sufficient evidence to conclude that the true population proportion is significantly different from 70%.