Abstract

The continued interest of researchers in understanding the home advantage in sports has led to many different attempts in determining the meaningfulness of the effect. Specifically, it has been argued that the typically large sample sizes examined in this field jeopardize the meaningfulness of common statistical tests that are based on proportions. To combat this, scholars have recommended to focus on standardized effect size measures instead. While such unit-less measures allow for comparisons between studies, they do not reflect the actual units of measurement that players and coaches work with. Using the critical values of the associated equations alongside simulated data, this paper shows that hypothesis tests indeed yield significant results when the home teams win few additional points, while at the same time, effect size measures would require a large amount of additional points won by the home teams which rarely occurs in any sport to yield at least a small effect. Therefore, it is recommended to report results in relevant measurement units which are practically relevant like the average number of points won by the home teams alongside the common standardized test statistics to make the effect more tangible.

1 Introduction

The home advantage in sports has long been established and generated a large body of literature across different types of sports, gender, geographic locations, and levels of competition (Jamieson, 2010; Pollard, 1986; Pollard et al., 2017). When researchers and practitioners speak of a home advantage, they refer to the tendency of teams to win over 50% of their games at home under a balanced schedule (Courneya & Carron, 1992). For sports like soccer, which follow a point system determined by wins, draws, and losses, the 50% threshold typically refers to the points cumulated from games played at home. Despite these clear criteria, research often insists on using statistical inferences to determine the significance of the home advantage. These approaches are, however, accompanied by some issues. Specifically, the tendency for studies to analyze large numbers of games played across several seasons leads to overpowered hypothesis tests (Dufner et al., 2023; Sors et al., 2022). Consequently, significance tests have been combined with effect size measures – specifically, at least a small effect – as a threshold to provide a meaningful result (e.g., Dufner et al., 2023). While this suggestion helps to minimize the problems arising from large sample sizes, it lacks reference to actually important parameters that relate to the world of sports, such as the number of points won. In other words, using a small effect size as an additional threshold may be seen as another arbitrary choice that it is not motivated by real-world parameters. Therefore, the aim of this paper is to provide simple calculation examples alongside simulated data to demonstrate how changes in practically relevant parameters affect common statistical measures of the home advantage. To do so, this paper will first explain the most common method to assess the home advantage as introduced by Pollard (1986) before demonstrating how changes in the number of points won by the home teams during a soccer season would translate to significant tests and effect size measures.

2 Assessing the Home Advantage

Although multiple new methods like Bayesian statistics (Higgs & Stavness, 2021) or Monte Carlo techniques (Hill & Yperen, 2021) have gained popularity in analyzing the home advantage in recent years, the traditional method introduced by Pollard (1986) is still commonly applied. In line with the 50% threshold that is emphasized in the definition of the home advantage (Courneya & Carron, 1992), Pollard’s (1986) method proposes to convert game outcomes to win percentages or points for sports like soccer where games may result in a draw. In the example of points, the next step would be to calculate the relative number of points won by the home teams given the total number of points won by both home and away teams combined:

\[ RelativeHA = \frac{Points_{Home}}{Points_{Total}} = \frac{(Wins_{Home} \cdot 3 + Draws)}{(Wins_{Total} \cdot 3 + Draws \cdot 2)} \tag{1}\]

where RelativeHA represents the proportion of points won by the home team, PointsHome the points won by the home team, and PointsTotal the number of points won by both home and away teams combined. Under a 3-point system as used in soccer, the last step of the formular describes the specific calculation to convert the game outcomes to points. Thus, this approach initially calculates the relative number of points won by the home team compared to the away team.

In the next step, a one-sample proportion test is applied to assess whether the observed proportion is statistically larger than the 50% threshold:

\[ z = \frac{p - p_0}{\sqrt{\frac{p_0 (1-p_0)}{n}}} \tag{2}\]

where p equals the observed proportion, p0 represents the expected proportion in the absence of a home advantage (i.e., 50%), and n represents the total number of points for home and away teams combined. Using a z-distribution table, the z-score is converted to the according p-value. Because the home advantage literature typically expects that the observed proportion is larger than 50%, a right-tailed one-sided approach should be adopted. A central issue with these types of significance tests is that studies on the home advantage typically include large samples of games, often spanning the course of multiple seasons and different competitions (Sors et al., 2022). For example, a soccer season with 20 teams in one league, which is commonly found in Europe’s and Japan’s top competitions, will have 38 playing rounds with 10 games each. Thus, the 380 games yield at least 760 points (if all games resulted in a draw). Such large sample sizes render even small deviations from the 50% threshold statistically significant. Therefore, recent papers added the use effect size measures that are less vulnerable to sample sizes as an additional evaluation criterion to determine whether the observed home advantage can be classified as statistically significant (Dufner et al., 2023; Sors et al., 2022). In other words, we are striving to find meaningful indicators that can signify whether the found effects are relevant for teams and coaches to pay attention to.

Following the one-sample proportion test, the effect size is typically given by Cohen’s h1:

\[ h = 2 \cdot \arcsin{\sqrt{p}} - 2 \cdot \arcsin{\sqrt{p_0}} \tag{3}\]

where 0.2 \(\le\) |h| < 0.5 is interpreted as a small effect, 0.5 \(\le\) |h| < 0.8 as a medium effect, and |h| \(\ge\) 0.8 as a large effect. Note that the formular does not include the sample size as a parameter, which makes it well-suited for research areas that use very large samples like the home advantage (Dufner et al., 2023; Sors et al., 2022). The point estimate should be accompanied by a confidence interval (CI) to provide an indication of the precision and uncertainty of the estimate. For Cohen’s h, the sample size is again used to estimate the CI:

\[ CI = h \pm z \cdot \left( 2 \cdot \sqrt{\frac{1}{4n}} \right) \tag{4}\]

where z is set to 1.96 to estimate a 95% confidence interval.

The use of such standardized effect size measures has been criticized in the literature because (i) their practical implication is not well-aligned with the statistical interpretation thresholds (e.g., a small effect size can have large consequences in practice and vice versa), (ii) they are based on specific statistical assumptions that may not hold, and (iii) the interpretation is challenging for individuals who are not specifically trained in these methods (Cuijpers, 2021). Even standardization practices within a specific research field may not be fruitful given the specific contextual factors for each study (Panzarella et al., 2021). Moreover, Cohen’s h shares a critical disadvantage with p-values, namely that it is a unit-less measure (Kallogjeri & Piccirillo, 2023). While this unit-less standardization of statistical effect size measures allows for universal interpretation thresholds and is critical for meta-analyses, it does not directly inform about the units at which the performance of teams is measured – points. That is, meaningful differences in points that could cause important changes in the final table standings of a season may be deemed irrelevant by such unit-less effect size measures (cf. Cuijpers, 2021). Instead, it would be more informative to assess real-world differences to determine the interpretation of any standardized and unit-less statistical measure. To compare, if a swimmer is introduced to a new training routine, they would be interested in their exact improvements with regards to their time, rather than whether a unit-less effect size indicated a small, medium, or large improvement in performance according to some statistical standardization. The next section provides an overview of the relation between the p-values of the significance tests, the according effect size given by Cohen’s h, and the associated average change in points per team for a simulated soccer season. Aligning these different measures may help researchers to find new and more meaningful ways to present their results.

3 Comparing Statistics with Practically Relevant Measures

To demonstrate the relationship between p-values, effect sizes (i.e., Cohen’s h), and practically relevant measures (i.e., points), The formulars of the one-proportion test (i.e., Eq. 2 and Eq. 3) can be solved to yield the critical values when the p-value drops below .05 and the Cohen’s h exceeds 0.2 (yielding a small effect) – both standardized thresholds based on the guidelines for social sciences. Many top soccer leagues worldwide consist of 20 teams that play each other following a balanced schedule. This means that a total of 380 games is played per season. For the sake of simplicity and to determine the mathematical thresholds, we will assume that each game results in a win equally distributed among home and away teams. Given that soccer uses a 3-point system (3 points awarded to the winning team, 1 point for each team when the game results in a draw, and 0 points for the losing side), these hypothetical 380 games yield a total of 1,140 points – 570 won by both the home and away teams in this calculation example. Following Eq. 1, the relative home advantage is thus set to 50%:

\[ 50.00 = \frac{570}{1140} = \frac{190 \cdot 3 + 0}{(190 + 190) \cdot 3 + 0 \cdot 2} \tag{5}\]

To determine at what percentage of points won by the home team (assuming that the total number of points remains consistent at 1,140) the one-proportion test yields a significant result, we can solve Eq. 2 for p. Given that a one-sided approach is used, z will be equal to 1.64. Thus,

\[ \begin{aligned} 1.64 &= \frac{p - 0.5}{\sqrt{\frac{0.5(1 - 0.5)}{1140}}} \\ p - 0.5 &= 1.64 \cdot \sqrt{\frac{0.5(1 - 0.5)}{1140}}\\ p &= 1.64 \cdot \sqrt{\frac{0.5(1 - 0.5)}{1140}} + 0.5\\ p &= 0.5243 \end{aligned} \tag{6}\]

Accordingly, the home teams need to win at least 52.43% of the points to reach significance. Following Eq. 1, the specific number of points has to reach a minimum of 598:

\[ \begin{aligned} 52.43 &= \frac{\text{Points}_{\text{Home}}}{1140} \\ \text{Points}_{\text{Home}} &= 52.43 \cdot 1140 \\ \text{Points}_{\text{Home}} &= 597.70 \end{aligned} \tag{7}\]

his improvement of 28 points by all 20 teams combined implies that an average improvement of 1.4 points won at home is statistically significant. To calculate at what percentage of points won at home Cohen’s h reaches the threshold for a small effect (i.e., 0.2), Eq. 3 can be solved for p.

\[ \begin{aligned} 0.2 &= 2 \cdot \arcsin{\sqrt{p}} - 2 \cdot \arcsin{\sqrt{0.5}} \\ 2 \cdot \arcsin{\sqrt{p}} &= 0.2 + 2 \cdot \arcsin{\sqrt{0.5}} \\ 2 \cdot \arcsin{\sqrt{p}} &= 1.7708 \\ \arcsin{\sqrt{p}} &= \frac{1.7708}{2} \\ \sqrt{p} &= \sin{\left(\frac{1.7708}{2}\right)} \\ p &= \sin^2{\left(\frac{1.7708}{2}\right)} \\ p &= 0.5993 \end{aligned} \tag{8}\]

This means that the home teams need to win approximately 60% of the points across the season. To convert this percentage back to points, the same steps outlined in Eq. 7 are repeated. This yielded a total of 684 points or an average improvement of 5.7 points per team.

To underscore the results of the purely mathematically determined values and provide the reader with a more realistic example and a dataset that can be used to replicate the findings using different methods, a season that is based on random game outcomes was generated (for the full documentation of the simulation and statistical tests for Matlab, see https://doi.org/10.34894/0NYQG7). The season again reflects a balanced schedule with 20 teams. Thus, 380 games were simulated based on goals scored for home and away teams. The simulation utilized a normal distribution of rounded values centered around 2.5 and a standard deviation of 1. Note that this distribution could yield values smaller than 0 or larger than 5. To keep the values within realistic limits, every negative value was converted to 0 and the maximum possible value was set to 5. The upper limit was chosen to yield an equal distribution and does not affect the overall calculation example. Because the same distribution was used to create the number of goals for both home and away teams, the relative home advantage should approximate 50% and therefore not systematically favor any side.

In the next step, Eq. 1 was applied to determine the relative home advantage (i.e., the proportion of points won at home). The case that is used as an illustration in this article yielded 146 wins, 144 losses, and 90 draws for the home teams. This converts to the home teams winning 528 of the 1,050 total points, resulting in a relative home advantage of 50.29% (see Eq. 9).

\[ 50.29 = \frac{528}{1050} = \frac{146 \cdot 3 + 90}{(146 + 144) \cdot 3 + 90 \cdot 2} \tag{9}\]

In line with the expectations, this slight favor of the home teams does not reach significance (p = .426, h = 0.006, 95% CI [-0.055, 0.066]) when applying Pollard’s method (1986) (see Eq. 2, Eq. 3, and Eq. 4).

Finally, the relationship between the p-value, Cohen’s h, and points won by the home team was determined for when the home teams were to progressively win, on average, one point more across the entire season (see Table 1 for a full overview). This steady increase of the number of points won by the home team will continuously decrease the p-value and increase the effect size. This allows us to compare when the threshold of the p-value and the threshold for the effect size are reached in relation to the change in points won by the home teams. For the sake of simplicity, the total point value is kept constant at 1,050 even though this value would fluctuate based on shifts in actual game outcomes.

Table 1. Relationship between p-values, Cohen’s h, and points won by home teams based on total of 1,050 points.

Points

Average improvement in points per team

Relative HA

p

h [95% CI]

528

-

50.29

.426

.006 [-0.055, 0.066]

548

1

52.19

.078

.044 [-0.017, 0.104]

568

2

54.10

.004

.082 [0.022, 0.143]

588

3

56.00

< .001

.120 [0.060, 0.181]

608

4

57.90

< .001

.159 [0.098, 0.219]

628

5

59.81

< .001

.198 [0.137, 0.258]

648

6

61.71

< .001

.236 [0.176, 0.297]

Note: The row marked in green indicates the point improvement at which the p-value would reach significance, and the row marked in orange indicates the point improvement at which h would indicate at least a small effect.

The results of the simulated season converge with the mathematically determined critical values and yield in several findings. First of all, for a single season, a mere increase of 1.4 points per team is sufficient to yield a significant home advantage according to the p-value. This means that if not even each team turned one draw (out of 19 games played at home) into a win, we would find support for a significant effect. Note that this calculation is based on a single season, which is often not representative of the current research. If we considered two seasons with a point total of 2,100, even a single point improvement (i.e., a relative home advantage of 52.19%) would yield a statistically significant effect (p = .022, h = 0.044, 95% CI [0.001, 0.087]). Thus, in line with the arguments provided by previous papers, the p-value alone might yield significant effect for marginal improvements in the teams’ actual home performance (Dufner et al., 2023; Sors et al., 2022).

In contrast to the p-value, the effect size measure provided by Cohen’s h, is not affected by the sample size (see Eq. 3). However, a small effect size is not reached until approximately 60% of the points are won by the home teams. In soccer, this is equivalent to 5.7 additional points or almost two wins (instead of losses) when playing at home. Critically, not many sports reach such strong effects for the home advantage (Jamieson, 2010). This is especially important when considering women’s sport, where the 60% threshold is rarely reached (Pollard et al., 2017; Pollard & Gómez, 2014). In other words, considering the statistical effect size measures only, different sports and groups would seemingly not benefit from the home advantage, or deem the effect negligible. Note also that the range of the confidence interval approximates 0.1 which indicates that a single season does not allow for very precise estimations considering the interpretation thresholds for h. Thus, simply reporting and relying on the point estimate alone may be too imprecise.

4 Discussion

The current literature on the home advantage in sports is currently using a variety of methods to assess the meaningfulness of the results. Consequently, many new approaches (e.g., Higgs & Stavness, 2021; Hill & Yperen, 2021) or additions to existing methods (Sors et al., 2021) have been proposed and utilized. However, independent of these approaches, the inferential statistics that indicate the existence of the home advantage are often determined by arbitrary thresholds that are not based on relevant measurement units that are used in the field, such as the number of points won by the home teams for sports like soccer. That is, even the addition of effect size measures to combat the vulnerability of significance tests given the large sample sizes that are common in this field (Sors et al., 2022) resulted in unit-less measures that may not be meaningful. Furthermore, these statistical approaches do not relate to real-world changes and may be hard to interpret for individuals who have not received detailed training (Cuijpers, 2021). The presented simulation of a single soccer season based on a random distribution to create game outcomes confirmed the vulnerability of p-values to large sample sizes, but it also indicated that effect size measures that are independent of the sample size like Cohen’s h may induce thresholds that are not met in many sports, particularly women sports (Pollard et al., 2017). In other words, even if the magnitude of the effect is statistically not large, it may have practical relevance. Specifically, findings from psychological studies demonstrate that children at young ages already indicate that they prefer to play at home and believe to perform better on their home turf – an effect that steadily increases with age (Staufenbiel et al., 2016). Moreover, coaches even choose different strategies for home and away games (Staufenbiel et al., 2015). Apparently, smaller changes in the game outcomes that are statistically negligible yield psychological and behavioral changes in players and coaches when adjusting to playing at home or away.

In light of these findings, it is recommended to report an additional measure that relates to the actual measurement units of the variable of interest that players and coaches focus on when assessing the home advantage (cf. Cuijpers, 2021). Note that this step does not imply developing new standardized thresholds for the home advantage specifically (Panzarella et al., 2021). Instead, the appropriate measurement may refer to changes in points (for a detailed illustration of different regression models based on point distributions, see Higgs & Stavness, 2021) given that most studies use retrospective match outcomes for the analyses. However, studies may also increase their methodological rigor. For example, different aspects of the home advantage like the referee bias (e.g., Boyko et al., 2007) can use indicators like erroneous decisions (Buraimo et al., 2010; Hill & Yperen, 2021). To illustrate, Nevill et al. (2002) conducted a controlled experiment to indicate that video recording without noise yielded 15.5% fewer fouls called against the away team. Similar behavioral indicators may also be derived from players during games using video notational analyses (e.g., Caso et al., 2025). This allows for a verification whether the statistical indicators coincide with behavioral changes. These indicators may be more relevant for players and coaches that are inclined to evaluate effects on measurable scales rather than unit-less statistics (cf. Kallogjeri & Piccirillo, 2023). While this paper uses a single and specific method to illustrate this issue, this recommendation also extends to other methods like Bayesian statistics, which also often make use of unit-less effect size measures with standardized interpretation thresholds. However, the precise limitations and estimation biases should be established for each method specifically.

5 Conclusion

To summarize, relying purely on standardized thresholds of statistical measures to draw conclusions about the existence and relevance of home advantage in sports may be misleading (Dufner et al., 2023; Sors et al., 2022). That is, significance tests are vulnerable to the commonly large sample sizes in this field, while the effect sizes typically indicate that the magnitude of the effects is very small – which is often mistaken for the absence of practical relevance. Thus, this paper advocates for reporting the results in meaningful measurement units that either directly influence the final standings in a table, such as points, or tangible behavioral changes. Put more strongly, we must make use of meaningful measurement units when they are available to us to make meaningful claims about real-world phenomena.

6 Additional Information

6.1 Data Accessibility

The dataset and scripts are openly accessible under https://doi.org/10.34894/0NYQG7.

6.2 Conflict of Interest

The author has no conflicts of interest to declare.

6.3 Funding

No funding was used for this paper.

6.4 Acknowledgments

I would like to thank Jeroen Smeets for inspiring the idea for this article as well as Lisanne Kleygrewe and Anke Quast for the feedback on manuscript

7 References

Boyko, R. H., Boyko, A. R., & Boyko, M. G. (2007). Referee bias contributes to home advantage in english premiership football. Journal of Sports Sciences, 25(11), 1185–1194. https://doi.org/10.1080/02640410601038576
Buraimo, B., Forrest, D., & Simmons, R. (2010). The 12th man?: Refereeing bias in english and german soccer. Journal of the Royal Statistical Society Series A: Statistics in Society, 173(2), 431–449. https://doi.org/10.1111/j.1467-985X.2009.00604.x
Caso, S., Furley, P., & Jordet, G. (2025). Using video-notational analysis to examine soccer players’ behaviours. International Journal of Sport and Exercise Psychology, 1–21. https://doi.org/10.1080/1612197X.2025.2477165
Courneya, K. S., & Carron, A. V. (1992). The home advantage in sport competitions: A literature review. Journal of Sport & Exercise Psychology, 14(1), 13–27. https://doi.org/10.1123/jsep.14.1.13
Cuijpers, P. (2021). Has the time come to stop using the "standardised mean difference"? Clinical Psychology in Europe, 3(3), 6835. https://doi.org/10.32872/cpe.6835
Dufner, A., Schütz, L., & Hill, Y. (2023). The introduction of the video assistant referee supports the fairness of the game – an analysis of the home advantage in the german bundesliga. Psychology of Sport and Exercise, 66, 102386. https://doi.org/10.1016/j.psychsport.2023.102386
Higgs, N., & Stavness, I. (2021). Bayesian analysis of home advantage in north american professional sports before and during COVID-19. Scientific Reports, 11(1), 14521. https://doi.org/10.1038/s41598-021-93533-w
Hill, Y., & Yperen, N. W. (2021). Losing the home field advantage when playing behind closed doors during COVID-19: Change or chance? Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.658452
Jamieson, J. P. (2010). The home field advantage in athletics: A meta-analysis. Journal of Applied Social Psychology, 40(7), 1819–1848. https://doi.org/10.1111/j.1559-1816.2010.00641.x
Kallogjeri, D., & Piccirillo, J. F. (2023). A simple guide to effect size measures. JAMA Otolaryngology–Head & Neck Surgery, 149(5), 447–451. https://doi.org/10.1001/jamaoto.2023.0159
Nevill, A. M., Balmer, N. J., & Williams, A. M. (2002). The influence of crowd noise and experience upon refereeing decisions in football. Psychology of Sport and Exercise, 3(4), 261–272. https://doi.org/10.1016/s1469-0292(01)00033-4
Panzarella, E., Beribisky, N., & Cribbie, R. A. (2021). Denouncing the use of field-specific effect size distributions to inform magnitude [PeerJ 9:e11383]. https://doi.org/10.7717/peerj.11383
Pollard, R. (1986). Home advantage in soccer: A retrospective analysis. Journal of Sports Sciences, 4(3), 237–248. https://doi.org/10.1080/02640418608732122
Pollard, R., Bermejo, J. P., & Gómez, M. (2017). Global differences in home advantage by country, sport and sex. International Journal of Performance Analysis in Sport, 17(4), 586–599. https://doi.org/10.1080/24748668.2017.1372164
Pollard, R., & Gómez, M. (2014). Comparison of home advantage in men’s and women’s football leagues in europe. European Journal of Sport Science, 14, 77–83. https://doi.org/10.1080/17461391.2011.651490
Sors, F., Grassi, M., Agostini, T., & Murgia, M. (2021). The sound of silence in association football: Home advantage and referee bias decrease in matches played without spectators. European Journal of Sport Science, 21(12), 1597–1605. https://doi.org/10.1080/17461391.2020.1845814
Sors, F., Grassi, M., Agostini, T., & Murgia, M. (2022). A complete season with attendance restrictions confirms the relevant contribution of spectators to home advantage and referee bias in association football. PeerJ, 10, 13681. https://doi.org/10.7717/peerj.13681
Staufenbiel, K., Lobinger, B., & Strauß, B. (2015). Home advantage in soccer – a matter of expectations, goal setting and tactical decisions of coaches? Journal of Sports Sciences, 33(18), 1932–1941. https://doi.org/10.1080/02640414.2015.1018929
Staufenbiel, K., Riedl, D., & Strauß, B. (2016). Learning to be advantaged: The development of home advantage in high-level youth soccer. International Journal of Sport and Exercise Psychology, 16(1), 36–50. https://doi.org/10.1080/1612197x.2016.1142463

  1. Alternatively, odds ratio may also be used to determine the effect size for proportion tests. Similar to Cohen’s h, this approach is also unit-less and does not include the sample size in the formular.↩︎





Communications in Kinesiology