AYSO Soccer's Mistake with Team Balancing:
Why the Fireballs' 36 Points Doesn't Equal the Dragons' 36 Points
or
Why the Dragons beat the Fireballs 9-0

by Steve Hampton
Davis, CA
November, 2003 (revised October, 2009)

One the main tenets of AYSO soccer is "balanced teams". Despite this, Davis AYSO has had problems achieving this goal. In the past (pre-2003), the problem was so chronic that the league was notorious as being one of the most un-balanced leagues in town. Most families I know experienced dismal or dominating seasons, with few competitive games. In contrast, the local baseball little league, where players try out and coaches draft players, seems to be much more successful. After 2003, AYSO adopted the "sprinkling" approach promoted here, which improved things considerably. However, many still don't understand it, and the annual comparison of teams' "average" rankings suggest Davis may slide back into problems.
It's another slow day for the Dragon keepers. Will he get to touch the ball this quarter?


Two Main Methodologies for Team Balancing
The current team balancing method is for coaches to rate players 1.0 through 5.0, at 0.5 intervals, with 5.0 being the very best players. Ignoring the obvious problem of variability in coach ratings, let's assume the coaches generally get it right and understand exactly who is a "3", who is a "4", and so on. The real problem comes after the ratings, when the teams are formed. There are two general approaches to using these ratings in team formation:

· the aggregate sum approach
· the sprinkling approach

In the aggregate sum approach, players are assigned to teams (typically by a computer) such that each team ends up with the same total number of points. For example, the Dragons may have players rated 5-5-5-4-3-3-3-2-2-2-1-1 = 36 points, while the Fireballs may end up with players rated 4-4-4-3-3-3-3-3-3-2-2-2 = 36 points. These would be considered balanced.

In the sprinkling approach, players within each rating are sprinkled across teams (often manually) in a snake-like fashion. First the 5's are passed out, then the 4's, and so on. Unrated players are also spread evenly across teams. The net result is similar to the aggregate sum approach, in that all teams end up with similar aggregate point values. The difference here is that the team compositions are also similar.

Two Problems with Adding Up Ratings
For several reasons, the sprinkling approach is vastly superior to the aggregate sum approach, which is entirely deceptive for the following reasons.

1. The Treatment of Ordinal Rankings
In the most general sense, the player ratings are ordinal rankings. That is, the ratings simply put the players into ordered categories that happen to have numbers as labels. They do not necessarily imply that the numbers have meaning as values and are related to each other in a linear fashion. All we know is that the 5's are more valuable than the 4's, the 4's are more valuable than the 3's, and so on. However, we do not necessarily know how much more valuable each rating is than the one below it. For example, it may not be true that a 5 is 20% better than a 4, a 4 is 20% better than a 3, and so on. One could posit that a 5 is a dominant player who can virtually always beat a 4 to get off a shot, or stop a 4 on the defensive end, and is thus more like a 7 or an 8 relative to a 4, in terms of actual value to the team. Figure 1 shows some possible relationships between player rating categories and actual value to a team.

The aggregate sum approach, by treating the ratings as numbers, inherently prescribes a linear relationship to them, thus assuming path A. In fact, we don't know the actual relationship between the player ratings and their actual value to a team. Under the aggregate sum approach, if path B or C is more accurate, serious team imbalances may occur.

As an example, let's return to the two teams described earlier. If path A is true, than the ratings reflect precisely their numerical value to the team (e.g., a 5 contributes 20% more than a 4, etc.; or, put another way, trading a 5 and a 2 for a 4 and 3 is an equal trade). However, let's also evaluate the situation in which path B is a more accurate representation of reality.

DRAGONSFIREBALLS
Player RatingActual ValuePlayer RatingActual Value
Path APath BPath APath B
558444
558444
558444
444332
332332
332332
332332
221332
221332
221221
110.2221
110.2221
Total:3637.4Total:3627

It's not difficult to see the outcome if indeed path B is more realistic. In this case, the three 5's on the Dragons dominate the game and give their team a lopsided advantage over the Fireballs. The Fireballs, lacking any 5's, may have match-up problems as well and be all the more hard-pressed to defeat teams with 5's.

The sprinkling approach inherently avoids this problem of not knowing the relationship between rating and value. With the sprinkling approach, paths A, B, or C may represent reality and the teams will still be balanced. The key advantage here is that we don't have to know which path is correct: the method works for any functional form (any potential path).

2. The Treatment of Unrated Players
Whenever the focus is on the aggregate sum of the ratings, there is a question regarding how to treat players that have no prior rating. Players with no rating are typically assigned a specific point value (both 3 and 0 have been used by Davis AYSO).

There are two significant problems with this: 1) we have no idea what their actual rating should be (my own experience suggests that using 3 is too high and 0 is obviously too low); 2) there is a much larger variance (i.e. degree of uncertainty) associated with this category than with any other category (e.g., I've seen unrated players turn out to be everything from 1 to 5). With the aggregate sum approach, the method is blind to the number of unrated players assigned to any one team (I was once given 6 unrated players on a team of 12). Obviously, putting a disproportionate number of these players on any one team increases the uncertainty surrounding that team. The goal here is to spread the uncertainty as thinly as possible. Under the sprinkling approach, unrated players are spread evenly across all teams.

Conclusions and Recommendations
Since 2004, Davis and most other AYSO leagues around the nation have used the sprinkling method. However, most of these leagues still mention re-balancing measures they may take (for example, when players drop or join) and describe the aggregate sum of the player ratings as their primary criteria for deciding whether or not a team is balanced. In using this criteria, they fall into the trap of assuming path A is the correct one.

I offer the following recommendations:
1. Strictly employ the sprinkling approach, taking care that each team has, to the extent possible, equal numbers of 5's, 4's, 3's, 2's, 1's, and unrated players.

2. Do not fall into the trap of summing or averaging the player ratings. To avoid this mistake, rate the players A thru E rather than 1 thru 5. With the sprinkling approach, simply spread each category across the teams.

3. Treat unrated players as unrated players. Do not assign a point value or rating of any kind to them. Perhaps call them U's. Spread them evenly across teams.

4. Because there are usually two age groups in each division, consider creating more rating categories by dividing the A thru E's into A-E "first year in division" (where their rating is emanating from play in a younger division) and A-E "second year in division". (This is the current practice in Davis.) As long as there are at least as many teams as there are players in each rating category, more categories can be created.



Send correspondence to Steve Hampton at hamptons_at_sbcglobal.net

About the Author:
Steve Hampton grew up playing recreational soccer and has been coaching AYSO soccer since 1997 (U6 through U19). When not thinking about team defensive strategies, he ponders how math affects the lives of children.