Musings and Ruminations: Stats For Spikes-Use of Statistics as Goals

In this era of Moneyball, almost all sports are delving into how to coax wisdom from the numbers that are naturally generated from taking statistics from playing the game. The USA National Team has been active in this discovery process, there are coaches who are integrated into the coaching staff and are specifically dedicated to the creation, calculation, and analysis of meaningful statistics based on basic playing statistics. They indulge in the process of descriptive statistics, which is the act of capturing the details of game action — what happened with each act of playing the ball — through the act of assigning values to each action on the ball. This is particularly important in the fast paced, continuously changing, and competitive environment of international volleyball. The statistics staff continuously keep the coaches appraised of the game action as seen through the filter of statistics to cut through the cloud of human biases and perceptions.

Those of us who reside in the less rarified air of high school and club volleyball are also interested in using the statistics for our purposes. Even though we cannot possibly accrue that level of descriptive statistics in our matches; because of our lack of resources, both human and technical, we sometimes try to use the statistics that we do have and try to use inferential statistics to help us make decisions about how we should plan our training as well as measure our team’s progress throughout a season. If we wish to measure improvement, we need to first measure our base level of performance, whether it is for individual players and individual skills or for team performances during match play. Regardless of the parameters of the performance measures, we need to make those performance measures before and after making any changes so that we can compare.

What is not a given is the vast difference between descriptive statistics and inferential statistics. Inferential statistics is based on assumptions made about the processes under measure, whether they are all under the same conditions, whether the processes are under statistical control, and whether the measurement process is repeatable and reproduceable.

We have seen the same need for measurement and improvement when we observe other sports or any other aspects of human endeavors outside of sports wishing to transform observations into corrective action. Statistical Process Control (SPC), and especially Six Sigma Processes, have become ubiquitous in our vocabulary. Indeed, using statistical measures are the keys to creating consistent manufacturing processes, minimize process errors, and increasing process throughput. Unfortunately, there are critical differences between the manufacturing environment and the sporting environment. In the manufacturing environment, the variability of the machines is measurably minimal because the machines are inanimate and they are, by and large, controllable. This is not to say that it is easy to control those variables; the controllability problem in manufacturing can be difficult because the threshold of error is small and the required signal to noise ratio is large.

In the sporting environment, human actions and responses can be random to the extreme, which drives the uncertainty in the sporting process; to make matters worse, the uncertainties associated with each individual are coupled so that the impact of one person’s randomness is not just limited to the actions of that person, but affects every other person taking part: every player on both teams, the officials, the coaching staff etc. all contribute to the aggregation of uncertainties in all statistical measures. The coupling effect may be miniscule so that much of the coupling can be ignored, but not all couplings can be easily ignored. This is true of the instantaneous descriptive statistics taken during the matches as well, but the averaging in descriptive statistics is minimal as compared to the accruing of the larger statistics that are used to draw inference. For example, a good server influences not only the passer but also the setter, the hitter; the interaction can have secondary and tertiary effects on how the serving team plays as they react to the actions of the passing team. Each action in volleyball, as with most sports, depends on prior actions.

So why talk about this? Because there are many coaches who ask the following form of question: “I have a Name a Level and Age team, what statistical threshold should my team be performing at when we are performing Name a Volleyball Action?”

The intent of the question is clear. The coach is trying to determine a reference level of performance for comparison against what they can measure of their own teams. The question is a loaded one because since sports are dependent upon prior actions; that is, there is no way to separate and isolate a specific game action from all that had led up to it, the statistics taken is conditional upon the prior actions, but the measure that we take are singular dimensioned, the measure never truly reflect the deep coupling of the actions.

To further compound the amount of uncertainties, many assumptions are assumed and tacitly made. The usual practice in statistics is to take many different sets of the one kind of data and aggregating them into one representative set of data by averaging many datasets together. Averaging, as with all things, has its advantages and disadvantages. The advantage is that many datasets can be used to create a uniform and representative set of data which can give the user a good idea of what the general trend is for specific variables: how well the team is performing and how well each player is performing in each of the measured skills. Rather than diving through massive amounts of data, the aggregate data is used. The assumption is that the aggregation is an accurate representation of your team. Which brings us to the disadvantage of taking averages. When an average is taken, the salient contributing factors are smeared, that is: the highs and lows of all the datasets are nullified in deference to the average. The result is that we have erased the unique contributions and variations of individual opponent as well as the team of interest. In essence, averaging your own team statistics creates a fictional “average” representation of your team. More subtly, the statistics generated is presumed to be against yet another fictional “average” team, that of the opponents. This is problematic because the opposing team’s actions are what elicits the response from your team, so the weaknesses inherent to your players and your team in aggregate is disguised by the average of the opponent, which negates any insight that you may gain regarding your team and what you would need to correct in training. Another issue is that by using an “average” representation of your team against the “average” opponent, you are obscuring the specifics of how your team plays: problem rotations that you may have against a good team or a good player. You are also erasing the problems that you may have in certain situations, like passing in the seams or hitting line. You are averaging out your best player’s statistics along with averaging out your worst player’s statistics, so you are unable to identify the problem area.

Note that all the aforementioned situational information are easily available from the descriptive statistics taken during the game, it is when we try to infer our team’s future performance from comparing our team’s general “average” performance against the performance of our general “average” opponent’s against us that the inferential value of the exercise disappears.

Another, more subtle, logical fallacy has also been made and assumed in addition to the averaging problem.

“When a measure becomes a target, it ceases to be a good measure.”

Goodhart’s Law as articulated by Marilyn Strathern, Anthropologist

What does the above statement mean?

It means that we selectively choose meaningful measures to help us in determining the truth in what we experience by observing specific variables that will validate or refute the pictures in our minds. The measurement should be performed unobtrusively in order to not affect the outcome of what we are trying to observe, but if we tried to take a shortcut in our observations by making reality conform to what we perceive we need to observe, i.e., if we made the team aim at the expected measure as targets, we are then skewing our player’s minds to perform according to the artificial horizons set by the measure/target rather than what we hope to achieve: maximize performance over all of the variables, and more importantly, winning. A good statistical lesson to remember is that correlation does not equal causation. Just because two sets of data correlate do not mean that the one result follows the other.

Using fictitious volleyball truisms as targets for a team can actually hurt the team’s chances. I was a firm believer in many truisms: the ace to error ratio must be at least one if you want to succeed; teams passing an average of 2.4 on a 3-point scale will always win; set distributions, in order, must be the most sets to the left side, followed by the middle, the right side, and then finally the backrow to win.

Once again, the problem with the truisms is that correlation does not equal causation. When coaches practice and train while using artificially determined goals as the target, the measure stop being the clue to the secret of team performance, they become the target and the end goal. People will focus on the target and work towards achieving that goal and ignore the fact that the purpose of playing the game is to be the winner when the last ball drops. When players get preoccupied by the artificial horizons set by the coaching staff, they are putting the winning and losing and their overall game performance secondary. All coaches have stories about how their teams did everything perfectly according to the statistics and still lost, and vice versa. Statistics should not be the goal; it should be a way to augment the picture that everyone has of the reality they are experiencing.

A Digression.

The use of averaging happens in real life and it is ubiquitous; even in the way we assess players in general. The avcaVPI measure was instituted to ostensibly help the college athlete determine whether they can play in college as well as help college coaches find the players to recruit based on physical measures. The idea is to use the avcaVPI score to help give players and coaches an idea about how the players would fit in given college divisions and programs by comparing their physical attributes, as measured in non-competitive environments, compare to those who are already playing in college. When the initial VPI measure came out, I remember that it was simply a single score which is an aggregate of various physical measures done at the testing sites. The initial criticism was that the players were not compared against players playing their positions; to AVCA’s credit, it looks like they have corrected that oversight, although the avcaVPI data doesn’t seem like it is further segregated into NCAA, NJCAA, or NAIA divisions, although I could be wrong. The avcaVPI scores are categorized and ranked according to positions, and the percentile where each player fit in each test category as compared to the players who are already playing in college, which is much more useful than before, but it is still misleading. What is left unsaid again is that correlation does not equal causation, because if a player’s physical measure falls within the percentile rank of the existing college players, that does not mean that the player is going to get recruited to play in college.

Talent evaluation is a very tricky and uncertain process, just ask any NFL team about their ability to identify a quality quarterback, and then point out that Tom Brady was drafted in the sixth round of 2000 NFL Draft by the New England Patriots, 199th overall, and was the seventh quarterback taken.

The avcaVPI really does very little to clear up the collegiate volleyball recruiting picture for all involved.

Musings and Ruminations

A Polymath in Training.

Followers

Search This Blog

Saturday, March 6, 2021

Stats For Spikes-Use of Statistics as Goals

No comments:

My Favorite Links

My Favorite Things

Contact Form