Defining Skewness
Skewness in statistics is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. A distribution is considered skewed if it is not symmetric, meaning one tail of the distribution is longer or fatter than the other. Understanding skewness helps in interpreting the relationship between the mean, median, and mode, and is crucial for choosing appropriate statistical methods for data analysis.
Types of Skewness
There are primarily three types of skewness: positive skew (right-skewed), negative skew (left-skewed), and zero/no skew (symmetric). In a positively skewed distribution, the tail on the right side is longer or fatter, and the mean is typically greater than the median. In a negatively skewed distribution, the tail on the left side is longer or fatter, and the mean is usually less than the median. A symmetric distribution, like a normal distribution, has zero skewness, with the mean, median, and mode being approximately equal.
A Practical Example: Skewness in Income Distribution
A common example of a positively skewed distribution is household income. Most households earn a moderate income, but a smaller number of households earn very high incomes. This creates a longer tail on the right side, pulling the mean income higher than the median income. Conversely, the age of death in a developed country might be negatively skewed, with most people living to an older age, creating a longer tail on the left side due to fewer early deaths.
Why Skewness Matters in Data Analysis
Skewness is important because it provides insight into the shape of a data distribution, which can affect the interpretation of descriptive statistics. For skewed data, the median is often a better measure of central tendency than the mean, as the mean can be disproportionately influenced by the long tail. Additionally, many statistical tests assume a normal (symmetric) distribution, so identifying skewness is essential for determining if data transformations or non-parametric tests are required.