After the central tendency like mean, median, mode, the next and the most useful are standard deviation and variance. So here in this guide, we will go through. In data science and machine learning, the standard deviations are most helpful.
What is variance
Variance is used to calculate the fluctuation of the data. If the value of variance is zero, we can say that the value is identical to each other; the value of variance is higher, then the data fluctuation is more.
Data_1 = [20, 30, 80, 40, 70, 110, 60, 90]
Data_2 = [62.48, 62.47, 62.4, 62.43, 62.49, 62.46, 62.39, 62.30]
Here I have created two datasets. The average of these two datasets is 62.5. So when we calculate the data_1 average, the average is 62.5, and the data containing 20,30 80, 110 does not look relevant. this is high fluctuations data set. So we don’t want this type of data set.
In the second data set, all values are identical; only the value changes in decimal. So we can say this is a proper value for analysis.
The variance is essential for the researches like Medicine, vaccines, or other industry.
For example, we heard that the Covid vaccine is 95% effective; there are many methods to identify variance.
When the vaccine is in a clinical trial, the researcher experiments on people and observes how much the people get the cure and the antigen-antibody reaction. If the efficiency of the vaccine is identical, like 95, 95.01, 95.02,…95.n, we can say this was a good vaccine.
If the value is fluctuations like 90, 95, 96, 96, 80, 83…n then we vaccine is not safe.
Variance in python
When calculating the large set of data and the value is random. It’s very tough to calculate. That’s the invention of the standard deviation; the SD is the square root of the variance and relative to its mean. Standard deviation efficiently evaluates the large data set if the data point has a large difference from the mean high standard deviation.
Standard deviation is denoted by SD, OR sigma( σ)
Standard Deviation in python
For calculating the standard deviation in python, you have to import the NumPy library and use the function std()