What are Outliers
An Outlier is an extremely high or extremely low value in our data .It can be identify if it is greater than Q3 + 1.5(IQR) or lower tha Q1 – 1.5(IQR).
IQR = Q3 – Q1
Note:
-
IQR means Interquartile Range
-
Q1 means first quartile
-
Q3 means third quartile
`import numpy as np
data = [32, 36, 46, 47, 56, 69, 75, 79, 79, 88, 89, 91, 92, 93, 96, 97,
101, 105, 112, 116]
Q1 = np.median(data[:10])
Q3 = np.median(data[10:])
IQR = Q3 – Q1
print(IQR)
`
Other example
import numpy as npimport pandas as pddf = pd.DataFrame({'rating': [90, 85, 82, 88, 94, 90, 76, 75, 87, 86],'points': [25, 20, 14, 16, 27, 20, 12, 15, 14, 19],'assists': [5, 7, 7, 8, 5, 7, 6, 9, 9, 5],'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]})q75, q25 = np.percentile(df['points'], [75 ,25])iqr = q75 - q25iqr5.75import numpy as np import pandas as pd df = pd.DataFrame({'rating': [90, 85, 82, 88, 94, 90, 76, 75, 87, 86], 'points': [25, 20, 14, 16, 27, 20, 12, 15, 14, 19], 'assists': [5, 7, 7, 8, 5, 7, 6, 9, 9, 5], 'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]}) q75, q25 = np.percentile(df['points'], [75 ,25]) iqr = q75 - q25 iqr 5.75import numpy as np import pandas as pd df = pd.DataFrame({'rating': [90, 85, 82, 88, 94, 90, 76, 75, 87, 86], 'points': [25, 20, 14, 16, 27, 20, 12, 15, 14, 19], 'assists': [5, 7, 7, 8, 5, 7, 6, 9, 9, 5], 'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]}) q75, q25 = np.percentile(df['points'], [75 ,25]) iqr = q75 - q25 iqr 5.75
Enter fullscreen mode Exit fullscreen mode
© 版权声明
THE END
暂无评论内容