Identification of Outlier Data in Flow Pattern Experiments in a Bend by Using Statistical Methods

Authors

Abstract

Various factors such as human or instrument errors, measurement conditions, and the nature of the flow under unique circumstances may lead to generation of data inconsistent with the normal pattern of the statistical population, and result in the assumption that they may have been generated through a different process. In a general definition, these data are called outlier data. Identification of outliers is significant in many aspects, and will thus result in an ever better and more precise understanding of flow pattern. The main purpose of this study was analysis and identification of outliers existing in flow pattern experiments in a bend channel with a central angle of 180 degrees and width of 1 meter in the presence and absence of a spur dike in the bend by employing statistical methods. The intended channel is located in the Hydraulic Laboratory of Persian Gulf University, and Vectrino velocimeter has been utilized for collection of 3D flow velocities. Median of Absolute Deviations (MAD), K-Means Clustering, Local Density Factor (LDF), and Voting were the methods employed for outlier detection in this study. The results of applying these methods on the collected experimental data suggested that most of the methods were efficient and appropriate. Eventually, the Voting method was used to achieve the optimum results in this paper. In this method, the data which have been identified as outlier by most of the methods are considered the final candidates as outlier

Keywords