作者:WU Wei,CHEN Yongliang
摘要:Constructing a statistical model that best fits the background is a key step in geochemical anomaly identification.But the model is hard to be constructed in situations where the sample population has unknown and/or complex distribution.Isolation forest is an outlier detection approach that explicitly isolates anomaly samples rather than models the population distribution.It can extract multivariate anomalies from huge-sized high-dimensional data with unknown population distribution.For this reason,we tentatively applied the method to identify multivariate anomalies from the stream sediment survey data of the Lalingzaohuo district,an area with a complex geological setting,in Qinghai Province in China.The performance of the isolation forest algorithm in anomaly identification was compared with that of a continuous restricted Boltzmann machine.The results show that the isolation forest model performs superiorly to the continuous restricted Boltzmann machine in multivariate anomaly identification in terms of receiver operating characteristic curve,area under the curve,and data-processing efficiency.The anomalies identified by the isolation forest model occupy 19%of the study area and contain 82%of the known mineral deposits,whereas the anomalies identified by the continuous restricted Boltzmann machine occupy 35%of the study area and contain 88%of the known mineral deposits.It takes 4.07 and 279.36 seconds respectively handling the dataset using the two models.Therefore,isolation forest is a useful anomaly detection method that can quickly extract multivariate anomalies from geochemical exploration data.
发文机构:Changchun Institute of Urban Planning and Design Institute of Mineral Resources Prognosis on Synthetic Information
关键词:ISOLATIONFORESTcontinuousRESTRICTEDBOLTZMANNmachinereceiveroperatingcharacteristiccurveYoudenindexgeochemicalANOMALYidentification
分类号: P[天文地球]