欢迎访问现代地质!

现代地质 ›› 2022, Vol. 36 ›› Issue (03): 972-978.DOI: 10.19657/j.geoscience.1000-8527.2022.03.17

• 地球化学 • 上一篇    

基于机器学习的表层土壤成矿元素空间预测: 以稀有金属铷元素为例

戴亮亮1(), 聂小力1(), 郭军1, 巩浩1, 吴欢欢2, 张涛1, 汤媛媛1, 毛聪1, 彭志刚1, 贺灿1   

  1. 1.中国地质调查局 长沙自然资源综合调查中心,湖南 长沙 410600
    2.中国地质调查局 西安矿产资源调查中心,陕西 西安 710000
  • 收稿日期:2021-09-02 修回日期:2021-10-18 出版日期:2022-06-10 发布日期:2022-07-19
  • 通讯作者: 聂小力
  • 作者简介:聂小力,男,助理工程师,1991年出生,地球物理专业,从事环境地球化学研究工作。Email: 404200714@qq.com
    戴亮亮,男,助理工程师,1993年出生,地球化学专业,从事环境地球化学研究工作。Email: 416396230@qq.com
  • 基金资助:
    中国地质调查局地质调查项目“湘西片区土地质量地球化学调查”(DD20211576)

Spatial Prediction of Surface Soil Ore-forming Elements Based on Machine Learning: Taking Rare Metal Rubidium as An Example

DAI Liangliang1(), NIE Xiaoli1(), GUO Jun1, GONG Hao1, WU Huanhuan2, ZHANG Tao1, TANG Yuanyuan1, MAO Cong1, PENG Zhigang1, HE Can1   

  1. 1. Changsha Natural Resources Comprehensive Survey Center of China Geological Survey, Changsha,Hunan 410600,China
    2. Xi’an Mineral Resources Survey Center, China Geological Survey,Xi’an,Shaanxi 710000,China
  • Received:2021-09-02 Revised:2021-10-18 Online:2022-06-10 Published:2022-07-19
  • Contact: NIE Xiaoli

摘要:

近些年随着土地质量地球化学调查工作的开展,获取了大量表层土壤样品数据。然而,这些数据也存在一个明显的缺陷,即1∶50 000大比例尺表层土壤数据往往缺少成矿元素。鉴于土壤成矿元素含量对于矿产资源勘查的重要指示作用,尝试基于现有数据对大比例尺表层土壤成矿元素含量提供一个补全方案。以稀有金属铷元素为例,采用随机森林算法把同一区域2 548组1∶250 000小比例尺表层土壤数据按照8∶2的比例随机分为两组,用80%的数据进行训练建模,20%的数据对模型进行验证。采用变量重要性度量排序和构建学习曲线的组合方法优选了8种元素(K、B、Ni、V、Zn、As、Co、Cu)作为预测变量,模型对训练数据和测试数据的拟合优度R2分别达到0.983 2和0.895 6,说明预测变量的优选方法是有效的。随后将1∶50 000表层土壤的上述预测变量数据作为输入变量导入模型中,得到预测的Rb元素含量,预测结果比较符合实际特征。本研究表明将大数据机器学习随机森林算法引入表层土壤地球化学元素含量空间定量预测具有可行性,可进一步拓展土地质量地球化学数据的服务应用维度。

关键词: 机器学习, 随机森林, 表层土壤, 成矿元素预测

Abstract:

Mass geochemical data of surface soil have been obtained in recent years with the development of geochemical surveys of land quality. However, there is an obvious defect in the dataset of 1∶50 000 large-scale surface samples, i.e., the lack of ore-forming elements. In view of the important role of ore-forming elements in the prospecting of mineral resources, this article attempts to provide a supplementary plan based on existing data. Taking the rare metal rubidium as an example, 2,548 groups of 1∶250 000 small-scale surface soil data in the same area were divided into two groups randomly using the random forest algorithm according to the ratio of 8∶2, with 80% of the data for model training and 20% of the data for model verifying. The combination of variable importance metric ranking and learning curve construction was used to select 8 elements (K, B, Ni, V, Zn, As, Co, Cu) as predictors. The goodness of fitness(R2)of the model to the training data and test data reached 0.983 2 and 0.895 6, respectively, indicating that the optimal method of predictor variables is effective. Subsequently, the above-mentioned predictive variable data of 1∶50 000 surface soil was imported into the model as input variables, and the predicted Rb element content was obtained. The predicted results were in line with the actual characteristics. This study indicating that it is feasible to introduce the big data machine learning random forest algorithm into the spatial quantitative prediction of surface soil geochemical element content, and the service application dimension of land quality geochemical data can be further expanded.

Key words: machine learning, random forest, surface soil, prediction of ore-forming elements

中图分类号: