Preview

Moscow University Bulletin. Series 4. Geology

Advanced search

Applications of data science methods in petroleum geochemistry: current state

https://doi.org/10.55959/MSU0579-9406-4-2025-64-1-88-96

Abstract

This paper explores the relevance of applications of Data Science methods in petroleum geochemistry. In order to investigate this topic, a methodology for searching, gathering and analyzing scientific papers published in the last decade was developed and successfully applied. The study reveals a growing interest in integrating Data Science methodology into petroleum geochemistry. The article also presents specific examples of found publication, identifies key “problems” hindering widespread Data Science adoption in geochemistry (including the need for result verification, shortage of qualified specialists, issues regarding access to data and negative sentiment towards new methods), and proposes promising ideas for further utilization of data science methods to tackle challenges presented by organic geochemistry (geological assistants, open access geological and geochemical databases and specialized digital toolkits and software.

About the Authors

G. A. Shevchenko
Lomonosov Moscow State University
Russian Federation

Gleb A. Shevchenko

Moscow



M. A. Bolshakova
Lomonosov Moscow State University
Russian Federation

Mariya A. Bolshakova

Moscow



References

1. Лутай А.В., Любушко Е.Э. Сравнение качества метаданных в БД CrossRef, Lens, OpenAlex, Scopus, Semantic Scholar, Web of Science Core Collection. Российский фонд фундаментальных исследований (РФФИ). 2022. URL: https://podpiska.rfbr.ru/storage/reports2021/2022_meta_quality.html (дата обращения: 28.05.2024).

2. Осипов К.О., Абля Э.А., Сауткин Р.С. и др. Выявление особенностей органического вещества нефтей и нефтегазоматеринских толщ путем сопоставления результатов геохимического анализа со статистическим анализом, основанным на методах машинного обучения (на примере одного из месторождений Западно-Сибирского нефтегазоносного бассейна) // Георесурсы. 2022. Т. 24. № 2. С. 217–229.

3. Шиверский Г.В., Кривощеков С.Н. Перспективы применения методов искусственного интеллекта в нефтегазовой геологии // Журнал магистров. 2022. № 2. С. 57–67.

4. Bispo-Silva S., Oliveira C.J., De Alemar Barberes G. Geochemical biodegraded oil classification using a machine learning approach // Geosciences. 2023. Vol. 13. N 11. P. 321.

5. Cheng D., Zhang T., He Z., et al. K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization // arXiv e-prints. 2023. URL: https://doi.org/10.48550/arxiv.2306.05064

6. Conway D. The Data Science Venn Diagram. Drew Conway Data Consulting. 2010. URL: http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram (дата обращения: 31.05.2024).

7. Culbert J., Hobert A., Jahn N., et al. Reference Coverage Analysis of OpenAlex compared to Web of Science and Scopus // arXiv e-prints. 2024. URL: https://doi.org/10.48550/arXiv.2401.16359

8. Farrell Ú.C., Samawi R., Anjanappa S., et al. The Sedimentary Geochemistry and Paleoenvironments Project // Geobiology. 2021. Vol. 19. N 6. P. 545–556.

9. Google Colab. URL: https://colab.research.google.com/drive/1oULyxOqrpP90-SVIEy1RRM2SNHgAKpHb?usp=sharing (дата обращения: 31.05.2024).

10. Gusenbauer M. A free online guide to researchers’ best search options // Nature. 2023. Vol. 615. P. 586.

11. Lin Z., Deng C., Zhou L., et al. GeoGalactica: A Scientific Large Language Model in Geoscience // arXiv e-prints. 2023. URL: https://doi.org/10.48550/arXiv.2401.00434

12. Maslianko P., Sielskyi Y. Data Science — definition and structural representation // System Research & Information Technologies. 2021. N 1. P. 61–78.

13. Priem J., Piwowar H., Orr R. OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts // arXiv e-prints. 2022. URL: https://doi.org/10.48550/arXiv.2205.01833

14. Sarker I.H. Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective // SN Computer Science. 2021. Vol. 2, N 5. URL: https://doi.org/10.1007/s42979-021-00765-8

15. Su K., Lu J., Yu J., et al. Intelligent geochemical interpretation of mass chromatograms: Based on convolution neural network // Petroleum Science. 2024. Vol. 21, N 2. P. 752–764.

16. Sun J., Dang W., Wang F., et al. Prediction of TOC content in Organic-Rich shale using machine learning algorithms: comparative study of random forest, Support Vector Machine, and XGBOOST // Energies (Basel). 2023. Vol. 16, N 10. P. 4159.

17. Tariq Z., Aljawad M.S., Hasan A., et al. A systematic review of data science and machine learning applications to the oil and gas industry // Journal of Petroleum Exploration and Production Technology. 2021. Vol. 11, N 12. P. 4339–4374.

18. Torres S.B., De Oliveira Matias Í., De Araújo Ponte F.F., et al. Data mining in organic geochemistry: case study in Potiguar basin // Geociências. 2022. Vol. 41, N 1. P. 105–114.

19. Williams M.J., Schoneveld L., Mao Y., et al. pyrolite: Python for geochemistry // Journal of Open Source Software. 2020. Vol. 5. N 50. P. 2314.

20. Wyborn L., Lehnert K.A. OneGeochemistry: Creating a global FAIR-Way to access and share geochemical data // Goldschmidt Abstracts. 2020. URL: https://doi.org/10.46427/gold2020.2910

21. Yu Q.-Y., Bagas L., Yang P.-H., et al. GeoPyTool: a crossplatform software solution for common geological calculations and plots // Geoscience Frontiers. 2019. Vol. 10. N 4. P. 1437–1447.

22. Zhang S.E., Bourdeau J.E., Nwaila G.T., et al. Denoising of Geochemical Data using Deep Learning–Implications for Regional Surveys // Natural Resources Research. 2024. Vol. 33. P. 495–520.

23. Zhangzhou J., He C., Sun J., et al. Geochemistry π: Automated machine learning Python framework for tabular data // Geochemistry, Geophysics, Geosystems. 2024. Vol. 25. N 1. P. e2023GC011324.


Review

For citations:


Shevchenko G.A., Bolshakova M.A. Applications of data science methods in petroleum geochemistry: current state. Moscow University Bulletin. Series 4. Geology. 2025;64(1):88-96. (In Russ.) https://doi.org/10.55959/MSU0579-9406-4-2025-64-1-88-96

Views: 25


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 0579-9406 (Print)