For More Posts like this Visit GeoWedge
Reservoir facies determination is the most important job in oil and gas exploration which relies on the major of rock propertied. Fundamental properties of rocks are usually understood by their detailed description in the field (lithofacies analysis) and laboratory (petrofacies analysis). The facies (lithofacies and petrofacies) determination in most subsurface studies is impractical, due to lack of cores and cuttings. In such situations, where the wire line logs are the only data available, the logfacies or electrofacies are determined instead. Electrofacies can be determined by using unsupervised machine learning technique i.e. K-mean cluster analysis.
What is a Cluster?
A cluster is a collection of data objects which are similar (or related) to one another within the same group (i.e., cluster).
What is Cluster Analysis?
The purpose of cluster analysis is to discover important patterns in large datasets (Wagstaff 2012).
Cluster analysis is an unsupervised machine learning algorithm of grouping a large dataset into significant subgroups so that the data points in the same class have the same characteristics and different from another subgroup. It was first introduced by Queen (1966). It has several applications in various fields, for example, data mining (Fayyad et al. 1996), compression of data, and quantization of vector (Gersho and Gray 2012).
Cluster Analysis for rock typing or lithology interpretation?
Similar facies may have different log responses due to diverse factors that affect the logs. Since using statistical methods and procedures are mandatory, in the clustering procedure, data are grouped with a minimum distance and maximum homogeneity.
Distinct geological parameters can be related to a group of data, as log facies, which be used by geologists for further interpretation. For this calculation, all log readings are considered as “observations” and the used logs as the “values of the observations” (Tavakoli and Amini 2006).
K-mean cluster analysis is easy and simple to understand, and it is fast and robust to cluster large dataset. K-mean cluster analysis is a well-known clustering algorithm because of its easy implementation and efficiency (Nazeer and Sabestian 2009). K-mean clustering is an unsupervised learning algorithm, and the main aim of K-mean clustering is to partition n number of observations into K number of clusters. For the numerical dataset, the centre of each cluster is represented by the mean/centroid. In each cluster, every observation belongs to the nearest mean.
Mathematically, K-mean cluster analysis can be written as (Wang et al. 2012);
K-mean cluster analysis works in the following way:
1. We define some clusters (K) randomly.
2. K-mean cluster divides the data points into subsets and allocates the centroids to each subset or cluster.
3. K-mean computes the clusters again and assign the data points to their nearest centroids.
4. The second and third step is iterated again and again until the arithmetic means/centroids do not change any more ( Ali et al., 2020).
The cluster analysis method is used to perform the log facies classification based on the attempts to identify clusters of well logs responses with similar characteristics. This classification does not require any artificial subdivision of the data population but follows naturally based on the unique characteristics of well-log measurements, reflecting minerals and lithofacies within the logged interval (Al-Baldawi 2014).
Example:
Below is an example of crossplots and histograms between gamma ray log, porosity, and water saturation as generated by k-means cluster analysis for a specified reservoir (Image by Al-Baldawi, 2014)