Hierarchical clustering as a data analysis method for tour operators

In the past years, data processing and storage technologies have gotten more developed and thus cheaper. Because of that, more variables are available to segment your customers into different groups. When previously common sense could be used to group your customers, for example to differentiate between loyal and one-time purchasers, tour operators are now faced with many more options. This can make the segmentation process complicated and difficult to manage. So how can tour operators use data mining techniques to gain deeper insight into their customer base?

One option to segment a market into uniform groups would be to make use of cluster analysis. The benefit of this analysis method is that by creating clusters of customers, tour operators can decide on the different channels of communication that will resonate with each group. Another benefit is the higher reliability of computer-based data mining techniques compared to segmenting markets merely by human observation.

One possible cluster technique to apply is the hierarchical cluster analysis. For this, the customer data is shown in a cluster tree (dendrogram), where every group is split into two or more groups respectively. Each cluster combines data with similar variables and each new ‘branch’ of clusters represents a new level of data. To show the similarity or dissimilarity between branches, one only has to compare the height of each. If two cluster branches start at a similar height, their data sets are more similar to each other compared to a branch that starts much lower in the dendrogram (StatisticsHowTo, 2016).

Cropped version of Plosone Phylogeny. ‘Figure 4’. May 17, 2014. Online image. Flickr. October 09, 2018. https://www.flickr.com/photos/123621741@N08/14126323921/

At the top of the dendrogram, the whole data is combined in one large cluster which is then split into two smaller clusters. The data from the first cluster then gets assigned to one of two new clusters according to their degree of similarity to the group. This process is repeated until the clusters can’t be split again into even smaller entities. By using this approach of data analysis, tour operators can not only see which customers are like to behave or respond in a similar way to their marketing efforts, but it also makes it easier to distinguish differences between the clusters.

For the example of tour operators, the top of the dendrogram would be all available info in their customer database. For the first clustering, these data could be grouped in one-time customers and clients. Then these clusters could be split respectively, for example in demographics or the type of tour they booked. This process then can be repeated until each cluster is completely unique and homogenous.

To conclude, tour operators should classify the data sets they have of their customers into groups by using cluster analysis. By using the hierarchical method with dendrograms, the clustered data is neatly arranged so that it is easy for tourism marketers to recognise which customers belong to which segment. With this knowledge gained form the hierarchical cluster analysis, tour operators then can continue on with deciding how to build their marketing mix in ways that the respective cluster segments will respond to.

Works cited

Peelen, E., & Beltman, R. (2013). Customer Relationship Management. Pearson.

Plosone Phylogeny. ‘Figure 4’. May 17, 2014. Online image. Flickr. October 09, 2018. https://www.flickr.com/photos/123621741@N08/14126323921/

StatisticsHowTo. (2016, November 15). Hierarchical Clustering / Dendrogram. Retrieved from Statistics How To: https://www.statisticshowto.datasciencecentral.com/hierarchical-clustering/



3 thoughts on “Hierarchical clustering as a data analysis method for tour operators

    1. Fiona Froehlich

      Thank you for your feedback! I have included a picture now to show how the clustered data and the degrees of similarity can look like.


      1. I guess Carla meant a picture with examples of clusters in it. It is theoretically a strong article, but it is quite difficult for the reader to perceive what kind of clusters a TO could detect if datamining or tourism are not familiar to him or her.


Leave a reaction

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.