I am using hierarchical clustering to analyze time series data. My code is implemented using the Mathematica function DirectAgglomerate[...]
, which generates hierarchical clusters given the following inputs:
a distance matrix D
the name of the method used to determine inter-cluster linkage.
I have calculated the distance matrix D using Manhattan distance:
where and is the number of data points in my time series.
My question is, is it ok to use Ward's inter-cluster linkage with a Manhattan distance matrix? Some sources suggest that Ward's linkage should only be used with Euclidean distance.
Note that DirectAgglomerate[...]
calculates Ward's linkage using the distance matrix only, not the original observations. Unfortunately, I am unsure how Mathematica modifies Ward's original algorithm, which (from my understanding) worked by minimizing the error sum of squares of the observations, calculated with respect to the cluster mean. For example, for a cluster consisting of a vector of univariate observations, Ward formulated the error sum of squares as:
(Other software tools such as Matlab and R also implement Ward's clustering using just a distance matrix so the question isn't specific to Mathematica.)