Meaning of imbalanceTolerance
Posted: Wed Aug 24, 2022 8:42 am
One of my customers asked me about the handling of "imbalanceTolerance". He played with different combinations of imbalanceTolerance and ignoreImbalanceTolerance. While one of the combinations provided a good solution in terms of "all assigned territories total activities have almost equal values" the other results seem to provide an equal, poor level of balancing.
Now the next question is:
Bernd
Appendix:
There's no such a call as "find me the best tolerance" in a single step. If you want to go for it you can apply a client logic based on one of the following approaches:
Let me explain this... As you might know the clustrering tries to satisfy two conflicting goals:Getting same responses every time with below combination:
1. imbalanceTolerance=10,50&10 and ignoreImabalanceTolerance=false
2. ignoreImbalanceTolerance=true
But getting different response with combination imbalanceTolerance=0 and ignoreImabalanceTolerance=false.
Please let me know, why similar response is generated with above mentioned combinations? I am expecting different responses with different tolerance values.
- create geographic compact territories
- assign each territory an equal workload
So if you activate a specific tolerance T% (e.g.5%) we determine the target activity and derive thresholds for a potential solution:Target activity: the average activity which satisfies the second goal in a perfect way. Simply the sum of all activities divided by the number of output clusters, e.g. the total activity is 10'000 and you want to create 5 clusters. TargetActivity is then 2'000.
- MinActivity := TargetActivity * (1 - T%), e.g. 2'000 * 0.95 = 1'900
- MaxActivity := TargetActivity * (1 + T%), e.g. 2'000 * 1.05 = 2'100
Now the next question is:
For this you need to understand the iteration under the roof.Why do the various tolerances in the customers example return same solutions?
This is how we create a kickof "condition" which we would try to improve if it violates the imbalanceTolerance.Trivial assignment: each customer is simply assigned to the closest cluster center.
- In the customers example the Trivial assignment satisfies all the imbalance tolerances [100%,50%, 10%] so there's no need to improve the trivial assignment.
- By setting the imbalanceTolerance to 0% the trivial assignment is no longer sufficient.
Bernd
Appendix:
There's no such a call as "find me the best tolerance" in a single step. If you want to go for it you can apply a client logic based on one of the following approaches:
Stragegy 1: start with a low value of imbalance tolerance (e.g.0) - as long as this creates "no valid solution found" increase it step by step
Strategy 2 : start with a trivial tolerance and reduce it until you run into the "no valid solution found". The las successful solution is what you are looking for