A mathematical programming approach for resource allocation of data analysis workflows on heterogeneous clusters