Structured Clustering using Integer ProgrammingAlgebra & Discrete Mathematics
|Anna Seigal, UC Berkeley
|Wed, Apr 20 2016, 5:10PM
For biological applications, it is often important to cluster data based on some observable features, but simultaneously to impose structural restrictions on which combinations of data items can be in each cluster. Here we look at the example of breast cancer cell lines exposed to a range of signaling molecules. We show how integer programming, using a tensor format, can be used to take an initial clustering assignment that does not respect the structural restrictions to a provably closest clustering that incorporates the structural information. This enables interpretable groupings to be found from the biological information. The method can also be adapted to cluster the data from the outset via integer programming, finding the optimal clustering with respect to some cost function.