Seminars

CGB Roundtable Details


Geoffrey Fox

Robust High Performance Optimization for Clustering, Multi- Dimensional Scaling and Mixture Models


January 22, 2008 at 12:00 PM
Myers 209

Description:

http://www.infomall.org/salsa

We first review the pros and cons of various approaches to non linear optimization in the presence of local minima, ill conditioned matrices and ambiguous choice of appropriate number of degrees of freedom (over and under fitting). We define constraints on approaches from need to run well in parallel on systems of multicore CPU's. We present a uniform approach to data clustering and Gaussian mixture model ling that uses deterministic (not Monte Carlo) annealing to mitigate the local minima problem and naturally relates the appropriate number of parameters (clusters or mixture components) to the scale at which problem is examined. New clusters (mixtures) are introduced at phase transitions as the annealing temperature is lowered and second derivative matrix becomes singular. We contrast three ways of visualizing this structure in low (2) dimensions with Principal Component Analysis PCA, Generative Topographic Mapping GTM and Multi-Dimensional Scaling MDS using annealing to regularize GTM and MDS.

Currently we have implemented in preliminary fashion deterministic annealing clustering and GTM in a fashion that runs well on multicore systems. We have applied these techniques to Geographical Information Systems (clustering demographic data in 2D) and Cheminformatics in 1024 and lower dimensions. We would like to understand other applications that can constrain and test these techniques.



Seminar Archive >

This website will look much better in a browser that supports web standards, but it has been designed so that it is still usable and accessible to any browser or web-enabled device.