The Shape of Data


(T) Data has shape. Shape reveals a pattern. That pattern provides insights into the data. This is what and why Topology Data Analysis (TDA) does. TDA attempts to find the shape of a big complex multi-dimensional data set to provide valuable insights into the data set. While classical algebra is being used to learn from limited shapes such as regression or clustering in traditional machine learning, TDA aims to remove any limitation by modeling the data set into a network. Similar data points are grouped, and the groups become nodes that are connected to other nodes. The data set can either be structured or unstructured and compressed while still preserving its features. Search and analysis can be performed in the compressed data set. This is how TDA does it.

This is my simplified summary of the talk of Professor Gunnar Carlsson, from Stanford University and Founder of Ayasdi, this week at an SF Bay ACM Chapter meet-up. Professor Carlsson is one of the leading authorities on TDA, having significantly researched it for DARPA while teaching at Stanford.

To deeper dive into TDA, I would recommend reading the following paper “Extracting insights from the shape of complex data using topology “ published in Nature.

In addition, Wikipedia has a good article on TDA with an elegant math background summary, and Professor Carlson has a blog @ Ayasdi that focus on the basic concepts of TDA.

Since the talk at this week was not recorded, following is a past lecture of TDA by Professor Carlsson:

Note: The picture above is “Sofa Living Sculpture” from Verner Panton from the Centre Georges Pompidou in Paris.

Copyright © 2005-2016 by Serge-Paul Carrasco. All rights reserved.
Contact Us: asvinsider at gmail dot com.