Complexity Explorer Santa Few Institute

Simon DeDeo talks about his "MaxEnt" tutorial

10 Jun 2015

In this post we interview Simon DeDeo, the instructor for our new Mathematics tutorial on “Maximum Entropy Methods”. Simon is an Assistant Professor in Indiana University’s School of Informatics and Computing and External Professor at the Santa Fe Institute.  He is affiliated with the Center for Complex Networks and Systems Research and also with Indiana University’s Cognitive Science Program. We asked Simon to tell us a little bit more about what Maximum Entropy Methods are good for.

Q: Can you give us a short description of what MaxEnt is about?

A: MaxEnt is a set of methods that came out of the physical sciences.   Imagine a system where there are a huge number of particles, interacting with each other in reasonably simple ways and following certain properties -- informally, "sufficiently jumbled-up".  We’d like to predict something about such a system. However, tracking all of the particles and their interactions is impossible. It turns out that we can actually predict a great deal of what we’re interested in using the techniques taught in the MaxEnt tutorial.

In MaxEnt, we look for a probability distribution that best describes a set of data, given some constraints related to features of this data.  For example, we might want our probability distribution to have the same mean or variance as what we've observed in the past, or perhaps the same pairwise correlations. There are typically an infinite number of possible distributions with the same mean, variance, or correlations, so which one do we choose to model our system? The philosophy of MaxEnt is to choose the distribution with maximum entropy, subject to those desired constraints. Such a distribution makes the least assumptions about the data, and can, among other things, reduce overfitting.

Although it originally came out of physics, MaxEnt is now being used in many other fields. A notable example is the field of ecology. John Harte, one of the driving forces behind this research, noted that several quantities of interest in ecology could be modeled using MaxEnt methods.  These quantities include species diversity, species number, energy flow through the system, total metabolic rate per individual and species, among others.  Looking at the empirical behavior of these quantities, Harte observed that they could be described as a system in a maximum entropy state subject to certain constraints. Those constraints include the total amount of energy in the system, for example. Results like these are exciting because we can describe something that is as complex as an ecological system using only a set of simple principles. 

In the tutorial, we build on insights from Harte to reason about a human social system -- the world of the open source movement, and how programmers and geeks write code, contribute to projects, and pick the languages they write in.

Q: Why is it important for people interested in complex systems to know something about MaxEnt?

A: There’s a distinction between systems that are complex and systems that are complicated.  What MaxEnt does is allow you to disregard what is only “complicated” in the system.  That’s the real secret of MaxEnt. 

If you’re able to come up with a satisfying account of the system using MaxEnt principles, it implicitly is telling you that a bunch of stuff doesn’t matter. It looks like it’s meaningful but it’s not.  Bill Bialek at Princeton, for example, showed that the brain, with all its complications and patterns of firing neurons, has emergent motifs that can be explained by a MaxEnt model that constrains only the pairwise correlations of co-firing neurons. It's a suggestive account of where the information processing is taking place: functions that take only two arguments at a time!

MaxEnt helps to focus on the aspects of a system that are complex, separating them from the merely complicated. For example, certain crucial features of an ecosystem, like its robustness, recovery from shock, or response to degradation, may be independent of the color of a rabbit's fur, or the shape of a tree's leaves.

Q: Any advice for Complexity Explorer students taking this tutorial?

A: Try to apply these methods in their your research or your own lives. I try to give examples in the lectures that are relatable -- taxi cab waiting times, tossing coins, writing code.  Apply these things in your life: driving to work, waiting for a friend, sitting in (someone else's) lecture or meeting. It’s hard and you may get it wrong but it’s a fantastic mental exercise, and the most efficient way to multiply your understanding, memory, and ability to use these techniques.      

 

Simon’s tutorial can be found at this link.  The pre-requisite for this tutorial is calculus.  


← Back to news stories