MetroMaps: Map-based Representations for Analyzing Optimization Solution Spaces
Problem Statement and Goals
Understanding the solution space plays an important role in a wide range of optimization applications. For example, finding solutions to many urgent problems (e.g., pollution and global warning) requires novel insights into chemical systems and processes. It is possible to gain these insights through computational modeling and analysis of the system. One characteristic, common in the analysis of many existing models, is the energy function of coordinates of components of a chemical system. Analyzing this energy function (a specific instance of a cost function in optimization problems), defined on a high-dimensional domain, may improve our understanding of chemical systems and reactions.
There is a lack of effective visualization techniques for these multi-dimensional functions, and chemists often study only one or two parameters of interest at a time. Developing a comprehensive understanding of these complex systems requires new visualization techniques that show relationships between all parameters describing the system at the same time.
Implementation and Results
Our approach utilizes the fact that the Morse complex contains the information required to understand state transitions in these systems. Each stable state of the chemical system corresponds to a minimum in the energy function. The Morse complex partitions the domain into regions such that following the gradient from any point in the region will end at the minimum. For an energy function these segmented regions correspond to classes of configurations associated with a stable state. Each descending region can adjoin several other regions. The lowest function value along the boundaries between two adjoining regions, also called the saddle, corresponds to the barrier between the states. Since minima and saddles in the Morse complex correspond to energy minima and barriers, chemical transformation paths are given by the edges connecting two minima through a saddle. Assigning minima and saddles to nodes and connecting pathways as edges creates the Morse complex graph.
However, since the graph itself is an abstract structure, we have to find a proper layout to visualize it. Although standard graph-layout algorithms may produce visually pleasing visualizations of the Morse complex graph, this representation does not consider relational information from the original system, which is important for chemical analysis due to the impact of distances on the probability of certain transformations. To preserve this relational context, we project the high-dimensional point locations of the graph onto the plane using principal component analysis (PCA).
Once we have obtained final positions for all nodes of the graph, we add chemical information from the model system. Since the most important information is the energy difference between a minimum and a barrier, we color edges according to the energy difference value. The Boltzmann distribution states that the probability of accessing a state decreases exponentially with its energy. The lower the energy difference is, the higher the probability of a transformation. Hence, we display paths with smaller energy differences more prominently (darker colors and wider edges). Periodic edges, i.e., edges crossing a boundary along which the problem is periodic, are dashed, and the common saddle of the two connected minima is broken into two nodes (see Figure 1. Finally, we add labels to all minima that specify coordinates (first line) and energy values (second line).
The free energy function of CH4 in an LTA zeolite is substantially different from the energy functions of the dimer of formic acid and acetic acid, see Figure 2 (left) and (right). This graph represents only the large cage of LTA zeolite, which is the only fragment of void space in LTA accessible to CH4. The graph representation of the free energy reveals important information about the material. There are 14 important energy minima per periodic unit cell of LTA corresponding to favorable locations of the adsorbing CH4 molecule (adsorption sites). Six of them correspond to lower energy (ca. 3.7 kBT), and are localized near windows connecting two periodic cells (near faces of the unit cell). The remaining eight minima are localized further away from the windows, on the surface of the large cage of LTA. The further analysis of connections between nodes/minima in our representations reveals that all 14 minima localized within the big cage are separated by low barriers, and therefore movements of the CH4 molecule between the adsorption sites are feasible. However, connections between large cages in the extended material lead through high barriers. These high barriers along diffusion paths in every direction are reflected in slower diffusion rates.
The final example is the graph representation of the 4D energy function of DFA depicted in Figure 3, for which plotting the original energy functions in all four dimensions was not feasible. The resulting graph is fairly easy to analyze. It clearly shows energy separation of the graph into two subgraphs: one corresponding to rotamers of DFA (with minima denoted A1–A9) and another with double proton transfer (minima denoted B1–B6). Transitions between minima within each subgraph are at low cost, as shown by heavy edges.
Our approach holds promise in allowing chemists to gain insight into complex chemical systems and understand higher dimensional energy functions. Since topological features of the Morse Complex correspond directly to concepts such as stable states, it provides an intuitive overview of a chemical system. In the long term, it will allow chemists to analyze higher dimensional systems all at once without having to look at pairs of parameters separately (see Figure 3 for a four-dimensional example). We also plan to adapt this representation to other optimization problems, such as providing an overview over the solution space in the auto-tuning of codes for high-performance architectures.
Kenes Beketayev (UC Davis), Peer-Timo Bremer (LLNL), Bernd Hamann (UC Davis), MarioHlawitschka (University of Leipzig).
Kenes Beketayev, Gunther H. Weber, Maciej Haranczyk, Peer-Timo Bremer, Mario Hlawitschka, and Bernd Hamann. Visualization of topology of transformation pathways in complex chemical systems. Computer Graphics Forum (Special Issue, Proceedings Eurographics/IEEE Symposium on Visualization), pages 663–672, May/June 2011. LBNL-5019E.
About Berkeley Lab
Founded in 1931 on the belief that the biggest scientific challenges are best addressed by teams, Lawrence Berkeley National Laboratory and its scientists have been recognized with 14 Nobel Prizes. Today, Berkeley Lab researchers develop sustainable energy and environmental solutions, create useful new materials, advance the frontiers of computing, and probe the mysteries of life, matter, and the universe. Scientists from around the world rely on the Lab’s facilities for their own discovery science. Berkeley Lab is a multiprogram national laboratory, managed by the University of California for the U.S. Department of Energy’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.