Algorithms for the Phase-Mapping Problem in Materials Science

Sebastian Ament (Computer Science, Cornell University)

Knowledge of materials has driven technological progress for millennia. This motivates the field of high-throughput material discovery, which accelerates the materials discovery process by creating many compounds in a single experiment. The experiments are frequently analyzed with x-ray spectroscopy. The challenge in analyzing the resulting spectrograms is that multiple - potentially unknown - compounds can contribute to a signal spectrogram. This creates a source separation problem, since we want to recover the spectrograms of the individual compounds. This talk will focus on two aspects of this problem: subtracting systematic background signals in the data, and techniques for dealing with non-linear phenomena in the data, in order to make the source separation problem approachable with a non-negative matrix factorization. The algorithm for the background subtraction is based on a probabilistic matrix factorization with a non-Gaussian error model. Experimental results show that this approach is able to separate out complex background signals, while conserving the meaningful part of the data. The non-linear physical effects in the data can be dealt with by modeling the diffraction peaks explicitly. I will show that this provides several advantages over previous approaches. Foremost, if every compound has a unique diffraction peak, the problem can be converted to a separable non-negative matrix factorization, which is solvable in polynomial time.