Filename: b315893d

Back | Show only these items: | Using real-valued multi-objective genetic algorithms to model molecular absorption spectra and Raman excitation profiles in solution

1
Using real-valued multi-objective genetic algorithms to model molecular absorption spectra and Raman excitation profiles in solution

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa1

The empirical modeling of the absorption spectra and resonance Raman excitation profiles of a large molecule in solution requires adjustment of a minimum of dozens of parameters to fit several hundred data points.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac1

This is a difficult optimization problem because all of the observables depend on all of the parameters in a highly coupled and nonlinear manner.

Type: Motivation | Advantage: None | Novelty: None | ConceptID: Mot1

Standard nonlinear least-squares fitting methods are highly susceptible to becoming trapped in local minima in the error function unless very good initial guesses for the molecular parameters are made.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met1

Here, we demonstrate a method that employs a real-valued genetic algorithm to force a broad search through parameter space to determine the best-fit parameters.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa1

The multiobjective genetic algorithm is successful at inverting absorption spectra and Raman excitation profiles to determine molecular parameters.

Type: Conclusion | Advantage: None | Novelty: None | ConceptID: Con1

When vibronic structure is evident in the absorption profile, the algorithm returns nearly quantitative results.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res1

For broad, featureless profiles, the algorithm returns the correct slope of the excited state surface but cannot independently determine the excited-state frequency and the equilibrium geometry change.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res2

Compared with manual adjustment of parameters to obtain a best fit, the genetic algorithm is computationally less efficient but requires less human time.

Type: Conclusion | Advantage: None | Novelty: None | ConceptID: Con2

Introduction

Optical absorption spectra and resonance Raman excitation profiles contain information about a variety of important molecular parameters including the excitation energy, transition moment, excited state vibrational frequencies, and geometry changes along each coupled normal mode.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac2

We^1–4 and others^5–8 have utilized Raman excitation profiles to quantify excited state parameters and deepen our understanding of electronic processes.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac3

Fig. 1 illustrates a related pair of optical spectra and molecular parameters.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac4

Chemical physicists have developed powerful theoretical and computational techniques that allow spectra to be calculated given knowledge of the potential energy surfaces, dynamics, and radiation-matter couplings of a particular system.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac4

Spectroscopists address the related problem of measuring the spectra and inverting the spectral observables to extract the physical parameters.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac4

Within a given theoretical framework, it can be quite easy to calculate the optical spectra of even very large polyatomic molecules in the condensed phase.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac5

However, it is difficult to objectively find the set of parameters that best fits the data, and it is particularly hard to justify those instances when it is necessary to go beyond the simplest assumptions and consider Duschinsky rotation or vibrational coordinate dependence of the electronic transition moment, for example.

Type: Motivation | Advantage: None | Novelty: None | ConceptID: Mot1

Traditionally the “intelligently guided random search” has been employed by our group^2,9–12 and others.^13–19

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met1

This method involves guessing a set of parameters, calculating the spectra, comparing with experiment, modifying the parameters, recalculating the spectra, and continuing iteratively until a best fit between simulation and experiment is found.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met1

Since the spectra depend on the molecular parameters in a highly nonlinear fashion and the Raman data usually carry significant uncertainties, converging on a best fit set of parameters is slow, and it is rarely possible to be certain that the best fit has been found or that the best-fit parameters are unique.

Type: Method | Advantage: No | Novelty: Old | ConceptID: Met1

In this paper, we use real-valued genetic algorithms to automate this iterative process of parameter optimization and simultaneously determine the uniqueness of the best-fit parameters.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met2

Genetic algorithms have been applied to a variety of problems in chemistry and physics that involve optimization in a large parameter space.

Type: Method | Advantage: Yes | Novelty: Old | ConceptID: Met2

These include determination of mechanisms and rate coefficients for complex reactions,²⁰ determination of structures from NMR spectra,^21,22 molecular and cluster geometry optimization,^23–26 parameter optimization in semiempirical electronic structure methods,²⁷ numerical solution of the Schrödinger equation,²⁸ optimization of vibrational force fields,²⁹ and modeling of vibronic^30,31 and rovibronic³² molecular spectra.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac6

Previously in our group, the absorption spectra of molecules with one vibrational mode were used to quantitatively determine molecular parameters in a three-step process.³¹

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac6

First, a neural network which had been trained to associate spectral patterns with their underlying molecular parameters was used to obtain an estimate of the relevant parameters.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met3

This initial guess was used to select the range of parameter space to be searched by a micro-genetic algorithm.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met3

The range was discretized into 2¹⁰ bits in each dimension, and a random population was allowed to evolve toward the solution.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met3

Since convergence of the population slowed as it neared the extremum, a traditional Levenberg–Marquardt nonlinear least squares search was grafted on to locate the true minimum.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met3

The method worked well, but training the neural network is a slow process which does not scale well for larger systems and the Levenberg–Marquardt steps add an extra layer of computational complexity and time.

Type: Method | Advantage: No | Novelty: Old | ConceptID: Met3

Our goal here is to develop a stand-alone genetic algorithm which can efficiently solve the spectral inversion problem for sets of absorption spectra and resonance Raman excitation profiles.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa1

In Section 2 we describe a theoretical framework for calculating absorption spectra and Raman excitation profiles from fundamental molecular parameters, provide a physical interpretation of the parameters and define the objective functions which we seek to minimize in determining the best fit.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa1

We develop the simplest implementation of a real-valued genetic algorithm in Section 3 which is designed to meet a single objective (e.g., fitting the optical absorption spectrum).

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa1

The algorithm is successful at quantitatively fitting absorption spectra for one-mode systems.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res1

As the number of modes increases, the “uniqueness” of the fit is sacrificed but a range of parameters which provide quantitative fits can still be determined.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res2

In Section 4, we extend our treatment to multiobjective problems in which there are several properties of interest (e.g., the absorption spectrum and Raman excitation profiles for several fundamental, overtone and combination bands) which depend on the same set of parameters in different ways.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa2

The inclusion of Raman data enables reliable parameter optimization for multimode systems.

Type: Method | Advantage: Yes | Novelty: Old | ConceptID: Met2

The largest calculations attempted describe a simplified p-nitroaniline molecule with five modes coupled to the electronic transition.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa2

Finally, in Section 5 we comment on the strengths and weaknesses of genetic algorithms for inverting spectroscopic data.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa3

Electronic spectroscopy

The traditional approach to calculating Raman excitation profiles involves approximations to the energy-frame Kramers-Heisenberg-Dirac sum-over-states formula.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met4

These calculations quickly become intractable with even a few degrees of freedom when there are significant Franck–Condon displacements.

Type: Method | Advantage: No | Novelty: Old | ConceptID: Met4

Instead, working directly in the time domain exploits the short-time nature of Raman scattering and leads to efficient calculations: The propagation time required to determine the Raman cross sections stays constant, or even decreases, as the number of degrees of freedom and displacements grows.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met5

The computational method is based on the time-dependent formulation of Raman scattering³³ which expresses the exact Raman amplitude in terms of a half-Fourier transform of the overlap between the initial ground state wavepacket propagating on the excited state surface |ϕ_i(t)〉 and the final stationary state |ϕ_f〉.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod1

The scattering amplitude is given by where the frequency Ω = ω_i + ω_L − ω_0–0 is the vibrational frequency of the initial state plus the incident laser frequency minus the purely electronic transition frequency.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod2

The time correlation function is modulated by a solvent-induced spectral broadening g(t).

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod2

In the most naïve approximation, this damping can be modeled as a simple Lorentzian function governed by an electronic dephasing rate Γ, g(t) = Γt.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod3

We employ an overdamped Brownian oscillator in the high temperature limit^34,35 to more carefully describe solvent interaction.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod4

Following ^{refs. 36, 37 and 38}, we assume that the initial population is in the vibrational ground state and for clarity consider a single coupled vibrational mode, where q is the ground-state dimensionless normal coordinate with conjugate momentum p. When the molecule is excited, the wavefunction propagates along the upper surface according to the excited state Hamiltonian, |ϕ_i(t)〉 = exp(−iH_et/ħ)|ϕ_i〉 with where q′ is the excited-state normal coordinate.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod5

The excited state surface is displaced Δ′ along the normal coordinate q′.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod5

The general expression for the final state is where H_n is the nth Hermite polynomial with n = 0 for absorption, n = 1 for fundamental Raman scattering and n = 2 for overtone Raman scattering.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod5

The explicit form of the overlap of the propagated initial wavepacket and the final stationary state is whereω_± = ω_e ± ω_gHere Δ is the displacement in ground-state dimensionless normal coordinates (Δ = (ω_g/ω_e)^1/2Δ′).

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod6

For molecules with more than one coupled vibrational mode, the total overlap function is a simple product of the single-mode overlaps as long as the ground- and excited-state normal coordinates are parallel (no Duschinsky rotation).

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod7

For comparison to experiment, we calculate the Raman cross section according to where ω_S is the frequency of the scattered photon and M is the electronic transition length integral, assumed independent of vibrational coordinate.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod8

A similar expression can be derived for the absorption cross section: where n is the solution refractive index.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod9

Absorption is governed by the real part of the Fourier transform of the time autocorrelation function damped by solvent interaction.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod9

Many generalizations of these expressions have been developed^37–39 which allow for greater complexity in the system including thermal population of initial vibrational levels, Duschinsky rotation, and coordinate dependent transition moments.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac7

In this work, the condensed-phase optical absorption spectrum and resonance Raman excitation profiles of a molecule having a single, isolated electronic transition between two harmonic potential energy surfaces are considered.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod10

The calculations are performed within the Condon approximation (no vibrational coordinate dependence of the electronic transition moment), and it is assumed that the ground and excited state normal modes are the same (no Duschinsky rotation).

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod11

Different frequencies in the ground and excited electronic states are allowed.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod11

The ground state vibrational frequencies are assumed to be known from IR or Raman spectroscopy.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod11

The solvent-induced vibronic line broadening function is modeled by a single Brownian oscillator in the overdamped, high temperature limit³⁵ with the lineshape parameter fixed at κ = 0.05.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod4

Within these approximations, the absorption spectrum and Raman excitation profiles can be exactly determined given the following parameters.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp1

(i) Δ, the displacement for each normal mode between ground and excited state potential minima in ground state dimensionless normal coordinates which is needed to calculate the correlation function.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp1

(ii) ω_e/ω_g, the ratio of the excited and ground state vibrational frequencies for each normal mode which is needed to calculate the correlation function.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp1

(iii) Γ, the vibronic linewidth which is incorporated in the solvent-induced broadening governed by the Brownian oscillator damping function g(t).

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp1

(iv) ω_0–0, the purely electronic transition frequency which governs the position along the frequency axis.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp1

(v) M, the electronic transition length which determines the integrated intensity (oscillator strength).

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp1

The goal of this work is to invert experimental spectra to determine these parameters for systems with several coupled vibrational modes.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa1

While the spectroscopic model employed to generate the calculated spectra involves common approximations suited to our molecular systems, the time-dependent approach to determine the absorption and Raman excitation profiles does not generally require them.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp2

The algorithm presented here could easily be extended beyond these approximations to allow, for example, vibrational coordinate dependence of the transition moment, inhomogeneous broadening, or more than one contributing resonant excited state.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp2

The concomitant increase in the number of molecular parameters would, however, place even more stringent requirements on the quality and quantity of the Raman data.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp2

To monitor the progress of our fit, we compare the target spectra (analogous to experimental data) to calculated spectra which correspond to a particular set of parameters {a_i}.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met6

The error associated with a calculated spectrum is governed by the mean absolute difference between calculated and target spectra where σcA,R(ω_j) and σtA,R(ω_j) are the calculated and target absorption or Raman cross sections at frequency ω_j, and N is the number of points in the target spectrum.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod12

Our choice of the absolute error in eqn. 12 instead of the relative error (where each term in the sum is divided by σtA,R(ω_j) as used in ^{ref. 31}) does not overemphasize the lowest intensity points where experimental error may be more significant in the Raman excitation profiles.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod12

The goal of the optimization is to find the {a_i} which minimize the error functions.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met6

Genetic algorithms

Genetic algorithms^40–43 are global optimization methods that mimic the mechanisms of evolution: reproduction, natural selection, and diversification.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met2

In biological systems, genetic information is encoded in the quaternary alphabet of DNA base pairs.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac8

Sequences of base pairs comprise individual genes which correspond to traits in the developed organism; the genes are grouped into chromosomes.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac8

Through meiosis, the parental chromosomes crossover and recombine to form genetically unique children with qualities inherited from both parents.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac8

Random mutations further diversify the next generation though these events are infrequent and rarely beneficial.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac8

Children develop and compete for resources and the opportunity to mate.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac8

Only the most fit phenotypes are expected to survive and propagate; they represent better solutions to adaptive problems.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac8

For the current problem of fitting molecular spectra, we consider the molecular parameters described in Section 2 to be analogous to the genes which determine each individual's characteristics.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod13

The phenotypes are the calculated spectra, analogous to the physical and behavioral characteristics of biological organisms.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod13

The fitness of each individual is a measure of its similarity to the target spectrum.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod13

Only the best solutions are allowed to reproduce to drive the population toward the optimum solution.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod13

In this section, we detail our application of genetic algorithms to invert a target absorption spectrum, that is, to derive the molecular parameters that produce it.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa1

The extension to include resonance Raman data is described in Section 4.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa2

Encoding

In recasting the optimization problem in terms of an evolutionary system, we kept the problem conceptually close to the real parameter space by constructing each chromosome from N floating-point numbers which directly correspond to the N parameters.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod13

Traditional genetic algorithms encode each parameter in an n-bit binary gene which has the unphysical property of assigning equal importance to (for example) the most and least significant bits.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod13

Real-valued genetic algorithms have been shown to be robust, accurate, and efficient.^44–47

Type: Method | Advantage: Yes | Novelty: Old | ConceptID: Met2

A further advantage for this application is the relative compactness of N floating point numbers as compared to N × n zeros and ones.

Type: Method | Advantage: Yes | Novelty: Old | ConceptID: Met2

For complex systems with several parameters, the arrangement of genes within the chromosomes is important.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met2

Genetic algorithms are believed to rely on the conservation of successful patterns to guide the evolutionary process.⁴⁰

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met2

Successive evolutionary steps juxtapose these independent “building blocks” to construct the optimal solution.⁴¹

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met2

The variables of our system are highly correlated and do not act independently in determining the phenotype.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac9

When epistasis, or interaction among design parameters, occurs, convergence of the genetic algorithm is much more difficult.⁴⁸

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac9

Grouping entangled genes in the chromosome promotes the formation of robust, quasi-independent building blocks which have a high probability of being propagated to the next generation.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac9

For the conjugated organic molecular systems of interest to our group, we chose the parameter ranges shown in Table 1 for the normal-mode displacement, the relative excited-state frequency and the vibronic linewidth.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod14

Systems with resolved vibronic progressions and relatively narrow linewidths are best described in terms of the displacements and frequencies of each electronically coupled normal mode individually {Δ₁,ω₁,Δ₂,ω₂…}.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod14

When the inner-sphere reorganization energy or vibronic linewidth increases, the vibronic progression becomes difficult to resolve.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod15

100

Physically, the spectrum becomes more appropriately described in terms of the overall reorganization energy (width) and shape (asymmetry) of the curve.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod15

101

To speed convergence, we recast the variables Δ and ω_e/ω_g into the mode-specific inner-sphere reorganization energy λ = ω2eΔ²/2ω_g and a shape factor ρ, the ratio of Δ² and ω_e/ω_g.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod15

102

The overall width is a collective function of the reorganization energies and the linewidth.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod15

103

Convergence is accelerated by forming parameter blocks of the collective shape and width {ρ₁,ρ₂,…,λ₁,λ₂,…}.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod15

104

The comparison of the target and calculated spectra is conducted in an energy window which extends from 2500 cm⁻¹ below to 5000 cm⁻¹ above the target spectrum maximum at 50 cm⁻¹ intervals.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met7

105

To guarantee that the absorption features of the calculated spectrum occur within the observation window we adjust the pure electronic excitation frequency, ω_0–0.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met7

106

For systems with well-resolved vibronic structure and a dominant peak, the maxima of the target and calculated spectra are aligned.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met7

107

All other spectra are adjusted such that the first moments of the calculated and target spectra coincide.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met8

108

In both cases, we equate the integrated intensity of the absorption curves within the window to determine the transition moment.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met8

Evolution

109

Upon randomly generating an initial population, we assume that the genetic pool contains the solution, or a better solution, to the adaptive problem of fitting the spectrum.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod16

110

However, the solution is not “active” because the optimal genetic combination is split among several members.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod16

111

Just as in nature where individuals more successful in competing for resources are more likely to survive and propagate their genetic material, the algorithm's fitness function transforms the performance of each member into an allocation of reproductive opportunities.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod16

112

The random crossover of these “most fit” partial solutions enables the algorithm to approach, and eventually find, the optimum.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod16

113

Mutations allow the emergence of new configurations which widen the pool and improve the chances of finding the optimal solution.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod16

114

This process of evaluation, selection, recombination and mutation forms one generation in the execution of a genetic algorithm.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod16

115

The fitness of each member of the population is inversely proportional to the objective function defined in eqn. 12.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod12

116

We take The simplest algorithm for selection maps the population onto a roulette wheel where each individual is represented by a space proportional to its fitness as in Fig. 2.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod16

117

Individuals are entered into the mating pool by repeatedly spinning the roulette wheel.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod16

118

This method leads to genetic drift, or a loss of population diversity, which is overcome using stochastic universal sampling.⁴¹

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod17

119

To prevent the domination of a single individual, after each spin of the wheel several (in our case, four) parents are selected and copied into the mating pool.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod17

120

Parents are randomly chosen from this intermediate population to reproduce and are then discarded.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod17

121

In nature, crossover occurs when two parents exchange parts of their corresponding chromosomes.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac10

122

The most common approach for recombination of two parents represented by a vector of real numbers is flat crossover, a gene-by-gene weighted average, to give two children.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod18

123

Each gene in the children c_i is expressed as a linear combination of their parents, where γ is a random number on the interval [0,1].

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod18

124

Flat crossover (see Fig. 2) explores more of the parameter space than the more traditional single-point crossover.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod18

125

To maintain some consistency between successive generations, 20% of the population is generated by asexual reproduction, or with γ = 0.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod18

126

Since the selection and crossover processes are stochastic, there is no guarantee that the algorithm will monotonically approach the optimum.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp3

127

To prevent backward steps, the best 10% of the solutions (the elite members⁴⁹) are allowed to survive into the next generation.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod18

128

The difference between asexual reproduction and survival of elite members is that elite chromosomes do not undergo mutation.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod18

129

Successful convergence depends on a delicate balance between exploration of the parameter space and exploitation of good solutions.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp4

130

This balance is governed by the mutation operator.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod19

131

Fitness proportionate selection exploits fit members to generate fit children.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod19

132

To avoid premature convergence to a non-optimal solution, and maintain population diversity, we use a dynamically updated mutation operator.⁵⁰

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod19

133

Each gene has an adaptive mutation rate δ normally distributed about zero according to ±δ = 1 − exp{−(0.5 − x)²/2.373}which takes a random x on the interval [0,1] and maps it to a percent deviation between 0 and 10% of the allowed parameter range.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod19

134

We choose δ to be negative if x < 0.5.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod19

135

Every ten generations, adaptation occurs by narrowing the allowed parameter range to the parameter average plus or minus two standard deviations.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod19

136

Generally, an at-first-sight needlessly large search region is sampled by the initial population to avoid missing the global minimum.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod19

137

During the optimization, we narrow the search toward the most promising region of the parameter space to increase the role of exploitation as the optimum is approached.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod19

Results

138

A simple test of the efficacy of our algorithm is to construct target spectra with known parameters within the ranges given in Table 1 and then to invert the spectra.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met9

139

Comparing the actual parameters to those returned by the algorithm affords a measure of our success.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met9

140

As expected, systems with few parameters and substantial vibronic structure are easiest to invert.

Type: Method | Advantage: Yes | Novelty: Old | ConceptID: Met9

141

For systems with a single mode coupled to the electronic transition, we always achieve a quantitative fit to the absorption spectrum for the parameter ranges in Table 1.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res1

142

Six randomly generated target spectra are shown in Fig. 3; the corresponding parameters are the bold entries in Table 2.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod20

143

To invert these spectra, the population size was fixed at 30 chromosomes which were allowed to evolve for 30 generations.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod20

144

The calculated spectra are superimposed on the target spectra in Fig. 3 and no differences are visible.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res1

145

Comparing the parameters (plain-type entries in Table 2) demonstrates that inversion of the spectra yields ∼1% error in the parameter values.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res1

146

Comparable results were obtained in our previous work³¹ by combining an integer valued genetic algorithm with Levenberg–Marquardt refinement.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac11

147

As we move to larger systems, the inversion becomes more difficult.

Type: Method | Advantage: No | Novelty: Old | ConceptID: Met2

148

For example, the width of the absorption spectrum is governed by both the inner-sphere reorganization energy, a function of Δ and ω_e/ω_g, and the outer-sphere reorganization energy which depends on Γ.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod21

149

The non-separability of the parameters in the model plagues optimization techniques which typically rely on parameter additivity.

Type: Method | Advantage: No | Novelty: Old | ConceptID: Met2

150

In the one-mode case this “epistasis” does not hinder inversion of the spectrum.

Type: Method | Advantage: Yes | Novelty: Old | ConceptID: Met2

151

For multi-mode systems, the increase in both the number of parameters and the complexity of the relationship between them place much more stringent demands on the optimization technique.

Type: Motivation | Advantage: None | Novelty: None | ConceptID: Mot2

152

It is also in this regime where an automated, objective fitting algorithm is most necessary.

Type: Method | Advantage: None | Novelty: New | ConceptID: Met10

153

A “difficult” target absorption spectrum of a two-mode system was synthesized (closed circles in Fig. 4) from the parameters shown in the bottom, bold line of Table 3.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod22

154

This absorption spectrum is relatively broad and the vibronic patterns of the individual normal modes cannot be distinguished.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod22

155

Four attempts to invert this spectrum were made.

Type: Method | Advantage: None | Novelty: New | ConceptID: Met10

156

In each, the 60 member population was allowed to evolve for 60 generations.

Type: Method | Advantage: None | Novelty: New | ConceptID: Met10

157

The results are shown as the solid lines in Fig. 4 with the corresponding parameters entered in Table 3.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res3

158

While it is only in the inset that the small errors in the calculated spectra are visible, the calculated parameters do not correspond to the target parameters.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res3

159

Generally, the algorithm fails to correctly invert spectra which are broad or lack clear vibronic progressions because the fits are not unique.

Type: Conclusion | Advantage: None | Novelty: None | ConceptID: Con3

160

As the number of parameters increases and the resolution of spectral features decreases, several parameter sets produce nearly identical, good fits to the absorption spectrum.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res3

161

More data are required to find a single best solution.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res3

Multiobjective genetic algorithm

162

More insight into the molecular parameters may be achieved with resonance Raman intensities.^1–8

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac2

163

For each mode coupled to the electronic transition, an experimental profile of the Raman cross section vs. laser frequency may be obtained.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac2

164

Each resonance Raman profile isolates the contribution of a single vibrational mode to the often unresolved vibronic structure of the absorption.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac12

165

Incorporating these data into the calculation of a fitness function is not straightforward, as typically there are many fewer data points with much higher uncertainties than in the linear absorption spectrum.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac12

166

A direct statistical combination of the absorption and Raman data would fail to assign enough importance to the Raman experiments.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac12

167

These data are incommensurate (cannot be measured on the same scale) and should not be combined into a single objective function.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac12

168

In this section, we describe an algorithm which simultaneously optimizes several objectives.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa2

169

We begin by discussing multiobjective problems and defining Pareto-optimality.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa2

170

Next, we present a modified version of the strength Pareto evolution algorithm.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa2

171

Finally, we show results for a series of randomly generated target spectra which typify the spectroscopy of conjugated organic molecular materials.

Type: Goal | Advantage: None | Novelty: None | ConceptID: Goa2

Background

172

Real-world design problems require the simultaneous optimization of multiple objectives which are often incommensurate, and sometimes competing.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac13

173

These multiobjective problems typically do not have a single utopian solution.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac13

174

Instead, they are solved by determining the tradeoff, or Pareto-optimal surface.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met11

175

Solutions along this surface are optimal in the sense that no other solutions in the search space are superior to them with respect to every objective.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met11

176

To make the conditions of Pareto-optimality mathematically rigorous, we state that chromosome if the fitness of x is greater than or equal to the fitness of y with respect to every objective, and that the fitness of x is greater than the fitness of y for at least one objective.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod23

177

Fig. 5 shows the approximate Pareto-optimal surface of a two-dimensional optimization problem as a solid line at the lower left of each panel.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res4

178

Nondominated solutions are indicated with solid circles, while dominated chromosomes are shown as open circles.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res4

179

As the system evolves, the nondominated front approaches the Pareto-optimal surface.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res4

180

For an arbitrary dominated chromosome, in the upper-right-hand panel, the dominated solutions are shown in light gray while the dominant solutions are shown in a darker gray.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res4

181

In general, the goal of multiobjective problems is to find as many nondominated solutions as possible along the Pareto-optimal surface and then to allow a higher-level decision maker, who may possess additional information, to choose among them.

Type: Method | Advantage: None | Novelty: Old | ConceptID: Met11

182

Since the absorption spectrum and Raman excitation profiles are experimental data from the same molecule, they must be produced by a common set of molecular parameters–there should be a single best solution.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp5

183

However, there need not be a utopian solution if approximations such as separable harmonic vibrations are included in the model, or when spectral noise is considered.

Type: Hypothesis | Advantage: None | Novelty: None | ConceptID: Hyp5

184

The range of good solutions will provide information about the error tolerance for each molecular parameter and the appropriateness of the model.

Type: Method | Advantage: Yes | Novelty: Old | ConceptID: Met11

185

The absence of an adequate fit indicates either an insufficient model or experimental noise.

Type: Method | Advantage: Yes | Novelty: Old | ConceptID: Met11

186

A problem which illustrates the Pareto-optimal surface can be artificially constructed by adding 25% rms noise to the Raman excitation profiles of a hypothetical two-mode system.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod24

187

Since there is no longer an optimal solution which would generate the exact absorption and Raman spectra, we may examine the algorithm's ability to approach and sample the Pareto-optimal front.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod24

188

(The algorithm is presented in detail in Section 4.2).

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod24

189

For this example, we fixed the population size at sixty chromosomes, each composed of seven floating point numbers {Δ₁,ω_1e/ω_1g,Δ₂,ω_2e/ω_2g,Γ,ω_0–0,M}, and allowed the population to evolve for sixty generations.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod25

190

The two, simultaneous objectives are (1) to fit the absorption spectrum and (2) to fit the “noisy” Raman excitation profiles of the two coupled modes.

Type: Model | Advantage: None | Novelty: None | ConceptID: Mod25

191

The evolution of the population toward the Pareto-optimal surface is shown in Fig. 5 where each chromosome is represented by a point in fitness space (O_A,O_R).

Type: Result | Advantage: None | Novelty: None | ConceptID: Res5

192

Noise added to the Raman excitation profiles obviates a perfect solution which would be located at the origin.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res5

193

Initially, the nondominated chromosomes (filled circles) approach and spread out along the Pareto-optimal surface.

Type: Observation | Advantage: None | Novelty: None | ConceptID: Obs1

194

As evolution continues, the dominated chromosomes (open circles) approach the nondominated front.

Type: Observation | Advantage: None | Novelty: None | ConceptID: Obs1

195

Even with fitness sharing^51–55 (see Section 4.2) to minimize genetic drift toward a specific region of the tradeoff surface, the converged solution has reduced diversity.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res6

196

The two extreme nondominated solutions are designated by a star and asterisk with their corresponding spectra shown in Fig. 6 and parameter values in Table 4.

Type: Observation | Advantage: None | Novelty: None | ConceptID: Obs2

197

The best solution with regard to absorption is shown in the bottom panel while the best solution with regard to the Raman excitation profile is shown in the top panel.

Type: Observation | Advantage: None | Novelty: None | ConceptID: Obs2

198

There is a clear trade-off in optimizing the two objectives.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res7

199

Instead of allowing the inversion algorithm to choose between these extreme solutions, we allow the decision maker to examine all of the nondominated solutions within the parameter space.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res7

200

We find that neither solution shown in Fig. 6, and none of the other solutions along the Pareto-optimal surface, fit both objectives well.

Type: Result | Advantage: None | Novelty: None | ConceptID: Res7

201

Generally, we can use this negative result to assert that either the data are noisy or the model is inadequate!

Type: Conclusion | Advantage: None | Novelty: None | ConceptID: Con4

Algorithm

202

Genetic algorithms are particularly well suited to multiobjective optimization.^56–58

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac14

203

Multiple individuals can search for multiple solutions in parallel.

Type: Background | Advantage: None | Novelty: None | ConceptID: Bac14

204

By incorporating the concept of Pareto-domination into the selection procedure, and applying a niching pressure to spread the population out along the tradeoff surface, the genetic algorithm described in Section 3 can be modified to handle multiple objective functions.