Tests of second-generation and third-generation density functionals for thermochemical kinetics

We report tests of second- and third-generation density functionals, for pure density functional theory (DFT) and hybrid DFT, against the BH6 representative barrier height database and the AE6 representative atomization energy database, with augmented, polarized double and triple zeta basis sets.

The pure DFT methods tested are G96LYP, BB95, PBE, mPWPW91, VSXC, HCTH, OLYP, and OPW91 and the hybrid DFT methods tested are B1B95, PBE0, mPW1PW91, B97-1, B98, MPW1K, B97-2, and O3LYP.

The performance of these methods is tested against each other as well as against first-generation methods (BP86, BLYP, PW91, B3PW91, and B3LYP).

We conclude that the overall performance of the second-generation DFT methods is considerably better than the first-generation methods.

The MPW1K method is very good for barrier height calculations, and none of the pure DFT methods outperforms any of the hybrid DFT methods for kinetics.

The B1B95, VSXC, B98, OLYP and O3LYP methods perform best for atomization energies.

Using a mean mean unsigned error criterion (MMUE) that involves two sizes of basis sets (both with polarization and diffuse functions) and averages mean unsigned errors in barrier heights and in atomization energy per bond, we find that VSXC has the best performance among pure functionals, and B97-2, MPW1K, and B1B95 have the best performance of all hybrid functionals tested.

Hybrid density functional theory1 (mixing Hartree–Fock exchange with pure DFT) has become generally recognized as the electronic structure method of choice for calculations on large systems.

Methods of this type can be justified theoretically by the adiabatic connection theory,2 and hence they are sometimes called adiabatic connection methods.

The adiabatic connection theory indicates that more accurate results can be obtained by replacing some DFT exchange by Hartree–Fock exchange.

Hartree–Fock exchange does not suffer from the self-interaction error of DFT, which can be very important when hydrogen atoms are present, and this is one way to understand why mixing in exact exchange can reduce the error; however, DFT exchange is inseparable from DFT dynamical correlation, and it introduces some static correlation,3 so the optimum fraction of HF exchange is not 100%.

The performance of hybrid DFT for thermochemistry is well documented.1,2,4–21

At present, despite the successes of hybrid DFT, one is not completely satisfied for several reasons.

First, the method is not systematically improvable.

Second, the mixture of Hartree–Fock theory into DFT restricts the choice of algorithms such that the most efficient computational strategies used for pure DFT are inapplicable.

Third, the most generally successful hybrid DFT methods are less accurate for kinetics than for thermochemistry.16,20,22,23

Since Hartree–Fock theory usually overestimates barrier heights for chemical reactions, and pure DFT usually underestimates them,6 many workers noticed that more accurate results can be obtained for kinetics by increasing the fraction of Hartree–Fock exchange;24 however, it was questionable whether this approach was physically meaningful.

It was soon learned that the popular B3LYP hybrid method,25–27 with 20% Hartree–Fock exchange, still systematically underestimates barrier heights,22,23 but raising the fraction of Hartree–Fock exchange usually deteriorated the quality of the prediction of the theory for other quantities more rapidly than it increased the quality of barrier height predictions.

However the hybrid mPW1PW91 method9 is more stable than the popular B3LYP hybrid method when the fraction of Hartree–Fock exchange is increased,23 and this observation was used to optimize the MPW1K method for kinetics.23

MPW1K gives remarkably accurate barrier heights with only slight deterioration of reaction energies.20,23,28

The number of available density functionals is increasing rapidly.

Boese et al21. have pointed out that “Many DFT users are overwhelmed by the sheer number of functionals and possibilities….

Very often because of sheer user inertia, first-generation functionals are applied rather than more accurate second-generation functionals….

Meanwhile, systematic studies on the dependency of the basis set and functionals remain sparse.” Furthermore, with a few exceptions,12,16,18,20,23,28–30 those systematic tests that are available are dominated by thermochemistry, and much less attention has been paid to quantities like barrier heights that are important for kinetics.

Recently, a representative database of six barrier heights was developed31 such that the errors calculated for this database correlate extremely well with errors calculated for a much larger database20,32 of 44 barrier heights.

The small database is called BH6.

The small size of this database makes it more straightforward to test a wide assortment of second-generation density functionals for kinetics, and the representative character of the database makes us expect that the conclusions are consistent with what would be concluded from tests against a much larger database.

The same paper31 developed a representative database of atomization energies, containing six molecules and called AE6, such that performance on this database is indicative of performance on a much larger 109 molecule database.19,20

The average number of bonds per molecule in AE6 is 4.833.

In the present paper we report errors in the AE6 data base as mean signed error (MSE) and mean unsigned error (MUE) in kcal mol–1 per bond by dividing the total MSE and MUE by 4.833.

Small representative databases can play an important role in allowing a wide variety of methods to be tested on the same data.

Small databases will not necessarily uncover the interesting difficult cases, but it is reasonable to ask first how a method performs for typical cases.

One advantage of the AE6 and BH6 databases is that they correspond to electronic energy contributions (including nuclear repulsion), exclusive of vibrational zero point energy and vibrational–rotational thermal energies.

Thus they can be used to test electronic energy calculations without the complication of vibrational energy considerations.

All calculations in this paper are single-point calculations at QCISD/MG3 geometries, where QCISD is quadratic configuration interaction with single and double excitations,33 and MG3 is the modified34,35 G3Large36 basis set.

The MG3 basis set,34 also called G3LargeMP2,35 is the same as 6-311++G(3d2f,2df,2p)37 for H–Si, but improved36 for atoms heavier than Si.

The QCISD/MG3 greometries for all molecules in BH6 and AE6 can be obtained from the database website of our group.38

The effect of spin–orbit coupling is also added to open shell systems from a compendium given else where.39

We used the GAUSSIAN0340 program to test all pure and hybrid DFT methods except OLYP and O3LYP41 (acronyms for density functionals are explained below).

We found that O3LYP in GAUSSIAN03 cannot reproduce the atomization energies published in the original41,42 O3LYP paper, and therefore O3LYP and OLYP calculations were carried out with the PQS ab initio program developed by Parallel Quantum Solutions.43

Baker and Pulay18 used PQS to assess O3LYP and OLYP for organic reactions.

Note that the local correlation functional in O3LYP and OLYP is Vosko, Wilk, and Nusair's correlation functional V (VWN5),44 while in B3LYP the local correlation functional is the Vosko, Wilk, and Nusair's correlation functional III (VWN3).44

In the printed part of this article we will give MSE and MUE for BH6 and MSE per bond and MUE per bond for AE6 with two highly recommended basis sets, namely a recommended19,28 augmented polarized double zeta set, 6-31+G(d,p),45,46 and a recommended augmented polarized triple zeta set, MG3S.

In tables 6-31+G(d,p) is abbreviated DIDZ (desert-island double zeta).

The MG3S basis19 is the same as MG3 except it omits diffuse functions on hydrogens.

The supporting information gives the following additional information that may be of interest to specialists: root-mean square errors, results for four more basis sets (6-31G(d), 6-31+G(2d,p), 6-311G(3d,2pd), and 6-311+G(2df,2p)), and results for the subsets of the databases that contain only H–F, where the latter values exclude molecules and barrier heights for systems that include Si and S. We simply comment that the trends in relative performance on the full AE6 and BH6 are mirrored in these subsets, and the conclusions about relative performance drawn from the two recommended basis sets are similar to those for the other four basis sets.

With regard to comparison of basis sets, the results in supporting information support our previous conclusion19 that inclusion of diffuse functions on atoms heavier than H make the results more generally reliable.

These tables are also consistent with our previous conclusion, based on previous results,19,20,23 that if one were confined to a desert island with only one valence double zeta basis set and one valence triple zeta basis set, 6-31+G(d,p) and MG3S would be two very excellent choices (kudos to the Pople group36,37,45,46 for their work in optimizing the basis functions in these basis sets).

Table 1 gives results for first-generation functionals, both pure DFT (X = 0) and hybrid DFT (X > 0).

In particular, the first-generation functionals we tested are BP86, BLYP, PW91, B3LYP and B3PW91.

BP86 is a pure DFT method using Becke's 1988 gradient corrected exchange functional (B)25 and Perdew's 1986 gradient corrected correlation functional (P86).47

When one combines Becke's 1988 gradient corrected exchange functional (B) with Lee, Yang, and Parr's gradient corrected correlation functional (LYP),26 one obtains the BLYP method.

PW91 is a pure DFT method that incorporates Perdew and Wang's 1991 gradient corrected exchange and correlation functional.48,49

B3PW91 is Becke's three parameter hybrid DFT method which includes 20% Hartree–Fock exchange.1

B3LYP is the most popular hybrid DFT method; it was developed by Stephens et al27. by following Becke's three-parameter hybrid DFT strategy.

Table 2 gives some results for DFT methods involving second-generation functionals (mPW1PW91 and MPW1K) that have already been widely tested against our databases.

Barone and Adamo9 developed mPW1PW91 as a Becke-style one-parameter hybrid functional using their modified Perdew–Wang (mPW or MPW) exchange functional, Perdew and Wang's 1991 (PW91) correlation functional, and 25% of Hartree–Fock exchange.

MPW1K is a method optimized against a kinetics database by our group;23 it uses the same functionals as mPW1PW91, but the fraction of Hartree–Fock exchange is 42.8%.

Table 3 and Table 4 give results for second-generation and third-generation pure and hybrid DFT functionals that have not previously been tested against our databases.

The tests in Table 3 and Table 4 are the main new results in the present paper.

By testing so many methods against the same representative databases, including barrier heights, we can put the new functionals in better perspective and thereby provide guidance to the “many DFT users” mentioned in the third paragraph as well as to developers of new density functionals and computational strategists.

In particular, Tables 3 and 4 include the following second-generation methods (in alphabetical order):

(i) B97-1: Hamprecht et al.'s modification to Becke's 1997 hybrid functional (B97)8,12

(ii) B98: Becke's 1998 revisions to B978,10

(iii) B97-2: Wilson, Bradley and Tozer's modification to B978,16

(iv) G96LYP: Gill's 1996 exchange50 and LYP correlation26

(v) HCTH: Hamprecht, Cohen, Tozer, and Handy's exchange–correlation12 (this is also sometimes called HCTH/407)

(vi) OLYP: Handy and Cohen's OPTX41 exchange and LYP26 correlation

(vii) O3LYP: Handy and Cohen's three parameter hybrid functional42 using OPTX41 exchange and LYP26 correlation

(viii) OPW91: OPTX41 exchange and PW9148 correlation

(ix) PBE: Perdew, Burke, and Ernzerhof's exchange and correlation51

(x) PBE0: One-parameter hybrid functional using Perdew, Burke and Ernzerhof's 1996 exchange and correlation51,52 and three third-generation methods:

(i) B1B95: One-parameter hybrid functional using Becke's 1988 exchange25 and 1995 kinetic-energy-dependent correlation5

(ii) BB95: same as B1B95 but with no Hartree–Fock exchange5

(iii) VSXC: van Voorhis and Scuseria's kinetic-energy-dependent exchange–correlation53

The second-generation functionals, like the first-generation functionals of Table 1 that brought DFT and hybrid DFT to its present state of popularity, are generalized gradient approximation (GGAs) and their hybrid generalizations.

The third generation functionals are sometimes called meta-GGAs or MGGAs; they incorporate kinetic energy density and are a possible area of future systematic improvement.5,17,53–56

Previous work56 has shown that the VSXC functional is more accurate than the more recent Perdew–Kurth–Zupan–Blaha57 (PKZB) MGGA, and so we did not include the PKZB functional here (very recently an improved MGGA, called TPSS, has been presented, and it would be interesting to study that in future work).

Because the MGGAs can do a much better job of eliminating the self-interaction energy than GGAs, one expects less improvement upon making hybrid versions.54,55

Nevertheless we test one MGGA in both pure and hybrid form, namely BB955 and B1B.955

(Boese and Handy17 found only a small improvement in the HCTH functional upon adding kinetic energy density, and so we do not test the functional that they obtained by that addition.) Dependence of the density functional on the kinetic energy density is a special case of functional dependence on the Kohn–Sham orbitals, which are in turn functionals of the density; more general orbital dependencies (fourth-generation functionals) have also been studied,58 but are not considered here.

From Tables 1–4 we can see that new generations of DFT methods outperform the first-generation methods.

Among the pure DFT methods, VSXC and OLYP give the best performance for atomization energies, and VSXC also gives the best performance for barrier heights.

In comparison to the best pure DFT first-generation functional (BLYP) in Table 1, we see that VSXC improves performance by factor of 1.6–2.6, which is a dramatic advance.

Among the hybrid DFT methods, MPW1K gives the best results for BH6, while B1B95, O3LYP, and B98 give the best results for AE6.

As compared to the best first-generation hybrid functional (B3PW91) in Table 1, MPW1K improves the barrier heights by a factor of 3.1, but increases the mean unsigned error in the atomization energies by factor of 2.6–3.8.

In contrast, the third-generation B1B95 functional improves the barrier heights by a factor of only 1.2 but also improves the atomization energies by factors of 1.1 (computed from unrounded MUEs) to 2.0.

None of the seven second-generation hybrid functionals in Table 2 and 4 improve the MUE for both BH6 and AE6 for both basis sets.

The good results obtained in this paper with VSXC and B1B95 provide good support for the value of including kinetic energy density in the functionals.

A desirable quality that one seeks for density functionals is that one can obtain good results with small basis sets as well as large basis sets.

Also, it is important for density functionals to perform well for both barrier heights and bond energies.

In order to make it a little easier to put the large number of results in this paper in perspective, we calculated a mean MUE, by the formula MMUE = 1/4[MUE(BH6,DIDZ) + MUE(BH6,MG3S) + MUE(AE6,DIDZ) + MUE(AE6,MG3S)] where the MUEs for AE6 are on per bond basis as in Tables 1–4.

Table 5 lists all the methods in this paper in order of increasing MMUE.

Clearly eqn. (1) is not a unique measure of quality (for example, if one is not interested in kinetics and potential energy surfaces, one might prefer a criterion based only on atomization energies, or if one is interested in eliminating the largest errors (“difficult cases”), one needs a larger database), but nevertheless it is one reasonable measure of representative performance, and so we can draw some conclusions.

From the MMUE in Table 5, we can see that most of the hybrid DFT methods outperform pure DFT.

One exception is VSXC, which is the best pure DFT method, and it outperforms PBE0 and B3LYP.

The large margin by which VSXC outperforms the second-best pure functional is quite impressive.

The good performance of HCTH, OLYP, and OPW91 is also noteworthy and provides a significant testimonial to the work of the Handy group.

Table 5 also shows that most second-generation DFT methods outperform the first-generation DFT methods.

B97-2 is the very best method in that it has the lowest MMUE.

We previously concluded that mPW91PW91 was systematically better than B3LYP,20,23,28 in agreement with the original conclusion of Adamo and Barone9 (who, however, did not systematically test barrier heights).

Table 5 shows that, at least using eqn. (1), B97-2, MPW1K, and B1B95 are significantly better, on average, than mPW1PW91.

Of these methods, MPW1K achieve its place mainly by superior performance on barrier heights, whereas B97-2 and B1B95 outperform mPW91PW91 for all four MUEs in Tables 2 and 4.

Therefore, these two methods are good candidates for general-purpose hybrid DFTs, while MPW1K remains the best hybrid DFT for kinetics.

None of the pure DFT methods in Tables 1, 2, or 3 outperforms any of the hybrid DFT methods in Tables 1, 2, or 4 for barrier heights.