f(phi)-value analysis



f-value analysis is a method to understand the effects of a mutation in a protein upon the free energy of the individual native state, denatured state and transition state.




I.  The free energy diagram for two-state protein folding


A simple two-state, reversible, protein folding process can be represented as:


Where, N is the native (folded) state, and D is the denatured (unfolded) state.


The following diagram represents the free energy of the native and denatured forms of a protein under conditions where the native state is favored (e.g. 0M denaturant, physiological pH, room temperature, etc.):

From the above diagram we can conclude the following:



II.  Rates of folding and unfolding and the free energy diagram


The folding transition state, ‡, (which is actually likely to represent an ensemble of structures) describes the energy barrier between the N and D states, and determines the rate at which the N state converts to D (unfolding process), and the rate at which the D state converts to N (folding process)



In the above diagram, the N®‡ energy barrier (DG‡ - N) is larger than the D®‡ energy barrier (DG‡ - D), thus:


The rates of folding and unfolding are a function of the rate constants for folding (kf) and unfolding (ku), thus, we have an overall diagram that looks like this:




III. Equilibrium denaturation methods


Equilibrium denaturation experiments report the extent of denaturation as a function of added denaturant (isothermal equlibrium denaturation by guanidine or urea), or added heat (differential scanning calorimetry).

DG° = -RTln(Keq)


Example I:


The mutant protein free energy diagram is shown in the green broken line, and the wild-type reference protein energy diagram is shown in the black line. The free energy of the native (N), denatured (D) and transition (‡) states are shown, with the mutant indicated by an asterisk.



f-value analysis compares the free energy data from equilibrium denaturation methodologies to free energy values derived from kinetic studies, and this comparison allows a determination of how a mutation has affected the free energy of the native, denatured or transition states of the protein



IV. Probing the structure of the transition state


Example II: 

A mutation (indicated by an asterisk *) that does not affect the denatured state, or the transition state, but destabilizes the native state:




In this example:

·       The values of the folding rate constants, kf and k*f, for wild-type and mutant are observed to be equal, therefore, DG‡ - D and DG*‡ - D are identical in value, and the value of DDG‡ - D = 0 (i.e. DG‡ - D - DG*‡ - D = 0)


Note: it may seem that when you make a mutation that the mutant should be considered the "new state" (i.e. state 2) in comparison to the wild type "original state"  (i.e. state 1). Thus, any delta values relating a mutant to wild type should be of the form: (mutant value - wild type value).  However, there is no strict adherence to this frame of reference (even though it’s a “delta” value), and effects of mutations are commonly calculated by subtracting mutant values from the wild type.  The key thing is that you explicitly state how you are calculating the values for the mutant in terms of the wild type protein when you report the relevant delta values (and the resultant meaning of negative vs positive values).


·       The unfolding rate constants are different between mutant and wild-type (faster for the mutant).  Thus, DG‡ - N values are different and the value of DDG‡ - N (i.e. DG‡ - N - DG*‡ - N) is non-zero (positive in this case).

·       The value of DDG‡ - N can be determined from the wild type and mutant folding rate constants:

(Note: DDG‡ - N is also referred to as DDGunfolding or DDGu)


·       The DDGD - N value for the mutant (i.e. the effect of the mutation upon stability) is determined experimentally using isothermal equilibrium denaturation data (at the same temperature as the kinetic studies, or using DSC data with DDG value determined by extrapolation of individual DG  values to the temperature used for the kinetic experiments). 



If then it means that the energetic changes between the wild type and mutant native states accounts for the entire energetic difference observed in the equilibrium stability study (and we conclude that the mutation has affected the native state exclusively)




NOTE: there is potential for ambiguity in the energy diagram above. For example, the following two energy diagrams would yield exactly the same kinetics and equilibrium thermodynamics:

In the first diagram the D and D* states are assumed to be energetically equivalent; whereas in the second diagram the N and N* states are assumed to be energetically equivalent. Note however that in both diagrams the various thermodynamic parameters are identical. Thus, we cannot state with confidence the absolute energy levels; but what we can say with confidence is whether the ‡ state energy is moving coordinately with either the N or D state. In the above case, the ‡ state energy is moving coordinately with the D state energy (and the site of mutation is considered to be as unfolded in the transition state as it is in the D state).


Example III:


In this example:

·       The values of the unfolding rate constants, ku and k*u, for wild-type and mutant are observed to be equal, therefore, the values of DG‡ - N and for DG*‡ - N are identical and the value of DDG‡ - N = 0

·       The folding rate constants are different between mutant and wild-type (faster for the wild-type).  Thus, DG‡ - D values are different and the value of DDG‡ - D is negative.

(Note: DDG‡ - D is also referred to as DDGfolding or DDGf)



If then it means that the energetic changes between the wild type and mutant denatured states accounts for the entire energetic difference observed in the equilibrium stability study (and we conclude that the mutation has affected the denatured state exclusively)



NOTE: the same relative energy ambiguity exists with this example also. The following two energy diagrams are indistinguishable in terms of thermodynamics and folding/unfolding kinetics:

In the first image the N and N* states are assumed to be energetically equivalent. In the second image the D and D* states are assumed to be energetically equivalent. However, notice again that all thermodynamic and kinetic parameters are unchanged. Thus, the only firm conclusion that can be state with confidence as regards energy levels is that the transition state and N states move coordinately. Thus, the site of mutation is as folded in the N state as it is in the transition state (and forms part of the critical folding nucleus).


V. Folding and unfolding kinetic data and the "chevron plot" model


The folding and unfolding kinetic constants are determined experimentally by either stopped-flow or manual mixing techniques. To determine folding kinetic constants the protein sample is initially denaturated by dilution (or dialysis) into high concentration of denaturant (e.g. 7.0M GuHCl). This sample is then rapidly mixed with buffer having no denaturant – upon which the protein begins to refold. This rate is typically rapid and so is performed in a stopped-flow instrument (monitoring some spectroscopic probe of folding – such as fluorescence or circular dichroism). To determine unfolding kinetic constants the protein is diluted or dialyzed into native buffer (i.e. buffer containing no denaturant). It is then mixed with a buffer containing high-denaturant – and the protein begins to unfold. The rate is typically slower than folding, and so manual mixing methods typically suffice. Folding and unfolding typically (but not always) is fit to a single exponential function:

The above image is an example of a stopped-flow refolding study at a particular final concentration of denaturant. The folding rate constant (kf) under this condition is determined by a fit to the single exponential equation shown. The half-life for a given rate constant is (1/kf)*LN(2). It is important to collect data that covers a significant portion of the maximum amplitude. One half-life covers 50%, two half-lives covers 75%, three half-lives covers 87.5%. Most experiments collect 5-10 half-lives worth of data. You can always truncate data.


If folding and unfolding kinetic data are plotted as ln(kf) and ln(ku) vs [Denaturant] an idealized example will demonstrate two linear arms – the "folding" arm and the "unfolding arm" (called a "chevron" plot because of its shape):

At the point indicated by "Cm" the folding and unfolding rates are equal, and this is the definition of the Keq condition; Cm is the midpoint of denaturation (where N and D states are half-populated at equilibrium). It should agree with the Cm value determined from isothermal equilibrium denaturation studies (at the same temperature as folding kinetic studies). In practical terms, kinetic data for the folding arm extends up to Cm, and the kinetic data for the unfolding arm extends down to Cm. The chevron plot is defined by two linear functions, where kf0 is the folding rate at 0M denaturant (and ln(kf0) is the Y-intercept of the folding arm) and mkf is the slope of the ln(kf) function. Similarly, ku0 is the unfolding rate at 0M denaturant (and ln(ku0) is the Y-intercept of the unfolding arm) and mku is the slope of the ln(ku) function:

The equation that defines the simple chevron plot is the combination of the folding and unfolding arms. The linear function of ln(kf) as a function of denaturant concentration (i.e. the folding arm) is:


ln(kf) = mkf*X + ln(kf0)


Similarly, the linear function of ln(ku) as a function of denaturant concentration (i.e. the unfolding arm) is:


ln(ku) = mku*X + ln(ku0)

The two are combined as:

ln(exp(folding arm)+exp(unfolding arm)


= ln(exp(mkf*X + ln(kf0))+exp(mku*X + ln(ku0)))


= ln((kf0*exp(mkf*X))+ku0*exp(mku*X))


When the rate data is plotted as ln(kobs) values the actual fit to the above equation will look something like this:

Since the rate of folding and unfolding is dependent upon denaturant concentration the condition of 0M denaturant is the typical reference for quoting the intrinsic folding and unfolding rates (i.e. kf0 and ku0).


VI. f value analysis


The basis of f value analysis is to compare the overall free energy change for a mutation to the individual contributions of the folding and unfolding free energy change.  The analysis usually is focused upon understanding whether a particular mutation site is folded or unfolded in the transition state (and in this way probes the "structure" of the transition state).



Usually the choice of either ff or fu is based upon whether folding kinetic data or unfolding kinetic data can be more accurately determined.


VII. Cross-validation


DDGD - N values can also be determined from the kinetic values:


The above equation also suggests that if unfolding kinetic data for a mutant can be obtained, but folding kinetic data cannot, it can be predicted by comparing the equilibrium DG data and the known kinetic data:

Isothermal equilibrium denaturation (IED) data provides information on the thermodynamics of unfolding, but not the kinetics. However, if the thermodynamic and kinetic analyses have shared assumptions (i.e. two-state, reversible unfolding) then the thermodynamic and kinetic data should cross-validate (i.e. be in agreement where applicable).


The IED data provides information on ΔGunfolding (ΔGu) as a function of denaturant:


ΔGu = (m-value*X)+ΔG0


ΔGu is related to the equilibrium constant for unfolding (Keq):

ΔGu = -RT*ln(Keq)


exp(ΔGu /-RT) = Keq


Expanding this equation by the definition of ΔGu = (m-value*X)+ΔG0:

exp(((m-value*X)+ ΔG0)/-RT) = Keq


The definition of Keq for protein unfolding in terms of folding and unfolding rate constants kf and ku:

Keq = ku/kf

Setting the two terms as equalities:

exp(((m-value*X)+ ΔG0)/-RT) = ku/kf


ku = kf * exp(((m-value*X)+ ΔG0)/-RT)




kf = ku / exp(((m-value*X)+ ΔG0)/-RT)


In other words, if the folding/unfolding is two-state you can predict the ku function from the kf function plus IED data; or you can predict the kf function from the ku function and IED data. This information allows you to fit the chevron plot data using knowledge of both folding/unfolding constants and IED data.


For example:

ln(kf) = mkf*X + ln(kf0)


kf = kf0*exp(mkf*X)


exp(((m-value*X)+ ΔG0)/-RT) = Keq = ku/kf


now we can define ku in terms of kf and IED m-value and ΔG0:


(kf0*exp(mkf*X)) * exp(((m-value*X)+ ΔG0)/-RT) = ku


The standard chevron plot equation is:

Y = LN(kf0*exp(mkf*X)+ku0*exp(mku*X))


Substituting the ku term yields:

Y = LN(kf0*exp(mkf*X)+ (kf0*exp(mkf*X)) * exp(((m-value*X)+ ΔG0)/-RT))


This will cause the chevron plot to be fit with linear functions for both folding/unfolding arms, and for the resultant Keq (i.e. ΔG function) to be equal to that derived from the IED data (note: this would require these terms to be constant values during the fit).


VIII. Hammond behavior


The chevron plot folding and unfolding arms often exhibit non-linear behavior. This is typically a "roll-over" at either low or high denaturant concentrations. This can be due to the structure of the transition state changing to a neighboring intermediate on the reaction profile. Thus, the folding/unfolding arms may be better modeled by a polynomial that includes a second order term for the curvature. Note that the ΔGu(denaturant) function determined from IED is still a linear function. Modification of the above model is as follows:


kf = kf0*exp(mkf*X+bkf*X2)


ln(kf) = ln(kf0)+ln(exp(mkf*X+bkf*X2))

ln(kf) = ln(kf0)+(mkf*X+bkf*X2)



exp(((m-value*X)+ ΔG0)/-RT) = Keq = ku/kf


ku = (kf0*exp(mkf*X+bkf*X2)) * (exp(((m-value*X)+ ΔG0)/-RT))


ln(ku) = ln((kf0*exp(mkf*X+bkf*X2)) * (exp(((m-value*X)+ ΔG0)/-RT))

ln(ku) = ln(kf0*exp(mkf*X+bkf*X2)) + ln(exp(((m-value*X)+ ΔG0)/-RT))

ln(ku) = ln(kf0) + ln(exp(mkf*X+bkf*X2)) + ((m-value*X)+ ΔG0)/-RT

ln(ku) = ln(kf0) + (mkf*X+bkf*X2) + ((m-value*X)+ ΔG0)/-RT


with ln(kf) and ln(ku) defined, we can now state the chevron plot function:


Y = ln(exp(ln(kf))+exp(ln(ku)))

Y = ln( exp(ln(kf0)+(mkf*X+bkf*X2)) + exp(ln(kf0) + (mkf*X+bkf*X2) + ((m-value*X)+ ΔG0)/-RT))

Y = ln( exp(ln(kf0))*exp(mkf*X+bkf*X2) + exp(ln(kf0)*exp((mkf*X+bkf*X2) + ((m-value*X)+ ΔG0)/-RT))

Y = ln( kf0*exp(mkf*X+bkf*X2) + kf0*exp((mkf*X+bkf*X2) + ((m-value*X)+ ΔG0)/-RT))


This will cause the chevron plot to be fit with a second order polynomial (the 2nd order term of which is identical for both arms) and as a two-state model whose ΔG(denaturant) agrees with the IED data (the terms for IED m-value and ΔG0 are defined as constants during the fit).

Once the fitted parameters are determined, the following relationships hold:


ku0 = kf0 * exp(ΔG0/-RT)


bkf is the 2nd order polynomial for folding arm also


Cm = -ΔG0/m-value


mku = ln(kf0/ku0)/Cm + mkf


The independent ln(kf) and ln(ku) baselines from the above dataset:

IX. Some derivations


The rate of folding is proportional to the free energy difference between the denatured state and the transition state:


·       k = 1.0


Boltzmann's constant, kb = 1.380 x 10-23 J K-1

Temperature in K

Planck's constant, h = 6.626 x 10-34 J sec


·       n has units of (J K-1)*(K)/(J sec) = sec-1 (appropriate for a rate constant)


Calculation of DDG‡-N from experimental rate constants of unfolding:


Assuming n and k are the same in both cases:


DDG‡-D follows a similar derivation.