**f****(phi)-value analysis**

** **

*f**-value analysis** is a method to
understand the effects of a mutation in a protein upon the free energy
of the individual native state, denatured state and transition
state.*

* *

**I.
The free energy diagram for two-state protein folding**

A simple two-state, reversible, protein folding process can be represented as:

**N ****Û**** D **

Where,
**N is the native** (folded) state, and **D is the denatured** (unfolded)
state.

The following diagram represents the free energy of the native and denatured forms of a protein under conditions where the native state is favored (e.g. 0M denaturant, physiological pH, room temperature, etc.):

From the above diagram we can conclude the following:

- The
native state (N) has a lower free energy than the denatured state (D) (in
fact, the native state appears to be the global energy minimum). The
system will
**spontaneously adopt an equilibrium that favors the native state**. - The
**free energy difference**between the N and D states (DG_{D-N}) is a**measure of the stability**of the protein

- In
this case,
**D****G**_{N}_{®}is defined as the free energy change in going from state 1 (N) to state 2 (D)._{D} - "Delta"
values are always defined as (value of state2 - value of state1), thus DG
_{N}_{®}_{D}= (DG_{D}- DG_{N}) or DG_{D - N} - From
the above diagram this value should be a
**positive**value - A
positive value indicates a
**non-spontaneous**process as written (i.e. going to the right). Thus, N Û D is non-spontaneous in the right-hand direction, but the reverse direction is spontaneous as written (thus, the above diagram reflects a situation where the equilibrium favors the native state of the protein)

**II.
Rates of folding and unfolding and the free energy diagram**

The
folding transition state, **‡**, (which is actually likely to represent an ensemble of
structures) describes the energy barrier between the N and D states, and **determines
the rate** at which the N state converts to D (**unfolding
process**), and the rate at which the D state converts to N (**folding process**)

- The
**energy barrier for unfolding**is proportional to the height between the**N****®****‡**states - The
**energy barrier for folding**is proportional to the height between the**D****®****‡**states - The
rate at which the system can change states is
**inversely proportional**to the**height of the energy barrier**(i.e. the__larger__the energy barrier, the fewer molecules in the sample that will have the necessary energy to overcome the barrier, and therefore, the__slower__the overall rate of change of state)

In
the above diagram, the **N****®****‡** energy barrier (DG_{‡}_{ - N}) is larger than the **D****®****‡** energy barrier (DG_{‡}_{ - D}), thus:

- The rate of unfolding should be slower than the rate of folding
- This situation results in an equilibrium condition that will favor the N state (i.e. the above diagram is representative of a protein that is stably folded)

The
rates of folding and unfolding are a function of the **rate constants** for
folding (** k_{f}**) and unfolding (

- As
stated above, the rate at which the system can change states is
**inversely proportional**to the**height of the energy barrier**. In other words, if DG_{‡}_{ - N}is large (i.e. the energy barrier to unfolding), then the corresponding unfolding rate constant*k*is small (i.e. DG_{u}_{‡}_{ - N}µ 1/*k*); likewise, if DG_{u}_{‡}_{ - D}is large (i.e. the energy barrier to folding), then the corresponding folding rate constant*k*is small (i.e. DG_{f}_{‡}_{ - D}µ 1/*k*)._{f}

**III.
Equilibrium denaturation methods**

Equilibrium
denaturation experiments report the **extent of denaturation** as a function
of added **denaturant** (isothermal equlibrium denaturation by guanidine or
urea), or added **heat** (differential scanning calorimetry).

- Such
experiments provide information on the
**equilibrium constant**for denaturation, and therefore,**D****G**:_{D - N}

**D****G° =
-RTln(K _{eq})**

- These
experiments assume the system is always at
**equilibrium**(e.g. samples are allowed to come to equilibration prior to measurement), thus, they provide**no information**on the folding or unfolding**kinetics**(i.e. folding or unfolding rate constants) - Thus, using equilibrium methods alone, we cannot say what the effects of mutations are upon the folding or unfolding kinetic properties

**Example I:**

The mutant protein free energy diagram is shown in the green broken line, and the wild-type reference protein energy diagram is shown in the black line. The free energy of the native (N), denatured (D) and transition (‡) states are shown, with the mutant indicated by an asterisk.

- The
mutation has affected the transition state (‡). Specifically, it has
**stabilized the transition state**and has had no effect upon either the native or denatured states - In
stabilizing (i.e. lowering the free energy of) the transition state, the
mutation will result in an
**increase in both the rate of unfolding and folding** - However,
since the free energy levels of the native and denatured states are unchanged,
the overall
**D****G**._{D - N}value is unchanged - Thus,
equilibrium denaturation methodologies would report
**no difference**between the mutant and wild type proteins, but,**kinetic experiments**would clearly indicate that the**mutation has altered the rates of folding and unfolding**.

* *

*f**-value analysis** compares the free
energy data from equilibrium denaturation methodologies to free energy values
derived from kinetic studies, and this comparison allows a determination of how
a mutation has affected the free energy of the native, denatured or transition
states of the protein*

**IV.
Probing the structure of the transition state**

** **

**Example II: **

A
mutation (indicated by an asterisk *) that does not affect the denatured state,
or the transition state, but **destabilizes the native state**:

In this example:

·
The
**values of the folding rate constants**, k_{f} and k*_{f},
for wild-type and mutant are **observed to be equal**, therefore, DG_{‡ - D} and DG*_{‡ - D} are identical in
value, and the value of DDG_{‡ - D} = 0
(i.e. DG_{‡}_{ - D} - DG*_{‡}_{ - D} = 0)

*Note: it
may seem that when you make a mutation that the mutant should be considered the
"new state" (i.e. state 2) in comparison to the wild type "original
state" (i.e. state 1). Thus, any delta values relating a mutant to wild
type should be of the form: (mutant value - wild type value). However, there
is no strict adherence to this frame of reference (even though it’s a “delta”
value), and effects of mutations are commonly calculated by subtracting mutant
values from the wild type. The key thing is that you explicitly state
how you are calculating the values for the mutant in terms of the wild type
protein when you report the relevant delta values (and the resultant meaning of
negative vs positive values).*

·
The
**unfolding rate constants are different** between mutant and wild-type
(faster for the mutant). Thus, DG_{‡
- N} values are different and the value of DDG_{‡ - N} (i.e. DG_{‡}_{ - N} - DG*_{‡}_{ - N}) is non-zero (positive
in this case).

·
The
value of DDG_{‡ - N} can be determined
from the wild type and mutant folding rate constants:

_{}

**(Note: ****DD****G**_{‡ }_{- N}** is also
referred to as ****DD****G _{unfolding} or **

·
The
DDG_{D - N}
value for the mutant (i.e. the effect of the mutation upon stability) is
determined experimentally using **isothermal equilibrium denaturation** data
(at the __same temperature as the kinetic studies__, or using DSC data with DDG value determined by extrapolation of
individual DG values to the temperature
used for the kinetic experiments).

- Note
that in the above case, the DDG
_{D - N}value is equal to the value of DDG_{‡ - N}. In other words, it looks like if you make a mutation that affects the stability of the protein, and**if this effect is characterized by changes**__exclusively upon the unfolding rate constants__, then the mutation has affected exclusively the__native state__.

*If _{}then it
means that the energetic changes between the wild type and mutant native
states accounts for the entire energetic difference observed in the
equilibrium stability study (and we conclude that the mutation has affected the
native state exclusively)*

- If a mutation affects the
**native state and transition states equally**, then it is assumed that the mutation site(i.e. the mutation site adopts the native configuration in the transition state)*is as folded in the transition state as it is in the native state* - If a mutation affects the
**denatured state and transition states equally**, then it is assumed that the mutation site(i.e. the mutation site adopts the denatured configuration in the transition state)*is as unfolded in the transition state as it is in the denatured state* - In the above example,
the
__transition state and the denatured state__are unaffected by the mutation - in other words, the mutation has affected these states equally - thus, we would conclude that__the site of mutation is as unfolded in the transition state as it is in the denatured state__

- In this case, the
perturbation of the mutation upon the denatured state is equivalent to the
perturbation of the transition state.
__Thus the site of mutation is unfolded in the transition state; it does not form part of the critical folding nucleus (i.e. folding transition state)__*.*

* *

**NOTE:
**there
is potential for **ambiguity** in the energy diagram above. For example, the
following two energy diagrams would yield __exactly the same kinetics and
equilibrium thermodynamics__:

In
the first diagram the D and D* states are assumed to be energetically
equivalent; whereas in the second diagram the N and N* states are assumed to be
energetically equivalent. Note however that in both diagrams __the various
thermodynamic parameters are identical__. Thus, we **cannot state with
confidence the absolute energy levels**; but **what we can say with
confidence is whether the ****‡**** state energy is moving coordinately with
either the N or D state**. In the above case, __the ____‡____ state energy is
moving coordinately with the D state energy__ (and the **site of mutation is
considered to be as unfolded in the transition state as it is in the D state**).

**Example III:**

In this example:

·
The
values of the unfolding rate constants, *k _{u}* and

·
The
folding rate constants are different between mutant and wild-type (faster for
the wild-type). Thus, DG_{‡ - D}
values are different and the value of DDG_{‡
- D} is negative.

_{}

**(Note: ****DD****G**_{‡ }_{- D}** is also
referred to as ****DD****G _{folding} or **

- Note
that in the above case, the DDG
_{D - N}value is equal to the value of DDG_{‡ - D}. In other words, it looks like if you make a mutation that affects the stability of the protein, and**if this effect is characterized by changes only upon the folding rate constants, then the mutation has affected exclusively the denatured state.**

*If _{}then it
means that the energetic changes between the wild type and mutant denatured
states accounts for the entire energetic difference observed in the
equilibrium stability study (and we conclude that the mutation has affected
the denatured state exclusively)*

- For
this example, the perturbation of the mutation on the transition state is
equivalent to the perturbation upon the native state.
.__Therefore, the site of mutation is as folded in the transition state as it is in the native state; and this position forms part of the critical folding nucleus__

**NOTE:
**the
same relative energy **ambiguity** exists with this example also. The
following two energy diagrams are indistinguishable in terms of thermodynamics
and folding/unfolding kinetics:

In
the first image the N and N* states are assumed to be energetically equivalent.
In the second image the D and D* states are assumed to be energetically
equivalent. However, notice again that **all thermodynamic and kinetic
parameters are unchanged**. Thus, the only firm conclusion that can be state
with confidence as regards energy levels is that **the transition state and N
states move coordinately**. Thus, __the site of mutation is as folded in the
N state as it is in the transition state__ (and forms part of the critical
folding nucleus).

**V.
Folding and unfolding kinetic data and the "chevron plot" model**

The folding and unfolding kinetic constants are determined experimentally by either stopped-flow or manual mixing techniques. To determine folding kinetic constants the protein sample is initially denaturated by dilution (or dialysis) into high concentration of denaturant (e.g. 7.0M GuHCl). This sample is then rapidly mixed with buffer having no denaturant – upon which the protein begins to refold. This rate is typically rapid and so is performed in a stopped-flow instrument (monitoring some spectroscopic probe of folding – such as fluorescence or circular dichroism). To determine unfolding kinetic constants the protein is diluted or dialyzed into native buffer (i.e. buffer containing no denaturant). It is then mixed with a buffer containing high-denaturant – and the protein begins to unfold. The rate is typically slower than folding, and so manual mixing methods typically suffice. Folding and unfolding typically (but not always) is fit to a single exponential function:

The
above image is an example of a stopped-flow refolding study at a particular
final concentration of denaturant. The folding rate constant (*k _{f}*)
under this condition is determined by a fit to the single exponential equation
shown. The half-life for a given rate constant is (1/

If
folding and unfolding kinetic data are plotted as ln(*k _{f}*) and
ln(

At
the point indicated by "Cm" the folding and unfolding rates are
equal, and this is the definition of the Keq condition; Cm is the midpoint of
denaturation (where N and D states are half-populated at equilibrium). It
should agree with the Cm value determined from isothermal equilibrium
denaturation studies (at the same temperature as folding kinetic studies). In
practical terms, kinetic data for the folding arm extends up to Cm, and the
kinetic data for the unfolding arm extends down to Cm. The chevron plot is
defined by two linear functions, where *k _{f}0* is the folding
rate at 0M denaturant (and ln(

The
equation that defines the simple chevron plot is the combination of the folding
and unfolding arms. The linear function of ln(*k _{f}*) as a
function of denaturant concentration (i.e. the folding arm) is:

ln(*k _{f}*)
=

Similarly,
the linear function of ln(*k _{u}*) as a function of denaturant
concentration (i.e. the unfolding arm) is:

ln(*k _{u}*)
=

The two are combined as:

ln(exp(folding arm)+exp(unfolding arm)

= ln(exp(*mk _{f}**X
+ ln(

= ln((*k _{f}0**exp(

When
the rate data is plotted as ln(*k _{obs}*) values the actual fit to
the above equation will look something like this:

Since
the rate of folding and unfolding is dependent upon denaturant concentration **the
condition of 0M denaturant is the typical reference for quoting the intrinsic
folding and unfolding rates** (i.e. ** k_{f}0** and

**VI.
****f**** value analysis**

__The
basis of ____f____ value analysis is to
compare the overall free energy change for a mutation to the individual
contributions of the folding and unfolding free energy change__. The analysis
usually is focused upon understanding whether a particular mutation site is
folded or unfolded in the transition state (and in this way probes the
"structure" of the transition state).

- The
DDG
_{D - N}value is determined by denaturation equilibrium methods - The
_{}value is determined from the unfolding kinetic constants of the mutant and wild type - The
_{}value is determined from the folding kinetic constants of the mutant and wild type _{}is called the "folding f value". If it equals 1.0 it means that the site of the mutation is native-like in the transition state (a value of 0 means the opposite)_{}is called the "unfolding f value". If it equals 1.0 it means that the site of the mutation is denatured in the transition state (a value of 0 means the opposite)- Fractional , or negative, values for f values are more difficult to interpret

Usually
the choice of either f_{f} or f_{u} is based upon
whether folding kinetic data or unfolding kinetic data can be more accurately
determined.

**VII.
Cross-validation**

DDG_{D - N}
values can also be determined from the kinetic values:

_{}

- If the two-state model correctly describes the protein denaturation, then the above value should agree with the value from equilibrium denaturation studies

The above equation also suggests that if unfolding kinetic data for a mutant can be obtained, but folding kinetic data cannot, it can be predicted by comparing the equilibrium DG data and the known kinetic data:

_{}

Isothermal equilibrium denaturation (IED) data provides information on the thermodynamics of unfolding, but not the kinetics. However, if the thermodynamic and kinetic analyses have shared assumptions (i.e. two-state, reversible unfolding) then the thermodynamic and kinetic data should cross-validate (i.e. be in agreement where applicable).

The
IED data provides information on ΔG_{unfolding} (ΔG_{u}) as a
function of denaturant:

ΔG_{u} =
(m-value*X)+ΔG_{0}

ΔG_{u} is
related to the equilibrium constant for unfolding (K_{eq}):

ΔG_{u} =
-RT*ln(K_{eq})

exp(ΔG_{u} /-RT) =
K_{eq}

Expanding
this equation by the definition of ΔG_{u} = (m-value*X)+ΔG_{0}:

exp(((m-value*X)+
ΔG_{0})/-RT) = K_{eq}

The
definition of K_{eq} for protein unfolding in terms of folding and
unfolding rate constants *k _{f}* and

K_{eq} = *k _{u}*/

Setting the two terms as equalities:

exp(((m-value*X)+
ΔG_{0})/-RT) = *k _{u}*/

Thus:

*k _{u}* =

and

*k _{f}* =

In
other words, if the folding/unfolding is two-state you can predict the *k _{u}*
function from the

For example:

ln(*k _{f}*)
=

*k _{f}*

exp(((m-value*X)+
ΔG_{0})/-RT) = K_{eq} = *k _{u}*/

now
we can define *k _{u}* in terms of

(*k _{f}0**exp(

The standard chevron plot equation is:

Y = LN(*k _{f}0**exp(

Substituting
the *k _{u}* term yields:

Y = LN(*k _{f}0**exp(

This
will cause the chevron plot to be fit with linear functions for both
folding/unfolding arms, and for the resultant K_{eq} (i.e. ΔG function) to be
equal to that derived from the IED data (note: this would require these terms
to be constant values during the fit).

**VIII.
Hammond behavior**

The
chevron plot folding and unfolding arms often exhibit non-linear behavior. This
is typically a "roll-over" at either low or high denaturant
concentrations. This can be due to the structure of the transition state
changing to a neighboring intermediate on the reaction profile. Thus, the
folding/unfolding arms may be better modeled by a polynomial that includes a
second order term for the curvature. Note that the ΔG_{u}(denaturant)
function determined from IED is still a linear function. Modification of the
above model is as follows:

*k _{f}*

and

ln(*k _{f}*)
= ln(

**ln( k_{f})
= ln(k_{f}0)+(mk_{f}*X+bk_{f}*X^{2})**

exp(((m-value*X)+ ΔG_{0})/-RT) =
K_{eq} = *k _{u}*/

*k _{u}*

and

ln(*k _{u}*)
= ln((

ln(*k _{u}*)
= ln(

ln(*k _{u}*)
= ln(

**ln( k_{u})
= ln(k_{f}0) + (mk_{f}*X+bk_{f}*X^{2})
+ ((m-value*X)+ ΔG_{0})/-RT**

with ln(*k _{f}*)
and ln(

Y = ln(exp(ln(*k _{f}*))+exp(ln(

Y = ln( exp(ln(*k _{f}0*)+(

Y = ln( exp(ln(*k _{f}0*))*exp(

**Y = ln( k_{f}0*exp(mk_{f}*X+bk_{f}*X^{2})
+ k_{f}0*exp((mk_{f}*X+bk_{f}*X^{2})
+ ((m-value*X)+ ΔG_{0})/-RT))**

This
will cause the chevron plot to be fit with a second order polynomial (the 2^{nd}
order term of which is identical for both arms) and as a two-state model whose ΔG(denaturant) agrees
with the IED data (the terms for IED *m-value* and ΔG_{0} are
defined as constants during the fit).

Once the fitted parameters are determined, the following relationships hold:

*k _{u}0*

**bkf is the 2 ^{nd}
order polynomial for folding arm also**

**Cm = -****Δ****G _{0}/m-value
**

*mk _{u}*

The
independent ln(*k _{f}*) and ln(

**IX.
Some derivations**

The rate of folding is proportional to the free energy difference between the denatured state and the transition state:

Assumptions:

· k = 1.0

·
_{}

Boltzmann's constant, *k _{b}* =
1.380 x 10

Temperature in K

Planck's constant, *h* = 6.626 x 10^{-34}
J sec

·
n has units of (J K^{-1})*(K)/(J
sec) = sec^{-1} (appropriate for a rate constant)

_{}

Calculation
of DDG_{‡-N} from
experimental rate constants of unfolding:

Assuming n and k are the same in both cases:

DDG_{‡-D} follows
a similar derivation.