Modern molecular dynamics tends to be so vastly complicated that we often get lost in the myriad of details. The goal of this tutorial is to show the readers how to build a generalized MD simulation from scratch, and in the process, distill the fundamentals behind MD. Note that we don’t describe the actual physical system, as we want to be generic. Code snippets are written for C++.
Step 1. Given a physical system, describe your coordinate space and prescribe a potential
on this coordinate space.
For purposes of illustration. Let the domain of my potential be
. Thus,
is the flat 2-torus. Define
, where explicitly,
(1) 

The potential code is very simple:
double potential(double x, double y) {
//replace later with any potential of our choosing
double V = cos(x) + cos(y);
return V;
}
Step 2. Find the partial derivatives of
.
Analytically, the solutions are:
(2) 
In practice, it is usually impractical to derive this analytically. It is usually calculated numerically using a central difference scheme:
double p_deriv_CD(double x, double y, int dim) {
double delt = 1e-6; // difference step
double val = 0;
if(dim == 0) {
double v_prev = potential(x-delt,y);
double v_forw = potential(x+delt,y);
return (v_forw - v_prev)/(2*delt);
}
else {
double v_prev = potential(x,y-delt);
double v_forw = potential(x,y+delt);
return (v_forw - v_prev)/(2*delt);
}
}
Step 3. Set up your ordinary differential equation (ODE) to create the equations of motion.
This is the heart of molecular dynamics. The form of your differential equation depends upon your coordinate system and choice of ensemble. However, only in exceptional circumstances can the ODE be solved analytically. Realistically, molecules have constraints that fixate bond lengths of angles. There are two ways to deal with constraints.
The popular approach is that for each time step, assume constraints do not exist and therefore simply integrate the equations using Newton’s equations of motion. Afterwards, we get some new position
. Then we do an update step using a constraint algorithm to correct
to the proper position
. The other approach, and often hailed as esoteric and overly complicated, is to use Lagrangian mechanics, which in brief, integrates number of equations of motions and do not need an update, but each equation in the system of ODEs is extremely complicated – because the form of the kinetic energy is no longer as trivial as
.
In our example, we assume that our equations of motion are constraint-free. In particular, we use the NVE ensemble (constant number of atoms, volume, and energy), note that most simulations nowadays utilizes the NVT ensemble (constant number of atoms, volume, and temperature). We use the following set of second order non-linear differential equations, assuming masses of unity:
(3) 
Step 4. Integrate the equations of motion.
In general, the ODEs tend to be a system of ordinary differential equations that must be solved numerically. We typically call these solvers integrators. If you have taken a course on numerical computing, you’ll probably have learnt the Euler method, the midpoint method, Heun’s method, Runge-Kutta, etc. However, not all integrators are suitable as they must conserve energy (and preserve “area”), and these collectively, are called symplectic integrators. The most commonly used sympletic integrators are Basic Verlet, Velocity Verlet, and Leapfrog.
This tutorial implements velocity verlet, a second-order integrator. The input into velocity verlet is a initial position vector and a initial velocity vector. During each time-step, that is, to integrate the equations of motion, we first calculate the current force. Then we calculate the position at time t+1 using the current position, current velocity, and current force. Now we have the position at time t+1. Then we calculate the new forces at this new position, giving us the force at time t+1. Using the current force and force at time t+1, we can calculate the velocity at the next time-step. In detail,
As a matter of notational convenience:
(4) 
The velocity verlet scheme then proceeds as follows:
(5) 
The code to implement the whole thing is as follows:
void velocity_verlet(FLOAT_TYPE x0, FLOAT_TYPE y0,
FLOAT_TYPE vx0, FLOAT_TYPE vy0) {
double delt = 0.0005;
int MAX_STEPS = 500000;
double x = x0;
double y = y0;
double vx = vx0;
double vy = vy0;
// xx - next step (t+dt)
// x - current step (t)
// xv - current velocity (t)
for(double t = 0; t<delt*MAX_STEPS; t+=delt) {
if(x > 2*PI)
x -= 2*PI;
if(x < 0)
x += 2*PI;
if(y > 2*PI)
y -= 2*PI;
if(y < 0)
y += 2*PI;
// calculate potential energy
float e_tot = 0.5*vx*vx + 0.5*vy*vy + potential(x,y);
fx = -1*p_deriv_CD(x,y,0);
fy = -1*p_deriv_CD(x,y,1);
x += vx*delt + delt*delt*fx/2;
y += vy*delt + delt*delt*fy/2;
fx_delt = -1*p_deriv_CD(x,y,0);
fy_delt = -1*p_deriv_CD(x,y,1);
vx += delt*(fx + fx_delt)/2;
vy += delt*(fy + fy_delt)/2;
}
}
Step 5. Plot your data and check your trajectories.
To make sure that you do not go insane while writing MD code – it is important that we check for at least two things: 1, the trajectories has to look reasonable. 2, the energy must be conserved.
The following figure shows the trajectory with symmetrical starting points:



Well, this is not very interesting. Now suppose instead, we started with a non-symmetric initial position:



Now things are getting interesting. The first behavior we immediately notice is that there is no nice oscillation pattern (as was the case in the symmetric case). Let’s increase the length of the simulation:



Figure 1.
Lo and be hold, the second behavior we notice is that the set of positions seem to be bounded by the rectangle
Imagine plotting out the level curve (an implicit function):
(6) 

All points with this region can visited and not violate the conservation of energy. Yet, our system only chooses to visit a subset of the possible states. But this makes sense for the following reason. To calculate
, we need information of
,
, and
. The
term depends on
, and
. However,
in general depends only
, and not on
. Thus, the calculation of
is also independent of
. In other words, the trajectory along the x-dimension is independent of the trajectory along the y-dimension because the partial derivatives are independent.
A third behavior that becomes immediately clear is that the system spends most of its time on the outer ring. The density of points in the minima region is in fact, quite small. Again, by conservation of total energy, this makes sense because at regions of high potential energy, the kinetic energy is low; and at regions of low potential energy, the kinetic energy is high. A higher kinetic energy implies a higher velocity at that point, which implies that the system is moving very fast through that point.
And there we go! Your own little MD engine in less than 300 lines of code.
Addendum
A holonomic constraint can be written in the closed form:
(7) 
For example, bond length constraints and bond angle constraints:
(8) 
Some other cool plots:





Top down and side view



Convergent Potentials:


Figure 2.
In the future:
-Parallelizing Velocity Verlet
-Derivation of Lagrangian Mechanics
-More Code!