Divergence & CurlDivergence and curl are two generalizations of the idea of derivative to higher dimensions: but instead of measuring changes of functions they measure changes in vector fields. The curl of a vector field measures the "infintesimal circulation" and the divergence the "infintesimal expansion": these ideas will be made precise here.

DivergenceThe divergence operator in vector calculus is often defined by the following formula in introductory multivariate calculus courses: $\nabla\cdot \vec{F}=\frac{\partial F_1}{\partial x}+\frac{\partial F_2}{\partial y}+\frac{\partial F_3}{\partial z}$, but little is done to explain why we would care about a quantity defined in this way, or even how the sum of a bunch of partial derivatives somehow measures the way in which a vector field spreads out. My goal here is first off to give some motivation for why we would want to define a quantity like divergence, and how to go about it. Later then, I will show the definition we came up for divergence is actually equal to the one above, although they at first look very different.
The divergence is a scalar field that we associate with a vector field, which aims to give us more information about the vector field itself. Much like the gradient of a function provides us with the direction and magnitude of the greatest increase at each point, the divergence provides us with a measure of how much the vector field is "spreading out" at each point.
Intuitively, we want the divergence to give us the following assessments of vector fields: if the vectors here all point away from the black dot, causing a positive outward flow, we want to say that the field has positive divergence there. If the vectors all point inwards, so there is a "negative outward flow," we want to say that the field has negative divergence there. And if the vectors seem to be flowing right through the black dot, neither spreading out or converging on it, we'd like to say that the divergence is zero. 
Note that if we want to define divergence so that it agrees with these above intuitions, the definition can't depend on the value of the vector field at the point we are interested in. In the first two examples (where we want to assign nonzero divergence), the vector field is actually zero at the point of interest, whereas in the last example (where we want to assign zero divergence), the vector field is nonzero. Instead, we want to know what the vector field is doing "around" the point of interest, and more specifically, we want to know how much the vector field is "flowing towards or away" from that point.
One way we have at our disposal to measure the "amount of flow" is the surface integral: given a surface S in a vector field F, the integral $$\int_S \vec{F}\cdot d\vec{S}$$ Gives the amount of flow (known mathematically as flux) through the surface. So, if we want to know how much "stuff" is flowing towards/away from a point p, we could encase p in a surface and look at the flux through that surface. Although the actual visual should be a 3dimensional vector field with a surface around p, like on the right below. We will instead keep opting for looking at 2dimensional pictures (you can think of them as cross sections of these 3d pictures if you'd like) so that it's easier to see whats going on. 
The amount of the vector field "moving outwards through the surface" is just the surface integral about this surface. There is one slight technicality here: as with all surface integrals, we need to choose an orientation for our surface, which pretty much amounts to a choice of which side of the surface is "outside" and which is "inside". Since we are forming closed surfaces around a point, there's a pretty obvious option here, we are going to count the region with the point as the inside and the rest the outside. Under this choice, it turns out that our surface integrals are positive when the vector field flows from the point to the outside, and negative when it flows from the outside towards the point, just like we want divergence to do.
Because the surface integral around a point seems to do exactly what we want divergence to do, we would like to make some sort of definition along the lines 
$$\mathrm{div}{\vec{F}}\stackrel{?}{=}\int_S \vec{F}\cdot d\vec{S}$$
Where our surface $S$ surrounds the point of interest, $p$. Trying to do this directly leads to some problems however, the most notable being the choice of surface $S$. Let's say we are interested in a point where the vector field is flowing outwards in all directions from it, and so we expect a positive divergence at that point. 
Now, if we want to measure the outflow from $p$ using a surface integral, we can see that the amount of flow we register will definitely depend on the shape of the surface we choose. If the surface is rather small about $p$ we will get a positive value, as expected, but if our surface is a bit larger and includes inside one of the "sinks" as well as the source at $p$, the net flow through the surface will be very different as at some locations the flow is outwards from $p$, and at other locations it is inwards towards the sink. The two pictures below illustrate what can happen with such different sized loops.
Now obviously the first one of these cases is a better measure of what's happening at p, and the reason for this is that the chosen surface is small enough that it doesn't include any of the "weird" points of the vector field that are far away. So, it seems to get a better idea of what is going on at p, its useful to choose a smaller surface.
To express this more precisely, let $V$ be a small 3d region about our point $p$, and $\partial V$ be the surface of that region. To say that our surface shrinks and converges to $p$, we can just as well say that the region $V$ itself shrinks to a point (carrying the surface with it). We will express this (still admittedly imprecise) idea in symbols as $V\to \{p\}$. Recalling that given a region $V$ we can write the surface integral about its boundary as $\int_{\partial V} \vec{F}\cdot d\vec{S}$ , as we let our region shrink down to the point of interest, we can write the limiting value of the surface integral as
$$\mathrm{div}\vec{F}(p)\stackrel{?}{=}\lim_{V\to\{p\}}\int_{\partial V} \vec{F}\cdot d\vec{S}$$ However, in trying this we run into another problem: this limit is always going to be zero! To see why, let's think what happens to the flux of a vector field through smaller and smaller surfaces. In the pictures below, we are looking at a sequence of surfaces converging to a point which should intuitively have positive divergence. We can picture the surface integral as "adding up" all of the contributions of all the vectors which lie on the surface, but as the surface gets smaller, less and less vectors lie on the surface, and so the value will get smaller as the surface shrinks. As the size of the surface goes to zero then, the number of vectors lying on it will also head towards zero. ("Number" here is being used very loosely, to mimic the fact that in our drawings there are less vectors on smaller curves; but the mathematics carries through and shows that our intuition that the value is zero is in fact correct). One way around this is to consider the "flux per unit volume," instead of just the straightup flux, and effectively normalize our integral by the size of the volume enclosed by the surface. This way, even though bigger surfaces have more flux through them, they also enclose more volume, and so the two effects would cancel each other out. This will leave us with a normalizing factor out in front of the integral, where we write $V$ for the volume of $V$. $$\frac{1}{V}\int_{\partial V} \vec{F}\cdot d\vec{S}$$ Now that our integral no longer tends to zero as the surface shrinks, we can again take the limit as the chosen region converges to $p$. As this encapsulates what we want a quantity like divergence to measure: the flux per unit volume around a point, we can define this as THE divergence of F at p. $$\mathrm{div} \vec{F}(p)=\lim_{V\to \{p\}}\frac{1}{V}\int_{\partial V}\vec{F}\cdot d\vec{S}$$ To quickly recap what we've done so far, starting with a point in a vector field, we have found a way to attach a number, called the divergence, which measures how much the vector field is expanding or contracting at that point. The way to arrive at such a quantity is to look at the flow through small surfaces around the point, and then take the limit as these surfaces shrink away. This is all good and dandy, and sure provides a better intuitive picture than the standard definition as to why divergence measures flow, but it has a serious drawback: it looks god awful to try and compute. First off, we are taking the limit over ALL regions converging on p, and secondly for each one we have to compute a surface integral! This is a situation we run into a lot with coordinate free definitions like this, they get to the heart of the matter intuitively, but without a handy coordinate system they seem hopeless to use. To make this definition useful (and to bring us full circle in explaining why the standard formula for divergence looks the way it does) we will evaluate the divergence of a vector field in Cartesian (xyz) coordinates. The main advantage that choosing a coordinate system gives us is that instead of considering all possible regions about p, we can handpick "nice" regions which have simple descriptions in our coordinate system. In Cartesian coordinates, cubes are particularly easy to work with, so we will stick to those. (UNDER CONSTRUCTION) 