In 1946, geophysicist and Bayesian statistician Harold Jeffreys introduced what we today call the Kullback-Leibler divergence, and discovered that for two distributions that are "infinitely close" (let's hope that Math SE guys don't see this ;-) we can write their Kullback-Leibler divergence as a quadratic form whose coefficients are given by the elements of the Fisher information matrix. He interpreted this quadratic form as the element of length of a Riemannian manifold, with the Fisher information playing the role of the Riemannian metric. From this geometrization of the statistical model, he derived his Jeffreys's prior as the measure naturally induced by the Riemannian metric, and this measure can be interpreted as an intrinsically uniform distribution on the manifold, although, in general, it is not a finite measure.
To write a rigorous proof, you'll need to spot out all the regularity conditions and take care of the order of the error terms in the Taylor expansions. Here is a brief sketch of the argument.
The symmetrized Kullback-Leibler divergence between two densities f and g is defined as
D[f,g]=∫(f(x)−g(x))log(f(x)g(x))dx.
If we have a family of densities parameterized by θ=(θ1,…,θk), then
D[p(⋅∣θ),p(⋅∣θ+Δθ)]=∫(p(x,∣θ)−p(x∣θ+Δθ))log(p(x∣θ)p(x∣θ+Δθ))dx,
in which Δθ=(Δθ1,…,Δθk). Introducing the notation
Δp(x∣θ)=p(x∣θ)−p(x∣θ+Δθ),
some simple algebra gives
D[p(⋅∣θ),p(⋅∣θ+Δθ)]=∫Δp(x∣θ)p(x∣θ)log(1+Δp(x∣θ)p(x∣θ))p(x∣θ)dx.
Using the Taylor expansion for the natural logarithm, we have
log(1+Δp(x∣θ)p(x∣θ))≈Δp(x∣θ)p(x∣θ),
and therefore
D[p(⋅∣θ),p(⋅∣θ+Δθ)]≈∫(Δp(x∣θ)p(x∣θ))2p(x∣θ)dx.
But
Δp(x∣θ)p(x∣θ)≈1p(x∣θ)∑i=1k∂p(x∣θ)∂θiΔθi=∑i=1k∂logp(x∣θ)∂θiΔθi.
Hence
D[p(⋅∣θ),p(⋅∣θ+Δθ)]≈∑i,j=1kgijΔθiΔθj,
in which
gij=∫∂logp(x∣θ)∂θi∂logp(x∣θ)∂θjp(x∣θ)dx.
This is the original paper:
Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems. Proc. Royal Soc. of London, Series A, 186, 453–461.