## calculation of distance using Kullback–Leibler divergence

**Moderators:** Leonid, amiradm, BioTeam

### calculation of distance using Kullback–Leibler divergence

Hi,

Basically I'm from biology back ground. So it may be a basic question for you. I have two plots for example

plot1:

Now I need to design a scoring function which should be symmetric and also 2D (score should be based on x - axis and also on y- axis values) which results in displaying how good these two plots fit.

I thought of using Kullback–Leibler divergence for Gaussian.

Here what I intended to do is to draw a gaussian curve for each point (1st value of plot 1 and 1st value of plot 2) individually in the two plots and calculate their over lap, and finally sum all the overlaps which we get from each point in the plots.

If the overlap is perfect on x- axis and also on y - axis then fit should be 1 or 100% and as there if there is some changes in x- axis or y- axis then the fit should be less than that. If the y -axis values are same and there is a significant change in the x- axis then the score should be near to "zero" as the overlap will be much less.

I hope I'm clear in presenting the Idea. It would be helpful if anyone provide me the implementation of the formula according to the problem as I'm dumbo in maths.

Thanks in advance

Basically I'm from biology back ground. So it may be a basic question for you. I have two plots for example

plot1:

Code: Select all

`X - axis values: `

535.255111, 536.258228, 537.26097, 538.26361, 539.266194, 540.268735

Y-axis values:

0.7, 0.23474151, 1, 0.00980891, 0.00116291, 0.0001162

Plot 2:

X - axis values:

535.255111, 536.258228, 537.26097, 538.26361, 539.266194, 540.268735

Y-axis values:

1, 0.33474151, 0.06663174, 0.00980891, 0.00116291, 0.0001162

Now I need to design a scoring function which should be symmetric and also 2D (score should be based on x - axis and also on y- axis values) which results in displaying how good these two plots fit.

I thought of using Kullback–Leibler divergence for Gaussian.

Here what I intended to do is to draw a gaussian curve for each point (1st value of plot 1 and 1st value of plot 2) individually in the two plots and calculate their over lap, and finally sum all the overlaps which we get from each point in the plots.

If the overlap is perfect on x- axis and also on y - axis then fit should be 1 or 100% and as there if there is some changes in x- axis or y- axis then the fit should be less than that. If the y -axis values are same and there is a significant change in the x- axis then the score should be near to "zero" as the overlap will be much less.

I hope I'm clear in presenting the Idea. It would be helpful if anyone provide me the implementation of the formula according to the problem as I'm dumbo in maths.

Thanks in advance

### Who is online

Users browsing this forum: No registered users and 0 guests