## Plz help - question on probabilty of a nucleotide sequence

**Moderators:** honeev, Leonid, amiradm, BioTeam

### Plz help - question on probabilty of a nucleotide sequence

**First off the Question: Please let me know if I am on the right path or missing something?**

A molecule of double-stranded DNA that is 5 million base pairs long has a

base composition that is 62% G + C. How many times, on average, are the

following restriction sites likely to be present in this DNA molecule? (a)

BamHl (recognition sequence = GGATCC)

**My attempt at solving it:**

(.19) chance of getting an A or T

(.31) chance of getting a C or G

To get AAGCTT I first calculated the probability of that sequence occuring in a row:

(.31)(.31)(.19)(.19)(.31)(.31) = 0.0003339

I then multiplied this probabilty by the number of total nucleotides that occuring in 5 million base pairs which is 10 million.

Therefore (0.0003339)(10,000,000) = 3333.4 would be my answer

However, I was thinking about dividing this answer by 6 because the sequence cannot possibly occur more than once every 6 nucleotides. IE. The sequence AAGCTTAAGCTT is 12 nucelotides long but the sequence could never occur more than twice.

So my final answer to the original question above would be 3333.4/6 which gives 555.7 restriction sites would be likely to be present in this DNA molecule of 5,000,000 base pairs.

### Re: Plz help - question on probabilty of a nucleotide sequence

james0222 wrote:To get AAGCTT I first calculated the probability of that sequence occuring in a row:

(.31)(.31)(.19)(.19)(.31)(.31) = 0.0003339

But I think with this formula, the order doesn't matter, rigth?

http://www.biolib.cz/en/main/

*Cis*or*trans*? That's what matters.### Re: Plz help - question on probabilty of a nucleotide sequence

JackBean wrote:james0222 wrote:To get AAGCTT I first calculated the probability of that sequence occuring in a row:

(.31)(.31)(.19)(.19)(.31)(.31) = 0.0003339

But I think with this formula, the order doesn't matter, rigth?

So, permutation of a sequence which has 6 elements with two repetitions (As and Ts):

6!/(2!).(2!)=180

Which means there are 180 permutations of 6 bp sequence that is consisted of 2 As, 2 Ts, 1 G and 1 C. So, you should divide your answer by 180:

555.7/180=3.1

I think rest of the calculations seems correct. But, I didn't understand why you did the calculation for AAGCTT sequence while your sequence is GGATCC?

### Re: Plz help - question on probabilty of a nucleotide sequence

but the formula

is for sequence of 2 A/T and 4 C/G

(.31)(.31)(.19)(.19)(.31)(.31) = 0.0003339

is for sequence of 2 A/T and 4 C/G

http://www.biolib.cz/en/main/

*Cis*or*trans*? That's what matters.### Re: Plz help - question on probabilty of a nucleotide sequence

Yes, I see it now that the calculation is for the sequence given by the question. So, the answer should be 3.

This sentence made me ask it. I guess james0222 mistyped it.To get AAGCTT I first calculated the probability of that sequence occuring in a row:

### Who is online

Users browsing this forum: No registered users and 3 guests