## Plz help - question on probabilty of a nucleotide sequence

Genetics as it applies to evolution, molecular biology, and medical aspects.

james0222
Garter
Posts: 1
Joined: Fri Mar 25, 2011 3:23 am

### Plz help - question on probabilty of a nucleotide sequence

First off the Question: Please let me know if I am on the right path or missing something?

A molecule of double-stranded DNA that is 5 million base pairs long has a
base composition that is 62% G + C. How many times, on average, are the
following restriction sites likely to be present in this DNA molecule? (a)
BamHl (recognition sequence = GGATCC)

My attempt at solving it:

(.19) chance of getting an A or T
(.31) chance of getting a C or G

To get AAGCTT I first calculated the probability of that sequence occuring in a row:

(.31)(.31)(.19)(.19)(.31)(.31) = 0.0003339

I then multiplied this probabilty by the number of total nucleotides that occuring in 5 million base pairs which is 10 million.

Therefore (0.0003339)(10,000,000) = 3333.4 would be my answer

However, I was thinking about dividing this answer by 6 because the sequence cannot possibly occur more than once every 6 nucleotides. IE. The sequence AAGCTTAAGCTT is 12 nucelotides long but the sequence could never occur more than twice.

So my final answer to the original question above would be 3333.4/6 which gives 555.7 restriction sites would be likely to be present in this DNA molecule of 5,000,000 base pairs.

JackBean
Inland Taipan
Posts: 5694
Joined: Mon Sep 14, 2009 7:12 pm

### Re: Plz help - question on probabilty of a nucleotide sequence

james0222 wrote:To get AAGCTT I first calculated the probability of that sequence occuring in a row:

(.31)(.31)(.19)(.19)(.31)(.31) = 0.0003339

But I think with this formula, the order doesn't matter, rigth?
http://www.biolib.cz/en/main/

Cis or trans? That's what matters.

DRT23
Garter
Posts: 34
Joined: Sat Feb 26, 2011 5:29 pm
Location: Istanbul, Turkey

### Re: Plz help - question on probabilty of a nucleotide sequence

JackBean wrote:
james0222 wrote:To get AAGCTT I first calculated the probability of that sequence occuring in a row:

(.31)(.31)(.19)(.19)(.31)(.31) = 0.0003339

But I think with this formula, the order doesn't matter, rigth?

So, permutation of a sequence which has 6 elements with two repetitions (As and Ts):
6!/(2!).(2!)=180

Which means there are 180 permutations of 6 bp sequence that is consisted of 2 As, 2 Ts, 1 G and 1 C. So, you should divide your answer by 180:
555.7/180=3.1

I think rest of the calculations seems correct. But, I didn't understand why you did the calculation for AAGCTT sequence while your sequence is GGATCC?

JackBean
Inland Taipan
Posts: 5694
Joined: Mon Sep 14, 2009 7:12 pm

### Re: Plz help - question on probabilty of a nucleotide sequence

but the formula
(.31)(.31)(.19)(.19)(.31)(.31) = 0.0003339

is for sequence of 2 A/T and 4 C/G
http://www.biolib.cz/en/main/

Cis or trans? That's what matters.

DRT23
Garter
Posts: 34
Joined: Sat Feb 26, 2011 5:29 pm
Location: Istanbul, Turkey

### Re: Plz help - question on probabilty of a nucleotide sequence

Yes, I see it now that the calculation is for the sequence given by the question. So, the answer should be 3.

To get AAGCTT I first calculated the probability of that sequence occuring in a row: