Login

Join for Free!
118903 members


SNPs and promoter polymorphisms

Everything on bioinformatics, the science of information technology as applied to biological research.

Moderator: BioTeam

SNPs and promoter polymorphisms

Postby ermis » Mon Aug 21, 2006 3:08 pm

Here is the thing:

I search here
http://www.ncbi.nlm.nih.gov/entrez/quer ... rch&DB=snp


for SNPs that concern the promoter of several genes.


My question is:


1) Am I searching at the right place, or should I try here:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed ?


2) If this is the right place
(http://www.ncbi.nlm.nih.gov/entrez/quer ... rch&DB=snp)


which option should I use from the "limits" section in order to take
as results only the "promoter SNPs"?


3) If this is not the right place
(http://www.ncbi.nlm.nih.gov/entrez/quer ... rch&DB=snp)


where and under which keywords or limits should I search?


Thanks in advance
ermis
Garter
Garter
 
Posts: 8
Joined: Mon Aug 21, 2006 3:04 pm

Postby weesper » Mon Aug 21, 2006 3:39 pm

Mhh this interesting; I guess the problem is that for most genes the actual promoter sequence, let alone the TF binding sites have not been that well defined if at all, meaning that there is no database where you can just look up the promoter sequence of any random gene (as far as I know, if you can show me otherwise I'd be really interested). I think you just have to work yourself through this database
weesper
Death Adder
Death Adder
 
Posts: 82
Joined: Sun May 29, 2005 1:57 pm

Postby ermis » Tue Aug 22, 2006 12:00 pm

weesper wrote:there is no database where you can just look up the promoter sequence of any random gene (as far as I know, if you can show me otherwise I'd be really interested). I think you just have to work yourself through this database


1) So, I guess that I will first look at the references for each gene and then I will try to infere if there is a promoter polymorphism?

2) In our university there is no sequencer and we use rflp. Is there any other easier way, I mean using a software such as sequencher 1.4.1, in order to avoid rflp?
ermis
Garter
Garter
 
Posts: 8
Joined: Mon Aug 21, 2006 3:04 pm


Postby weesper » Wed Aug 23, 2006 6:07 am

1) yes try and see if a promoter sequence has been defined (usually through a process called promoter bashing) and then search with the database in that stretch of the genome. I too wish all these were much easier and you could just type in exactly what you interested in and it gives you all your sites highlighted with SNP frequencies but it doesnt work that way. The database remember is relatively new and everyone is still submitting data, the process all categorizing all the data has not yet begun.

2) Im not sure I just sequence and would only do RFLPs once I've identified something I can reliably work with.
weesper
Death Adder
Death Adder
 
Posts: 82
Joined: Sun May 29, 2005 1:57 pm

Promoter SNP searching

Postby Jimbob » Tue Sep 19, 2006 2:44 pm

There is a way!
Try using CHIP Bioinformatics software (free). You can look specifically for promoter polymorphisms. I find it works quite well.

Good luck
Jimbob
Garter
Garter
 
Posts: 1
Joined: Tue Sep 19, 2006 2:41 pm

Use Genome Browser

Postby G-Do » Mon Oct 02, 2006 2:21 pm

If you're trying to find all SNPs which impact the promoter regions of all human genes, I have a technique, though it is imperfect and roundabout.

I have attached a file called customtrack.png. This is a UCSC Genome Browser custom track which annotates promoter data. It is derived from the existing Genome Browser annotation - each "promoter" is a window of 2000bp upstream of the transcription start site (TSS) of each RefSeq gene. This definition is not perfect, as it will miss some of the promoters, but it's a good start; it covers a little more than 23,000 reference sequences for known genes.

Go to the UCSC Genome Browser (http://genome.ucsc.edu) and click on "Genomes" in the left-hand navigation bar. Set the clade to "Vertebrate" and the genome to "Human" and the build to "Mar. 2006," then click "add custom tracks." You will be taken to a new page. Upload customtrack.txt; it will take a while to be fully loaded. When the browser window opens up, zoom out 10x and you'll see a big red block just upstream of the HIC2 gene. This is a 2000bp element which roughly corresponds to the promoter for that gene.

Now, to get all SNPs that fall into these regions, we'll have to query the UCSC database directly by using the "Tables" feature. Click on "Tables" in the upper navigation bar. Set "group" to "Variation and Repeats," set "track" to "SNPs," and set "table" to "snp126." Set "region" to "genome." In the middle of the form you should see a label "intersection" and a button next to it called "create." Click that button. On the new page, set "group" to "Custom Tracks." The track and table should be set automatically to the correct values ("promoters"). Scroll to the bottom and click "submit." This will take you back to the original tables page. Go to the bottom of the form and set "output format" to "sequence," then click "get output." On the next page, click "get sequence."

It will take a long time to get all the results - in fact, it might be quicker if you go back to the main tables page and have this all written to an output file. At any rate, the output format is a FASTA file which tells you the SNP ID ("rs" and some digits, usually), the locus where it's found, and the reference allele (the FASTA sequence).

Hope this helped!
Attachments
customtrack.png
A UCSC Genome Browser custom track which annotates the 2kb upstream regions for each RefSeq gene. To be used in query intersections.
(551.19 KiB) Downloaded 1432 times
Vi veri veniversum vivus vici
User avatar
G-Do
Garter
Garter
 
Posts: 38
Joined: Mon Oct 02, 2006 12:23 pm
Location: Philadelphia, PA; USA

Well, that didn't work.

Postby G-Do » Mon Oct 02, 2006 2:30 pm

Ah, crud. That didn't work. It's kind of stupid that the forum won't let you upload a standard text file. I don't want to clutter up the doc/pdf forum with data files, either. If you're still interested, why don't you send me a PM with your email address and I'll get the custom track to you that way?
Vi veri veniversum vivus vici
User avatar
G-Do
Garter
Garter
 
Posts: 38
Joined: Mon Oct 02, 2006 12:23 pm
Location: Philadelphia, PA; USA

Re: Well, that didn't work.

Postby ermis » Wed Oct 04, 2006 2:41 pm

1) How about this method:

I went here: http://www.pubmed.com and I have chosen the GENE database.
Let's say I was interested at the VWF promoter gene polymorphisms.
I typed VWF itno the search box and it gave me the VWF gene information for several organisms. I chose HOMO SAPIENS, but not the gene itself but the word links at the top right of the name of the gene.
Then I chose GeneViewIndbSNP and it gave me all the SNPs for the VWF gene, here: http://www.ncbi.nlm.nih.gov/SNP/snp_ref ... cusId=7450.
At the bottom of this page there are 11 SNPs, TO THE GENOMIC REGION NEAR THE VWF GENE. My question is:

ARE THESE SNPs PROMOTER SNPs?

2) I have also found this: http://www.pubmedcentral.nih.gov/articl ... id=1361283

Is it worth trying? I tried it and it gave me a dead link.

3) Another question: If you have the RS number for a given SNP, is there an easy way to find if this is a promoter SNP?

G-Do wrote:Ah, crud. That didn't work. It's kind of stupid that the forum won't let you upload a standard text file. I don't want to clutter up the doc/pdf forum with data files, either. If you're still interested, why don't you send me a PM with your email address and I'll get the custom track to you that way?
ermis
Garter
Garter
 
Posts: 8
Joined: Mon Aug 21, 2006 3:04 pm

Postby G-Do » Thu Oct 05, 2006 12:30 pm

Hi ermis,

1) No. This list gives you all SNPs within 2kb of the VWF gene, but that includes downstream, too. So not all of these SNPs are promoter SNPs.

2) I don't know if that tool is worth trying. According to the paper it is here:

http://primer.duhs.duke.edu/

But when you follow that link and try to access the tool you have to input a username and password. You could try to get in touch with the corresponding author on the paper, Hong Xu, to get access.

3) There's no guaranteed way to find a promoter SNP. The best you can do is look to see whether it falls within some distance of a gene's TSS. If you have the dbSNP ID (the rs-something-something ID), go to UCSC Genome Browser, click on "Genomes" on the left-hand navigation bar, and put the dbSNP ID into the "position or search term" box, then click the "submit" button. This will take you to a browser window on that SNP. Zoom out. Are you upstream of a gene? How far apart are the SNP and the gene's TSS? Use the distance scale at the top of the window to figure this out. Usually 1kb upstream of the TSS (or less) is a good indicator that you're in the promoter, though people often go to 2kb (and sometimes as far away as 5kb).
Vi veri veniversum vivus vici
User avatar
G-Do
Garter
Garter
 
Posts: 38
Joined: Mon Oct 02, 2006 12:23 pm
Location: Philadelphia, PA; USA

Postby ermis » Thu Oct 05, 2006 8:21 pm

G-Do wrote:Hi ermis,

1) No. This list gives you all SNPs within 2kb of the VWF gene, but that includes downstream, too. So not all of these SNPs are promoter SNPs.

The gene of the particular example, VWF, starts at 5928301 and ends at 6104097 of the minus strand. Does this means that the promoter snips locate at 5928301 and less? If we look at the 11 SNPs near the gene we see only 3 SNPs, according the contig position, that fullfill this rule. I am a bit confused. Does the minus strand means that the transcription takes place from 3 to 5? According to this link, http://www.ncbi.nlm.nih.gov/entrez/quer ... _uids=7450, the red arrows near the start and the end of the gene coordinates show reversly from 3 to 5. Is this the direction of transcription and every snp prior to the start end of transcription is promoter snp?
If this is the case, then why can't be applied to these SNPs (http://www.ncbi.nlm.nih.gov/SNP/snp_ref ... cusId=3383) [the contig numbers of the 6 snps are too far from the range of the gene, 10242779 > 10258291]. Is there a way to guess the promoter SNPs in this case?

3) I will check that and I ll give feedback. Thanks for the help..
ermis
Garter
Garter
 
Posts: 8
Joined: Mon Aug 21, 2006 3:04 pm

Postby G-Do » Fri Oct 06, 2006 1:53 pm

VWF is on the (-) strand and the numbers UCSC gives are for the (+) strand. You can tell that this is the case because when you look at VWF in the UCSC Genome Browser, the arrow notches point from right to left and not left to right. Therefore, you have to think about the whole thing backwards. Transcription always runs 5' to 3' on the strand the gene is on, so in this case it runs from right to left. For that reason, the upstream "promoter" region for this gene is not "5928301 and less" but rather the sequence just greater than 6104097 (6104097-6106097 if you want to do a 2kb window). SNPs in this area may have an effect on the promoter. If any of those SNPs are in this window (and if they overlap a conserved region), chances are they are important.

Anyway, here is how I would find promoter SNPs for this gene:

1) Go to UCSC Genome Browser.

2) Search for VWF as described above.

3) Determine the strand the gene is on using the arrow notches. If it's left-to-right, it's on the (+) strand. If it's right-to-left, it's on the (-) strand.

4) Find the promoter region (for the sake of argument, say it's 2kb upstream of the gene's 5' boundary). For (+) strand that means you have to look at the 2kb < the lower number and for (-) strand you have to look at the 2kb > the higher number.

5) Turn on the SNPs track (scroll all the way to the bottom of the screen) by setting "SNPs" to "pack." A new track should appear detailing the SNPs in the region you're looking at.
Vi veri veniversum vivus vici
User avatar
G-Do
Garter
Garter
 
Posts: 38
Joined: Mon Oct 02, 2006 12:23 pm
Location: Philadelphia, PA; USA

Postby ermis » Sun Oct 08, 2006 11:28 am

G-Do wrote:VWF is on the (-) strand and the numbers UCSC gives are for the (+) strand. You can tell that this is the case because when you look at VWF in the UCSC Genome Browser, the arrow notches point from right to left and not left to right. Therefore, you have to think about the whole thing backwards. Transcription always runs 5' to 3' on the strand the gene is on, so in this case it runs from right to left. For that reason, the upstream "promoter" region for this gene is not "5928301 and less" but rather the sequence just greater than 6104097 (6104097-6106097 if you want to do a 2kb window). SNPs in this area may have an effect on the promoter. If any of those SNPs are in this window (and if they overlap a conserved region), chances are they are important.

Anyway, here is how I would find promoter SNPs for this gene:

1) Go to UCSC Genome Browser.

2) Search for VWF as described above.

3) Determine the strand the gene is on using the arrow notches. If it's left-to-right, it's on the (+) strand. If it's right-to-left, it's on the (-) strand.

4) Find the promoter region (for the sake of argument, say it's 2kb upstream of the gene's 5' boundary). For (+) strand that means you have to look at the 2kb < the lower number and for (-) strand you have to look at the 2kb > the higher number.

5) Turn on the SNPs track (scroll all the way to the bottom of the screen) by setting "SNPs" to "pack." A new track should appear detailing the SNPs in the region you're looking at.


Thanks for the help. I did everything you said but I have one more question. What do the colours of the RS numbers mean? Till now I have noticed blue, red, green, black and grey RS numbers.

For example I compared the results of the 2 databases (ncbi, ucsc) for the vwf gene and I have noticed that the grey ucsc RS numbers (35679344, 35501232, 35230054) are not in the ncbi results. Is there something I should be careful about these colours?

In the TGFB1 gene the ucsc database give me, among others, one red and one green RS number (see link: http://genome.ucsc.edu/cgi-bin/hgTracks). What is the meaning of their colours? Should I include them as promoter snps?
ermis
Garter
Garter
 
Posts: 8
Joined: Mon Aug 21, 2006 3:04 pm

Next

Return to Bioinformatics

Who is online

Users browsing this forum: No registered users and 0 guests