Who could tell me how to do?

Everything on bioinformatics, the science of information technology as applied to biological research.

Moderators: honeev, Leonid, amiradm, BioTeam

Post Reply
cnstr14
Garter
Garter
Posts: 2
Joined: Thu Jun 24, 2010 8:02 am

Who could tell me how to do?

Post by cnstr14 » Thu Jun 24, 2010 8:13 am

I have downloaded 120 sequences from Genbank for designing primers. However how can I import the referred sequences' Genbank IDs(not including sequences) into my article references section with a simple approach, Thanks.

User avatar
JackBean
Inland Taipan
Inland Taipan
Posts: 5694
Joined: Mon Sep 14, 2009 7:12 pm

Post by JackBean » Thu Jun 24, 2010 12:25 pm

you need to parse only the lines with > in the beginning.
http://www.biolib.cz/en/main/

Cis or trans? That's what matters.

cnstr14
Garter
Garter
Posts: 2
Joined: Thu Jun 24, 2010 8:02 am

Re:

Post by cnstr14 » Thu Jun 24, 2010 1:50 pm

JackBean wrote:you need to parse only the lines with > in the beginning.


Do you mean add ">" at the begining of the each sequence and then import the FASTA format sequences into a soft for further analysis ? But actually I just only need to import the sequences' Genbank id, not including the sequences, into my MS WORD article.

User avatar
JackBean
Inland Taipan
Inland Taipan
Posts: 5694
Joined: Mon Sep 14, 2009 7:12 pm

Post by JackBean » Mon Jul 05, 2010 10:54 am

well, the FASTA format looks like:

>ID_you're_looking_for|stuff you probably don't need|more stuff|and yet other
HEREISYOURSEQUENCEYOU'RENOTINTERESTEDINMOSTLY60OR100LETTE
RSPERLINE;)

so, you need, to pick only the lines starting with ">", so like:
>ID_you're_looking_for|stuff you probably don't need|more stuff|and yet other
http://www.biolib.cz/en/main/

Cis or trans? That's what matters.

fcs
Garter
Garter
Posts: 20
Joined: Thu Jul 23, 2009 12:06 am
Location: South San Francisco

Post by fcs » Wed Aug 11, 2010 3:49 am

To clean up the sequence information, you can run a "find and replace" with nothing with this regular expression:

^\w+$

Make sure you copy the original file because this will remove all sequence information so that all you are left with are your headers.

Post Reply

Who is online

Users browsing this forum: No registered users and 4 guests