Usage

Input Sequences and/or PDB IDs (plus chain IDs)
You must provide a list of sequences and/or PDB and chain identifiers. The list may be pasted into the text area or uploaded from an external file. In either case, the sequences must be input in FASTA format and the PDB and chain identifier must be joined as a string of length 5 (e.g. 1nagA). Each PDB and chain identifer line must be preceeded by a line containing the string '>PDBID' and nothing else.

Here are some examples of VALID inputs:
>PDBID
3ygsC
>Q6Q899_DDX58_MOUSE_1-91
MTAAQRQNLQAFRDYIKKILDPTYILSYMSSWLEDEEVQYIQAEKNNKGPMEAASLFLQYLLKLQSEGWFQAFLDALYHAGYCGLCEAIES
>Q6Q899_DDX58_MOUSE_101-176
EEHRLLLRRLEPEFKATVDPNDILSELSECLINQECEEIRQIRDTKGRMAGAEKMAECLIRSDKENWPKVLQLALE
>PDBID
2p1hA
>Q6Q899_DDX58_MOUSE_1-91
MTAAQRQNLQAFRDYIKKILDPTYILSYMSSWLEDEEVQYIQAEKNNKGPMEAASLFLQYLLKLQSEGWFQAFLDALYHAGYCGLCEAIES
>Q6Q899_DDX58_MOUSE_101-176
EEHRLLLRRLEPEFKATVDPNDILSELSECLINQECEEIRQIRDTKGRMAGAEKMAECLIRSDKENWPKVLQLALE
>Q6Q899_DDX58_MOUSE_1-91
MTAAQRQNLQAFRDYIKKILDPTYILSYMSSWLEDEEVQYIQAEKNNKGPMEAASLFLQY
LLKLQSEGWFQAFLDALYHAGYCGLCEAIES
>Q6Q899_DDX58_MOUSE_101-176
EEHRLLLRRLEPEFKATVDPNDILSELSECLINQECEEIRQIRDTKGRMAGAEKMAECLI
RSDKENWPKVLQLALE
MTAAQRQNLQAFRDYIKKILDPTYILSYMSSWLEDEEVQYIQAEKNNKGPMEAASLFLQYLLKLQSEGWFQAFLDALYHAGYCGLCEAIES
MTAAQRQNLQAFRDYIKKILDPTYILSYMSSWLEDEEVQYIQAEKNNKGPMEAASLFLQY
LLKLQSEGWFQAFLDALYHAGYCGLCEAIES
>PDBID
3ygsC
>PDBID
2p1hA


And some examples of INVALID inputs:
>this_is_a_pdbid
3ygsC
>PDB_ID
2p1hA
3ygsC
2p1hA
Add Structure Homologs
This feature will use BLAST to search the PDB using your Input sequences and PDBIDs as a query. There are four parameters that control what Seekquencer retrieves:
Minimum Identity. This parameter controls what BLAST considers a sequence homolog. Increasing this parameter will reduce the number of PDB entries retrieved; decreasing it will increase the number retrieved. However, an internal parameter prevents PDB entries with e-values > 0.01 from being included. (Default: 20%)

Minimum Coverage. This parameter determines how much of particular PDB entry must coverthe query sequence. Ideally, the structure would cover all or most of the query; if it does not, you might consider breaking your query sequences into domains. (Default: 50%)

Clustering Threshold. This parameter prevents many instances of a particular structure from being retrieved. If you want fewer structures, lower the value; if you want more, increase it; using 100 will add all PDB entries that are homologous to your input. The pruning of sequences is performed using the program cd-hit. (Default: 90%)

Database. Select the database to use as reference. (Default: pdb)
Add Sequence Homologs
This feature allows you to pull in sequences from the UniRef database. There are six parameters that control what Seekquencer retrieves:
Minimum Identity. This parameter controls what BLAST considers a sequence homolog. Increasing this parameter will reduce the number of UNIREF entries retrieved; decreasing it will increase the number retrieved. However, an internal parameter prevents UNIREF entries with e-values > 0.01 from being included. (Default: 20%)

Minimum Coverage. This parameter determines how much of particular UNIREF entry must coverthe query sequence. Ideally, the structure would cover all or most of the query; if it does not, you might consider breaking your query sequences into domains. (Default: 50%)

Clustering Threshold. This parameter prevents many instances of a particular structure from being retrieved. If you want fewer structures, lower the value; if you want more, increase it; using 100 will add all UNIREF entries that are homologous to your input. The pruning of sequences is performed using the program cd-hit. (Default: 90%)

Database. Select the database to use as reference. (Default: uniref90)

Search Algorithm. This parameter controls what blast algorithm to use. (Default: Blast)

Trim Hit Sequence. This parameter determines the resulting hit sequences. Selecting 'Yes' will return only the aligned regions (your input against the hits) while selecting 'No' will return the full sequence. (Default: Yes)
Add ASH Structural Neighbors
This feature allows you to pull in structural homologs to your query sequence(s). We maintain a database of ASH structural alignments. If one or more of your queries can be matches to one or more of the structures for which pre-computed alignments are available, the list of structuralneighborscan be added subject to the following constraints:
Minimum Identity. This parameter controls what BLAST considers a sequence homolog. Increasing this parameter will reduce the number of ASH structural neighbors retrieved; decreasing it will increase the number retrieved. However, an internal parameter prevents ASH structural neighbors with e-values > 0.01 from being included. (Default: 20%)

Minimum Coverage. This parameter determines how much of particular ASH structural neighbor must coverthe query sequence. Ideally, the structure would cover all or most of the query; if it does not, you might consider breaking your query sequences into domains. (Default: 50%)

Clustering Threshold. This parameter prevents many instances of a particular structure from being retrieved. If you want fewer structures, lower the value; if you want more, increase it; using 100 will add all ASH structural neighbors that are homologous to your input. The pruning of sequences is performed using the program cd-hit. (Default: 90%)
Cluster Final Results
This feature allows you perform a final clustering of all the results.
Threshold. This parameter prevents many instances of a particular structure from being retrieved. If you want fewer structures, lower the value; if you want more, increase it; The pruning of sequences is performed using the program cd-hit. (Default: 70%)