Webscipio RPC Web Service

You can access the full functionality of Webscipio from within any software program by using the web service API.
To illustrate the process, we show a transcript of a session in Ruby, which can easily be adapted to your programming language:
Lines starting with a number in brackets are user input. Lines starting with "=>" are responses from the server.

The WSDL file is here.

(0)require "xmlrpc/client"
(1)client = XMLRPC::Client.new( "www.webscipio.org", "/webscipioapi/api","80")
(2)client.timeout=10000
(3)species = client.call("SearchSpeciesAdvanced", "Primate")
=>["Gorilla_gorilla", "Homo_sapiens", "Homo_sapiens_str__JCVenter", "Macaca_mulatta", "Microcebus_murinus", "Otolemur_garnettii", "Pan_troglodytes", "Pongo_pygmaeus"]
(4)species = client.call("SearchSpecies", "mel")
=>["Apis_mellifera_str__DH4", "Drosophila_melanogaster"]
(5)genomes = client.call("SearchGenomes",species[1])
=>[{"reference"=>"ncbi", "size"=>"116.45MB", "type"=>"chromosome", "version"=>4.1, "species"=>"Drosophila_melanogaster"}, {"reference"=>"ncbi", "size"=>"133.04MB", "type"=>"contigs", "version"=>4.1, "species"=>"Drosophila_melanogaster"}, {"reference"=>"ncbi", "size"=>"8.98MB", "type"=>"heterochromatin", "version"=>4.1, "species"=>"Drosophila_melanogaster"}, {"reference"=>"ncbi", "size"=>"9.75MB", "type"=>"uchromosome", "version"=>4.1, "species"=>"Drosophila_melanogaster"}]
(6)response = client.call("Query",
"MTRNAAVSGDEPDEEQASAVVAEGEDQETQNQAMQDDGQPGGEGHHLSSEKQQSTEQCRQ
RPRKKLTDKQPQRSAANSHPEQADADTDSSSCEGIKVGQRRGTRVSYAQRNLHGAELEDY
DYDGEGEGEREEDGEEEEEEEESYDDYEQSSKPEGRRPSARTALSVRSRRKTKTRQICYA
SSDLELGIGDGPNLIDGETLHKRRCISKGQMREFREAFRLFDKDGDGCITKEELGTVMRS
LGQFARVEELQEMLQEIDVDGDGNVSFEEFVDILSNMTYEDKSGLSSADQEERELRDAFR
VFDKHNRGYITASDLRAVLQCLGEDLDEEDIEDMIKEVDVDGDGRIDFYEFVHALGEPED
SQENDDEDVDTTSPLPTPKSAISLSYD"

,genomes[0],"yaml")
=>[["ScipioQuery", "--- \n- matchings: \n - mismatchlist: []\n\n type: exon\n nucl_end: 3\n translation: M\n prot_end: 1\n dna_end: -910553\n dna_start: -910556\n nucl_start: 0\n seq: atg\n seqshifts: []\n\n prot_start: 0\n - type: intron\n dna_end: -910279\n dna_start: -910553\n nucl_start: 3\n seq: gtgagtgcagttgagataaaaattctacttctacttattaagcaaacttggaacgagttttgtttaattatttagta [...]
(7)response = client.call("QueryWithOptions",
"MTRNAAVSGDEPDEEQASAVVAEGEDQETQNQAMQDDGQPGGEGHHLSSEKQQSTEQCRQ
RPRKKLTDKQPQRSAANSHPEQADADTDSSSCEGIKVGQRRGTRVSYAQRNLHGAELEDY
DYDGEGEGEREEDGEEEEEEEESYDDYEQSSKPEGRRPSARTALSVRSRRKTKTRQICYA
SSDLELGIGDGPNLIDGETLHKRRCISKGQMREFREAFRLFDKDGDGCITKEELGTVMRS
LGQFARVEELQEMLQEIDVDGDGNVSFEEFVDILSNMTYEDKSGLSSADQEERELRDAFR
VFDKHNRGYITASDLRAVLQCLGEDLDEEDIEDMIKEVDVDGDGRIDFYEFVHALGEPED
SQENDDEDVDTTSPLPTPKSAISLSYD"

,genomes[0],"yaml", 1, 5, 100, 0.8, 100, 0.1)
=>[["ScipioQuery", "--- \n- matchings: \n - mismatchlist: []\n\n type: exon\n nucl_end: 3\n translation: M\n prot_end: 1\n dna_end: -910553\n dna_start: -910556\n nucl_start: 0\n seq: atg\n seqshifts: []\n\n prot_start: 0\n - type: intron\n dna_end: -910279\n dna_start: -910553\n nucl_start: 3\n seq: gtgagtgcagttgagataaaaattctacttctacttattaagcaaacttggaacgagttttgtttaattatttagta [...]
(8)require 'yaml'
(9)gene = YAML.load(response[0][1])
(10)gene[0]["target"]
=>"gi|116010291|ref|NC_004354.3| Drosophila melanogaster chromosome X, complete sequence"
(11)gene[0]["dna_start"]
=>-910556
(12)gene[0]["matchings"].length
=>7
(13)gene[0]["matchings"].select {|m| m["type"] == "exon"}.length
=>4
(14)gene[0]["matchings"].select {|m| m["type"] == "exon"}.collect {|e| [e["dna_start"], e["dna_end"]]}
=>[[-910556, -910553], [-910279, -909501], [-909382, -909172], [-909112, -908942]]

Explanation

(0)Load the rpc libraries.
(1)Instantiate the service with URL and port number.
(2)Set the timeout to 10000 seconds. Searches may take a while and we don't want the client to cancel.
(3)SearchSpeciesAdvanced searches for species by looking for the string just like typing it into the auto completion field on the web site (see Help-FAQ for more information). We are searching for Primates and get a list of species from the Primate taxon.
(4)SearchSpecies searches for a matching species names. We are searching for the string "mel". WebScipio returns Apis_mellifera_str__DH4 and Drosophila_melanogaster.
(5)We want to search againt Drosophila (the species with the index 1 in our species array) so we ask for the genomes that are available using SearchGenomes. This returns an array of hashes specifying the reference, the size, the type and the version of the genome.
(6)We decide to search the file of type chromosome and start a search with standard parameters using the Query command. The query could also be in FASTA format but here we are using a plain sequence. We want to retrieve a YAML file for further processing.
(7)Maybe we like to have a little more control over how the search is performed so we can use QueryWithOptions. The additional arguments are: run_times, blattile, reg_size, minid, maxmis, bestsize as described in Help-FAQ.
(8)In order to process the result we load YAML.
(9)We deserialize the second element of the first gene, which is the yaml string (the first beeing the name of the query).
(10)On which target was the first hit found?
(11)Where does it start on the contig?
(12)How many matchings are there (matchings are either introns or exons)?
(13)How many of them are exons?
(14)Where are the borders of these exons?
link to diark
link to cymobase
link to motorprotein.de
MPG
MPI for biophysical chemistry
Uni-Goettingen
Informatik Uni-Goettingen