WebScipio RESTful Web Service
You can access the functionality of WebScipio from within any software program by using the web service API.To illustrate the process, we show a transcript of a session in Ruby, which can easily be adapted to your programming language:
Lines starting with a number in brackets are user input. Lines starting with "=>" are responses from the server.
(0) |
require 'net/http' require 'yaml' |
(1) |
url = URI.parse("http://www.webscipio.org/api_searches") post_parameters = {'search_species' => 'true', 'query' => 'drosophila'} response = Net::HTTP.post_form(url, post_parameters) id = response.body |
=> | "109342452031" |
(2) |
url = URI.parse("http://www.webscipio.org/api_searches/#{id}.yaml") response = Net::HTTP.get_response(url) yaml_string = response.body species = YAML::load(yaml_string) |
=> | ["Drosophila_ananassae_TSC_14024_0371_13", "Drosophila_elegans", "Drosophila_erecta_TSC_14021_0224_01", "Drosophila_ficusphila", "Drosophila_grimshawi_TSC_15287_2541_00", "Drosophila_kikkawai", "Drosophila_melanogaster", "Drosophila_mojavensis_TSC_15081_1352_22", "Drosophila_persimilis_MSH_3", "Drosophila_pseudoobscura_MV2_25", "Drosophila_sechellia_Rob3c", "Drosophila_simulans_str__Mosaic", "Drosophila_simulans_str__c1674", "Drosophila_simulans_str__md106", "Drosophila_simulans_str__md199", "Drosophila_simulans_str__nc48", "Drosophila_simulans_str__sim4", "Drosophila_simulans_str__sim6", "Drosophila_simulans_str__white501", "Drosophila_takahashii", "Drosophila_virilis_TSC_15010_1051_87", "Drosophila_willistoni_TSC_14030_0811_24", "Drosophila_yakuba_Tai18E2", "Kluyveromyces_lactis_NRRL_Y_1140"] |
(3) |
url = URI.parse("http://www.webscipio.org/api_searches") post_parameters = {'search_species' => 'true', 'query' => 'primate'} response = Net::HTTP.post_form(url, post_parameters) id = response.body |
=> | "791866516211" |
(4) |
url = URI.parse("http://www.webscipio.org/api_searches/#{id}.yaml") response = Net::HTTP.get_response(url) yaml_string = response.body species = YAML::load(yaml_string) |
=> | ["Callithrix_jacchus", "Gorilla_gorilla", "Gorilla_gorilla_gorilla", "Homo_sapiens", "Homo_sapiens_African_Individual_NA18507", "Homo_sapiens_JCVenter", "Homo_sapiens_JDWatson", "Macaca_fascicularis", "Macaca_mulatta", "Microcebus_murinus", "Nomascus_leucogenys", "Otolemur_garnettii", "Pan_troglodytes", "Papio_hamadryas", "Pongo_abelii", "Tarsius_syrichta"] |
(5) |
url = URI.parse("http://www.webscipio.org/api_searches") post_parameters = {'search_genomes' => 'true', 'query' => 'Daphnia_pulex'} response = Net::HTTP.post_form(url, post_parameters) id = response.body |
=> | "218502844915" |
(6) |
url = URI.parse("http://www.webscipio.org/api_searches/#{id}.yaml") response = Net::HTTP.get_response(url) yaml_string = response.body genomes = YAML::load(yaml_string) |
=> | [{:minor_version=>0, :type=>"supercontigs", :species=>"Daphnia_pulex", :mini_version=>0, :size=>"219.77MB", :reference=>"jgi", :major_version=>1, :path=>"genomes_jgi/Daphnia_pulex_v1_supercontigs.fasta"}, {:minor_version=>0, :type=>"supercontigs", :species=>"Daphnia_pulex", :mini_version=>0, :size=>"191.35MB", :reference=>"ncbi", :major_version=>1, :path=>"genomes_ncbi/Daphnia_pulex_v1_supercontigs.fasta"}, {:minor_version=>0, :type=>"contigs", :species=>"Daphnia_pulex", :mini_version=>0, :size=>"155.07MB", :reference=>"ncbi", :major_version=>1, :path=>"genomes_ncbi/Daphnia_pulex_v1_contigs.fasta"}] |
(7) |
query_fasta = '>DapMhc MPPKKDMGPDPDPAQYLFVSLEMKRADQTKPYDGKKATWVPCEKDSYQLGEITGTKGDLV VVKVADGNEKMVKKDQCFPVNPPKFEKVEDMADLTYLNDAAVLHNLRQRYYHKLIYTYSG LFCVAINPYKRFPIYTQRVIKMYIGKRRNEVPPHIFCISDGAYMDMLTNHENQSMLITGE SGAGKTENTKKVIQYFAQIAKDTKGSKHTFSSGGNLEDQIVQTNPVLEAFGNAKTTRNDN SSRFGKFIRIHFGNSGKLAGADIETYLLEKARVISQQALERSYHIFYQIMSGKLPTLKAD CCLVDDIYQYNFVSQGKITIPSMDDSEEMALTDEAFEILGMGEQRPEIWKITAAVMHFGT MKFKQRGREEQADPDGTQEGENVAKMMGVDGPQLYMNFLKPRIKVGNEFVTQGRNVNQVV YSIGAMAKAIFDRLFKWLVKRVNETLETGQKRVTFIGVLDIAGFEIFDYNGFEQLCINFT NEKLQQFFNHHMFVLEQEEYKKEGIDWVFMDFGMDLQACIELMEKPMGVLSILEEESMFP KATDQTFAEKLNNNHLGKSASFVKPKPAKAGCKEAHFAIAHYAGTVPYNITGWLEKNKDP LNDTVVDQFKKGSSKLVQEIFADHPGQSGGKEEAKGGKRTKGSGFQTVSALYREQLNGLM KTLNATSPHFIRCIIPNETKSPGVIDSHLVMHQLTCNGVLEGIRICRKGFPNRMVYPDFK HRYMILAPNEMKAEPDERKAAKICLEKIALDPEWYRIGHTKVFFKAGVLGQLEEMRDDKL AKIITWMQSFIRGYHTRKQYKQLQDQRVALCVVQRNLRSYLQMRTWAWYRLWQKVKPLLN VTRVEDEIKALEDKAAAAQANFEKEEKLRKELETNLAKLTKEKEDLLNRLQAESGTVADF HDKQNKLMSQKADLESQLSDTQERLQQEEDARNQLFQNKKKLEQEASGLKKDIEDLELAL QKTETDKATKDHQIRNLNDEIAHQDELINKLNKEKKHMQEVNQKTAEDLQASEDKVNHLN KVKAKLEQTLDELEDSLEREKKLRADIEKNKRKTEGDLKLTQEAVADLERNKKELEQTIQ RKDKEIASLNAKLEDEQSLVGKLQKQIKELQSRIEELEEEVEAERQARAKAEKQRADLAR ELEELGERLEEAGGATAAQIELNKKREAELSKLRRDLEESNIQHESVLSNLRKKHNDAVS EMSEQIDQLNKMKAKAEKDRSQFAGENNDLRAAMDHVSSDKAAAEKMTKMLQQQLNEIQS KLDEANRSLNDFDVQKKKLTIENSDYLRQLEDAESQVSQLQKLKISLTTQLEDSKRMADE EGRERATLLGKFRNLEHDIDNIREQLDEESEAKADLQRQLSKSNADCQMWRHKYESEGVA KAEELEDAKRKLQARLGEAEEAIESLNQKNVALEKIKMRLSGELDDMHVEVERATVLANQ MEKRGKNFDKVVSEWKAKVDDLAAELDASQKECRNYSTELFRLKAGYDESQEHLEAVRRE NKNLADEIKDLMDQIGEGGRNVHEIDKQRKRLEVEKEELQAALEEAESALEQEENKVLRA QLELSQVRQEIDRRIQEKEEEFENTRKNHQRAIDSMQASLEAEAKGKAEALRMKKKLESD INELEIALDHANKANAEAQKSIKRYQQSIKETQSALEEEQRNRDDLREQYGIAERRANAL QGELEESRTLLEQADRARRQAETELADAHEQLHDLTAQAASSSAAKRKMESELQTLHADL DDMINETKNSEEKAKKAMVDAARLADELRAEQEHAQAQEKQRKALELQVKELQVRLDESE NNALKGGKKAIQKLEERVRGLETELDGEQRRHADAQKNLRKSERRIKELTFQSDEDRKNH ERMQDLVDKLQQKIKTYKRQIEEAEEIAALNLAKFRKAQQELEEADERAELADQAVSKLR AKGRGGSASRLSPPPQMKPRSKRDFE' url = URI.parse("http://www.webscipio.org/api_searches") post_parameters = {'scipio_run' => 'true', 'target_file_path' => 'genomes_jgi/Daphnia_pulex_v1_supercontigs.fasta', 'query' => query_fasta} #optional_scipio_parameters = { # :min_score => 0.3, # best_size # :minid => 90, # min_identity # :maxmis => 7, # max_mismatches # :min_coverage => 60, # :reg_size => 2000, # :multiple_results => false, # :single_target_hits => false, # :transtable => 1, # :max_assemble_size => 75000, # :max_move_exon => 6, # :gap_to_close => 6, # :min_intron_len => 22, # :accepted_intron_penalty => "1.0", # :blattile => 7, # tile_size # :blatoneoff => false, # :blatscore => 15, # :blatidentity => 81, # :exhaust_align_size => 15000, # :exhaust_gap_size => 21 #} response = Net::HTTP.post_form(url, post_parameters) id = response.body |
=> | "276393024066" |
(8) |
url = URI.parse("http://www.webscipio.org/api_searches/#{id}.yaml") run_result = ["running", ""] while(run_result[0] == "running") do response = Net::HTTP.get_response(url) yaml_string = response.body run_result = YAML::load(yaml_string) sleep(10) end scipio_results = YAML::load(run_result[1]) |
=> |
{"DapMhc"=>[{"number"=>1, "matchings"=>[{"number"=>1, "overl ap"=>nil, "mismatchlist"=>[], "prot_start"=>0, "nucl_end"=>2 01, "prot_end"=>67, "translation"=>"MPPKKDMGPDPDPAQYLFVSLEMK RADQTKPYDGKKATWVPCEKDSYQLGEITGTKGDLVVVKVADG", "dna_start"=>2 048124, "type"=>"exon", "contig"=>1, "seqshifts"=>[], "undet erminedlist"=>[], "nucl_start"=>0, "seq"=>"atgcctcccaagaagga tatgggacccgatcccgacccagcccaatacctcttcgtttccctggaaatgaaacgtgc ... QIEEAEEIAALNLAKFRKAQQELEEADERAELADQAVSKLRAKGRGGSASRLSPPPQMKP RSKRDFE", "prot_len"=>1946, "status"=>"auto", "undetermined" =>0, "mismatches"=>0, "matches"=>1946, "dna_end"=>2070830}]} |
(9) | scipio_results["DapMhc"].size |
=> | 1 |
(10) | scipio_results["DapMhc"][0]["target"] |
=> | "scaffold_6" |
(11) | scipio_results["DapMhc"][0]["dna_start"] |
=> | 2048124 |
(12) | scipio_results["DapMhc"][0]["matchings"].size |
=> | 57 |
(13) | scipio_results["DapMhc"][0]["matchings"].select{|m| m["type"] == "exon"}.size |
=> | 29 |
(14) | scipio_results["DapMhc"][0]["matchings"].select{|m| m["type"] == "exon"}.map{|e| [e["dna_start"], e["dna_end"]]} |
=> | [[2048124, 2048325], [2048507, 2048654], [2049778, 2049935], [2050465, 2050493], [2050633, 2050739], [2052035, 2052128], [2052199, 2052263], [2052336, 2052435], [2052631, 2052735], [2055462, 2055597], [2056634, 2056754], [2056821, 2056971], [2057249, 2057420], [2059163, 2059343], [2059422, 2059573], [2059936, 2059989], [2061071, 2061159], [2061759, 2061877], [2062068, 2062186], [2064404, 2064541], [2064625, 2064962], [2066715, 2066844], [2066930, 2067175], [2067247, 2067506], [2067578, 2067832], [2068003, 2068082], [2068560, 2069505], [2069584, 2070718], [2070794, 2070830]] |
(15) |
yaml_string = YAML::dump(scipio_results) url = URI.parse("http://fab8:3010/api_searches") post_parameters = {'mutu_exon_run' => 'true', 'query' => yaml_string} #optional_mutu_exon_parameters = { # :length_difference => "20", # :min_score => "15", # :min_exon_length_aa => "15", # :search_up_down_stream => true, # :all => false, # :use_start_codon => "auto", # :use_stop_codon => "auto", # :max_recursion_depth => "0" } response = Net::HTTP.post_form(url, post_parameters) id = response.body |
=> | "401632858060" |
(16) |
url = URI.parse("http://fab8:3010/api_searches/#{id}.yaml") run_result = ["running", ""] while(run_result[0] == "running") do response = Net::HTTP.get_response(url) yaml_string = response.body run_result = YAML::load(yaml_string) sleep(10) end scipio_results_with_mutu_exons = YAML::load(run_result[1]) |
=> |
{"DapMhc"=>[{"unmatched"=>0, "matchings"=>[{"overlap"=>nil, "number"=>1, "prot_start"=>0, "mismatchlist"=>[], "nucl_end" =>201, "first_exon"=>true, "prot_end"=>67, "type"=>"exon", " dna_start"=>2048124, "translation"=>"MPPKKDMGPDPDPAQYLFVSLEM ... TCCGAATCCCACACGAATTTCAACCGTCTCTCTCTCCTCTCTCTCATCAACTTGTTTGAT TTTTTGTCGTCGTCGTCGTTGATCAACAACTCGAAACAACAAG", "dna_end"=>207 0830, "matches"=>1946, "mismatches"=>0, "undetermined"=>0}]} |
(17) |
yaml_string = YAML::dump(scipio_results) url = URI.parse("http://fab8:3010/api_searches") post_parameters = {'tandem_genes_run' => 'true', 'query' => yaml_string} optional_tandem_genes_parameters = { # :length_difference => "10", # :min_score => "15", # :min_exon_length_aa => "10", # :min_tandem_gene_score => "30", # :search_for_concatenated_exons => false, # :search_for_splitted_exons => false, # :use_start_codon => "auto", # :use_stop_codon => "auto", # :generate_tandem_gene_results => true } response = Net::HTTP.post_form(url, post_parameters) id = response.body |
=> | "219595899748" |
(18) |
url = URI.parse("http://fab8:3010/api_searches/#{id}.yaml") run_result = ["running", ""] while(run_result[0] == "running") do response = Net::HTTP.get_response(url) yaml_string = response.body run_result = YAML::load(yaml_string) sleep(10) end scipio_results_with_tandem_genes = YAML::load(run_result[1]) |
=> |
{"DapMhc"=>[{"unmatched"=>0, "matchings"=>[{"overlap"=>nil, "number"=>1, "prot_start"=>0, "mismatchlist"=>[], "nucl_end" =>201, "first_exon"=>true, "prot_end"=>67, "type"=>"exon", " dna_start"=>2048124, "translation"=>"MPPKKDMGPDPDPAQYLFVSLEM ... TCCGAATCCCACACGAATTTCAACCGTCTCTCTCTCCTCTCTCTCATCAACTTGTTTGAT TTTTTGTCGTCGTCGTCGTTGATCAACAACTCGAAACAACAAG", "dna_end"=>207 0830, "matches"=>1946, "mismatches"=>0, "undetermined"=>0}]} |
Explanation
(0) | Load the http and yaml libraries. |
(1) | Send a POST to url "http://www.webscipio.org/api_searches" with parameters "search_species" set to "true" and "query" set to "drosophila". This will generate a search for all species with "drosophila" in their names. The response will be an ID to get the result. |
(2) | Send a GET to url "http://www.webscipio.org/api_searches/109342452031.yaml" were "109342452031" is the ID of the response. You can use .xml, .json and .html to get responses in ohter formats than YAML. You will get an array of all species found. |
(3) | Send a POST to url "http://www.webscipio.org/api_searches" with parameters "search_species" set to "true" and "query" set to "primate". This will generate a search for all primates. The response will be an ID to get the result. |
(4) | Send a GET to url "http://www.webscipio.org/api_searches/791866516211.yaml" were "791866516211" is the ID of the response. You will get an array of all species found. |
(5) | Send a POST to url "http://www.webscipio.org/api_searches" with parameters "search_genomes" set to "true" and "query" set to "Daphnia_pulex". This will generate a search for all genome files of the specified organism. The response will be an ID to get the result. |
(6) | Send a GET to url "http://www.webscipio.org/api_searches/218502844915.yaml" were "218502844915" is the ID of the response. You will get an array of all genome files found. Each genome file is represented by a hash containing the version, type, size and path of the genome file. |
(7) | Send a POST to url "http://www.webscipio.org/api_searches" with parameters "scipio_run" set to "true", "target_file_path" set to the chosen Daphnia genome "genomes_jgi/Daphnia_pulex_v1_supercontigs.fasta" and "query" set to a myosin heavy chain protein sequence. This will start a Scipio run with default parameters. You could change all parameters by setting them in the POST call. The response will be an ID to get the result. |
(8) | Send a GET to url "http://www.webscipio.org/api_searches/276393024066.yaml" were "276393024066" is the ID of the response. Because the Scipio run can take some time, you have to recall the GET until the run is finished. As a result you get an array with to elements. The first element is the status ("running", "finished", "nothing_found" or "error") and the second element is the Scipio result in YAML format. |
(9) | The Scipio result is a hash with a key/value pair for each protein queried. To get the number of contigs on which the gene is distributed, ask for the size of the array (in this case only one contig). |
(10) | Chose the first contig, which contains a hash with key/value pairs discribing the gene. Use the "target" key to get the name of the contig. |
(11) | Chose the first contig and ask for the starting position of the gene in the DNA sequence. |
(12) | To get the number of introns, exons and gaps chose "matchings". |
(13) | To get the number of exons select the matchings of type "exon". |
(14) | To get the star and end positions of the exons of the gene, select the exons and map them onto their start and end positions in the DNA sequence. |
(15) | Send a POST to url "http://www.webscipio.org/api_searches" with parameters "mutu_exon_run" set to "true" and "query" set to the Scipio result in YAML format. This will start a search for mutually exclusive exons with default parameters. You can change all parameters by setting them in the POST call. The response will be an ID to get the result. |
(16) | Send a GET to url "http://www.webscipio.org/api_searches/401632858060.yaml" were "401632858060" is the ID of the response. Because the search can take some time, you have to recall the GET until the run is finished. As a result you get an array with to elements. The first element is the status ("running", "finished", "nothing_found" or "error") and the second element is the Scipio result including the mutually exclusive exons found. |
(17) | Send a POST to url "http://www.webscipio.org/api_searches" with parameters "tandem_genes_run" set to "true" and "query" set to the Scipio result in YAML format. This will start a search for tandem genes with default parameters. You can change all parameters by setting them in the POST call. The response will be an ID to get the result. |
(18) | Send a GET to url "http://www.webscipio.org/api_searches/219595899748.yaml" were "219595899748" is the ID of the response. Because the search can take some time, you have to recall the GET until the run is finished. As a result you get an array with to elements. The first element is the status ("running", "finished", "nothing_found" or "error") and the second element is the Scipio result including the tandem genes found. |