#author("2024-07-02T03:22:09+09:00","default:mbgdadm","mbgdadm") #author("2024-07-02T03:23:44+09:00","default:mbgdadm","mbgdadm") [[MyMBGD]] #contents //------------------- // ゲノム登録方法 //------------------- * How to add your own genome data to MBGD [#a4402bc9] + Click "Enter Your Data" on the MyMBGD top page. + Click "Create New Genome" #ref(create_new_genome.png,left,wrap,create new genome,75%) + Enter the following genome information: -- Species name: full species name (e.g. Carsonella ruddii) -- Abbreviation Name: abbreviated species name (e.g. C. ruddii). Press the "Set" button to try to convert the name automatically. -- Species code: short species code with 3-6 letters (e.g. u_cru) -- Strain: strain name to distinguish different strains. Tip: A species code already used for an exising genome in MBGD is not allowed. We recommend you to include digits or underscores in the species code to avoid duplicate name. -- Taxonomy ID: NCBI Taxonomy ID. To find Taxonomy ID of the specified organism, press the "Search the NCBI Taxonomy DB" button. #ref(search_taxid.gif,left,wrap,search_taxid,100%) Now you can find the the Taxonomy ID of the strain Candidatus Carsonella PV and enter it:~ Taxnomy ID: 387662 ~ Press "Save", then "Ok", to save the genome record. ~ Tip: If you cannot find any appropriate taxonomy record, you can specify any taxonomy node near your organism. If you specify no taxonomy ID or a non-existing ID, your data will be treated as "Unclassified". + Next, enter the chromosome information. First, specify the type of your input file; for complete genome data, you can choose either a GenBank format file (GenBank) or a tab delimited gene file plus a fasta formatted protein sequence file (GeneTab+ProteinSeq). A GenBank file containing multiple sequence entry can be used to enter multiple sequences simultaneously. In the following, we choose GenBank format and press "Create New Chromosome" button. #ref(create_new_chromosome.png,left,wrap,create new genome,75%) Then, enter the chromosome information. In this case, the choromosome name is automatically assigned as "chromosome 1", but you can change it if you want. If your data is in a GenBank format file that contains all required information, then you can just enter the file name into the "Data in GenBank Format" field. Otherwise, you should see Help. After you entered information, press the "Save" button below, and press "Ok".~ ~ Check if the Status field in the "User Genome List" and the "User Chromsome List" tables in the light side panel are changed from "Incomplete" to "Ok." You can also check whether the gene information in your input data is correctly read, by pressing the "View Saved Data" button. If everything is OK, then press the "Save" button to save the data. If your genome has multiple chromosomes, repeat this process until all data have been uploaded. ~ ~ * Data Type: Incomplete genome + GeneTab,ProteinSeq [#ff2b0894] ** GeneTab format [#c55b30e1] - Header ~ -- beginning with # -- Each line must be separated by a comma (,) or one-byte space. -- Acceptable Items(&color(red){*};required fields) LEFT:|Header name|required fields|description|data|h |id/locus_tag|&color(red){*};|gene ID|1_1| |name||gene name|geneA| |chrid/chr_acc|&color(red){*};|chromosome ID|chr1| |from |&color(red){*};|Gene start location|| |to |&color(red){*};|Gene end location|| |dir ||Gene direction|1/-1/F/R/DIR/INV| |type||gene type|CDS| |descr||gene description|| - Data~ -- The data delimiter is a tab. -- Required fields are as follows --- id/locus_tag &color(red){*}; --- chrid &color(red){*}; --- from &color(red){*}; --- to &color(red){*}; -- If no header, the following order is used~ --- For 7 columns~ id, name, chrid, from, to, dir, desc -- If there is a chrid, it is considered to have chromosome information -- id must be unique -- If only an amino acid sequence is available, specify from=-1, to=-1 -- The id and the id of the fasta file of the gene should be listed correspondingly. -- The chrid and the id of the fasta file of the chromosome should be listed correspondingly.