Pan-genome interface provides an in-depth analysis of Solanaceae genomes. “Genome data” on this interface displays comprehensive information on pan-genome analyses for various species. Users can input data into a submission box to initiate specific queries.
Analysis methods:
Part | Methods |
---|---|
Gene cluster | Protein sequences from 81 species were used as input for OrthoFinder software to construct gene families. A core cluster is defined as a cluster shared among all 81 genomes. Clusters present in 79 and 80 species were further classified as soft-core clusters. Clusters found in 2 to 78 species were defined as dispensable clusters. Specific clusters were those present in only one species. To further mimic the number of protein-coding genes in the pan-genome and core genome, we used PanGP (v1.0.1) using a completely random algorithm to set the sample size to 2000 and sample replicates to 80 based on OrthoFinder results. |
Structural variation | Pairwise genome alignments between potato and tomato were conducted using the nucmer program in MUMmer (v.4.0.0beta2)[1]. SV detection was performed using SVMU (v.0.4-alpha)[2] to generate CNVs, insertions, and deletions. For SV calling, minimap2 (v.2.21-r1071)[3] was used to generate paired genome alignments, which were then passed to SyRI (v.1.2)[4] to identify and retain SVs, including insertions, deletions, SNPs, inversions, and translocations. We used different chromosome level varieties of Solanum_lycopersicum and Solanum_tuberorum to detect structural variations between them, and dynamically displayed SyRi results on the website. |