Software Name: AncestrySever Version: 2.0 Data: 2/22/2024 -------------------- Copyright -------------------- All Rights Reserved. No part of this software or any of its contents may be reproduced, copied, modified or adapted, without the prior written consent of the author, unless otherwise indicated for stand-alone materials. Basic Information -------------------- AncestryHub is a web version of software aMAP and AncestryView. It is designed to help those users without any programming experiences and bioinformatic skills to do local-ancestry analysis. AncestryHub also provides the following unique feature: 1. SNPs matching; 2. Reference-ready; 3. An customized option for users to choose either a whole-genome analysis or a target region analysis. Inputs -------------------- AncestryHub requires the following two documents as the input, 1. A data file, in the VCF format, needs to meet following requirments, a. It is a haplotypes or genotype data; b. Variants must be sorted by their genomic positions along each chromosome; c. GRCh38 (hg38) coordinates are required; d. The size of the compressed file should be smaler than 1G, it must be compressed by gzip (*.gz); e. The maximum number for each file is 500 samples. 2. An Parameters Form -- the information of an user's customized choices. a. Name of Input File - the name of the data file; b. Human Assembly - need GRCh38 (hg38); c. Input File Formate - vcf.gz; d. Email Address - a user's email address for receiving the responses from the AncestryHub server; e. Target region (Option) - a user’s option only for a target region analysis (SCA mode). The default mode is the whole-genome analysis (WGS mode); f. Recruitment site (Option) - the cities/countries where the sampels were recruited . The VCF Format Specification -------------------- https://samtools.github.io/hts-specs/VCFv4.2.pdf Outputs -------------------- 1. An visual picture of local ancestry on an individual, WGS mode will output an ancestry picture of the entire genome of an individual: eg. 1-NA20845-All.png SCA mode will output an ancestry picture of one chromosome of a number of individuals: 1-8.png 2. A txt file (*_phased_aMAP.txt), which provides the ancestry information for each homologous chromosome (A and B represent two different homologous chromosomes) at each chromosome locus. It contains the chormosome ID, the homologous ID, the chromosome positions (as defined by two positions at the starting point and the ending point), and the ancestry calls for each region along each homologous chromosome. For example, 1-NA20845_phased_aMAP.txt 3. A csv file (*_sub-population_detail.csv), which provide the ancestry percentage information for each homologous chromosome on sub-population level 4. A summary in the format of Excel CSV, in which proportional percentages of ancestral origins of all populations will be provided together. All samples will be included in this file. eg. summary.csv.