What is HAP file?

The HAP format
HAP is a text file format. It contains several meta-information lines followed by ## , one line containing only the # sign, one header line, and those data lines. Each data line contains the information for one SNP, including the chromosome ID, the position of the SNP in the chromosome, the rs ID, and the haplotypes of each individual.
1. An example of the HAP format. ##fileformat=HAPv1.0
CHROM POS ID NA00001 NA00002
1 126113 . A:A A:A
1 535131 . G:T G:T
1 567239 . C:C C:C
1 570254 . G:A G:A
1 592368 . G:A G:A

How to create a bgzip compressed file?

*.vcf.gz file using VCFtools and tabix (including bgzip): vcf-sort mystudy_chr1.vcf | bgzip -c > mystudy_chr1.vcf.gz

What are the common issues of the input files?

1) File size. We require that each file is not larger than 1 G; 2) File format. We will check the following:

  • VCF check: validity + statistics such as #samples, chromosomes, SNPs, chunks, phased / unphased, reference build.
  • Quality control statistics: duplicate sites, SNPs removed, NonSNP sites, monomorphic sites, MAF check.

What is the data security?

The data will be transferred to our server, a wide array of security measures are enforced:

  • The complete interaction with the server is secured with HTTPS.
  • All results are encrypted with an one-time password, thus only you can get them.

How to cite?

Please cite: Ma Y, Zhao J, Wong JS, Ma L, Li W, Fu G, Xu W, Zhang K, Kittles RA, Li Y, Song Q. Accurate inference of local phased ancestry of modern admixed populations. Scientific Reports. 2014 Jul 23;4:5800. PMID: 25052506