|
Genotype files must be tabular with the samples as columns and the SNPs as rows, they can also be zipped or gzipped.
Since there appears to be an unlimited array of different formats for genotype files, we specify here those that can be imported into HomozygosityMapper without any further manipulation.
In every file, lines starting with the number sign (#) will be ignored. In each line, the SNP ID (Affymetrix ID or, with Illumina files, dbSNP ID) must be directly followed by the genotypes. The genotypes must be written in one fo the following ways:
| SNP ID |
Sample01 |
Sample02 |
Sample03 |
Sample04 |
Sample08 |
Sample09 |
| SNP_A-1513509 | BB | BB | AB | BB | AB | BB |
| SNP_A-1518411 | BB | BB | BB | BB | BB | BB |
| SNP_A-1511066 | AB | NoCall |
AA | AA | AA | AA |
| SNP_A-1517367 | AA | AB | AB | AA | AA | AB |
Instead of AA/AB/BB/NoCall, also the 'number format' (0,1,2,-1) can be used.
The following columns will be ignored and do not have to be removed from the file:
- dnsnp rs id
- tsc id
- chromosome
- physical position
Illumina
| DBSNP |
Sample01 |
Sample02 |
Sample03 |
Sample05 |
Sample06 |
| rs10000010 | 3 | 0 | 3 | 2 | 1 |
| rs10000023 | 3 | 3 | 2 | 1 | 2 |
| rs10000030 | 3 | 3 | 0 | 2 | 3 |
| rs1000007 | 0 | 3 | 1 | 0 | 0 |
| rs10000092 | 3 | 0 | 1 | 3 | 0 |
| rs10000121 | 1 | 1 | 1 | 2 | 2 |
Instead of 1/2/3/0, also the character format (AA, AB, BB, --) can be used. |