Single-Nucleotide Polymorphisms (SNPs)

Nitrogenous Bases (ATCG)
DNA chromosomes look like long twisty ladders. The longest chromosome (1) has over 249 million rungs and the smallest (21) has over 48 million. In total there are over 3 billion of these rungs in human DNA. Each rung in the ladder will contain a pair of nitrogenous bases: Adenine (A), Thymine (T), Cytosine (C), and Guanine (G). A and T are always paired together and C and G are paired together. Although two SNPs will always be together at each spot, only one of the two values at each spot will do any coding, the other is just a backbone that holds the structure together. The side that does the coding is called the + strand and the side that is the backbone is the - strand. Sometimes in an A T pair, the A will be the coding gene and the T will be the backbone, other times it will be the reverse and the same is true for C and G pairs. For simplicity, DNA companies will therefore just record the value of a person's + strand at each spot they test.

Single-Nucleotide Polymorphisms (SNPs)
All human beings are 99.9% identical in their genetic makeup meaning that at out of the 3 billion genes we all have 99.9% are the same in all humans. The places where it is possible for a variance to occur are called, SNP's which stands for Single-nucleotide polymorphisms. SNP's are the main force behind DNA and what gives it it's genealogical value. When two individuals have enough matching SNP's in a row, this becomes a matching segment. The more matching SNP's there are, the bigger the segment is. If a segment is big enough (bigger than 15cm's), then the segment must be identical by descend (IBD) which means the two individuals share that segment because they both descend from a common ancestor who passed on that segment of DNA to both of them. The more matching segments there and the bigger they are, the closer two test takers are probably related. By testing a sample of a person's SNP's and then comparing them to everyone else in the database, it is possible to identify a person's genetic relatives. Most major companies will test 500-600k SNP's.

In theory each SNP can be one of the four nitrogenous bases (A, C, G, or T), but in practice only two are ever found at each specific spot the vast majority of the time. There is usually a major alle and a minor alle that is present in at least 5% of test takers. In autosomal DNA, each person will have two nitrogenous bases at each spot, one inherited from their mother and one inherited from their father. This means that at each SNP tested a person can have one of three combinations: two copies of the major alle (called a homozygous SNP), one copy of the major alle and one copy of the minor alle (called a heterozygous SNP), and two copies of the minor alle (called a homozygous SNP). When two people have at least once matching SNP at each spot, it is a half match, if both SNPs match it is a full match, and if neither SNP matches it is no match. Since there are only three possible combinations at each spot, many people will be either a full or half match at any given SNP by coincidence even though they are not related. In fact, somebody who is heterozygous will match everybody on earth at that spot. This is why it is important that hundreds-thousands of SNPs in a row match to be confident that a matching segment is identical by descent and not just a coincidence.

If you download your DNA raw data file and open it, you can see exactly what genes you have at each of the tested SNPs. The order they are listed is arbitrary and it is impossible to know which gene came from each parent. If both genes are the same (A A for example) then one A came from mom and one from dad. If your DNA is heterozygous at a certain SNP (C G for example), the only way to know which parent gave you the C and which the G is by comparing against other relatives. Sorting out the paternal and maternal SNPS is called phasing. In this situation, if you compared your DNA against your mom's and at the spot where you have C G, your mom has G G then you must have inherited the C from your dad and the G from your mom. If you are C G and your mom is also C G, then it is still unclear which gene came from which parent and comparing against your dad or another relative would be necessary to figure it out. Programs such as GedMatch.com offer the ability to phase your DNA by comparing it against one or both parents. Using phased kits reduces the amount of false segments identified between you and a match and is a valuable tool for people interested in small DNA segments. However, in cases where you and the parent being compared against are both heterozygous (like C G) the value becomes a no call and is discarded from the comparison. For this reason, comparing your DNA against both parents creates better results than just comparing against one. Perhaps in the spot where you and your mom are both C G, your father is C C, now it can be concluded you inherited the C from your father and the G from your mother.