Glossary
There are an awful lot of new terms in DNA research. You won't run across all these in this report, but over the years, as you do your own research, you'll have this handy glossary to use.
Adenine: One of the
four bases in DNA that make up the
letters ATGC, adenine is the "A".
The others are “G” for guanine,
“C” for cytosine, and “T” for
thymine. Adenine
always pairs with thymine. Cytosine
always pairs with guanine. These
letters are used as shorthand for the
sequences of fragments of DNA e.g. CCAAGTAC.
These sequences are the code for genetic information.
Allele:
Alternative form of a genetic locus; a single allele for each locus is
inherited separately from each parent (e.g., at a locus for eye color
the
allele might result in blue or brown eyes).
Allele
Frequency: The proportion of a particular allele among the
chromosomes carried
by individuals in a population.
AMH:
See Atlantic Modal Haplotype
Atlantic
Modal Haplotype (AMH):
the descriptive
term used by James F. Wilson to characterize the most common haplotype
in parts
of Europe. The markers and most common repeat values for the AMH are;
DYS19
= 14
DYS388
= 12
DYS390
= 24
DYS391
= 11
DYS392
= 13
DYS393
= 13
Autosome:
A chromosome not involved in sex determination.
The diploid human genome consists of 46 chromosomes, 22
pairs of
autosomes, and 1 pair of sex
chromosomes
(the X and Y chromosomes).
Base
pair (bp): Two nitrogenous bases (adenine and thymine or
guanine and cytosine)
held together by weak bonds. Two
strands
of DNA are held together in the shape of a double helix by the bonds
between
base pairs. A set
of two bonded
nucleotides on opposite strands of DNA.
There are two possible base pairs: C-G and A-T. These letters are used as
shorthand for the
sequences of fragments of DNA e.g. CCAAGTAC.
These sequences are the code for genetic information. Strung together in chains
each base
reaches across and
forms a pair with its
complementary base on the opposite strand; like the rungs of a ladder. Base pairing ensures that
the genetic
information, the sequence of bases in the DNA, is passed securely from
generation to generation in a process called DNA replication.
Bp:
See base pair.
Chromosome:
The self-replicating genetic structure of cells containing the cellular
DNA
that bears in its nucleotide sequence the linear array of genes. In prokaryotes,
chromosomal DNA is circular,
and the entire genome is carried on one chromosome.
Eukaryotic genomes consist of a number of
chromosomes whose DNA is associated with different kinds of proteins. A rod-like structure of
tightly coiled DNA
found in the cell nucleus of plants and animals.
Chromosomes are normally found in pairs;human
beings typically have 23 pairs of chromosomes.
Clade:
from the Greek word klados, meaning branch.
A branch of biological taxa or species that share features
inherited
from a common ancestor. A single phyletic group or line. Also cladus. Monophyletic group of taxa.
Cladistics:
School of phylogenetic analysis emphasizing the branching patterns of
monophyletic taxa relying on synapomorphies (vs. symplesiomorphies) to
unite
sister taxa. [See Avise, pp. 34-39, 121-122].
Cladogram:
A diagram, in the form of a stylized tree, showing inferred historical
branching patterns among taxa.
Cline:
Continuous change in a trait or trait frequency over space or time.
Cytosine:
One of the four bases in DNA that make up the letters ATGC, cytosine is
the
"C". The others are
“A” for adenine,
“G” for guanine, and “T” for
thymine.
Cytosine always pairs with guanine.
Adenine always pairs with thymine.
These letters are used as shorthand for the sequences of
fragments of
DNA e.g. CCAAGTAC. These
sequences are
the code for genetic information.
Diploid:
A full set of genetic material, consisting of paired chromosomes one
chromosome
from each parental set. Most
animal
cells except the gametes have a diploid set of chromosomes. The diploid human genome
has 46 chromosomes.
DNA
(deoxyribonucleic acid): The molecule that encodes genetic information. DNA is a doublestranded
molecule held
together by weak bonds between base pairs of nucleotides. The four nucleotides in
DNA contain the
bases: adenine (A), guanine (G), cytosine (C), and thymine (T). In nature, base pairs form
only between A and
T and between G and C; thus the base sequence of each single strand can
be
deduced from that of its partner.
DNA
fingerprinting: A term for DNA typing.
The chemical structure of everyone's DNA is the same. The only difference
between people (or any
animal) is the order of the base pairs.
There are so many millions of base pairs in each person's
DNA that every
person has a different sequence.
Using
these sequences, every person could
be identified solely by the sequence of their base pairs. However, because there are
so many millions
of base pairs, the task would be very time-consuming.
Instead, scientists are able to use a shorter
method, because of repeating patterns in DNA.
These
patterns do not, however, give an
individual "fingerprint," but they are able to determine whether two
DNA samples are from the same person, related people, or non-related
people. Scientists
use a small number of
sequences of DNA that are known to vary among individuals a great deal,
and
analyze those to get a certain probability of a match.
DNA
marker: A gene or other fragment of DNA whose location in
the genome is known.
DNA
sequence: The relative order of base pairs, whether in a
fragment of DNA, a gene,
a chromosome, or an entire genome.
DNA
typing: The analysis of sections of DNA for purposes of
identification.
Double
helix: The shape that two linear strands of DNA assume
when bonded together.
DYS
: D = DNA, Y = Y chromosome, S = a
unique DNA
segment. This
nomenclature is controlled
by the HUGO Gene Nomenclature Committee, with the assignment of new DYS
numbers. This
guideline determines each
part of the symbol for naming arbitrary DNA fragments and loci. See section Appendix App
1.1 DNA Segments
located at
http://www.gene.ucl.ac.uk/nomenclature/guidelines.html#1.4
EST:
Expressed Sequence Tag
Flanking
Region: For microsatellites, the flanking regions are the
stretches of DNA
outside the simple sequence tandem repeat (STR). These sequences are
used as
primer pairs. The flanking regions are usually invariant across a
population or
species, but mutations in the flanking region can be a cause of null
alleles as
well as a potentially serious source of homoplasy (see Pemberton et al.
1995).
Forensic:
Of or relating to courts or legal matters. Molecular markers are
increasingly
common in the context of forensics, both in wildlife and human cases
involving
identity or relatedness.
Gene:
The fundamental physical and functional unit of heredity. A gene is an ordered
sequence of nucleotides
located in a particular position on a particular chromosome that
encodes a
specific functional product (i.e., a protein or RNA molecule).
Gene
expression: The process by which a genes coded information
is converted into
the structures present and operating in the cell. Expressed genes
include those
that are transcribed into mRNA and then translated into protein and
those that
are transcribed into RNA but not translated into protein.
Gene
mapping: Determination of the relative positions of genes
on a DNA molecule
(chromosome or plasmid) and of the distance, in linkage units or
physical
units, between them.
Genetic
Distance: various statistics for measuring the 'genetic
distance' between
subgroups or populations. Major distance measures include Nei's
distance (1972,
1978), Reynold's distance (Reynolds et al. 1983) and new distance
measures that
incorporate the stepwise mutation process in microsatellites (RST of
Slatkin
1995a, b; D of Shriver et al., delta mu of Goldstein et al. 1995).
Genetic
markers: Alleles of genes, or DNA polymorphisms, used as
experimental probes to
keep track of an individual, a tissue, a cell, a nucleus, a chromosome,
or a
gene. Stated another way, any character that acts as a signpost or
signal of
the presence or location of a gene or heredity characteristic in an
individual
in a population. There
are 4 chromosome
changes that do occur from generation to generation, and these are
known as
markers:
a.
indels: these are insertions or deletions
of the DNA at particular locations on the chromosome. An example is the
YAP (Y
chromosome alu polymorphism).
b.
SNPs: these are single nucleotide
polymorphisms in which a particular nucleotide is changed (like A is
changed to
G). Since
SNPs(snips) and indels
(stable alus) are very rare, they also are known as unique event
polymorphisms
(UEPs).
c.
microsatellites: these are short
sequences of nucleotides (typically 2 to 5 core base pairs, example:
ATCG)
which are repeated multiple times in tandem.
Over time changes sometimes do occur, thus the number of
repeats may
increase or decrease.
d.
minisatellites: these are longer sequences
of nucleotides (typically 9 to 80 core base pairs, example: TAAGGGCCA)
which
are repeated multiple times in tandem. Over time changes sometimes do
occur and
the number of repeats may increase or decrease.
Genetic
profile: A collection of information about a person's
genes.
Genetics:
The study of the patterns of inheritance of specific traits.
Genome:
All the genetic material in the chromosomes of a particular organism;
its size
is generally given as its total number of base pairs.
Genome
project: Research and technology development effort aimed
at mapping and
sequencing some or all of the genome of human beings and other
organisms.
Genotype:
the genetic makeup of an organism or set of DNA variants found at one
or more
loci in an individual, as characterized by its physical appearance or
phenotype. Our external features-what scientists call our
phenotypes-are
different. We have a wide array of skin color, eye shape and color,
hair
texture. However our interior profile, or genotype - the organization
of our
genes on our chromosomes-identifies us all as Homo sapiens.
Guanine:
One of the four bases in DNA that make up the letters ATGC, guanine is
the
"G". The others are
“A” for
adenine, “C” for cytosine, and
“T” for thymine.
Guanine always pairs with cytosine. Adenine always pairs
with
thymine. These
letters are used as
shorthand for the sequences of fragments of DNA e.g. CCAAGTAC. These
sequences
are the code for genetic information.
Haplogroup
(Hg): a collection
of closely related
haplotypes.
Haploid:
A single set of chromosomes (half the full set of genetic material),
present in
the egg and sperm cells of animals and in the egg and pollen cells of
plants.
Human beings have 23 chromosomes in their reproductive cells. Compare diploid.
Haplotype
(Ht): A set of
closely linked alleles
(genes or DNA polymorphisms) inherited as a unit. A contraction of the
phrase
"haploid genotype". Different
combinations of polymorphisms are known as haplotypes.
Collectively the results from several loci
could be referred to as a haplotype.
"Haplo" comes from the Greek word for "single".
Heredity:
The handing down of certain traits from parents to their offspring. The
process
of heredity occurs through the genes.
Homology:
Having the same origin (used for genes or characters deriving from a
common
ancestor).
Homoplasy:
similarity of traits or genes for reasons other than coancestry (e.g.,
convergent evolution, parallelism, evolutionary reversals, horizontal
gene
transfer, gene duplications). Homoplasy violates a basic assumption of
the
analysis of genetic markers -- variants of similar phenotype (e.g.,
base pair
size) are assumed to derive from a common ancestor. [See Sanderson, M.,
and
Hufford. 1996. Homoplasy: The Recurrence of Similarity in Evolution.
Academic
Press, NY ISBN 618030-X].
HUGO
: See Human Genome
Organization
Human
Genome Initiative: Collective name for several projects
begun in 1986 by DOE to
(1) create an ordered set of DNA segments from known chromosomal
locations, (2)
develop new computational methods for analyzing genetic map and DNA
sequence
data, and (3) develop new techniques and instruments for detecting and
analyzing
DNA. This DOE initiative is now known as the Human Genome Program. The
national
effort, led by DOE and NIH, is known as the Human Genome Project.
Human
Genome Organization (HUGO):
The Human
Genome Organization (HUGO) is the international organization of
scientists involved
in the Human Genome Project (HGP),
the global initiative to map and sequence the human genome. HUGO was
established in 1989 by a group of the world's leading genome scientists
to
promote international collaboration within the project.
HUGO
currently has over 1000 members
representing over 50 countries. HUGO maintains three regional offices,
HUGO
Americas, HUGO Europe and HUGO Pacific, which carry out the
administrative
duties of the organization.
Hugo
carries out a complex coordinating role within the Human Genome Project. HUGO activities range from
support of data
collation for constructing genetic and physical maps of the human
genome to the
organization of workshops to promote the consideration of a wide range
of
ethical, legal, social and intellectual property issues.
Human
Genome Project (HGP):
The national
effort, initially led by DOE and NIH, is known as the Human Genome
Project. It is now
an international
initiative to map and sequence the human genome.
Human
Genome Program:
This was previously
known as the DOE’s Human Genome Initiative and is now known
as the Human Genome
Program.
Hypervariability:
High degree of variation among individuals within local populations at
a given
genetic marker. Examples of hypervariable markers include
minisatellites and
microsatellites.
Informatics:
The study of the application of computer and statistical techniques to
the
management of information. In genome projects, informatics includes the
development of methods to search databases quickly, to analyze DNA
sequence
information, and to predict protein sequence and structure from DNA
sequence
data.
In vitro: Outside a living organism.
ISOGG: The International Society of Genetic Genealogy (ISOGG) was founded in 2005 by DNA project administrators who share the common vision of the promotion and education of genetic genealogy. Learn more.
Karyotype:
A picture of the chromosomes in a cell that is used to check for
abnormalities. A
karyotype is created by staining the
chromosomes with dye and photographing them through a microscope. The
photograph is then cut up and rearranged so that the chromosomes are
lined up
into corresponding pairs.
Linkage
map: A map of the relative positions of genetic loci on a
chromosome,
determined on the basis of how often the loci are inherited together.
Distance
is measured in centimorgans (cM).
Localize:
Determination of the original position (locus) of a gene or other
marker on a
chromosome.
Loci: See Locus.
Locus
(pl. loci): The position on a chromosome of a gene or other chromosome
marker;
also, the DNA at that position. The use of locus is sometimes
restricted to
mean regions of DNA that are expressed. The specific physical location
of a
gene on a chromosome. From
the Latin for
'place'. A stretch of DNA at a particular place on a particular
chromosome —
often used for a 'gene' in the broad sense, meaning a stretch of DNA
being
analyzed for variability (e.g., a microsatellite locus).
Marker:
An identifiable physical location on a chromosome (e.g., restriction
enzyme
cutting site, gene) whose inheritance can be monitored. Markers can be
expressed regions of DNA (genes) or some segment of DNA with no known
coding
function but whose pattern of inheritance can be determined. A gene of
known
location on a chromosome and phenotype that is used as a point of
reference in
the mapping of other loci.
Meiosis:
The process of two consecutive cell divisions in the diploid
progenitors of sex
cells. Meiosis results in four rather than two daughter cells, each
with a
haploid set of chromosomes.
Microsatellite:
Repetitive stretches of short sequences of DNA used as genetic markers
to track
inheritance in families. They are short sequences of nucleotides
(example:
ATCG) which are repeated over and over again a number of times in
tandem.
Changes sometimes do occur, however, and the number of repeats may
increase or
decrease. See also Genetic Markers.
Minisatellites:
Segments of repeated DNA often used as genetic markers for individual
identification
(forensic
DNA 'fingerprinting') or analyses of relatedness. Can be
either single- or
multi-locus. Minisatellite technology relies on probe-based
hybridization.
Advantages include lack of need for specific primers and
hypervariability. Disadvantages
include inability to use PCR
amplification, the need for Southern blotting, and, for multi-locus
minisatellites, the lack of locus-specificity (making population
genetic
analyses difficult). [See Avise, Fig. 3.16, p. 80].
Mitochondrial DNA: See mtDNA
Modal Haplogroup: All
Haplogroups are in a sense made up based on similarities. A modal
haplogroup is one in which scientists have noticed similarities within
a certain set of markers amongst a group of people. The goal is to tie
any haplogroup, even a modal one, to a specific point in time and a
precise place in geography.
Monophyletic
group (clade): Evolutionary assemblage of taxa that
includes a common ancestor
and all of its descendants. [See Avise, p. 36].
MRCA:
Most recent common ancestor.
mtDNA:
Mitochondrial
DNA which is passed down
from the mother to all her children, males and females. The genetic
material of
the mitochondria, the organelles that generate energy for the cell.
Mutation: A permanent structural alteration in DNA. In most cases, DNA changes either have no effect or cause harm, but occasionally a mutation can improve an organism's chance of surviving and passing the beneficial change on to its descendants.
MWTBD:
More Work To Be Done. I use this almost jokingly
throughout
the report because it's always true of every part of this work.
Nucleotide:
A subunit of DNA or RNA consisting of a nitrogenous base (adenine,
guanine,
thymine, or cytosine in DNA; adenine, guanine, uracil, or cytosine in
RNA), a
phosphate molecule, and a sugar molecule (deoxyribose in DNA and ribose
in
RNA). Thousands of nucleotides are linked to form a DNA or RNA
molecule.
Nucleus:
The cellular organelle in eukaryotes that contains the genetic
material. The
center of a cell, where all of the DNA, packaged in chromosomes, is
contained.
PCR:
See Polymerase Chain Reaction.
Pedigree:
A simplified diagram of a family's genealogy that shows family members'
relationships to each other and how a particular trait or disease has
been
inherited.
Phenotype: Our external features are
called our
phenotypes and are very different. We have a wide array of skin color,
eye
shape and color, hair texture. However our interior profile, or
genotype - the
organization of our genes on our chromosomes-identifies us all as Homo
sapiens.
Phylogeny:
the evolutionary history of a species.
Polymerase
Chain Reaction (PCR): An in vitro process that yields
millions of copies of
desired DNA through repeated cycling of a reaction involving the DNA
polymerase
enzyme. Technique
for amplifying nucleic
acids in a thermal cycler. Involves use of forward and reverse primer
pairs
that start off the reaction. End yield is many orders of magnitude more
DNA of
the target sequence than one started with. The resulting amplified DNA
can then
be visualized with stains or radioactive labeling, or sized with
fluorescent
markers in a sequencer. [See Avise, p. 84, Fig. 3.18, p. 85].
Polymorphism:
a term to show that mutations do occur in the Y chromosome, as can
happen with
other chromosomes. It is a naturally occurring or induced variation in
the
sequence of genetic information on a segment of DNA.
Primer: Short, preexisting
single-stranded
polynucleotide chain to which new deoxyribonucleotides can be added by
DNA
polymerase (to 'prime' PCR amplification). The primer anneals to a
nucleic acid
template (DNA of the organism of interest) and promotes copying of the
template, starting from the primer site. To amplify microsatellites one
uses a
forward and reverse primer pair:
[agctcagtccctagtcagtact]acacacacacacacacacacac[ggtacttcggagctatccgaattccct]
In
this example the bold, italicized bp are the forward and reverse
primers
(should not differ among individuals), whereas the unitalicized 'ac'
repeat is
the microsatellite. By running back and forth across the repeat one can
amplify
a few copies of the microsatellite region by orders of magnitude,
yielding sufficient
DNA to allow visualization of the amplified product on an acrylamide
gel by
staining with ethidium bromide. Some primer sequences may be conserved
across
wide taxonomic gaps (e.g., across families), while others may differ
even among
congeners.
Protein:
A large molecule composed of one or more chains of amino acids in a
specific
order; the order is determined by the base sequence of nucleotides in
the gene
coding for the protein. Proteins are required for the structure,
function, and
regulation of the body’s cells, tissues, and organs, and each
protein has
unique functions. Examples are hormones, enzymes, and antibodies.
Recombination:
Exchange of gene segments by crossing over at chiasmata (exchange of
material
between non-sister chromatids). The exchanged sections are usually
homologous.
The likelihood of recombination increases with increasing physical
distance.
Sequence
Tagged Site (STS): Short (200 to 500 base pairs) DNA
sequence that has a single
occurrence in the human genome and whose location and base sequence are
known.
Detectable by polymerase chain reaction, STSs are useful for localizing
and
orienting the mapping and sequence data reported from many different
laboratories and serve as landmarks on the developing physical map of
the human
genome. Expressed sequence tags (ESTs) are STSs derived from cDNAs.
Sequencing:
Determination of the order of nucleotides (base sequences) in a DNA or
RNA
molecule or the order of amino acids in a protein.
Sex
Chromosome: The X or Y chromosome in human beings that
determines the sex of an
individual. Females have two X chromosomes in diploid cells; males have
an X
and a Y chromosome. The sex chromosomes comprise the 23rd chromosome
pair in a
karyotype. Compare autosome.
Short
Tandem Repeats (STR): Multiple copies of an identical DNA
sequence arranged in
direct succession in a particular region of a chromosome.
Single
Nucleotide Polymorphism (SNP): A variation in the genetic
code at a specific
point on the DNA. In
principle, SNPs
could be bi-, tri-, or tetra-allelic polymorphisms.
However, in humans, tri- allelic and tetra-
allelic SNPs are rare almost to the point of non- existence, and so
SNPs are
sometimes simply referred to as bi- allelic markers (or di- allelic to
be
etymologically correct). This
is
somewhat misleading because SNPs are only a subset of all possible bi-
allelic
polymorphisms (e.g., multiple base variations).
About 30 million SNPs are thought to exist, making them
much better
markers than alternative markers, such as micro- satellite repeats or
short
tandem repeats. But it has been the discovery that some SNPs are linked
to
particular diseases that has fueled the rising interest in this field. see also The SNP
Consortium.
Slippage
Replication: A mutation process whereby a simple sequence
tandem
(microsatellite) repeat grows by addition or subtraction of the
"beads" of simple units that make up the "necklace". A
dinucleotide AC repeat would grow by addition or subtraction of AC
units.
SNP:
see Single Nucleotide Polymorphism
Species:
A single, distinct class of living creature with features that
distinguish it
from others.
Stepwise
mutation: Microsatellite variation appears to result from
slippage in
replication, which is most likely to add or delete a single repeat unit
(steps
of one). As a result, alleles more similar in size will presumably be
more
closely related. This additional 'phylogenetic' information can be used
in
assessing genetic differentiation or genetic distance.
STR:
See Short Tandem Repeats.
Tandem
Repeat Sequences: Multiple copies of the same base
sequence on a chromosome;
used as a marker in physical mapping.
Taxon
(plural taxa): Group of organisms linked by common ancestry. Taxa can
range in
scale from populations to kingdoms.
The
SNP Consortium (TSC): Established (April 1999) to identify
SNPs and add them to
the public domain rather than patenting them for commercial use. A
joint
enterprise of pharmaceutical
companies
(AstraZeneca, Bayer, Bristol-Myers Squibb, F. Hoffmann-La Roche, Glaxo
Wellcome, Aventis/Hoechst Marion Rouse, Novartis, Pfizer, Searle, and
SmithKline Beecham) and the Wellcome Trust, with sequencing carried out
by
three public genomics institutes (Sanger Centre, Washington University
School
of Medicine, and the Whitehead Institute) with Stanford University
Human Genome
Center contributing mapping and Cold Spring Harbor Laboratory
coordinating
bioinformatics activities. [CHI SNPs report] Motorola joined in October
1999
and IBM in March 2000.
Traits:
Ways of looking, thinking, or being. Traits that are genetic are passed
down
through the genes from parents to offspring.
TSC:
see The SNP Consortium
X-Chromosome:
A chromosome that is different in the two sexes and involved in sex
determination. The
female in our species
has two X chromosomes.
Y-Chromosome: A chromosome that is
different in the two
sexes and involved in sex determination.
The male in our species has one Y and one
X chromosome.
Home | Contact