Sequence Algebraic Expression For The Genetic Map Of D. melanogaster
by
Huen Y.K.
CAHRC, P.O.Box 1003, Singapore 911101
http://web.singnet.com.sg/~huens/
email: huens@mbox3.singnet.com.sg
(A short communication - 1st released: 22/1/98. Revised 22/1,23/1.)
Abstract
D. melanogaster is the scientific name for the fruitfly. Using crosses and related
techniques, researchers have constructed a genetic map of Drosophila
melangogaster that is one of the most complete for any eukaryote [1]. A genetic
map shows the positions of well-known genes on the chromosomes as distances
in map units. The author shows how the entire genetic or linkage map can be
represented algebraically in sequence algebra. This is not done just for novelty
as sequence algebra has already been found useful in genetics [2,3]. Instead of
using abbrevated symbols, fully descriptive names are used for ease of
reference.
1. Introduction
This is a followup with some refinements on materials from a previous
paper [2,3], where the author suggested that a chromosome
diploid should be represented in sequence algebraic format as shown in equation (1):
............................................................A a...B b...C c
.............................Chromo_diploid := --- + --- + --- ..........................(1).
...............................................................p......q.......r
.............................................................z......z.......z
Here alleles belonging to one copy are represented in capitals and the other copy
represented in lower-cases. This is necessary so the when the chromosome
splits into two halves the alleles belong to each copy should stay together. There are
occasions when alleles crossover between two copies during egg formation but this
operation should be handled separately. Also suggested in the previous paper is to
introduce a new operator called Slice( ) which reduces the above diploid into two copies
of haploids as shown in equations (2a) and (2b) [3]. No confusion could arise between
haploid and diploid sequence expressions since the former has only one variable in each
of the numerator whereas the latter has two. For homozygote diploid, one would
expect two copies of the same allele in the numerator but it is obvious that by looking
at the haploid sequence in isolation, one will not be able to trace whether it comes
from a homozygote or heterozygote parent ... not unless additional tags are included
in the sequence algebraic expressions. These are entirely possible as exemplified
in a previous
paper where the author used M and F in the denominator variables to represent male or
female parentage [3]. Sequence algebra is flexible enough to accept additional tag
variables if these are found necessary.
.....................................................................A.......B......C
..................................chromo_haploid_I := --- + --- + --- ..........................(2a).
.......................................................................p.......q.......r
.....................................................................z.......z.......z
.......................................................................a......b.......c
..................................chromo_haploid_II := --- + --- + --- ..........................(2b).
.........................................................................p......q.......r
.......................................................................z......z.......z
Note that in the above equations, the power indices of the denominator variables remain
unchanged since these represent map unit distances. Following the convention in
genetics, one could use A+ or a+ to represent dominant alleles and those
without the superfices such as A or a as recessive alleles. When equation (1) is
operated on by the Slice function, the output is a list of two sequences as shown in
equation (3) where chromo_haploid_I and chromo_haploid_II are represented by
equations (2a) and (2b) respectively:
Slice(chromo_diploid):= [chromo_haploid_I , chormo_haploid_II]; ...................(3).
2. Equation For D. melangaster
Using the above developed conventions, the haploid representation of D. melanguster is
given by equation (4). The genetic map is sourced from a standard textbook [1] which might
not be the most up-to-date map currently available but this is used only as a
demonstration example. Abbreviated symbols are avoided for ease of reference and
standard notations used in genetics are used for familiarities. Since there are four
chromosomes in D. melangaster, the denominator variables chosen for the four
sequences use different symbols. The power indices to the denominator variables
represent unit map distances. The diploid representation would require further information
on whether it is based on a homozygote or heterozygote individual. As a haploid
this sequence is an algebraic representation of a gamete. One could therefore take the
outer-product of these sequences to predict the variations in the offsprings. With so
many alleles taken into account at the same time, it would be a daunting task to carry out
real experiments. Maybe mathematics could find its strength in realistic simulation
of genetic assortments on a computer. Those interested are invited to read the two
previous papers which initiated this line of thought [2,3].
Chromo_X :=
.......................white_eyes...facet_eyes...echinus_eyes...ruby_eyes...crossveinless_wings
yellow_body + ---------- + ---------- + ------------ + --------- + -------------------
................................1.5...............3................5.5.................7.5...................13.7
...............................x..................x................x....................x.......................x
..cut_wings.....singed_bristles......tan....lozenge_eyes...vermillion_eyes....miniature_wings
+ --------- + --------------- + ----- + ------------ + --------------- + ---------------
..........20.................21.................27.5..........27.7....................33....................36.1
.........x..................x...................x...............x..........................x......................x
..sable_body..garnet_eyes..rudimentary_wings..forked_bristles....bar_eyes....fused_veins
+ ---------- + ----------- + ----------------- + -------------- + -------- + -----------
..........43................44.......................54.5..................56.7...............57...............59.5
.........x..................x.........................x.......................x....................x.................x
..carnation_eyes....bobbed_hairs
+ -------------- + ------------
...........62.5.....................66
.........x..........................x
......................................................star_eyes..dupmpy_wings..clot_eyes..black_body
Autosom_II := aristaless_antenna + --------- + ------------ + --------- + ----------
..............................................................1.3................13................16.5...........48.5
.............................................................y...................y..................y................y
..reduced_bristles...purple_eyes...cinnabar_eyes...vestigial_wings....lobe_eyes...cured_wings
+ ---------------- + ----------- + ------------- + --------------- + --------- + -----------
...............51...................54.5...............57.5...................67...................72..............75.5
..............y.....................y....................y........................y.....................y................y
..plexus_wings..brown_eyes...speck_body
+ ------------ + ---------- + ----------
.........100.5.............104.5..............107
........y....................y.....................y
.................................................veinlet_veins.....javelin_bristle....sepia_eyes...hairy_body
Autosom_III := rouhoid_eyes + ------------- + --------------- + ---------- + ----------
............................................................2....................19.2...................26...............26.5
..........................................................z.....................z.......................z..................z
..dichaete_bristles...thread_arista...scarlet_eyes...curled_wings...stubble_bristles
+ ----------------- + ------------- + ------------ + ------------ + ----------------
................41......................43.2................44...................50..................58.2
...............z........................z.....................z.....................z.....................z
..spineeless_bristles......striped_body...delta_veins...hairless_bristles......ebony_body
+ ------------------- + ------------ + ----------- + ----------------- + ----------
..................58.5.....................62.................66.2................69.5....................70.7
.................z..........................z...................z.....................z........................z
..cardinal_eyes....roug_eyes...claret_eyes
+ ------------- + --------- + -----------
............74.4.............91.1.............100.7
...........z..................z..................z
...........................................eyeless
Autosom_IV := sparkling + ------- ....................................(4).
...................................................2
................................................zz
3. New Mathematical Operators
The author has already mentioned that Nature's mathematics in biology is quite
different from that of the human kind [2]. This is applied mathematics. One is therefore
given the licence to define new mathematical operations when none exists in order
that work can proceed. Up to the present moment, the algebraic representions
and manipulations of dipoid and haploid sequences are already different from those
in conventional sequence algebra. However the convention for outer-product
remains unchanged and is found useful in genetic assortments, segregation, and cross-over
studies. There is no doubt that as we venture deeper into this domain, additional
new operators will be found necessary. The whole domain is unchartered territory
where one could give free rein to one's imagination.
4. Summary
It is too early to make any conclusions. Thus far, mathematical biology belongs to
the statistical kind. Now, we have a new tool in sequence algebra.
Someone might ask: "So what if you can represent D. melangaster algebraically ...".
The answer is that this is only the very beginning of a journey and
one cannot predict what will be encountered along the way. But the author feels
that biology needs a heavy dose of mathematics beyond just statistics. The
discipline is so complex that any algorithm that leads to the use of computers
for analyses would be welcome. Experiments in genetics is laborious and
time-consuming. If one could simulate the outcome before launching a fullscale
investigation, this surely will save money.
We are already seeing how 3-dimensional
manipulations of proteins are already feasible and used profitably by drug
research companies. More could be done in this field if one is not over-
conservative.
5. References
Comments: Not all references in this list are directly referred in the main paper.
Most are provided for readers not familiar with sequence algebra. These papers
can be easily hyperlinked whilst you are browsing in the URLsite. Most html files are
quite short
and can be download quite fast without unzipping operations.
1. Weaver R.F. and Hedrick P.W.:Basic Genetics, WCB Publishers,
2nd Edition, 1995, Printed in Dubuque, pp 154-159.
2.
A Sequence Algebraist's Attempts To Learn From Life Sciences
- Huen Y.K. (Date Released 14/1/98, 38 Kbytes)
================================================
3. Interval Modulated Frequency Distribution Problems
- Huen Y.K. (Date Released 19/1/98, 31 Kbytes)
================================================
4. Huen Y.K.: A Matrix Map for Prime and Non-prime Numbers, INT. J. Math. Educ. Sci.
Technol., 1994, VOL. 25, NO.6, pp 913-920.
5. Huen Y.K.: Some Interesing Properties Of The Natural Number System, Int. J. Math. Educ.
Sci. Technol., 1996, VOL.27, NO. 5, 685-691.
6. Huen Y.K. et al: Visual algebra and its applications, INT. J. Math. Educ. Sci.
Technol.,1997, VOL.28, NO.3, pp 333-344.
7. A Simple Introduction To Sequence
Algebra - by Huen Y.K.
(date release: 15.3.97) (38 KBytes, 11*A4 pages).
========================================================
8. The Canonical Generating Function
or CGF(z) ... - by Huen Y.K.
(date released : 27.5..97) (24 KBytes, 7*A4s).
========================================================
9. Visual Solutions Of Number Theoretic
Problems ..... - by Huen Y.K. (date released : 3.6.97) (38.3 KBytes, 10*A4s).
=====================END OF PAPER ======================