Is A Periodic Language A Pipe Dream?
by
Huen Y.K.
CAHRC, P.O.Box 1003, Singapore 911101
http://web.singnet.com.sg/~huens/
email: huens@mbox3.singnet.com.sg
(A short communication - 1st released: 1/1/98)
Abstract
Languages used by higher animals including Man can be classified into two pure
types, viz., periodic language and spatial language. In reality, all languages are never
pure but fall
somewhere in between these two extremes and should be labelled as predominantly
periodic or predominantly spatial. Human languages are prediminantly spatial but
some birds use periodic languages. The information contents conveyed
by a pure periodic language is quite limited but this can be increased by turning it into
a composite periodic language. A question posed: "Could a human language
be made periodic?" The author postulates that a periodic language is highly
ordered, i.e., it has zero entropy whereas spatial languages, especially human
languages have high entropy. The chance of converting a human language
into a periodic language is remote unless entropy of the former can be reduced to zero.
The conclusion is that a human language is much more complex than the natural
number system and that it is impossible to design a periodic language expressive
enough to convey the whole spectrum of human thoughts. However
most human languages can be rendered partially periodic by the theorem of
pseudo-periodicities.
1. Introduction
Have you ever wonder why a sunbird in the bush emits a continuous stream
of monotonous "chip, chip, .......". This is done by a male sunbird in claiming its
territory. To other male sunbirds, the message is clear ... it says: "Within hearing
range, this is my territory. Do not trespass." To a female sunbird, it harbours
other meanings, such as the health of the owner. This is labelled by the author
as a periodic language. Like music, it is a time-based repetitious language which has meanings
only if listened along a time axis. Humans use periodic languages rather infrequently.
The signature tune of a radio station is periodic and is used with an intent somewhat
analogous to that of the male sunbird. It seems that periodic languages are not
very expressive as these can hold only limited information content. On the other
hand a pure spatial language can be understood with or without the timebase. One
can read a bedtime story to a child. This is the time-based mode. Writen words are
concatenated in space where
the spacing between words can be varied quite a bit without affecting the meanings
of the message. You cannot do that with a periodic language since the alphabets
of such a language are defined by time intervals. Written languages must be
spatial as humans are the only animals which use writing. But when we send
messages across the ether, we have to convert the language into a time-based one
even though it is not periodic.
One distinguishing feature of a periodic language is that alphabets are encoded
by time intervals and not by amplitudes. The second distinguishing feature is that
all alphabets are time holistic, i.e., it occupies the whole time axis from zero time
to infinity. This however does not mean that one has to wait forever to interpret a
message. This is because every holistic periodic sequence has an algebraic
closed formulation which can be transmitted through the ether. This is the way
periodic languages are encoded. One must realise that a pure periodic sequence
has very limited information. In fact the entropy of such a sequence is zero, i.e.,
there is no uncertainty in a periodic language. Because
each periodic sequence can encode for one meaning, you will need infinite number
of periodic sequences with different time intervals to encode a human language.
The natural number system has only one meaning and is pure periodical whilst
the composite number system,
having much more information contents, must be described by an infinity of
periodic sequences. If each periodic sequence is considered as one alphabet,
this means that some alphabets or keys will have very long time intervals making
it impractical for transmission through the ether. However since all periodic
sequences have closed forms the real problem does not lie here. The real difficulty
is the use of an infinite alphabet system. A message encoded by an infinite
alphbetic system is probably as difficult to crack as a one-time pad. But it is
impractical to implement such a language .. it is like sending an encyclopaedia
to your listeners.
2. Shannon's Information Theory
Claude Elmwood Shannon pioneered modern information theory in 1948[1]. He
never mentioned the possibilities of information carried by a periodic language
with zero uncertainty. Does Shannon's information theory apply to such a language?
It does seem like an irrelevant quesiton since no such language has ever been
encountered even as of today. It seems like a hypothetical language concocted
by the present author but is it? Any information carrying system is a language.
The natural number system carries information. It is a language. Sequence algebra
is the algebra of holistic number sequences. Information in number sequences are
encoded in closed form generating function before transmission. Shannon defined
the rate of a language as:
..........................................H(M)
..................................r = ------- ........................................(1),
............................................N
in which N is the length of the message and H(M) is its entropy. In sequence
algebra the natural number system Nat(z) is given by the following closed form:
.................................................z
...............................Nat(z) = ------ .....................................(2).
...............................................z - 1
The length N of the natural number set is infinite and if its entropy or H(M) is finite,
the rate of the language r is zero by equation (2) which implies that one needs 0-bit/letter
to encode the natural number system. This interpretation is correct provided
we take r as almost equal to zero and not absolutely zero. It implies that the information
content of the natural number system is minimal. In sequence algebra
the expression z/(z-1) can have only one meaning, i.e., n = 1. Therefore entropy
H(M) of a message measured in bits is log2(n) = log2(1) = 0. It seems that we cannot
escape from the conclusion that a periodic language has zero entropy whether we
are comfortable with it or not. Someone might counter that Nat(z) has infinite number
of meanings from the integer set {0,1,2,...,infinity}. That might be true in conventional
number theory but not in sequence algebra. This is because Nat(z) is handled in
sequence algebra as a single entity as a holistic number sequence. It says Nat(z)
is a natural number system and nothing more. Sequence algebra has different closed
forms for Even(z), Odd(z), Comp(z), Prime(z) and so on. The meanings are very clear.
The conclusion is that Shannon's information theory applies also to a periodic
language examplified by the natural number system.
As an alternative to Shannon's information theory, the author has previously
suggested that the number of alphnumeric characters in a closed form can be used
to measure the information content [2]. This idea originated from Chaitin [3]. The author's
investigations on the information contents of various number sequences did
show proportionality between the number of alphanumeric characters in the closed
forms against the information content of such sequences. For example, the
sequence algebraic generating function for Fermat's number is definitely longer than
that for Nat(z).
3. Characteristics Of A Periodic Language
The information content of a periodic language can be increased by increasing
the number of periodic sequences with distinctive time intervals. The best
example is provided by the equation of divisibles given by equation (3) [4]:
..................................................ub
................................................-----
.................................................\.........1
................................Comp(z) := ) ----------- ..........................................(3).
................................................./.........i...i
................................................----- z (z - 1)
.................................................i = 2
Equation (3) contains an infinite summation of expressions of the type 1/(z^i*(z^i-1))
with i ranging from 2 to infinity. Each of these terms can be viewed as an alphabet. So
this is an infinite alphabet system. To describe the composite number system, you
need to use all the alphabets provided and it is obviously that one will never be
able to describe the whole composite number system by going through the composites
one by one. Equation (3) is an extremely compact description of the composite number
sequence and that is only possible because the language is periodic.
The periodicities of equation (3) as a global generating function for the composite
number sequence did draw the author's attention to the possibility of modelling the
DNA-sequence which after all only has four nucleotide alphabets. It is suspected that the
DNA-language is not periodic since it cannot be completely modelled by periodic sequences.
The author suggests that the DNA-sequence
can be modelled as a pseudo-periodic language by including a fudging expression which
takes the general form as shown in equation (4).
....................................1............1...........1............1
..................Dna(z) := ------ + ------ + ------ + ------ + Fudging(z) ..................(4),
...................................p...........q..............r...........s
.................................A - 1.....C - 1......G - 1......T - 1
where A, C, G, and T represent the names of the four nucleotides each modelled by its
own periodic sequence and p, q, r, and s represents the periodic intervals of these
four sequences. It was also suggested that p, q, r, and s should be chosen as prime
integers so that the points at which these sequences overlap can be predicted. For
example, if A=7, C=11, G=13, and T=17, then the smallest periodic interval for overlaps
will occur at the 7*11 = 77th term. This is the longest sequence one could use to
model the above DNA sequence without overlapping of terms. For a long DNA sequences, one
should choose prime integers large enough to accommodate its length.
4. Theorem On Pseudo-Periodicities
The findings in the previous section did lead to an important theorem on
the relations between order and entropy in any spatial language which is state as follows:
Theorem Of Pseudo-Periodicity: It is never possible to convert a spatial
language into a periodic language.
Here we offer two proofs:
Proof 1: A periodic language has zero entropy. A spatial language such as a
human language has finite entropy. One can never violate the Second Law of
thermodynamics by increasing the order of on part of a system without at the same
time increasing the entropy of the remaining part. Q.E.D.
Proof 2: The second proof is more quantitative and intuitive.
Consider an English sentence binary encoded as 101100101010111010110B.
There are twelve 1s and nine 0s. Since this is a spatial language the space between
succesive 1s can be stretched and the intervals be made uniform as shown in the
first line below. Thus the 1s can be modelled by a periodic expression given by
1/(z^3-1). To preserve the original message, the 0s are inserted betweens the 1s
where they belong as given by line 2. Obviously this line is not periodic since the
intervals between 0s are not uniform. We introduce a third line called a futching line
where 0s are added
in positions where these are missing in the second line so that when combined
with the second line, it is a periodic sequence. Then we can model the combined
second and third lines by a periodic expression plus or minus some corrective
or futching terms as shown in equation (5).
line 1:xxx1xx1xx1xx1xx1xx1xx1xx1xx1xx1xx1xx1x = 1/(x^3-1)
line 2:xxxx0xxxxx00x0xx0xx0xxxxxxxx0xx0xxxxx0
line 3:xxxxxxx0xxxxxxxxxxxxxx0xx0xxxxxxxx0xxx
line 2 + line 3 = 1/(z*(z^3-1)+1/z^11-1/z^7-1/z^22-1/z^25-1/z^34..........(5)
Using sequence algebraic analysis, we can prove that line2+line3 will generate a
binary sequence identical to that in line1 as shown from equations (6) to (8).
..............1........1........1........1......1.......1......1.......1.......1.......1.......1.......1...........1
line1 := ---- + ---- + ---- + --- + --- + --- + --- + --- + --- + --- + --- + --- + O(---)......(6).
................3........6........9........12....15....18.....21.....24.....27.....30.....33.....36.......39
..............x........x........x.........x......x.......x.......x.......x.......x........x.......x.......x..........x
..
.....................1......1.......1.......1.......1......1......1.......1.......1............1
line2+line3 := ---- + --- + --- + --- + --- + --- + --- + --- + --- + O(---).......(7).
...........................4.....10.....11....13.....16.....19....28.....31....37...........40
........................z.......z.......z.......z........z......z.......z........z.......z............z
.....1.............1.......1.........1........1........1........1.......1......1.......1.......1.......1......1.......1.......1
O(---) + O(---) + ---- + ---- + ---- + ---- + --- + --- + --- + --- + --- + --- + --- + --- + ---
......39...........40.......3.........4........6........9.......10.....11....12.....13......15....16....18....19.....21
.....x.............z........x.........z.........x........x........z........z.......x.......z........x.......z......x.......z.......x
.....1......1......1......1.......1.......1.......1........1
+ --- + --- + --- + --- + --- + --- + --- + ---...........................................(8).
......24.....27....28....30.....31.....33.....36.....37
.....x......x.......z......x.......z.......x.......x........z
In the output sequence in equation (8), if we take x's as 1s and z's as 0, then we
can read the binary line as 101100101010111010110 which is in the same order as
the orginal string. This is possible because the orignal string is a spatial language
where the 1s can be made periodic at the expense of the 0s which becomes more
chaotic. It does look like the Second Law of thermodynamics is at work where increased
order in one part of the system is obtained at the expense of increased entropy
in the remaining part of the system. One cannot have the cake and eat it.
What if all alphabets are encoded periodically such as A=10, B=1010, C=101010, ... ?
The answer is that now your alphabets are not of fixed bit size and you will need to
introduce a space to mark the end of each alphabet. The spaces have to be encoded
too and here is where entropy will arise as it has no order. You cannot beat the
Second Law of thermodynamics.
5. Conclusions
Periodic languages seem very attractive for transmission through the ether since
the messages can be expressed in closed forms which can be highly compressed. However
the Theorem of Pseodu-Periodicities shows that it is impossible to convert a human
language which is a spatial language to a periodic one in view of the presence of
entropy in the former. Any attempt to put order in one part of the language will cause
an increase in entropy in the other part. There is no way a human language can be
made periodic like the natural number system. If it can be done, the language will
be very deterministic but it will be so inflexible that most human thoughts cannot be
communicated with it. This remark also applies to the DNA sequence which can be
regarded as a biochemical language of Life. Most likely DNA-sequences are more
like human languages and suffer from the same entropy problem as the latter. So
the dream of a periodic human language which is as expressive as the spatial
human language remains a dream. All attempts will be stonewalled by the Second
Law of thermodynamics.
6. References
Comments: Not all references in this list are directly referred in the main paper.
These are provided to readers as background papers in sequence algebra. These papers
can be easily hyperlinked whilst you are in the web.
1. Schneier Bruce: Applied Cryptography, Protocols, Algorithms, and Source Code in C, chapter 11,
pp 233 to 234, Wiley (2nd ed.), 1996
2. Information Contents Of Number
Theoretic Functions - by Huen Y.K. (date released : 29.5.97) (21.5 KBytes, 7*A4s).
3. Chaitin G.J.: Godel's Theorem and Information, Interantional Journal of Theoretical Physics
22 (1982), pp 941-954.
4. Huen Y.K.: A Matrix Map for Prime and Non-prime Numbers, INT. J. Math. Educ. Sci.
Technol., 1994, VOL. 25, NO.6, pp 913-920.
5. Huen Y.K.: Some Interesing Properties Of The Natural Number System, Int. J. Math. Educ.
Sci. Technol., 1996, VOL.27, NO. 5, 685-691.
6. Huen Y.K.: Visual algebra and its applications, INT. J. Math. Educ. Sci. Technol.,1996,
VOL.??, NO.?, ???-??? (In the press as proof paper mes 100421).
7. The twin prime problem revisited, INT.J.MATH.EDUC.SCI.TECHNOL.,199?,VOL.??, NO.?,???-???, proof paper
mes-0488 (10 pages).
8. Is Pie Periodic?, INT.J.MATH.EDUC.SCI.TECHNOL.,199?,VOL.??,NO.?,???-???, (in the press).
================================================
9. Evaluations Of Normc( ) Function
In Macsyma 2.2
- Huen Y.K. (Date Released 17/12/97, 14 Kbytes)
================================================
10.
List Processing In Sequence Algebra
- Huen Y.K. (Date Released 23/12/97, 20 Kbytes)
================================================
11. A Simple Introduction To Sequence
Algebra - by Huen Y.K.
(date release: 15.3.97) (38 KBytes, 11*A4 pages).
========================================================
12. The Canonical Generating Function
or CGF(z) ... - by Huen Y.K.
(date released : 27.5..97) (24 KBytes, 7*A4s).
========================================================
13. Visual Solutions Of Number Theoretic
Problems ..... - by Huen Y.K. (date released : 3.6.97) (38.3 KBytes, 10*A4s).
========================================================
14. Final Value Theorem Applied To Number
Sequences... - by Huen Y.K. (date released : 5.6.97) (29.4 KBytes, 9*A4s).
========================================================
15. Unsolved Problems In Sequence
Algebra - by Huen Y.K. (date released : 6.6.97) (29.4 KBytes, 9*A4s).
========================================================
16. Methods Of Developing Sequence
Algebraic Formulations For Comp(z) and Prime(z) - by Huen Y.K. (date released : 20.6.97) (36.8 KBytes, 10*A4s).
========================================================
17. Composite Number Sequence
Challenge 1/97 - by Huen Y.K. (date released : 28.6.97) (24.8 KBytes, 7*A4s).
========================================================
18. Lemmata, Corollaries, And
Theorems In Sequence Order Analysis. - by Huen Y.K. (date released : 6.7.97) (38.3 KBytes, 12*A4s).
========================================================
19. Improved Formulations For Comp(z)
and Prime(z)
- by Huen Y.K. (date released : 16.9.97) (17 KBytes ).
========================================================
20. Detecting False Reports
in Primality Tests By The Oddcomp(z) Method.
- by Huen Y.K. (date released : 18.9.97, Revised 20/9) (26 KBytes ).
========================================================
21. The Throwing Power Of
Oddcomp(z).
- by Huen Y.K. (date released : 24.9.97 ) (15 Kbytes).
========================================================
22. Sequence Algebraic
Approach To Prime Number Theorem
- by Huen Y.K. (date released : 28.9.97 ) (21 Kbytes).
========================================================
23. Generating Functions -
Closed Forms vs Open Forms
- by Huen Y.K. (date released : 1.10.97 ) (21 Kbytes).
========================================================
24. Generating Large
Odd Composite With Two Prime Factors
- by Huen Y.K. (date released : 3.10.97 ) (13.5 Kbytes).
========================================================
25. In Search Of Counter-
Examples In Maple's Isprime Function.
- by Huen Y.K. (date released : 4.10.97 ) (18 Kbytes).
========================================================
26. A Sequence Algebraist's
View Of Lehmann's Primality Test
- by Huen Y.K. (date released : 6.10.97 ) (26 Kbytes).
========================================================
27. On Odd(z), Oddcomp(z),
Seq1(z) and Seq2(z)
- by Huen Y.K. (date released : 10.10.97 ) (17 Kbytes).
========================================================
28. How To Generate A Short
And Contiguous Oddcomp(z) Sequence?
- by Huen Y.K. (date released : 15.10.97 ) (13 Kbytes).
========================================================
(29) A Sketch Of Test-Tube
Evolution In A Primeval Number Soup - by Huen Y.K.
(date released : 25.11.97) (paper35.htm 1 K).
========================================================
(30) In Search Of Primes From
The XGS Sequence - by Huen Y.K.
(date released : 8.12..97) (paper38.htm 23 K).
=====================END OF PAPER ======================