Is A Periodic Language A Pipe Dream?

by

Huen Y.K.

CAHRC, P.O.Box 1003, Singapore 911101
http://web.singnet.com.sg/~huens/
email: huens@mbox3.singnet.com.sg

(A short communication - 1st released: 1/1/98)


Abstract

Languages used by higher animals including Man can be classified into two pure types, viz., periodic language and spatial language. In reality, all languages are never pure but fall somewhere in between these two extremes and should be labelled as predominantly periodic or predominantly spatial. Human languages are prediminantly spatial but some birds use periodic languages. The information contents conveyed by a pure periodic language is quite limited but this can be increased by turning it into a composite periodic language. A question posed: "Could a human language be made periodic?" The author postulates that a periodic language is highly ordered, i.e., it has zero entropy whereas spatial languages, especially human languages have high entropy. The chance of converting a human language into a periodic language is remote unless entropy of the former can be reduced to zero. The conclusion is that a human language is much more complex than the natural number system and that it is impossible to design a periodic language expressive enough to convey the whole spectrum of human thoughts. However most human languages can be rendered partially periodic by the theorem of pseudo-periodicities.


1. Introduction

Have you ever wonder why a sunbird in the bush emits a continuous stream of monotonous "chip, chip, .......". This is done by a male sunbird in claiming its territory. To other male sunbirds, the message is clear ... it says: "Within hearing range, this is my territory. Do not trespass." To a female sunbird, it harbours other meanings, such as the health of the owner. This is labelled by the author as a periodic language. Like music, it is a time-based repetitious language which has meanings only if listened along a time axis. Humans use periodic languages rather infrequently. The signature tune of a radio station is periodic and is used with an intent somewhat analogous to that of the male sunbird. It seems that periodic languages are not very expressive as these can hold only limited information content. On the other hand a pure spatial language can be understood with or without the timebase. One can read a bedtime story to a child. This is the time-based mode. Writen words are concatenated in space where the spacing between words can be varied quite a bit without affecting the meanings of the message. You cannot do that with a periodic language since the alphabets of such a language are defined by time intervals. Written languages must be spatial as humans are the only animals which use writing. But when we send messages across the ether, we have to convert the language into a time-based one even though it is not periodic.

One distinguishing feature of a periodic language is that alphabets are encoded by time intervals and not by amplitudes. The second distinguishing feature is that all alphabets are time holistic, i.e., it occupies the whole time axis from zero time to infinity. This however does not mean that one has to wait forever to interpret a message. This is because every holistic periodic sequence has an algebraic closed formulation which can be transmitted through the ether. This is the way periodic languages are encoded. One must realise that a pure periodic sequence has very limited information. In fact the entropy of such a sequence is zero, i.e., there is no uncertainty in a periodic language. Because each periodic sequence can encode for one meaning, you will need infinite number of periodic sequences with different time intervals to encode a human language. The natural number system has only one meaning and is pure periodical whilst the composite number system, having much more information contents, must be described by an infinity of periodic sequences. If each periodic sequence is considered as one alphabet, this means that some alphabets or keys will have very long time intervals making it impractical for transmission through the ether. However since all periodic sequences have closed forms the real problem does not lie here. The real difficulty is the use of an infinite alphabet system. A message encoded by an infinite alphbetic system is probably as difficult to crack as a one-time pad. But it is impractical to implement such a language .. it is like sending an encyclopaedia to your listeners.

2. Shannon's Information Theory

Claude Elmwood Shannon pioneered modern information theory in 1948[1]. He never mentioned the possibilities of information carried by a periodic language with zero uncertainty. Does Shannon's information theory apply to such a language? It does seem like an irrelevant quesiton since no such language has ever been encountered even as of today. It seems like a hypothetical language concocted by the present author but is it? Any information carrying system is a language. The natural number system carries information. It is a language. Sequence algebra is the algebra of holistic number sequences. Information in number sequences are encoded in closed form generating function before transmission. Shannon defined the rate of a language as:

..........................................H(M)
..................................r = ------- ........................................(1),
............................................N

in which N is the length of the message and H(M) is its entropy. In sequence algebra the natural number system Nat(z) is given by the following closed form:

.................................................z
...............................Nat(z) = ------ .....................................(2).
...............................................z - 1

The length N of the natural number set is infinite and if its entropy or H(M) is finite, the rate of the language r is zero by equation (2) which implies that one needs 0-bit/letter to encode the natural number system. This interpretation is correct provided we take r as almost equal to zero and not absolutely zero. It implies that the information content of the natural number system is minimal. In sequence algebra the expression z/(z-1) can have only one meaning, i.e., n = 1. Therefore entropy H(M) of a message measured in bits is log2(n) = log2(1) = 0. It seems that we cannot escape from the conclusion that a periodic language has zero entropy whether we are comfortable with it or not. Someone might counter that Nat(z) has infinite number of meanings from the integer set {0,1,2,...,infinity}. That might be true in conventional number theory but not in sequence algebra. This is because Nat(z) is handled in sequence algebra as a single entity as a holistic number sequence. It says Nat(z) is a natural number system and nothing more. Sequence algebra has different closed forms for Even(z), Odd(z), Comp(z), Prime(z) and so on. The meanings are very clear. The conclusion is that Shannon's information theory applies also to a periodic language examplified by the natural number system.

As an alternative to Shannon's information theory, the author has previously suggested that the number of alphnumeric characters in a closed form can be used to measure the information content [2]. This idea originated from Chaitin [3]. The author's investigations on the information contents of various number sequences did show proportionality between the number of alphanumeric characters in the closed forms against the information content of such sequences. For example, the sequence algebraic generating function for Fermat's number is definitely longer than that for Nat(z).


3. Characteristics Of A Periodic Language

The information content of a periodic language can be increased by increasing the number of periodic sequences with distinctive time intervals. The best example is provided by the equation of divisibles given by equation (3) [4]:

..................................................ub
................................................-----
.................................................\.........1
................................Comp(z) := ) ----------- ..........................................(3).
................................................./.........i...i
................................................----- z (z - 1)
.................................................i = 2

Equation (3) contains an infinite summation of expressions of the type 1/(z^i*(z^i-1)) with i ranging from 2 to infinity. Each of these terms can be viewed as an alphabet. So this is an infinite alphabet system. To describe the composite number system, you need to use all the alphabets provided and it is obviously that one will never be able to describe the whole composite number system by going through the composites one by one. Equation (3) is an extremely compact description of the composite number sequence and that is only possible because the language is periodic.

The periodicities of equation (3) as a global generating function for the composite number sequence did draw the author's attention to the possibility of modelling the DNA-sequence which after all only has four nucleotide alphabets. It is suspected that the DNA-language is not periodic since it cannot be completely modelled by periodic sequences. The author suggests that the DNA-sequence can be modelled as a pseudo-periodic language by including a fudging expression which takes the general form as shown in equation (4).

....................................1............1...........1............1
..................Dna(z) := ------ + ------ + ------ + ------ + Fudging(z) ..................(4),
...................................p...........q..............r...........s
.................................A - 1.....C - 1......G - 1......T - 1

where A, C, G, and T represent the names of the four nucleotides each modelled by its own periodic sequence and p, q, r, and s represents the periodic intervals of these four sequences. It was also suggested that p, q, r, and s should be chosen as prime integers so that the points at which these sequences overlap can be predicted. For example, if A=7, C=11, G=13, and T=17, then the smallest periodic interval for overlaps will occur at the 7*11 = 77th term. This is the longest sequence one could use to model the above DNA sequence without overlapping of terms. For a long DNA sequences, one should choose prime integers large enough to accommodate its length.


4. Theorem On Pseudo-Periodicities

The findings in the previous section did lead to an important theorem on the relations between order and entropy in any spatial language which is state as follows:

Theorem Of Pseudo-Periodicity: It is never possible to convert a spatial language into a periodic language.

Here we offer two proofs:

Proof 1: A periodic language has zero entropy. A spatial language such as a human language has finite entropy. One can never violate the Second Law of thermodynamics by increasing the order of on part of a system without at the same time increasing the entropy of the remaining part. Q.E.D.

Proof 2: The second proof is more quantitative and intuitive.

Consider an English sentence binary encoded as 101100101010111010110B. There are twelve 1s and nine 0s. Since this is a spatial language the space between succesive 1s can be stretched and the intervals be made uniform as shown in the first line below. Thus the 1s can be modelled by a periodic expression given by 1/(z^3-1). To preserve the original message, the 0s are inserted betweens the 1s where they belong as given by line 2. Obviously this line is not periodic since the intervals between 0s are not uniform. We introduce a third line called a futching line where 0s are added in positions where these are missing in the second line so that when combined with the second line, it is a periodic sequence. Then we can model the combined second and third lines by a periodic expression plus or minus some corrective or futching terms as shown in equation (5).

line 1:xxx1xx1xx1xx1xx1xx1xx1xx1xx1xx1xx1xx1x = 1/(x^3-1)
line 2:xxxx0xxxxx00x0xx0xx0xxxxxxxx0xx0xxxxx0
line 3:xxxxxxx0xxxxxxxxxxxxxx0xx0xxxxxxxx0xxx

line 2 + line 3 = 1/(z*(z^3-1)+1/z^11-1/z^7-1/z^22-1/z^25-1/z^34..........(5)

Using sequence algebraic analysis, we can prove that line2+line3 will generate a binary sequence identical to that in line1 as shown from equations (6) to (8).

..............1........1........1........1......1.......1......1.......1.......1.......1.......1.......1...........1
line1 := ---- + ---- + ---- + --- + --- + --- + --- + --- + --- + --- + --- + --- + O(---)......(6).
................3........6........9........12....15....18.....21.....24.....27.....30.....33.....36.......39
..............x........x........x.........x......x.......x.......x.......x.......x........x.......x.......x..........x

.. .....................1......1.......1.......1.......1......1......1.......1.......1............1
line2+line3 := ---- + --- + --- + --- + --- + --- + --- + --- + --- + O(---).......(7).
...........................4.....10.....11....13.....16.....19....28.....31....37...........40
........................z.......z.......z.......z........z......z.......z........z.......z............z

.....1.............1.......1.........1........1........1........1.......1......1.......1.......1.......1......1.......1.......1
O(---) + O(---) + ---- + ---- + ---- + ---- + --- + --- + --- + --- + --- + --- + --- + --- + ---
......39...........40.......3.........4........6........9.......10.....11....12.....13......15....16....18....19.....21
.....x.............z........x.........z.........x........x........z........z.......x.......z........x.......z......x.......z.......x

.....1......1......1......1.......1.......1.......1........1
+ --- + --- + --- + --- + --- + --- + --- + ---...........................................(8).
......24.....27....28....30.....31.....33.....36.....37
.....x......x.......z......x.......z.......x.......x........z

In the output sequence in equation (8), if we take x's as 1s and z's as 0, then we can read the binary line as 101100101010111010110 which is in the same order as the orginal string. This is possible because the orignal string is a spatial language where the 1s can be made periodic at the expense of the 0s which becomes more chaotic. It does look like the Second Law of thermodynamics is at work where increased order in one part of the system is obtained at the expense of increased entropy in the remaining part of the system. One cannot have the cake and eat it.

What if all alphabets are encoded periodically such as A=10, B=1010, C=101010, ... ? The answer is that now your alphabets are not of fixed bit size and you will need to introduce a space to mark the end of each alphabet. The spaces have to be encoded too and here is where entropy will arise as it has no order. You cannot beat the Second Law of thermodynamics.

5. Conclusions

Periodic languages seem very attractive for transmission through the ether since the messages can be expressed in closed forms which can be highly compressed. However the Theorem of Pseodu-Periodicities shows that it is impossible to convert a human language which is a spatial language to a periodic one in view of the presence of entropy in the former. Any attempt to put order in one part of the language will cause an increase in entropy in the other part. There is no way a human language can be made periodic like the natural number system. If it can be done, the language will be very deterministic but it will be so inflexible that most human thoughts cannot be communicated with it. This remark also applies to the DNA sequence which can be regarded as a biochemical language of Life. Most likely DNA-sequences are more like human languages and suffer from the same entropy problem as the latter. So the dream of a periodic human language which is as expressive as the spatial human language remains a dream. All attempts will be stonewalled by the Second Law of thermodynamics.


6. References

Comments: Not all references in this list are directly referred in the main paper. These are provided to readers as background papers in sequence algebra. These papers can be easily hyperlinked whilst you are in the web.

1. Schneier Bruce: Applied Cryptography, Protocols, Algorithms, and Source Code in C, chapter 11, pp 233 to 234, Wiley (2nd ed.), 1996

2. Information Contents Of Number Theoretic Functions - by Huen Y.K. (date released : 29.5.97) (21.5 KBytes, 7*A4s).

3. Chaitin G.J.: Godel's Theorem and Information, Interantional Journal of Theoretical Physics 22 (1982), pp 941-954.

4. Huen Y.K.: A Matrix Map for Prime and Non-prime Numbers, INT. J. Math. Educ. Sci. Technol., 1994, VOL. 25, NO.6, pp 913-920.

5. Huen Y.K.: Some Interesing Properties Of The Natural Number System, Int. J. Math. Educ. Sci. Technol., 1996, VOL.27, NO. 5, 685-691.

6. Huen Y.K.: Visual algebra and its applications, INT. J. Math. Educ. Sci. Technol.,1996, VOL.??, NO.?, ???-??? (In the press as proof paper mes 100421).

7. The twin prime problem revisited, INT.J.MATH.EDUC.SCI.TECHNOL.,199?,VOL.??, NO.?,???-???, proof paper mes-0488 (10 pages).


8. Is Pie Periodic?, INT.J.MATH.EDUC.SCI.TECHNOL.,199?,VOL.??,NO.?,???-???, (in the press).

================================================

9. Evaluations Of Normc( ) Function In Macsyma 2.2 - Huen Y.K. (Date Released 17/12/97, 14 Kbytes)

================================================

10. List Processing In Sequence Algebra - Huen Y.K. (Date Released 23/12/97, 20 Kbytes)

================================================

11. A Simple Introduction To Sequence Algebra - by Huen Y.K. (date release: 15.3.97) (38 KBytes, 11*A4 pages).

========================================================

12. The Canonical Generating Function or CGF(z) ... - by Huen Y.K. (date released : 27.5..97) (24 KBytes, 7*A4s).

========================================================

13. Visual Solutions Of Number Theoretic Problems ..... - by Huen Y.K. (date released : 3.6.97) (38.3 KBytes, 10*A4s).

========================================================

14. Final Value Theorem Applied To Number Sequences... - by Huen Y.K. (date released : 5.6.97) (29.4 KBytes, 9*A4s).

========================================================

15. Unsolved Problems In Sequence Algebra - by Huen Y.K. (date released : 6.6.97) (29.4 KBytes, 9*A4s).

========================================================

16. Methods Of Developing Sequence Algebraic Formulations For Comp(z) and Prime(z) - by Huen Y.K. (date released : 20.6.97) (36.8 KBytes, 10*A4s).

========================================================

17. Composite Number Sequence Challenge 1/97 - by Huen Y.K. (date released : 28.6.97) (24.8 KBytes, 7*A4s).

========================================================

18. Lemmata, Corollaries, And Theorems In Sequence Order Analysis. - by Huen Y.K. (date released : 6.7.97) (38.3 KBytes, 12*A4s).

========================================================

19. Improved Formulations For Comp(z) and Prime(z) - by Huen Y.K. (date released : 16.9.97) (17 KBytes ).

========================================================

20. Detecting False Reports in Primality Tests By The Oddcomp(z) Method. - by Huen Y.K. (date released : 18.9.97, Revised 20/9) (26 KBytes ).

========================================================

21. The Throwing Power Of Oddcomp(z). - by Huen Y.K. (date released : 24.9.97 ) (15 Kbytes).

========================================================

22. Sequence Algebraic Approach To Prime Number Theorem - by Huen Y.K. (date released : 28.9.97 ) (21 Kbytes).

========================================================

23. Generating Functions - Closed Forms vs Open Forms - by Huen Y.K. (date released : 1.10.97 ) (21 Kbytes).

========================================================

24. Generating Large Odd Composite With Two Prime Factors - by Huen Y.K. (date released : 3.10.97 ) (13.5 Kbytes).

========================================================

25. In Search Of Counter- Examples In Maple's Isprime Function. - by Huen Y.K. (date released : 4.10.97 ) (18 Kbytes).

========================================================

26. A Sequence Algebraist's View Of Lehmann's Primality Test - by Huen Y.K. (date released : 6.10.97 ) (26 Kbytes).

========================================================

27. On Odd(z), Oddcomp(z), Seq1(z) and Seq2(z) - by Huen Y.K. (date released : 10.10.97 ) (17 Kbytes).

========================================================

28. How To Generate A Short And Contiguous Oddcomp(z) Sequence? - by Huen Y.K. (date released : 15.10.97 ) (13 Kbytes).

========================================================

(29) A Sketch Of Test-Tube Evolution In A Primeval Number Soup - by Huen Y.K. (date released : 25.11.97) (paper35.htm 1 K).

========================================================

(30) In Search Of Primes From The XGS Sequence - by Huen Y.K. (date released : 8.12..97) (paper38.htm 23 K).

=====================END OF PAPER ======================