Sunday, May 31, 2009

Greatest prime factor sequences


The greatest prime factor function (gpf) can be used to generate a class of prime sequences as follows: x(1) = p (an arbitrary prime) and x(n+1) = gpf (a*x(n) + b) for n ≥ 1 (a, b are fixed positive integers). Computer evidence suggests that all "linear gpf sequences" are ultimately periodic, regardless the values chosen for a and b. For example, the sequences corresponding to a = 2, b = 1 and p arbitrary appear to enter the period 3, 7, 5, 11, 23, 47, 19, 13. For a = 6, b = 5, x(1) = 2, the period is pretty long: 23, 13, 83, 503, 3023, 18143, 108863, 9749, 137, 827, 4967, 727, 397, 31, 191, 1151, 6911, 367, 2207, 1019, 211, 41, 251, 1511, 193, 1163, 69. Multiple limit cycles are possible - for example in the case a = 16 and b = 1, the choice x(1)=2 leads to the period 37, 593, 3163, 229, 733, 317, 89, 19, 61, 977, 193, 3089, 659, while the choice x(1) = 47 leads to the period 251, 103, 97, 1553. So far we proved the "gpf conjecture" in the special case in which a is a divisor of b.
REFERENCES:
1. Caragiu, M. and Scheckelhoff, L. The Greatest Prime Factor and Related Sequences., JP J. of Algebra, Number Theory and Appl. 6, 403-409 (2006).
2. Mihai Caragiu. Recurrent sequences based on the greatest prime factor function. In Abstracts of the Papers Presented to the American Mathematical Society, Meeting #1020, University of Cincinnati, 1020-11-247 (2006).

Monday, May 4, 2009

Balancing the frequencies

Some might find useful to assign +1 or –1 weights to the letters in the English alphabet so that the weighted sum of the frequencies of the letters in the "generic" plain English text is, in absolute value, as small as possible. A possible weight assignment is given in the list below. The number listed at the beginning of every row is 100,000 times the proportion of the corresponding letter (based on the letter frequencies list displayed in Wikipedia, in reference to the "Relative Frequencies of Letters in General English Plain text", from Cryptographical Mathematics, by Robert Edward Lewand). In this weight assignment, the weighted sum is not zero - indeed it is 1 (this is because the original list of frequencies is not perfectly normalized - they add up to 99.999%) and thus optimal, since the sum of the numbers initializing the rows below is odd (which rules out a partition in two subsets with equal sums) . This corresponds to a frequency "defect" of 0.001%. This simple example may be seen in relation to the general "subset sum problem". We can use this "weight assignment" to generate +1/–1 "random walks" out of string characters.

12702(e)–1
9056(t)–1
8167(a)+1
7507(o)–1
6966(i)+1
6749(n)+1
6327(s)–1
6094(h)–1
5987(r)+1
4253(d)+1
4025(l)+1
2782(c)–1
2758(u)+1
2406(m)+1
2360(w)+1
2228(f)+1
2015(g)–1
1974(y)+1
1929(p)–1
1492(b)–1
978(v)+1
772(k)+1
153(j)+1
150(x)+1
95(q)–1
74(z)+1