Problem #342 says:
Consider the number 50.
\(50^{2} = 2500 = 2^{2} × 5^{4}\), so \(φ(2500) = 2 × 4 × 5^{3} = 8 × 5^{3} = 2^{3} × 5^{3}.\)
So 2500 is a square and φ(2500) is a cube.
Find the sum of all numbers n, 1 < n < \(10^{10}\) such that \(φ(n^{2})\) is a cube.
This solution makes much use of the following property of the totient function \(φ(p^{a}) = p^{a-1} (p-1)\). With a little effort one can see that if you have the prime factorization of a number N you can calculate the prime factorization of the totient of \(n^{2}\) and if one has that factorization the to make sure that it is a cube is as simple as making sure that the power of each exponent in the factorization is 0 mod 3.
Thus for the example problem above we have the number n = 50. Which factors to \(2 * 5^{2}\). Using the above identity on the 5 factor we get \(φ(5^{2}) = 5^{(2*2)-1} (5-1)\) which when we factor any nonfactored terms comes out to be \(2^{2} * 5^{3}\) which is then multiplied by the result of the same property applied to the factor 2 gets us the result \( 2^{3} * 5^{3}\). We can see that this is a cube because all factors are raised to a multiple of three.
With that basic set my solution is relying on factoring numbers and as most people know that’s rather difficult to do fast. Especially when memorizing previously factored numbers becomes impossible due to the size of the problem set. However this is a massively parallel operation for each value of n. So I took this as an opportunity to learn some OpenMPI and see what I could do with at bunch of old computers. However this means that solution only runs in a reasonable time if you have a lot of hardware backing it and a network for the various computers to talk over. It’s actually quite easy to setup a cluster from old computers but maybe not for the faint of heart or those unused to Linux.
Solution written in c++ using g++ 4.5.1 and OpenMPI 1.4.1. Runtime depends on how many processors you give the program; but I will say that it’s far from the fastest solution unless you have an account on a TOP500 supercomputer.
P.S. This program divides work with a fairly simple algorithm and works best with a multiple of 10 for the number of processes created by mpiexec. If you don’t use a multiple of 10 then it may skip numbers or come up short of the actual answer.
#include <iostream>
#include <vector>
#include <map>
#include <string.h>
#include <time.h>
#include <algorithm>
#include <mpi.h>
#include <math.h>
using namespace std;
vector<unsigned long> get_primes(unsigned long max){
vector<unsigned long> primes;
char *sieve;
sieve = new char[max/8+1];
// Fill sieve with 1
memset(sieve, 0xFF, (max/8+1) * sizeof(char));
for(unsigned long x = 2; x <= max; x++)
if(sieve[x/8] & (0x01 << (x % 8))){
primes.push_back(x);
// Is prime. Mark multiplicates.
for(unsigned long j = 2*x; j <= max; j += x)
sieve[j/8] &= ~(0x01 << (j % 8));
}
delete[] sieve;
return primes;
}
vector<unsigned long> primes = get_primes(100000);
map<unsigned long, unsigned long> Factor (unsigned long n)
{
map<unsigned long, unsigned long> Factors;
unsigned long sqrtn = sqrt(n);
for (unsigned int i = 0; i < primes.size(); i++)
{
if (primes[i] > sqrtn)
{
break;
}
else
{
if (n % primes[i] == 0)
{
Factors[primes[i]] = 0ul;
while(n % primes[i] == 0)
{
Factors[primes[i]] = Factors[primes[i]] + 1ul;
n = n/primes[i];
}
}
}
}
if (n != 1)
Factors[n]=1;
if (Factors.size() == 0)
{
Factors[n] = (unsigned long)1;
}
return Factors;
}
bool IsTotientCube(unsigned long n)
{
map<unsigned long, unsigned long> factors = Factor(n);
map<unsigned long, unsigned long>::iterator pos;
map<unsigned long, unsigned long> FactorsOfSquaredTotient;
for (pos = factors.begin(); pos != factors.end(); ++pos)
{
map<unsigned long, unsigned long> pm1factors = Factor((pos->first)-1);
map<unsigned long, unsigned long>::iterator pm1pos;
for (pm1pos = pm1factors.begin(); pm1pos != pm1factors.end(); ++pm1pos)
{
if (pm1pos->first != 1ul)
FactorsOfSquaredTotient[pm1pos->first] += pm1pos->second;
}
FactorsOfSquaredTotient[pos->first] += (2* (pos->second)) -1;
}
map<unsigned long, unsigned long>::iterator possquaretotient;
for (possquaretotient = FactorsOfSquaredTotient.begin();
possquaretotient != FactorsOfSquaredTotient.end();
++possquaretotient)
{
if (possquaretotient->second % 3 != 0)
{
return false;
}
}
return true;
}
int main(int argc, char *argv[])
{
int numprocs, rank, namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
unsigned long count = 0ul;
unsigned long max = 10000000000;
unsigned long workload = max / numprocs;
unsigned long responses[numprocs];
for (unsigned long i = (workload * rank); i < ((workload * rank) + workload);++i)
{
if (IsTotientCube(i) == true)
{
count += i;
cout << i << endl;
}
}
MPI_Barrier(MPI_COMM_WORLD);
MPI_Gather(&count,1,MPI_UNSIGNED_LONG,&responses,1,MPI_UNSIGNED_LONG,0,MPI_COMM_WORLD);
unsigned long answer = 0ul;
for (int i = 0; i < numprocs;++i)
{
answer += responses[i];
}
cout << "Rank "<< rank << " Found #"<< count << endl;
if (rank == 0)
cout << "Answer "<< answer << endl;
MPI_Finalize();
return 0;
}