Tuesday, September 29, 2015

Modular Inverse from 1 to N

We already learned how to find Modular Inverse for a particular number in a previous post, "Modular Multiplicative Inverse". Today we will look into finding Modular Inverse in a bulk.


Problem

Given $N$ and $M$ ( $N < M$ and $M$ is prime ), find modular inverse of all numbers between $1$ to $N$ with respect to $M$.

Since $M$ is prime and $N$ is less than $M$, we can be sure that Modular Inverse exists for all numbers. Why? Cause prime numbers are coprime to all numbers less than them.

We will look into two methods. Later one is better than the first one.

$O(NlogM)$ Solution

Using Fermat's little theorem, we can easily find Modular Inverse for a particular number.

$A^{-1} \:\%\: M = bigmod(A,M-2,M)$, where $bigmod()$ is a function from the post "Repeated Squaring Method for Modular Exponentiation". The function has complexity of $O(logM)$. Since we are trying to find inverse for all numbers from $1$ to $N$, we can find them in $O(NlogM)$ complexity by running a loop.
int inv[SIZE]; ///inv[x] contains value of (x^-1 % m)
for ( int i = 1; i <= n; i++ ) {
    inv[i] = bigmod ( i, m - 2, m );
}
But it's possible to do better. 

$O(N)$ Solution

This solution is derived using some clever manipulation of Modular Arithmetic.

Suppose we are trying to find the modular inverse for a number $a$, $a < M$, with respect to $M$. Now divide $M$ by $a$. This will be the starting point. 

$M = Q \times a + r$, (where $Q$ is the quotient and $r$ is the remainder) 
$M = \lfloor \frac{M}{a} \rfloor \times a + (M \:\%\: a )$

Now take modulo $M$ on both sides.

$0 \equiv \lfloor \frac{M}{a} \rfloor \times a + (M \:\%\: a ) \:\:\:\text{(mod M )}$
$  (M \:\%\: a ) \equiv -\lfloor \frac{M}{a} \rfloor \times a \:\:\:\text{(mod M )}$

Now divide both side by $a \times ( M \:\%\: a )$.

$$\frac{M \:\%\: a}{a \times ( M \:\%\: a )} \equiv \frac{-  \lfloor \frac{M}{a} \rfloor \times a } { a \times ( M \:\%\: a ) } \:\:\:\text{(mod M)}$$
$$\therefore a^{-1} \equiv - \lfloor \frac{M}{a} \rfloor \times ( M \:\%\: a )^{-1} \:\:\:\text{(mod M)}$$

The formula establishes a recurrence relation. The formula says that, in order to find the modular inverse of $a$, we need to find the modular inverse of $b = M \:\%\: a$ first. 

Since $b = M \:\%\: a$, we can say that its value lies between $0$ and $a-1$. But, $a$ and $M$ are coprime. So $a$ will never fully divide $M$. Hence we can ignore the possibility that $b$ will be $0$. So possible values of $b$ is between $1$ and $a-1$.

Therefore, if we have all modular inverse from $1$ to $a-1$ already calculated, then we can find the modular inverse of $a$ in $O(1)$.

Code

We can now formulate our code.
int inv[SIZE];
inv[1] = 1;
for ( int i = 2; i <= n; i++ ) {
    inv[i] = (-(m/i) * inv[m%i] ) % m;
    inv[i] = inv[i] + m;
}
In line $2$, we set the base case. Modular inverse of $1$ is always $1$. Then we start calculating inverse from $2$ to $N$. When $i=2$, all modular inverse from $1$ to $i-1=1$ is already calculated in array $inv[]$. So we can calculate it in $O(1)$ using the formula above at line $4$.

At line $5$, we make sure the modular inverse is non-negative.

Next, when $i=3$, all modular inverse from $1$ to $i-1=2$ is already calculated. This is process is repeated till we reach $N$.

Since we calculated each inverse in $O(1)$, the complexity of this code is $O(N)$.

Conclusion

I saw this code first time on CodeChef forum. I didn't know how it worked back then. I added it to my notebook and have been using it since then. Recently, while searching over the net for resources on Pollard Rho's algorithm, I stumbled on an article from Come On Code On which had the explanation. Thanks, fR0DDY, I have been looking for the proof.

Reference

  1. forthright48 - Modular Multiplicative Inverse
  2. forthright48 - Repeated Squaring Method for Modular Exponentiation
  3. Come On Code On - Modular Multiplicative Inverse

Saturday, September 26, 2015

Euler Phi Extension and Divisor Sum Theorem

Previously we learned about Euler Phi Function. Today we are going to look at two theorems related to Euler Phi that frequently appears in CPPS. I am not sure whether these theorems have any official names, so I just made them up. These allow easy references so I will be using these names from now on.


Euler Phi Extension Theorem

Theorem: Given a number $N$, let $d$ be a divisor of $N$. Then the number of pairs $\{a,N\}$, where $1 \leq a \leq N$ and $gcd(a,N) = d$, is $\phi(\frac{N}{d})$.

Proof

We will prove the theorem using Euler Phi Function and Arithmetic notion.

We need to find the numbe of pairs $\{a,N\}$ such that $gcd(a,N) = d$, where $1 \leq a \leq N$. 

Both $a$ and $N$ are divisible by $d$ and $d$ is the GCD. So, if we divide both $a$ and $N$ by $d$, then they will no longer have any common divisor.

$gcd(\frac{a}{d},\frac{N}{d}) = 1$,  where $1 \leq a \leq N$.

We know that the possible values of $a$ lie in range $1 \leq a \leq N$. What about the possible values of $\frac{a}{d}$? $\frac{a}{d}$ must lie between $1  \leq \frac{a}{d} \leq \frac{N}{d}$ otherwise $a$ will cross its limit.

Therefore, $gcd(a,N) = d$, where $1 \leq a \leq N$ is same as, $gcd(\frac{a}{d},\frac{N}{d}) = 1$, where $1  \leq \frac{a}{d} \leq \frac{N}{d}$.

So all we need to do is find the value of  $gcd(\frac{a}{d},\frac{N}{d}) = 1$, where $1  \leq \frac{a}{d} \leq \frac{N}{d}$.

Let $N' = \frac{N}{d}$ and $a' = \frac{a}{d}$. How many pairs of $\{a',N'\}$ are there such that $gcd(a',N') = 1$ and $1 \leq a' \leq N'$? Isn't this what Euler Phi Function finds? The answer is $\phi(N') = \phi(\frac{N}{d})$.

Euler Phi Divisor Sum Theorem

Theorem: For a given integer $N$, the sum of Euler Phi of each of the divisors of $N$ equals to $N$, i.e, $$N = \sum_{d|N}\phi(d)$$

Proof

The proof is simple. I have broken down the proof in the following chunks for the ease of understanding.

Forming Array $A$

Imagine, we the following fractions in a list: $$\frac{1}{N}, \frac{2}{N}, \frac{3}{N}...\frac{N}{N}$$

Not very hard to imagine right? Let us convert this into an array of pairs. So now, we have the following array $A$: 

$$A = [ \{1,N\},\{2,N\},\{3,N\}...\{N,N\} ]$$

So we have an array of form $\{a,N\}$, where $a$ is between $1$ and $N$. There are exactly $N$ elements in the array.

Finding GCD of Pairs

Next, we find the GCD of each pair, $g$. What are the possible values of $g$? Since $g$ must divide both $a$ and $N$, $g$ must be a divisor of $N$. Therefore, we can conclude that, GCD of pair $\{a,N\}$ will be one of the divisors of $N$.

Let the divisors of $N$ be the following: $d_1, d_2, d_3...d_r$. So these are the only possible GCD.

Forming Parititions

Next, we form partitions $P_i$. Let us put all pairs which have $gcd(a,N) = d_i$ to partition $P_i$. Therefore, we will have $R$ partitions, where $R$ is the number of divisor of $N$. Note that each pair will belong to one partition only since a pair has a unique GCD. Therefore, $$N = \sum_{i=1}^{R}P_i$$

Size of Each Paritition

How many elements does partition $P_i$ contain? $P_i$ has all the pairs $\{a,N\}$ such that $gcd(a,N) = d_i$, $1 \leq a \leq N$. Using Euler Phi Extension Theorem from above, this value is $\phi(\frac{N}{d_i})$.

Wrapping it Up

We are almost done with the proof. Since $N = \sum_{i=1}^{R}P_i$, we can now write: $$N = \sum_{i=1}^{R}P_i = \sum_{i=1}^{R}\phi(\frac{N}{d_i})$$

But $d_i$ is just a divisor of $N$. So we can simplify and write:

$$N = \sum_{d|N}\phi(\frac{N}{d}) = \sum_{d|N}\phi(d)$$

Conclusion

These theorems may look so simple that you might think they are useless. Especially Euler Phi Divisor Sum Theorem, $N = \sum_{d|N} \phi(d)$. How is this useful at all? Hopefully, we will see one of its application on next post.

Reference

Wednesday, September 23, 2015

Modular Multiplicative Inverse

Problem

Given value of $A$ and $M$, find the value of $X$ such that $AX \equiv 1\:\text{(mod M)}$.

For example, if $A = 2$ and $M = 3$, then $X = 2$, since $2\times2 = 4 \equiv 1\:\text{(mod 3)}$.

We can rewrite the above equation to this:

$AX \equiv 1\:\text{(mod M)}$
$X \equiv \frac{1}{A}\:\text{(mod M)}$
$X \equiv A^{-1}\:\text{(mod M)}$

Hence, the value $X$ is known as Modular Multiplicative Inverse of $A$ with respect to $M$.

How to Find Modular Inverse?

First we have to determine whether Modular Inverse even exists for given $A$ and $M$ before we jump to finding the solution. Modular Inverse doesn't exist for every pair of given value.

Existence of Modular Inverse

Modular Inverse of $A$ with respect to $M$, that is, $X = A^{-1} \text{(mod M)}$ exists, if and only if $A$ and $M$ are coprime.

Why is that?

$AX \equiv 1 \:\text{(mod M)}$
$AX - 1 \equiv 0 \:\text{(mod M)}$

Therefore, $M$ divides $AX-1$. Since $M$ divides $AX-1$, then a divisor of $M$ will also divide$AX-1$. Now suppose, $A$ and $M$ are not coprime. Let $D$ be a number greater than $1$ which divides both $A$ and $M$. So, $D$ will divide $AX - 1$. Since $D$ already divides $A$, $D$ must divide $1$. But this is not possible. Therefore, the equation is unsolvable when $A$ and $M$ are not coprime.

From here on, we will assume that $A$ and $M$ are coprime unless state otherwise.

Using Fermat's Little Theorem

Recall Fermat's Little Theorem from a previous post, "Euler's Theorem and Fermat's Little Theorem". It stated that, if $A$ and $M$ are coprime and $M$ is a prime, then, $A^{M-1} \equiv 1 \text{(mod M)}$. We can use this equation to find the modular inverse.

$A^{M-1} \equiv 1 \text{(mod M)}$ (Divide both side by $A$)
$A^{M-2} \equiv \frac{1}{A}\text{(mod M)}$
$A^{M-2} \equiv A^-1\text{(mod M)}$

Therefore, when $M$ is prime, we can find modular inverse by calculating the value of $A^{M-2}$. How do we calculate this? Using Modular Exponentiation.

This is the easiest method, but it doesn't work for non-prime $M$. But no worries since we have other ways to find the inverse.

Using Euler's Theorem

It is possible to use Euler's Theorem to find the modular inverse. We know that:

$A^{\phi(M)} \equiv 1 \text{(mod M)}$
$\therefore A^{\phi(M)-1} \equiv A^{-1} \text{(mod M)}$

This process works for any $M$ as long as it's coprime to $A$, but it is rarely used since we have to calculate Euler Phi value of $M$ which requires more processing. There is an easier way.

Using Extended Euclidean Algorithm

We are trying to solve the congruence, $AX \equiv 1 \text{(mod M)}$. We can convert this to an equation.

$AX \equiv 1 \text{(mod M)}$
$AX + MY = 1$

Here, both $X$ and $Y$ are unknown. This is a linear equation and we want to find integer solution for it. Which means, this is a Linear Diophantine Equation.

Linear Diophantine Equation can be solved using Extended Euclidean Algorithm. Just pass $\text{ext_gcd()}$ the value of $A$ and $M$ and it will provide you with values of $X$ and $Y$. We don't need $Y$ so we can discard it. Then we simply take the mod value of $X$ as the inverse value of $A$.

Code

$A$ and $M$ need to be coprime. Otherwise, no solution exists. The following codes do not check if $A$ and $M$ are coprime. The checking is left of the readers to implement.

When $M$ is Prime

We will use Fermat's Little Theorem here. Just call the $bigmod()$ function from where you need the value. 
int x = bigmod( a, m - 2, m ); ///(ax)%m = 1
Here $x$ is the modular inverse of $a$ which is passed to $bigmod()$ function.

When $M$ is not Prime

For this, we have to use a new function. 
int modInv ( int a, int m ) {
    int x, y;
    ext_gcd( a, m, &x, &y );

    ///Process x so that it is between 0 and m-1
    x %= m;
    if ( x < 0 ) x += m;
    return x;
}
I wrote this function since after using $\text{ext_gcd()}$ we need to process $x$ so that it's value is between $0$ and $M-1$. Instead of doing that manually, I decided to write a function.

So, if we want to find the modular inverse of $A$ with respect to $M$, then the result will be $X = modInv ( A, M )$.

Complexity

Repeated Squaring method has a complexity of $O(logP)$, so the first code has complexity of $O(logM)$, whereas Extended Euclidean has complexity of $O(log_{10}A+log_{10}B)$ so second code has complexity $O(log_{10}A + log_{10}M)$.

Why Do We Need Modular Inverse?

We need Modular Inverse to handle division during Modular Arithmetic. Suppose we are trying to find the value of the following equations:

$\frac{4}{2} \:\%\: 3$ - This is simple. We just simplify the equation and apply normal modular operation. That is, it becomes $\frac{4}{2} \:\%\: 3 = 2 \:\%\: 3 = 2$.

Then what happens when we try to do same with $\frac{12}{9}\:\%\:5$? First we simply. $\frac{12}{9}\:\%\:5 = \frac{4}{3}\:\%\:5$. Now we are facing an irreducible fraction. Should we simply perform the modular operation with numerator and denominator? That doesn't help since both of them are smaller than $5$.

This is where Modular Inverse comes to the rescue. Let us solve the equation $X \equiv 3^{-1}\:\text{(mod 5)}$. How do we find the value of $X$? We will see that on the later part of the post. For now, just assume that we know the value of $X$.

Now, we can rewrite the above equation in the following manner:

$\frac{12}{9}\:\%\:5$
$\frac{4}{3}\:\%\:5$
$(4 \times 3^{-1})\:\%\:5$
$( (4\:\%\:5) \times (3^{-1}\:\%\:5) ) \:\%\:5$
$\therefore 4X \:\%\:5$

So, now we can easily find the value of $\frac{A}{B} \:\%\: M$ by simply calculating the value of $(A \times B^{-1}) \:\%\: M$.

Conclusion

Modular Inverse is a small topic but look at the amount of background knowledge it requires to understand it! Euler's Theorem, Euler Phi, Modular Exponentiation, Linear Diophantine Equation, Extended Euclidian Algorithm and other small bits of information. We covered them all before, so we can proceed without any hitch.

Hopefully, you understood how Modular Inverse works. If not, make sure to revise the articles in the "Reference" section below.

Reference

  1. Wiki - Modular Multiplicative Inverse
  2. forthright48 - Euler's Theorem and Fermat's Little Theorem
  3. forthright48 - Modular Exponentiation
  4. forthright48 - Euler Phi
  5. forthright48 - Linear Diophantine Equation
  6. forthright48 - Extended Euclidean Algorithm

Monday, September 21, 2015

Repeated Squaring Method for Modular Exponentiation

Previously on Modular Exponentiation we learned about Divide and Conquer approach to finding the value of $B^P \:\%\: M $. In that article, which is recursive. I also mentioned about an iterative algorithm that finds the same value in same complexity, only faster due to the absence of recursion overhead. We will be looking into that faster algorithm on this post today.

Make sure you know about Bit Manipulation before proceeding.


Problem

Given three positive integers $B, P$ and $M$, find the value of $B^P \:\%\: M$.

For example, $B=2$, $P=5$ and $M=7$, then $B^P \:\%\: M = 2^5 \: \% \: 7 = 32 \: \% \: 7 = 4$.

Repeated Squaring Method

Repeated Squaring Method (RSM) takes the advantage of the fact that $A^x \times A^y = A^{x+y}$.

Now, we know that any number can be written as the sum of powers of $2$. Just convert the number to Binary Number System. Now for each position $i$ for which binary number has $1$ in it, add $2^i$ to the sum.

For example, $(19)_{10} = (10011)_2 = ( 2^4 + 2^1 + 2^0)_{10} = (16 + 2 + 1 )_{10}$

Therefore, in equation $B^P$, we can decompose $P$ to the sum of powers of $2$.

So let's say, $P = 19$, then $B^{19} = B^{2^4 + 2^1 + 2^0} = B^{16+2+1} = B^{16} \times B^2 \times B^1$.

This is the main concept for repeated squaring method. We decompose the value $P$ to binary, and then for each position $i$ (we start from 0 and loop till the highest position at binary form of $P$) for which binary of $P$ has $1$ in $i_{th}$ position, we multiply $B^{2^i}$ to result.

Code

Here is the code for RSM. The Explanation is below the code.
int bigmod ( int b, int p, int m ) {
    int res = 1 % m, x = b % m;
    while ( p ) {
        if ( p & 1 ) res = ( res * x ) % m;
        x = ( x * x ) % m;
        p >>= 1;
    }
    return res;
}

Explanation

At line $1$, we have the parameters. We simply send the value of $B$, $P$ and $M$ to this function, and it will return the required result.

At line $2$, we initiate some variables. $res$ is the variable that will hold our result. It contains the value $1$ initially. We will multiply $b^{2^i}$ with result to find $b^p$. $x$ is temporary variable that initially contains the value $b^{2^0} = b^1 = b$.

Now, from line $3$ the loop starts. This loop runs until $p$ becomes $0$. Huh? Why is that? Keep reading.

At line $4$ we first check whether the first bit of $p$ is on or not. If it is on, then it means that we have to multiply $b^{2^i}$ to our result. $x$ contains that value, so we multiply $x$ to $res$.

Now line $5$ and $6$ are crucial to the algorithm. Right now, $x$ contains the value of $b^{2^0}$ and we are just checking the $0_{th}$ position of $p$ at each step. We need to update our variables such that they keep working for positions other than $0$.

First, we update the value of $x$. $x$ contains the value of $b^{2^i}$. On next iteration, we will be working with position $i+1$. So we need to update $x$ to hold $b^{2^{i+1}}$.

$b^{2^{i+1}} = b^{2^i \times 2^1} = b ^ {2^i \times 2} = b^{2^i + 2^i} = b^{2^i} \times b^{2^i} = x \times x$.

Hence, new value of $x$ is $x \times x$. We make this update at line $5$.

Now, at each step we are checking the $0_{th}$ position of $p$. But next we need to check the $1_{st}$ position of $p$ in binary. Instead of checking $1_{st}$ position of $p$, what we can do is shift $p$ $1$ time towards right. Now, checking $0_{th}$ position of $p$ is same as checking $1_{st}$ position. We do this update at line $6$.

These two lines ensures that our algorithm works on each iteration. When $p$ becomes $0$, it means there is no more bit to check, so the loop ends.

Complexity

Since there cannot be more than $log_{2}(P)$ bits in $P$, the loop at line $3$ runs at most $log_{2}(P)$ times. So the complexity is $log_{2}(P)$.

Conclusion

RSM is significantly faster than D&C approach due to lack of recursion overhead. Hence, I always use this method when I have to find Modular Exponentiation.

The code may seem a little confusing, so feel free to ask questions.

When I first got my hands on this code, I had no idea how it worked. I found it in a forum with a title, "Faster Approach to Modular Exponentiation". Since then I have been using this code.

Resources

  1. forthrigth48 - Modular Exponentiation
  2. forthright48 - Bit Manipulation
  3. algorithmist - Repeated Squaring
  4. Wiki - Exponentiation by Squaring

Thursday, September 17, 2015

Euler's Theorem and Fermat's Little Theorem

We will be looking into two theorems at the same time today, Fermat's Little Theorem and Euler's Theorem. Euler's Theorem is just a generalized version of Fermat's Little Theorem, so they are quite similar to each other. We will focus on Euler's Theorem and its proof. Later we will use Euler's Theorem to prove Fermat's Little Theorem.

Euler's Theorem

Theorem - Euler's Theorem states that, if $a$ and $n$ are coprime, then $a^{\phi(n)} \equiv 1 (\text{mod n})$ - Wikipedia
Here $\phi(n)$ is Euler Phi Function. Read more about Phi Function on this post - Euler Totient or Phi Function.

Proof

Let us consider a set $A = \{b_1, b_2, b_3...,b_{\phi(n)} \}\:(\text{mod n})$, where $b_i$ is coprime to $n$ and distinct. Since there are $\phi(n)$ elements which are coprime to $n$, $A$ contains $\phi(n)$ integers.

Now, consider the set $B = \{ ab_1, ab_2, ab_3....ab_{\phi(n)} \} \:(\text{mod n})$. That is, $B$ is simply set $A$ where we multiplied $a$ with each element. Let $a$ be coprime to $n$. 
Lemma - Set $A$ and set $B$ contains the same integers.
We can prove the above lemma in three steps.
  1. $A$ and $B$ has the same number of elements
    Since $B$ is simply every element of $A$ multiplied with $a$, it contains the same number of elements as $A$. This is obvious.
  2. Every integer in $B$ is coprime to $n$
    An integer in $B$ is of form $a \times b_i$. We know that both $b_i$ and $a$ are coprime to $n$, so $ab_i$ is also coprime to $n$.
  3. $B$ contains distinct integers only
    Suppose $B$ does not contain distinct integers, then it would mean that there is such a $b_i$ and $b_j$ such that:

    $ab_i \equiv ab_j\:(\text{mod n})$
    $b_i \equiv b_j\:(\text{mod n})$

    But this is not possible since all elements of $A$ are distinct, that is, $b_i$ is never equal to $b_j$. Hence, $B$ contains distinct elements.
With these three steps, we claim that, since $B$ has the same number of elements as $A$ which are distinct and coprime to $n$, it has same elements as $A$.

Now, we can easily prove Euler's Theorem.

$ab_1 \times ab_2 \times ab_3...\times ab_{\phi(n)} \equiv b_1 \times b_2 \times b_3...\times b_{\phi(n)} \:(\text{mod n})$
$a^{\phi(n)} \times b_1 \times b_2 \times b_3...\times b_{\phi(n)} \equiv b_1 \times b_2 \times b_3...\times b_{\phi(n)}  \:(\text{mod n}) $
$$\therefore a^{\phi(n)} \equiv 1  \:(\text{mod n})$$

Fermat's Little Theorem

Fermat's Little Theorem is just a special case of Euler's Theorem.
Theorem - Fermat's Little Theorem states that, if $a$ and $p$ are coprime and $p$ is a prime, then $a^{p-1} \equiv 1 \:(\text{mod p})$ - Wikipedia
As you can see, Fermat's Little Theorem is just a special case of Euler's Theorem. In Euler's Theorem, we worked with any pair of value for $a$ and $n$ where they are coprime, here $n$ just needs to be prime.

We can use Euler's Theorem to prove Fermat's Little Theorem.

Let $a$ and $p$ be coprime and $p$ be prime, then using Euler's Theorem we can say that:

$a^{\phi(p)} \equiv 1\:(\text{mod p})$  (But we know that for any prime $p$, $\phi(p) = p-1$)
$a^{p-1} \equiv 1\:(\text{mod p})$

Conclusion

Both theorems have various applications. Finding Modular Inverse is a popular application of Euler's Theorem. It can also be used to reduce the cost of modular exponentiation. Fermat's Little Theorem is used in Fermat's Primality Test.

There are more applications but I think it's better to learn them as we go. Hopefully, I will be able to cover separate posts for each of the applications.

Reference

  1. Wiki - Euler's Theorem
  2. forthright48 - Euler Totient or Phi Function
  3. Wiki - Fermat's Little Theorem

Monday, September 7, 2015

Segmented Sieve of Eratosthenes

Problem

Given two integers $A$ and $B$, find number of primes inside the range of $A$ and $B$ inclusive. Here, $1 \leq A \leq B \leq 10^{12}$ and $B - A \leq 10^5$.

For example, $A = 11$ and $B = 19$, then answer is $4$ since there are $4$ primes within that range ($11$,$13$,$17$,$19$).

If limits of $A$ and $B$ were small enough ( $ \leq 10^8$ ), then we could solve this problem using the ordinary sieve. But here limits are huge, so we don't have enough memory or time to run normal sieve. But note that, $B - A \leq 10^5$. So even though we don't have memory/time to run sieve from $1$ to $N$, we have enough memory/time to cover $A$ to $B$.

$A$ to $B$ is a segment, and we are going to modify our algorithm for Sieve of Eratosthenes to cover this segment. Hence, the modified algorithm is called Segmented Sieve of Eratosthenes. 

Make sure you fully understand how "Normal" Sieve of Eratosthenes works.

How Normal Sieve Works

First, let us see the steps for "Normal" sieve. In order to make things simpler, we will be looking into a somewhat unoptimized sieve. You can implement your own optimizations later.

Suppose we want to find all primes between $1$ to $N$.
  1. First we define a new variable $sqrtn = \sqrt{N}$.
  2. We take all primes less than $sqrtn$. 
  3. For each prime $p$, we repeat the following steps.
    1. We start from $j = p \times p$.
    2. If $j \leq N$, we mark sieve at position $j$ to be not prime.
      Else, we break out of the loop.
    3. We increase the value of $j$ by $p$. And go back to step step $2$ of this loop.
  4. All positions in the sieve that are not marked are prime.
This is how the basic sieve works. We will now modify it to work on segments.

How Segmented Sieve Works

We will perform the same steps as normal sieve but just slightly modified.

Generate Primes Less Than $\sqrt{N}$

In the segmented sieve, what is he largest limit possible? $10^{12}$. So let $N = 10^{12}$

First of all, in the normal sieve we worked with primes less than $\sqrt{N}$ only. So, if we had to run sieve from $1$ to $N$, we would have required only primes less than $\sqrt{N} = 10^6$. So in order to run sieve on a segment between $1$ to $N$, we won't require primes greater than $\sqrt{N}$.

So, using normal sieve we will first generate all primes less than $\sqrt{N} = 10^6$.

Run on Segment

Okay, now we can start our "Segmented" Sieve. We want to find primes between $A$ and $B$. 
  1. If $A$ is equal to $1$, then increase $A$ by $1$. That is, make $A = 2$. Since $1$ is not a prime, this does not change our answer.
  2. Define a new variable $sqrtn = \sqrt{B}$.
  3. Declare a new array of size $dif = \text{maximum difference of }(B - A) + 1$. Since it is given in our problem that $B-A \leq 10^5$, $dif = 10^5 + 1$ for this problem.

    Let the array be called $arr$. This array has index from $0$ to $dif-1$. Here $arr[0]$ represents the number $A$, $arr[1]$ represents $A+1$ and so on.
  4. Now, we will be working with all primes less than $sqrtn$. These primes are already generated using the normal sieve.
  5. For each prime $p$, we will repeat the following steps:
    1. We start from $j = p \times p$.
    2. But initial value of $j = p^2$ might be less than $A$. We want to mark positions between $A$ and $B$ only. So we will need to shift $j$ inside the segment.

      So, if $j < A$, then $j = ceil ( \frac{A}{p} ) \times p = \frac{A+p-1}{p} \times p$. This line makes $j$ the smallest multiple of $p$ which is bigger than $A$.
    3. If $j \leq B$, we mark sieve at position $j$ to be not prime.
      Else, we break out of the loop.

      But when marking, remember that our array $arr$ is shifted by $A$ positions. $arr[0]$ indicates position $A$ of normal sieve. So, we will mark position $j - A$ of $arr$ as not prime.
    4. Increase the value of $j$ by $p$. Repeat loop from step $3$.
  6. All positions in $arr$ which has not been marked are prime.
Step $1$ is important. Since we only mark multiples of prime as not prime in the pseducode above, $1$ which has no prime factor never gets marked. So we handle it by increassing value of $A$ by $1$ when $A = 1$.

Code

If we convert the above pseducode into C++, then it becomes something like this:
int arr[SIZE];

///Returns number of primes between segment [a,b]
int segmentedSieve ( int a, int b ) {
    if ( a == 1 ) a++;

    int sqrtn = sqrt ( b );

    memset ( arr, 0, sizeof arr ); ///Make all index of arr 0.

    for ( int i = 0; i < prime.size() && prime[i] <= sqrtn; i++ ) {
        int p = prime[i];
        int j = p * p;

        ///If j is smaller than a, then shift it inside of segment [a,b]
        if ( j < a ) j = ( ( a + p - 1 ) / p ) * p;

        for ( ; j <= b; j += p ) {
            arr[j-a] = 1; ///mark them as not prime
        }
    }

    int res = 0;
    for ( int i = a; i <= b; i++ ) {
        ///If it is not marked, then it is a prime
        if ( arr[i-a] == 0 ) res++;
    }
    return res;
}
In line $1$ we declare an array of $SIZE$. This array needs to be as large as maximum difference between $B-A+1$. Next in line $4$ we declare a function that finds number of primes between $a$ and $b$. Rest of the code is same as the psedocode above.

It is possible to optimize it further ( both for speed and memory ) but in expense of clarity. I am sure if readers understand the core concept behind this algorithm, then they will have no problem tweaking the code to suit their needs.

Conclusion

I first learned about Segmented Sieve from blog of +Zobayer Hasan. You can have a look at that post here. I wasn't really good at bit manipulation back then, so it looked really scary. Later I realized it's not as hard as it looks. Hopefully, you guys feel the same.

Leave a comment if you face any difficulty in understanding the post.

Reference

  1. forthright48 - Sieve of Eratosthenes
  2. zobayer - Segmented Sieve

Related Problems


Friday, September 4, 2015

Euler Totient or Phi Function

I have been meaning to write a post on Euler Phi for a while now, but I have been struggling with its proof. I heard it required Chinese Remainder Theorem, so I have been pushing this until I covered CRT. But recently, I found that CRT is not required and it can be proved much more easily. In fact, the proof is so simple and elegant that after reading it I went ahead and played MineCraft for 5 hours to celebrate.


Problem

Given an integer $N$, how many numbers less than or equal $N$ are there such that they are coprime to $N$? A number $X$ is coprime to $N$ if $gcd(X,N)=1$.

For example, if $N = 10$, then there are $4$ numbers, namely $\{1,3,7,9\}$, which are coprime to $10$.

This problem can be solved using Euler Phi Function, $\phi()$. Here is the definition from Wiki:
In number theory, Euler's totient function (or Euler's phi function), denoted as φ(n) or ϕ(n), is an arithmetic function that counts the positive integers less than or equal to n that are relatively prime to n. - Wiki
That's exactly what we need to find in order to solve the problem above. So, how does Euler Phi work?

Euler Phi Function

Before we go into its proof, let us first see the end result. Here is the formula using which we can find the value of the $\phi()$ function. If we are finding Euler Phi of $N = p_1^{a_1}p_2^{a_2}...p_k^{a_k}$, then:
$$\phi(n) = n \times \frac{p_1-1}{p_1}\times\frac{p_2-1}{p_2}...\times\frac{p_k-1}{p_k}$$
If you want you can skip the proof and just use the formula above to solve problems. That's what I have been doing all these years. But I highly recommend that you read and try to understand the proof. It's simple and I am sure someday the proof will help you out in an unexpected way.

Proof of Euler Phi Function

Even though the proof is simple, it has many steps. We will go step by step, and slowly you will find that the proof is unfolding in front of your eyes.

Base Case - $\phi(1)$

First, the base case. Phi function counts the number of positive numbers less than $N$ that are coprime to it. The keyword here is positive. Since the smallest positive number is $1$, we will start with this.

$\phi(1) = 1$, since $1$ itself is the only number which is coprime to it.

When $n$ is a Prime - $\phi(p)$

Next, we will consider the case when $n = p$. Here $p$ is any prime number. When $n$ is prime, it is coprime to all numbers less than $n$. Therefore, $\phi(n) = \phi(p) = p - 1$.

When $n$ is Power of Prime - $\phi(p^a)$

Next, we will consider $n$ where $n$ is a power of a single prime. In this case, how many numbers less than $n$ are coprime to it? Instead of counting that, we will count the inverse. How many numbers are there which are not coprime.

Since, $n = p^a$, we can be sure that $gcd(p,n) \neq 1$. Since both $n$ and $p$ are divisible by $p$. Therefore, the following numbers which are divisible by $p$ are not coprime to $n$,  $\{p, 2p, 3p$ $.... p^2, (p+1)p, (p+2)p$ $...(p^2)p, (p^2+1)p...(p^{a-1})p \}$. There are exactly $\frac{p^a}{p} = p^{a-1}$ numbers which are divisible by $p$. So, there are $n - p^{a-1}$ numbers which are coprime to $n$.

Hence, $\phi(n) = \phi(p^a) $ $ = n - \frac{n}{p} = p^a - \frac{p^a}{p} $ $= p^a ( 1 - \frac{1}{p} ) = p^a \times ( \frac{p - 1}{p} )$

It's starting to look like the equation above, right?

Assuming $\phi()$ is Multiplicative - $\phi( m \times n )$

This step is the most important step in the proof. This step claims that Euler Phi function is a multiplicative function. What does this mean? It means, if $m$ and $n$ are coprime, then $\phi( m \times n ) = \phi(m) \times \phi(n) $. Functions that satisfy this condition are called Multiplicative Functions.

So how do we prove that Euler Phi is multiplicative and how does Euler Phi being multiplicative helps us?

We will prove multiplicity of Euler Phi Function in the next section. In this section, we will assume it is multiplicative and see how it helps us calculating Euler Phi.

Let the prime factorization of $n$ be $p_1^{a_1}p_2^{a_2}...p_k^{a_k}$. Now, obviously $p_i$ nad $p_j$ are coprime to each other. Since $\phi$ funtion is multiplicative, we can simply rewrite the function as:

$\phi(n) = \phi(p_1^{a_1}p_2^{a_2}...p_k^{a_k})$
$\phi(n) = \phi(p_1^{a_1}) \times  \phi(p_2^{a_2}) ... \times  \phi(p_k^{a_k})$.

We can already calculate $\phi(p^a) = p^a \times \frac{p-1}{p}$. So our equationg becomes:

$\phi(n) = \phi(p_1^{a_1}) \times  \phi(p_2^{a_2}) ... \times  \phi(p_k^{a_k})$
$\phi(n) = p_1^{a_1} \times \frac{p_1 - 1}{p_1} \times  p_2^{a_2} \times \frac{p_2 - 1}{p_2}...\times  p_k^{a_k} \times \frac{p_k - 1}{p_k}$
$\phi(n) = ( p_1^{a_1} \times p_2^{a_2}... \times p_k^{a_k} ) \times \frac{p_1 - 1}{p_1} \times \frac{p_2 - 1}{p_2}... \times \frac{p_k - 1}{p_k}$
$$\therefore \phi(n) = n \times \frac{p_1 - 1}{p_1} \times \frac{p_2 - 1}{p_2}... \times \frac{p_k - 1}{p_k}$$
This is what we have been trying to prove. This equation was derived by assuming that Euler Phi Function is multiplicative. So all we need to do now is prove Euler Phi Function is multiplicative and we are done.

Proof for Multiplicity of Euler Phi Function

We are trying to prove the following theorem:
Theorem $1$: If $m$ and $n$ are coprime, then $\phi(m \times n ) = \phi ( m ) \times \phi(n)$
But in order to prove Theorem $1$, we will need to prove few other theorems first.

Theorem Related to Arithmetic Progression

Theorem $2$: In an arithmetic progression with difference of $m$, if we take $n$ terms and find their modulo by $n$, and if $n$ and $m$ are coprimes, then we will get the numbers from $0$ to $n-1$ in some order.
Umm, looks like Theorem $2$ is packed with too much information. Let me break it down.

Suppose you have an arithmetic progression (AP). Now, every arithmetic progression has two things. A starting element and a common difference. That is, arithmetic progressions are of the form $a + kb$ where $a$ is the starting element, $b$ is the common difference and $k$ is any number.

So take any arithmetic progression that has a common difference of $m$. Then take $n$ consecutive terms of that progression. So which terms will they be?

$\{ a + 0m, a + 1m, a + 2m, a + 3m ... a + (n-1)m\}$ There are exactly $n$ terms in this list.

Next, find their modulus by $n$. That is, find the remainder of each term after dividing by $n$.

Now, Theorem $2$ claims that, if $m$ and $n$ are coprime, then the list will contain a permutation of $0$ to $n-1$.

For example, let us try with $a = 1$, $m = 7$ and $n = 3$. So the $3$ terms in the list will be $(1, 8, 15)$. Now, if find modulus of each element, we get $(1,2,0)$.

I hope that now it's clear what the Theorem claims. Now we will look into the proof.

Proof of Theorem 2

What we need to prove is that the list after modulo operation has a permutation of numbers from $0$ to $n-1$. That means, all the numbers from $0$ to $n-1$ occurs in the list exactly once. There are three steps to this proof. Also, remember that $m$ and $n$ are coprime.
  1. There are exactly $n$ elements in the list.
    Well, since we took $n$ terms from the AP, this is obvious.
  2. Each element of the list has value between $0$ to $n-1$
    We performed modulo operations on each element by $n$. So this is also obvious.
  3. No remainder has the same value as another.
    Since there are $n$ values, and each value is between $0$ to $n-1$, if we can prove that each element is unique in the list, then our work is done.

    Suppose there are two numbers which have the same remainder. That means $a + pm$ has same remainder as $a + qm$, where $p$ and $q$ are two integer numbers such that $0 \leq p < q \leq n - 1$.

    Therefore, $( a + qm ) - ( a + pm ) \equiv 0\: ( mod\: n )$
    $(a+qm -a - pm) \equiv 0\: ( mod\: n )$
    $m ( q - p ) \equiv 0\: ( mod\: n )$

    That means, $n$ divides $m(q-p)$. But this is impossible. $n$ and $m$ are coprime and $q-p$ is smaller than $n$. So it is not possible for two numbers to have the same remainder.

Theorem Related to Remainder

Theorem $3$: If a number $x$ is coprime to $y$, then $(x \:\%\: y)$ will also be coprime to $y$.
The proof for this theorem is simple. Suppose $x$ and $y$ are coprime. Now, we can rewrite $x$ as $x = ky + r$, where $k$ is the quotient and $r$ is the remainder.

Theorem $3$ claims that $y$ and $r$ are coprime. What happens if this claim is false? Suppose they are not coprime. That means there is a number $d > 1$ which divides both $y$ and $r$. But then, $d$ also divides $ky + r = a$. So $d > 1$ divides $r, y$ and $x$, which is impossible cause $y$ and $x$ are coprime. There is no number greater than $1$ that can divide both of them. Hence, there is a contradiction. Hence, the theorem is proved.

Proof for Multiplicity of Euler Phi Function Continued

We are now ready to tackle proving Theorem $1$.

Suppose you have two numbers $m$ and $n$, and they are coprime. We want to show that $\phi(m\times n) = \phi (m ) \times \phi (n )$. 

What done $\phi (m\times n)$ gives us? It gives us the count of numbers which are coprime to $mn$. If a number $x$ is coprime to $mn$, then it is also coprime to $m$ and $n$ separately. So basically, we need to count the number of positive numbers less than or equal to $mn$ which are coprime to both $m$ and $n$.

Now, let us build a table of with $n$ rows and $m$ columns. Therefore, the table will look like the following:
123...m
1+m2+m3+m...2m
1+2m2+2m3+2m...3m
...............
1 + (n-1)m2 + (n-1)m3 + (n-1)m...mn
Now, notice that each column is an arithmetic progression with $n$ terms and has common difference of $m$. Also, $m$ and $n$ are coprime. This is exactly the same situation as Theorem $2$.

Now, how many numbers in each column are coprime to $n$? In order to figure this result out, we first need to consider what happens if we modulo all table values with $n$. Using theorem $2$, we know that each column will then contain a permutation of numbers from $0$ to $n-1$. Using theorem $3$, we know what if the remainder of a number is coprime to $n$ then the number itself will also be coprime. So, how many numbers between $0$ to $n-1$ is coprime to $n$? We can consider $0$ to be same as $n$ ( cause this is modular arithmetic), so it boils down to, how many numbers between $1$ to $n$ is coprime to $n$? Euler Phi Function calculates this values.

So, there are exactly $\phi(n)$ numbers which are coprime to $n$ in each column.

We need to find numbers that coprime to both $n$ and $m$. So, we cannot take $\phi(n)$ elements from every column, cause those elements may not be coprime to $m$. How do we decide which columns we should be taking?

Notice that, if we find the modulus of elements of the table by $m$, then each row has remainder between $0$ to $m-1$ occuring exactly once. If we consider $0$ to be $m$, then each row has values between $1$ to $m$.  That is the table becomes something like this:
123...m
123...m
123...m
...............
123...m
So, how many columns are there which are coprime to $m$? There are $\phi(m)$ columns which are coprime to $m$.

Now we just need to combine the two results from above. There are exactly $\phi(m)$ columns which are coprime to $m$ and in each column there are $\phi(n)$ values which are coprime to $n$. Therefore, there are $\phi(m) \times \phi(n)$ elements which are coprime to both $m$ and $n$.
$$\therefore \phi(m) \times \phi(n) = \phi(m \times n)$$

Code

Since we have to factorize $n$ in order to calculate $\phi(n)$, we can modify our $factorize()$ function from post "Prime Factorization of Integer Number" to handle Euler Phi.
int eulerPhi ( int n ) {
    int res = n;
    int sqrtn = sqrt ( n );
    for ( int i = 0; i < prime.size() && prime[i] <= sqrtn; i++ ) {
        if ( n % prime[i] == 0 ) {
            while ( n % prime[i] == 0 ) {
                n /= prime[i];
            }
            sqrtn = sqrt ( n );
            res /= prime[i];
            res *= prime[i] - 1;
        }
    }
    if ( n != 1 ) {
        res /= n;
        res *= n - 1;
    }
    return res;
}
I highlighted the lines that are different from $factorize()$ function. Notice that in line $10$ divided $res$ before multiplying in line $11$. This is an optimization that lowers the risk of overflowing.

Conclusion

That was a long post with lots of theorems and equations, but hopefully, they were easy to understand. Even though Theorem $2$ and $3$ were used as lemmas to prove Theorem $1$, they both are important by themselves. 

Leave a comment if you face any difficulty in understanding the post.

Reference

  1. Wiki - Euler Totient Function

Monday, August 31, 2015

Bit Manipulation

On this post, we will look into how to use "Bitwise Operators" to manipulate bits of a number. Bit manipulation is a useful tool that can sometimes perform complicated tasks in a simple manner.

We will work on integer values and assume each number has $32$ bits in them. If a number has only $0$ or $1$ as its digit, then assume that it is in binary form. All position of bits is $0$ indexed.

Since binary values can only be $0$ or $1$, we sometimes refer them like light bulbs. A bit being "Off" means it has value $0$ and being "On" means it has value $1$. Bit $1$ is also referred as "Set" and bit $0$ as "Reset".

We need to have a strong grasp of how AND, OR, Negation, XOR and Shift Operators work to understand how bit manipulation works. If you have forgotten about them, then make sure you revise.

Checking Bit at Position X

Given a number $N$, find the value of its bit at position $X$. For example, if $N=12=(1100)_2$ and $X=2$, then value of bit is $1$. 

So how do we find it? We can find this value in three steps:
  1. Let, $mask = 1 << X$
  2. Let, $res = N \: \& \: mask$
  3. If $res$ is $0$, then bit is $0$ at that position, else bit is $1$.
Let us see it in action with the example above, $N = 12$ and $X = 2$.

$1100$
$\underline{0100} \:\&$ ( $1 << 2$ is just $1$ shifted to left twice )
$0100$

Now, what happened here? In step $1$, when we left shifted $1$ $X$ times, we created a number $mask$ which has bit $1$ only at position $X$. This is a useful application of left shift operator. Using this, we can move $1$ bit anywhere we want. 

Next, at step $2$, we find the result of performing AND operator between $N$ and $mask$. Since $mask$ has $0$ in all positions except $X$, the result will have $0$ in all places other than $X$. That's because performing AND with $0$ will always give $0$. Now, what about position $X$. Will the result at position $X$ have $0$ too? That depends on the bit of $N$ at position $X$. Since, the $X$ bit in the mask is $1$, the only way result will have $0$ at that position if $N$ has $0$ in that position.

So, when all position of $res$ is $0$, that is $res$ has value $0$, we can say that $N$ has $0$ bit in position $X$, otherwise it doesn't.

Set Bit at Position $X$

Given a value $N$, turn the bit at position $X$ of $N$ to $1$. For example, $N=12=1100$ and $X=0$, then $N$ will become $13=1101$. 

This can be done in two steps:
  1. Let, $mask = 1 << X$
  2. $N = N \:|\: mask$
$1100$
$\underline{0001}\:|$
$1101$

In step $1$, we shift a $1$ bit to position $X$. Then we perform OR between $N$ and $mask$. Since $mask$ has 0 in all places except $X$, in result all those places remain unchanged. But $mask$ has $1$ in position $X$, so in result position $X$ will be $1$. 

Reset Bit at Position $X$

Given a value $N$, turn the bit at position $X$ of $N$ to $0$. For example, $N=12=1100$ and $X=3$, then $N$ will become $4=0100$. 

This cane be done in three steps:
  1. Let, $mask = 1 << X$
  2. $mask = \sim mask$
  3. $N = N \:\&\: mask$
$1100$
$\underline{0111}\:\&$ ( We first got $1000$ from step $1$. Then used negation from step 2.)
$0100$

In step $1$ we move $1$ bit to position $X$. This gives us a number with $1$ in position $X$ and $0$ in all other places. Next we negate the $mask$. This flips all bits of $mask$. So now we have $0$ in position $X$ and $1$ in all other places. Now when we perform AND between $mask$ and $N$, it forces the bit at the $X$ position to $0$ and all other bits stay intact.

Toggle Bit at Position X

Given a value $N$, toggle the bit at position $X$, i.e, if bit at $X$ is $0$ then make it $1$ and if it is already $1$, make it $0$. 

This can be done in two steps:
  1. Let, $mask = 1 << X$
  2. $N = N \:\wedge\: mask$
First we shift bit $1$ to our desired position $X$ in step $1$. When XOR is performed between a bit and $0$, the bit remains unchanged. But when XOR performed between a bit and $1$, it toggles. So all position except $X$ toggles during step 2, since all position in mask except $X$ is $0$.

Coding Tips

When coding these in C++, make sure you use lots of brackets. Bitwise operators have lower precedence than $==$ and $!=$ operator which often causes trouble. See here

What would be the output of this code:
if ( 0 & 6 != 7 ) printf ( "True\n" );
else printf ( "False\n" );
You might expect things to follow this order: $0 \:\&\: 6 = 0$ and $0 \:!= 7$, so "True" will be output. But due to precedence of operators, the output comes out "False". First $6\: != 7$ is checked. This is true, so $1$ is returned. Next $0 \:\&\:1$ is performed which is $0$.

The best way to avoid these kind of mishaps is to use brackets whenever you are using bitwise operators. The correct way to write the code above is:
if ( (0 & 6) != 7 ) printf ( "True\n" );
else printf ( "False\n" );
This prints "True" which is our desired result.

Conclusion

These are the basics of bit manipulation and by no means the end of it. There are lots of other tricks that are yet to be seen, such as "Check Odd or Even", "Check Power of Two", "Right Most Bit". We will look into them some other day.

Reference

  1. forthright48 - Bitwise Operators
  2. CPPReference - C++ Operator Precedence

Thursday, August 27, 2015

Bitwise Operators

We know about arithmetic operators. The operators $+,-,/$ and $\times$ adds, subtracts, divides and multiplies respectively. We also have another operator $\%$ which finds the modulus.

Today, we going to look at $6$ more operators called "Bitwise Operators". Why are they called "Bitwise Operators"? That's because they work using the binary numerals (bits, which are the individual digits of the binary number) of their operands. Why do we have such operators? That's because in computers all information is stored as strings of bits, that is, binary numbers. Having operators that work directly on them is pretty useful.

We need to have a good idea how Binary Number System works in order to understand how these operators work. Read more on number system in [1]Introduction to Number Systems. Use the $decimalToBase()$ function from [1] to convert the decimal numbers to binary and see how they are affected.

Bitwise AND ($\&$) Operator

The $\&$ operator is a binary operator. It takes two operands and returns single integer as the result. Here is how it affects the bits.

$0 \: \& \: 0 = 0$
$0 \: \& \:  1 = 0$
$1 \: \& \:  0 = 0$
$1 \: \& \:  1 = 1$

It takes two bits and returns another bit. The $\&$ operator will take two bits $a, b$ and return $1$ only if both $a$ AND $b$ are $1$. Otherwise, it will return $0$.

But that's only for bits. What happens when we perform $\&$ operation on two integer number?

For example, what is the result of  $A \: \& \:  B$ when $A = 12$ and  $ B =10$?

Since $\&$ operator works on bits of binary numbers we have to convert $A$ and $B$ to binary numbers.

$A = (12)_{10} = (1100)_2$
$B = (10)_{10} = (1010)_2$

We know how to perform $\&$ on individual bits, but how do we perform $\&$ operation on strings of bits? Simple, take each position of the string and perform and operations using bits of that position.

$1100$
$\underline{1010}\: \&$
$1000$

Therefore, $12 \: \& \:  10 = 8$.

In C++, it's equivalent code is:
printf ("%d\n", 12 & 10 );

Bitwise OR ($|$) Operator

The $|$ operator is also a binary operator. It takes two operands and returns single integer as the result. Here is how it affects the bits:

$0 \: | \: 0 = 0$
$0 \: | \: 1 = 1$
$1 \: | \: 0 = 1$
$1 \: | \: 1 = 1$

The $|$ operator takes two bits $a, b$ and return $1$ if $a$ OR $b$ is $1$. Therefore, it return $0$ only when both $a$ and $b$ are $0$.

What is the value of $A\:|\:B$ if $A=12$ and $B=10$? Same as before, convert them into binary numbers and apply $|$ operator on both bits of each position.

$1100$
$\underline{1010}\: |$
$1110$

Therefore $12\:|\:10 = 14$.
printf ( "%d\n", 12 | 10 );

Bitwise XOR ($\wedge$) Operator

Another binary operator that takes two integers as operands and returns integer. Here is how it affects two bits:

$0 \: \wedge \: 0 = 0$
$0 \: \wedge \: 1 = 1$
$1 \: \wedge \: 0 = 1$
$1 \: \wedge \: 1 = 0$

XOR stands for Exclusive-OR. This operator returns $1$ only when both operand bits are not same. Otherwise, it returns $0$.

What is the value of $A\:\wedge\:B$ if $A=12$ and $B=10$?

$1100$
$\underline{1010}\: \wedge$
$0110$

Therefore, $12\:\wedge\:10 = 6$.

In mathematics, XOR is represented with $\oplus $, but I used $\wedge$ cause in C++ XOR is performed with $\wedge$.
printf ( "%d\n", 12 ^ 10 );

Bitwise Negation ($\sim$) Operator

This is a unary operator. It works on a single integer and flips all its bits. Here is how it affects individual bits:

$\sim\:0 = 1$
$\sim\: 1 = 0$

What is the value of $\sim \: A$ if $A = 12$?

$\sim \: (1100)_2 = (0011)_2 = (3)_{10}$

But this will not work in code cause $12$ in C++ is not $1100$, it is $0000...1100$. Each integer is $32$ bits long. So when each of the bits of the integer is flipped it becomes $11111...0011$. If you don't take unsigned int, the value will even come out negative.
printf ( "%d\n", ~12 );

Bitwise Left Shift ($<<$) Operator

The left shift operator is a binary operator. It takes two integers $a$ and $b$ and shifts the bits of $a$ towards LEFT $b$ times and adds $b$ zeroes to the end of $a$ in its binary system.

For example, $(13)_{10} << 3 = (1101)_2 << 3 = (110100)_2$.

Shifting the bits of a number $A$ left once is same as multiplying it by $2$. Shifting it left three times is same as multiplying the number by $2^3$.

Therefore, the value of $A << B = A \times 2^B$.
printf ( "%d\n", 1 << 3 );

Bitwise Right Shift ($>>$) Operator

The $>>$ Operator does opposite of $<<$ operator. It takes two integer $a$ and $b$ and shifts the bits of $a$ towards RIGHT $b$ times. The rightmost $b$ bits are lost and $b$ zeroes are added to the left end.

For example, $(13)_{10} >> 3 = (1101)_2 >> 3 = (1)_2$.

Shifting the bits of a number $A$ right once is same as dividing it by $2$. Shifting it right three times is same as dividing the number by $2^3$.

Therefore, the value of $A >> B = \lfloor \frac{A}{2^B} \rfloor$.
printf ( "%d\n", 31 >> 3 );

Tips and Tricks

  1. When using $<<$ operator, careful about overflow. If $A << B$ does not fit into $int$, make sure you type cast $A$ into $long\:long$. Typecasting $B$ into $long\:long$ does not work.
  2. $A \:\&\:B \leq MIN(A,B)$ 
  3. $A\:|\:B \geq MAX(A,B)$



That's all about bitwise operations. These operators will come useful during "Bits Manipulation". We will look into it on next post.

Resource

SPOJ LCMSUM - LCM Sum

Problem

Problem Link - SPOJ LCMSUM

Given $n$, calculate the sum $LCM(1,n) + LCM(2,n) + ... + LCM(n,n)$, where $LCM(i,n)$ denotes the Least Common Multiple of the integers $i$ and $n$.

Solution

I recently solved this problem and found the solution really interesting. You can find the formula that solves this problem on OEIS. I will show you the derivation here.

In order to solve this problem, you need to know about Euler Phi Function, finding Divisor using Sieve and some properties of LCM and GCD.

Define SUM

Let us define a variable SUM which we need to find.

$SUM = LCM(1,n) + LCM(2,n) + ... + LCM(n,n)$ (take $LCM(n,n)$ to the other side for now )
$SUM - LCM(n,n) = LCM(1,n) + LCM(2,n) + ... + LCM(n-1,n)$

We know that $LCM(n,n) = n$.

$SUM - n = LCM(1,n) + LCM(2,n) + ... + LCM(n-1,n)$ ($eq 1$)

Reverse and Add

$SUM - n = LCM(1,n) + LCM(2,n) + ... + LCM(n-1,n)$ (Reverse $eq 1$ to get $eq 2$)
$SUM - n = LCM(n-1,n) + LCM(n-2,n) + ... + LCM(1,n)$ ( $eq 2$ )

No what will we get if we add $eq 1$ with $eq 2$? Well, we need to do more work to find that.

Sum of $LCM(a,n) + LCM(n-a,n)$

$x = LCM(a,n) + LCM(n-a,n)$
$x = \frac{an}{gcd(a,n)} + \frac{(n-a)n}{gcd(n-a,n)}$ ($eq 3$)

Arghh. Now we need to prove that $gcd(a,n)$ is equal to $gcd(n-a,n)$.

If $c$ divides $a$ and $b$, then, $c$ will divide $a+b$ and $a-b$. This is common property of division. So, if $g = gcd(a,n)$ divides $a$ and $n$, then it will also divide $n-a$. Hence, $gcd(a,n)=gcd(n-a,n)$.

So $eq 3$ becomes:

$x = \frac{an}{gcd(a,n)} + \frac{(n-a)n}{gcd(a,n)}$
$x = \frac{an + n^2 -an}{gcd(a,n)}$.
$x = \frac{n^2}{gcd(a,n)}$.

Now, we can continue adding $eq 1$ with $eq 2$.

Reverse and Add Continued

$SUM - n = LCM(1,n) + LCM(2,n) + ... + LCM(n-1,n)$ ($eq 1$)
$SUM - n = LCM(n-1,n) + LCM(n-2,n) + ... + LCM(1,n)$ ( $eq 2$ Add them )
$2(SUM-n)= \frac{n^2}{gcd(1,n)} +  \frac{n^2}{gcd(2,n)} + ...  \frac{n^2}{gcd(n-1,n)}$
$2(SUM-n ) = \sum_{i=1}^{n-1}\frac{n^2}{gcd(i,n)}$
$2(SUM-n ) = n\sum_{i=1}^{n-1}\frac{n}{gcd(i,n)}$  ( take $n$ common )

Group Similar GCD

What are the possible values of $g = gcd(i,n)$? Since $g$ must divide $n$, $g$ needs to be a divisor of $n$. So we can list the possible values of $gcd(i,n)$ by finding the divisors of $n$.

Let $Z = n\sum_{i=1}^{n-1}\frac{n}{gcd(i,n)}$. Now, every time $gcd(i,n) = d$, where $d$ is a divisior of $n$, $n \times \frac{n}{d}$ is added to $Z$. Therefore, we just need to find, for each divisor $d$, how many times $n \times \frac{n}{d}$ is added to $Z$.

How many values $i$ can we put such that $gcd(i,n) = d$? There are $\phi(\frac{n}{d})$ possible values of $i$ for which we get $gcd(i,n)=d$. Therefore:

$2(SUM-n ) =  n\sum_{i=1}^{n-1}\frac{n}{gcd(i,n)}$
$2(SUM-n ) =  n\sum_{d\:|\:n, d \neq n } \phi(\frac{n}{d}) \times \frac{n}{d}$ (but, $\frac{n}{d}$ is also a divisor of $n$ )
$2(SUM-n ) =  n\sum_{d\:|\:n, d \neq 1 } \phi(d) \times d$ ( when $d = 1$, we get $\phi(1)\times 1$ ) 
$2(SUM-n ) =  n( \sum_{d\:|\:n } ( \phi(d) \times d ) -1 )$ 
$2(SUM-n ) =   n \sum_{d\:|\:n } (\phi(d) \times d ) - n$
$2SUM - 2n + n  = n\sum_{d\:|\:n } \phi(d) \times d $
$2SUM - n = n\sum_{d\:|\:n } \phi(d) \times d $
$$2SUM = n( \sum_{d\:|\:n } \phi(d) \times d )+ n \\2SUM = n ( \sum_{d\:|\:n } ( \phi(d) \times d ) + 1 )\\ \therefore SUM = \frac{n}{2} ( \sum_{d\:|\:n } ( \phi(d) \times d ) + 1 )$$
Using this formula we can solve the problem.

Code

#include <bits/stdc++.h>

#define FOR(i,x,y) for(vlong i = (x) ; i <= (y) ; ++i)

using namespace std;
typedef long long vlong;

vlong res[1000010];
vlong phi[1000010];

void precal( int n ) {
    ///Calculate phi from 1 to n using sieve
    FOR(i,1,n) phi[i] = i;
    FOR(i,2,n) {
        if ( phi[i] == i ) {
            for ( int j = i; j <= n; j += i ) {
                phi[j] /= i;
                phi[j] *= i - 1;
            }
        }
    }

    ///Calculate partial result using sieve
    ///For each divisor d of n, add phi(d)*d to result array
    FOR(i,1,n){
        for ( int j = i; j <= n; j += i ) {
            res[j] += ( i * phi[i] );
        }
    }
}

int main () {
    precal( 1000000 );

    int kase;
    scanf ( "%d", &kase );

    while ( kase-- ) {
        vlong n;
        scanf ( "%lld", &n );

        ///We already have partial result in res[n]
        vlong ans = res[n] + 1;
        ans *= n;
        ans /= 2;

        printf ( "%lld\n", ans );
    }

    return 0;
}
We nee to precalculate partial result using a sieve for all values of $N$. Precalculation has a complexity of $O(N\: lnN)$. After pre-calculation, for each $N$ we can answer in $O(1)$.

Reference

  1. OEIS - A051193

Tuesday, August 25, 2015

Contest Analysis: BUET Inter-University Programming Contest - 2011, UVa 12424 - 12432

Last week (2015-08-21) we practiced with this set. Some of the problems didn't have proper constraints, which caused some headaches. This contest originally followed Google Code Jam format with two subproblems ( small and large ) for each problem. Each of the sub-problems had a different constraint on them. But when they uploaded it online, they removed both constraints from some problems, making things confusing. We ended up guessing the constraints when solving them.

During contest time, we managed to solve $B, E$, and $H$. I was stuck with $F$. Should have moved on. $C$ and $D$ were far easier than $F$. Bad decision from my end.

UVa 12424 - Answering Queries on a Tree

Complexity: $O(logN)$
Category: Segment Tree, DFS, LCA

At first it seemed like a problem on Heavy Light Decomposition. But after a moment I realized it can be done without HLD.

Run a dfs from any node and make the tree rooted tree. Now consider the color $X$. For every node that have color $X$, the edge from its parent will have cost $1$. Since the root can also have color $X$, we need to add a dummy node $0$ and make that root. Now, run another dfs and calculate the distance from the root to each node. We repeat this for all $10$ color and build $10$ trees. Let us call tree with color of node $X$, $T_X$.

Now, whenever we have an update to change node $u$ from color $X$ to color $Y$, it means that the edge entering $u$ in tree $T_X$ will now have a cost $0$. Since its cost decreased, the distance from root of all the nodes in the subtree of $u$ will be lowered by $1$. We can apply this change using preorder traversal dfs + segment tree. Do the opposite for $T_Y$. Increase the value of all nodes under subtree $u$ in $T_Y$.

And for query $u$ and $v$, find the distance between these two nodes in each of the $10$ trees. This can be done in $O(logN)$ using LCA + Segment Tree.

Code: http://ideone.com/UEU9J4

UVa 12425 - Best Friend

Complexity: $O(\sqrt{N} + \sqrt[3]{N} \times \text{Complexity of } \phi (N) )$
Category: Number Theory, Euler Phi, GCD

So for given integers $N$ and $X$, we have to find out how many number $I$ are there such that $gcd(N, I) \leq X$.

In order to calculate $gcd(N,I) \leq X$, I first need to be able to calculate how many numbers are there such that $gcd(N,I) = Y$. I started with a small value of $Y$ first. So the first thing I asked myself, how many numbers are there such that $gcd(N, I) = 1$? The answer is $\phi (N)$. Then I asked how many are there such that $gcd(N, I) = 2$? 

This took a moment, but I soon realized, if $gcd(N,I)$ is equal $2$, then if I divide both $N$ and $I$ by $2$, then $gcd(\frac{N}{2},\frac{I}{2})$ will be equal to $1$. Now, all I had to calculate how many $I$ are there such that $gcd(\frac{N}{2},\frac{I}{2}) = 1$. The answer is $\phi (\frac{N}{2})$.

Therefore, there are $\phi (\frac{N}{Y})$ number of $I$ such that $gcd(N,I) = Y$.

Great. Now we can calculate number of $I$ such that  $gcd(N,I) = Y$. Now all we need to do is find the sum of all number of $I$ for which $gcd(N, I) = Y$ and $Y \leq X$. GCD of $N$ and $I$ can be as high as $10^{12}$, so running a loop won't do.

Next I thought, what are the possible values of $g = gcd(N,I)$? Since $g$ is a number that divides $N$ and $I$, $g$ must be a divisor of $N$. So I simply calculated all the divisors of $N$ using a $\sqrt{N}$ loop. Next for each divisor $d$, I calculated $\phi ( \frac{N}{d})$ which is the number of ways to chose $I$ such that $gcd(N,I) = d$.

Now, for each query $X$, all I needed to do was find sum of $\phi ( \frac{N}{d} ) $, where $d$ is a divisor of $N$ and $d \leq X$. Since all values of $\phi ( \frac{N}{d} ) $ was precalculated, using cumalitive sum I answered it in $O(1)$.


Complexity: $O(N^2\:logN)$
Category: Two Pointer, Geo, Binary Search

We need to choose three points of a convex hull such that the area does not exceed $K$. 

First, we need to know, for two points $i$ and $j$, $j>i$, what is the point $m$ such that $(i,j,m)$ forms the largest triangle possible. This can be done by selecting two points with two loops and then using ternary search. But I used two pointer technique to find $m$ for every pair of $(i,j)$ in $O(N)$.

After that, for ever pair of $(i,j)$ I used binary search twice. Once on points $[j+1,m]$ and once on points $[m+1,n]$. Using binary search I found the point $x$ which gives the highest area that does not exceed $K$. Then I added $x-j-1$ in first BS and $n-1-m$ in second case.

Careful about not adding a degenerate triangle. Hanlde the edge cases properly.


Complexity: $O(1)$
Category: Game Theory, Combinatorics, Nim Game, Bogus Nim

This game can be converted to into a game of pile. Let the distance between the left boundary and gold coin be $x$ and the distance between diamond and silver coin be $y$. Now, instead of playing on the board, we can play with two piles of height $x$ (first pile) and $y$ (second pile).

Moving gold coin left is same as removing one coin from the first pile. Moving diamond coin left is same as increasing second pile by one. Moving silver coin left is same as removing one coin from the second pile. Moving gold and silver coin left is same as removing one coin from both first and second pile.

Now, note that is a state of $(x,y)$ is losing and I try to increase the second pile by one, my opponent then can simply take one from second pile and put me back to same situation. That is, increasing the second pile has no effect in this game. This is a "Bogus" move. We will ignore this move.

Next I drew a $5 \times 5$ table and calculated the Win or Lose state of each possible $(x,y)$. I found that only when both $x$ and $y$ are even, the first person loses. That is the second person, me, wins the game when $x$ and $y$ both are even.

What are the possible values of $x$? $x$ can only be between $a_1$ and $a_2$. Find number of even numbers in this range. Let this be $r_1$. Next, $y$ can be between $c_1$ and $c_2$. Let the number of even numbers in this range be $r_2$. The distance between gold and diamond has no effect in this game. So we can put any amount of gap there.

Therefore, number of possible combination is $r_1 \times r_2 \times (b_2 - b_1 + 1 )$.

Complexity: $O(\sqrt{M})$
Category: Adhoc

This was the easiest problem in the set. In this problem, we are given $N$ nodes and $M$ edges, we have to find out how many leaves can this graph have if we maximize leaves.

The problem says the graph will be connected. So I created a star-shaped graph with it using $N-1$ edges. Let the center node be $0$ and remaining nodes be the numbers from $1$ to $N-1$. This gives me $N-1$ critical edges. Then I checked if I still have more edges left? If I do, then I need to sacrifice a critical edge.

The first edge that gets sacrifice is a special case. This is because, no matter which two nodes you connect, it will reduce the number of critical edge by $2$. There is no helping it. So let's connect $1$ and $2$ and reduce our edge by $1$ and critical edge by $2$. 

Then if we still have more edges left, then we need to sacrifice another critical length. From now on, at each step only one critical edge will be removed. Also, this time we can add $2$ edges to the graph by removing one critical edge. Connect an edge between $(3,1)$ and $(3,2)$. Reduce $M$ by $2$ and critical edge by $1$.

Next is node $4$. We can add $3$ edges by removing $1$ critical edge now. We need to keep on doing this until all edges finish.

I guess it is possible to solve the problem in $O(1)$ but I didn't bother with it. Number of test case was small anyway.

Complexity: $O(N\:logN)$
Category: Binary Indexed Tree, Number Theory

We have to find the number of triplets such that $a + b^2 \equiv c^3$ and $a \leq b \leq c$. Here is what I did.

Iterate over $c$ from $N$ to $1$ using a loop. Let the loop iterator be $i$. Now for each $i$, we do the following:
  1. Find $x = ( i \times i \times i ) \:\%\: k$ and store $x$ in a BIT. BIT stores all occurrences of $c$ and by iterating in reverse order we make sure that all $c$ in BIT is greater or equal to $i$.

  2. Let $b = ( i \times i ) \: \% \: k$. This is the current $b$ we are working with. In BIT we have $c$ which are $c \geq b$. Now all we need is to find possible values of $a$.

  3. Now, take any value of $c$ from BIT. For this $c$ and current $b$, how many ways can we choose $a$ so that $a + b \equiv c$. Notice that $a$ can be anything from $1$ to $k$, or $0$ to $k-1$. Therefore, no matter what is the value of $c$ we can chose $1$ number from $0$ to $k-1$ so that $a + b \equiv c$.

    Let $y = \frac{i}{k}$. This means we have $y$ segments, in which value of $a$ spans from $0$ to $k-1$. So we add $y$ for each possible $c$ we take from BIT. Let $total$ be number of $c$ we have inserted till now. So we add $total \times y$ to $result$.

  4. But it's not over yet. What about the values of $a$ that we did not use. To be more exact, let $r = i \: \% \: k$. If $ r > 0$, then we have unused values of $a$. We need to use these.

    But we have to be careful. We have used up values of $a$ from $1$ to $y\times k$. So the remaining $a$ we have is $1$ to $r$. $0$ is not included.

    Since we don't have all values from $0$ to $k-1$, we are not able to handle all possible values $c$ any more. How many can we handle exactly?

    $a + b \equiv c$
    $a \equiv c - b$

    But we need $a$ can only be between $1$ and $r$.

    $1 \leq c - b \leq r$
    $\therefore 1+b \leq c \leq r + b$

    Find out how many $c$ are there with values between $1+b$ and $r+b$ from BIT. Let this value be $z$. For each of those $c$ we can use one value of $a$ between $1$ to $r$. So add $z$ to $result$.
And that's it. Be careful when handling edge cases in BIT.

Code: http://ideone.com/iMBUmU

UVa 12430 - Grand Wedding

Complexity: $O(N \:logN)$
Category: DFS, Bicoloring, Binary Search

Constraint was missing. I assumed $N <= 10^6$.

We have to remove some edges and then assign guards such that no edge is left unguarded and no edge has guards in both ends. We have to output the maximize the value of the edges that we remove.

Binary search over the value of edges which we need to remove. Now, let's say we removed all edges with value greater or equal to $x$. How do we decide that a valid assignment of guard is possible in this problem? 

Suppose all the nodes in the graph are colored white. If we assign guard in a node, then we color it black. Now, two white nodes cannot be adjacent cause that would mean the edge is left unguarded. Two black nodes cannot be adjacent otherwise guards will chat with each other. That is both ends of an edge cannot have same color in the graph. This is possible when the graph has no cycle of odd length. If a graph has no cycle, then it is bicolorable. So all we need to do is check if the graph is bicolorable.

Now, two special cases. If we can bicolor the graph without removing any edge, then answer is $0$. If we have to remove all edges then answer is $-1$. Why? Cause in the problem it is said we can remove proper subset of edges. The full set is not a proper subset.


UVa 12431 - Happy 10/9 Day

Complexity: $O(\:(logN)^2\:)$
Category: Divide and Conquor, Modular Exponentiation, Repeated Squaring

We have to find the value of:

$ ( d\times b^0 + d\times b^1 + d \times b^2 + ... + d \times b^{n-1} ) \: \%\: M$
$ ( d ( b^0 + b^1 + b^2 + ... + b^{n-1} ) ) \: \%\: M$.

Therefore, the problem boils down to finding the value of $( b^0 + b^1 + b^2 + ... + b^{n-1} ) \: \%\: M$. We cannot use the formula of geometric progression since $M$ and $b-1$ may not be coprime. So what do we do? We can use divide and conquor. How? Let me show you an example.

Suppose $f(n) = (b^0 + b^1 + b^2 + ... + b^n)$. Let us find $f(5)$.
$f(5) = b^0 + b^1 + b^2 + b^3 + b^4 + b^5$
$f(5) = (b^0 + b^1 + b^2) + b^3 ( b^0 + b^1 + b^2)$
$f(5) = f(2) + b^3 \times f(2)$
$f(5) = f(2) \times (1 + b^3)$.

From this we can design a D&C in following way:

$\text{n is odd: }f(n) = f(\frac{n}{2}) \times ( 1 + b^{n/2 + 1 })$
$\text{n is even: }f(n) = f(n-1) + b^n$
$f(0) = 1$

Just apply modular arithmatic with the whole things. Also note that $M$ can be as larg as $10^{12}$, so $(a \times b)$ will overflow if they are multiplied directly, even after making them small by modulus $M$. So use repeated squaring metho to multiply them.


UVa 12432 - Inked Carpets

Complexity:
Category:

I didn't try this problem yet. I guess this was the stopper.


It was a good set. I espcially enjoyed solving $B, D$ and $F$. 

Thursday, August 20, 2015

Contest Analysis: IUT 6th National ICT Fest 2014, UVa 12830 - 12839

I participated in this contest last year. Our team "NSU Shinobis" managed to solve 5 and placed in top 10 ( failed to find the rank list ). The IUT 7th ICT National Fest 2015 is going to be held next month, so I thought I would try to up solve the remaining problems. I tried to compile detailed hints for the problems here.

UVa 12830 - A Football Stadium

Complexity: $O(N^3)$
Category: Loop, DP, Kadane's Algorithm

Given $N$ points, we have to find the largest area of the rectangle which can fit such that there is no point inside the rectangle. Points on boundaries of the rectangle are allowed.

Now imagine a rectangle which has no point inside it and also no point on its boundary. Now focus on the top boundary of that rectangle. Since there is no point on that boundary, we can extend it up until it hits a point or limit. Same can be said for the bottom, left and right boundary. That is, the largest rectangle will have its boundary aligned with the limits or points.

Now let us fix the top and lower boundary of the rectangle. How many possible values can they take? Each boundary can take values of $y$-coordinate from one of the $N$ points or the limits $0$ and $W$. Using two nested loops, we can fix them.

Next we need to fix the left and right boundary. Lets us sort the points according to $x$ first and then according to $y$. Next we iterate over the points. Initially set the left boundary of rectangle aligned to $y$-axis, that is aligned with the left boundary of Sand Kingdom.

Now, for each point, if the $y$ coordinate of the point is above or below or on the boundary we fixed then we ignore them, cause they don't cause any problem. But if they fall inside, then we need to process this point. We put the right boundary aligned to the $x$ coordinate of this point and calculate this rectangle.

But it's not over. We shift the left boundary over to current right boundary and continue process remaining points.

This is a classical problem of Kadane's Algorithm. With two nested loops to fix upper and lower boundary, and another loop inside iterating over $N$ points makes the complexity $O(N^3)$.

$O(N^2)$ solution is possible, by constructing a histogram for each row ( $O(N)$ ) and the calculating area of histogram in $O(N)$. But that's not necessary here.

Code: http://ideone.com/FjvTN2

UVa 12831 - Bob the Builder

Complexity: $O(\sqrt{V}E)$
Category: Max Flow, Bipartite Matching, Dinic's Algorithm, Path Cover

Another classical problem, though we had a hard time solving this one. Mainly cause we didn't realize this was a path cover problem which can be solved using Max Flow.

Take each number provided and generate all child. Consider each number a node and if it possible to generate $B$ from $A$, then add a directed edge from $A$ to $B$. When you are done, you will have DAG.

Now split all the nodes in two. We are going to create a bipartite graph now. On the left side, we will have all the original nodes and on the right side we will have the copy we got by splitting the nodes. Now, for each edge between $A$ and $B$, add edge in the bipartite graph from $A$ on the left side to $B$ on the right side.

Find the matching ( or maximum flow ) of this graph by running  Dinic's Algorithm. The answer will be $\text{Total Nodes} - \text{Matching}$.

Code: http://ideone.com/U1VPNB

UVa 12832 - Chicken Lover

Complexity: $O(M)$
Category: Expected Value, Probability, DP,  Combination

If a shop can make $N$ different items, and in a single day prepares $K$ items from those $N$ items, then how many different sets of menus ($total$) can they make? $total = C^N_K$. Now, if they decide to make chicken that day for sure, how many sets of menus ( $chicken$ ) can they make now? $chicken = C^{N-1}_{K-1}$. So what is the probability $P$ that if I visit a shop I will get to eat chicken? $P = \frac{chicken}{total}$. And what is the probability that I will not eat chicken? $Q = 1- P$.

So now I know the probability of eating chicken for each shop. How do we find the expected value? We will find it using dynamic programming.

The states of dp only be the shop number. $dp(x)$ will give me the expected number of chicken that I can eat if I visit all shops starting from $x$. Our result will be $dp(1)$.

At each state, I have $P_i$ probability that I will eat chicken and $Q$ probability that I will not. So result for each state will be:

$dp ( pos ) = P \times ( 1 + dp ( pos + 1 ) + Q \times dp ( pos + 1 )$
$dp ( pos ) = P + P \times dp ( pos + 1 ) + Q \times dp ( pos + 1 ) $
$dp ( pos ) = P + dp ( pos + 1 ) \times ( P + Q )$
$dp ( pos ) = P + dp ( pos + 1 )$.

In order to print the result in $\frac{A}{B}$ form, we need to avoid $double$ and use integer arithmetic in all places. I implemented my own fraction class for that purpose.

Code: http://ideone.com/mGKwuL

UVa 12833 - Daily Potato

Complexity: $O(26 \times N)$
Category: String, Manacher's Algorithm

For each query, we have to count the number of palindromes which are substring of $S$, starts and ends with given $C$ and has exactly $X$ occurrences of $C$ in it. Since it deals with palindrome, perhaps it has something to do with Manacher's Algorithm?

With Manacher's Algorithm, we can find out the maximum length of palindrome in $O(N)$. But what's more, we can actually generate an array $M$ which gives us, for each center in the extended string $E$, ( $aba$ when extended becomes $\text{^#a#b#a#\$} $ ) the maximum length of palindrome with that particular center. How can we use this knowledge to solve our problem?

Let us consider the character $'a'$ only. We can easily extend it for other characters by repeating the whole process $26$ times.

Suppose we are working with center $x$. It has a palindrom from it with length $y$. Therfore, in extended string of manacher, the palindrome starts from $x-y$ and ends in $x+y$. Now, how many times does $'a'$ occurs in this palindrome? Using Cumulative Sum it is possible to answer in $O(1)$. Let that occurance be $f = \text{# of times 'a' occurs in palindrome with center x}$. Let us mark this in another array $freq$. That is we mark it like $\text{freq[f]++}$, meaning we have $freq[f]$ palindromes where $'a'$ occurs $f$ times. But wait, what if the palindrome does not start and end with $'a'$? Simple, we keep on throwing the leading and trailing character until it starts and ends with $'a'$ and it will still have $f$ occurances of $'a'$ in it.

So we repeat this for all possible center. Now, if the query is find number of palindromes that starts and ends with $'a'$ and $'a'$ occurs exactly $X$ times, how do we solve it?

First of all, our result will contain $res = freq[X]$ in it. What's more, our result will also contain $res \text{+=}  freq[X+2] + freq[X+4] + freq[X+6] + ...$. Why is that? Take any palindrome that contains more than $X$ occurances of $'a'$. Since they start and end with $'a'$, we can just throw them out of that palindrome and reduce the occurance of $'a'$ in it by $2$. After that, we keep on trimming down head and tail of that palindrome until we reach $'a'$ again. That is, a palindrome with $Y$ occurrences of $'a'$ can be trimmed down to palindrome with $Y-2$, $Y-4$, $Y-6$, $...$ occurrences of $'a'$.

Instead of getting the result $res = freq[X] + freq[X+2] + freq[X+4] + ... $ we can just use cumulative sum again to find it in $O(1)$ here. Just find cumulative sum of alternate terms.

Code: http://ideone.com/brZKXL

UVa 12834 - Extreme Terror

Complexity: $O(N\: logN )$
Category: Adhoc

This was the easiest problem. Each shop gives me some money ($income$) and then for that shop I have to give Godfather some cut ($expense$). So for each shop I get $profit = income - expense$. So I calculate profit for each shop and then sort them. I can now skip $K$ shops. I will of course skip shops for which profit is negative as long as I am allowed to skip.

Code: http://ideone.com/s3iHGz

UVa 12835 - Fitting Pipes Again

Complexity: $O(N!\: N^2)$
Category: Geometry, Trigonometry, Permutation, Packing Problem

It's a packing problem. At first glance it seems like a tough problem but it's not. Let us first define few variables first.


This is the polygon. Each polygon has two horizontal lines through it. We will call them low and top line. Each side of the polygon has length of $x$. The radius of the polygon is $r = \frac{x}{2} + y$.

We are given height of each polygon. But first we will need to calculate the value of $x$ and $y$. We can find their values using trigonometry.

$y^2 + y^2 = x^2$
$2y^2 = x^2$
$x = \sqrt{2y^2}$

We also know that $h = y + x + y$. From that we can derive:

$y + x + y = h$
$2y+x = h$
$2y + \sqrt{2y^2} = h$
$2y + y\sqrt{2} = h$
$y( 2 + sqrt{2} ) = h$
$y = \frac{h}{2} +\sqrt{2}$

With the above, we can find the value of $y = \frac{h}{2} +\sqrt{2}$ and $x = \sqrt{2y^2}$ But why did we find their values?

Okay, what happens when we put to polygon side by side? In order to solve this problem we need to be able to find the distance between the center of two polygon $A$ and $B$.

Now if two polygons are of same height and they are placed side by side, then the difference between their centers will be $d = r_a + r_b$. What happens when two polygons of arbitrary height come beside each other?

$3$ things can happen when two polygon $A$ and $B$ are placed side by side. $A$ is on left of $B$.

  1. Height of bottom line of $A$ is higher than height of top line of  $B$. In this case, $A$ is so big that $B$ slides inside the radius of $A$.
  2. Height of bottom line of $B$ is higher than height of top line of $A$. In this case $A$ is so small that it slides inside the radius of $B$.
  3. Nobody can slide inside each others radius. So $d = r_a + r_b$.
We need to calculate the value of $d$ for step $1$ and $2$. That can also be easily done using trigonometry. Try to draw some diagram yourself.

So we used $x$ and $y$ to find the value of $d$ between two polygon. How do we use this to find the minimum width of box? 

Suppose we are trying to pack the polygons in some order. Which order? We don't know which order will give us minimum width so we try them all. There will be $N!$ order. 

So for each order, what will be the minimum of width. First lets take the first polygon, and put it inside the empty box such that it touches the left side of the box. We will calculate the center of each polygon relative to the left side of the box. The first box will have center at $r_0$.

Now take the second box. First imagine there is no polygon in the box. There where will be the center of the second polygon? Ar $r_1$. Now, since there is a polygon, we will try to put it beside that one. Where will be the center now? $r_0 + d$. We will take the maximum possible center.
Repeat this for the third polygon. Empty box, beside first polygon, beside second polygon. Take the maximum again. Repeat this for all polygon.

From all polygon, find the one with maximum center position. Add radius of that polygon to it's center to find the width.

Take the minimum width from all permutation. 


UVa 12836 - Gain Battle Power

Complexity: $O(N^2)$
Category: Interval DP, LIS, Knuth Optimization

First we need to calculate the power of each deatheater. We can do that by calculating LIS from both direction and then adding them.

Once we calculate the power, we run an interval dp between $1$ and $N$. The dp will have states $start$ and $end$. Inside the dp a loop between start and end will run, choosing different cut sections. We will take the one with minimum value.

But this results in $O(N^3)$ solution. Using Knuth's optimization we can reduce it to $O(N^2)$. 


UVa 12837 - Hasmot Ali Professor

Complexity: $O(100 \times |S|^2 )$
Category: String, Trie, Data Structure

 We will create two trie trees. The first one will contain all the queries. Take a query and concatanate them in a special way. Take the first string of the query and add a $\text{'#'}$ and then reverse the second string of the query and attach it to result. That is, if we have $abc$ and $pqr$ as query, then special string will be $\text{abc#rqp}$. Insert all special strings for each query in the first tries.

Now, let us process the main string $S$. We will take each of its suffix and insert into the second trie. Now, when inserting the suffixes, each time a new node is created, we can say that we found a unique substring. 

Each time we find a unique substring we will process it further. Take the unique substring, and using brute force generate all possible special strings ( the one we made using query strings ) with the first and last characters of the unique string. We don't need to take more than $10$ characters from each end.

For each of those special string we made from unique substring, we will query the first trie with it and find the node where it ends. We will add one to that node number in a global array.

Once we finish processing all nodes in the second trie, we will simply traverse the first trie according to each query and print the result found in that node.


UVa 12838 - Identity Redemption

Complexity: Unknown
Category: Matching on General Graph

I didn't manage to solve this yet. It looks like matching on general graph.

UVa 12839 - Judge in Queue

Complexity: $O(N\:logN)$
Category: Data Structure, Heap

We want to minimize the waiting time for each person. So first we will sort the people in descending order. Next we will create a priority queue, which will contain all the service center along with the information about when that service center will be free. Initially all service centers are free.

Now, we take each person and take out the service center that will get free at the earliest time. The person had to originally wait  $x$ minutes and it took $y$ minutes for the service to get free again. So the person had to wait $x+y$ minutes. We insert the service again inside the heap and update its free time by the time it takes to serve one person.

Repeat and keep track of the highest time a person had to wait.




I hope the details are clear enough for everyone to understand. Let me know if you find any mistakes.