Counting Inversions

First week of algorithm course programming assignment is to compute the number of inversion count of given array. We have a given array A[0…n – 1] of n distinct positive integers and we have to count all the pairs ( i , j ) such that i < j and A[i] > A[j].
1. Brute Force
Brute force idea is very simple. Run two loops and keep counting the respective pairs for which i < j and A[i] > A[j]. The complexity of brute force is O( n² ).

bruteForce :: [ Int ] -> Int
bruteForce [] = 0
bruteForce [_] = 0
bruteForce ( x : xs ) = ( length . filter ( x > ) $ xs ) + bruteForce xs

*Main> bruteForce [ 3 , 1 , 2]
Loading package array-0.4.0.1 ... linking ... done.
Loading package deepseq-1.3.0.1 ... linking ... done.
Loading package bytestring-0.10.0.0 ... linking ... done.
2
*Main> bruteForce [ 2,3,8,6,1]
5

2. Divide and Conquer
We can modify the merge sort to compute the inversion count. The idea is to compute the inversion count of during the merge phase. Inversion count of array C resulting from merging two sorted array A and B is sum of inversion count of A, inversion count of B and their cross inversion. The inversion count of single element is 0 ( base case ). The only trick is compute the cross inversion. What will be the inversion count of following array if we merge them ?
A = [ 1 , 2 , 3 ]
B = [ 4 , 5 , 6 ]
Clearly 0 because all element of A is less than the first element of B. What is the inversion count of C = [ 4 , 5 , 6 ] and D = [ 1 , 2 , 3 ] ? First element of D ( 1 ) is less than first element of C ( 4 ) so we have total three pairs ( 4 , 1 ) ( 5 , 1 ) ( 6 , 1). Similarly for 2 and 3 ( total 9 inversion count ). It reveals that if we have any element y in D is less than certain element x in C then all the element of A after x will also be greater than y and we will have those many cross inversion counts. More clear explanation. Wrote a quick Haskell code and got the answer.

import Data.List
import Data.Maybe ( fromJust )
import qualified Data.ByteString.Lazy.Char8 as BS

inversionCnt :: [ Int ] -> [ Int ] -> [ Int ] -> Int -> ( Int , [ Int ] )
inversionCnt [] ys  ret  cnt = ( cnt , reverse ret ++ ys ) 
inversionCnt xs []  ret  cnt = ( cnt , reverse ret ++ xs )
inversionCnt u@( x : xs ) v@( y : ys )  ret  cnt
      | x <= y = inversionCnt xs v  (  x : ret   )   cnt
      | otherwise = inversionCnt u ys  ( y : ret )  ( cnt + length u )  

merge ::  ( Int , [ Int ] ) -> ( Int , [ Int ] ) -> ( Int , [ Int ] )
merge ( cnt_1 , xs ) ( cnt_2 , ys ) = ( cnt_1 + cnt_2 + cnt , ret ) where
        ( cnt , ret ) = inversionCnt xs ys []   0

mergeSort ::  [ Int ] -> ( Int , [ Int ] )
mergeSort [ x ] = ( 0 , [ x ] )
mergeSort  xs   = merge ( mergeSort xs' ) ( mergeSort ys' ) where
           ( xs' , ys' ) = splitAt ( div ( length xs ) 2 ) xs
          
main =  BS.interact $ ( \x -> BS.pack $ show x ++ "\n" ) . fst 
        .  mergeSort . map ( fst . fromJust .  BS.readInt :: BS.ByteString -> Int  ) 
        . BS.lines 


Mukeshs-MacBook-Pro:Puzzles mukeshtiwari$ ghc  -fforce-recomp -O2  Inversion.hs 
[1 of 1] Compiling Main             ( Inversion.hs, Inversion.o )
Linking Inversion ...
Mukeshs-MacBook-Pro:Puzzles mukeshtiwari$ time ./Inversion <  IntegerArray.txt 
2407905288

real	0m8.002s
user	0m7.881s
sys	0m0.038s

I tried to compare this solution with GeeksforGeeks‘s solution written in C and it was a complete surprise to me.

Mukeshs-MacBook-Pro:Puzzles mukeshtiwari$ time ./a.out <  IntegerArray.txt 
 Number of inversions are 2407905288 

real	0m0.039s
user	0m0.036s
sys	0m0.002s

Almost 200 times faster! My Haskell intuition told me that I am doing certainly some wrong. I started analyzing the code and got the point. In inversionCnt function, I am doing extra work for computing the length of list u ( O ( n ) ) so rather than doing this extra work, I can pass the length.

import Data.List
import Data.Maybe ( fromJust )
import qualified Data.ByteString.Lazy.Char8 as BS


inversionCnt :: [ Int ] -> [ Int ] -> [ Int ] -> Int -> Int -> ( Int , [ Int ] )
inversionCnt [] ys  ret _ cnt = ( cnt , reverse ret ++ ys ) 
inversionCnt xs []  ret _ cnt = ( cnt , reverse ret ++ xs )
inversionCnt u@( x : xs ) v@( y : ys )  ret n cnt
      | x <= y = inversionCnt xs v  (  x : ret   )  ( pred n ) cnt
      | otherwise = inversionCnt u ys  ( y : ret ) n ( cnt + n )  

merge ::  ( Int , [ Int ] ) -> ( Int , [ Int ] ) -> ( Int , [ Int ] )
merge ( cnt_1 , xs ) ( cnt_2 , ys ) = ( cnt_1 + cnt_2 + cnt , ret ) where
        ( cnt , ret ) = inversionCnt xs ys [] ( length xs )  0

mergeSort ::  [ Int ] -> ( Int , [ Int ] )
mergeSort [ x ] = ( 0 , [ x ] )
mergeSort  xs   = merge ( mergeSort xs' ) ( mergeSort ys' ) where
           ( xs' , ys' ) = splitAt ( div ( length xs ) 2 ) xs
          
main =  BS.interact $ ( \x -> BS.pack $ show x ++ "\n" ) . fst 
        .  mergeSort . map ( fst . fromJust .  BS.readInt :: BS.ByteString -> Int  ) 
        . BS.lines 

Mukeshs-MacBook-Pro:Puzzles mukeshtiwari$ time ./Inversion <  IntegerArray.txt 
2407905288

real	0m0.539s
user	0m0.428s
sys	0m0.031s

Now 10 times slower and still lot of room for improvement but for now I am happy with this :). You can also try to solve INVCNT. Run this code on Ideone.

July 4, 2013 - Posted by tiwari_mukesh | Haskell, Programming | Algorithm, Coursera, Haskell, Inversion Count, SPOJ

2 Comments »

I was wondering whether there is way to structure this code so that we don’t have to do that many pattern matching. mergesort is function from [Int] -> [Int], extra inversion count which we need to carry can thought of logging. This is what I come up with

import Control.Monad
import Control.Monad.Writer
import Data.Monoid

merge :: [Int] -> [Int] -> Writer (Sum Int) [Int]
merge [] ys = return ys
merge xs [] = return xs
merge (x:xs) (y:ys)
| x <= y = do { temp <- merge xs (y:ys); return (x:temp) }
| otherwise = do
temp Writer (Sum Int) [Int]
mergeSort [] = return []
mergeSort [x] = return [x]
mergeSort xs = do
let (lx, rx) = splitAt (div (length xs) 2) xs
slx <- mergeSort lx
srx <- mergeSort rx
merge slx srx

Comment by divyanshu ranjan (@rdivyanshu) | July 29, 2013 | Reply
bad formatting : http://ideone.com/YfP9cI

Comment by divyanshu ranjan (@rdivyanshu) | July 29, 2013 | Reply

About

Hello All, My name is Mukesh Tiwari and I graduated from IIITM Gwalior in 2009. I am passionate about programming and mathematics. I love solving problems on SPOJ , UVA , Topcoder and Project Euler. I am curious about functional languages specially Haskell and it’s one the excellent language I encountered after python. I am also looking for some challenging functional programming job specially related to Haskell. You can reach me mukeshtiwariDOTiiitmATgmailDOTcom. Replace DOT with . and AT with @

July 2013

M T W T F S S

1 2 3 4 5 6 7

8 9 10 11 12 13 14

15 16 17 18 19 20 21

22 23 24 25 26 27 28

29 30 31

« Jun Sep »
Recent Comments

carl johnson on SPOJ 9126. Time to live
gratitude on SPOJ problem INTEGMAX
stan on SPOJ 9126. Time to live
Roni on SPOJ DIE HARD
avinish on SPOJ 8756. Shake Shake Sh…

July 2013
M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Twitter Updates
Tweets by mukesh_tiwari
Blog Stats
- 139,047 hits
Follow Blog via Email

Enter your email address to follow this blog and receive notifications of new posts by email.

Email Address:

Join 364 other subscribers
Search Articles

Search for:
Archives
Archives

My Weblog

Blog about programming and math