My Weblog

Blog about programming and math

Threading in C and Haskell

As soon as main starts executing ( it’s process ), it sees that we are creating threads and operating system creates them for us. When we run this code, it produces the different results every time ( Make the task bit bigger to feel the inconsistency. Probably executing more than processor switching time I guess ). Its because lets say first thread starts incrementing the count variable and before finishing the whole loop, processor switch to second or third thread and they start modifying the inconsistent count variable. In the mean time processor keeps switching to all threads. How to avoid this ? If we are executing some critical code ( some data manipulation ) then we have to put lock on that critical code so if processor switch to other thread and they try to access it then denied because the critical code is locked by some other thread and he is not finished yet. We get the lock by pthread_mutex_t , initialise it and destroy it when we are done. Lock the critical code pthread_mutex_lock and unlock it using pthread_mutex_unlock passing the address of lock. A excellent tutorial on C thread programming.

#include<stdio.h>
#include<pthread.h>

pthread_t tid[2];
pthread_mutex_t lock;

//comment all the functions related to lock to see the race condition.

void* coun_incr(void* arg )
 {
       int i;
       //printf("Thread identifier = %d\n",  pthread_self() );
       //going to lock this critical code 
       pthread_mutex_lock ( &lock );
       
       int* count = ( int* ) arg;
       for( i = 0 ; i < 1000000 ; i++ )  
       *count += 10;

       //release the lock because we have finished the whole iteration
       pthread_mutex_unlock ( &lock );
       return NULL;
 }

int main()
 {
       //initialize the lock
       pthread_mutex_init ( &lock , NULL );
       int count = 0 ;
       pthread_create( &tid[0] , NULL , coun_incr , &count ) ;
       pthread_create( &tid[1] , NULL , coun_incr , &count ) ;
       pthread_create( &tid[2] , NULL , coun_incr , &count ) ;
       //wait for thread to finish otherwise main will finish and terminate all the three threads. 
       pthread_join ( tid[0] , NULL );
       pthread_join ( tid[1] , NULL );
       pthread_join ( tid[2] , NULL );
       printf("%d\n", count );

       //destroy the lock
       pthread_mutex_destroy ( &lock );
 }   

Some results in inconsistent state. 
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ gcc -lpthread Thread_1.c 
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./a.out 
13482740
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./a.out 
21974930
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./a.out 
24638680
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./a.out 
17794120

Using locks on critical code. 
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ gcc -lpthread Thread_1.c 
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./a.out 
30000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./a.out 
30000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./a.out 
30000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./a.out 
30000000

The almost same code in Haskell using mvar.

import Control.Concurrent
import Control.Monad
import GHC.IORef

incr_count :: MVar () -> MVar Int -> IO ()
incr_count m n = ( forM_ [ 1..10000 ] $ \_ -> modifyMVar_ n ( return . ( + 10 ) ) ) >> putMVar m ()

main :: IO()
main = do 
      count <- newMVar 0 
      list <- forM [1..10] $ \_ -> newEmptyMVar
      forM_ list $ \var -> forkIO . incr_count var $ count
      forM_ list $ \var ->  takeMVar var
      val <- takeMVar count
      print val

Ok, modules loaded: Main.
ghci>main
1000000
ghci>main
1000000
ghci>main
1000000

We can create race condition using IORef. We are creating a mutable variable count of type IORef and 10 parallel threads are modifying it.

import Control.Concurrent
import Control.Monad
import Data.IORef

incr_count :: MVar () -> IORef Int  -> IO ()
incr_count m n = ( forM_ [ 1 .. 10000 ] $ \_ -> modifyIORef' n ( + 10   ))  >> putMVar m ()

main :: IO ()
main = do
      count <- newIORef 0
      list <- forM [1..10] $ \_ -> newEmptyMVar
      forM_ list $ \var -> forkIO . incr_count var $ count
      forM_ list $ \var ->  takeMVar var
      val <- readIORef count
      print val

Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1 
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1 
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1  +RTS -N2
619360
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1  +RTS -N2
614090
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1  +RTS -N2
521930
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1
500010
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1  +RTS -N2
597710

We can use software transactional memory ( STM ) to run whole update as single unit without interference of other threads. The only use of MVar is to finish all the threads before main. We can remove MVar and use threadDelay function but I believe that MVar solution is more reliable than using random thread delay. From wikipedia “Unlike the locking techniques used in most modern multithreaded applications, STM is very optimistic: a thread completes modifications to shared memory without regard for what other threads might be doing, recording every read and write that it is performing in a log. Instead of placing the onus on the writer to make sure it does not adversely affect other operations in progress, it is placed on the reader, who after completing an entire transaction verifies that other threads have not concurrently made changes to memory that it accessed in the past. This final operation, in which the changes of a transaction are validated and, if validation is successful, made permanent, is called a commit. A transaction may also abort at any time, causing all of its prior changes to be rolled back or undone. If a transaction cannot be committed due to conflicting changes, it is typically aborted and re-executed from the beginning until it succeeds.”

import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad
import Data.IORef


incr_count :: TVar Int -> STM ()
incr_count m = forM_ [ 1 .. 10000 ] $ \_ -> modifyTVar' m ( + 10 ) 

main = do
   count <- newTVarIO  0
   list <- forM [1..10] $ \_ -> newEmptyMVar
   forM_ list $ \var -> forkIO (  ( atomically . incr_count $ count ) >> putMVar var () )
   forM_ list $ \var ->  takeMVar var
   --threadDelay 1000000
   val <- readTVarIO count
   print val
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1 
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1 +RTS -N2
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1 +RTS -N2
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1 +RTS -N4
1000000
Mukeshs-MacBook-Pro:GitRepo mukeshtiwari$ ./Thread_1 +RTS -N4
1000000

I am no expert in C or Haskell and still learning. If you have feedback or comment then please let me know.

Advertisements

December 18, 2012 - Posted by | Haskell, Programming | , , ,

1 Comment »

  1. I have ran the same code that you have above posted on my systems. The results are as expected as the codes seems fine.

    Comment by Amit | April 28, 2013 | Reply


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: