The unexpected side effects of converting a single threaded service into a multi-thread, multi-instance service.
We’re in the middle of one of the most critical migrations - moving to the cloud. One of the most frequently used terms about this shift is scale : the ability to run mutiple instances of something, without worrying about the operational overheads.
During this migration, we are looking at ways of parallelizing pretty much every background service. One such service is our External Clicks worker. Well since we were in a hurry and we needed to migrate ~500GB of data to the new servers, we decided to run multiple instances of this worker.
All was well. Well, almost.
Somehow, this worker ended up duplicating data. On digging through the app logic, we saw that this service had a “Get Or Create” logic; think of it like Upsert. Since multiple workers were running in parallel, almost at the same time, this logic duplicated the data.
The method definition for GetOrCreateRow looks like this :
Since there is no lock on this shared resource, each thread creates the Row with Id 3; causing the data to be duplicated (well, triplicated in this case).
Not to worry though; this is not an uncommon situation.
Let’s take a cue from this SO answer, and change the definition for GetOrCreateRow
Well yeah, it’s that simple!
The output of the program is shown below
There are more ways in which you could synchronize a multi threaded application and avoid these race conditions. Take a look at MutexOperation.cs to see the Mutex variation of this demo.
I also wanted to throw some light on how we leverage multithreading and mutex synchronization in our background services. We have the classic case of an increment-decrement counter situation. We use this counter to scale out our workers.