Back to basics – Part 1..of..n-1 – Algorithm to Merge Sorted Sequences using C#

This is another episode of the “so called my sporadic” series, mentioned long long time ago. This time we’ll be discussing the biggest merger ever in the United States History, Merger of two Sorted Sequences :)  like arrays, lists etc. I’ll try to go from more concrete to generic one, like taking ‘int’ sequences as concrete type and end up with generic<t> that caters all types. [digg]

The algorithm is pretty simple, take a look by yourself:

Lets say we have two sorted ‘int’ arrays, shown below:

image

 

 

 

 

Here i & j represents pointers/indices to Left and Right arrays respectively and initially are at 0. Next we check that Left[i] >= Right[j], if yes we’ll take the right one and increment the i (i++), else if Left[i] < Right[i], we’ll take the left item and increment j (j++).  Until we reach to a point where one of the sequence is processed fully and there are some elements left in the other array. That is in our case, Left one will be processed earlier and there still have some elements left in the right array. This is what is called the skip-state that is pointed by ‘k’ index/pointer, where we take the rest of the elements from the right array and place them in the temporary array/container merged_array. The skip state is shown below with k pointer/index:

image

 

 

 

 

 

 

 

 

Here is the flow chart for the algorithm.

 

image

 

Now lets go into the implementation of it. I have implemented the algorithm in four ways:

  1. Concrete implementation, for ints without the Extension methods (greedy approach).
  2. Concrete implementation, for ints, using Extension methods (greedy approach).
  3. Generic implementation using Extension methods, greedy approach.
  4. Generic implementation using Extension methods, lazy approach (using yields).

Let’s take the top down approach and start with the usage of the above mentioned implementations. Take a closer look @ (1), (2), (3) & (4).

static void Main(string[] args)
{
    int[] left_array =  { 1, 3, 5, 7, 40 };
    int[] right_array = { 4, 6, 30, 30, 40, 50 };

    var results = MergeSortedArrays(left_array, right_array);  // (1)
   
    var results2 = left_array.MergeSortedArrays(right_array);  // (2) 
   
    // using generics, greedy approach.
    var results3 = left_array.MergeSortedSequences(right_array, (a1, a2) => (a1 >= a2) );  // (3) 
   
    // this one calls the Lazy one, with yield.
    var result4 = left_array
        .MergeSorted(right_array, (a1, a2) => (a1 >= a2))
        .ToArray();  // (4) 
}

1) Here is the Concrete implementation of the first one, for int  types without the Extension methods (greedy approach).

public static int[] MergeSortedArrays(int[] left_array, int[] right_array)
{
    // bounds checking.
    if (left_array == null ||
        right_array == null ||
        left_array.Count() == 0 ||
        right_array.Count() == 0)
        return null;

    // created an array to hold the results
    int count = left_array.Count() + right_array.Count();
    int[] merged_array = new int[count];

    // temp array points to the array that has to be processed during
    // skip state
    int[] temp_array = null;

    // at start the skip state is false, no skipping of data.
    bool skip_state = false;

    // i and j are pointers for both the arrays, while k is the pointer where skip
    // of data has to begin i.e. skipstate
    int i = 0;
    int j = 0;
    int k = 0;

    // initially the counter is zero, and
    // represents start of the traversal
    int counter = 0;

    // lets begon the traversal.
    while (counter < count)
    {
        if (left_array[i] < right_array[j])
        {
            merged_array[counter++] = left_array[i++];

            if (i == left_array.Count()) { skip_state = true; temp_array = right_array; k = j; }
        }
        else
        {
            merged_array[counter++] = right_array[j++];

            if (j == right_array.Count()) { skip_state = true; ; temp_array = left_array; k = i; }
        }

        // At skip state traverse the rest.
        if (skip_state)
        {
            while (counter < count)
            {
                merged_array[counter++] = temp_array[k++];
            }
        }
    }

    return merged_array;
}

2) Do we need the second while loop, I don’t think so, sounds redundant. Refined it, this one is with the Extension methods, and the arguments are now of type IEnumerable<>, instead of arrays[], we are getting close to generic solutions, correct! Since all collections implements IEnumerable<>, after little refactoring, it is now LINQ-able (that requires composable or fluent API which is based on chain of responsibility pattern). I’ll discuss this in more details, in coming sessions, on LINQ, that why we need IEnumerable<T> and IQueryable<T> etc. This one also is a greedy approach.

public static IEnumerable<int> MergeSortedArrays(this IEnumerable<int> left,
    IEnumerable<int> right)
{
    // bounds checking, we can consider throwing exception as well.
    if (left == null ||
        right == null ||
        left.Count() == 0 ||
        right.Count() == 0) return null;

    // created an array to hold the results
    int count = left.Count() + right.Count();
    int[] merged_array = new int[count];

    // temp array points to the array that has to be processed during
    // skip state
    IEnumerable<int> temp_array = null;

    // at start the skip state is false, no skipping of data.
    bool skip_state = false;

    // i and j are pointers for both the arrays, while k is the pointer where skip
    // of data has to begin i.e. skipstate
    int i = 0;
    int j = 0;
    int k = 0;

    // initially the counter is zero, and
    // represents start of the traversal
    int counter = 0;

    // lets begon the traversal.
    while (counter < count)
    {
        // At skip_state traverse the rest and loop back.
        if (skip_state)
        {
            merged_array[counter++] = temp_array.ElementAt(k++);
        }
        else if (left.ElementAt(i) >= right.ElementAt(j))
        {
            merged_array[counter++] = right.ElementAt(j++);

            if (j == right.Count()) { skip_state = true; ; temp_array = left; k = i; }
        }
        else
        {
            merged_array[counter++] = left.ElementAt(i++);

            if (i == left.Count()) { skip_state = true; temp_array = right; k = j; }
        }
    }

    return merged_array;
}

So all we need is the introduction of another ‘ if ‘ statement and no need of another nested while loop (always avoid redundancy). Seems like we are getting the Algorithm’s Complexity of O(n), very nice.

3) The generic approach, with extension method. Also, introducing, the comparer, a callback, or delegate of type Func<T,T,bool>, greedy as well:

public static IEnumerable<T> MergeSortedSequences<T>(this IEnumerable<T> left,
    IEnumerable<T> right,
    Func<T, T, bool> Comparer)
{
    // bounds checking, we can consider throwing exception as well.
    if (left == null ||
        right == null ||
        left.Count() == 0 ||
        right.Count() == 0) return null;

    // created an array/list to hold the results
    int count = left.Count() + right.Count();
    List<T> merged_array = new List<T>(count);

    // temp array points to the array that has to be processed during
    // skip state
    IEnumerable<T> temp_array = null;

    // at start the skip state is false, no skipping of data.
    bool skip_state = false;

    // i and j are pointers for both the arrays, while k is the pointer where skip
    // of data has to begin i.e. skipstate
    int i = 0;
    int j = 0;
    int k = 0;

    // initially the counter is zero, and
    // represents start of the traversal
    int counter = 0;

    // lets begon the traversal.
    while (counter < count)
    {
        // At skip_state traverse the rest and loop back.
        if (skip_state)
        {
            counter++;
            merged_array.Add(temp_array.ElementAt(k++));
        }
        else // first >= second, take second.
        if (Comparer(left.ElementAt(i), right.ElementAt(j)))
        {
            counter++;
            merged_array.Add(right.ElementAt(j++));

            if (j == right.Count()) { skip_state = true; ; temp_array = left; k = i; }

        }
        else // first < second, take first.
        {
            counter++;
            merged_array.Add(left.ElementAt(i++));

            if (i == left.Count()) { skip_state = true; temp_array = right; k = j; }
        }
    }

    return merged_array;
}

4) Here comes the final one, my favorite, using Extension method, Generic and Lazy (or Deferred Execution), I am using yield and the good part is no intermediary container is required in this case, as we saw in the previous implementations. This is now fully LINQ ready, and we can apply as many LINQ operations, as needed.

/// <summary>
/// Merges the sorted sequence using yields(lazy),
/// no temporary container required in this version.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="left">The first/left sequence.</param>
/// <param name="right">The second/right sequence.</param>
/// <param name="Comparer">The comparer.</param>
/// <returns>IEnumerable<T></returns>

public static IEnumerable<T> MergeSorted<T>(this IEnumerable<T> left,
    IEnumerable<T> right,
    Func<T, T, bool> Comparer)
{
    // nulls/bounds checking.
    if (left == null ||
        right == null ||
        left.Count() == 0 ||
        right.Count() == 0)
    {
        yield break; // no need to proceed.
    }

    // created an array/list to hold the results
    int count = left.Count() + right.Count();

    // temporary container not required, in the 'yield' scenario, so commented out.
    //List<T> merged_array = new List<T>(count);

    // temp array points to the array that has to be processed during
    // skip state
    IEnumerable<T> temp_array = null;

    // at start the skip state is false, no skipping of data.
    bool skip_state = false;

    // i and j are pointers for both the arrays, while k is the pointer where skip
    // of data has to begin i.e. skipstate
    int i = 0;
    int j = 0;
    int k = 0;

    // initially the counter is zero, and
    // represents start of the traversal
    int counter = 0;

    // lets begon the traversal.
    while (counter < count)
    {
        // At skip_state traverse the rest and loop back.
        if (skip_state)
        {
            counter++;
            yield return temp_array.ElementAt(k++);
        }
        else // first >= second, take second.
        if (Comparer(left.ElementAt(i), right.ElementAt(j)))
        {
            counter++;
            yield return right.ElementAt(j++);

            if (j == right.Count()) { skip_state = true; ; temp_array = left; k = i; }
        }
        else // first < second, take first.
        {
            counter++;
            yield return left.ElementAt(i++);

            if (i == left.Count()) { skip_state = true; temp_array = right; k = j; }
        }
    }
}

The code is self explanatory. Although this is a simple algorithm, but main idea was to present it in different ways so the viewers will get different flavors of implementing it in .NET using C#.

That's all for now folks, I hope it was helpful. I appreciate you please leave your valuable feedback. Enjoy!

If you enjoyed reading this blog, leave your valuable feedback and consider subscribing to the RSS feed. You can also subscribe to it by email. Also, you can follow me on Twitter. Thank you!

Technorati Tags: ,,,

Comments (3) -

Hi friends, how is everything, and what you would like to say on the topic of this paragraph, in my view its in fact awesome in favor of me.

Tremendous issues here. I'm very glad to peer your post. Thanks a lot and I'm having a look ahead to contact you. Will you please drop me a mail?

google map การจราจร
10/18/2017 10:36:57 AM #

It is in point of fact a nice and helpful piece of info. I am glad that you shared this helpful info with us. Please keep us up to date like this. Thanks for sharing.

Greetings! I've been following your site for a while now and finally got the bravery to go ahead and give you a shout out from  Humble Tx! Just wanted to tell you keep up the great job!  My website ...  Аренда квартир в Аликанте - www.heydaylights.com/component/k2/author/13664

Wow, marvelous blog layout! How long have you ever been running a blog for? you make blogging look easy. The total look of your site is wonderful, let alone the content!

Add comment