Tuesday Feb 14, 2012

Funny world of Improv

Completed my first ever improv class yesterday at Westside Comedy and wanted to share my experience. First off, I had general idea about improv, but had no clue about the dynamics and other details. Improv is more than just telling jokes on the fly. Wikipedia defines Improvisation as “the practice of acting, singing, talking and reacting, of making and creating, in the moment and in response to the stimulus of one's immediate environment and inner feelings. This can result in the invention of new thought patterns, new practices, new structures or symbols, and/or new ways to act.” My primary motivation in joining this class was to practice improv speaking on any topic and do it in a manner that is funny and engaging. The skills learned can be applied in any field ranging from public speaking to general every day conversation in formal or informal setting.

It was fun and enriching experience. Through the course of 6 weeks, we went though many activities and games to learn the tools and structure of improv. It is not just about cracking jokes; to be successful in this, you need to be constantly aware of what is going on - be active listener, and work as team and support each other. It was fascinating to see how playing a mundane act with varied emotions can bring so much laughs. What you learn here can be so contrary to how we normally think, such as - if you feel weird, do it more - if you are stuck, say the truth. You learn to avoid arguments and go with the flow. It’s all about relationships filled with emotions. Sounds like valuable life lessons!!



All in all, while this class might not have made me any funnier, it definitely taught me bits and pieces about how to keep a conversation engaging and entertaining. Also, confidence to face audience and not be conscious whether I might be sounding puerile or preposterous - as it might not be a bad thing after all. Finally, confidence to crack jokes, even if no one is laughing!

Wednesday May 25, 2011

Finding Missing Value

Often times it is required to find a missing number in large set, say of 32-bit integers (n), in random order. For example, integer numbers used as keys for some records and the missing key(s) could be reused for new allocations or to find missing data/holes in the data-set. 

So, what is optimal way to find the missing value. Well, it depends. We have to further define the constraints. What is the size of data-set, how much virtual memory do we have - can the numbers be loaded into memory? What if we are constrained by the amount of memory but we have plenty of disk space. How about the case where we have some (limited) memory to partially fit in the input in memory. The optimal solution, as you might have guessed, depends on these constraints and this blog present various techniques that can be used to find the missing value for each of these cases. Mainly, following solutions for following three cases will be discussed. Also, we will note the running time for each.

  1. Plenty of virtual memory
  2. Very little virtual memory but plenty of disk space
  3. Some memory

Plenty of Virtual Memory

Solution 1 

We can define a bit vector of size equal to n + 1 all initialized to 0. Then by looping through the input numbers, for a given number set the corresponding position in the bit set to 1. Finally, scanning the bit set for a value that is not set to 1 will give us the missing number. This solution of course assume that plenty of virtual memory available to represent the input set in bit array. The running time of this algorithm is O(n), not counting the time it takes to build the bit set.

Solution 2

Sort the input array and then using binary search find the missing value.

Solution 3

Another possible solution for the case where only one number is missing in a range of numbers would be to take sum of numbers (can be obtained using arithmetic series - S = n/2(a1 + an)) and then subtract each number from the sum of the range. The result would be missing number. We of course need to be careful that the sum does not overflow the maximum allowed for the data-type.

Very Little Virtual Memory

When we are limited by virtual memory and cannot load the input array into memory, we cannot do either bit check or sort, hence these solutions will not work. Assuming we have plenty of disk space, a solution similar to binary search (without the sort step) can be used to find missing number(s). 

We will successively divide the input into two files half the size of input and write each to the disk. The trick is to divide the input into two halves based on most significant bit (MSB). One file will have all numbers with MSB value of 1 and other file with numbers starting with 0 as MSB. Each successive iteration will reduce the input array into half and the MSB for dividing the file will also move one step to right (towards least significant bit or LSB).

Since we know the maximum capacity for a given number of bits, the max elements for each half is known. Finding out the size of each file and knowing the total capacity, we can pick the half which has number of items less than the capacity and discard the other file. Note that it is possible that the other half also have missing numbers (the size less than the capacity), but since we are interested in finding a missing value (not all missing values), it does not matter which file (if the size of both is less than the capacity) we choose. Repeat the process over the selected file (now approx half the size) now considering the next MSB. Continue doing this till we have reached all but the last bit (LSB). At this point the input will have only one (or none) element remaining. Knowing the the possible values that can be represented for already found MSB, and the original item(s) remaining (for this range), we can easily find the missing number(s).

Also, note that in case there are range of missing values adjacent to each other, the solution may converge earlier and there is no need to traverse from MSB all the way to LSB. 

Lets understand the above solution using simple example. Assume 4-bit integer (range 0 - 16) with missing value of 12 (1100). Here is the input set

input (integer) {5   7   15   8   1   9   4   0   3   10   13   6   14   2   11} 
input (binary)  {0101  0111 1111 1000 0001 1001 0100 0000 0011 1010 1101 0110 1110 0010 1011}

In the first pass, we serially read the input and divide it into two arrays based on most significant bit and write them to the disk

first file      (bit 1xxx) {1111 1000 1001 1010 1101 1110 1011}
second file( bit 0xxx) {0101 0111 0001 0100 0000 0011 0110 0010}

Thus the two files sizes obtained are 7 (with MSB as 1) and 8 (MSB of 0). The max capacity of each being 8 (the total capacity ignoring the MSB). 

Thus looking at the size of two files, it is clear that the first file contains the missing number. At this point we can discard the second file and repeat the above process for the first file using the second most significant bit to obtain two files:

first file       (bit 11xx) {1111 1101 1110}
second file  (bit 10xx) {1000 1001 1010 1011}

Above clearly suggests that the missing number must be in the range 1100 - 1111. Discarding the second file and repeating the same using the third most significant bit, we obtain following:

first file       (bit 111x) {1111 1110}
second file  (bit 110x) {1101}

This we conclude that the missing value is 1100 (the only remaining value for second file above).

In conclusion, the using the above mechanism we successively divided the input array into half. Thus for array of size n, we can find the missing value using log(n) passes, with each pass reducing the input size to n, n/2, n/4.....(geometric series sum = n/(1-1/2)= 2n) producing running time proportional to n.  

Refer code at the end for implementation of above idea. 

Some Memory 

Where we do have some memory to work with, the solution could be hybrid of above. Check first few significant bit to reduce the problem size (input array) so that it is able to fit in the memory. Then you can employ any of the solution listed in Case 1 above. For example, for an input size of 232 (approx 4 billion values), checking 10 most significant bit can reduce the input to the order of 1000 (approx 4 million values) which might fit into memory.

Code

(click here to download

/*
 * Copyright (c) 2010-2020 Malkit S. Bhasin. All rights reserved.
 * 
 * All source code and material on this Blog site is the copyright of Malkit S.
 * Bhasin, 2010 and is protected under copyright laws of the United States. This
 * source code may not be hosted on any other site without my express, prior,
 * written permission. Application to host any of the material elsewhere can be
 * made by contacting me at mbhasin at gmail dot com
 * 
 * I have made every effort and taken great care in making sure that the source
 * code and other content included on my web site is technically accurate, but I
 * disclaim any and all responsibility for any loss, damage or destruction of
 * data or any other property which may arise from relying on it. I will in no
 * case be liable for any monetary damages arising from such loss, damage or
 * destruction.
 * 
 * I further grant you ("Licensee") a non-exclusive, royalty free, license to
 * use, modify and redistribute this software in source and binary code form,
 * provided that i) this copyright notice and license appear on all copies of
 * the software;
 * 
 * As with any code, ensure to test this code in a development environment
 * before attempting to run it in production.
 */

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

/**
 * The following program finds missing a value in input array. It takes an input
 * array and the bit size of the maximum value (the number of bits needed to
 * represent the represent max value) and returns missing number, if any. The
 * algorithm fully loads the input in memory and require additional memory of
 * size equal to input, however can be modified to work in limited memory
 * environment (refer comments in the end).
 * 
 * Also, note the following: 1. In certain cases, the implementation might
 * return two contiguous missing values (as the case might be). 2. The utility
 * does not return all the missing values, but at least one value (assuming list
 * has missing values). 3. The bitSize parameter can be derived from input, but
 * that would require scanning the input to find the maximum value.
 * 
 * The implementation works by inspecting the Most Significant bit and work like
 * binary search in that the input arrays is successively reduced to two halves
 * during iteration and then selects the one that is guaranteed to have at least
 * one missing value.
 * 
 * The running time (asymptotic) of this algorithm is O(log(n)) which is much
 * better than sort followed by binary search (total of O(nlog(n) + log(n)).
 * 
 * Another value of this algorithm is that for the case where the virtual memory
 * is limited and the input array cannot be fit into memory, modification of
 * this algorithm can be used to read the input stream and write the reduced
 * files back to file system with each step reducing the input size by half.
 * 
 * @author Malkit S. Bhasin
 */
public class MissingNumber {

	/**
	 * Returns a missing value (if any) in the input array. If none
	 * is found, empty set is returned.
	 * 
	 * @param input array of integers
	 * @param bitSize the size of maximum value in input array
	 * @return
	 */
	static Set findMissing(int[] input, int bitSize) {
		if (bitSize < 0) {
			throw new IllegalArgumentException("bitSize value cannot be nagative");
		}
		
		// convert the input array into list
		List list = asList(input);
		
		StringBuilder result = new StringBuilder(bitSize);
		
		// this represents the maximum capacity of half array 
		// during successive iteration
		int maxCapacity = -1;

		// binary representation of int value
		String binary = null;

		// traversing from most significant bit (msb) to least significant bit (lsb)  
		for (int i = 0; i < bitSize; i++) {
			// temp arrays to store values beginning with 1 bit and 0 bit
			// (for current bit index -> i)
			List ones = new ArrayList();
			List zeros = new ArrayList();
			
			// Calculate max number of values that can fit in 'ones' and 'zeros' array
			maxCapacity = (int) Math.pow(2, bitSize - (i + 1));

			for (int j = 0; j < list.size(); j++) {
				binary = toBinaryString(list.get(j), bitSize);
				if (binary.charAt(i) == '1') {
					ones.add(list.get(j));
				} else {
					zeros.add(list.get(j));
				}
			}
			
			if (ones.size() < maxCapacity) {
				// this indicate that there is definitely at least one missing value
				// in this array (ones array). Note that this does not mean 
				// that there is no missing value in zeros array. Since we just 
				// to find a missing value, we dont have to check other array..
				// As we progress through the msb to lsb, we collect the bits
				// in result which will be used later to find the missing value
				result.append("1");
				list = ones;
			} else if (zeros.size() < maxCapacity) {
				// indicate that the missing value in zeros array
				result.append("0");
				list = zeros;
			} else {
				// no missing element.. return empty set
				return new HashSet();
			}

			// this means that successive reduction of input array
			// has resulted in array of size 1 or even 0 (the case
			// where both the elements are missing - see test3 below..)
			if (list.size() <= 1) {
				return getMissingInt(list, result.toString(), bitSize);
			}
		}
		return new HashSet();
	}

	/**
	 * Here the passed list is the values for a narrow range which has
	 * other value(s) missing. The logic works by find all the values
	 * that will fit in this range then striking off the values that
	 * are already present in this range (as identified by list).
	 * 
	 * @param list
	 * @param result
	 * @param bitSize
	 * @return
	 */
	private static Set getMissingInt(List list,
			String result, int bitSize) {
		Set resultSet = getValuesForRange(result, bitSize);
		for (Integer i : list) {
			resultSet.remove(i);
		}
		return resultSet;
	}

	/**
	 * Returns all possible values for a given partially filled most significant bits. 
	 * The method is passed binary string containing most significant bits for total bitSize. 
	 * For example, if the result value is "110" and bitSize is 4, this means
	 * that the possible values that can fit in are "1100" and "1101" and these
	 * will be returned as result.
	 * 
	 * @param result
	 * @param bitSize
	 * @return
	 */
	private static Set getValuesForRange(String result, int bitSize) {
		String start = padRight(result, bitSize).replace(' ', '0');
		String end = padRight(result, bitSize).replace(' ', '1');
		int startInt = Integer.valueOf(start, 2);
		int endInt = Integer.valueOf(end, 2);
		Set resultSet = new HashSet();
		for (int i = startInt; i <= endInt; i++) {
			resultSet.add(i);
		}
		return resultSet;
	}

	private static List asList(int[] input) {
		List list = new ArrayList();
		for (int index = 0; index < input.length; index++) {
			list.add(input[index]);
		}
		return list;
	}

	private static String padRight(String s, int n) {
		return String.format("%-" + n + "s", s);
	}

	/**
	 * Converts the input integer to equivalent binary string representation.
	 * Also, the returned value is left padded with 0's.
	 * @param value
	 * @param bitSize
	 * @return
	 */
	private static String toBinaryString(int value, int bitSize) {
		String binary = Integer.toBinaryString(value);
		binary = (String.format("%" + bitSize + "s", binary)).replace(' ', '0');
		return binary;
	}

	private static void printResult(String testName, Set result) {
		System.out.printf("%6s: %-10s %n", testName, result);
	}

	/**
	 * @param args
	 */
	public static void main(String[] args) {
		// the following test has one element (2) missing, 2 bits are sufficient to
		// represent largest value (3)
		int[] input1 = {0, 1, 3};
		printResult("test1", findMissing(input1, 2));
		
		// the following test has one element - 12 missing
		int[] input2 = {5, 7, 15, 8, 1, 9, 4, 0, 3, 10, 13, 6, 14, 2, 11};
		printResult("test2", findMissing(input2, 4));
		
		// the list has two elements - 12, 13 missing
		int[] input3 = {5, 7, 15, 8, 1, 9, 4, 0, 3, 10, 6, 14, 2, 11};
		printResult("test3", findMissing(input3, 4));
		
		// the list has maximum value of 31, which means it needs at least 5 bits. The missing value is 15
		int[] input4 = {5, 7, 8, 1, 9, 4, 0, 3, 10, 13, 6, 14, 2, 11, 12, 16, 17, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 29, 28, 31, 30};
		printResult("test4", findMissing(input4, 5));

		// here all the values from 0 - 28 are present, but the input array needs 5 bits (because
		// maximum value is 28). The utility returns missing value(s) that can fit in this 
		// bit range (29, 30, 31). This is by design. If strictly values between the 
		// lower and upper values of input are needed, the implementation can scan through
		// the input to find the min and max and return missing values between this range.
		int[] input5 = {5, 7, 8, 1, 9, 4, 0, 3, 10, 13, 6, 14, 2, 11, 12, 16, 17, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 28};
		printResult("test5", findMissing(input5, 5));
	}
}


Thursday Jan 13, 2011

Kth Order Statistics/Partial ordering – Using Tournament Algorithm (optimized)

This blog entry is in continuation to my previous entries about finding 2nd Minimum and more generic Kth Order statistic in linear time (asymptotic) using dynamic programming principles and tournament algorithm in particular. Here I talk about slight variation to the tournament algorithm implementation that results in better runtime performance. Using modification to our earlier logic we can find Kth min without any need for back-tracking. We still need to build the tournament tree, but the new mechanism results in better runtime efficiency and is also simpler to implement (compared to what we did earlier).

The Idea

The idea is that if keep track of nodes a given node is compared at each level as the node progress in tournament method, we have sufficient information to deduct Kth minimum without ever back tracking the tournament tree. To find second minimum, we inspect the root node’s defeated-node-list for the minimum value. Once we have second minimum; for finding third min we combine second min node’s defeated-node-list to minimum (root) node's defeated node-list and find the third minimum from the resulting set. This method may need some additional processing while constructing tournament tree (and perhaps little more memory), but it is worth the effort as it saves us from backtracking tournament tree to find Kth min.

Details

Referring to the same sample list used previously. If we were to keep track of defeated nodes, we can obtain something like the following:

We wrap the input data into a node structure which can encapsulate the key and also the value (in typical scenario there will be some data value that these keys would be pointing to). In addition to key and value, each node contains a list (defeated-nodes linked-list) of nodes. As a node progress to next level, it appends the defeated node to its defeated-node-list.

After constructing the tournament tree and generating node-list of defeated nodes, for each node, can easily find the Kth minimum by traversing the graph from root-node.

For example, here's how we can find 2nd, 3rd and 4th min for the example above:

2nd Minimum: Looking at the node-list for root node, we can find the second minimum without any back-tracking. Since the root node must have defeated second min node on its way to last level, second min must be the minimum node in the defeated-node-list for the root node. For the case above from the node-list we can find the second minimum value as 2.

3rd Minimum: To find third minimum, we need to go to second minimum node, get its node-list and then find the third minimum from among the combined list for the minimum and second minimum. Since we already have node with value 2 in node-list of root node, we obtain its node list and find the third least from the resulting set. For the above case it will be 3 (the node with key 2 returns 14 and 5 which is larger than 3 which root node holds).

4th Minimum: Continuing the same idea, we obtain node-list for node with key 3 (found in node-list of root); combine it with node-list of min and second min, to find the 4th min from the resulting set.

In general, the root node has node-list which is log(n) size which contains the second min. From second min we can obtain its node-list and combined with node-list of min, we can obtain third min and so on. This mechanism does not require any backtracking, once we build node-list for each progressing node. 

Code

Can also be downloaded from here -  KThMinimum2.java

import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;

/\*\*
 \* Copyright (c) 2010-2020 Malkit S. Bhasin. All rights reserved.
 \* 
 \* All source code and material on this Blog site is the copyright of Malkit S.
 \* Bhasin, 2010 and is protected under copyright laws of the United States. This
 \* source code may not be hosted on any other site without my express, prior,
 \* written permission. Application to host any of the material elsewhere can be
 \* made by contacting me at mbhasin at gmail dot com
 \* 
 \* I have made every effort and taken great care in making sure that the source
 \* code and other content included on my web site is technically accurate, but I
 \* disclaim any and all responsibility for any loss, damage or destruction of
 \* data or any other property which may arise from relying on it. I will in no
 \* case be liable for any monetary damages arising from such loss, damage or
 \* destruction.
 \* 
 \* I further grant you ("Licensee") a non-exclusive, royalty free, license to
 \* use, modify and redistribute this software in source and binary code form,
 \* provided that i) this copyright notice and license appear on all copies of
 \* the software;
 \* 
 \* As with any code, ensure to test this code in a development environment
 \* before attempting to run it in production.
 \* 
 \* @author Malkit S. Bhasin
 \* 
 \*/

public class KThMinimum2 {

	/\*\*
	 \* @param inputList
	 \*            unordered linked list of Node<T,X>
	 \* @param k
	 \*            order of minimum value desired
	 \* @return kth minimum Node<T,X>
	 \* @throws Exception
	 \*/
	public static Node<Integer, Object> getKthMinimum(
			List<Node<Integer, Object>> inputList, int k) throws Exception {
		return findKthMinimum(inputList, k).get(k - 1);
	}

	/\*\*
	 \* @param inputList
	 \*            unordered linked list of non-negative integers
	 \* @param k
	 \*            ordered number of minimum values
	 \* @return k ordered minimum values
	 \* @throws Exception
	 \*/
	public static List<Node<Integer, Object>> getMinimumKSortedElements(
			List<Node<Integer, Object>> inputList, int k) throws Exception {
		return findKthMinimum(inputList, k);
	}

	/\*\*
	 \* First output tree will be obtained using tournament method. During each
	 \* round, adjacent nodes are compared against, the lower of two moves to
	 \* next round. The defeated node is added to list of defeated nodes
	 \* maintained at each node.
	 \* 
	 \* Kth minimum is obtained by going through list of defeated nodes, starting
	 \* from the root node (minimum node).
	 \* 
	 \* @param inputArray
	 \* @param k
	 \* @return ordered array of k minimum elements
	 \* @throws Exception
	 \*/
	private static List<Node<Integer, Object>> findKthMinimum(
			List<Node<Integer, Object>> inputList, int k) throws Exception {
		List<List<Node<Integer, Object>>> outputTree = getOutputTree(inputList);
		// printList(outputTree);
		return getKthMin(outputTree, k);
	}

	private static List<Node<Integer, Object>> getKthMin(
			List<List<Node<Integer, Object>>> outputTree, int k)
			throws Exception {
		int size = outputTree.size();
		if (k <= 0) {
			throw new IllegalArgumentException(
					"Please provide valid integer argument k [0," + size + ")");
		}

		List<Node<Integer, Object>> orderedList = new LinkedList<Node<Integer, Object>>();
		Node<Integer, Object> rootNode = outputTree.get(size - 1).get(0);
		orderedList.add(rootNode);
		if (k == 1) {
			return orderedList;
		}

		List<Node<Integer, Object>> defeatedNodes = new LinkedList<Node<Integer, Object>>();
		defeatedNodes.addAll(rootNode.getDefeatedNodesList());
		Node<Integer, Object> ithMinNode = rootNode;

		for (int i = 1; i < k; i++) {
			ithMinNode = getIthMinNode(defeatedNodes, ithMinNode);
			defeatedNodes.addAll(ithMinNode.getDefeatedNodesList());
		}

		Node<Integer, Object> kthMinNode = rootNode;
		for (int i = 1; i < k; i++) {
			kthMinNode = getIthMinNode(defeatedNodes, kthMinNode);
			orderedList.add(kthMinNode);
		}
		return orderedList;
	}

	private static Node<Integer, Object> getIthMinNode(
			List<Node<Integer, Object>> defeatedNodes,
			Node<Integer, Object> iMinusOneTh) {
		Node<Integer, Object> ithMin = new KThMinimum2().new Node(
				Integer.MAX_VALUE, Integer.MAX_VALUE);
		for (Node<Integer, Object> n : defeatedNodes) {
			if (n.compareTo(iMinusOneTh) < 0) {
				continue;
			}
			if ((n.compareTo(iMinusOneTh) > 0) && (n.compareTo(ithMin) < 0)) {
				ithMin = n;
			}
		}
		return ithMin;
	}

	/\*\*
	 \* Takes an input array and generated a two-dimensional array whose rows are
	 \* generated by comparing adjacent elements and selecting minimum of two
	 \* elements. Thus the output is inverse triangle (root at bottom)
	 \* 
	 \* @param values
	 \* @return
	 \*/
	public static List<List<Node<Integer, Object>>> getOutputTree(
			List<Node<Integer, Object>> inputList) {
		Integer size = new Integer(inputList.size());
		double treeDepth = Math.log(size.doubleValue()) / Math.log(2);
		int intTreeDepth = (int) (Math.ceil(treeDepth)) + 1;
		List<List<Node<Integer, Object>>> outputTreeList = new ArrayList<List<Node<Integer, Object>>>();

		// first row is the input
		outputTreeList.add(inputList);
		// printRow(inputList);

		List<Node<Integer, Object>> currentRow = inputList;
		List<Node<Integer, Object>> nextRow = null;
		for (int i = 1; i < intTreeDepth; i++) {
			nextRow = getNextRow(currentRow);
			// printRow(nextRow);
			outputTreeList.add(nextRow);
			currentRow = nextRow;
		}
		return outputTreeList;
	}

	/\*\*
	 \* Compares adjacent elements (starting from index 0), and construct a new
	 \* array with elements that are smaller of the adjacent elements.
	 \* 
	 \* For even sized input, the resulting array is half the size, for odd size
	 \* array, it is half + 1.
	 \* 
	 \* @param values
	 \* @return
	 \*/
	private static List<Node<Integer, Object>> getNextRow(
			List<Node<Integer, Object>> values) {
		int rowSize = getNextRowSize(values);
		List<Node<Integer, Object>> row = new ArrayList<Node<Integer, Object>>(
				rowSize);
		int i = 0;
		for (int j = 0; j < values.size(); j++) {
			if (j == (values.size() - 1)) {
				// this is the case where there are odd number of elements
				// in the array. Hence the last loop will have only one element.
				// row[i++] = values.get(j);
				row.add(values.get(j));
			} else {
				row.add(getMin(values.get(j), values.get(++j)));
			}
		}
		return row;
	}

	/\*\*
	 \* Returns minimum of two passed in values.
	 \* 
	 \* @param num1
	 \* @param num2
	 \* @return
	 \*/
	private static Node<Integer, Object> getMin(Node<Integer, Object> node1,
			Node<Integer, Object> node2) {
		if (node1.compareTo(node2) < 0) {
			node1.addDefeatedNode(node2);
			return node1;
		} else {
			node2.addDefeatedNode(node1);
			return node2;
		}
	}

	/\*\*
	 \* following uses Math.ceil(double) to round to upper integer value..since
	 \* this function takes double value, diving an int by double results in
	 \* double.
	 \* 
	 \* Another way of achieving this is for number x divided by n would be -
	 \* (x+n-1)/n
	 \* 
	 \* @param values
	 \* @return
	 \*/
	private static int getNextRowSize(List<Node<Integer, Object>> values) {
		return (int) Math.ceil(values.size() / 2.0);
	}

	private static void printRow(List<Node<Integer, Object>> inputList) {
		for (Node<Integer, Object> n : inputList) {
			System.out.print(n.getKey() + " ");
		}
		System.out.println(" ");
	}

	private static void printList(List<List<Node<Integer, Object>>> outputTree) {
		for (List<Node<Integer, Object>> list : outputTree) {
			for (Node<Integer, Object> n : list) {
				System.out.print(n.getKey() + ": ");
				for (Node<Integer, Object> defeatedNode : n
						.getDefeatedNodesList()) {
					System.out.print(defeatedNode.getKey() + " ");
				}
				System.out.println("");
			}
		}
	}

	private static List<Node<Integer, Object>> getNodeList(int[] input) {
		List<Node<Integer, Object>> list = new ArrayList<Node<Integer, Object>>();
		Node<Integer, Object> node = null;
		for (int i : input) {
			node = new KThMinimum2().new Node<Integer, Object>(i, i);
			list.add(node);
		}
		return list;
	}

	public static void main(String args[]) throws Exception {
		int[] input = { 2, 14, 5, 13, 1, 8, 17, 10, 6, 12, 9, 4, 11, 15, 3, 16 };
		List<Node<Integer, Object>> inputList = getNodeList(input);
		System.out.println("Fifth Minimum: " + getKthMinimum(inputList, 5));
		int minimumSortedElementSize = 10;
		List<Node<Integer, Object>> tenMinimum = getMinimumKSortedElements(
				inputList, minimumSortedElementSize);
		System.out.print("Minimum " + minimumSortedElementSize + " Sorted: ");
		for (int i = 0; i < minimumSortedElementSize; i++) {
			System.out.print(tenMinimum.get(i) + " ");
		}
	}

	class Node<T extends Comparable<T>, X> implements Comparable {
		private T key;
		private X value;
		private List<Node<T, X>> list = new LinkedList<Node<T, X>>();

		public Node(T key, X value) {
			super();
			this.key = key;
			this.value = value;
		}

		public void addDefeatedNode(Node<T, X> node) {
			list.add(node);
		}

		public List<Node<T, X>> getDefeatedNodesList() {
			return list;
		}

		@Override
		public int compareTo(Object o) {
			Node<T, X> other = (Node<T, X>) o;
			return key.compareTo(other.getKey());
		}

		public T getKey() {
			return key;
		}

		public X getValue() {
			return value;
		}

		@Override
		public String toString() {
			return String.valueOf(key);
		}
	}
}

Running Time

The asymptotic running time is similar to one obtained earlier O(n + klog(n)), but this implementation require fewer comparisons compared to one done while doing back-tracking of tournament tree. The gain of course is at the cost of maintaining node-list of defeated nodes at node that progresses to next level, which is not significant.

Final Thoughts

One of the drawback of Tournament Algorithm is that it is not in place algorithm - requires additional memory (of the order of original array) to store the tournament tree. The above algorithm further require maintaining lists to nodes (for the nodes progressing to next levels), hence put some additional burden on memory requirement (the burden is minimal as only references are kept, not additional node construction), and also some additional processing. But the advantage of the ‘tweaking’ above results in improved performance and much simpler implementation.

Can we use some in-place algorithms such as Quicksort or Heapsort to find Kth min key in efficient manner? What if we are constrained by the virtual memory and the input array cannot fit into memory. Other algorithms exists which can be employed to find Kth order key in linear time such as Counting Sort and Median Order Statistics. I will talk about some of these in my upcoming entries. Stay tuned.

Tuesday Dec 28, 2010

Finding Kth Minimum (partial ordering) – Using Tournament Algorithm

Ok, so far in my previous blog entries about finding 2nd minimum (and for repeating elements) I wrote about efficient algorithm for finding second minimum element of an array. This optimized algorithm takes O(n + log(n)) time (worst case) to find the second minimum element.
So, the next obvious question is - how can we find Kth minimum element efficiently in an unsorted array where k ranges from 1 – n. How about partial sorting - efficiently returning first K minimum values from unsorted array. As you might have guessed it, we can achieve this by extending the logic used before and tweaking our implementation little bit. We can find Kth min (or return partially sorted list of K min elements) in O(n + klog(n)) time.

Design Details

 For simplicity we will assume that the input unsorted array consists of non-repeating, non-negative integer numbers only. Solution can be extended for arrays with repeating numbers using similar considerations as outlined in my blog for finding second min for repeating values.

As noted in the previous blog, using tournament method we can build output tree by comparing the adjacent elements and moving the lower of the two to next level. Repeating this process yields a tree with log(n) depth and the root element of this tree represent the minimum value. Also, the second minimum can then easily be obtained by finding the minimum of all the values that the root was compared against at each level.  Since the depth of tree is log(n), we have to do at most O(log(n)) comparisons. For example, Figure 1 below uses tournament method for finding the root element. For this particular case, the minimum value is 1 and second minimum 2.

Let’s say we want to find the 3rd minimum value. Extending the logic used to find second minimum, the third minimum value must have met the root (minimum) or the second minimum before ending its progression (just like winning a tournament round) to next level of the tree (as smaller of the two compared values move to next level). Thus 3rd minimum can be obtained by finding the least of all the values compared against the minimum and the second minimum, i.e. backtracking the root (minimum) and the second minimum up to the top of tree (original array) and noting all values each (root and second minimum) was compared against.

 Take the sample in Figure 1. Here is the list of values obtained by back-tracking the root (call it Set A).

Set A (for root) - [3, 2, 10, 8]

From this list we obtain the second minimum as 2. Back-tracking from 2, we obtain following list (Set B):

Set B (for Second minimum) [5, 14]

Thus the third minimum is the least value in union of set A and B, which is 3 (ignoring value 2, which was already established as second minimum). Note that the 3rd minimum must be in either Set A or Set B. In above case it was found from Set A. 

How about if we have to find the 4th minimum? Well, we have to back track the 3rd least value and collect adjacent values a (one level up). Then add this set to the values for root and second minimum. Then this set will contain the 2nd, 3rd and 4th minimum. Continuing our sample, let’s say Set C consists of such elements obtained by backtracking 3rd min – [4, 11, 16]. Thus the minimum value in union of Set A, B, C is 4 (ignoring already established 2 and 3 as second min and third min resp.), is fourth least value. Note that for calculating kth minimum we don’t backtrack and find the adjacency list for minimum through k -1. We can use already calculated results from any previous such computation. This is the reason why this technique falls under Dynamic Programming – we have optimal substructure and avoid repeated computations. 

Now it is time to formalize. To find Kth minimum, back track root, second min, third min up to k-1 min and collect the adjacent values for each - moving from the root (of sub-tree) all the way to the top. The resulting collection will contain the 2nd, 3rd … Kth minimum.


Implementation Details

To obtain Kth minimum, we need collection of all the comparisons from minimum (root) to K-1. Thus we need to make following changes to our previous code for finding second minimum.

  1. Return array of values (as indicated in rectangle boxes in Figure 2) while back-tracking root (could also be sub-tree root). Also, in addition to the values, we need the level and also index information for of each of these values. The reason why we need the level and index is because once we determine that a particular value from this list is minimum, we need to back-track the sub-tree from that value (hence we need to know the location of the element at that level) on order to obtain new collection to determine next lower value. Thus we will return a two-dimensional array with first index denoting the value and the other one the index. We can infer the level from the index of the pair (in 2-d array).
    For example, for the case above, for the first pass (to find the minimum value), the following will be returned (let’s call it adjacency list):


    Thus from above array we can conclude that the second minimum value is 2 (least of first value of all pairs), and this value appears two levels above the root (level 5), at 0th index (at this level).  We will use this info to identify the sub-tree for out next round of back-tracking.

  2. Change the api to take back-track a portion of tree. In order to achieve that we need to pass level and index to define the root of sub-tree to back track from.

  3. The 2-dimensional list obtained from each run will be added to a 3-dimensional list – full adjacency list for min to K-1 min elements. 3-D list is used rather than merging a bigger 2-D list to preserve and identify the results for a particular run. The reason being once we identify the minimum element for all k-1 runs, the level of that particular element is obtained from particular list where the minimum was found (alternatively our backtracking api could have returned a 3-d array with level info as one more dimension, either way we need this info).

  4. Api that take full adjacency list and min value and return the next min element info. The api should identify for which run it found the min and the index of element within that run. Refer Figure 4 below. When this list is passed to this api with value of second min (2), it returns (0,3), which is interpreted as the next min (third min) was found in first array (obtained from first backtrack) and 4th element within that array. Once we have this info, we can look into first array and locate the third min value as 3, at index 1 at level 3 (remember original array is level 1 – Figure 2). Note that for m elements in full adjacency list we are making m comparisons which seem not efficient at first glance, but since the size of the adjacency list is small O(log(n)), hence it is not significant. 

  5. To find Kth min, we need to find min through k. Hence this algorithm can return partially sorted array up-to k elements. 

Code 

Can be downloaded from here - KthMinimum.java (remove the .txt extension)

/\*\*
 \* Copyright (c) 2010-2020 Malkit S. Bhasin. All rights reserved.
 \* 
 \* All source code and material on this Blog site is the copyright of Malkit S.
 \* Bhasin, 2010 and is protected under copyright laws of the United States. This
 \* source code may not be hosted on any other site without my express, prior,
 \* written permission. Application to host any of the material elsewhere can be
 \* made by contacting me at mbhasin at gmail dot com
 \* 
 \* I have made every effort and taken great care in making sure that the source
 \* code and other content included on my web site is technically accurate, but I
 \* disclaim any and all responsibility for any loss, damage or destruction of
 \* data or any other property which may arise from relying on it. I will in no
 \* case be liable for any monetary damages arising from such loss, damage or
 \* destruction.
 \* 
 \* I further grant you ("Licensee") a non-exclusive, royalty free, license to
 \* use, modify and redistribute this software in source and binary code form,
 \* provided that i) this copyright notice and license appear on all copies of
 \* the software;
 \* 
 \* As with any code, ensure to test this code in a development environment
 \* before attempting to run it in production.
 \* 
 \* @author Malkit S. Bhasin
 \* 
 \*/

public class KThMinimum {

	/\*\*
	 \* @param inputArray
	 \*            unordered array of non-negative integers
	 \* @param k
	 \*            order of minimum value desired
	 \* @return kth minimum value
	 \*/
	public static int getKthMinimum(int[] inputArray, int k) {
		return findKthMinimum(inputArray, k)[k - 1];
	}

	/\*\*
	 \* @param inputArray
	 \*            unordered array of non-negative integers
	 \* @param k
	 \*            ordered number of minimum values
	 \* @return k ordered minimum values
	 \*/
	public static int[] getMinimumKSortedElements(int[] inputArray, int k) {
		return findKthMinimum(inputArray, k);
	}

	/\*\*
	 \* First output tree will be obtained using tournament method. For k
	 \* minimum, the output tree will be backtracked k-1 times for each sub tree
	 \* identified by the minimum value in the aggregate adjacency list obtained
	 \* from each run. The minimum value after each run will be recorded and
	 \* successive runs will produce next minimum value.
	 \* 
	 \* @param inputArray
	 \* @param k
	 \* @return ordered array of k minimum elements
	 \*/
	private static int[] findKthMinimum(int[] inputArray, int k) {
		int[] partiallySorted = new int[k];
		int[][] outputTree = getOutputTree(inputArray);
		int root = getRootElement(outputTree);
		partiallySorted[0] = root;
		int rootIndex = 0;
		int level = outputTree.length;
		int[][][] fullAdjacencyList = new int[k - 1][][];
		int[] kthMinIdx = null;
		for (int i = 1; i < k; i++) {
			fullAdjacencyList[i - 1] = getAdjacencyList(outputTree, root,
					level, rootIndex);
			kthMinIdx = getKthMinimum(fullAdjacencyList, i, root);
			int row = kthMinIdx[0];
			int column = kthMinIdx[1];
			root = fullAdjacencyList[row][column][0];
			partiallySorted[i] = root;
			level = column + 1;
			rootIndex = fullAdjacencyList[row][column][1];
		}

		return partiallySorted;
	}

	/\*\*
	 \* Takes an input array and generated a two-dimensional array whose rows are
	 \* generated by comparing adjacent elements and selecting minimum of two
	 \* elements. Thus the output is inverse triangle (root at bottom)
	 \* 
	 \* @param values
	 \* @return
	 \*/
	public static int[][] getOutputTree(int[] values) {
		Integer size = new Integer(values.length);
		double treeDepth = Math.log(size.doubleValue()) / Math.log(2);
		// int intTreeDepth = getIntValue(Math.ceil(treeDepth)) + 1;
		int intTreeDepth = (int) (Math.ceil(treeDepth)) + 1;
		int[][] outputTree = new int[intTreeDepth][];

		// first row is the input
		outputTree[0] = values;
		printRow(outputTree[0]);

		int[] currentRow = values;
		int[] nextRow = null;
		for (int i = 1; i < intTreeDepth; i++) {
			nextRow = getNextRow(currentRow);
			outputTree[i] = nextRow;
			currentRow = nextRow;
			printRow(outputTree[i]);
		}
		return outputTree;
	}

	/\*\*
	 \* Compares adjacent elements (starting from index 0), and construct a new
	 \* array with elements that are smaller of the adjacent elements.
	 \* 
	 \* For even sized input, the resulting array is half the size, for odd size
	 \* array, it is half + 1.
	 \* 
	 \* @param values
	 \* @return
	 \*/
	private static int[] getNextRow(int[] values) {
		int rowSize = getNextRowSize(values);
		int[] row = new int[rowSize];
		int i = 0;
		for (int j = 0; j < values.length; j++) {
			if (j == (values.length - 1)) {
				// this is the case where there are odd number of elements
				// in the array. Hence the last loop will have only one element.
				row[i++] = values[j];
			} else {
				row[i++] = getMin(values[j], values[++j]);
			}
		}
		return row;
	}

	/\*\*
	 \* From the passed full adjacency list and min value scans the list and
	 \* returns the information about next minimum value. It returns int array
	 \* with two values:
	 \* first value: index of the back-track (the min value was found in the
	 \* Adjacency list for min value, second min etc.)
	 \* second value: index within the identified run.
	 \* 
	 \* @param fullAdjacencyList
	 \*            Adjacency list obtained after k-1 backtracks
	 \* @param kth
	 \*            Order of minimum value desired
	 \* @param kMinusOneMin
	 \*            value of k-1 min element
	 \* @return
	 \*/
	private static int[] getKthMinimum(int[][][] fullAdjacencyList, int kth,
			int kMinusOneMin) {
		int kThMin = Integer.MAX_VALUE;
		int[] minIndex = new int[2];
		int j = 0, k = 0;
		int temp = -1;

		for (int i = 0; i < kth; i++) {
			for (j = 0; j < fullAdjacencyList.length; j++) {
				int[][] row = fullAdjacencyList[j];
				if (row != null) {
					for (k = 0; k < fullAdjacencyList[j].length; k++) {
						temp = fullAdjacencyList[j][k][0];
						if (temp <= kMinusOneMin) {
							continue;
						}
						if ((temp > kMinusOneMin) && (temp < kThMin)) {
							kThMin = temp;
							minIndex[0] = j;
							minIndex[1] = k;
						}
					}
				}
			}
		}
		return minIndex;
	}

	/\*\*
	 \* Back-tracks a sub-tree (specified by the level and index) parameter and
	 \* returns array of elements (during back-track path) along with their index
	 \* information. The order elements of output array indicate the level at
	 \* which these elements were found (with elements closest to the root at the
	 \* end of the list)
	 \* 
	 \* Starting from root element (which is minimum element), find the lower of
	 \* two adjacent element one row above. One of the two element must be root
	 \* element. If the root element is left adjacent, the root index (for one
	 \* row above) is two times the root index of any row. For right-adjacent, it
	 \* is two times plus one. Select the other element (of two adjacent
	 \* elements) as second minimum.
	 \* 
	 \* Then move to one row further up and find elements adjacent to lowest
	 \* element, again, one of the element must be root element (again, depending
	 \* upon the fact that it is left or right adjacent, you can derive the root
	 \* index for this row). Compare the other element with the second least
	 \* selected in previous step, select the lower of the two and update the
	 \* second lowest with this value.
	 \* 
	 \* Continue this till you exhaust all the rows of the tree.
	 \* 
	 \* @param tree
	 \*            output tree
	 \* @param rootElement
	 \*            root element (could be of sub-tree or outputtree)
	 \* @param level
	 \*            the level to find the root element. For the output tree the
	 \*            level is depth of the tree.
	 \* @param rootIndex
	 \*            index for the root element. For output tree it is 0
	 \* @return
	 \*/
	public static int[][] getAdjacencyList(int[][] tree, int rootElement,
			int level, int rootIndex) {
		int[][] adjacencyList = new int[level - 1][2];
		int adjacentleftElement = -1, adjacentRightElement = -1;
		int adjacentleftIndex = -1, adjacentRightIndex = -1;
		int[] rowAbove = null;

		// we have to scan in reverse order
		for (int i = level - 1; i > 0; i--) {
			// one row above
			rowAbove = tree[i - 1];
			adjacentleftIndex = rootIndex \* 2;
			adjacentleftElement = rowAbove[adjacentleftIndex];

			// the root element could be the last element carried from row above
			// because of odd number of elements in array, you need to do
			// following
			// check. if you don't, this case will blow {8, 4, 5, 6, 1, 2}
			if (rowAbove.length >= ((adjacentleftIndex + 1) + 1)) {
				adjacentRightIndex = adjacentleftIndex + 1;
				adjacentRightElement = rowAbove[adjacentRightIndex];
			} else {
				adjacentRightElement = -1;
			}

			// if there is no right adjacent value, then adjacent left must be
			// root continue the loop.
			if (adjacentRightElement == -1) {
				// just checking for error condition
				if (adjacentleftElement != rootElement) {
					throw new RuntimeException(
							"This is error condition. Since there "
									+ " is only one adjacent element (last element), "
									+ " it must be root element");
				} else {
					rootIndex = rootIndex \* 2;
					adjacencyList[level - 1][0] = -1;
					adjacencyList[level - 1][1] = -1;
					continue;
				}
			}

			// one of the adjacent number must be root (min value).
			// Get the other number and compared with second min so far
			if (adjacentleftElement == rootElement
					&& adjacentRightElement != rootElement) {
				rootIndex = rootIndex \* 2;
				adjacencyList[i - 1][0] = adjacentRightElement;
				adjacencyList[i - 1][1] = rootIndex + 1;
			} else if (adjacentleftElement != rootElement
					&& adjacentRightElement == rootElement) {
				rootIndex = rootIndex \* 2 + 1;
				adjacencyList[i - 1][0] = adjacentleftElement;
				adjacencyList[i - 1][1] = rootIndex - 1;
			} else if (adjacentleftElement == rootElement
					&& adjacentRightElement == rootElement) {
				// This is case where the root element is repeating, we are not
				// handling this case.
				throw new RuntimeException(
						"Duplicate Elements. This code assumes no repeating elements in the input array");
			} else {
				throw new RuntimeException(
						"This is error condition. One of the adjacent "
								+ "elements must be root element");
			}
		}

		return adjacencyList;
	}

	/\*\*
	 \* Returns minimum of two passed in values.
	 \* 
	 \* @param num1
	 \* @param num2
	 \* @return
	 \*/
	private static int getMin(int num1, int num2) {
		return Math.min(num1, num2);
	}

	/\*\*
	 \* following uses Math.ceil(double) to round to upper integer value..since
	 \* this function takes double value, diving an int by double results in
	 \* double.
	 \* 
	 \* Another way of achieving this is for number x divided by n would be -
	 \* (x+n-1)/n
	 \* 
	 \* @param values
	 \* @return
	 \*/
	private static int getNextRowSize(int[] values) {
		return (int) Math.ceil(values.length / 2.0);
	}

	/\*\*
	 \* Returns the root element of the two-dimensional array.
	 \* 
	 \* @param tree
	 \* @return
	 \*/
	public static int getRootElement(int[][] tree) {
		int depth = tree.length;
		return tree[depth - 1][0];
	}

	private static void printRow(int[] values) {
		for (int i : values) {
			// System.out.print(i + " ");
		}
		// System.out.println(" ");
	}

	public static void main(String args[]) {
		int[] input = { 2, 14, 5, 13, 1, 8, 17, 10, 6, 12, 9, 4, 11, 15, 3, 16 };
		System.out.println("Fifth Minimum: " + getKthMinimum(input, 5));

		int minimumSortedElementSize = 10;
		int[] tenMinimum = getMinimumKSortedElements(input,
				minimumSortedElementSize);
		System.out.print("Minimum " + minimumSortedElementSize + " Sorted: ");
		for (int i = 0; i < minimumSortedElementSize; i++) {
			System.out.print(tenMinimum[i] + " ");
		}
	}
}

Running Time

Avoiding pure mathematics and using approximations, the worst case running time of this algorithm can be calculated as follows:

For finding second min –>  O(n + log(n))
For Kth min (or partial k sort) –>  O(n + klog(n)), ignoring constants (actually, the algorithm should perform much better than this as not all the k elements can be at log(n) depth).

Final Thoughts

  • The above algorithm does provide optimal mechanism to find Kth minimum. This can also be used for partially sorting unsorted array. But, note that this does require additional memory (or the order of original array size). Also, the calculations does not account for processing required for creating tree or the copy operations. What if we are limited by memory and the whole cannot be fit into memory. Obviously this solution will not work.
  • Is the implementation too complex? In addition to Tournament method there are other mechanisms to find Kth minimum efficiently. This wiki page enumerates some of those methods - Selection Algorithm. I will try to solve using some of the methods listed on wiki resource. As always, feedback welcome. 

Thursday Dec 16, 2010

Finding Second Minimum with REPEATING elements in array

I posed few questions at the end of my earlier blog entry about finding 2nd minimum element. Let’s tackle the first question – how to efficiently find second minimum when we have repeating elements in the array. 

Let’s say we have following scenario. Notice that the minimum element, 1 appears twice in the list.

As before, we will use tournament method to first find the minimum value. Also note that while progressing through successive levels, the root elements, if repeating, will be compared against each other at some point. 

Using tournament method we obtain the minimum as below. 


Design Details

We can use similar logic as to find the second minimum, but we need to make slight changes to our logic. Our earlier code allowed only one of the adjacent elements of the row above to be root but now it is a valid condition and the of course the result of comparison will be a root element for such cases.

As noted above, if the root element is repeating, it must meet other root element(s) in the back path. So at the point where the two root elements meet, the tree would be divided into two sub-trees and we need to backtrack along the root element in each sub-tree till we reach the top level and select the minimum for each sub-tree. As before, since the second minimum must meet the root element at some point; by comparing the minimum for each path, we can obtain second minimum for repeating root.

Implementation Details

We can use similar logic as used for finding second minimum, but we will have to make few changes:

1. The api getSecondMinimum will be extended to take additional parameters.  In addition to the tree, it will also take the level to backtrack from and the root index. Same api can then be recursively used for back traversal of the tree from any point – root or from any level above root.

2. Whenever we hit a level where both the adjacent elements of row above are root elements, obtain the minimum from each sub-tree for both the root elements (left adjacent and right adjacent). Comparing the result of two would give us the second minimum. 

3. To back track from root, just pass the total depth of the tree for the level argument and root index as value of 0.

Code 

Here is updated code for the method getSecondMinimum (no change in rest of methods in previous post). Click here to download complete class - SecondMinRepeatingElements.java (remove the .txt extension).

	/\*\*
	 \* The logic for finding second minimum is as follows:
	 \* 
	 \* Starting from root element (which is minimum element), find the lower of
	 \* two adjacent element one row above. One of the two element must be root
	 \* element. If the root element is left adjacent, the root index (for one
	 \* row above) is two times the root index of any row. For right-adjacent, it
	 \* is two times plus one. Select the other element (of two adjacent
	 \* elements) as second minimum.
	 \* 
	 \* Then move to one row further up and find elements adjacent to lowest
	 \* element, again, one of the element must be root element (again, depending
	 \* upon the fact that it is left or right adjacent, you can derive the root
	 \* index for this row). Compare the other element with the second least
	 \* selected in previous step, select the lower of the two and update the
	 \* second lowest with this value.
	 \* 
	 \* Continue this till you exhaust all the rows of the tree.
	 \* 
	 \* @param tree
	 \* @param rootElement
	 \* @param level
	 \*            The depth from where to start back-tracking
	 \* @param rootIndex
	 \*            Index of the root element at that depth level
	 \* @return
	 \*/
	public static int getSecondMinimum(int[][] tree, int rootElement,
			int level, int rootIndex) {
		int adjacentleftElement = -1, adjacentRightElement = -1;
		int adjacentleftIndex = -1, adjacentRightIndex = -1;
		int secondLeast = Integer.MAX_VALUE;
		int[] rowAbove = null;
		// we have to scan in reverse order
		for (int i = level - 1; i > 0; i--) {
			// one row above
			rowAbove = tree[i - 1];
			adjacentleftIndex = rootIndex \* 2;
			adjacentleftElement = rowAbove[adjacentleftIndex];

			// the root element could be the last element carried from row above
			// because of odd number of elements in array, you need to do
			// following
			// check. if you don't, this case will blow {8, 4, 5, 6, 1, 2}
			if (rowAbove.length >= ((adjacentleftIndex + 1) + 1)) {
				adjacentRightIndex = adjacentleftIndex + 1;
				adjacentRightElement = rowAbove[adjacentRightIndex];
			} else {
				adjacentRightElement = -1;
			}

			// if there is no right adjacent value, then adjacent left must be
			// root continue the loop.
			if (adjacentRightElement == -1) {
				// just checking for error condition
				if (adjacentleftElement != rootElement) {
					throw new RuntimeException(
							"This is error condition. Since there "
									+ " is only one adjacent element (last element), "
									+ " it must be root element");
				} else {
					rootIndex = rootIndex \* 2;
					continue;
				}
			}

			// one of the adjacent number must be root (min value).
			// Get the other number and compared with second min so far
			if (adjacentleftElement == rootElement
					&& adjacentRightElement != rootElement) {
				secondLeast = getMin(secondLeast, adjacentRightElement);
				rootIndex = rootIndex \* 2;
			} else if (adjacentleftElement != rootElement
					&& adjacentRightElement == rootElement) {
				secondLeast = getMin(secondLeast, adjacentleftElement);
				rootIndex = rootIndex \* 2 + 1;
			} else if (adjacentleftElement == rootElement
					&& adjacentRightElement == rootElement) {
				// This is case where the root element is repeating
				// The tree is now divided into two sub-trees and we need to
				// back-track both the sub-trees and select the minimum for
				// each. Then lower of the two is second minimum
				int minLeftSubTree = getSecondMinimum(tree, rootElement, i,
						adjacentleftIndex);
				int minRightSubTree = getSecondMinimum(tree, rootElement, i,
						adjacentRightIndex);
				return getMin(minLeftSubTree, minRightSubTree);

			} else {
				throw new RuntimeException(
						"This is error condition. One of the adjacent "
								+ "elements must be root element");
			}
		}

		return secondLeast;
	}

Running Time

For single root, the depth of tree will be log(n), for second repeating element (for worst case) will be log(n) – 1 and so on. For m repeating root elements, there will be m backtrackings, one for each repeating root. Thus for m repeating roots, the running time: 

n + log(n) + (log(n) -1) +  … + (log(n) – m) 

Or,  O(n + mlog(n)), ignoring constants.

Also, note that the for repeating elements other than root element, the running time remains same as previously calculated (for non-repeating elements - O(n + log(n)); can also be obtained by substituting m=1 above) and that the previous code will work fine without any changes. 

Thursday May 27, 2010

My First Scratch (Golf)

Today, I went out for quick round of golf  with one of my colleague (after work) at nearby 9-hole Rancho Duarte Golf Course. It is nice little executive course (par 31), with par-3 and par-4 holes and fast greens. It was bit overcast, but nice breeze made the playing all the more enjoyable.

I played at this and other courses many times before and came close to shooting par, but could never actually do it. Since I started paying golf around 3 and half years back, it was my dream to do a scratch round (for those not familiar with golf, scratch is term used to define a round under par; for more info refer this page). Well, today was that day. I shot total of 30 for one under par (-1). I hit 8 of 9 greens in regulation and finished the round with back to back birdies. Here is my detailed scorecard.

BTW, I was lucky to score hole-in-one once at Los Feliz Golf Course around 2 years back.  

My next goal - Scratch for a regulation course. It is not going to be easy..

Sunday Dec 07, 2008

Bear Canyon Hike (California)

This saturday, I participated in a very scenic hike in a group of 16 hikers. This was about 9 mile and mostly downhill hike. Using car pooling and splitting cars at start and end point, this hike was done as one-way. After parking our cars at Switzer’s Falls Starting Point (exit on 210 and drive approx 9 miles on Angeles Crest Hwy), we packed into few cars and moved up to the start point of the hike at Eton Saddle fire road gate (approx 6.5 Miles - keep going north on Angles Crest and take Mt. Wilson Red Box road at approx 4.1 miles and drive 2.4 miles on Mt Wilson Red Box road) to hike back down to Switzer’s Falls Starting Point car parking.

The hike started by going around south side of San Gabriel Peak (6,161') and entering Mueller Tunnel. Surrounded by number of nearby Peaks – Mt. Lowe (5,603'), Mt. Wilson (5,710') and Mt. Disappointment (5,960'), you keep moving on to the fire road to reach the Bear Canyon trail head. At the trail head you find a halipad which offer breathtaking view of greater Los Angeles. It was a clear day and we could see ocean and as far as Catalina Island (approx 60+ miles away!). Hiking through bear canyon you pass though number of water streams and pools with plenty of places to sit and relax. At this time of the year, the water level was low and you could do stream crossing without getting your feet wet. But, during rainy season, the hike is closed as the water level would make it impossible to cross the streams. After approx 3.2 miles, you reach Bear Canyon Campgound, where we stopped for lunch. Further around 2 miles of hiking along streamside, you reach 50 ft high Switzer’s falls. From Switzer Falls, around 2.2 miles and approx 600 ft of elevation you reach the Switzer lower parking area.

Don’t forget to take plenty of water and food. Good hiking shoes are must due to loose gravel, as there is real risk of slip and fall. We had one little accident wherein one hiker fell and got some bruises. But, the injuries were not bad, and he was able to finish the hike. Overall, the hike was easy as there are not much elevation gains, but it is still strenuous as it was long. I really enjoyed the spectacular canyon and plenty of scenic points all along.

Here is hike log (Dec 6, 08)

9:30 am       Start (Eaton Saddle – 5,120’)
10:30 am     Bear Canyon Trail Head
12:00 pm     Bear Canyon Camp (stopped for lunch – approx 40 mins)
1:45 pm       Switzer’s Falls
3:00 pm       Return – Switzer Falls Start Point – Lower parking lot

Mt. Wlison Hike (California)

Though famous for its observatory, Mt. Wilson houses number of radio and television antennas and is famous landmark and conspicuously visible to anyone driving in San Gabriel valley and beyond. At around 5,700 ft, this is one of the highest peaks in San Gabriel Mountains. You can in-fact drive up to Mt. Wilson following Angeles Crest Highway (around one hour drive from LA downtown).

Day after thanksgiving, along with 4 other very experienced hikers; we started this hike from City of Sierra Madre at intersection of E. Mira Monte and Mt. Wilson Trail. It is long and strenuous hike through wilderness with steep elevations and passing through some shade and lot of sun. Make sure that you carry plenty of water (at lest 2 liters) and food as there is no source of drinking water on this long hike. We were maintaining quite brisk pace. Having never experienced a long hike this one, it was really tough for me and was getting slower and slower progressing through the hike and had virtually no energy left at around 3/4 of the onward hike. I had to stop and break away from the group. Fortunately, making slow and continued progress, I was able to join the group at Manzanita Ridge point. It took us approx 4 1/2 hours to reach the top. After resting for approx 30 - 40 mins at Mt. Wilson we made our way back and with few short stops made down in approx 3 hours. The total hike was around 15 miles (round trip) with approx 4500 ft. of total elevation gain.

Here is the hike log:

8:25 am      Start (89 E. Mira Monte Ave., Sierra Madre, CA)
9 am          First Water - 1.5 Mile (break 10 mins)
10 am        Orchard Camp (2,960') - 3.5 Mile (stopped for lunch approx 15 mins)
11:30 am    Manzanita Ridge - 5 miles (break 10 mins)
                  Mt. Wilson Toll Road (5.5 Miles)
1 pm          Mt. Wilson Summit (5,710') - 7.5 Miles
4:30 pm     Return -  15 Miles

Wednesday Oct 08, 2008

3 New Date/Time Comparison Extension Funtions for Netbeans BPEL 2.0 Editor

Checkout 3 new BPEL Extension functions for date/time comparisons. These functions are implemented as extension functions as BPEL 2.0 (which uses XPath 1.0) standard function list does not include these. The syntax and definition for these is derived from XPath 2.0 spec (http://www.w3.org/TR/xquery-operators/). These new functions being -

sxxf:dateTime-less-than  (Less-than comparison on xs:dateTime values)
sxxf:date-less-than      (Less-than comparison on xs:date values)
sxxf:time-less-than      (Less-than comparison on xs:time values)

Where the namespace prefix sxxf stands for "http://www.sun.com/wsbpel/2.0/process/executable/SUNExtension/XPathFunctions" which need to be defined in the process definition.

Note that these functions can either take literal value (corresponding to  xs:dateTime, xs:date or xs:time representation) or BPEL Variable of appropriate time.

A test case to showcase example usage of these function is checked in driver tests. Check out BPEL Project and Composite Application from https://open-jbi-components.dev.java.net/source/browse/open-jbi-components/driver-tests/bpelse/xpathfunctions/DateTimeComparison/.

Copying from the test case-

sxxf:dateTime-less-than($NewWSDLOperationIn.dateTime1, $NewWSDLOperationIn.dateTime2)
sxxf:date-less-than($NewWSDLOperationIn.date1, $NewWSDLOperationIn.date2)
sxxf:time-less-than($NewWSDLOperationIn.time1, $NewWSDLOperationIn.time2)

sxxf:dateTime-less-than('2008-09-29T17:15:43.68-08:00', '2008-09-29T17:15:43.67-08:00')
sxxf:date-less-than('2008-09-28', '2008-09-29')
sxxf:time-less-than('17:15:43.68-08:00', '17:15:43.67-08:00')

Monday Sep 29, 2008

Two New BPEL Extension funtions - getBPId() and getGUID()

BPEL Service Engine, part of Open ESB Project recently added two new Extension functions to return the Process Instance Id and a New GUID respectively. These functions are available in BPEL Mapper Pallet in Netbean's BPEL Editor.  So, what might be good use of these functions. Of top of my head, I can think of the following:
- For creating unique Correlation Keys: Often times state-ful interactions are required between interacting partners exchange asynchronous messages. BPEL describes correlation mechanism to make such interaction possible. These correlation tokens are embedded in the messages that are passed around in such interactions. Any of these newly added getGUID() or getBPID() extension functions can easily be used to create such correlation tokens.
- For Auditing and Debugging purposes: getBPID() that returns the process instance id can be used for these purposes.

Following is the usage of these funtions, as shown in the example below:

      <assign name="Assign1">
         <copy>
            <from>concat('BPID = [', sxxf:getBPId(), '] GUID = [', sxxf:getGUID(), ']')</from>
            <to variable="NewWSDLOperationOut" part="part1"/>
         </copy>
      </assign>

where the namespace prefix sxxf stands for xmlns:sxxf="http://www.sun.com/wsbpel/2.0/process/executable/SUNExtension/XPathFunctions" which need to be imported into your BPEL Definition file. Of course, Netbean's BPEL Editor comes with powerful mapper (screenshot below), that makes usage of these funtions like a breeze.

Two New BPEL Extension Funtions

Thursday Aug 21, 2008

Santa Anita Loop Hike

This Tuesday, I did another scenic hike through small portion of Santa Anita Canyon in San Gabriel Mountains. This is excellent hike for summer with plenty of forest cover all through the trail. We were group of 9 and started off around 6 pm and finished the loop in less than two hours with few minutes stop at Hodgees camp site. It was pretty dark when we reached the tail end of the hike.

With easy access from San Gabriel valley, this trail is a loop starting from Chantry Flats (exit Santa Anita on I-210 and drive around 6 miles towards mountain to parking structure), with plenty of shade, passing through several creek crossings and few campsites. The hike actually starts up the hill with steady gain and then goes down (around 400' feet lower then tail-head) before finally going up. About two miles into this trail you pass through Upper Winter Creek Junction Signpost for trails to Mt. Wilson (another 4 1/2 miles) and Mt Zion (1/8 Miles). Further down around 1/3 of a mile you reach Hodgees Campground. This limited facility small campground under dense forest cover has restroom and water. Interestingly, I was told that this campsite has highest recorded rainfall (26 inches in one day) in whole of California (could not confirm this fact though). Another two miles you reach Roberts Campground and the lowest point in the hike. Overall, the hike has moderate elevation gain rate throughout except at the end when you hit paved road of about half a mile with steep elevation of around 400 ft.

Here are quick facts about this trail.

Distance: 5 Miles
Altitude: Trailhead 2200', Minimum-1800 , Maximum 2900', Total Gain/Loss -1100'
Difficulty: Beginner
Trail Condition: Good
What should you carry (at minimum): One bottle of water

Monday Jul 07, 2008

How to Enable Java Debugger for Glassfish Cluster Profile

Often times things work fine for Single Instance App Server, but when the same application is put on cluster mode, the results are not as expected. For glassfish, the steps I use for enabling the debugger for cluster mode is slightly from non-cluster mode. Also, there is one extra manual step involved to make the debugger work.

First for Clustered Glassfish, you can easily enable the debugger using the Admin Web GUI. Here are the steps involved. (for the purposes of illustration cluster name and instance name  is assumed to be cluster1 and instance-ONE)

  1. Log on to Admin GUI (http://<server>:4848/index.jsf). 
  2. Navigate to Configurations -> cluster1-config -> JVM Settings
  3. On the General Tab, Enable the Debug Check Box and for the Debug Options Text Box remove the port number. Leave it blank (Figure 1).
  4. Shutdown node agent (asadmin stop-node-agent)
  5. Go to domain.xml for the cluster instance (\\openesb\\glassfish\\nodeagents\\cluster1-nodeagent\\instance-ONE\\config) and enable the debug-enabled flag to true for the cluster1-config (Figure 2).
  6. Start the node agent (asadmin start-node-agent) and look for the debug port in the server log (Figure 3). The server log will print the port to use.

Debug Configuration for Clustured Glassfish

Figure 1 Figure 2

Server Log

Figure 3

Sunday Jul 06, 2008

Echo Mountain Hike

Living in the San Gabriel Valley (foothills of San Gabriel Mountain Ranges) for over three years now, I almost had feeling of guilt for not exploring the nearby mountains. San Gabriel mountains form barrier between Greater Los Angeles Area and Mohave Desert. At 3,067 m (10,064 ft), the highest peak in the range is Mount San Antonio, aka Mt. Baldy.

While searching for outdoor activities around this area I came across excellent online resource – outdoorsclub.com. Here you can find and active community who love outdoor activities. A quick glace at the event calender reveals multiple outdoor events happening every week! The events are not just limited to hiking but other activities such as biking, camping, backpack, river-rafting, cannoeing. Through the website you can sign for an activity and also organize one. You need to be member though to sign up (first 3 free, $25 per year thereafter) for an event.

I saw one hike happening close to where I live. Following Sam Merrill Trail, at top of Lake Avenue in Altedena, this was hike to Echo Mountain Peak. With total round-trip distance of around 6 miles, the hike involves total gain of around 1400'. In a group of 6 (3 ladies and 3 men), we started off at 6:30 pm. Although warned fairly by the leader regarding the difficulty of hike, I was determined to try it out. The other hikers were quite experienced and maintained brisk pace all throughout the hike. After around one mile of hike, I was totally exhausted and was not sure if would be able to complete the hike. Though the ladies were leading the pack, the group leader stayed back to watch me. With lot of encouragement and some teasing by our group leader, I dragged along and finally made it to the top in one hour and two minutes, lagging by approx 5 minutes from the leading group. I was told that for first timer, I did good. Overall, it was great experience and I would recommend it to anyone who would like a decent hike.

Echo Mountain, also known as White City has rich past. Once a place with a resort with two hotels - Echo Mountain House and The Chalet. It also had an observatory and a small zoo. Professor Thaddeus S. C. Lowe and engineer David J. Macpherson built a railroad connection to Echo Mountain, which eventually be visited by over 3 million people from the year 1896 - 1936. Through series of fires and windstorms and waning public interest, the city was destroyed. The remains of the resort and the railroad can still be seen, reminiscent of the past glory of the place.

Sam Merill Trail Sam Merill Trail
Sam Merill Trail
Sam Merill Trail
Sam Merill Trail
Sam Merill Trail



Friday May 30, 2008

Logging In..

This is my first real foray into world of blogging. So why blog? Well many reasons but mostly to share insights about products and technologies I am working on. Also a place for me to think, reflect and connect.

I came to Sun through Seebeyond Acquisition. Part of Web Services division, I have been working on BPEL Engine for over four years now – earlier part of eInsight Business Process Manager of ICAN Suite (while in Seebeyond) and now part of BPEL Service Engine part of Project Open-ESB based on JBI Specification.

About

Malkit works in software development at Oracle, Inc. working in Oracle Business Process Manager, part of Oracle Fusion Middleware. Malkit came to Oracle through Sun Microsystems acquisition, living and working in Los Angeles, California.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today