Monday, September 21, 2015

H-Index

Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.

According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."

For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.


Note: If there are several possible values for h, the maximum one is taken as the h-index.


This is a very interesting problem. 
1. Sort the array in descending order.
2. Scan the array from the biggest element. At each position, there are two values.
    n - Number of elements scanned so far
    k - The value of the last scanned element
n starts from 1 and k starts from the largest value. So in the beginning, k should always be greater than n. We can try to take the p value now, which equals Math.min(n, k). Since the p actually takes the value of n, which is smaller, we have to check the following element's value. If its value is greater than p, this p is invalid. 
3. While the scan keeps going, we might hit a point where n==k. That means we have n elements, there values are all >= k. For the remaining elements, their values are smaller than k. So p=n at this moment is the answer we are looking for.
4. Sometimes, if there is no point where n==k, we might be in this situation. At n', array value is k'. Math.min(n', k')=n' and the value of the next element k >= n'. So we need to keep going. For the next element, there are n elements in total, the value at n is k. Math.min(n, k)=k. Because we already know k >= n'. Here we know k <= n'+1. k can have only two possible values, n' or n1+1. So we can safely say there are k elements with values >= k. For the remaining element, their value <= k.

To summarize, starting from the biggest element, I'm looking for a point where n==k. If this position doesn't exist, then the transition point between k and k' is what I'm looking for.

 public class Solution {  
   public int hIndex(int[] citations) {  
     Arrays.sort(citations);  
     for(int i = citations.length-1; i >= 0; i--) {  
       int n = citations.length-i;  
       int k = citations[i];  
       int h = Math.min(n, k);  
       if(i==0 || citations[i-1]<h)  
         return h;  
     }  
     return 0;  
   }  
 }  

Sunday, September 20, 2015

Shortest Palindrome

Given a string S, you are allowed to convert it to a palindrome by adding characters in front of it. Find and return the shortest palindrome you can find by performing this transformation.
For example:
Given "aacecaaa", return "aaacecaaa".

Given "abcd", return "dcbabcd".

Naive solution is remove the right-most letters of string one by one and check whether the remaining part is palindrome. It exceeds time limit. The fancy solution is to use KMP.



1:  public class Solution {  
2:    public String shortestPalindrome(String s) {  
3:      String ss = s + "#" + new StringBuilder(s).reverse().toString();  
4:      int[] pie = new int[ss.length()];  
5:      int k = 0;  
6:      pie[0] = 0;  
7:      for(int i = 1; i < ss.length(); i++) {  
8:        while(k>0 && ss.charAt(k) != ss.charAt(i))  
9:          k = pie[k-1];  
10:        if(ss.charAt(k) == ss.charAt(i))  
11:          k++;  
12:        pie[i] = k;  
13:      }  
14:      return (new StringBuilder(s.substring(pie[ss.length()-1])).reverse().toString()+s);  
15:    }  
16:  }  

Implement strStr() using KMP

I'm reading "Introduction to Algorithms" to review the KMP algorithm. I feel the most important concept in the algorithm is k=pi[k]. It's very tricky to understand. The key to understand this is to remember for the matched part between target string and pattern string, checking target string is the same as checking pattern string. When constructing pi[], pattern string is both target string and pattern string. So apply the above rule again. Some notes per my understanding.



Code for strstr, which is also a leetcode question.

1:    public int strStr(String haystack, String needle) {  
2:      int l = needle.length();  
3:      if(l == 0)  
4:        return 0;  
5:      int[] pie = new int[l];  
6:      pie[0] = 0;  
7:      int k = 0;  
8:      for(int q = 1; q < l; q++) {  
9:        while(k>0 && needle.charAt(k)!=needle.charAt(q))  
10:          k = pie[k-1];  
11:        if(needle.charAt(k) == needle.charAt(q))  
12:          k++;  
13:        pie[q] = k;  
14:      }  
15:      int i = 0;  
16:      int j = 0;  
17:      while(i < haystack.length()) {  
18:        while(j>0 && needle.charAt(j)!=haystack.charAt(i))  
19:          j = pie[j-1];  
20:        if(needle.charAt(j) == haystack.charAt(i))  
21:          j++;  
22:        if(j == needle.length())  
23:          return i-j+1;  
24:        i++;  
25:      }  
26:      return -1;  
27:    }  

The Skyline Problem



When I first started working on the problem, my idea is to insert and merge new intervals into the result list while loop through buildings. For the result list, I only need to grab the last two elements and compare with the new interval and do all the merging and insertion stuff. The solution can only pass part of the test cases. The reason is the new segment might have a start time which is earlier than the last two points. It's not enough to only compare it against the last two elements of the result list.


My solution fails in the above cases. When calculating the skyline of 1, 2 and 3, 3's starting point is before p3 and p4. The correct result after inserting 3 is create a new p3 and remove the original p3 and p4. It tells me that for each new interval, I need to go through all the existing points in result list, find the correct position to insert the new interval and than possibly modify many subsequent points in the result list.

I started my search of correct answer and this one looks very interesting. https://leetcode.com/discuss/37630/my-c-code-using-one-priority-queue-812-ms

Copy the explanation from the link above here,
"The idea is to do line sweep and just process the buildings only at the start and end points. The key is to use a priority queue to save all the buildings that are still "alive". The queue is sorted by its height and end time (the larger height first and if equal height, the one with a bigger end time first). For each iteration, we first find the current process time, which is either the next new building start time or the end time of the top entry of the live queue. If the new building start time is larger than the top one end time, then process the one in the queue first (pop them until it is empty or find the first one that ends after the new building); otherwise, if the new building starts before the top one ends, then process the new building (just put them in the queue). After processing, output it to the resulting vector if the height changes."

The explanation is very useful, but you may need to read it a couple of times before it makes sense to you. The fundamental principle is that at any time, the last element in result list is the starting point of the last segment. This point extends until the top point of the priority queue.

Try the code with the example below is the best way to understand the algorithm.


1:  public class Solution {  
2:    public class Pos {  
3:      int h;  
4:      int x;  
5:      public Pos(int hh, int xx) {  
6:        h = hh;  
7:        x = xx;  
8:      }  
9:    }  
10:    public List<int[]> getSkyline(int[][] buildings) {  
11:      LinkedList<int[]> res = new LinkedList<int[]> ();  
12:      if(buildings.length == 0)  
13:        return res;  
14:      PriorityQueue<Pos> queue = new PriorityQueue<Pos> (1, new Comparator<Pos>() {  
15:        @Override  
16:        public int compare(Pos p1, Pos p2) {  
17:          if(p1.h != p2.h)  
18:            return p2.h-p1.h;  
19:          else  
20:            return p1.x-p2.x;  
21:        }  
22:      });  
23:      int curr = 0;  
24:      int currX = 0;  
25:      int currH = 0;  
26:      while(curr<buildings.length || !queue.isEmpty()) {  
27:        //when buildings[curr] overlaps with segments already inserted  
28:        if(queue.isEmpty() || (curr<buildings.length && buildings[curr][0] <= queue.peek().x)) {  
29:          currX = buildings[curr][0];  
30:          //Insert the end point of this segment and the following segments' end point having the same start point into the queue  
31:          while(curr<buildings.length && buildings[curr][0]==currX) {  
32:            queue.add(new Pos(buildings[curr][2], buildings[curr][1]));  
33:            curr++;  
34:          }  
35:        }  
36:        //buildings[curr] has no overlap with already inserted segments  
37:        //Finish all the inserted segments before dealing with buildings[curr]  
38:        else {  
39:          currX = queue.peek().x;  
40:          //For end points which terminate earlier than currX and not has high as currX, they will be invisible in the skyline  
41:          //So we should directly pop them away  
42:          while(!queue.isEmpty() && queue.peek().x <= currX)  
43:            queue.poll();  
44:        }  
45:        //If queue is empty, it's the last point. The height should be zero  
46:        currH = queue.isEmpty() ? 0 : queue.peek().h;  
47:        //If the newly inserted point has the same height as the last element of result list, there is no need to insert it. Because the last element of result list will extend until the top of priority list, which must have a larger end point  
48:        if(res.isEmpty() || res.getLast()[1]!=currH) {  
49:          res.add(new int[]{currX, currH});  
50:        }  
51:      }  
52:      return res;  
53:    }  
54:  }  

Sunday, September 13, 2015

Best Time to Buy and Sell Stock IV



Best Time to Buy and Sell Stock IV

Say you have an array for which the ith element is the price of a given stock on day i.

Design an algorithm to find the maximum profit. You may complete at most k transactions.

Note:
You may not engage in multiple transactions at the same time (ie, you must sell the stock before you buy again).

刚拿到这道题目的时候完全没有思路,想借鉴 Best Time to Buy and Sell Stock III 的思路,但是复杂度太高,过不了大的test。看了许多大牛的文章,大家的思路基本上都来源于这篇博客,http://blog.csdn.net/linhuanmars/article/details/23236995。 文章的思路很好,用local和global两个数组,基于DP来解决这个问题,但是我一直看不懂关于local的递推关系式,
local[i][j]=max(global[i-1][j-1]+max(diff,0),  local[i-1][j]+diff) where diff=prices[i]-prices[i-1]
我觉得这个关系大致正确,但仔细推敲却不make sense。

local[i-1][j]+diff这一部分可以理解,local[i-1][j]就是进行了j次transaction,其中最后一次是在(i-1)的位置卖出。local[i-1][j]+diff就是把最后一次卖出的位置从i-1移到了i,因为相当于合并了两次交易,所以总共的交易次数还是j。

global[i-1][j-1]+max(diff,0)的含义就很confusing了,global[i-1][j-1]意思是到(i-1)为止,在交易了最多(j-1)次的情况下的最大收益,要从这个得到local[i][j],就要保证在i的位置卖一次,但是global[i-1][j-1]有可能会包含在(i-1)的位置卖股票的情况,这种情况下,在i的位置再卖一次显然没有任何意义。

基于这些大牛的思路,我自己总结的更加容易理解的关系是这样的。
在计算local[i][j]的时候,这里的关键因素是多了一个元素,prices[i]。local[i][j]的意思是交易了j次,并且最后一次的卖发生在i。这个元素究竟会怎样影响最大收益呢?它有两种方式影响最大收益。
1. 把(i-1)处的卖移到i处,由于这相当于合并了两次交易,所以并不会影响总的交易次数。这种情况下的收益是local[i-1][j]+diff,diff=prices[i]-prices[i-1]。
2. 在(i-1)处买,在i处卖,得到的收益加上global[i-2][j-1],就是local[i][j]可能的最大值。至于为什么不用再考虑其他的情况,比如说在(i-2)处买,在i处卖,这种情况相当于在(i-2)处买,在(i-1)处卖,在i处再卖。这种情况已经被case 1 cover了,所以无需再考虑。总之,在case 2,收益等于global[i-2][j-1]+diff。

所以,综合两种情况,递推关系应该如下,
local[i][j] = local[i-1][j] + diff
if (i-2 >= 0) local[i]][j] = max(local[i][j], global[i-2][j-1]+diff)

global[i][j] = max(global[i-1][j], local[i][j])

这道题的思路其实还是主要源于各位大牛的博客,只是我实在理解不了其中的一个递推关系,所以在这里写下了自己的思路。


1:  public class Solution {  
2:    public int maxProfit(int k, int[] prices) {  
3:      if (k < 1 || prices.length <= 1)  
4:        return 0;  
5:      if (k >= prices.length/2)  
6:        return helper(prices);  
7:      int[][] local = new int[prices.length][k+1];  
8:      int[][] global = new int[prices.length][k+1];  
9:      for(int i = 1; i < prices.length; i++) {  
10:        for(int j = 1; j <= k; j++) {  
11:          int diff = prices[i] - prices[i-1];  
12:          local[i][j] = local[i-1][j]+diff;  
13:          if(i-2 >= 0)  
14:            local[i][j] = Math.max(local[i][j], global[i-2][j-1]+diff);  
15:          global[i][j] = Math.max(global[i-1][j], local[i][j]);  
16:        }  
17:      }  
18:      return global[prices.length-1][k];  
19:    }  
20:    public int helper(int[] prices) {  
21:      int profit = 0;  
22:      for(int i = 1; i < prices.length; i++) {  
23:        profit += Math.max(prices[i]-prices[i-1], 0);  
24:      }  
25:      return profit;  
26:    }  
27:  }