Today we will work on hard task Word Break II. Why this task is hard? I can say, here, during resolving this problem, you will and you should use a few techniques.
First one traversing method called DFS (or BFS), second is Dynamic programming, third is some combining and tricks 🙂 to pass tests on leetcode.
OK, try to read task description :
Given a string s and a dictionary of words dict, add spaces in s to construct a sentence where each word is a valid dictionary word.
Return all such possible sentences.
For example, given
s ="catsanddog"
,
dict =["cat", "cats", "and", "sand", "dog"]
.A solution is
["cats and dog", "cat sand dog"]
.
We can see, that we have some dictionary and first idea will be about traversing through dictionary, but it will be mistake. For example you have simple input string like catsanddog and dictionary with 1 billion of words, what is the complexity will be ? Yes, O(N^2) where is N equals to 1 billion. And generally I do not think that complexity of our solution should be dependent from dictionary size!
What we can todo, try to analyse … We can run through all letters in input word and try to find all words from our dic, if found put to storage (DP) end index of word (or start index) depend to the direction of string traversing. I am storing not just indexes but word too, to speed up our solution.
OK, at the end you will have something like that:
import java.util.*; /** * catsanddog * * @author */ public class WordBreakII { public List<String> wordBreak(String s, Set<String> dict) { List<String> result = new ArrayList<String>(); int length = s.length() + 1; List<List<Integer>> mask = new ArrayList<List<Integer>>(length); for (int i = 0; i < length; i++) { mask.add(new ArrayList<Integer>()); } for (int i = 0; i < length; i++) { for (int j = i; j < length; j++) { String word = s.substring(i, j); if (dict.contains(word)) { mask.get(i).add(j); } } } buildResult(mask, 0, result, "", s); return result; } private void buildResult(List<List<Integer>> mask, int index, List<String> result, String val, String src) { if (index == src.length()) { result.add(val.trim()); return; } for (Integer end : mask.get(index)) { String word = src.substring(index, end); if (index == 0) { val = word+ " "; } else { val += word + " "; } buildResult(mask, end, result, val, src); } } }
But you will get Time Limit Exceeded exception. Generally this solution is good, but we have problems in analysing words like that :
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab", ["a","aa","aaa","aaaa","aaaaa","aaaaaa","aaaaaaa","aaaaaaaa","aaaaaaaaa","aaaaaaaaaa"]
So, how to fix that problem, you can see our algorithm run from start to end and check if the counts of words equals to input word size, it means we have used all letters from input word:
if (index == src.length()) { result.add(val.trim()); return; }
But, what will be if we run from other side, I mean we are storing in start bucket end indexes
mask.get(i).add(j);
But we can store in the end buckets start indexes and run loop from end to start, Hmm … good idea try to implement:
import java.util.ArrayList; import java.util.List; import java.util.Set; /** * catsanddog * * @author */ public class WordBreakII { public List<String> wordBreak(String s, Set<String> dict) { List<String> result = new ArrayList<String>(); int length = s.length() + 1; List<List<Pair>> mask = new ArrayList<List<Pair>>(length); for (int i = 0; i < length; i++) { mask.add(new ArrayList<Pair>()); } for (int i = 0; i < length; i++) { for (int j = i; j < length; j++) { String word = s.substring(i, j); if (dict.contains(word)) { mask.get(j).add(new Pair(i, word)); } } } if (mask.get(mask.size() - 1).size() > 0) buildResult(mask, mask.size() - 1, result, ""); return result; } private void buildResult(List<List<Pair>> mask, int index, List<String> result, String val) { if (index == 0) { result.add(val.trim()); return; } for (Pair pair : mask.get(index)) { String word = pair.val; val = word + " " + val; buildResult(mask, pair.key, result, val); val = removeCurrentWord(val); } } private String removeCurrentWord(String val) { int i = val.indexOf(" "); if (i >= 0) val = val.substring(i + 1); return val; } private class Pair { public int key; public String val; public Pair(int key, String val) { this.key = key; this.val = val; } } }
I’ve added new class for storing start index and word to improve speed of execution
private class Pair { public int key; public String val; public Pair(int key, String val) { this.key = key; this.val = val; } } and storing end – start indexes, it means storing in end bucket start index (and word) mask.get(j).add(new Pair(i, word)); so, before build result I am checking if there are some solutions, I am mean if some DFS traversing way ended in last bucket. if (mask.get(mask.size() - 1).size() > 0)
Trying to run on leetcode and yeah-a, it’s accepted :))!!!
It means we solved problem.
We have used combination of DP and DFS for solve that problem, but after got Time Limit exception we have changed way of DFS running and tests passed.
That is all, Thanks.