Monday, 22 December 2014
Sunday, 7 December 2014
Quora haqathon 2014
Quora haqathon today, from 11am to 7pm - Pacific standard time! Features 9 problems mixed with tradition algorithm tasks, machine learning and system programming tasks. Link to site.
Ontology
Linearize the tree - each query reduces to "in question q[x...y], how many of them start with prefix p?". Offline query + Partial Sum+ Trie. Linear time.
Wombats
Maximum closure.
Labeler
Use training set to calculate \text{Pr}[q_i \in t_k | w_j \in q_i ] for all question q_i, topic t_k and word w_j. Improve using bi-gram.
Duplicate
Use \text{Pr}[w \in \text{question_text}_i \text{ and } w \notin \text{question_text}_j ] as classifying criteria - 60% accuracy. Consider also \frac{\text{view_count}_i }{ \text{view_count}_j } improved it to 70%.
Ontology
Linearize the tree - each query reduces to "in question q[x...y], how many of them start with prefix p?". Offline query + Partial Sum
Wombats
Maximum closure.
Labeler
Use training set to calculate \text{Pr}[q_i \in t_k | w_j \in q_i ] for all question q_i, topic t_k and word w_j. Improve using bi-gram.
Duplicate
Use \text{Pr}[w \in \text{question_text}_i \text{ and } w \notin \text{question_text}_j ] as classifying criteria - 60% accuracy. Consider also \frac{\text{view_count}_i }{ \text{view_count}_j } improved it to 70%.
Subscribe to:
Posts (Atom)