網頁

Sunday 7 December 2014

Quora haqathon 2014

Quora haqathon today, from 11am to 7pm - Pacific standard time! Features 9 problems mixed with tradition algorithm tasks, machine learning and system programming tasks.  Link to site.

Ontology
Linearize the tree - each query reduces to "in question q[x...y], how many of them start with prefix p?". Offline query + Partial Sum Trie. Linear time.

Wombats
Maximum closure.

Labeler
Use training set to calculate \(\text{Pr}[q_i \in t_k | w_j \in q_i ] \) for all question \(q_i\), topic \(t_k\) and word \(w_j\). Improve using bi-gram.

Duplicate
Use \( \text{Pr}[w \in \text{question_text}_i \text{ and } w \notin \text{question_text}_j ]\) as classifying criteria - 60% accuracy. Consider also \( \frac{\text{view_count}_i }{ \text{view_count}_j } \) improved it to 70%.


No comments:

Post a Comment