Estimating the selectivity of tf-idf based cosine similarity predicates S Tata, JM Patel ACM Sigmod Record 36 (2), 7-12, 2007 | 302 | 2007 |
Using paxos to build a scalable, consistent, and highly available datastore J Rao, EJ Shekita, S Tata arXiv preprint arXiv:1103.2408, 2011 | 228 | 2011 |
Column-oriented storage techniques for MapReduce A Floratou, J Patel, E Shekita, S Tata arXiv preprint arXiv:1105.4252, 2011 | 202 | 2011 |
SQAK: doing more with keywords S Tata, GM Lohman Proceedings of the 2008 ACM SIGMOD international conference on Management of …, 2008 | 165 | 2008 |
Representation learning for information extraction from form-like documents BP Majumder, N Potti, S Tata, JB Wendt, Q Zhao, M Najork proceedings of the 58th annual meeting of the Association for Computational …, 2020 | 133 | 2020 |
Practical suffix tree construction S Tata, RA Hankins, JM Patel VLDB 4, 36-47, 2004 | 125 | 2004 |
Practical methods for constructing suffix trees Y Tian, S Tata, RA Hankins, JM Patel The VLDB Journal 14, 281-299, 2005 | 124 | 2005 |
Efficient and accurate discovery of patterns in sequence data sets A Floratou, S Tata, JM Patel IEEE Transactions on Knowledge and Data Engineering 23 (8), 1154-1168, 2011 | 88 | 2011 |
Clydesdale: structured data processing on MapReduce T Kaldewey, EJ Shekita, S Tata Proceedings of the 15th international conference on extending database …, 2012 | 70 | 2012 |
Freedom: A transferable neural architecture for structured information extraction on web documents BY Lin, Y Sheng, N Vo, S Tata Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020 | 59 | 2020 |
Differentiated secondary index maintenance in log structured NoSQL data stores W Tan, S Tata US Patent 9,218,383, 2015 | 58 | 2015 |
Scalable row-store with consensus-based replication J Rao, EJ Shekita, S Tata US Patent 9,047,331, 2015 | 55 | 2015 |
Sparkler: Supporting large-scale matrix factorization B Li, S Tata, Y Sismanis Proceedings of the 16th international conference on extending database …, 2013 | 51 | 2013 |
Quick access: building a smart experience for Google drive S Tata, A Popescul, M Najork, M Colagrosso, J Gibbons, A Green, A Mah, ... Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge …, 2017 | 49 | 2017 |
PiQA: An algebra for querying protein data sets S Tata, JM Patel 15th International Conference on Scientific and Statistical Database …, 2003 | 49 | 2003 |
Diff-Index: Differentiated Index in Distributed Log-Structured Data Stores. W Tan, S Tata, YR Tang, LL Fong EDBT, 700-711, 2014 | 48 | 2014 |
Declarative querying for biological sequences S Tata, JS Friedman, A Swaroop 22nd International Conference on Data Engineering (ICDE'06), 87-87, 2006 | 46 | 2006 |
BlueSNP: R package for highly scalable genome-wide association studies using Hadoop clusters H Huang, S Tata, RJ Prill Bioinformatics 29 (1), 135-136, 2013 | 45 | 2013 |
Differentiated secondary index maintenance in log structured NoSQL data stores W Tan, S Tata US Patent 9,218,385, 2015 | 43 | 2015 |
Simplified dom trees for transferable attribute extraction from the web Y Zhou, Y Sheng, N Vo, N Edmonds, S Tata arXiv preprint arXiv:2101.02415, 2021 | 39 | 2021 |