full by type in chronological order selected
Selected publications
-
TxtAlign: Efficient Near-Duplicate Text Alignment Search via Bottom-k Sketches for Plagiarism Detection.
Zhizhi Wang, Chaoji Zuo, Dong Deng*
SIGMOD 2022 -
SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation.
Chaoji Zuo, Sepehr Assadi, Dong Deng*
SIGMOD 2022 -
Allign: Aligning All-Pair Near-Duplicate Passages in Long Texts.
Weiqi Feng, Dong Deng*
SIGMOD 2021 -
DeltaPQ: Lossless Product Quantization Code Compression for High Dimensional Similarity Search.
Runhui Wang, Dong Deng*
PVLDB 2020 -
JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes.
Erkang Zhu, Dong Deng, Fatemeh Nargesian, Renée J. Miller
SIGMOD 2019 -
Balance-Aware Distributed String Similarity-Based Query Processing System.
Ji Sun, Zeyuan Shang, Guoliang Li, Zhifeng Bao, Dong Deng
PVLDB 2019 -
Overlap Set Similarity Joins with Theoretical Guarantees.
Dong Deng, Yufei Tao, Guoliang Li
SIGMOD 2018 slides -
Approximate String Joins with Abbreviations.
Wenbo Tao, Dong Deng, Michael Stonebraker
PVLDB 2017 -
The Data Civilizer System.
Dong Deng, Raul Castro Fernandez, Ziawasch Abedjan, Sibo Wang, Michael Stonebraker, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Nan Tang
CIDR 2017 slides -
SilkMoth: An Efficient Method for Finding Related Sets with Maximum Matching Constraints.
Dong Deng, Albert Kim, Samuel Madden, Michael Stonebraker
PVLDB 2017 slides -
META: An Efficient Matching-Based Method for Error-Tolerant Autocompletion.
Dong Deng, Guoliang Li, He Wen, H. V. Jagadish, Jianhua Feng
PVLDB 2016 slides -
Detecting Data Errors: Where Are We and What Needs To Be Done?.
Ziawasch Abedjan, Xu Chu, Dong Deng, Raul Castro Fernandez, Ihab F. Ilyas, Mourad Ouzzani, Paolo Papotti, Michael Stonebraker, Nan Tang
PVLDB 2016 slides -
Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach.
Chengliang Chai, Guoliang Li, Jian Li, Dong Deng, Jianhua Feng
SIGMOD 2016 -
An Efficient Partition Based Method for Exact Set Similarity Joins.
Dong Deng, Guoliang Li, He Wen, Jianhua Feng
PVLDB 2015 slides -
Efficient Similarity Join and Search on Multi-Attribute Data.
Guoliang Li, Jian He, Dong Deng, Jian Li
SIGMOD 2015 -
A Pivotal Prefix Based Filtering Algorithm for String Similarity Search.
Dong Deng, Guoliang Li, Jianhua Feng
SIGMOD 2014 slides -
Distributed Graph Simulation: Impossibility and Possibility.
Wenfei Fan, Xin Wang, Yinghui Wu, Dong Deng
PVLDB 2014 -
Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases.
Dong Deng, Yu Jiang, Guoliang Li, Jian Li, Cong Yu
PVLDB 2013 slides -
PassJoin: A Partition-based Method for Similarity Joins.
Guoliang Li, Dong Deng, Jiannan Wang, Jianhua Feng
PVLDB 2011 slides -
Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction.
Guoliang Li, Dong Deng, Jianhua Feng
SIGMOD 2011 slides
Recent Publications
-
TxtAlign: Efficient Near-Duplicate Text Alignment Search via Bottom-k Sketches for Plagiarism Detection.
Zhizhi Wang, Chaoji Zuo, Dong Deng*
SIGMOD 2022 -
SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation.
Chaoji Zuo, Sepehr Assadi, Dong Deng*
SIGMOD 2022 -
Allign: Aligning All-Pair Near-Duplicate Passages in Long Texts.
Weiqi Feng, Dong Deng*
SIGMOD 2021 -
DeltaPQ: Lossless Product Quantization Code Compression for High Dimensional Similarity Search.
Runhui Wang, Dong Deng*
PVLDB 2020
All publications by type
-
C28: Zhizhi Wang, Chaoji Zuo, Dong Deng*
TxtAlign: Efficient Near-Duplicate Text Alignment Search via Bottom-k Sketches for Plagiarism Detection.
SIGMOD 2022 -
C27: Chaoji Zuo, Sepehr Assadi, Dong Deng*
SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation.
SIGMOD 2022 -
C26: Weiqi Feng, Dong Deng*
Allign: Aligning All-Pair Near-Duplicate Passages in Long Texts.
SIGMOD 2021 -
C25: Runhui Wang, Dong Deng*
DeltaPQ: Lossless Product Quantization Code Compression for High Dimensional Similarity Search.
PVLDB 2020 -
C24: Erkang Zhu, Dong Deng, Fatemeh Nargesian, Renée J. Miller
JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes.
SIGMOD 2019 -
C23: Ji Sun, Zeyuan Shang, Guoliang Li, Zhifeng Bao, Dong Deng
Balance-Aware Distributed String Similarity-Based Query Processing System.
PVLDB 2019 -
C22: Dong Deng, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Guoliang Li, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang
Unsupervised String Transformation Learning for Entity Consolidation.
ICDE 2019 -
C21: Dong Deng, Chengcheng Yang, Shuo Shang, Fan Zhu, Li Liu, Ling Shao
LCJoin: Set Containment Join via List Crosscutting.
ICDE 2019 -
C20: Zeyi Wen, Dong Deng, Rui Zhang, Ramamohanarao Kotagiri
2ED: An Efficient Entity Extraction Algorithm Using Two-Level Edit-Distance.
ICDE 2019 -
C19: Dong Deng, Yufei Tao, Guoliang Li
Overlap Set Similarity Joins with Theoretical Guarantees.
SIGMOD 2018 slides -
C18: Wenbo Tao, Dong Deng, Michael Stonebraker
Approximate String Joins with Abbreviations.
PVLDB 2017 -
C17: Dong Deng, Raul Castro Fernandez, Ziawasch Abedjan, Sibo Wang, Michael Stonebraker, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Nan Tang
The Data Civilizer System.
CIDR 2017 slides -
C16: Dong Deng, Albert Kim, Samuel Madden, Michael Stonebraker
SilkMoth: An Efficient Method for Finding Related Sets with Maximum Matching Constraints.
PVLDB 2017 slides -
C15: Dong Deng, Guoliang Li, He Wen, H. V. Jagadish, Jianhua Feng
META: An Efficient Matching-Based Method for Error-Tolerant Autocompletion.
PVLDB 2016 slides -
C14: Michael Stonebraker, Dong Deng, Michael L. Brodie
Database Decay and How To Avoid It.
BigData 2016 -
C13: Ziawasch Abedjan, Xu Chu, Dong Deng, Raul Castro Fernandez, Ihab F. Ilyas, Mourad Ouzzani, Paolo Papotti, Michael Stonebraker, Nan Tang
Detecting Data Errors: Where Are We and What Needs To Be Done?.
PVLDB 2016 slides -
C12: Chengliang Chai, Guoliang Li, Jian Li, Dong Deng, Jianhua Feng
Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach.
SIGMOD 2016 -
C11: Dong Deng, Guoliang Li, He Wen, Jianhua Feng
An Efficient Partition Based Method for Exact Set Similarity Joins.
PVLDB 2015 slides -
C10: Jin Wang, Guoliang Li, Dong Deng, Yong Zhang, Jianhua Feng
Two Birds with One Stone: An Efficient Hierarchical Framework for Top-k and Threshold-based String Similarity Search.
ICDE 2015 -
C9: Guoliang Li, Jian He, Dong Deng, Jian Li
Efficient Similarity Join and Search on Multi-Attribute Data.
SIGMOD 2015 -
C8: Dong Deng, Guoliang Li, Shuang Hao, Jiannan Wang, Jianhua Feng
MassJoin: A MapReduce-based Algorithm for String Similarity Joins.
ICDE 2014 slides -
C7: Dong Deng, Guoliang Li, Jianhua Feng
A Pivotal Prefix Based Filtering Algorithm for String Similarity Search.
SIGMOD 2014 slides -
C6: Wenfei Fan, Xin Wang, Yinghui Wu, Dong Deng
Distributed Graph Simulation: Impossibility and Possibility.
PVLDB 2014 -
C5: Dong Deng, Yu Jiang, Guoliang Li, Jian Li, Cong Yu
Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases.
PVLDB 2013 slides -
C4: Dong Deng, Guoliang Li, Jianhua Feng, Wen-Syan Li
Top-k String Similarity Search with Edit-Distance Constraints.
ICDE 2013 slides -
C3: Dong Deng, Guoliang Li, Jianhua Feng
An Efficient Trie-based Method for Approximate Entity Extraction with Edit-Distance Constraints.
ICDE 2012 slides -
C2: Guoliang Li, Dong Deng, Jiannan Wang, Jianhua Feng
PassJoin: A Partition-based Method for Similarity Joins.
PVLDB 2011 slides -
C1: Guoliang Li, Dong Deng, Jianhua Feng
Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction.
SIGMOD 2011 slides
-
J8: Zaifeng Pan, Feng Zhang, Hourun Li, Chenyang Zhang, Xiaoyong Du, Dong Deng
G-SLIDE: A GPU-Based Sub-Linear Deep Learning Engine via LSH Sparsification.
TPDS 2021 -
J7: Chengcheng Yang, Dong Deng, Shuo Shang, Fan Zhu, Li Liu, Ling Shao
Internal and External Memory Set Containment Join.
VLDB Journal 2021 -
J6: Chengliang Chai, Guoliang Li, Jian Li, Dong Deng, Jianhua Feng
A Partial-Order-based Framework for Cost-Effective Crowdsourced Entity Resolution.
VLDB Journal 2018 -
J5: Minghe Yu, Jin Wang, Guoliang Li, Yong Zhang, Dong Deng, Jianhua Feng
A Unified Framework for String Similarity Search with Edit-Distance Constraint.
VLDB Journal 2017 -
J4: Minghe Yu, Guoliang Li, Dong Deng, Jianhua Feng
String Similarity Search and Join: A Survey.
FCS 2016 -
J3: Dong Deng, Guoliang Li, Jianhua Feng, Yi Duan, Zhiguo Gong
A Unified Framework for Approximate Dictionary-based Entity Extraction.
VLDB Journal 2015 -
J2: Sebastian Wandelt, Dong Deng, Stefan Gerdjikov, Shashwat Mishra, Petar Mitankin, Manish Patil, Enrico Siragusa, Alexander Tiskin, Wei Wang, Jiaying Wang, Ulf Leser
State-of-the-art in String Similarity Search and Join.
SIGMOD Record 2014 -
J1: Guoliang Li, Dong Deng, Jianhua Feng
A Partition-based Method for String Similarity Joins with Edit-Distance Constraints.
TODS 2013
-
D3: Essam Mansour, Dong Deng, Raul Castro Fernandez, Abdulhakim Ali Qahtan, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang
Building Data Civilizer Pipelines with an Advanced Workflow Engine.
ICDE (demo) 2018 -
D2: Ji Sun, Zeyuan Shang, Guoliang Li, Dong Deng, Zhifeng Bao
DIMA: A Distributed In-Memory Similarity-Based Query Processing System.
PVLDB (demo) 2017 -
D1: Raul Castro Fernandez, Dong Deng, Essam Mansour, Abdulhakim Ali Qahtan, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang
A Demo of the Data Civilizer System.
SIGMOD (demo) 2017
-
S2: Chengcheng Yang, Dong Deng, Shuo Shang, Ling Shao
Efficient Locality-Sensitive Hashing Over High-Dimensional Data Streams.
ICDE (short paper) 2020 -
T1: Dong Deng
Error-Tolerant Big Data Processing.
Thesis 2016 -
W1: Yu Jiang, Dong Deng, Jiannan Wang, Guoliang Li, Jianhua Feng
Efficient Parallel Partition-based Algorithms for Similarity Search and Join with Edit Distance Constraints.
EDBT (workshop) 2013 slides -
S1: Guoliang Li, Dong Deng, Jianhua Feng
Extending Dictionary-based Entity Extraction to Tolerate Errors..
CIKM (short paper) 2010
C: conference research paper
J: journal paper
D: conference demo paper
S: conference short paper
W: workshop paper
T: thesis
All publications in chronological order
-
Extending Dictionary-based Entity Extraction to Tolerate Errors..
Guoliang Li, Dong Deng, Jianhua Feng CIKM (short paper) 2010 -
Faerie: Efficient Filtering Algorithms for Approximate Dictionary-based Entity Extraction.
Guoliang Li, Dong Deng, Jianhua Feng SIGMOD 2011 -
PassJoin: A Partition-based Method for Similarity Joins.
Guoliang Li, Dong Deng, Jiannan Wang, Jianhua Feng PVLDB 2011 -
An Efficient Trie-based Method for Approximate Entity Extraction with Edit-Distance Constraints.
Dong Deng, Guoliang Li, Jianhua Feng ICDE 2012 -
A Partition-based Method for String Similarity Joins with Edit-Distance Constraints.
Guoliang Li, Dong Deng, Jianhua Feng TODS 2013 -
Top-k String Similarity Search with Edit-Distance Constraints.
Dong Deng, Guoliang Li, Jianhua Feng, Wen-Syan Li ICDE 2013 -
Efficient Parallel Partition-based Algorithms for Similarity Search and Join with Edit Distance Constraints.
Yu Jiang, Dong Deng, Jiannan Wang, Guoliang Li, Jianhua Feng EDBT (workshop) 2013 -
Scalable Column Concept Determination for Web Tables Using Large Knowledge Bases.
Dong Deng, Yu Jiang, Guoliang Li, Jian Li, Cong Yu PVLDB 2013 -
MassJoin: A MapReduce-based Algorithm for String Similarity Joins.
Dong Deng, Guoliang Li, Shuang Hao, Jiannan Wang, Jianhua Feng ICDE 2014 -
Distributed Graph Simulation: Impossibility and Possibility.
Wenfei Fan, Xin Wang, Yinghui Wu, Dong Deng PVLDB 2014 -
A Pivotal Prefix Based Filtering Algorithm for String Similarity Search.
Dong Deng, Guoliang Li, Jianhua Feng SIGMOD 2014 -
State-of-the-art in String Similarity Search and Join.
Sebastian Wandelt, Dong Deng, Stefan Gerdjikov, Shashwat Mishra, Petar Mitankin, Manish Patil, Enrico Siragusa, Alexander Tiskin, Wei Wang, Jiaying Wang, Ulf Leser SIGMOD Record 2014 -
A Unified Framework for Approximate Dictionary-based Entity Extraction.
Dong Deng, Guoliang Li, Jianhua Feng, Yi Duan, Zhiguo Gong VLDB Journal 2015 -
Efficient Similarity Join and Search on Multi-Attribute Data.
Guoliang Li, Jian He, Dong Deng, Jian Li SIGMOD 2015 -
Two Birds with One Stone: An Efficient Hierarchical Framework for Top-k and Threshold-based String Similarity Search.
Jin Wang, Guoliang Li, Dong Deng, Yong Zhang, Jianhua Feng ICDE 2015 -
An Efficient Partition Based Method for Exact Set Similarity Joins.
Dong Deng, Guoliang Li, He Wen, Jianhua Feng PVLDB 2015 -
Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach.
Chengliang Chai, Guoliang Li, Jian Li, Dong Deng, Jianhua Feng SIGMOD 2016 -
Error-Tolerant Big Data Processing.
Dong Deng Thesis 2016 -
String Similarity Search and Join: A Survey.
Minghe Yu, Guoliang Li, Dong Deng, Jianhua Feng FCS 2016 -
META: An Efficient Matching-Based Method for Error-Tolerant Autocompletion.
Dong Deng, Guoliang Li, He Wen, H. V. Jagadish, Jianhua Feng PVLDB 2016 -
Database Decay and How To Avoid It.
Michael Stonebraker, Dong Deng, Michael L. Brodie BigData 2016 -
Detecting Data Errors: Where Are We and What Needs To Be Done?.
Ziawasch Abedjan, Xu Chu, Dong Deng, Raul Castro Fernandez, Ihab F. Ilyas, Mourad Ouzzani, Paolo Papotti, Michael Stonebraker, Nan Tang PVLDB 2016 -
DIMA: A Distributed In-Memory Similarity-Based Query Processing System.
Ji Sun, Zeyuan Shang, Guoliang Li, Dong Deng, Zhifeng Bao PVLDB (demo) 2017 -
SilkMoth: An Efficient Method for Finding Related Sets with Maximum Matching Constraints.
Dong Deng, Albert Kim, Samuel Madden, Michael Stonebraker PVLDB 2017 -
A Demo of the Data Civilizer System.
Raul Castro Fernandez, Dong Deng, Essam Mansour, Abdulhakim Ali Qahtan, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang SIGMOD (demo) 2017 -
The Data Civilizer System.
Dong Deng, Raul Castro Fernandez, Ziawasch Abedjan, Sibo Wang, Michael Stonebraker, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Nan Tang CIDR 2017 -
A Unified Framework for String Similarity Search with Edit-Distance Constraint.
Minghe Yu, Jin Wang, Guoliang Li, Yong Zhang, Dong Deng, Jianhua Feng VLDB Journal 2017 -
Approximate String Joins with Abbreviations.
Wenbo Tao, Dong Deng, Michael Stonebraker PVLDB 2017 -
Building Data Civilizer Pipelines with an Advanced Workflow Engine.
Essam Mansour, Dong Deng, Raul Castro Fernandez, Abdulhakim Ali Qahtan, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang ICDE (demo) 2018 -
Overlap Set Similarity Joins with Theoretical Guarantees.
Dong Deng, Yufei Tao, Guoliang Li SIGMOD 2018 -
A Partial-Order-based Framework for Cost-Effective Crowdsourced Entity Resolution.
Chengliang Chai, Guoliang Li, Jian Li, Dong Deng, Jianhua Feng VLDB Journal 2018 -
Balance-Aware Distributed String Similarity-Based Query Processing System.
Ji Sun, Zeyuan Shang, Guoliang Li, Zhifeng Bao, Dong Deng PVLDB 2019 -
2ED: An Efficient Entity Extraction Algorithm Using Two-Level Edit-Distance.
Zeyi Wen, Dong Deng, Rui Zhang, Ramamohanarao Kotagiri ICDE 2019 -
LCJoin: Set Containment Join via List Crosscutting.
Dong Deng, Chengcheng Yang, Shuo Shang, Fan Zhu, Li Liu, Ling Shao ICDE 2019 -
Unsupervised String Transformation Learning for Entity Consolidation.
Dong Deng, Wenbo Tao, Ziawasch Abedjan, Ahmed K. Elmagarmid, Ihab F. Ilyas, Guoliang Li, Samuel Madden, Mourad Ouzzani, Michael Stonebraker, Nan Tang ICDE 2019 -
JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes.
Erkang Zhu, Dong Deng, Fatemeh Nargesian, Renée J. Miller SIGMOD 2019 -
DeltaPQ: Lossless Product Quantization Code Compression for High Dimensional Similarity Search.
Runhui Wang, Dong Deng* PVLDB 2020 -
Efficient Locality-Sensitive Hashing Over High-Dimensional Data Streams.
Chengcheng Yang, Dong Deng, Shuo Shang, Ling Shao ICDE (short paper) 2020 -
G-SLIDE: A GPU-Based Sub-Linear Deep Learning Engine via LSH Sparsification.
Zaifeng Pan, Feng Zhang, Hourun Li, Chenyang Zhang, Xiaoyong Du, Dong Deng TPDS 2021 -
Internal and External Memory Set Containment Join.
Chengcheng Yang, Dong Deng, Shuo Shang, Fan Zhu, Li Liu, Ling Shao VLDB Journal 2021 -
Allign: Aligning All-Pair Near-Duplicate Passages in Long Texts.
Weiqi Feng, Dong Deng* SIGMOD 2021 -
TxtAlign: Efficient Near-Duplicate Text Alignment Search via Bottom-k Sketches for Plagiarism Detection.
Zhizhi Wang, Chaoji Zuo, Dong Deng* SIGMOD 2022 -
SPINE: Scaling up Programming-by-Negative-Example for String Filtering and Transformation.
Chaoji Zuo, Sepehr Assadi, Dong Deng* SIGMOD 2022