Publications on PubMed and Google Scholar
Current and former members of the JSB Group are in bold
author* indicates equal contribution
author indicates corresponding author(s)
Statistical rigor in omics data analysis
64. Zhou, H.J., Li, L., Li, Y., Li, W., and Li, J.J. (2022). PCA outperforms popular hidden variable inference methods for QTL mapping. Genome Biology 23:210. [ Highlight talk at RECOMB 2023 ] [ SOFTWARE ] | [ PDF ]
59. Li, Y.*, Ge, X.*, Peng, F., Li, W., and Li, J.J. (2022). Exaggerated false positives by popular differential expression methods when analyzing human population samples. Genome Biology 23:79. [ UCLA NEWS ] [ CODE ] | [ PDF ]
38. Li, J.J. and Tong, X. (2020). Statistical hypothesis testing versus machine-learning binary classification: distinctions and guidelines. Patterns 1(7):110115. [ UCLA NEWS ] [ PODCAST ]
Single-cell RNA-seq
73. Song, D., Wang, Q., Yan, G., Liu, T., and Li, J.J. (2024). scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics. Nature Biotechnology 42:247-252. [ SOFTWARE ] [ PDF ]
58. Jiang, R., Sun, T., Song, D., and Li, J.J. (2022). Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biology 23:31. [ CODE ] [ PDF ]
53. Song, D.*, Li, K.*, Hemminger, Z., Wollman, R., and Li, J.J. (2021). scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling. Bioinformatics 37(Supplement_1):i358-i366. [ ISMB/ECCB 2021 ] [ SOFTWARE ]
50. Sun, T., Song, D., Li, W.V., and Li, J.J. (2021). scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured. Genome Biology 22:163. [ RECOMB 2021 ] [ UCLA NEWS ] [ SOFTWARE ] [ CODE ] [ PDF ]
46. Song, D. and Li, J.J. (2021). PseudotimeDE: inference of differential gene expression along cell pseudotime with well-calibrated p-values from single-cell RNA sequencing data. Genome Biology 22:124. [ UCLA NEWS ] [ SOFTWARE ] [ CODE ]
45. Xi, N.M. and Li, J.J. (2021). Benchmarking computational doublet-detection methods for single-cell RNA sequencing data. Cell Systems 12(2):176-194. [ CODE ] [ DATA ] [ SSRN’s Top Downloaded Paper of Apr 9 – Jun 7, 2021 in Computational Biology eJournal ]
34. Li, W.V. and Li, J.J. (2019). A statistical simulator scDesign for rational scRNA-seq experimental design. Bioinformatics 35(14):i41-i50. [ ISMB/ECCB 2019 ] [ SOFTWARE ]
26. Li, W.V. and Li, J.J. (2018). An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nature Communications 9:997. [ UCLA NEWS ] [ SOFTWARE ]
Bulk RNA-seq isoform discovery and quantification
36. Li, W.V.*, Li, S.*, Tong, X., Deng, L., Shi, H., and Li, J.J. (2019). AIDE: annotation-assisted isoform discovery with high precision. Genome Research 29:2056-2072. [ UCLA NEWS ] [ SOFTWARE ] [ DATA ] [ COVER ART ]
29. Li, W.V. and Li, J.J. (2018). Modeling and analysis of RNA-seq data: a review from a statistical perspective. Quantitative Biology 6(3):195-209.
27. Li, W.V.*, Zhao, A., Zhang, S., and Li, J.J.* (2018). MSIQ: joint modeling of multiple RNA-seq samples for accurate isoform quantification. Annals of Applied Statistics 12(1):510-539. [ SOFTWARE ] [ COLOR PDF ]
15. Ye, Y. and Li, J.J. (2016). NMFP: a non-negative matrix factorization based preselection method to increase accuracy of identifying mRNA isoforms from RNA-seq data. BMC Genomics 17(Supp 1):11. [ SOFTWARE ]
2. Li, J.J., Jiang, C.-R., Brown, B.J., Huang, H., and Bickel, P.J. (2011). Sparse linear modeling of RNA-seq data for isoform discovery and abundance estimation. Proc Natl Acad Sci. USA 108(50):19867-19872. [ SOFTWARE ]
Central dogma and translational control
35. Li, J.J., Chew, G.-L., and Biggin, M.D. (2019). Quantitative principles of cis-translational control by general mRNA sequence features in eukaryotes. Genome Biology 20:162. [ CODE ]
22. Li, J.J., Chew, G.-L., and Biggin, M.D. (2017). Quantitating translational control: mRNA abundancee-dependent and independent contributions and the mRNA sequences that specify them. Nucleic Acids Research 45(20):11821-11836. [ Highlight talk at RECOMB 2018 ]
11. Li, J.J. and Biggin, M.D. (2015). Statistics requantitates the central dogma. Science 347(6226):1066-1067. [ UCLA NEWS ] [ Interview at Significance 12(3):8 ]
7. Li, J.J., Bickel, P.B., and Biggin, M.D. (2014). System wide analyses have underestimated protein abundances and transcriptional importance in animals. PeerJ 2:e270. [ Press release ] [ Guest post on “Bits of DNA” blog ] [ PeerJ Picks 2015″ Collection ] [ Top Bioinformatics Papers – June 2015″ Collection ] [ Top 5 most cited PeerJ articles ]
Classification methodologies and applications
65. Zhang, C., Chen, Y.E., Zhang, S., and Li, J.J. (2022). Information-theoretic classification accuracy: a criterion that guides data-driven combination of ambiguous outcome labels in multi-class classification. Journal of Machine Learning Research 23(341):1-65. [ RECOMB 2023 ] [ SOFTWARE ] [ PDF ]
49. Li, J.J., Chen, Y.E., and Tong, X. (2021). A flexible model-free prediction-based framework for feature ranking. Journal of Machine Learning Research 22(124):1-54. [ SOFTWARE ]
40. Lyu, J.*, Li, J.J.*, Su, J., Peng, F., Chen, Y.E., Ge, X., and Li, W. (2020). DORGE: Discovery of Oncogenes and tumor suppressoR genes using Genetic and Epigenetic features. Science Advances 6(46):eaba6784. [ VIDEO ]
25. Tong, X.*, Feng, Y.*, and Li, J.J. (2018). Neyman-Pearson classification algorithms and NP receiver operating characteristics. Science Advances 4(2):eaao1659. [ SOFTWARE ] [ VIDEO ] [ Francis X. Diebold’s Blog on NP Classification ]
Microbiome sequencing data imputation
52. Jiang, R., Li, W.V., and Li, J.J. (2021). mbImpute: an accurate and robust imputation method for microbiome data. Genome Biology 22:192. [ UCLA NEWS ] [ SOFTWARE ] | [ PDF ]
Networks
48. Sun, Y.E., Zhou, H.J., and Li, J.J. (2021). Bipartite tight spectral clustering (BiTSC) algorithm for identifying conserved gene co-clusters in two species. Bioinformatics 37(9):1225-1233. [ SOFTWARE ]
42. Wang, Y.X.R., Li, L., Li, J.J., and Huang, H. (2021). Network modeling in biology: statistical methods for gene and brain networks. Statistical Science 36(1):89-108.
32. Razaee, Z.S., Amini, A.A., and Li, J.J. (2019). Matched bipartite block model with covariates. Journal of Machine Learning Research 20(34):1-44.
High-dimensional model inference
37. Liu, H., Xu, X., and Li, J.J. (2020). A bootstrap lasso + partial ridge method to construct confidence intervals for parameters in high-dimensional sparse linear models. Statistica Sinica 30:1333-1355. [ SOFTWARE ]
Comparative genomics
33. Ge, X.*, Zhang, H.*, Xie, L., Li, W.V., Kwon, S.B., and Li, J.J. (2019). EpiAlign: an alignment-based bioinformatic tool for comparing chromatin state sequences. Nucleic Acids Research 47(13):e77. [ SOFTWARE ] [ WEBSITE ]
30. Duong, D., Ahmad, W.U., Eskin, E., Chang, K.-W., and Li, J.J. (2019). Word and sentence embedding tools to measure semantic similarity of Gene Ontology terms by their definitions. Journal of Computational Biology 26(1):38-52. [ SOFTWARE ]
19. Li, W.V., Chen, Y., and Li, J.J. (2017). TROM: a testing-based method for finding transcriptomic similarity of biological samples. Statistics in Biosciences 9(1):105-136. [ SOFTWARE ]
18. Gao, R. and Li, J.J. (2017). Correspondence of D. melanogaster and C. elegans developmental stages revealed by alternative splicing characteristics of conserved exons. BMC Genomics 18:234.
17. Yang, Y.*, Yang, Y.T.*, Yuan, J., Lu, Z.J., and Li, J.J. (2017). Large-scale mapping of mammalian transcriptomes identifies conserved genes associated with different cell states. Nucleic Acids Research 45(4):1657-1672. [ DATA ]
14. Li, W.V., Razaee, Z.S., and Li, J.J. (2016). Epigenome overlap measure (EPOM) for comparing tissue/cell types based on chromatin states. BMC Genomics 17(Supp 1):10. [ SOFTWARE ]
10. Gerstein, M.B.*, Rozowsky, J.*, Yan, K.K.*, Wang, D.*, Cheng, C.*, Brown, J.B.*, Davis, C.A.*, Hillier, L*, Sisu, C.*, Li, J.J.*, Pei, B.*, Harmanci, A.O.*, Duff, M.O.*, Djebali, S.*, and 82 other authors from the modENCODE consortium (2014). Comparative analysis of the transcriptome across distant species. Nature 512(7515):445-448. [ NIH NEWS ]
8. Li, J.J., Huang, H., Bickel, P.B., and Brenner, S.E. (2014). Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data. Genome Research 24(7):1086-1101. [ Press release ] [ Top 10 papers selected at the 2014 RECOMB/ISCB Conference on Regulatory & Systems Genomics ] [ DATA ] [ SOFTWARE ]
Gene regulation
1. MacArthur, S.*, Li, X.Y.*, Li, J.*, Brown, J.B., Chu, H.C., Zeng, L., Grondona, B.P., Hechmer, A., Simirenko, L., Keranen, S.V., Knowles, D.W., Stapleton, M., Bickel, P., Biggin, M.D., and Eisen, M.B. (2009). Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions. Genome Biology 10:R80. [ Faculty of 1000 recommendation ]