Bibliography

Abdelaal et al., 2019
Abdelaal, T., Michielsen, L., Cats, D., Hoogduin, D., Mei, H., Reinders, M. J., and Mahfouz, A. (2019).
A comparison of automatic cell identification methods for single-cell rna sequencing data.
Genome biology, 20(1):194.

Amemiya et al., 2019
Amemiya, H. M., Kundaje, A., and Boyle, A. P. (2019).
The encode blacklist: identification of problematic regions of the genome.
Scientific reports, 9(1):1-5.

Bakken et al., 2018
Bakken, T. E., Hodge, R. D., Miller, J. A., Yao, Z., Nguyen, T. N., Aevermann, B., Barkan, E., Bertagnolli, D., Casper, T., Dee, N., et al. (2018).
Single-nucleus and single-cell transcriptomes compared in matched cortical cell types.
PloS one, 13(12):e0209648.

Bastidas-Ponce et al., 2019
Bastidas-Ponce, A., Tritschler, S., Dony, L., Scheibner, K., Tarquis-Medina, M., Salinno, C., Schirge, S., Burtscher, I., Böttcher, A., Theis, F. J., et al. (2019).
Comprehensive single cell mrna profiling reveals a detailed roadmap for pancreatic endocrinogenesis.
Development, 146(12):dev173849.

Bentsen et al., 2020
Bentsen, M., Goymann, P., Schultheis, H., Klee, K., Petrova, A., Wiegandt, R., Fust, A., Preussner, J., Kuenne, C., Braun, T., et al. (2020).
Atac-seq footprinting unravels kinetics of transcription factor binding during zygotic genome activation.
Nature communications, 11(1):1-11.

Bergen et al., 2020
Bergen, V., Lange, M., Peidli, S., Wolf, F. A., and Theis, F. J. (2020).
Generalizing rna velocity to transient cell states through dynamical modeling.
Nature biotechnology, 38(12):1408-1414.

Bergen et al., 2021
Bergen, V., Soldatov, R. A., Kharchenko, P. V., and Theis, F. J. (2021).
Rna velocity-current challenges and future perspectives.
Molecular systems biology, 17(8):e10282.

Buenrostro et al., 2013
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y., and Greenleaf, W. J. (2013).
Transposition of native chromatin for multimodal regulatory analysis and personal epigenomics.
Nature methods, 10(12):1213.

Chao, 1987
Chao, A. (1987).
Estimating the population size for capture-recapture data with unequal catchability.
Biometrics, pages 783-791.

Chao et al., 2014
Chao, A., Gotelli, N. J., Hsieh, T., Sander, E. L., Ma, K., Colwell, R. K., and Ellison, A. M. (2014).
Rarefaction and extrapolation with hill numbers: a framework for sampling and estimation in species diversity studies.
Ecological monographs, 84(1):45-67.

Chao et al., 2013
Chao, A., Wang, Y., and Jost, L. (2013).
Entropy and the species accumulation curve: a novel entropy estimator via discovery rates of new species.
Methods in Ecology and Evolution, 4(11):1091-1100.

Gale and Sampson, 1995
Gale, W. A. and Sampson, G. (1995).
Good-turing frequency estimation without tears.
Journal of quantitative linguistics, 2(3):217-237.

Germain et al., 2020
Germain, P., Sonrel, A., and Robinson, M. (2020).
pipecomp, a general framework for the evaluation of computational pipelines, reveals performant single cell rna-seq preprocessing tools.
Genome Biol, 21(227).

Hafemeister and Satija, 2019
Hafemeister, C. and Satija, R. (2019).
Normalization and variance stabilization of single-cell rna-seq data using regularized negative binomial regression.
Genome Biol, 20(296).

Ilicic et al., 2016
Ilicic, T., Kim, J. K., Kolodziejczyk, A. A., Bagger, F. O., McCarthy, D. J., Marioni, J. C., and Teichmann, S. A. (2016).
Classification of low quality cells from single-cell RNA-seq data.
Genome biology, 17(1):1-15.

Islam et al., 2014
Islam, S., Zeisel, A., Joost, S., La Manno, G., Zajac, P., Kasper, M., Lönnerberg, P., and Linnarsson, S. (2014).
Quantitative single-cell RNA-seq with unique molecular identifiers.
Nature methods, 11(2):163.

Kang et al., 2018
Kang, H. M., Subramaniam, M., Targ, S., Nguyen, M., Maliskova, L., McCarthy, E., Wan, E., Wong, S., Byrnes, L., Lanata, C. M., et al. (2018).
Multiplexed droplet single-cell rna-sequencing using natural genetic variation.
Nature biotechnology, 36(1):89.

Kobak and Berens, 2019
Kobak, D. and Berens, P. (2019).
The art of using t-sne for single-cell transcriptomics.
Nature communications, 10(1):1-14.

Kuchenbecker et al., 2015
Kuchenbecker, L., Nienen, M., Hecht, J., Neumann, A. U., Babel, N., Reinert, K., and Robinson, P. N. (2015).
Imseq-a fast and error aware approach to immunogenetic sequence analysis.
Bioinformatics, 31(18):2963-2971.

Kulakovskiy et al., 2018
Kulakovskiy, I. V., Vorontsov, I. E., Yevshin, I. S., Sharipov, R. N., Fedorova, A. D., Rumynskiy, E. I., Medvedeva, Y. A., Magana-Mora, A., Bajic, V. B., Papatsenko, D. A., et al. (2018).
Hocomoco: towards a complete collection of transcription factor binding models for human and mouse via large-scale chip-seq analysis.
Nucleic acids research, 46(D1):D252-D259.

La Manno et al., 2018
La Manno, G., Soldatov, R., Zeisel, A., Braun, E., Hochgerner, H., Petukhov, V., Lidschreiber, K., Kastriti, M. E., Lönnerberg, P., Furlan, A., et al. (2018).
Rna velocity of single cells.
Nature, 560(7719):494-498.

Lefranc et al., 2009
Lefranc, M.-P., Giudicelli, V., Ginestoux, C., Jabado-Michaloud, J., Folch, G., Bellahcene, F., Wu, Y., Gemrot, E., Brochet, X., Lane, J., et al. (2009).
Imgt®, the international immunogenetics information system®.
Nucleic acids research, 37(suppl_1):D1006-D1012.

Li et al., 2017
Li, H., Linderman, G. C., Szlam, A., Stanton, K. P., Kluger, Y., and Tygert, M. (2017).
Algorithm 971: An implementation of a randomized algorithm for principal component analysis.
ACM Transactions on Mathematical Software (TOMS), 43(3):1-14.

Lun et al., 2019
Lun, A. T., Riesenfeld, S., Andrews, T., Gomes, T., Marioni, J. C., et al. (2019).
Emptydrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data.
Genome Biology, pages 1-9.

Maaten and Hinton, 2008
Maaten, L. v. d. and Hinton, G. (2008).
Visualizing data using t-sne.
Journal of machine learning research, 9(Nov):2579-2605.

MacParland et al., 2018
MacParland, S. A., Liu, J. C., Ma, X.-Z., Innes, B. T., Bartczak, A. M., Gage, B. K., Manuel, J., Khuu, N., Echeverri, J., Linares, I., et al. (2018).
Single cell rna sequencing of human liver reveals distinct intrahepatic macrophage populations.
Nature communications, 9(1):1-21.

McInnes et al., 2018
McInnes, L., Healy, J., and Melville, J. (2018).
Umap: Uniform manifold approximation and projection for dimension reduction.
arXiv preprint arXiv:1802.03426.

Otsu, 1979
Otsu, N. (1979).
A threshold selection method from gray-level histograms.
IEEE transactions on systems, man, and cybernetics, 9(1):62-66.

Parkhomchuk et al., 2009
Parkhomchuk, D., Borodina, T., Amstislavskiy, V., Banaru, M., Hallen, L., Krobitsch, S., Lehrach, H., and Soldatov, A. (2009).
Transcriptome analysis by strand-specific sequencing of complementary dna.
Nucleic Acids Res, 37(18):e123.

Platt et al., 1999
Platt, J. et al. (1999).
Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods.
Advances in large margin classifiers, 10(3):61-74.

Robinson et al., 2010
Robinson, M. D., McCarthy, D. J., and Smyth, G. K. (2010).
edger: a bioconductor package for differential expression analysis of digital gene expression data.
Bioinformatics, 26(1):139-140.

Satopaa et al., 2011
Satopaa, V., Albrecht, J., Irwin, D., and Raghavan, B. (2011).
Finding a" kneedle" in a haystack: Detecting knee points in system behavior.
In 2011 31st international conference on distributed computing systems workshops, pages 166-171. IEEE.

Stoeckius et al., 2018
Stoeckius, M., Zheng, S., Houck-Loomis, B., Hao, S., Yeung, B. Z., Mauck, W. M., Smibert, P., and Satija, R. (2018).
Cell hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics.
Genome biology, 19(1):1-12.

Strino and Lappe, 2016
Strino, F. and Lappe, M. (2016).
Identifying peaks in*-seq data using shape information.
BMC bioinformatics, 17(5):343-361.

Taavitsainen et al., 2021
Taavitsainen, S., Engedal, N., Cao, S., Handle, F., Erickson, A., Prekovic, S., Wetterskog, D., Tolonen, T., Vuorinen, E., Kiviaho, A., et al. (2021).
Single-cell atac and rna sequencing reveal pre-existing and persistent cells associated with prostate cancer relapse.
Nature communications, 12(1):1-16.

Traag et al., 2019
Traag, V. A., Waltman, L., and van Eck, N. J. (2019).
From louvain to leiden: guaranteeing well-connected communities.
Scientific reports, 9(1):1-12.

Van Der Maaten, 2014
Van Der Maaten, L. (2014).
Accelerating t-sne using tree-based algorithms.
The Journal of Machine Learning Research, 15(1):3221-3245.

Wattenberg et al., 2016
Wattenberg, M., Viï¿12gas, F., and Johnson, I. (2016).
How to use t-sne effectively.
Distill.

Xu and Su, 2015
Xu, C. and Su, Z. (2015).
Identification of cell types from single-cell transcriptomes using a novel clustering method.
Bioinformatics, 31(12):1974-1980.