Abstract: The present disclosure relates to a computer-implemented method of extracting a subsample of cells from a plurality of cells in a single-cell genomics dataset, the method comprising the steps of: obtaining a single-cell genomics dataset represented in at least two dimensions, wherein information about each cell is represented in a first dimension and information about genomic features is represented in a second dimension; generating a cell-cell neighborhood graph from the single-cell genomics dataset, the cell-cell neighborhood graph providing information about similarities of the genomic features of the cells, wherein the cells are represented as vertices in the cell-cell neighborhood graph; dividing the cells in the cell-cell neighborhood graph into seed cells and non-seed cells; assigning at least one first prize to the seed cells and at least one second prize to the non-seed cells in the cell-cell neighborhood graph; and traversing the cell-cell neighborhood graph using a prize collecting steiner tree alg