Abstract: The present invention provides a method for identifying peptides that contain features positively associated with natural endogenous or exogenous cellular processing, transportation and major histocompatibility complex (MHC) presentation. In particular, the invention/method controls for the influence of protein abundance, stability and HLA/MHC binding on processing and presentation, enabling a machine-learning algorithm or statistical inference model trained using the method to be applied to any test peptide regardless of its HLA/MHC restriction i.e. the algorithm operates in a HLA/MHC-agnostic manner. This is attained through the building of positive and negative data sets of peptide sequences (peptides identified or inferred from surface bound or secreted MHC/peptide complexes in the literature, and those which are not).