Abstract: Method and system for processing digital works, the method comprising the steps of identifying terms within each digital work of a plurality of digital works, wherein the terms are words and/or phrases. Determining a number of times that the identified terms occur within each digital work of the plurality of digital works. Generating a fingerprint for each digital work of the plurality of digital works, the generated fingerprint based on the identified terms and the number of times that the identified terms occur within each digital work. Using a neural network to find an encoding function, g, that encodes a higher dimensionality space, x, of each fingerprint into a lower dimensionality space, y. Applying the encoding function to each fingerprint of the plurality of digital works to reduce their dimensionality. Determining a similarity between a first fingerprint and one or more dimensionality reduced fingerprints.