Abstract: An apparatus for diagnosing a medical condition including at least a processor and a memory, the memory containing instructions configuring the at least a processor to receive a medical image that is the result of a medical imaging procedure and train a medical image classifier and a medical report classifier. Training the classifiers includes receiving training data, including a plurality of prior medical images and a plurality of prior medical reports, training the medical image classifier and the medical report classifier, and optimizing both classifiers using a common loss function. The memory further containing instructions configuring the at least a processor to generate a label for the medical image, including inputting the medical image into the medical image classifier and receiving the label as output from the medical image classifier.