Abstract
Automatic Radiology Report Generation (RRG) is an important topic foralleviating the substantial workload of radiologists. Existing RRG approachesrely on supervised regression based on different architectures or additionalknowledge injection,while the generated report may not align optimally withradiologists' preferences. Especially, since the preferences of radiologistsare inherently heterogeneous and multidimensional, e.g., some may prioritizereport fluency, while others emphasize clinical accuracy. To address thisproblem,we propose a new RRG method via Multi-objective Preference Optimization(MPO) to align the pre-trained RRG model with multiple human preferences, whichcan be formulated by multi-dimensional reward functions and optimized bymulti-objective reinforcement learning (RL). Specifically, we use a preferencevector to represent the weight of preferences and use it as a condition for theRRG model. Then, a linearly weighed reward is obtained via a dot productbetween the preference vector and multi-dimensional reward. Next,the RRG modelis optimized to align with the preference vector by optimizing such a rewardvia RL. In the training stage,we randomly sample diverse preference vectorsfrom the preference space and align the model by optimizing the weightedmulti-objective rewards, which leads to an optimal policy on the entirepreference space. When inference,our model can generate reports aligned withspecific preferences without further fine-tuning. Extensive experiments on twopublic datasets show the proposed method can generate reports that cater todifferent preferences in a single model and achieve state-of-the-artperformance.