Message ID | 20241105130902.4603-1-chenqiuji666@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | scsi: lpfc: Fix improper handling of refcount in lpfc_bsg_hba_get_event() | expand |
Hi Qiu-ji, Similar to the other suggested patch, this does not look logically correct. if (evt_dat == NULL) evaluates to true, then that means the list_for_each_entry_safe(evt, evt_next, &phba->ct_ev_waiters, node) loop did not find an evt lpfc_bsg_event object of interest or that the phba->ct_ev_waiters list is empty. Why would this patch want to call lpfc_bsg_event_unref on an evt object that is not of specified interest indicated by the bsg event_req object? Even worse, as mentioned in the other email, this patch could kref_put on the phba->ct_ev_waiters head which is not a preallocated lpfc_bsg_event object leading to references on an uninitialized memory region. Sorry, but I cannot acknowledge this patch as well. Regards, Justin
Hi Qiu-ji, kernel test robot noticed the following build errors: [auto build test ERROR on jejb-scsi/for-next] [also build test ERROR on mkp-scsi/for-next linus/master v6.12-rc6 next-20241105] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Qiu-ji-Chen/scsi-lpfc-Fix-improper-handling-of-refcount-in-lpfc_bsg_hba_get_event/20241105-211110 base: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next patch link: https://lore.kernel.org/r/20241105130902.4603-1-chenqiuji666%40gmail.com patch subject: [PATCH] scsi: lpfc: Fix improper handling of refcount in lpfc_bsg_hba_get_event() config: i386-allmodconfig (https://download.01.org/0day-ci/archive/20241106/202411060332.fmmxsBzv-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241106/202411060332.fmmxsBzv-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411060332.fmmxsBzv-lkp@intel.com/ All error/warnings (new ones prefixed by >>): drivers/scsi/lpfc/lpfc_bsg.c: In function 'lpfc_bsg_hba_get_event': >> drivers/scsi/lpfc/lpfc_bsg.c:1332:1: warning: label 'job_err_unref' defined but not used [-Wunused-label] 1332 | job_err_unref: | ^~~~~~~~~~~~~ >> drivers/scsi/lpfc/lpfc_bsg.c:1297:17: error: label 'job_error_unref' used but not defined 1297 | goto job_error_unref; | ^~~~ Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for MODVERSIONS Depends on [n]: MODULES [=y] && !COMPILE_TEST [=y] Selected by [y]: - RANDSTRUCT_FULL [=y] && (CC_HAS_RANDSTRUCT [=n] || GCC_PLUGINS [=y]) && MODULES [=y] WARNING: unmet direct dependencies detected for GET_FREE_REGION Depends on [n]: SPARSEMEM [=n] Selected by [m]: - RESOURCE_KUNIT_TEST [=m] && RUNTIME_TESTING_MENU [=y] && KUNIT [=m] vim +/job_error_unref +1297 drivers/scsi/lpfc/lpfc_bsg.c 1243 1244 /** 1245 * lpfc_bsg_hba_get_event - process a GET_EVENT bsg vendor command 1246 * @job: GET_EVENT fc_bsg_job 1247 **/ 1248 static int 1249 lpfc_bsg_hba_get_event(struct bsg_job *job) 1250 { 1251 struct lpfc_vport *vport = shost_priv(fc_bsg_to_shost(job)); 1252 struct lpfc_hba *phba = vport->phba; 1253 struct fc_bsg_request *bsg_request = job->request; 1254 struct fc_bsg_reply *bsg_reply = job->reply; 1255 struct get_ct_event *event_req; 1256 struct get_ct_event_reply *event_reply; 1257 struct lpfc_bsg_event *evt, *evt_next; 1258 struct event_data *evt_dat = NULL; 1259 unsigned long flags; 1260 uint32_t rc = 0; 1261 1262 if (job->request_len < 1263 sizeof(struct fc_bsg_request) + sizeof(struct get_ct_event)) { 1264 lpfc_printf_log(phba, KERN_WARNING, LOG_LIBDFC, 1265 "2613 Received GET_CT_EVENT request below " 1266 "minimum size\n"); 1267 rc = -EINVAL; 1268 goto job_error; 1269 } 1270 1271 event_req = (struct get_ct_event *) 1272 bsg_request->rqst_data.h_vendor.vendor_cmd; 1273 1274 event_reply = (struct get_ct_event_reply *) 1275 bsg_reply->reply_data.vendor_reply.vendor_rsp; 1276 spin_lock_irqsave(&phba->ct_ev_lock, flags); 1277 list_for_each_entry_safe(evt, evt_next, &phba->ct_ev_waiters, node) { 1278 if (evt->reg_id == event_req->ev_reg_id) { 1279 if (list_empty(&evt->events_to_get)) 1280 break; 1281 lpfc_bsg_event_ref(evt); 1282 evt->wait_time_stamp = jiffies; 1283 evt_dat = list_entry(evt->events_to_get.prev, 1284 struct event_data, node); 1285 list_del(&evt_dat->node); 1286 break; 1287 } 1288 } 1289 spin_unlock_irqrestore(&phba->ct_ev_lock, flags); 1290 1291 /* The app may continue to ask for event data until it gets 1292 * an error indicating that there isn't anymore 1293 */ 1294 if (evt_dat == NULL) { 1295 bsg_reply->reply_payload_rcv_len = 0; 1296 rc = -ENOENT; > 1297 goto job_error_unref; 1298 } 1299 1300 if (evt_dat->len > job->request_payload.payload_len) { 1301 evt_dat->len = job->request_payload.payload_len; 1302 lpfc_printf_log(phba, KERN_WARNING, LOG_LIBDFC, 1303 "2618 Truncated event data at %d " 1304 "bytes\n", 1305 job->request_payload.payload_len); 1306 } 1307 1308 event_reply->type = evt_dat->type; 1309 event_reply->immed_data = evt_dat->immed_dat; 1310 if (evt_dat->len > 0) 1311 bsg_reply->reply_payload_rcv_len = 1312 sg_copy_from_buffer(job->request_payload.sg_list, 1313 job->request_payload.sg_cnt, 1314 evt_dat->data, evt_dat->len); 1315 else 1316 bsg_reply->reply_payload_rcv_len = 0; 1317 1318 if (evt_dat) { 1319 kfree(evt_dat->data); 1320 kfree(evt_dat); 1321 } 1322 1323 spin_lock_irqsave(&phba->ct_ev_lock, flags); 1324 lpfc_bsg_event_unref(evt); 1325 spin_unlock_irqrestore(&phba->ct_ev_lock, flags); 1326 job->dd_data = NULL; 1327 bsg_reply->result = 0; 1328 bsg_job_done(job, bsg_reply->result, 1329 bsg_reply->reply_payload_rcv_len); 1330 return 0; 1331 > 1332 job_err_unref: 1333 spin_lock_irqsave(&phba->ct_ev_lock, flags); 1334 lpfc_bsg_event_unref(evt); 1335 spin_unlock_irqrestore(&phba->ct_ev_lock, flags); 1336 job_error: 1337 job->dd_data = NULL; 1338 bsg_reply->result = rc; 1339 return rc; 1340 } 1341
Hi Qiu-ji, kernel test robot noticed the following build errors: [auto build test ERROR on jejb-scsi/for-next] [also build test ERROR on mkp-scsi/for-next linus/master v6.12-rc6 next-20241105] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Qiu-ji-Chen/scsi-lpfc-Fix-improper-handling-of-refcount-in-lpfc_bsg_hba_get_event/20241105-211110 base: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next patch link: https://lore.kernel.org/r/20241105130902.4603-1-chenqiuji666%40gmail.com patch subject: [PATCH] scsi: lpfc: Fix improper handling of refcount in lpfc_bsg_hba_get_event() config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20241106/202411060527.qPI24Q8a-lkp@intel.com/config) compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241106/202411060527.qPI24Q8a-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202411060527.qPI24Q8a-lkp@intel.com/ All error/warnings (new ones prefixed by >>): In file included from drivers/scsi/lpfc/lpfc_bsg.c:25: In file included from include/linux/pci.h:1650: In file included from include/linux/dmapool.h:14: In file included from include/linux/scatterlist.h:8: In file included from include/linux/mm.h:2213: include/linux/vmstat.h:504:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 504 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 505 | item]; | ~~~~ include/linux/vmstat.h:511:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 511 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 512 | NR_VM_NUMA_EVENT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~~ include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ include/linux/vmstat.h:524:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion] 524 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~ ^ 525 | NR_VM_NUMA_EVENT_ITEMS + | ~~~~~~~~~~~~~~~~~~~~~~ >> drivers/scsi/lpfc/lpfc_bsg.c:1297:8: error: use of undeclared label 'job_error_unref' 1297 | goto job_error_unref; | ^ >> drivers/scsi/lpfc/lpfc_bsg.c:1332:1: warning: unused label 'job_err_unref' [-Wunused-label] 1332 | job_err_unref: | ^~~~~~~~~~~~~~ drivers/scsi/lpfc/lpfc_bsg.c:2810:11: warning: variable 'offset' set but not used [-Wunused-but-set-variable] 2810 | int cnt, offset = 0, i = 0; | ^ 6 warnings and 1 error generated. Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for MODVERSIONS Depends on [n]: MODULES [=y] && !COMPILE_TEST [=y] Selected by [y]: - RANDSTRUCT_FULL [=y] && (CC_HAS_RANDSTRUCT [=y] || GCC_PLUGINS [=n]) && MODULES [=y] vim +/job_error_unref +1297 drivers/scsi/lpfc/lpfc_bsg.c 1243 1244 /** 1245 * lpfc_bsg_hba_get_event - process a GET_EVENT bsg vendor command 1246 * @job: GET_EVENT fc_bsg_job 1247 **/ 1248 static int 1249 lpfc_bsg_hba_get_event(struct bsg_job *job) 1250 { 1251 struct lpfc_vport *vport = shost_priv(fc_bsg_to_shost(job)); 1252 struct lpfc_hba *phba = vport->phba; 1253 struct fc_bsg_request *bsg_request = job->request; 1254 struct fc_bsg_reply *bsg_reply = job->reply; 1255 struct get_ct_event *event_req; 1256 struct get_ct_event_reply *event_reply; 1257 struct lpfc_bsg_event *evt, *evt_next; 1258 struct event_data *evt_dat = NULL; 1259 unsigned long flags; 1260 uint32_t rc = 0; 1261 1262 if (job->request_len < 1263 sizeof(struct fc_bsg_request) + sizeof(struct get_ct_event)) { 1264 lpfc_printf_log(phba, KERN_WARNING, LOG_LIBDFC, 1265 "2613 Received GET_CT_EVENT request below " 1266 "minimum size\n"); 1267 rc = -EINVAL; 1268 goto job_error; 1269 } 1270 1271 event_req = (struct get_ct_event *) 1272 bsg_request->rqst_data.h_vendor.vendor_cmd; 1273 1274 event_reply = (struct get_ct_event_reply *) 1275 bsg_reply->reply_data.vendor_reply.vendor_rsp; 1276 spin_lock_irqsave(&phba->ct_ev_lock, flags); 1277 list_for_each_entry_safe(evt, evt_next, &phba->ct_ev_waiters, node) { 1278 if (evt->reg_id == event_req->ev_reg_id) { 1279 if (list_empty(&evt->events_to_get)) 1280 break; 1281 lpfc_bsg_event_ref(evt); 1282 evt->wait_time_stamp = jiffies; 1283 evt_dat = list_entry(evt->events_to_get.prev, 1284 struct event_data, node); 1285 list_del(&evt_dat->node); 1286 break; 1287 } 1288 } 1289 spin_unlock_irqrestore(&phba->ct_ev_lock, flags); 1290 1291 /* The app may continue to ask for event data until it gets 1292 * an error indicating that there isn't anymore 1293 */ 1294 if (evt_dat == NULL) { 1295 bsg_reply->reply_payload_rcv_len = 0; 1296 rc = -ENOENT; > 1297 goto job_error_unref; 1298 } 1299 1300 if (evt_dat->len > job->request_payload.payload_len) { 1301 evt_dat->len = job->request_payload.payload_len; 1302 lpfc_printf_log(phba, KERN_WARNING, LOG_LIBDFC, 1303 "2618 Truncated event data at %d " 1304 "bytes\n", 1305 job->request_payload.payload_len); 1306 } 1307 1308 event_reply->type = evt_dat->type; 1309 event_reply->immed_data = evt_dat->immed_dat; 1310 if (evt_dat->len > 0) 1311 bsg_reply->reply_payload_rcv_len = 1312 sg_copy_from_buffer(job->request_payload.sg_list, 1313 job->request_payload.sg_cnt, 1314 evt_dat->data, evt_dat->len); 1315 else 1316 bsg_reply->reply_payload_rcv_len = 0; 1317 1318 if (evt_dat) { 1319 kfree(evt_dat->data); 1320 kfree(evt_dat); 1321 } 1322 1323 spin_lock_irqsave(&phba->ct_ev_lock, flags); 1324 lpfc_bsg_event_unref(evt); 1325 spin_unlock_irqrestore(&phba->ct_ev_lock, flags); 1326 job->dd_data = NULL; 1327 bsg_reply->result = 0; 1328 bsg_job_done(job, bsg_reply->result, 1329 bsg_reply->reply_payload_rcv_len); 1330 return 0; 1331 > 1332 job_err_unref: 1333 spin_lock_irqsave(&phba->ct_ev_lock, flags); 1334 lpfc_bsg_event_unref(evt); 1335 spin_unlock_irqrestore(&phba->ct_ev_lock, flags); 1336 job_error: 1337 job->dd_data = NULL; 1338 bsg_reply->result = rc; 1339 return rc; 1340 } 1341
diff --git a/drivers/scsi/lpfc/lpfc_bsg.c b/drivers/scsi/lpfc/lpfc_bsg.c index 85059b83ea6b..832a5a6dd85f 100644 --- a/drivers/scsi/lpfc/lpfc_bsg.c +++ b/drivers/scsi/lpfc/lpfc_bsg.c @@ -1294,7 +1294,7 @@ lpfc_bsg_hba_get_event(struct bsg_job *job) if (evt_dat == NULL) { bsg_reply->reply_payload_rcv_len = 0; rc = -ENOENT; - goto job_error; + goto job_error_unref; } if (evt_dat->len > job->request_payload.payload_len) { @@ -1329,6 +1329,10 @@ lpfc_bsg_hba_get_event(struct bsg_job *job) bsg_reply->reply_payload_rcv_len); return 0; +job_err_unref: + spin_lock_irqsave(&phba->ct_ev_lock, flags); + lpfc_bsg_event_unref(evt); + spin_unlock_irqrestore(&phba->ct_ev_lock, flags); job_error: job->dd_data = NULL; bsg_reply->result = rc;
This patch addresses a reference count handling issue in the lpfc_bsg_hba_get_event() function. In the branch if (evt->reg_id == event_req->ev_reg_id), the function calls lpfc_bsg_event_ref(), which increments the reference count of the relevant resources. However, in the branch if (evt_dat == NULL), a goto statement directly jumps to the function’s final goto block, skipping the release operations at the end of the function. This means that, if the condition if (evt_dat == NULL) is met, the function fails to correctly release the resources acquired by lpfc_bsg_event_ref(), leading to a reference count leak. To fix this issue, we added a new block job_error_unref before the job_error block. When the condition if (evt_dat == NULL) is met, the function will enter the job_error_unref block, ensuring that the previously allocated resources are properly released, thereby preventing the reference count leak. This bug was identified by an experimental static analysis tool developed by our team. The tool specializes in analyzing reference count operations and detecting potential issues where resources are not properly managed. In this case, the tool flagged the missing release operation as a potential problem, which led to the development of this patch. Fixes: 4cc0e56e977f ("[SCSI] lpfc 8.3.8: (BSG3) Modify BSG commands to operate asynchronously") Cc: stable@vger.kernel.org Signed-off-by: Qiu-ji Chen <chenqiuji666@gmail.com> --- drivers/scsi/lpfc/lpfc_bsg.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)