Message ID | 20210121072202.120810-1-bianpan2016@163.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | lightnvm: fix memory leak when submit fails | expand |
On 21.01.2021 05:47, Jens Axboe wrote: >On 1/21/21 12:22 AM, Pan Bian wrote: >> The allocated page is not released if error occurs in >> nvm_submit_io_sync_raw(). __free_page() is moved ealier to avoid >> possible memory leak issue. > >Applied, thanks. > >General question for Matias - is lightnvm maintained anymore at all, or >should we remove it? The project seems dead from my pov, and I don't >even remember anyone even reviewing fixes from other people. > At least from the pblk side, I have no objections to removing it. I test briefly that pblk runs on new releases, but there are no new features or known bug fixes coming in. Current deployments - to the best of my knowledge - are forks, which are not being retrofitted upstream. For completeness, I get a number of questions and request primarily from the academia. These people will probably accuse the lack of LightNVM. I understand though that this is not an argument to keep it. Javier
On 21/01/2021 13.47, Jens Axboe wrote: > On 1/21/21 12:22 AM, Pan Bian wrote: >> The allocated page is not released if error occurs in >> nvm_submit_io_sync_raw(). __free_page() is moved ealier to avoid >> possible memory leak issue. > Applied, thanks. > > General question for Matias - is lightnvm maintained anymore at all, or > should we remove it? The project seems dead from my pov, and I don't > even remember anyone even reviewing fixes from other people. > Hi Jens, ZNS has superseded OCSSD/lightnvm. As a result, the hardware and software development around OCSSD have also moved on to ZNS. To my knowledge, there is not anyone implementing OCSSD1.2/2.0 commercially at this point, and what has been deployed in production does not utilize the Linux kernel stack. I do not mind continuing to keep an eye on it, but on the other hand, it has served its purpose. It enabled the "Open-Channel SSD architectures" of the world to take hold in the market and thereby gained enough momentum to be standardized in NVMe as ZNS. Would you like me to send a PR to remove lightnvm immediately, or should we mark it as deprecated for a while before pulling it? Best, Matias
I don't think that ZNS supersedes OCSSD. OCSSDs provide much more flexibility and device control and remain valuable for academia. For us, PBLK is the most accurate "SSD Emulator" out there that, as another benefit, enables real-time performance measurements. That being said, I understand that this may not be a good enough reason to keep it around, but I wouldn't mind if it stayed for another while. On Thu, Jan 21, 2021 at 5:57 AM Matias Bjørling <mb@lightnvm.io> wrote: > > On 21/01/2021 13.47, Jens Axboe wrote: > > On 1/21/21 12:22 AM, Pan Bian wrote: > >> The allocated page is not released if error occurs in > >> nvm_submit_io_sync_raw(). __free_page() is moved ealier to avoid > >> possible memory leak issue. > > Applied, thanks. > > > > General question for Matias - is lightnvm maintained anymore at all, or > > should we remove it? The project seems dead from my pov, and I don't > > even remember anyone even reviewing fixes from other people. > > > Hi Jens, > > ZNS has superseded OCSSD/lightnvm. As a result, the hardware and > software development around OCSSD have also moved on to ZNS. To my > knowledge, there is not anyone implementing OCSSD1.2/2.0 commercially at > this point, and what has been deployed in production does not utilize > the Linux kernel stack. > > I do not mind continuing to keep an eye on it, but on the other hand, it > has served its purpose. It enabled the "Open-Channel SSD architectures" > of the world to take hold in the market and thereby gained enough > momentum to be standardized in NVMe as ZNS. > > Would you like me to send a PR to remove lightnvm immediately, or should > we mark it as deprecated for a while before pulling it? > > Best, Matias > >
On 21/01/2021 17.58, Heiner Litz wrote: > I don't think that ZNS supersedes OCSSD. OCSSDs provide much more > flexibility and device control and remain valuable for academia. For > us, PBLK is the most accurate "SSD Emulator" out there that, as > another benefit, enables real-time performance measurements. > That being said, I understand that this may not be a good enough > reason to keep it around, but I wouldn't mind if it stayed for another > while. The key difference between ZNS SSDs, and OCSSDs is that wear-leveling is done on the SSD, whereas it is on the host with OCSSD. While that is interesting in itself, the bulk of the research that is based upon OCSSD, is to control which dies are accessed. As that is already compatible with NVMe Endurance Groups/NVM Sets, there is really no reason to keep OCSSD around to have that flexibility. If we take it out of the kernel, it would still be maintained in the github repository and available for researchers. Given the few changes that have happened over the past year, it should be relatively easy to rebase for each kernel release for quite a while. Best, Matias > > On Thu, Jan 21, 2021 at 5:57 AM Matias Bjørling <mb@lightnvm.io> wrote: >> On 21/01/2021 13.47, Jens Axboe wrote: >>> On 1/21/21 12:22 AM, Pan Bian wrote: >>>> The allocated page is not released if error occurs in >>>> nvm_submit_io_sync_raw(). __free_page() is moved ealier to avoid >>>> possible memory leak issue. >>> Applied, thanks. >>> >>> General question for Matias - is lightnvm maintained anymore at all, or >>> should we remove it? The project seems dead from my pov, and I don't >>> even remember anyone even reviewing fixes from other people. >>> >> Hi Jens, >> >> ZNS has superseded OCSSD/lightnvm. As a result, the hardware and >> software development around OCSSD have also moved on to ZNS. To my >> knowledge, there is not anyone implementing OCSSD1.2/2.0 commercially at >> this point, and what has been deployed in production does not utilize >> the Linux kernel stack. >> >> I do not mind continuing to keep an eye on it, but on the other hand, it >> has served its purpose. It enabled the "Open-Channel SSD architectures" >> of the world to take hold in the market and thereby gained enough >> momentum to be standardized in NVMe as ZNS. >> >> Would you like me to send a PR to remove lightnvm immediately, or should >> we mark it as deprecated for a while before pulling it? >> >> Best, Matias >> >>
there are a couple more, but again I would understand if those are deemed not important enough to keep it. device emulation of (non-ZNS) SSD block device die control: yes endurance groups would help but I am not aware of any vendor supporting it finer-grained control: 1000's of open blocks vs. a handful of concurrently open zones OOB area: helpful for L2P recovery On Thu, Jan 21, 2021 at 10:25 AM Matias Bjørling <mb@lightnvm.io> wrote: > > On 21/01/2021 17.58, Heiner Litz wrote: > > I don't think that ZNS supersedes OCSSD. OCSSDs provide much more > > flexibility and device control and remain valuable for academia. For > > us, PBLK is the most accurate "SSD Emulator" out there that, as > > another benefit, enables real-time performance measurements. > > That being said, I understand that this may not be a good enough > > reason to keep it around, but I wouldn't mind if it stayed for another > > while. > > The key difference between ZNS SSDs, and OCSSDs is that wear-leveling is > done on the SSD, whereas it is on the host with OCSSD. > > While that is interesting in itself, the bulk of the research that is > based upon OCSSD, is to control which dies are accessed. As that is > already compatible with NVMe Endurance Groups/NVM Sets, there is really > no reason to keep OCSSD around to have that flexibility. > > If we take it out of the kernel, it would still be maintained in the > github repository and available for researchers. Given the few changes > that have happened over the past year, it should be relatively easy to > rebase for each kernel release for quite a while. > > Best, Matias > > > > > > > On Thu, Jan 21, 2021 at 5:57 AM Matias Bjørling <mb@lightnvm.io> wrote: > >> On 21/01/2021 13.47, Jens Axboe wrote: > >>> On 1/21/21 12:22 AM, Pan Bian wrote: > >>>> The allocated page is not released if error occurs in > >>>> nvm_submit_io_sync_raw(). __free_page() is moved ealier to avoid > >>>> possible memory leak issue. > >>> Applied, thanks. > >>> > >>> General question for Matias - is lightnvm maintained anymore at all, or > >>> should we remove it? The project seems dead from my pov, and I don't > >>> even remember anyone even reviewing fixes from other people. > >>> > >> Hi Jens, > >> > >> ZNS has superseded OCSSD/lightnvm. As a result, the hardware and > >> software development around OCSSD have also moved on to ZNS. To my > >> knowledge, there is not anyone implementing OCSSD1.2/2.0 commercially at > >> this point, and what has been deployed in production does not utilize > >> the Linux kernel stack. > >> > >> I do not mind continuing to keep an eye on it, but on the other hand, it > >> has served its purpose. It enabled the "Open-Channel SSD architectures" > >> of the world to take hold in the market and thereby gained enough > >> momentum to be standardized in NVMe as ZNS. > >> > >> Would you like me to send a PR to remove lightnvm immediately, or should > >> we mark it as deprecated for a while before pulling it? > >> > >> Best, Matias > >> > >> >
On 21/01/2021 20.49, Heiner Litz wrote: > there are a couple more, but again I would understand if those are > deemed not important enough to keep it. > > device emulation of (non-ZNS) SSD block device That'll soon be available. We will be open-sourcing a new device mapper (dm-zap), which implements an indirection layer that enables ZNS SSDs to be exposed as a conventional block device. > die control: yes endurance groups would help but I am not aware of any > vendor supporting it It is out there. Although, is this still important in 2021? OCSSD was made back in the days where media program/erase suspend wasn't commonly available and SSD controller were more simple. With today's media and SSD controllers, it is hard to compete without leaving media throughput on the table. If needed, splitting a drive into a few partitions should be sufficient for many many types of workloads. > finer-grained control: 1000's of open blocks vs. a handful of > concurrently open zones It is dependent on the implementation - ZNS SSDs also supports 1000's of open zones. Wrt to available OCSSD hardware - there isn't, to my knowledge, proper implementations available, where media reliability is taken into account. Generally for the OCSSD hardware implementations, their UBER is extremely low, and as such RAID or similar schemes must be implemented on the host. pblk does not implement this, so at best, one should not store data if one wants to get it back at some point. It also makes for an unfair SSD comparison, as there is much more to an SSD than what OCSSD + pblk implements. At worst, it'll lead to false understanding of the challenges of making SSDs, and at best, work can be used as the foundation for doing an actual SSD implementation. > OOB area: helpful for L2P recovery It is known as LBA metadata in NVMe. It is commonly available in many of today's SSD. I understand your point that there is a lot of flexibility, but my counter point is that there isn't anything in OCSSD, that is not implementable or commonly available using today's NVMe concepts. Furthermore, the known OCSSD research platforms can easily be updated to expose the OCSSD characteristics through standardized NVMe concepts. That would probably make for a good research paper.
thanks, Matias, I am going to look out for dm-zap! On Thu, Jan 21, 2021 at 12:14 PM Matias Bjørling <mb@lightnvm.io> wrote: > > On 21/01/2021 20.49, Heiner Litz wrote: > > there are a couple more, but again I would understand if those are > > deemed not important enough to keep it. > > > > device emulation of (non-ZNS) SSD block device > > That'll soon be available. We will be open-sourcing a new device mapper > (dm-zap), which implements an indirection layer that enables ZNS SSDs to > be exposed as a conventional block device. > > > die control: yes endurance groups would help but I am not aware of any > > vendor supporting it > It is out there. Although, is this still important in 2021? OCSSD was > made back in the days where media program/erase suspend wasn't commonly > available and SSD controller were more simple. With today's media and > SSD controllers, it is hard to compete without leaving media throughput > on the table. If needed, splitting a drive into a few partitions should > be sufficient for many many types of workloads. > > finer-grained control: 1000's of open blocks vs. a handful of > > concurrently open zones > > It is dependent on the implementation - ZNS SSDs also supports 1000's of > open zones. > > Wrt to available OCSSD hardware - there isn't, to my knowledge, proper > implementations available, where media reliability is taken into account. > > Generally for the OCSSD hardware implementations, their UBER is > extremely low, and as such RAID or similar schemes must be implemented > on the host. pblk does not implement this, so at best, one should not > store data if one wants to get it back at some point. It also makes for > an unfair SSD comparison, as there is much more to an SSD than what > OCSSD + pblk implements. At worst, it'll lead to false understanding of > the challenges of making SSDs, and at best, work can be used as the > foundation for doing an actual SSD implementation. > > > OOB area: helpful for L2P recovery > > It is known as LBA metadata in NVMe. It is commonly available in many of > today's SSD. > > I understand your point that there is a lot of flexibility, but my > counter point is that there isn't anything in OCSSD, that is not > implementable or commonly available using today's NVMe concepts. > Furthermore, the known OCSSD research platforms can easily be updated to > expose the OCSSD characteristics through standardized NVMe concepts. > That would probably make for a good research paper. > >
diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c index c1bcac71008c..28ddcaa5358b 100644 --- a/drivers/lightnvm/core.c +++ b/drivers/lightnvm/core.c @@ -844,11 +844,10 @@ static int nvm_bb_chunk_sense(struct nvm_dev *dev, struct ppa_addr ppa) rqd.ppa_addr = generic_to_dev_addr(dev, ppa); ret = nvm_submit_io_sync_raw(dev, &rqd); + __free_page(page); if (ret) return ret; - __free_page(page); - return rqd.error; }
The allocated page is not released if error occurs in nvm_submit_io_sync_raw(). __free_page() is moved ealier to avoid possible memory leak issue. Fixes: aff3fb18f957 ("lightnvm: move bad block and chunk state logic to core") Signed-off-by: Pan Bian <bianpan2016@163.com> --- drivers/lightnvm/core.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)