2
0
mirror of https://git.missingno.dev/kolibrios-nvme-driver/ synced 2024-09-19 01:51:03 +02:00

Shutdown Freezes System #5

Open
opened 2024-08-09 16:08:55 +02:00 by ramenu · 32 comments
Member

This is a followup of #1 (comment)

I was able to replicate this problem on QEMU as well, so there's definitely something suspicious going on within the driver. I'll have to investigate this later.

This is a followup of https://git.kolibrios.org/GSoC/kolibrios-nvme-driver/issues/1#issuecomment-472 I was able to replicate this problem on QEMU as well, so there's definitely something suspicious going on within the driver. I'll have to investigate this later.

Update with commit 0f24000b9a, shutdown still freezes system on VirtualBox with loaded NVMe driver.

Update with commit 0f24000b9a, shutdown still freezes system on VirtualBox with loaded NVMe driver.
Author
Member

@Burer, Since this is happening at OS level I will need to setup a debugger to inspect what's going on. If possible, could you ask @dunkaist what is the listing utility used to generate virtual addresses is? (How to get it, compile it, or find documentation on it) I tried asking on IRC already but haven't received a response at the moment. That would be much appreciated, thanks. :)

@Burer, Since this is happening at OS level I will need to setup a debugger to inspect what's going on. If possible, could you ask @dunkaist what is the listing utility used to generate virtual addresses is? (How to get it, compile it, or find documentation on it) I tried asking on IRC already but haven't received a response at the moment. That would be much appreciated, thanks. :)
Owner

@ramenu,

The listing utility is shipped with the fasm package. Firstly you need to compile its source to an object file, then to link it using your system's linker, not fasm.

I suspect the issue is in nvme_cleanup procedure. Try commenting out its call in your START procedure and see if this helps. If yes, the issue is localized.

@ramenu, The listing utility is shipped with the fasm package. Firstly you need to compile its source to an object file, then to link it using your system's linker, not fasm. I suspect the issue is in nvme_cleanup procedure. Try commenting out its call in your START procedure and see if this helps. If yes, the issue is localized.
Author
Member

@dunkaist,

Thank you for the insight. Unfortunately even when removing the nvme_cleanup call the issue still occurs. Removing the call the RegService does fix the shutdown problem, but I'm not sure if this is a viable solution since the controller cannot shutdown properly.

@dunkaist, Thank you for the insight. Unfortunately even when removing the nvme_cleanup call the issue still occurs. Removing the call the RegService does fix the shutdown problem, but I'm not sure if this is a viable solution since the controller cannot shutdown properly.
Author
Member

Okay, some clue at least, so I tried moving around RegService into different places:

; No crash
invoke  RegService, my_service, service_proc
ret
DEBUGF  DBG_INFO, "Detecting NVMe device...\n"
call    detect_nvme
test 	eax, eax

however, invoking RegService right after detect_nvme is called, results in a page fault when trying to shut down:

DEBUGF  DBG_INFO, "Detecting NVMe device...\n"
call    detect_nvme
test 	eax, eax
jz      .err
; crash
invoke  RegService, my_service, service_proc
ret
Okay, some clue at least, so I tried moving around RegService into different places: ; No crash invoke RegService, my_service, service_proc ret DEBUGF DBG_INFO, "Detecting NVMe device...\n" call detect_nvme test eax, eax however, invoking RegService right after detect_nvme is called, results in a page fault when trying to shut down: DEBUGF DBG_INFO, "Detecting NVMe device...\n" call detect_nvme test eax, eax jz .err ; crash invoke RegService, my_service, service_proc ret
Author
Member

Preserving the edi, esi registers seems to fix this issue:

push 	esi edi
call    detect_nvme
test 	eax, eax
jz      .err
pop 	edi esi

Which makes sense, since the entrypoint to the driver follows stdcall convention. Now, let's see if I can fix this properly.

Preserving the edi, esi registers seems to fix this issue: push esi edi call detect_nvme test eax, eax jz .err pop edi esi Which makes sense, since the entrypoint to the driver follows stdcall convention. Now, let's see if I can fix this properly.
Author
Member

YUP, that fixes it! NICE! :) (9ace6a4b52)

I've emitted the nvme-cleanup call for now though, will test it tomorrow. But for now, this should suffice.

YUP, that fixes it! NICE! :) (https://git.kolibrios.org/GSoC/kolibrios-nvme-driver/commit/9ace6a4b528921a5097371c8f070c85d68af8a8b) I've emitted the nvme-cleanup call for now though, will test it tomorrow. But for now, this should suffice.

Confirmed, commit 9ace6a4b52 also fixed this issue in VirtualBox for me, both when autoloading driver from IMG image and loading it manually from ISO image.

Confirmed, commit 9ace6a4b52 also fixed this issue in VirtualBox for me, both when autoloading driver from IMG image and loading it manually from ISO image.
Author
Member

Could you test this once more on Virtualbox and VMWare, with commit 19f9eb5ae6? Thanks.

EDIT: Use c02d64b35a

Could you test this once more on Virtualbox and VMWare, with commit https://git.kolibrios.org/GSoC/kolibrios-nvme-driver/commit/19f9eb5ae620f6dc1d0b76fa1bf473b4d6f4701f? Thanks. EDIT: Use https://git.kolibrios.org/GSoC/kolibrios-nvme-driver/commit/c02d64b35ad7a7dcfa7ece74b559b3e54d229fd8

Update with commit c02d64b35a.
System with loaded NVMe driver shuts down okay in VirtualBox, but freezes in VMWare, both when using IMG-image with driver autoload and ISO-image with manual driver loading.

Update with commit c02d64b35a. System with loaded NVMe driver shuts down okay in VirtualBox, but freezes in VMWare, both when using IMG-image with driver autoload and ISO-image with manual driver loading.
Author
Member

Update with commit c02d64b35a.
System with loaded NVMe driver shuts down okay in VirtualBox, but freezes in VMWare, both when using IMG-image with driver autoload and ISO-image with manual driver loading.

Thank you. Could you try again but this time print debug output to the screen? I'm guessing the driver is stuck waiting for the controller to shutdown. But I just want to be sure.

> Update with commit c02d64b35a. > System with loaded NVMe driver shuts down okay in VirtualBox, but freezes in VMWare, both when using IMG-image with driver autoload and ISO-image with manual driver loading. Thank you. Could you try again but this time print debug output to the screen? I'm guessing the driver is stuck waiting for the controller to shutdown. But I just want to be sure.

Thank you. Could you try again but this time print debug output to the screen? I'm guessing the driver is stuck waiting for the controller to shutdown. But I just want to be sure.

Of course, here it is. Same result and output, both with IMG autoload and ISO manual load.

> Thank you. Could you try again but this time print debug output to the screen? I'm guessing the driver is stuck waiting for the controller to shutdown. But I just want to be sure. Of course, here it is. Same result and output, both with IMG autoload and ISO manual load.
Author
Member

@Burer Thanks. I'll check this issue out tomorrow. Seems like the controller isn't shutting down. I'm probably doing it wrong.

@Burer Thanks. I'll check this issue out tomorrow. Seems like the controller isn't shutting down. I'm probably doing it wrong.
Author
Member

So, since I don't have VMWare unfortunately this might get annoying for you to test. Unfortunately I'm not smart enough to know what's going on here, so if you could screenshot the output from this commit 5f727368e8 and try again, that would be helpful and tell me if shutdown processing is occurring or not. Thanks.

So, since I don't have VMWare unfortunately this might get annoying for you to test. Unfortunately I'm not smart enough to know what's going on here, so if you could screenshot the output from this commit https://git.kolibrios.org/GSoC/kolibrios-nvme-driver/commit/5f727368e898b0e12316468459d7269cf045e8da and try again, that would be helpful and tell me if shutdown processing is occurring or not. Thanks.

No problem at all!
And here are the results with commit 5f727368e8:

  • In VMWare OS freezes when loading. But if "Show disks visible by BIOS" is turned on while launching, it hangs on black screen, while if turn it off, it manages to load up to desktop.
  • In VirtualBox all runs okay, both with this setting turned on and off.
No problem at all! And here are the results with commit 5f727368e8: - In VMWare OS freezes when loading. But if "Show disks visible by BIOS" is turned on while launching, it hangs on black screen, while if turn it off, it manages to load up to desktop. - In VirtualBox all runs okay, both with this setting turned on and off.
Author
Member

@Burer, does this freeze always occur on VMWare? Or does it just happen occasionally? Has this happened on previous builds?

@Burer, does this freeze always occur on VMWare? Or does it just happen occasionally? Has this happened on previous builds?

@ramenu, happened every time on commit 5f727368e8, but was okay on previous commits.
Like, on c02d64b35a it loads okay but fails to shut down, and before that is just failed to load, but without freezes.

@ramenu, happened every time on commit 5f727368e8, but was okay on previous commits. Like, on c02d64b35a it loads okay but fails to shut down, and before that is just failed to load, but without freezes.
Author
Member

@Burer: Thank you. That is a very useful hint. :)

@Burer: Thank you. That is a very useful hint. :)
Author
Member

@Burer, I've reverted that commit. But I don't think it will help with the VMWare shutdown case. Not having VMWare makes it significantly more challenging to test and debug as I kind of have to rely on just guessing what may be wrong, so sorry if I'm not active on this issue.

@Burer, I've reverted that commit. But I don't think it will help with the VMWare shutdown case. Not having VMWare makes it significantly more challenging to test and debug as I kind of have to rely on just guessing what may be wrong, so sorry if I'm not active on this issue.
Author
Member

Also, I'm going to remove the image from the repository as its using quite a bit of storage space and git isn't really suitable for this kind of thing, so I'll edit the Makefile later to automatically update the image. :)

Also, I'm going to remove the image from the repository as its using quite a bit of storage space and git isn't really suitable for this kind of thing, so I'll edit the Makefile later to automatically update the image. :)

@ramenu, no problem at all. The first goal of driver is to work on real hardware, so if it can't run properly on one certain VM - that is sad, but not crucial, I suppose.
And okay, got it, thank you. But what do you mean by automatically update the image, if you are going to remove it from repo?

@ramenu, no problem at all. The first goal of driver is to work on real hardware, so if it can't run properly on one certain VM - that is sad, but not crucial, I suppose. And okay, got it, thank you. But what do you mean by _automatically update the image_, if you are going to remove it from repo?
Author
Member

@Burer, don't get me wrong. I'm still interested in making it work on VMWare. It's just difficult to work on until I install it.

And by automatically updating the image I mean having the makefile install the driver directly into the KolibriOS image with something like 'make install'.

@Burer, don't get me wrong. I'm still interested in making it work on VMWare. It's just difficult to work on until I install it. And by automatically updating the image I mean having the makefile install the driver directly into the KolibriOS image with something like 'make install'.

@ramenu, got it, just want to say, that there is nothing to be sorry about, if you are not active on it. VMWare website for some reason throws server errors when trying to download it for last few weeks, so there is possibly nothing we can do here right now. And if you will get some progress - I always will be glad to help test it.

And got it, thank you, really nice idea. Definitely would be helpful for other contributors to build it.

@ramenu, got it, just want to say, that there is nothing to be sorry about, if you are not active on it. VMWare website for some reason throws server errors when trying to download it for last few weeks, so there is possibly nothing we can do here right now. And if you will get some progress - I always will be glad to help test it. And got it, thank you, really nice idea. Definitely would be helpful for other contributors to build it.
Author
Member

@Burer, I'll try and download VMWare today and see what I can do. :) That being said, were there any other issues you discovered in your testing? Does read/write work okay? No crashes or freezing? Even though the driver is beta quality me and @punk_joker still want to merge the driver upstream.

@Burer, I'll try and download VMWare today and see what I can do. :) That being said, were there any other issues you discovered in your testing? Does read/write work okay? No crashes or freezing? Even though the driver is beta quality me and @punk_joker still want to merge the driver upstream.
Author
Member

@ramenu, got it, just want to say, that there is nothing to be sorry about, if you are not active on it. VMWare website for some reason throws server errors when trying to download it for last few weeks, so there is possibly nothing we can do here right now. And if you will get some progress - I always will be glad to help test it.

And got it, thank you, really nice idea. Definitely would be helpful for other contributors to build it.

RIP. You're right. I'm getting a Cloudflare host error.

> @ramenu, got it, just want to say, that there is nothing to be sorry about, if you are not active on it. VMWare website for some reason throws server errors when trying to download it for last few weeks, so there is possibly nothing we can do here right now. And if you will get some progress - I always will be glad to help test it. > > And got it, thank you, really nice idea. Definitely would be helpful for other contributors to build it. RIP. You're right. I'm getting a Cloudflare host error.
Author
Member

@Burer, you mentioned that it freezes on startup in VMWare right? For some reason, I'm also getting this issue when disabling debug logs on screen. It's weird.

@Burer, you mentioned that it freezes on startup in VMWare right? For some reason, I'm also getting this issue when disabling debug logs on screen. It's weird.

@ramenu, first of all, sorry for ignore, was busy last few days.
I will test the newest version of driver in the next few days and write here about the results.

And in the meanwhile, I sent you an e-mail to your posteo.net mail, also regarding the GSoC. I decided to write you about it here, just in case, so that it wouldn't get lost.

@ramenu, first of all, sorry for ignore, was busy last few days. I will test the newest version of driver in the next few days and write here about the results. And in the meanwhile, I sent you an e-mail to your posteo.net mail, also regarding the GSoC. I decided to write you about it here, just in case, so that it wouldn't get lost.
Author
Member

@Burer, no worries. I knew you were busy, take your time. :)

And I received your email. I'll reply in a short bit, you'll get it from a different email though so check if it goes to spam.

@Burer, no worries. I knew you were busy, take your time. :) And I received your email. I'll reply in a short bit, you'll get it from a different email though so check if it goes to spam.

Hello again, @ramenu!
It's been a long time, but finally I am here.
And if it is still actual, here are results of testing latest driver version (commit c2ee9ee16d).

  • In VirtualBox, all is okay.
  • In VMWare I got instant freeze when turned on "Add disk visible by BIOS" in BIOS (even when driver is not added to autoload), and freeze on desktop when it is turned off (and driver is added to autoload).

Hope this will help you if you are planning to continue driver development after the end of GSoC.

OFFTOP. Didn't get your email yet. Is it my miss, or you just haven't sent it yet?

Hello again, @ramenu! It's been a long time, but finally I am here. And if it is still actual, here are results of testing latest driver version (commit c2ee9ee16d). - In VirtualBox, all is okay. - In VMWare I got instant freeze when turned on "Add disk visible by BIOS" in BIOS (even when driver is not added to autoload), and freeze on desktop when it is turned off (and driver is added to autoload). Hope this will help you if you are planning to continue driver development after the end of GSoC. OFFTOP. Didn't get your email yet. Is it my miss, or you just haven't sent it yet?
Author
Member

Hello again, @ramenu!
It's been a long time, but finally I am here.
And if it is still actual, here are results of testing latest driver version (commit c2ee9ee16d).

  • In VirtualBox, all is okay.
  • In VMWare I got instant freeze when turned on "Add disk visible by BIOS" in BIOS (even when driver is not added to autoload), and freeze on desktop when it is turned off (and driver is added to autoload).

Hope this will help you if you are planning to continue driver development after the end of GSoC.

OFFTOP. Didn't get your email yet. Is it my miss, or you just haven't sent it yet?

Hey! I'll check this out sometime this weekend or next week. Thanks once again!

As for the email, I sent it again. My domain was on Spamhaus's DBL so it's possible that is why it didn't get sent. That should be resolved now, but if not then I'll just send it from another email.

> Hello again, @ramenu! > It's been a long time, but finally I am here. > And if it is still actual, here are results of testing latest driver version (commit c2ee9ee16d). > > - In VirtualBox, all is okay. > - In VMWare I got instant freeze when turned on "Add disk visible by BIOS" in BIOS (even when driver is not added to autoload), and freeze on desktop when it is turned off (and driver is added to autoload). > > Hope this will help you if you are planning to continue driver development after the end of GSoC. > > OFFTOP. Didn't get your email yet. Is it my miss, or you just haven't sent it yet? Hey! I'll check this out sometime this weekend or next week. Thanks once again! As for the email, I sent it again. My domain was on Spamhaus's DBL so it's possible that is why it didn't get sent. That should be resolved now, but if not then I'll just send it from another email.

Hey! I'll check this out sometime this weekend or next week. Thanks once again!

Great, take your time and you are always welcome!

As for the email, I sent it again. My domain was on Spamhaus's DBL so it's possible that is why it didn't get sent. That should be resolved now, but if not then I'll just send it from another email.

And now I got it, thank you very much.

> Hey! I'll check this out sometime this weekend or next week. Thanks once again! Great, take your time and you are always welcome! > As for the email, I sent it again. My domain was on Spamhaus's DBL so it's possible that is why it didn't get sent. That should be resolved now, but if not then I'll just send it from another email. And now I got it, thank you very much.

Update with commit 53c04b912c.

Driver loads and works okay on VMWare, but system freezes on shut down.

Update with commit 53c04b912c. Driver loads and works okay on VMWare, but system freezes on shut down.
Sign in to join this conversation.
No Label
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: GSoC/kolibrios-nvme-driver#5
No description provided.