Eolite: Assign correct label and icon to NVMe device #81

Closed
opened 2024-07-22 23:01:20 +02:00 by ramenu · 18 comments
Member

Currently, Eolite detects it as a unknown device as you can see from the image attached below. I tried patching it in myself, but it failed to build due to a segfault.

Currently, Eolite detects it as a unknown device as you can see from the image attached below. I tried patching it in myself, but it failed to build due to a segfault.

Hello!
Have you discussed it with your mentor, punk_joker? He took part in the development of Eolite a few years ago, so maybe he can help you with that.
P.S. Those who will be doing this - note that this problem also needs to be fixed in KFM2, or their shared components with Eolite as well.

Hello! Have you discussed it with your mentor, punk_joker? He took part in the development of Eolite a few years ago, so maybe he can help you with that. P.S. Those who will be doing this - note that this problem also needs to be fixed in KFM2, or their shared components with Eolite as well.
Sweetbread added the
Kind
Bug
Priority
Low
Eolite
Category/Applications
labels 2024-07-23 20:29:57 +02:00
Doczom added
Kind
Feature
and removed
Kind
Bug
labels 2024-07-23 20:38:33 +02:00
Author
Member

Yes, we've gone over it. But I thought I'd submit a issue here anyways just in case someone wants to do it. Thanks.

Yes, we've gone over it. But I thought I'd submit a issue here anyways just in case someone wants to do it. Thanks.

Can you, please, drop last version of nvme.sys or full kolibri.img?
I tried to use your kolibri.img from forum, but it seems to be bugged, as NVMe driver fails into infinite load with intense CPU usage.

Can you, please, drop last version of nvme.sys or full kolibri.img? I tried to use your kolibri.img from forum, but it seems to be bugged, as NVMe driver fails into infinite load with intense CPU usage.
Author
Member

First of all, thank you for testing out the driver and providing detailed screenshots! :) It's much appreciated.

In your case, it seems like a command didn't complete and the driver is polling, so it's stuck in an infinite loop. But I don't know if that's the exact cause for this. It doesn't even seem like an IRQ fired, which is strange. Did the screen freeze as well?

Currently the polling code is being removed, and I'm trying to find a suitable interprocess communication API to replace it (if anyone reading this has any source code references of this, please let me know).

The current image is buggy and susceptible to frequent crashes when reading and writing to disk, but it should work fine when handling initialization... though your screenshot doesn't give me high hopes. What is your NVMe controller version? Are you testing this on bare metal?

(EDIT: Gitea doesn't allow me to upload the image or the .sys file, so you'll have to clone the Git repository and build it manually. There shouldn't be any problems, I hope. But if there are please let me know.)

First of all, thank you for testing out the driver and providing detailed screenshots! :) It's much appreciated. In your case, it seems like a command didn't complete and the driver is polling, so it's stuck in an infinite loop. But I don't know if that's the exact cause for this. It doesn't even seem like an IRQ fired, which is strange. Did the screen freeze as well? Currently the polling code is being removed, and I'm trying to find a suitable interprocess communication API to replace it (if anyone reading this has any source code references of this, please let me know). The current image is buggy and susceptible to frequent crashes when reading and writing to disk, but it should work fine when handling initialization... though your screenshot doesn't give me high hopes. What is your NVMe controller version? Are you testing this on bare metal? (EDIT: Gitea doesn't allow me to upload the image or the .sys file, so you'll have to clone the Git repository and build it manually. There shouldn't be any problems, I hope. But if there are please let me know.)
Author
Member

Yup. It's definitely the polling code. I'm working on a fix for this (at least for initialization).

Yup. It's definitely the polling code. I'm working on a fix for this (at least for initialization).
Owner
> interprocess communication API For example, this API : https://git.kolibrios.org/KolibriOS/kolibrios/src/branch/main/kernel/trunk/docs/events_subsystem.txt

First of all, thank you for testing out the driver and providing detailed screenshots! :) It's much appreciated.

You are always welcome!

Did the screen freeze as well?

No, screen is okay, and I can use system freely, but it freezes with empty desktop when I am trying to shut down it (OS).

What is your NVMe controller version? Are you testing this on bare metal?

No, I am using a default VirtualBox NVMe controller and virtual drive image, formatted into FAT32 file system using Linux. Unfortunately, I don't have any NVMe drives on hand now.

Gitea doesn't allow me to upload the image or the .sys file, so you'll have to clone the Git repository and build it manually.

Yes, I finally managed to pick proper macros files and compiled latest version from your repository (commit 269d675), but it seems to have the same problems. But at least now it shows controller version in debug output, so maybe it will be easier for you to deal with this problem.

EDIT:

The hang occurs during the entry function, the fact that the disk is not added signals a fault before or during disk detection.

Presumably the interrupt handler is catching the wrong thing or not catching it at all. Try initializing ACPI, maybe that's the problem (one of our developers has observed this on his laptop with SDHCI).

Most likely, somewhere, there may be a construct like:

@@:
    test dword[base + mmio_reg], bit_mask
    jnz @b
> First of all, thank you for testing out the driver and providing detailed screenshots! :) It's much appreciated. You are always welcome! > Did the screen freeze as well? No, screen is okay, and I can use system freely, but it freezes with empty desktop when I am trying to shut down it (OS). > What is your NVMe controller version? Are you testing this on bare metal? No, I am using a default VirtualBox NVMe controller and virtual drive image, formatted into FAT32 file system using Linux. Unfortunately, I don't have any NVMe drives on hand now. > Gitea doesn't allow me to upload the image or the .sys file, so you'll have to clone the Git repository and build it manually. Yes, I finally managed to pick proper macros files and compiled latest version from your repository (commit [269d675](https://git.kolibrios.org/GSoC/kolibrios-nvme-driver/commit/269d675697d6370491b1ca3163c2217fd91bef87)), but it seems to have the same problems. But at least now it shows controller version in debug output, so maybe it will be easier for you to deal with this problem. EDIT: The hang occurs during the entry function, the fact that the disk is not added signals a fault before or during disk detection. Presumably the interrupt handler is catching the wrong thing or not catching it at all. Try initializing ACPI, maybe that's the problem (one of our developers has observed this on his laptop with SDHCI). Most likely, somewhere, there may be a construct like: ``` @@: test dword[base + mmio_reg], bit_mask jnz @b ```
Author
Member

@Burer I can confirm I've been able to reproduce your issue on Virtualbox. I forgot to add a timeout jump in the polling code, so that problem is gone now. However, I'm still figuring out why the IRQ doesn't fire at all. Thank you for the hint. I'll keep you updated on this.

@Burer I can confirm I've been able to reproduce your issue on Virtualbox. I forgot to add a timeout jump in the polling code, so that problem is gone now. However, I'm still figuring out why the IRQ doesn't fire at all. Thank you for the hint. I'll keep you updated on this.

@ramenu Nice to hear it, and will be looking forward to it!
Also, in fact, all thanks should be given to Doczom - he is the one who suggested what could be the cause of the problem with driver freezing. But of course, thank you!

EDIT: Yes, latest version finishes (fails to install) properly after a minute or something like that, but the OS itself still freezes when trying to shut down it, at least for me. But maybe it is just some VirtualBox-related bug.

@ramenu Nice to hear it, and will be looking forward to it! Also, in fact, all thanks should be given to Doczom - he is the one who suggested what could be the cause of the problem with driver freezing. But of course, thank you! EDIT: Yes, latest version finishes (fails to install) properly after a minute or something like that, but the OS itself still freezes when trying to shut down it, at least for me. But maybe it is just some VirtualBox-related bug.
Author
Member

Okay, I've figured out the problem (kind of, not really..)

The controller encounters a serious error condition of some sort. It's not specific as to what the problem error, all I know is that the fatal status bit is set in one of the registers. So it's not related to IRQ, because the controller does not indicate this error with an interrupt.

So I'm assuming the problem lies in my configuration somewhere, but it's weird because I checked the 1.4 and 1.2b specifications and couldn't distinguish much of a difference (if any?) when checking the initialization part. It's possible that QEMU just comes with better defaults, or I'm missing something somewhere. I'll have to investigate some more.

And yes, @Doczom has been tremendously helpful even since the start, so I consider him as a third mentor alongside @punk_joker and @dunkaist. :)

Okay, I've figured out the problem (kind of, not really..) The controller encounters a serious error condition of some sort. It's not specific as to what the problem error, all I know is that the fatal status bit is set in one of the registers. So it's not related to IRQ, because the controller does not indicate this error with an interrupt. So I'm assuming the problem lies in my configuration somewhere, but it's weird because I checked the 1.4 and 1.2b specifications and couldn't distinguish much of a difference (if any?) when checking the initialization part. It's possible that QEMU just comes with better defaults, or I'm missing something somewhere. I'll have to investigate some more. And yes, @Doczom has been tremendously helpful even since the start, so I consider him as a third mentor alongside @punk_joker and @dunkaist. :)
Author
Member

@Burer, unfortunately I haven't been able to figure out what the problem is as of right now. I'm talking to @punk_joker about it though and I'll keep you updated if I make progress on this issue. For now I'm going to fix the bugs on QEMU and once that's done I will work on troubleshooting Virtualbox. :)

@Burer, unfortunately I haven't been able to figure out what the problem is as of right now. I'm talking to @punk_joker about it though and I'll keep you updated if I make progress on this issue. For now I'm going to fix the bugs on QEMU and once that's done I will work on troubleshooting Virtualbox. :)

@ramenu, got it, thank you!
Will be waiting for progress and would be glad to help test it, when it would be done.

@ramenu, got it, thank you! Will be waiting for progress and would be glad to help test it, when it would be done.

@ramenu, by the way, have you already checked KolibriOS with commit 975284f5f3 with your driver?
Does it display labels for NVMe drives properly, both in Eolite and KFM2?
Just curios, as that was the original topic of this discussion, after all.

@ramenu, by the way, have you already checked KolibriOS with commit 975284f5f3 with your driver? Does it display labels for NVMe drives properly, both in Eolite and KFM2? Just curios, as that was the original topic of this discussion, after all.
Author
Member

Not sure how KFM2 is supposed to look, but Eolite does display the label correctly now.

I should probably close this issue since it's been resolved, however, there's still the Virtualbox problem which needs to be resolved (and I'm now working on again).

I feel like opening a separate issue for that would be better, but the question is where.. perhaps on the NVMe driver mirror hosted here? But I don't believe anyone who doesn't have an account here can view that repository, last I checked. What do you suggest?

Not sure how KFM2 is supposed to look, but Eolite does display the label correctly now. I should probably close this issue since it's been resolved, however, there's still the Virtualbox problem which needs to be resolved (and I'm now working on again). I feel like opening a separate issue for that would be better, but the question is where.. perhaps on the NVMe driver mirror hosted here? But I don't believe anyone who doesn't have an account here can view that repository, last I checked. What do you suggest?

Not sure how KFM2 is supposed to look, but Eolite does display the label correctly now.

Looks totally right for me, thank you for sharing the screenshot. Just one small remark - is there any strict reason to name drives like nvme0n1, instead of just nvme0? It is totally up to you, of course, but looks like a shorter version would be more in the pattern of other disk names better and fit better into the disk selection window in KFM2.

I should probably close this issue since it's been resolved, however, there's still the Virtualbox problem which needs to be resolved (and I'm now working on again).

I feel like opening a separate issue for that would be better, but the question is where.. perhaps on the NVMe driver mirror hosted here? But I don't believe anyone who doesn't have an account here can view that repository, last I checked. What do you suggest?

I asked admins, and now GSoC repos should be visibly publicly, but you still can't open issues in them, at least not for me. So will ask for this as well.

> Not sure how KFM2 is supposed to look, but Eolite does display the label correctly now. Looks totally right for me, thank you for sharing the screenshot. Just one small remark - is there any strict reason to name drives like **nvme0n1**, instead of just **nvme0**? It is totally up to you, of course, but looks like a shorter version would be more in the pattern of other disk names better and fit better into the disk selection window in KFM2. > I should probably close this issue since it's been resolved, however, there's still the Virtualbox problem which needs to be resolved (and I'm now working on again). > > I feel like opening a separate issue for that would be better, but the question is where.. perhaps on the NVMe driver mirror hosted here? But I don't believe anyone who doesn't have an account here can view that repository, last I checked. What do you suggest? I asked admins, and now GSoC repos should be visibly publicly, but you still can't open issues in them, at least not for me. So will ask for this as well.
Author
Member

I asked admins, and now GSoC repos should be visibly publicly, but you still can't open issues in them, at least not for me. So will ask for this as well.

Thank you! It's much appreciated. :)

is there any strict reason to name drives like nvme0n1, instead of just nvme0? It is totally up to you, of course, but looks like a shorter version would be more in the pattern of other disk names better and fit better into the disk selection window in KFM2.

I agree it would be more aesthetically pleasing to look at, but the reason for that naming is because NVMe operates with namespaces, which are kind of like virtual drives. The nX is a namespace identifier. In practice though, I have never really seen this value being anything else other than 1.

We could perhaps just change the name to something shorter, but I wanted to follow the Linux naming convention which seems to fairly understood by everyone.

> I asked admins, and now GSoC repos should be visibly publicly, but you still can't open issues in them, at least not for me. So will ask for this as well. Thank you! It's much appreciated. :) > is there any strict reason to name drives like nvme0n1, instead of just nvme0? It is totally up to you, of course, but looks like a shorter version would be more in the pattern of other disk names better and fit better into the disk selection window in KFM2. I agree it would be more aesthetically pleasing to look at, but the reason for that naming is because NVMe operates with namespaces, which are kind of like virtual drives. The `nX` is a namespace identifier. In practice though, I have never really seen this value being anything else other than 1. We could perhaps just change the name to something shorter, but I wanted to follow the Linux naming convention which seems to fairly understood by everyone.

Thank you! It's much appreciated. :)

Done! Now Issues are opened and moreover - you are the repo owner with admin rights! Thanks to Sweetbread.

I agree it would be more aesthetically pleasing to look at, but the reason for that naming is because NVMe operates with namespaces, which are kind of like virtual drives. The nX is a namespace identifier. In practice though, I have never really seen this value being anything else other than 1.

We could perhaps just change the name to something shorter, but I wanted to follow the Linux naming convention which seems to fairly understood by everyone.

Okay, got it, thank you, sounds pretty reasonable. Maybe gonna just update a KFM2 UI a little bit.

> Thank you! It's much appreciated. :) Done! Now Issues are opened and moreover - you are the repo owner with admin rights! Thanks to **Sweetbread**. > I agree it would be more aesthetically pleasing to look at, but the reason for that naming is because NVMe operates with namespaces, which are kind of like virtual drives. The `nX` is a namespace identifier. In practice though, I have never really seen this value being anything else other than 1. > We could perhaps just change the name to something shorter, but I wanted to follow the Linux naming convention which seems to fairly understood by everyone. Okay, got it, thank you, sounds pretty reasonable. Maybe gonna just update a KFM2 UI a little bit.
Author
Member

I've opened the issue here GSoC/kolibrios-nvme-driver#1

Thank you @Burer and @SweetBread :)

I've opened the issue here https://git.kolibrios.org/GSoC/kolibrios-nvme-driver/issues/1 Thank you @Burer and @SweetBread :)
Sign in to join this conversation.
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: KolibriOS/kolibrios#81
No description provided.