Looking Glass - Triage

@mbz I will correct the error, it was an oversight with the latest updates. It also likely the cause of your segfault as you must run the correct version of the host with the client.

I thought it could be that, I will let you know if my segfaults are gone when you upload the host. Thanks

I do not upload the host, you need to fetch it from the continual build server if you are using master.

See: https://ci.appveyor.com/project/gnif/lookingglass/builds/20917625/artifacts

Itā€™s building now and should be ready in a few minutes.

Thatā€™s what I meant, wrong terminology xD

@gnif Still having the same segfault. The version I had from early november still works without problems using EGL (with the proper client and host).

Let me know if you would like me to run some traces (I may need small directions, it has been a while).

A strack trace would be extremely handy, if you are not sure how to produce one just say and I will give you some instructions.

Hope this is what you are looking for:

The output is right after ā€œHost ready, starting sessionsā€ (nothing different before it besides new threads). program was run with the same parameters:

(gdb) r -L 32 -k -K 144 -Q -j -s -d -S -F -M -g egl

   Thread 7 "frameThread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffde6f0700 (LWP 4508)]
0x00007ffff73f20eb in glGetUniformLocation () from /usr/lib64/opengl/nvidia/lib/libGL.so.1
(gdb) bt
#0  0x00007ffff73f20eb in glGetUniformLocation () from /usr/lib64/opengl/nvidia/lib/libGL.so.1
#1  0x0000555555564b75 in egl_on_frame_event (opaque=0x5555557f0210, format=...,
    data=0x7fffe7080000 "\363\334\362\377\363\334\362\377\363\334\362\377\364\335\363\377\364\335\363\377\364\335\363\377\365\337\362\377\365\337\362\377\365\342\363\377\365\342\363\377\366\343\364\377\366\343\364\377\367\344\365\377\367\344\365\377\370\345\364\377\370\346\363\377\370\347\364\377\370\350\363\377\370\350\363\377\371\351\364\377\371\351\364\377\371\351\364\377\371\351\364\377\371\351\364\377\372\352\365\377\372\352\365\377\373\353\366\377\373\353\366\377\373\353\366\377\374\354\367\377\374\354\367\377\372\355\365\377\372\355\365\377\371\357\365\377\371\357\365\377\371\357\365\377\371\357\365\377\372\360\366\377\372\360\366\377\372\360\366\377\372\361\364\377\372\361\364\377\372\361\364\377\373\362\365\377\374\363\366\377\374\363\366\377\375\364\367\377\375\364\367\377\377\363\371\377\377\363\371\377"...)
    at /var/tmp/portage/app-emulation/looking-glass-9999-r1/work/looking-glass-9999/client/renderers/egl.c:310
#2  0x000055555555bd76 in frameThread (unused=<optimized out>)
    at /var/tmp/portage/app-emulation/looking-glass-9999-r1/work/looking-glass-9999/client/main.c:424
#3  0x00007ffff791b6dc in ?? () from /usr/lib64/libSDL2-2.0.so.0
#4  0x00007ffff79863b9 in ?? () from /usr/lib64/libSDL2-2.0.so.0
#5  0x00007ffff4156a33 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff57f6bdf in clone () from /lib64/libc.so.6

excellent, but can you please compile with debug symbols?

cmake -DCMAKE_BUILD_TYPE=Debug  ...

I might be doing something wrong, but that output is with debug symbols. The other version I had gdb complained about ā€œno symbols on fileā€ and only had the first line, glGetUniformLocation. After I added everything and removed my automatic strip it gave this more ā€œdetailedā€ output.

My CMakeCache.txt has DCMAKE_BUILD_TYPE=Debug

and file ./looking-glass-client gives this:

looking-glass-client: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, with debug_info, not stripped

Sorry for the confusion, I was literally running out the door and missed the details of the debug info, it was indeed done with debug symbols enabled. I will have a poke around and see if I can identify the issue, thanks for your patience.

Can you please verify where you are getting the looking-glass client source from? You are using a modified build I can not debug or provide support for.

The commit ga1b1ed0060 does not exist in the official tree, and your tree is marked as dirty which tells me that you have made local changes to your source.

The current master head should be showing a11-115-gd2b83027b4 at the time of posting this.

Here is the output from the client. This output appears to be constant regardless of the resulting client state (black/flashing/working as detailed below):

steve@Threadripper:~/Downloads/LookingGlass-a11/client/bin$ ./looking-glass-client                      
[I]               main.c:692  | run                            | Looking Glass ()                       
[I]               main.c:693  | run                            | Locking Method: Atomic                  
[I]               main.c:686  | try_renderer                   | Using Renderer: OpenGL                   
[I]               main.c:775  | run                            | Using: OpenGL                            
[I]              spice.c:159  | spice_connect                  | Remote: 127.0.0.1:5900                     
[I]               main.c:901  | run                            | Waiting for host to signal it's ready...                     
[I]              spice.c:367  | spice_on_common_read           | notify message: keyboard channel is insecure                          
[I]             opengl.c:552  | pre_configure                  | Vendor  : X.Org                                                       
[I]             opengl.c:553  | pre_configure                  | Renderer: AMD OLAND (DRM 2.50.0 / 4.15.0-42-generic, LLVM 6.0.0)                 
[I]             opengl.c:554  | pre_configure                  | Version : 3.0 Mesa 18.0.5                                                        
[I]             opengl.c:566  | pre_configure                  | Using GL_AMD_pinned_memory                                                       
[I]               main.c:921  | run                            | Host ready, starting session                                                     
[I]               main.c:177  | updatePositionInfo             | client 1024x768, guest 1920x1080, target 1024x576, scaleX: 1.88, scaleY 1.88                 
[W]               main.c:180  | updatePositionInfo             | Window size doesn't match guest resolution, cursor alignment may not be reliable             
[I]             opengl.c:602  | configure                      | Using decoder: NULL   

In troubleshooting, Iā€™ve tried KDE now since it allows compositing to be disabled, but it originally had the same flashing issue. However, here is something strange to note; the problem appears to be unpredictable. Sometimes I get the flashing issue, sometimes I get nothing but a black screen, and sometimes it actually works and I get a stable screen display (OMG the response, incredible work). This now holds true for both KDE and Gnome 3. Given this behavior, I tend to think that perhaps the compositing is not the only issue? I can achieve a stable display in Gnome 3 even with compositing defaulted to ON. I just wish it was more predictable! Iā€™d say on average I have to launch and kill the client about 6-7 times before I get a stable display. Once stable, it stays that way until I kill the client and start it again. At that point, I just keep killing and launching until it comes up stable again. Interesting.

Please try without AMD pinned memory:

-o opengl:amdPinnedMem=0

Bravo, sir. Consistently performing perfectly now. Thank you!!!

An amazing and frankly unreal level of performance and quality. Windows will never boot outside of a VM here again. You, sir, deserve a special place in the Linux Hall of Fame for this work! :sunglasses:

2 Likes

I think it might be something that gentoo portage does when it runs the configure.

Anyway, for the last report I sent to you I had it compiled by hand to make sure I only had the Debug flag in there, both versions give me the same output (posting it in full now):

(gdb) run -L 32 -k -K 144 -Q -j -s -d -S -F -M -g egl
Starting program: /home/mb/src/LookingGlass/client/looking-glass-client -L 32 -k -K 144 -Q -j -s -d -S -F -M -g egl
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[I]               main.c:752  | run                            | Looking Glass (a11-115-gd2b83027b4)
[I]               main.c:753  | run                            | Locking Method: Atomic
[New Thread 0x7fffedab9700 (LWP 13077)]
[Thread 0x7fffedab9700 (LWP 13077) exited]
[New Thread 0x7fffedab9700 (LWP 13078)]
[I]               main.c:793  | run                            | Trying forced renderer
[I]               main.c:746  | try_renderer                   | Using Renderer: EGL
[New Thread 0x7fffec948700 (LWP 13081)]
[I]               main.c:948  | run                            | Waiting for host to signal it's ready...
[New Thread 0x7fffe5db7700 (LWP 13082)]
[I]                egl.c:387  | egl_render_startup             | Vendor  : NVIDIA Corporation
[I]                egl.c:388  | egl_render_startup             | Renderer: GeForce GTX 1080/PCIe/SSE2
[I]                egl.c:389  | egl_render_startup             | Version : OpenGL ES 3.2 NVIDIA 396.54.09
[I]               main.c:957  | run                            | Host ready, starting session
[New Thread 0x7fffe4c56700 (LWP 13083)]
[New Thread 0x7fffde6f0700 (LWP 13084)]

Thread 7 "frameThread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffde6f0700 (LWP 13084)]
0x00007ffff73f20eb in glGetUniformLocation () from /usr/lib64/opengl/nvidia/lib/libGL.so.1
(gdb) bt
#0  0x00007ffff73f20eb in glGetUniformLocation () from /usr/lib64/opengl/nvidia/lib/libGL.so.1
#1  0x0000555555564be5 in egl_on_frame_event (opaque=0x555555853c00, format=...,
    data=0x7fffe7080000 "\363\334\362\377\363\334\362\377\363\334\362\377\364\335\363\377\364\335\363\377\364\335\363\377\365\337\362\377\365\337\362\377\365\342\363\377\365\342\363\377\366\343\364\377\366\343\364\377\367\344\365\377\367\344\365\377\370\345\364\377\370\346\363\377\370\347\364\377\370\350\363\377\370\350\363\377\371\351\364\377\371\351\364\377\371\351\364\377\371\351\364\377\371\351\364\377\372\352\365\377\372\352\365\377\373\353\366\377\373\353\366\377\373\353\366\377\374\354\367\377\374\354\367\377\372\355\365\377\372\355\365\377\371\357\365\377\371\357\365\377\371\357\365\377\371\357\365\377\372\360\366\377\372\360\366\377\372\360\366\377\372\361\364\377\372\361\364\377\372\361\364\377\373\362\365\377\374\363\366\377\374\363\366\377\375\364\367\377\375\364\367\377\377\363\371\377\377\363\371\377"...) at /home/mb/src/LookingGlass/client/renderers/egl.c:310
#2  0x000055555555bdc6 in frameThread (unused=<optimized out>) at /home/mb/src/LookingGlass/client/main.c:424
#3  0x00007ffff791b6dc in ?? () from /usr/lib64/libSDL2-2.0.so.0
#4  0x00007ffff79863b9 in ?? () from /usr/lib64/libSDL2-2.0.so.0
#5  0x00007ffff4156a33 in start_thread () from /lib64/libpthread.so.0
#6  0x00007ffff57f6bdf in clone () from /lib64/libc.so.6
(gdb)

I got the git version built from the AUR successfully. everything seems to be mostly fine. I think there may be some optimizations to do now that youā€™ve switched to the new dxgi because Iā€™m getting lower ups now. I also note that games that use DX11 invert the red and blue color channels. OpenGL Vulkan DX12 and DX9 all seem to work fine

AUR builds are not supported.

Switch to the EGL renderer -g egl, this is known and will be fixed soon. Please see the below where this problem has been covered.

that was it. I was using -o egl:vsync=0 but I wasnā€™t using -g egl and didnā€™t even think about checking the reported renderer in the console till just then

No worries. Also there is no need to specify vsync=0 for egl as it is the default for egl

I just pushed in a fix for this that also adds 10-bit RGBA support to the OpenGL renderer.