Need help with gpu folding, Folding@home, Fedora 31, Nvidia, Nvidia rtx 2070 [solved]

how do I get that? I am on fedora linux

It looks like you have a POCL installed?

Someone on another thread pointed out that F@H only works when there’s one, and only one entry in clinfo. Try uninstalling your other opencl provider.

1 Like

how do I uninstall it?

I don’t even know how to list all open cl installations

Oh yeah Fedora. Try running it explicitly with “python2 /usr/bin/FAHControl”

python2 /usr/bin/FAHControl

Traceback (most recent call last):
  File "/usr/bin/FAHControl", line 25, in <module>
    from fah import FAHControl, load_fahcontrol_db
ImportError: No module named fah

Ugh. I feel your pain. I had a similar problem, even though I am not on Fedora. They haven’t updated the fahcontrol in quite a while. That’s why I gave up and went to a remote machine to control it.

I suppose an alternative is to edit the config.xml file. Mine is in /etc/fahclient/config.xml although yours may be different.

Do you have a line like: <gpu v='true'/> in your config file? If not, try adding it and restart the fahclient service or just rebooting. It might be worth a try.

1 Like

clinfo will list all the OpenCL Info.

I think if you’re on Fedora dnf remove pocl will remove the Portable OpenCL package. Then you should be left with only the Nvidia.

1 Like

Again, not much help, but on MANJARO I had to download not only the drivers “opencl-nvidia-440x” but also the “opencl-headers”

Here is link to ARCHwiki entry on F@H:
https://wiki.archlinux.org/index.php/folding@home

You might spot something there rings a bell with Fedora

Good luck!

1 Like
<config>
  <!-- Slot Control -->
  <power v='FULL'/>
  <gpu v='true'/> 
  <!-- User Information -->
  <passkey v='mypascode'/>
  <team v='level1'/>
  <user v='myname'/>
  <!-- Folding Slots -->
  <slot id='0' type='CPU'/>
  <slot id='1' type='GPU'/>
</config>
clinfo 
Number of platforms                               2
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 1.2 CUDA 10.2.159
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics
  Platform Extensions function suffix             NV

  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 19.2.8
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   NVIDIA CUDA
Number of devices                                 1
  Device Name                                     GeForce RTX 2070
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 1.2 CUDA
  Driver Version                                  440.82
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     GPU
  Device Topology (NV)                            PCI-E, 26:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               36
  Max clock frequency                             1620MHz
  Compute Capability (NV)                         7.5
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Preferred work group size multiple              32
  Warp size (NV)                                  32
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              8366784512 (7.792GiB)
  Error Correction support                        No
  Max memory allocation                           2091696128 (1.948GiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        1179648 (1.125MiB)
  Global Memory cache line size                   128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            268435456 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             32768x32768 pixels
    Max 3D image size                             16384x16384x16384 pixels
    Max number of read image args                 256
    Max number of write image args                32
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max number of constant args                     9
  Max constant buffer size                        65536 (64KiB)
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Kernel execution timeout (NV)                 Yes
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  3
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics

  Platform Name                                   Clover
Number of devices                                 0

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  NVIDIA CUDA
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [NV]
  clCreateContext(NULL, ...) [default]            Success [NV]
  clCreateContext(NULL, ...) [other]              ����(V
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Invalid device type for platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.12
  ICD loader Profile                              OpenCL 2.2

running in terminal

FAHClient
04:07:05:INFO(1):Read GPUs.txt
04:07:06:Removing old file 'logs/log-20200412-020427.txt'
04:07:06:************************* Folding@home Client *************************
04:07:06:      Website: https://foldingathome.org/
04:07:06:    Copyright: (c) 2009-2018 foldingathome.org
04:07:06:       Author: Joseph Coffland <[email protected]>
04:07:06:         Args: 
04:07:06:       Config: /home/bedhedd/config.xml
04:07:06:******************************** Build ********************************
04:07:06:      Version: 7.5.1
04:07:06:         Date: May 12 2018
04:07:06:         Time: 22:51:07
04:07:06:   Repository: Git
04:07:06:     Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
04:07:06:       Branch: master
04:07:06:     Compiler: GNU 4.4.7 20120313 (Red Hat 4.4.7-18)
04:07:06:      Options: -std=gnu++98 -O3 -funroll-loops
04:07:06:     Platform: linux2 4.14.0-3-amd64
04:07:06:         Bits: 64
04:07:06:         Mode: Release
04:07:06:******************************* System ********************************
04:07:06:          CPU: AMD Ryzen 7 2700X Eight-Core Processor
04:07:06:       CPU ID: AuthenticAMD Family 23 Model 8 Stepping 2
04:07:06:         CPUs: 16
04:07:06:       Memory: 15.64GiB
04:07:06:  Free Memory: 9.84GiB
04:07:06:      Threads: POSIX_THREADS
04:07:06:   OS Version: 5.6
04:07:06:  Has Battery: false
04:07:06:   On Battery: false
04:07:06:   UTC Offset: -5
04:07:06:          PID: 5452
04:07:06:          CWD: /home/bedhedd
04:07:06:           OS: Linux 5.6.13-200.fc31.x86_64 x86_64
04:07:06:      OS Arch: AMD64
04:07:06:         GPUs: 1
04:07:06:        GPU 0: Bus:38 Slot:0 Func:0 NVIDIA:7 TU106 [GeForce RTX 2070] M 6497
04:07:06:CUDA Device 0: Platform:0 Device:0 Bus:38 Slot:0 Compute:7.5 Driver:10.2
04:07:06:       OpenCL: Not detected: clGetDeviceIDs() returned -1
04:07:06:***********************************************************************
04:07:06:<config>
04:07:06:  <!-- Slot Control -->
04:07:06:  <power v='FULL'/>
04:07:06:
04:07:06:  <!-- User Information -->
04:07:06:  <passkey v='********************************'/>
04:07:06:  <team v='232084'/>
04:07:06:  <user v='BedHedd'/>
04:07:06:
04:07:06:  <!-- Folding Slots -->
04:07:06:  <slot id='0' type='CPU'/>
04:07:06:  <slot id='1' type='GPU'/>
04:07:06:</config>
04:07:06:Trying to access database...
04:07:06:Successfully acquired database lock
04:07:06:Enabled folding slot 00: READY cpu:15
04:07:06:Enabled folding slot 01: READY gpu:0:TU106 [GeForce RTX 2070] M 6497
04:07:06:ERROR:Exception: Could not bind socket to 0.0.0.0:7396: Address already in use
04:07:06:ERROR:Exception: Could not bind socket to 0.0.0.0:36330: Address already in use
04:07:06:WU00:FS00:Starting
04:07:06:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /home/bedhedd/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 5452 -checkpoint 15 -np 15
04:07:06:WU00:FS00:Started FahCore on PID 5478
04:07:06:WU00:FS00:Core PID:5482
04:07:06:WU00:FS00:FahCore 0xa7 started
04:07:06:WU01:FS01:Connecting to 65.254.110.245:8080
04:07:06:WARNING:WU01:FS01:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration
04:07:06:WU01:FS01:Connecting to 18.218.241.186:80
04:07:06:WU01:FS01:Assigned to work server 3.133.76.19
04:07:06:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:TU106 [GeForce RTX 2070] M 6497 from 3.133.76.19
04:07:06:WU01:FS01:Connecting to 3.133.76.19:8080
04:07:06:WU00:FS00:0xa7:*********************** Log Started 2020-05-28T04:07:06Z ***********************
04:07:06:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
04:07:06:WU00:FS00:0xa7:       Type: 0xa7
04:07:06:WU00:FS00:0xa7:       Core: Gromacs
04:07:06:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 5478 -checkpoint 15 -np
04:07:06:WU00:FS00:0xa7:             15
04:07:06:WU00:FS00:0xa7:************************************ CBang *************************************
04:07:06:WU00:FS00:0xa7:       Date: Nov 5 2019
04:07:06:WU00:FS00:0xa7:       Time: 06:06:57
04:07:06:WU00:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
04:07:06:WU00:FS00:0xa7:     Branch: master
04:07:06:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
04:07:06:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
04:07:06:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
04:07:06:WU00:FS00:0xa7:       Bits: 64
04:07:06:WU00:FS00:0xa7:       Mode: Release
04:07:06:WU00:FS00:0xa7:************************************ System ************************************
04:07:06:WU00:FS00:0xa7:        CPU: AMD Ryzen 7 2700X Eight-Core Processor
04:07:06:WU00:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 8 Stepping 2
04:07:06:WU00:FS00:0xa7:       CPUs: 16
04:07:06:WU00:FS00:0xa7:     Memory: 15.64GiB
04:07:06:WU00:FS00:0xa7:Free Memory: 9.82GiB
04:07:06:WU00:FS00:0xa7:    Threads: POSIX_THREADS
04:07:06:WU00:FS00:0xa7: OS Version: 5.6
04:07:06:WU00:FS00:0xa7:Has Battery: false
04:07:06:WU00:FS00:0xa7: On Battery: false
04:07:06:WU00:FS00:0xa7: UTC Offset: -5
04:07:06:WU00:FS00:0xa7:        PID: 5482
04:07:06:WU00:FS00:0xa7:        CWD: /home/bedhedd/work
04:07:06:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
04:07:06:WU00:FS00:0xa7:    Version: 0.0.18
04:07:06:WU00:FS00:0xa7:     Author: Joseph Coffland <[email protected]>
04:07:06:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
04:07:06:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
04:07:06:WU00:FS00:0xa7:       Date: Nov 5 2019
04:07:06:WU00:FS00:0xa7:       Time: 06:13:26
04:07:06:WU00:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
04:07:06:WU00:FS00:0xa7:     Branch: master
04:07:06:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
04:07:06:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
04:07:06:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
04:07:06:WU00:FS00:0xa7:       Bits: 64
04:07:06:WU00:FS00:0xa7:       Mode: Release
04:07:06:WU00:FS00:0xa7:************************************ Build *************************************
04:07:06:WU00:FS00:0xa7:       SIMD: avx_256
04:07:06:WU00:FS00:0xa7:********************************************************************************
04:07:06:WU00:FS00:0xa7:Project: 16806 (Run 15, Clone 500, Gen 8)
04:07:06:WU00:FS00:0xa7:Unit: 0x0000000a82ed0b915eb41a34da56d2e0
04:07:06:WU00:FS00:0xa7:Digital signatures verified
04:07:06:WU00:FS00:0xa7:Calling: mdrun -s frame8.tpr -o frame8.trr -cpi state.cpt -cpt 15 -nt 15
04:07:06:WU00:FS00:0xa7:Steps: first=4000000 total=500000
04:07:08:WU00:FS00:0xa7:Completed 315367 out of 500000 steps (63%)

I dunno what this bind socket means

I wonder if it is opencl again @zlynx because it keeps spiting out opencl device issues

04:08:20:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:14440 run:0 clone:741 gen:35 core:0x22 unit:0x0000003503854c135ea0a30138b78104
04:08:20:WU01:FS01:Starting
04:08:20:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
04:08:20:WU01:FS01:Starting
04:08:20:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
04:09:20:WU01:FS01:Starting
04:09:20:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
04:10:57:WU01:FS01:Starting
04:10:57:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
04:11:46:WU00:FS00:0xa7:Completed 320000 out of 500000 steps (64%)
04:13:34:WU01:FS01:Starting
04:13:34:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually

https://wiki.archlinux.org/index.php/GPGPU#NVIDIA
looking over the openCl page, there are two versions on arch, OpenCl Runtime, and OpenCl ICD loader (libOpenCL.so)

in my config it looks like I have the ICD Loader, when I really need the Fedora equivalent of opencl-nvidia from the arch wiki

I think I need help figuring out which opencl package I need. @zlynx what open cl package did you use for your fedora configuration?

I’m not using any Nvidia / Linux combinations. All Linux machines are running AMD Vegas.

I had to use the AMD Pro drivers and the ROCm OpenCL from those.

The ICD loader didn’t cause a problem, other than having more than one provider installed had a problem.

so with my openCl configuration, how many providers do I have?

FAH keeps saying my openCl is not detected, what do I search for to find the right packages. I am hitting a wall with finding people with similar setups to mine

It looked like two.

Try ls -l /etc/OpenCL/vendors/

That’s the configuration file the OpenCL ICD loader reads to find OpenCL providers. Then maybe you can do rpm -qf /etc/OpenCL/vendors/mesa.icd or rpm -qf /etc/OpenCL/vendors/pocl.icd, for example, to find what RPM installed what provider.

1 Like

Aha! Because it was an acronym, I missed that I also had to install “ocl-icd” on Manjaro.

It’s on Github:

Good luck

1 Like

ls -l /etc/OpenCL/vendors/

total 8
-rw-r--r--. 1 root root 19 Dec 18 15:58 mesa.icd
-rw-r--r--. 1 root root 22 Mar  8 16:09 nvidia.icd

rpm -qf /etc/OpenCL/vendors/mesa.icd

mesa-libOpenCL-19.2.8-1.fc31.x86_64

rpm -qf /etc/OpenCL/vendors/nvidia.icd

xorg-x11-drv-nvidia-cuda-440.82-1.fc31.x86_64

should I remove the mesa open-cl?

Give it a try at least.

1 Like

Figured it out, going to mark this post as the solution!
The issue was the conflicting OpenCL libraries, I had installed. The toughest part of setting this up was figuring out what was broken. If you have similar issues to this viewing the outputs of FAHClient from terminal the terminal would help solve this issue. The types of OpenCL errors as seen below.

OpenCL: Not detected: clGetDeviceIDs() returned -1

this error I think combined with the other errors I think helped people determine my OpenCL was the issue.

Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually

looking online for this error gave prompted either no results or results that were for different issues

Once we figured out it was a OpenCL was causing the issues, it was a matter of figuring out which were the specific packages

Turns out it was a issue with the mesa-libOpenCL, which was when installed Dec 18, when I installed the nvidia drivers from someone else’s packages instead of nvidia’s official packages.

running sudo dnf remove mesa-libOpenCL in the terminal uninstalled the mesa OpenCL libraries. Once I uninstalled the mesa OpenCL library, FAH automatically started folding on my nvidia 2070

If you have issues installing nvidia graphics drivers on fedora, try the installation guide from the fedora magazine

If you have issues installing cuda libraries on fedora try the installation guide from the wiki
https://fedoraproject.org/wiki/Cuda
https://rpmfusion.org/Howto/CUDA
I used the wiki guide when I installed the OpenCL packages

Massive thanks to @zlynx for helping me figure out how to interpret the output of clinfo from the terminal and which the OpenCL library was causing the errors.
Hopefully someone will find this post when searching for similar outputs

5 Likes

This will be very relevant considering that modern distros are far enough ahead of the legacy codebase that new users wanting to contribute will fall victim to this gotcha. Thanks for your tenacity in the wake of this issue!

1 Like

Bravo, BedHedd! Good on ya sticking with things and figuring it out.

I’m impressed! :slight_smile:

1 Like