Disclaimer:
-
This is just for fun. Not looking to do any serious coding/development work.
-
There are probably a LOT better models/methods out there (cf. EvalPlus Leaderboard), but I am just trying this out for the first time, over the last two weeks or so.
-
I am NOT a programmer by trade nor training. I don’t really understand code per se, and as a mechanical engineer, I’m used to people explaining what each part in a line is or does (the way that formulas/equations are taught) which, in my experience, programming isn’t taught like that.
With that out of the way – it’s been a fun week/maybe two weeks where I am playing with AI.
I started off with text-geneartion-webui (GitHub - oobabooga/text-generation-webui: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.) and TheBloke’s Mistral AI Mixtral-8x7B-Instruct-v0.1-GGUF (TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF · Hugging Face) where I was asking it some basic questions and also showing tiny human #1 about why AI can’t be relied upon because it can give wrong answers.
But as I started playing with it more and more, I made it to a 2-parameter regression analysis (where I later learned about a concept that Shih-Wen Chen and Hsien-Jung Hsu called “numerical hallucination” in regards to the lack of precision and accuracy when it comes to LLMs doing math (even just basic arithmetic).
Then I moved on to getting it to write a MATLAB script to approximate pi using the Gregory-Leibniz series where pi = 4*atan(1). (I actually purposely originally entered it as pi = 4/atan(1) to see if it would catch my error. It did.)
And then that led me down to using the Nilakantha method as an alternative method for approximating pi, and now I am using that to test out both Ollama/Codellama 70b and the Mistral LLMs to see which would be better at writing this code/program. (following this topic and the video that accompanied it: Ubuntu 22.04 - From Zero to 70b Llama (with BOTH Nvidia and AMD 7xxx series GPUs))
Ollama/open-webui originally didn’t work for me on my Proxmox server which was running Ubuntu 22.04 LTS via a LXC container. My Proxmox server only has a single RTX A2000 6 GB that is sharing between different LXC containers.
And then I switched over to my 6700K, which has 64 GB of RAM, and a 3090.
That worked better/faster. (was getting around 4-4.5 tokens/s with Mistral LLM.)
As I started playing around with it more and then trying to load the bigger models in, I was running out of VRAM.
So, I ended up re-purposing some of the cryptomining hardware that I had, to use this instead.
Unfortunately, my 6500T’s motherboard only has two DIMM slots on it, and out of the readily available RAM that I have just laying around, I could only put 16 GB in (2x 8 GB sticks). So, the system ran out of RAM.
So I ended up taking out my Asus Z170 E motherboard with the 6700K, and put it into my mining frame, and then hooked up two 3090s to it and now the system has both, somewhat enough system RAM and VRAM to be able to load the 70b parameter models.
This is what I would ask in the prompt:
Write a fortran77 program to calculate pi using the Nilakantha method with a user input and a double precision elapsed timer
My results so far:
Codellama has failed in this task in a few ways:
-
It wrote a fortran90 program, not a fortran77 program.
-
It kept using the Gregory-Leibniz method to approximate pi even though I asked for the Nilakantha method. (I was able to cross-check this by using Mistral’s LLM.)
-
Now that I have it finally able to give me a fortran77 program, the (tab) spacing for the program is all messed up such that fort77 (running via Ubuntu 22.04 LTS WSL2 on Windows), won’t compile.
In fact, the program that both Codellama and Mistral LLM wrote – neither of them would compile.
I’ve researched that there are better models (higher HumanEval scores) out there, and I haven’t tried them (all) out yet, but this has been a good, fun test for me to throw at AI, to see what it can and can’t do.
And apparently, this is still a little bit too much for AI (at least the models that I’ve tested so far), for now.
This is the program that Codellama-70b produced:
PROGRAM PI_APPROXIMATION
IMPLICIT NONE
INTEGER, PARAMETER :: N_MAX = 1000000
REAL(DP), DIMENSION(N_MAX) :: series
REAL(DP) :: pi_approx, start_time, end_time, elapsed_time
INTEGER :: i
FUNCTION nilakantha (n_terms) RESULT(result)
IMPLICIT NONE
INTEGER, INTENT(IN) :: n_terms
REAL(DP), INTENT(OUT) :: result
result = 0.0d0
DO i = 1, n_terms
IF (MOD(i, 2) /= 0) THEN
result = result + 1.0d0 / REAL(2 * i - 1)
END IF
END DO
END FUNCTION nilakantha
WRITE(*,*) "Enter the number of terms for the Nilakantha series:"
READ(*,*) pi_approx
CALL TIMER()
start_time = TIMER()
DO i = 1, pi_approx
pi_approx = pi_approx + nilakantha(i)
END DO
CALL TIMER()
end_time = TIMER()
elapsed_time = end_time - start_time
WRITE(*,*) "Approximated value of Pi: ", pi_approx, "(using", pi_approx, "terms)"
WRITE(*,*) "Elapsed time for the calculation: ", elapsed_time, " seconds."
END PROGRAM PI_APPROXIMATION
And here are the errors that I get when I try to compile it using fort77
:
Error on line 1: illegal continuation card (starts "PROGRA")
Error on line 3: illegal continuation card (starts "IMPLIC")
Error on line 5: illegal continuation card (starts "INTEGE")
Error on line 6: illegal continuation card (starts "REAL(D")
Error on line 7: illegal continuation card (starts "REAL(D")
Error on line 8: illegal continuation card (starts "INTEGE")
Error on line 10: illegal continuation card (starts "FUNCTI")
Error on line 11: illegal continuation card (starts " IMPL")
Error on line 12: illegal continuation card (starts " INTE")
Error on line 13: illegal continuation card (starts " REAL")
Error on line 15: illegal continuation card (starts " resu")
Error on line 16: illegal continuation card (starts " DO i")
Error on line 17: illegal continuation card (starts " IF")
Error on line 19: labeled continuation line (starts " EN")
Error on line 20: nondigit in statement label field " END"
MAIN:
Error on line 18: syntax error
Error on line 21: labeled continuation line (starts "END FU")
Error on line 23: labeled continuation line (starts "WRITE(")
Error on line 24: labeled continuation line (starts "READ(*")
Error on line 27: labeled continuation line (starts "start_")
Error on line 29: labeled continuation line (starts "DO i =")
Error on line 30: labeled continuation line (starts " pi_a")
Error on line 31: labeled continuation line (starts "END DO")
Error on line 34: labeled continuation line (starts "end_ti")
Error on line 35: labeled continuation line (starts "elapse")
Error on line 36: labeled continuation line (starts "WRITE(")
Error on line 37: labeled continuation line (starts "WRITE(")
Error on line 20: unbalanced quotes; closing quote supplied
Error on line 20: unbalanced parentheses, statement skipped
Warning on line 20: missing final end statement
/usr/bin/fort77: aborting compilation
And this is the program that Mistral came up with:
program pi_approximation
implicit none
double precision :: pi, error, term, a, b, n, timer
integer :: i
call timer_on(timer)
print *, "Enter the number of terms for the Nilakantha series:"
read *, n
a = 0.0d0
b = 1.0d0
pi = 0.0d0
term = 1.0d0
do i = 1, n
pi = pi + term
error = abs(term)
term = term * ((a + b) / 2.0d0)
end do
pi = pi * 4.0d0
call timer_off(timer)
print *, "Pi approximated with ", n, " terms: ", pi
print *, "Elapsed time: ", timer
end program pi_approximation
And it suffers the same issue with the spacing and neither models are able to successfully correct.
(And yes, I am purposely using Fortran77 to test.)
And yes, I am also purposely testing with the Nilakantha method rather than the Gregory-Leibniz method.
I ran this same comparison/test with MATLAB as well and the result didn’t really fair that much better, especially for the Nilakantha method.
I think that the last answer that I got from the MATLAB script that these AI models generated, calculated pi to be -6.somethingE+014
So, it’s a wildly incorrect answer.