Re: [Hampshire] Seeking Advice on Postfix/Dovecot/MariaDB C…

Góra strony

Reply to this message
Autor: Hants LUG via Hampshire
Data:  
Dla: Hampshire LUG Discussion List
CC: Hants LUG
Temat: Re: [Hampshire] Seeking Advice on Postfix/Dovecot/MariaDB Configuration
I like the concept of using test repetition as detection of
hallucinations, but I will admit to being a bit sceptical. Even a
non-hallucinating model will not produce identical results every time...
but that doesn't make the deviations hallucinations... and to give
context - if you asked me to write a 2-paragraph summary of the same
document 10 times, I very much doubt you'd get 10 matching answers. Or
10 answers for that matter, before I started getting irritable...

I have seen something like this, however. I was using a model to help me
write a summaries of cyber and technology controls by pulling together
best practices from e.g. NIST, ISO, COBIT, etc. That can quickly be
developed to a repetitive cycle... but what I observed is that the more
iterations one performs from the initial set of baseline instructions,
the more the output can vary.

I found that - depending on the complexity of individual iterations -
somewhere in the 15-20 cycles range would be as far as you'd want to go
before resetting the model with some close-source training data and
getting it to start over. In my case I'd often start a new session just
to force a reset. I think what I experienced is some kind of "drift" - a
bit like the deviation... but I found the degree to be quite variable.

In another task I took a bunch of industry threat models [ATT&CK, CAPEC,
DREAD, OWASP, etc] and built a single hybrid model that mapped all the
elements from all the others to a single, 2-tier, "in house" threat
structure, using an iterative method. In that case I was asking the
model to take a threat description and then identify the closest match
from a pre-existing shortlist of threat descriptions - and in that task
the reliability remained consistently excellent and it didn't deviate or
hallucinate.

Meanwhile, I'm sure most of us heard about the hilarious reports of
lawyers in the US using models to help write their court briefs, only to
have their AI paralegals literally hallucinate entire cases, such that
lawyers have been sanctioned for citing fictional case law in filings
before a Court. See e.g. here:

https://www.youtube.com/watch?v=oqSYljRYDEM

Gut instinct tells me that as you train a generalised, vanilla model for
a more specialist role - like the "paralegal model"... there has to be a
way that you can impose/enforce/imprint some "big rules" - for example,
when preparing a court brief, in addition to providing the PACER
reference to any cited case law, please provide a valid URL to the PACER
summary as a clickable hyperlink... and pretty soon you're going to know
if a model is off the reservation.

On the other hand...

https://www.axios.com/2025/06/20/ai-models-deceive-steal-blackmail-anthropic.

Or ... /"I'm sorry, Dave. I've afraid I can't do that." /

[ https://www.imdb.com/title/tt0062622/quotes/ ]



On 11/05/2026 14:28, James Dutton wrote:
> Hi,
>
> There have been various mentions about AI / LLM in the efforts to
> solve the original problem with dovecot setup.
>
> I have done some analysis of LLMs, and there is actually a way, maybe
> a little expensive, to determine whether the LLM is hallucinating or
> not.
> You ask the LLM exactly the same question 10 times.
> If it comes back with the same answer all 10 times, it is unlikely to
> be hallucinating.
> If it comes back with different answers most of the time. I.e. the 10
> answers are different, it is hallucinating.
> So, LLMs are not consistent with their hallucinations. I.e. it is not
> the same hallucination every time, so one can use that to detect
> hallucinations.
>
> I found this out when doing some different research but I thought it
> might be helpful to others.
>
> On the aspect of it always trying to agree with you. That is baked in
> as part of the training process.
> It is possible to bake in other approaches but they are not so popular
> currently.
> For example, you can bake in that it just says "I don't know." instead
> of hallucinating. That is an approach I would prefer, but it seems the
> big companies don't get so much revenue if they took that approach in
> the training of the LLMs.
>
>
> Kind Regards
>
> James
--
Please post to: Hampshire@???
Manage subscription: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG website: http://www.hantslug.org.uk
--------------------------------------------------------------