How do we actually confront or evade "kirkification" and the flood of ai slop?

h333d@lemmy.world · 3 days ago

How do we actually confront or evade "kirkification" and the flood of ai slop?

SuspciousCarrot78@lemmy.world · edit-2 3 days ago

I’m exactly doing this atm. I’m running a homelab on a $200 USD lenovo p330 tiny with a Tesla P4 GPU, via Proxmox, CasaOS and various containers. I’m about 80% finished with what I want it to do.

Uses 40W at the wall (peak around 100W). IOW about the cost of a light bulb. Here’s what I run -

LXC 1: Media stack

Radarr, Sonarr, Sabnzdb, Jellyfin. Bye bye Netflix, D+ etc

LXC 2: Gaming stack

Emulation and PC gaming I like. Lots of fun indie titles, older games (GameCube, Wii, PS2). Stream from homelab to any TV in house via Sunshine / Moonlight. Bye bye Gforce now.

LXC 3: AI stack

Llama.cpp + llama-swap (AI back ends)
Qdrant server (document server)
Openwebui (front end)

Bespoke MoA system I designed (which I affectionately call my Mixture of Assholes, not agents) using python router and some clever tricks to make a self hosted AI that doesn’t scrape my shit and is fully auditble and non hallucinatory…which would otherwise be impossible with typical cloud “black box” approaches. I don’t want black box; I want glass box.

Bye bye ChatGPT.

LXC 4: Telecom stack

Vocechat (self hosted family chat replacement for WhatsApp / messenger),

Lemmy node (TBC).

Bye bye WhatsApp and Reddit

LXC 5: Security stack

Wireguard (own VPN). NPM (reverse proxy). Fail2Ban. PiHole (block ads).

LXC 6: Document stack

Immich (Google photos replacement), Joplin (Google keep), Snapdrop (Airdrop), Filedrop (Dropbox), SearXNG (Search engine).

Once I have everything tuned perfectly, I’m going to share everything on Github / Codeberg. I think the LLM stack alone is interesting enough to merit attention. Everyone makes big claims but I’ve got the data and method to prove it. I welcome others poking it.

Ultimately, people need to know how to do this, and I’m doing my best to document what I did so that someone could replicate and improve it. Make it easier for the next person. That’s the only way forward - together. Faster alone, further together and all that.

PS: It’s funny how far spite will take someone. I got into media servers after YouTube premium, Netflix etc jacked their prices up and baked in ads.

I got into lowendgaming when some PCMR midwit said “you can’t play that on your p.o.s. rig”. Wrong - I can and I did. It just needed know how, not “throw money at problem till it goes away”.

I got into self hosting LLM when ChatGPT kept being…ChatGPT. Wasting my time and money with its confident, smooth lies. No, unacceptable.

The final straw was when Reddit locked my account and shadow banned me for using different IP addresses while travelling / staying at different AirBNBs during holiday “for my safety”.

I had all the pieces there…but that was the final “fine…I’ll do it myself” Thanos moment.

pineapple@lemmy.ml · 2 days ago

Wait how did you set it up to avoid haloucinations? Is there a guide you followed that you can point me to?

SuspciousCarrot78@lemmy.world · edit-2 24 hours ago

I’ll try explaining using an analogy (though I can go nerd mode if that’s better? Let me know; I’m assuming an intelligent lay audience for this but if you want nerd-core, my body is ready lol).

PS: Sorry if scattered - am dictating using my phone (on holiday / laptop broke).

Hallucinations get minimized the same way a teacher might minimise a student from confidently bullshitting on their book reports: you control context (what they’re allowed to talk about), when they’re allowed to improvise, and you make them show their work when it matters by doing a class presentation.

Broadly speaking, that involves using RAG and GAG (of your own documents) as “ground truth”, setting temperature low (so LLM has no flights of fancy) and adding verifier passes / critic assessment by second model.

Additionally, a lot of hallucinations come from the model half-remembering something that isn’t in front of it and then “improvising”.

To minimise that, I coded a little python tool that forces the llm to store facts verbatim (triggered by using !!) into a JSON (text) file, so that when you ask it something it recalls it exactly as a sort of rolling memory. The basis of that is from something I made earlier for OWUI

https://openwebui.com/posts/total_recall_4a918b04

So what I have in place is this -

I use / orchestrate a couple of different models, each one tuned for a specific behaviour. They work together to produce an answer.

My python router then invokes the correct model for the task at hand based on simple rules (is the question over 300 words? Does it have images? Does it involve facts and figures or is it brain storming/venting/shooting the shit?)

The models I use are

Qwen 3-4B 2507 Instruct (usual main brain)
Phi-4-mini (critic)
Nanbeige 3B (2nd main brain when invoked / shit shooter)
You-tu LLM (coding stuff)
Qwen3-VL-4b (visual processing)
Qwen3-8b (document summariser)
Qwen3-1.7b (court jester that when invoked rewrites “main brain” output with contextually appropriate Futurama, Simpsons, Firefly etc quotes. With blackjack. And hookers!).

To give a workflow example - you ask a question.

The python router decides where it needs to go to. Let’s suppose its a technical look up / thinking about something in my documents.

The “main brain” generates an answer using whatever grounded stuff you’ve given it access to (in Qdrant database and JSON text file). If no stored info, it notes that explicitly and proceeds to next step (I always want to know where it’s pulling it’s into from, so I make it cite its references).

That draft gets handed to a separate “critic” whose entire job is to poke holes in it. (I use very specific system prompt for both models so they stay on track).

Then the main brain comes back for a final pass where it fixes the mistakes, reconciles the critique, and gives you the cleaned‑up answer.

It’s also allowed to say “I’m not sure; I need XYZ for extra context. Please provide”.

It’s basically: propose → attack → improve.

Additionally, I use a deterministic memory system (basically just a python script that writes to a JSON / text file that the LLM writes exactly into and then retrives exactly out from), without editorialising facts of a conversation in progress.

Facts stored get recalled exactly without llm massage or rewrite.

Urgh, I hope that came out OK. I’ve never had to verbally rubber-duck (explain) it to my phone before :)

TL;DR

Hallucinations minimised by -

Careful fact scraping and curation (using Qdrant database, markdown text summaries and rolling JSON plain text facts file)
Python router that decides which LLM (or more accurately, SLM, given I only have 8GB VRAM) answers what, based on simple rules (eg: coding questions go to coder, science questions go to science etc)
Keeping important facts outside of the LLM, that it needs to reference directly (RAG, GAG, JSON rolling summary).
Setting model temperatures so that responses are as deterministic as possible (no flowery language or fancy reinterpretations; just the facts, ma’am).
Letting the model say “I don’t know, based on context. Here’s my best guess. Give me XYZ if you want better answer”.

Basic flow:

ask question --> router calls model/s --> “main brain” polls stored info, thinks and writes draft --> get criticized by separate “critic” --> “main brain” gets critic output, responds to that, and produces final version.

That reduces “sounds right” answers that are actually wrong. All the seams are exposed for inspection.

pineapple@lemmy.ml · 15 hours ago

Thats awesome! I was going to add some sort of AI to my proxmox homelab for researching but I figured the risk of halloucination was too high, and I thought that the only way to fix this was getting a bigger model. But thid seams like a really good setup (if I can actually figure out how to implement it.) And I wont need to upgrade my gpu!

Althogh I only have one ai suitable gpu (I have a gtx 1660 6gb in my homelab which is really only suitable for movie transcoding.) I have a 3060 12gb that I use in my gaming pc I was thinking I could setup some kind of wol system that boots the pc and sets up the ai software on that. Maybe my homelab hosts openwebui and when I send a queory it prompts my gaming pc to wake up and do the ai crunching.

SuspciousCarrot78@lemmy.world · edit-2 8 hours ago

Well, technically, you don’t need any GPU for the system I’ve set up, because only 2-3 models are “hot” in memory (so about…10GB?) and the rest are cold / invoked as needed. My own GPU is only 8GB (and my prior one was 4GB!). I designed this with low end rigs in mind.

The minimum requirement is probably a CPU equal to or better than mine (i7-8700; not hard to match), 8-10GB RAM and maybe 20GB disk space. Bottom of the barrel would be 4gb but you’ll have to deal with ssd thrashing.

Anything above that is a bonus / tps multiplier.

FYI; CPU only (my CPU at least) + 32gb system RAM, this entire thing runs at about 10-11 tps, which is interactive enough speed / faster than reading speed. Any decent gpu should get you 3-10x that. I designed this for peasant level hardware / to punch GPTs in the dick thru clever engineering, not sheer grunt. Fuck OpenAi. Fuck Nvidia. Fuck DDR6. Spite + ASD > “you can’t do that” :). Yes I fucking can - watch me.

If you want my design philosophy, here is one of my (now shadowbanned) posts from r/lowendgaming. Seeing you’re a gamer, this might make sense to you! The MoA design I have is pure “level 8 spite, zip tie Noctura fan to server grade GPU and stick it in a 1L shoebox” YOLOing :).

It works, but it’s ugly, in a beautiful way.

Lowend gaming iceberg

Level 1

Drop resolution to 720p
Turn off AA, AF, Shadows etc
Vsync OFF
Windowed mode? OK.
Pray for decent FPS

Level 2

Use Nvidia/Intel/AMD control panel for custom tweaks
Create custom low end resolutions (540p, 480p) so GPU can enumerate them to games
Pray for decent FPS

Level 3

Start tweaking .cfg and .ini files like you’re a caveman from the ancient year of 1998
FPS capping? Sure.
FOV size of a keyhole? Do it
Texture filtering hacks / replacements? Rock on.
Pray for decent FPS

Level 4

Time to get serious. Crack open the box - repaste, clean, try to add more ram from anything that even remotely fits. We can hack the timings to match, no problem!
BIOS tweaking time! Let’s see what breaks! Oh…everything.
May as well undervolt and over clock, seeing we’re in here already. Where’s my paperclip…
EDID hacks to make TV / monitor do dumb shit, like run at resolutions it shouldn’t or Hz it pretends it can’t? Why not.
Pray for decent FPS

Level 5

Software time again! Lossless scaling? Sure!
Reshade post processing to sharpen ultra low mush? Ok.!
Integer scaling? Scanlines? Why not
Special K swap chain injection to force low res where no low res exists? Right on.
DXVK? Yolo.
Pray for decent FPS

Level 6

Fuck it; time for real black magic
Hack registry keys in windows settings.
Hex edit settings directly
Make windows believe impossible things, like imaginary VRAM.
Sacrifice boxed copy of Win98 to Linus Torvalds for absolution.
Pray for decent FPS

Level 7

Fine…I’ll do it myself then.
Strip out the game assets and rewrite shaders
No fancy lighting, kill the fill rate, post processing gone.
At this point, you may as well just recode the fucking game from scratch.
Pray for decent FPS

Level 8

Purely driven by spite now.
Franken-mod a $15 eGPU and run it via Pcie adaptor. Flash the vBIOS to do unnatural things.
Everything is overheating. Drill holes in case to improve airflow.
Still too hot; drag in desk fan. Point directly at case. Your PC now sounds like Darth Vader. Neat.
Decompile the games DLLs just to prove you can. Sneer at them.
No longer praying for FPS; now praying for no magic blue smoke.

Level 9

Buy an Xbox.

Dr_Vindaloo@lemmy.ml · edit-2 3 days ago

Is there somewhere I can follow to see this if you end up open sourcing it? Sounds pretty interesting (personally I’m looking into a k3s-based setup but it’s always interesting to see how others do things)

SuspciousCarrot78@lemmy.world · edit-2 3 days ago

Yep! I will mirror it here -

https://github.com/BobbyLLM

(Its empty rn / place holder only).

I had a bunch of prelim write-ups on r/LocalLLM and r/LocalLlama and r/homelab but they’re in the shadowrealm now due to reddit ban (fuck reddit)

I will also post it on @homelabs and @privacy here; I think my MoA design is worthwhile enough to maybe even merit a post on HackerNews…but I want to cross all T’s and dot all I’s before I get into that bar fight lol.

brownmustardminion@lemmy.ml · 2 days ago

Take it one step further and host your repo somewhere other than github. Codeberg, perhaps?

SuspciousCarrot78@lemmy.world · edit-2 1 day ago

Agreed. I have concerns with how Microsoft is handling Github, but organic discovery sure seems to favour Github / reddit / YouTube.

Unsurprising, YouTube (google) really doesn’t trust accounts without phone numbers attached (I set mine up before that was a requirement, using a @skiff address, so my ability to upload long form videos is curtailed. I think it was shadow banned from day 1, irrespective of how much we watch YT).

Probably the smart thing to do is to set up on Codeberg and maybe upload some “how to” videos to internet archive, and have github mirorring / forwarding.

That way whoever wants to find it can find it, somehow.

doodoo_wizard@lemmy.ml · 3 days ago

You can’t turn back the clock. Meaningful changes require a different social relationship between people and production.

utopiah@lemmy.ml · 3 days ago

Sure, rent a cloud server for $10/month, install Docker/Podman then all self hosted services you need. Invite people on your Jitsi Meet server, publish your videos on PeerTube, work via NextCloud, etc. It’s not easy the first time but with each (well documented) step it becomes easier. Most important : backup your data.

SuspciousCarrot78@lemmy.world · 3 days ago

The cloud? You mean someone else’s computer? 🤣

utopiah@lemmy.ml · edit-2 3 days ago

That’s actually my recommendation yes.

If somehow after a month you feel like you do want this “lifestyle”, are comfortable with setting up a VPN (if you need external access) THEN spend more and get your a SBI like a RPi and have it at home. If that’s still not enough then go up to a proper server you host, use a non commercial ISP, etc … but IMHO don’t start with a server at home if you are not familiar with all this, it’s counter intuitively harder and definitely more expensive.

Also FWIW you should still have an offsite backup regardless of how you do it.

SuspciousCarrot78@lemmy.world · edit-2 3 days ago

I get where you’re coming from (and why) but am of the “rip the bandaid off clean in one go” school of thought.

A smaller start (like using that RPI to self host Jellyfin server for your home) puts you on the road to sovereignty straight away. A Pi 4 costs what…$60 (plus $30 for power supply and SD card)? Hell, use an old laptop.

Once you have one thing running, your on the right path for the next and the next.

Doing it on the cloud I think is paradoxically harder and ultimately self defeating.

Don’t get me wrong - if you need the cloud (say, you need to rent a H100 for a few hours to fine tune your LLM), I’m all for it. But if sovereignty is to goal - and the gateway drug is a SBC and a few days / weeks of self learning…you may as well start eating the elephant. IMHO and YMMV of course.

utopiah@lemmy.ml · edit-2 2 days ago

IMHO the key aspect isn’t where you host things but rather understanding how hosting itself works.

To me the most challenging aspects are how to :

route traffic
start a service
backup your data

and also ideally

have more than 1 service on a single machine
restore your data
restore your entire setup

For that very first step I would say having a machine directly exposed to the Internet makes it easier. I don’t know what ISP you use but at least in Belgium where I’m currently located all ports are closed and IP are dynamic. That means if you want to show your freshly started Apache Web server to your mother in law it will challenging.

Meanwhile if you do manage to get to the last step, namely restore your entire setup, then restoring to a cloud service or a RPi is the same, you transfer your data, start your services and voila, you are back either LAN only or on the entire Internet via a cloud provider.

So autonomy isn’t as much as to where things are physically hosted and by whom as in the actual capacity to able to host there or elsewhere.

Finally if you are using a commercial ISP, as opposed to having your own AS, are you really self-hosting?

SuspciousCarrot78@lemmy.world · edit-2 1 day ago

For sure. I can dig where you’re coming from.

For me, I wanted to replace cloud based services for my personal use / in home as primary motivation; it’s only very recently that I am considering things like setting out-of-LAN access for broader family.

(I do have a minimal off site back up (to a raspberry pi stored at my parents home), but obviously this is not enterprise level infra).

My personal quirk is power management. Yes, my rig only uses about 80-100 w…but I can’t stop day dreaming creating a fall over system / bespoke UPS. Back of napkin calc suggest that a single marine / car battery should be able to store enough juice to run it (and my router) for 24hrs. Clunky as it is…the DIY nature of that really appeals to me

https://www.youtube.com/watch?v=1q4dUt1yK0g

☂️-@lemmy.ml · 3 days ago

we need to cut or block out big tech so people wake up.

pr3d@eviltoast.org · 3 days ago

Where is this IA slop people are talking about? I see this rarely. Do you see it in private chats forwarded by friends? tell them to stop it. It’s also on X and Instagram isn’t it? which is rarely browse. Sometime I see it in the popular post of Reddit l, when I’m bored and look there with redlib. I search with Kagi or Whoogle and spend time in the fediverse, hackernews and some of my favourite websites directly. I’m probably in my bubble with less AI slop, maybe I also can’t recognize it anymore. I also hate it and we should have platform/server/community/… rules that forbid it.

h333d@lemmy.world · 3 days ago

You’re right - you’ve successfully built an infrastructure that keeps you outside the slop machine. Kagi, Whoogle, fediverse, HackerNews - that’s strategic refusal working as intended. The slop is concentrated on mainstream platforms where people haven’t opted out. Instagram, TikTok, Facebook, YouTube - my friends still using those are drowning in AI-generated engagement bait, fake historical photos, GPT-written content. It’s not subtle anymore for people still plugged in. The kirkification angle is trickier though - it’s not just what you see, it’s how you’re represented in spaces you’re not in. Someone can generate deepfakes of you and you’d never know. Your digital body gets remixed without consent. Your “maybe I can’t recognize it anymore” point is real. The aesthetic tells are getting harder to spot. Five years ago it was obvious, now it takes active effort. Platform rules banning it would help but verification at scale is nearly impossible. The only reliable defense is what you’re doing - removing yourself from spaces where slop is profitable. But that’s also a technical barrier. I can set up Whoogle and fediverse accounts, but my friends on Instagram? That’s where their community actually is. Opting out means losing access for most people. This is why municipal-scale infrastructure matters - if a town runs its own services, suddenly opting out isn’t a technical hurdle, it’s just where the community is. You asking “where is the slop?” while others drown in it proves we’re already living in parallel internets. The bifurcation is real.

h333d@lemmy.world · 3 days ago

You’re absolutely right about the ageism - that was lazy framing on my part. The vulnerability is psychological and universal, not demographic. I’ve watched my technically-savvy friends fall for the same engagement manipulation as anyone else. I respect the hell out of the radical position you’re taking, and you’re correct that it solves the problem for you personally. But for a lot of us here, the threat model isn’t “can I individually opt out” - it’s “how do I minimize harm while participating in systems I can’t fully escape.” I’m 24, unemployed, job searching in tech. Most employers require LinkedIn, GitHub, email. My actual community - the people I game with, the friends who get me - are scattered across the continent. The meatspace-only option isn’t realistic for someone in my position. Alberta doesn’t exactly have the densest scene for the communities I’m part of. So I’m attempting harm reduction: self-hosted Matrix instead of Discord. Jellyfin instead of Spotify. Soju IRC bouncer instead of Slack. My own Proxmox homelab instead of cloud services. It’s not as pure as full disconnection, but it means I’m not feeding OpenAI’s training datasets or Meta’s engagement algorithms with every interaction. Your point about treating followers as “avatars of the same algorithm” is exactly what I’m trying to escape by moving communication to federated and self-hosted protocols. When I’m on my own IRC server or Matrix instance, I’m talking to people, not to a feed curated by an engagement-maximizing black box. The municipal infrastructure angle matters because it scales the individual solution. I worked at a municipal fiber network - we have the infrastructure to host community services. If a small municipality can run Mastodon, Matrix, and Nextcloud for residents, that’s hundreds of people removed from surveillance capitalism. It’s not everyone going full hermit, it’s building parallel infrastructure that respects privacy by default. Your cross-referencing and source verification advice is solid, but it requires people to first recognize they’re in an algorithmic environment. That’s why I think local-first infrastructure matters - it makes the choice explicit rather than defaulted. I hear you on offline community being the real answer. But for those of us who can’t or won’t fully disconnect, reducing the attack surface and building privacy-respecting alternatives feels like the next best thing.

mrl1@jlai.lu · 3 days ago

It’s the way forward, and a somewhat comfortable one at that for people who would rather start a homelab than talking to random humans (including myself). Internet is bound to be corrupt because of it’s inherent lawlessness and political power through mass propaganda. I would advocate for a ban of centralised social media, but that would only be a temporary solution since bots and trolls creep everywhere and communities online might still have a hard time surviving.

But to fight against the shit flooding, it’s hard to see how you’d do without meatspace option and evidently (as dumb as it may sound) you might want to get involved actively into associations or political activities around you. The high individuality (by that I meant the social atomisation) of the US is why it’s been so susceptible to false information and the far right online propaganda. Real life social fabric is what makes resilience against trolls and AI, and ultimately you’ll only be able to fight the root cause when you’ll be free of that dictator of yours.

So I am with you, and it’s hard to see at first but you’re not alone thinking like you do and finding groups around where you live to talk and think together is the best thing that can be recommended to anyone.

Teaching like another comment says would be such an option to consider.