@notabot

notabot@lemm.ee · 5 months ago

Could you let me know what sort of models you’re using? Everything I’ve tried has basically been so bad it was quicker and more reliable to to the job myself. Most of the models can barely write boilerplate code accurately and securely, let alone anything even moderately complex.

I’ve tried to get them to analyse code too, and that’s hit and miss at best, even with small programs. I’d have no faith at all that they could handle anything larger; the answers they give would be confident and wrong, which is easy to spot with something small, but much harder to catch with a large, multi process system spread over a network. It’s hard enough for humans, who have actual context, understanding and domain knowledge, to do it well, and I’ve, personally, not seen any evidence that an LLM (which is what I’m assuming you’re referring to) could do anywhere near as well. I don’t doubt that they flag some issues, but without a comprehensive, human, review of the system architecture, implementation and code, you can’t be sure what they’ve missed, and if you’re going to do that anyway, you’ve done the job yourself!

Having said that, I’ve no doubt that things will improve, programming languages have well defined syntaxes and so they should be some of the easiest types of text for an LLM to parse and build a context from. If that can be combined with enough domain knowledge, a description of the deployment environment and a model that’s actually trained for and tuned for code analysis and security auditing, it might be possible to get similar results to humans.

notabot@lemm.ee · 5 months ago

I’m unlikely to do a full code audit, unless something about it doesn’t pass the ‘sniff test’. I will often go over the main code flows, the issue tracker, mailing lists and comments, positive or negative, from users on other forums.

I mean, if you’re not doing that, what are you doing, just installing it and using it??!? Where’s the fun in that? (I mean this at least semi seriously, you learn a lot about the software you’re running if you put in some effort to learn about it)

notabot@lemm.ee · 5 months ago

‘AI’ as we currently know it, is terrible at this sort of task. It’s not capable of understanding the flow of the code in any meaningful way, and tends to raise entirely spurious issues (see the problems the curl author has with being overwhealmed for example). It also wont spot actually malicious code that’s been included with any sort of care, nor would it find intentional behaviour that would be harmful or counterproductive in the particular scenario you want to use the program.

notabot@lemm.ee · 6 months ago

Clearly fake, it says there’s a tty number in the top left corner and there isn’t.

notabot@lemm.ee · 6 months ago

I’m sure you’ve already considered it, but from that description it sounds very much like make. That compares the input files’ timestamps to the output files’ timestamps, so it might be different to your plan though.

notabot@lemm.ee · 7 months ago

Before you can decide on how to do this, you’re going to have to make a few choices:

Authentication and Access

Theres two main ways to expose a git repo, HTTPS or SSH, and they both have pros and cons here:

HTTPS A standard sort of protocol to proxy, but you’ll need to make sure you set up authentication on the proxy properly so that only only thise who should have access can get it. The git client will need to store a username and password to talk to the server or you’ll have to enter them on every request. gitweb is a CGI that provides a basic, but useful, web interface.
SSH Simpler to set up, and authentication is a solved problem. Proxying it isn’t hard, just forward the port to any of the backend servers, which avoids decrypting on the proxy. You will want to use the same hostkey on all the servers though, or SSH will refuse to connect. Doesn’t require any special setup.

Replication

Git is a distributed version control system, so you could replicate it at that level, alternatively you could use a replicated file system, or a simple file based replication. Each has it’s own trade-offs.

Git replication Using git pull to replicate between repositories is probably going to be your most reliable option, as it’s the job git was built for, and doesn’t rely on messing with it’s underlying files directly. The one caveat is that, if you push to different servers in quick suscession you may cause a merge confict, which would break your replication. The cleanest way to deal with that is to have the load balancer send all requests to server1 if it’s up, and only switch to the next server if all the prior ones are down. That way writes will alk be going to the same place. Then set up replication in loop, with server2 pulling from server1, server3 pulling from server2, and so on up to server1 pulling from server5. With frequent pulls changes that are commited to server1 will quickly replicate to all the other servers. This would effectively be a shared nothing solution as none of the servers are sharing resources, which would make it easier to geigraphically separate them. The load balancer could be replaced by a CNAME record in DNS, with a daemon that updates it to point to the correct server.
Replicated filesystem Git stores its data in a fairly simple file structure, so placing that on a replicated filesystem such as GlusterFS or Ceph would mean multiple servers could use the same data. From experience, this sort of thing is great when it’s working, but can be fragile and break in unexpected ways. You don’t want to be up at 2am trying to fix a file replication issue if you can avoid it.
File replication. This is similar to the git replication option, in that you have to be very aware of the risk of conflicts. A similar strategy would probably work, but I’m not sure it brings you any advantages.

I think my prefered solution would be to have SSH access to the git servers and to set up pull based replication on a fairly fast schedule (where fast is relative to how frequently you push changes). You mention having a VPS as obe of the servers, so you might want to push changes to that rather than have be able to connect to your internal network.

A useful property of git is that, if the server is missing changesets you can just push them again. So if a server goes down before your last push gets replicated, you can just push again once the system has switched to the new server. Once the first server comes back online it’ll naturally get any changesets it’s missing and effectively ‘heal’.

notabot@lemm.ee · 7 months ago

I manage all my homelab infra stuff via ansible and run services via kubenetes. All the ansible playbooks are in git, so I can roll back if I screw something up, and I test it on a sacrificial VM first when I can. Running services in kubenetes means I can spin up new instances and test them before putting them live.

Working like that makes it all a lot more relaxing as I can be confident in my changes, and back them out if I still get it wrong.

notabot@lemm.ee · 9 months ago

The problem is that those issues have, and continue to, cause damage to the Linux project. Good maintainers have been hounded out, or simply given up, and bad blood exists where it absolutely shouldn’t. You’re right that much of it is political, although that usually stems from deep technical differences backed up by corporate encouragement. Political turmoil can be as damaging, if not moreso, than technical differences. At least technical differences can usually be resolved technically, politics is infinitely more nuanced.

From Marcan’s description, the way certain people treated him was absolutely unacceptable, although I’ve no doubt they’d describe things very differently. I hope the whole kernel team, maintainers and contributers, can find a way to work through these differences and work more harmoniously before more members end up burnt out, frustrated and bitter.

notabot@lemm.ee · 9 months ago

It just occured to me that if you want to use Ubuntu without snap, you could uninstall the snap package itself (I’m not on Ubuntu, so you might need to find it), then put a ‘hold’ on the package to prevent it being reinstalled. That should, in turn, prevent any package versions that use snap from being installed.

Initially uninstalling snap might require removing any packages that use it, but that’ll tell you what you need non-snap versions of.

notabot@lemm.ee · 9 months ago

I suspect that what’s happened is you installed the apt version, then at some point upgraded it and there was a version in the main repo that had a higher version number and installed the snap version. If two repositories both have a package with the same name, and no other rules in place, the higher version number wins.

If that is the case, you need to pin the firefox package to the mozilla repository. You can find more details here: https://wiki.debian.org/AptConfiguration

notabot@lemm.ee · 9 months ago

Nagios. It does depend on what you mean by monitor though. Nagios is good at telling you that “service A on host B” is down" but less useful for looking at things like performance trends. I particularly like being able to setup dependencies between services, so I get the alert for the root cause, and not all of the services that have gone down because of it.

notabot@lemm.ee · 9 months ago

You dismiss the data you recorded because it doesn’t seem to support your hypothysis tgat there is greater lag in wayland, but that’s not really the right approach, and I think it points to a different conclusion.

You recorded a lag of 5 or 6 frames at 90 frames per second in both Xorg and wayland, which suggests that the lag is the same to within 0.011 seconds, and I don’t think that you can say that’s a huge difference. However, what you didn’t test is the acceleration curve on mouse movement. If that curve is different under wayland it could easily feel infuriatingly laggy without actually showing any extra delay on the movement starting or ending.

I’m not sure how you’d accurately test that, a HID device just sending mouse move events wouldn’t do it as wouldn’t mimic you accelerating the mouse from stationary, so wouldn’t exercise the acceleration curve in wayland. You might need a physical device that moves your actual mouse a fixed dustance and then measure the distance the cursor moves on screen. Repeat for different movement speeds and you might have sone useful data.

notabot@lemm.ee · 10 months ago

Ha! Faking key presses, truly an elegant weapon for a more civilized age. If it works, it works.

notabot@lemm.ee · 10 months ago

The thing that got me to switch was being able to maintain my pane layout between connections. The various window and pane management niceties (naming, swapping, listing and the like) got me to stay. Now you can keep your screen, but you’d have to pry tmux from my cold, dead, tty.

notabot@lemm.ee · 10 months ago

It’s likely that you’re using a systemd based system and the admin hasn’t enabled linger for your user.

notabot@lemm.ee · 11 months ago

I agree with the sentiment regarding being woken up, but I used to look forward to being on call. I could go to bed happily, knowing I was earning a significant premium and I’d still get a good night’s sleep because the systems just didn’t go down. I had the advantage that most of the customers I supported had similar requirements, so I had their systems locked up pretty well. Minor problems (disk space. Why is it always disk space?) would self heal, catastrophic failures (hardware failures or the engineer who supposed to replace a component unplugging the wrong server) would fail over to the rest of the cluster. I never had much trouble with logging either, it was typically one of the first things set up, and I had most of the setup automated to avoid missing anything. I suppose the thing was I was supporting systems I’d built, and I’d built them to ensure I didn’t have to be woken up.

I do a lot more troubleshooting and rescue type work nowadays, and the number of times I run into systemd components just not doing what they should is frustrating to say the least. Being able to pull the logs by knowing the service name would be nice, but a) you could already do that because you set up different services to log to different places and b) you don’t always know the service name in question. Being able to just grep the log directory is a lifesaver. You can still do that, but only because distros set systemd up to log to file as well as it’s binary format. I loathe the way systemd ends up spreading it’s unit files over about a dozen different directories, with overrides increasing that even further. I just want to know what services I’ve got and what will start up, in exactly what order, on the next reboot, dammit! The last one is particularly tricky as, due to services being started in parallel, you can’t predict exactly what order things will actually start between targets. That shouldn’t matter, units should have all their dependencies properly listed, but it’s no fun tracking down a race condition that only happens once every x reboots when a particular network service takes a few hundred milliseconds longer to come up. Give me sequential boot any day. It might take a few tens of seconds longer, but it happens the same way each time, and I only need to look in one place to know what that is.

As to systemd’s dominance, once Redhat, where Mr Pottering worked, chose it, it became hard for other distros not to. Derivative distros obviously went with it, and if you look back through the various email discussions, it was far from a unanimous choice for distros like Debian to choose it. They did so eventually mostly, as far as I can see, because it would theoretically make packaging easier. Fortunately they still support sysvinit, so all is not lost for those of us who want a mainstream distro without systemd bloat.

Shifting stuff to kube is definitely goot for making things more robust, so long as you’ve got the underlying clustering working, and I quite like working with it too. Once you realise it’s basically just a database and message queue with a bunch of controllers for managing storage, networking, containers and the like, and the ability to extend that, you can do all sorts of fun things with it.

Anyway, I’ve gone on for long enough. If you’re a sysadmin and the number of trouble calls is going down, then you probably don’t hear this often enough: well done, you’re doing a great job.

notabot@lemm.ee · 11 months ago

Ok, fair point on the capital D, I must have read it like that years ago and it stuck. I shall have to make an effort to unlearn it.

As to the rest, systemd has been a constant thorn in my side ever since L. Pottering published “Rethinking PID 1” back in 2010 or so. I found, and still find, that most of the assertions and actions in that document either don’t really hold, or just aren’t really relevant. Basically it’s trying to solve a problem that really wasn’t an issue in the real world, and does so in such a massively overbearing way that everything actually becomes more laborious than it otherwise would be. From my perspective it’s an unnecessarily complex and poorly architected attempt to answer a need that was better served in different ways. That it’s become a near mono-culture is deeply concerning.

I’ve also run into all sorts of awkward edge cases and misfeatures over the years, from the automounter that occasionally didn’t to race conditions that only manifest at the worst moments, none of which would have occured had the basic tenet of “do one thing and do it well” been followed. The extreme verbosity of the configuration, and unnecessarily large number of places it can be spread just serve to make it even more unpleasant to deal with compared to the simplicity of init scripts, crontabs and the like.

The sad thing is, there’s undoubtedly some good ideas buried in it, but they could all have been implemented much more lightly and in a way that worked with the rest of the ecosystem rather than fighting it. Things like starting daemons in what is essentially a repeatable sandbox, or being able to isolate logging per service. They could, and had both been implemented already, but systemd has a real “not invented here” problem, so everything was built again, with all the attendant bugs, and design issues that inevitably brings.

Ultimately clients pay good money for me to look after their systems, systemd or not, so I probably shouldn’t grumble, but I miss the days when Linux was a clean and elegant system, without this multi-tentacled thing sitting on top of it.

notabot@lemm.ee · 11 months ago

SystemD is far too much of a poorly thought through mess to have anything like a sane GUI configuration, it doesn’t even have a sane textfile based configuration. We’re going to have to wait fir SystemD to crumble under it’s own weight and be replaced with multiple, simple, cleanly designed components before we have any hope of a sane config again. Sort of like we used to have before a certain someone/some company (depending on how conspiratorial you’re feeling) decided to come along and muck it all up.

/rant

Thank you for coming to my Ted ~~Talk~~ Rant. You may gather I dislike SystemD quite a lot.

notabot@lemm.ee · 11 months ago

It’s an old phrase, “lions lead by donkeys”, originally used to describe British soldiers being lead by incompetent generals, possibly during the Crimean war. The concept goes way further back though; apparently an Athenian general in the 4th century said (translated) “an army of deer commanded by a lion is more to be feared than an army of lions commanded by a deer”.

notabot@lemm.ee · 11 months ago

Vim is running as you, rather than root, so you wont be able to edit other files as root, and any rogue plugins wont be able to either, which is good.

Sudoedit has various guards around what it’ll let you edit, in particular, you can’t edit a file in a directory you already have write permission on as doing so allows the user to bypass restrictions in the sudoers setup (there’s more detail in their issue tracker. If the directory is already writable though, you don’t need sudoedit anyway.