As always, I use the term “AI” loosely. I’m referring to these scary LLMs coming for our jobs.
It’s important to state that I find LLMs to be helpful in very specific use cases, but overall, this is clearly a bubble, and the promises of advance have not appeared despite hundreds of billions of VC thrown at the industry.
So as not to go full-on polemic, we’ll skip the knock-on effects in terms of power-grid and water stresses.
No, what I want to talk about is the idea of software in its current form needing to be as competent as the user.
Simply put: How many of your coworkers have been right 100% of the time over the course of your career? If N>0, say “Hi” to Jesus for me.
I started working in high school, as most of us do, and a 60% success rate was considered fine. At the professional level, I’ve seen even lower with tenure, given how much things turn to internal politics past a certain level.
So what these companies are offering is not parity with senior staff (Ph.D.-level, my ass), but rather the new blood who hasn’t had that one fuckup that doesn’t leave their mind for weeks.
That crucible is important.
These tools are meant to replace inexperience with incompetence, and the beancounters at some clients are likely satisfied those words look similar enough to pass muster.
We are, after all, at this point, the “good enough” country. LLM marketing is on brand.
Normally I’d agree, and we used some of that in the original form (like maximum hours, checking for negative submissions, etc.) but requests don’t always follow simple logic and more complex logic just led to failures every time a user did something other than take a standard full day off.
Some employees work 7 hours, while others are 7.5, some have flex days and hours that change that, sometimes requests are only for part days, sometimes they may use multiple leave types to cover one off period.
I spent a few hours writing and testing the prompt against previous submissions to fine tune it.
So far it’s detected every submission error in the two weeks it’s been running, with only one false positive.
If it helps to accurately fill in the details correctly in the backend system, which are then properly validated or escalated for human review/intervention (and let the human requester choose the escalation path too, as opposed to blindly submitting), then it sounds great.
Guided experiences, leading to the desired outcome, with less need for confused humans to talk to confused humans.
I want the same for most financial approvals in my company. Finance folks speak a different language to most employees, but they have an oversized impact on defining business processes, slowing down innovation, frustrating employees, and often driving costs UP.