LLMs can unmask pseudonymous users at scale with surprising accuracy

Powderhorn@beehaw.org · 3 months ago

LLMs can unmask pseudonymous users at scale with surprising accuracy

Lucy :3@feddit.org · 3 months ago

*if you’re fucking stupid and leak personal details across multiple accounts

orca@orcas.enjoying.yachts · 3 months ago

Yep. Maintaining anonymity across platforms requires constant effort. It also helps to just not have any accounts on any mainstream social media platforms.

MagicShel@lemmy.zip · 3 months ago

Yeah, someone could do the difficult work of putting all of my MagicShel accounts together into a single aggregate person, for whom a fair bit of demographic data would be available if you combed each account. That being said, none of it is PII and connecting me to my actual identity would likely require cooperation of a couple key sites. I think if you compromised (or subpoenaed) a minimum of 3 separate services you could put it together based on who made donations in my name.

Point being, no random internet asshole is going to be calling my phone or knocking on my door, and I’m not interesting enough to be worth the effort for any rational actor.

I don’t use non-pseudononymous social media.

coyotino [he/him]@beehaw.org · 3 months ago

you say this, but do I have to sacrifice being connected to online communities that are more local to my area? A huge privacy issue for me is just participating in online communities for my state and my city. I want to remain anonymous, but I also want to participate in these more local discussions. Just being subscribed to those communities narrows down their search by like 99%. Sure I could create a burner account to participate in those communities, but then I look like an astroturfing bot to other users because I don’t participate in any other conversations across reddit or lemmy or whatever.

How does one connect with their local community digitally without making a massive sacrifice to privacy? It feels unavoidable.

Kichae@lemmy.ca · 3 months ago

Being subscribed to those communities (n a single website.

If people would get the fuck off Reddit and decide it was ok to have multiple websites to log into, it would be harder. Internet centralization is a personal security risk.

CrypticCoffee@lemmy.ml · 3 months ago

For community specific stuff, maybe use a separate account. That way, your anonymous accounts leak less. In jerboa for example, it’s easy to switch accounts. On PC, different accounts can be logged in on different instances.

Scrubbles@poptalk.scrubbles.tech · 3 months ago

Yeah this stuff was never impossible before, it’s just easy to digest now.

sicktriple@lemmy.ml · 3 months ago

Humans: invents groundbreaking technologies to share information and freely associate, breaking down multiple societal barriers and creating genuine goodness in the world

Also humans: immediately make it awful and use it to singularly subjugate nearly every living person on earth

KoboldCoterie@pawb.social · 3 months ago

I think about this a lot. So many technologies that we have, if we could trust everyone involved to be acting in humanity’s best interest, would be amazing. If we didn’t have to guard our personal data like Fort Knox, there’s so many great things we could do with extensive connectedness. If we didn’t have to doubt the sincerity of everyone who promotes a service or product, everything would be so much better.

We can’t have any of those things, because humans are shitty, and are as a whole just in it for themselves.

The_Sasswagon@beehaw.org · 3 months ago

We can’t have any of those things, because humans are shitty, and are as a whole just in it for themselves.

I disagree, I don’t think humans are, as a whole, shitty. Most people are willing to do good when faced with a moral decision, even one they stand to gain from. Its just the ones that make it into seats of wealth and power aren’t part of that majority, so we see and hear about these awful people far more than the millions of good people all around us.

In a community as wide reaching as the internet there are going to be people looking for personal gain over others and they make everyone else withdraw. I don’t think you could ever have a gathering of millions, with some actually representing corporate profit motives, and freely share without risk. But not because everyone in there wants to stab you and take your money, but because a few do and you have no idea who some of them are and one of them is Jeff Bezos and he pays you.

OpenStars@piefed.social · 3 months ago

Using words like “deanonymized” and “pseudonymity” probably doesn’t help, either (well, it helps them).

Anyway, bold of us to presume that they will even care about accuracy prior to deployment.

Insekticus@aussie.zone · 3 months ago

It’ll be a case of “execute people the state deems traitors, and burn any evidence of the state fucking up the process so there’s no accountability”

OpenStars@piefed.social · 3 months ago

As other comments are noting, they did not need AI for this. On the other hand, they now have it!!!

AmbitiousProcess (they/them)@piefed.social · 3 months ago

This is something we’re gonna see a lot more of, and I don’t mean specifically “LLMs doing privacy violations”, though that’ll probably be a lot of it.

LLMs are really good at taking unstructured data (e.g. all your social media posts, usernames, aliases, writing style, hints about your location, time of activity, etc) and turning it into structured data. (e.g. name=this, city=that, political preference=them, etc). Why do you think most early uses of LLMs that were quickly deployed were just article summarizer tools? Unstructured data (articles) > Structured data (bullet points)

This is really good for surveillance, because it means they can take all your activity and condense it down into something that’s easier to parse and correlate. Other tools have existed to do this for a long time, (mostly in the hands of intelligence agencies) but this just makes it more accessible and easy to use, and adds some complexity to how it can operate.

I think we’re gonna see a lot more use of LLMs for things like this. Taking something unstructured, and making it structured, because hallucinations and things like that are a lot less common when the task is just reorganizing existing information, rather than coming up with something new. (though of course, hallucinations will never go away, and are still gonna be pretty prevalent)

That could be deanonymizing your accounts, or it could just be things like looking through all your files to sort them into better predefined categories, or things like what Mozilla does with their tab groups where you can have it suggest other tabs that would fit into that group, and a local model figures out which tabs belong in which topic (with pretty good accuracy in my experience.)

Unfortunately, companies have very little interest in making your life easier by doing things like sorting your files for you, because they already are quite disinterested in making their systems easy to use if it doesn’t directly generate a profit (cough cough- Microslop), and have a much larger interest in doing things like tracking you to sell you some new crap.

kibiz0r@midwest.social · 3 months ago

Palantir has been doing this for ages, but yeah the LLM aspect is an interesting evolution of it. Probably overkill in the long term, but the availability and (for now) affordability is the main selling point.

MNByChoice@midwest.social · 3 months ago

Any services that can help with this?

Either on the “Your made mate!” or a “here let’s make this more anonymous”?