wordfreq is not just concerned with formal printed words. It collected more conversational language usage from two sources in particular: Twitter and Reddit.
Now Twitter is gone anyway, its public APIs have shut down,
Reddit also stopped providing public data archives, and now they sell their archives at a price that only OpenAI will pay.
There’s still the Fediverse.
I mean, that doesn’t solve the LLM pollution problem, but…
The
~/.ssh/known_hosts
file only contains public keys. I mean, maybe someone doesn’t want to hand out the list of hosts that they talk to, but exposing it doesn’t expose the private keys, which are what you really need to keep secret.Those are in
~/.ssh/id_rsa
or the like, depending upon key type.