abliterate
verbEtymology
Blend of ablate + obliterate. Coined by Redditor /u/FailSpai in early 2024, as the idea is to ablate refusal features to the point of obliteration.
- learned borrowing from obliterātus
Definitions
To uncensor a large language model by modifying specific model internals to remove…
To uncensor a large language model by modifying specific model internals to remove refusal behaviours or unwanted traits, while aiming to preserve the model's other capabilities.
- Now that we have our datasets, we can load the model we want to abliterate. […] I evaluated the abliterated and source models from the previous section on the Open LLM Leaderboard and on Nous' benchmark suite.
The neighborhood
Vish — recursive loop
No curated loop yet for abliterate. Loops are being traced one word at a time while the ingestion pipeline matures.
sense glosses and etymology drawn from English Wiktionary · source · CC-BY-SA