abliterate

verb

Etymology

Blend of ablate + obliterate. Coined by Redditor /u/FailSpai in early 2024, as the idea is to ablate refusal features to the point of obliteration.

  1. derived from *h₁lengʷʰ- — “not heavy, light; brief; swift
  2. learned borrowing from obliterātus
  3. compounded as abliterate — “ablate + obliterate

Definitions

  1. To uncensor a large language model by modifying specific model internals to remove…

    To uncensor a large language model by modifying specific model internals to remove refusal behaviours or unwanted traits, while aiming to preserve the model's other capabilities.

    • Now that we have our datasets, we can load the model we want to abliterate. […] I evaluated the abliterated and source models from the previous section on the Open LLM Leaderboard and on Nous' benchmark suite.

The neighborhood

Vish — recursive loop

No curated loop yet for abliterate. Loops are being traced one word at a time while the ingestion pipeline matures.

sense glosses and etymology drawn from English Wiktionary · source · CC-BY-SA