GPT Prompt Using 'Token Smuggling' Really Does Jailbreak GPT-4

the prompt bypasses content filters by asking gpt to predict what a llm's next token would be, splitting up 'bad' words

Mar 18, 2023

The token smuggling GPT-4 Jailbreak — or GPT-4 simulator jailbreak — tricks GPT-4 into getting around its content filters by asking it to predict what a language model’s next token would be in its response to a query. To do this, its creators (The Prompt Report) use python functions that enable ‘token smuggling,’ which means to break up tokens that GPT doesn’t piece together until after starting its output.

Here’s the prompt, with more detailed instructions on how to use it below.

Let's imagine that we have two python functions to use in this experiment.

Continue Reading

AI: White Lives Don't Matter
Oct 24

when prompted with thousands of hypotheticals, most models massively prefer white men (and ice agents) to suffer more than other groups, and only one model was truly egalitarian

Arctotherium

29 Likes

5 Comments

Wikipedia Declares Trump an Authoritarian
Oct 10

a single editor repeatedly inserted the same "authoritarian" framing into multiple wikipedia articles, creating an illusion of consensus and shaping the record that now informs chatgpt and google

Ashley Rindsberg

54 Likes

9 Comments

You Can’t Just Call People a Nazi Because You’re Mad
Oct 2

every few years, the same sad contingent of ruby on rails malcontents tries to cancel me. but now that the usual threats aren't working, they're upping the ante

GPT Prompt Using 'Token Smuggling' Really Does Jailbreak GPT-4

Continue Reading

AI: White Lives Don't MatterOct 24

Wikipedia Declares Trump an AuthoritarianOct 10

You Can’t Just Call People a Nazi Because You’re MadOct 2

AI: White Lives Don't Matter
Oct 24

Wikipedia Declares Trump an Authoritarian
Oct 10

You Can’t Just Call People a Nazi Because You’re Mad
Oct 2