Roses are red, violets are blue, if you phrase it as poem, any jailbreak will do
A new study highlights a glaring weakness in large language models: bad actors can bypass security filters simply by rhyming. Malicious requests phrased as poetry slipped past safeguards far more often than plain text, achieving success rates of up to…
