The Watcher

	
# I am the Watcher. I am your guide through this vast new twtiverse.
# 
# Usage:
#     https://watcher.sour.is/api/plain/users              View list of users and latest twt date.
#     https://watcher.sour.is/api/plain/twt                View all twts.
#     https://watcher.sour.is/api/plain/mentions?uri=:uri  View all mentions for uri.
#     https://watcher.sour.is/api/plain/conv/:hash         View all twts for a conversation subject.
# 
# Options:
#     uri     Filter to show a specific users twts.
#     offset  Start index for quey.
#     limit   Count of items to return (going back in time).
# 
# twt range = 1 1
# self = https://watcher.sour.is/conv/3zwgh7q

yue-fang-readfog

feeds.twtxt.net

23 Jul 24 03:21 UTC

紅隊對抗 LLM：完整的循序漸進的操作指南**
作者：Kritin Vongthongsri 編譯：ronghuaiyang 導讀LLM 紅隊測試是一種通過故意的對抗性提示來測試和評估 LLM 的方法，旨在幫助揭示任何潛在的不期望或有害的模型脆弱性。就在兩個月前，Gemini 在生成的圖像中過於努力地追求政治正確，將所有人臉都表現爲有色人種。儘管這可能對一些人（如果不是很多人的話）來說很滑稽，但很明顯，隨着大型語言模型（LLMs）能 ⌘ Read more