Privacy-Aware Visual Language Models

05 27, 2024 arXiv_CV

Privacy-Aware Visual Language Models

本文旨在增进我们对视觉语言模型（VLMs）如何处理敏感信息的理解，这些技术已经成为日常生活中不可或缺的一部分。为此，我们引入了一个新的基准PrivBench，包含8个敏感类别的图像，如护照或指纹。我们在这个基准上评估了10个最先进的VLMs，并观察到对隐私的理解普遍有限，模型改进的领域仍然很大。基于这一观察结果，我们引入了PrivTune，一个新的指令调整数据集，旨在为VLMs提供关于视觉隐私的知识。通过在这个小数据集上微调两个预训练的VLM（TinyLLaVa和MiniGPT-v2），我们实现了他们在识别敏感内容方面的强大提升，甚至超过了GPT4-V。同时，我们还证明了隐私调整对VLMs在标准基准测试（VQA）上的性能影响非常小。总之，本文为使VLMs在处理现实世界的数据时更安全有效地设置了一个关键挑战，并为构建 privacy-aware VLMs提供了一个简单的配方，迈出了第一步。

This paper aims to advance our understanding of how Visual Language Models (VLMs) handle privacy-sensitive information, a crucial concern as these technologies become integral to everyday life. To this end, we introduce a new benchmark PrivBench, which contains images from 8 sensitive categories such as passports, or fingerprints. We evaluate 10 state-of-the-art VLMs on this benchmark and observe a generally limited understanding of privacy, highlighting a significant area for model improvement. Based on this we introduce PrivTune, a new instruction-tuning dataset aimed at equipping VLMs with knowledge about visual privacy. By tuning two pretrained VLMs, TinyLLaVa and MiniGPT-v2, on this small dataset, we achieve strong gains in their ability to recognize sensitive content, outperforming even GPT4-V. At the same time, we show that privacy-tuning only minimally affects the VLMs performance on standard benchmarks such as VQA. Overall, this paper lays out a crucial challenge for making VLMs effective in handling real-world data safely and provides a simple recipe that takes the first step towards building privacy-aware VLMs.

https://arxiv.org/abs/2405.17423

https://arxiv.org/pdf/2405.17423.pdf

AI论文

Privacy-Aware Visual Language Models

发表回复取消回复

Privacy-Aware Visual Language Models

发表回复 取消回复

发表回复取消回复