Responsible Generative AI

Period of duration of course
‌‌
Course info
Number of course hours
20
Number of hours of lecturers of reference
20
CFU 3
‌‌

Modalità esame

Seminar

Note modalità di esame

The exam will consist of a project and its presentation with a discussion

Prerequisiti

No prerequisites

Programma

1) Module 1 (4 hours): Introductory Module

a. Introduction to NLP

b. Introduction to Generative AI


2) Module 2 (12 hours): Advanced Concepts on Generative AI

a. On the way towards LLMs, e.g. word2vec, seq-to-seq models

b. The Building Blocks of Foundation Models

  • Transformer Architecture
  • Self-Supervised Learning
  • Transfer Learning, e.g. pretraining, in-context learning (prompting), supervised fine-tuning (SFT), reinforcement learning with human feedback (RLHF)
  • Computing on Scale

c. Designing a Chatbot, i.e., different prompting strategies, and retrieval augmented generation (RAG)

d. Image Generation Models, e.g. Variational AutoEncoders, Generative Adversarial Networks, CLIP and Stable Diffusion

e. Responsible Generative AI

  • Risks of Generative AI, e.g. biases, hallucinations, privacy leaks, environment impact
  • Technical considerations on establishing responsible generative AI models


3) Module 3 (4 hours): Hands-on Sessions

a. Interactive session on LLMs

b. Red Teaming on LLMs

Obiettivi formativi

The rapid development and deployment of generative AI models and applications has the potential to revolutionise various domains which brings about the urgency to use these models in a responsible manner.

Generative AI refers to creating new content in different modalities of digital text, images, audio, code, and other artifacts based on already existing content. Text generator models such as GPT-4, and its chat version,

ChatGPT as well as text-to-image models such as DALL-E 3 and Stable Diffusion are popular generative AI models. Although these models have significant implications for a wide spectrum of domains, there are several

ethical and social considerations associated with generative AI models and applications. These concerns include the existence of bias, lack of interpretability, privacy, fake and misleading content such as hallucinations,

and lack of accountability. Thus, it is very crucial to discuss these risks with their corresponding potential safeguards (if any) in addition to the technical details of these powerful models.


Target Audience: This course is addressed to the students of PhD in AI (uniPI) and Computational methods (SNS). The students of other PhD programs in SNS are welcome to enroll in the course if they are interested.

Riferimenti bibliografici

  1. Paaß, G., & Giesselbach, S. (2023). Foundation models for natural language processing: Pre-trained language models integrating media (p. 436). Springer Nature.
  2. Vaswani, A., 2017. Attention is all you need. Advances in Neural Information Processing Systems.
  3. Sutskever, I., 2014. Sequence to Sequence Learning with Neural Networks. arXiv preprint arXiv:1409.3215.
  4. Radford, A., 2018. Improving language understanding by generative pre-training.
  5. Devlin, J., 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  6. Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., ... & Liang, P. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.
  7. Brown, Tom B. "Language models are few-shot learners." arXiv preprint arXiv:2005.14165 (2020).
  8. Wei, J., Bosma, M., Zhao, V.Y., Guu, K., Yu, A.W., Lester, B., Du, N., Dai, A.M. and Le, Q.V., 2021. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652.
  9. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V. and Zhou, D., 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35, pp.24824-24837.
  10. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A. and Schulman, J., 2022. Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35, pp.27730-27744.
  11. Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Wang, M. and Wang, H., 2023. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997.
  12. Dettmers, T., Pagnoni, A., Holtzman, A. and Zettlemoyer, L., 2024. Qlora: Efficient finetuning of quantized llms. Advances in Neural Information Processing Systems, 36.
  13. Reed, S., Zolna, K., Parisotto, E., Colmenarejo, S.G., Novikov, A., Barth-Maron, G., Gimenez, M., Sulsky, Y., Kay, J., Springenberg, J.T. and Eccles, T., 2022. A generalist agent. arXiv preprint arXiv:2205.06175.
  14. Liu, H., Li, C., Wu, Q. and Lee, Y.J., 2024. Visual instruction tuning. Advances in neural information processing systems, 36.
  15. Alayrac, J.B., Donahue, J., Luc, P., Miech, A., Barr, I., Hasson, Y., Lenc, K., Mensch, A., Millican, K., Reynolds, M. and Ring, R., 2022. Flamingo: a visual language model for few-shot learning. Advances in neural information processing systems, 35, pp.23716-23736.
  16. Shuster, K., Poff, S., Chen, M., Kiela, D. and Weston, J., 2021. Retrieval augmentation reduces hallucination in conversation. arXiv preprint arXiv:2104.07567.
  17. Berglund, L., Tong, M., Kaufmann, M., Balesni, M., Stickland, A.C., Korbak, T. and Evans, O., 2023. The reversal curse: Llms trained on" a is b" fail to learn" b is a". arXiv preprint arXiv:2309.12288.
  18. Yan, B., Li, K., Xu, M., Dong, Y., Zhang, Y., Ren, Z. and Cheng, X., On protecting the data privacy of large language models (llms): a survey (2024). arXiv preprint arXiv:2403.05156.
  19. Zou, A., Wang, Z., Carlini, N., Nasr, M., Kolter, J.Z. and Fredrikson, M., 2023. Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043.
  20. Wei, A., Haghtalab, N. and Steinhardt, J., 2024. Jailbroken: How does llm safety training fail?. Advances in Neural Information Processing Systems, 36.
  21. Luccioni, A.S., Viguier, S. and Ligozat, A.L., 2023. Estimating the carbon footprint of bloom, a 176b parameter language model. Journal of Machine Learning Research, 24(253), pp.1-15.