Exploring the Landscape of
Large Language Models!
We are organizing the deep-diving workshop on the trending topic of Large Language Models (LLMs) and their multifaceted applications! Together, we will explore the far-reaching implications, advancements, and applications of LLMs, while navigating through their technical, ethical, and societal aspects. This engaging and insightful event will be moderated by Rohit Agarwal, Abhinanda R. Punnakkal and Ayush Somani, who bring their expertise to the table.
Save the Date!
27th-28th Oct. 2023 at Teknologibygget Tek 1.023, UiT Tromsø
Put aside your lunch on October 28th for a pizza party. 🎉
🎙️ What to Expect:
In-Depth Discussions: Engage in rich dialogues and discussions about the complexities and nuances of LLMs.
Expert Insights: Gain valuable insights from experts in the field, fostering a deeper understanding of LLMs.
Interactive Q&A Sessions: Have your pressing questions answered in lively Q&A sessions.
Networking Opportunities: Connect with peers, experts, and enthusiasts, building meaningful relationships in the community.
📝 Preparation for Speakers
Exploring the Dimensions: Large Language Models and Society
Each speaker is expected to rigorously prepare for their talk, drawing information from the "Summary plan" file. This file contains an extensive outline and links to pertinent papers, ensuring each presentation is well-rounded, informative, and grounded in current research.
🎤 Speakers Line-up
Day 1 (27th October 2023, Friday)
Arne O. Smalås
Philosophical view on LLMs
Who are we? What are we? What are large language models? Can such models turn a machine into an intelligent, conscious being? These are some fundamental and philosophical questions about the nature of large language models (LLMs) and Artificial Intelligence. This talk highlights the complexity of concepts such as consciousness, intelligence, justice, language, creativity, humanity, understanding, and knowledge, which are central to any discussion about AI and LLMs. In 2020, nine philosophers, including Amanda Askell, Regina Rini, and Annette Zimmermann, shared their thoughts on LLMs in an article published on dailynous.com, a platform presenting “News for & about the philosophy profession”. The article acknowledges potential biases but emphasizes the importance of these philosophical discussions. It underscores that the dialogue is not solely about technology but also about profound issues like consciousness, identity, historical bias, justice, and the digital zeitgeist. The aim is to encourage inspiring discussions regarding the philosophical perspective on AI and LLMs, recognizing that these questions go beyond mere technological considerations.
The world will not be the same after LLMs. Every day, we learn something new about this mind-blowing technology. The field is evolving so rapidly that it may feel overwhelming to comprehend the workings of these giants. Before it's too late, let's dive into the journey of text through LLMs. We will explore how text passes through different stages inside an LLM to provide a response magically. We will grasp the core of LLMs - Transformers - while going through different ways of processing text through LLMs to find explanations for questions such as how a single model can perform tasks like summarization, translation, and classification. What is the role of pretraining, and how is it beneficial for the health of our planet? We will also explore some emergent properties of LLMs.
Evolution of foundational LLMs
The impact of Language Learning Models (LLMs) on natural language processing and various other domains has been profound and transformative. Here, we provide an overview of Foundational Language Learning Models (LLMs), offering insights into their inception, evolution, and categorizations. We commence with an insightful introduction to Foundational models, and the reasons for their success. Tracing the evolutionary path, we journey from early embedding-based systems to the contemporary era of transformers and self-supervised training techniques, showcasing the remarkable strides made in language understanding. This delineation is vital for selecting the appropriate model for specific natural language tasks. Scaling LLMs forms a critical focal point, exploring the challenges and solutions associated with training and deploying models on extensive datasets. Moreover, we touch upon the current state of the degree of openness in LLM development, emphasizing accessibility, transparency, and ethical considerations. Lastly, we cast a forward-looking vision on the future of Foundational Language Learning Models.
Parameter efficient finetuning in LLMs
This talk provides a thorough examination of fine-tuning large language models (LLMs). It commences by elucidating the core concept of fine-tuning, offering a broad overview of its functionality, and highlighting various practical applications. Following this introduction, it delves into the methodologies for adjusting model parameters. There are currently three methods in practice. The first, retraining, involves a comprehensive reconfiguration of all model parameters, though it often requires substantial computational resources and time. The second method, transfer learning, leverages pre-existing knowledge, focusing on adapting the model from a general to a specific task by freezing all layers except the last one. The third method, Parameter-efficient fine-tuning (PEFT), is emphasized in this talk for its versatility and efficiency. PEFT allows fine-tuning of a small subset of parameters in a pre-trained LLM, achieved by freezing all existing parameters, introducing new ones, and fine-tuning them on a small new dataset. Notably, Low-ranking Adaption (LoRA) is a prominent application of PEFT. This talk culminates in a real-time practical demonstration, showcasing the fine-tuning process of an LLM with LoRA. This promises an insightful comparison between results obtained from the base model and the fine-tuned model.
Aaron Vaughn Celeste
Application development with LLMs
The talk discusses the transformative potential of Language Model Models (LLMs) when integrated with programs. LLMs have the ability to convert unrefined input from a non-expert into properly structured API requests, executable code, and coherent step-by-step action plans. When paired with a program capable of executing code, this combination becomes highly useful. Demonstrations illustrate how LLM-powered tools empower individuals without technical expertise to automate tasks like cleaning spreadsheet data, executing functional code, and analyzing extensive datasets. Langchain serves as a framework for constructing LLM-powered tools, offering features such as vector storage, chat memory, and agents. Vector storage includes similarity search capabilities, while chat memory can be implemented in various ways. Agents can be customized or used as prebuilt options. Vector databases, originally designed for aiding search engines in categorizing webpages, operate on the same data format as LLMs. They facilitate the creation of LLM-powered tools that interact with larger texts. Processing extensive text directly with LLMs can be costly, but vector stores help in filtering out irrelevant sections, enhancing efficiency. This amalgamation of elements is driving the emergence of a new category of apps that are easy to develop, scalable, and deliver rapid responses.
Day 2 (28th October 2023, Saturday)
Finetuning and ICL in LLMs
In a brief period, LLMs have rapidly advanced and can easily perform tasks such as sentiment classification and summarization. LLMs take query input, known as prompts, and generate an output called completion. LLMs perform well when examples of input and output are passed in the prompt. This is known as In-Context learning, where the model produces better completions based on prompt examples. However, In-Context learning may not work for smaller LLMs and specialized tasks with different data distribution. Thus, LLMs need to be finetuned in a supervised manner with a small number of examples. Finetuning LLMs sometimes leads to catastrophic forgetting, which is tackled using parameter efficient finetuning. LLMs face challenges such as toxicity, aggressive responses, and providing dangerous information. Reinforcement learning with human feedback (RLHF) finetunes LLMs towards human-aligned LLMs, minimizing harm and avoiding dangerous topics.
Prompting in LLMs
Prompt engineering involves crafting the input text that guides and influences the behavior of a language model (LLM), accomplished through English-based programming. The talk is divided into three segments: (1) providing key insights about prompts, (2) discussing emerging prompting techniques, and demonstration. Segment 1 explains how LLMs assign probabilities to each part of the input i.e., how LLMs learn and assign higher probabilities to text resembling real-world samples while staring with random weights. In segment 2, several emerging prompting strategies like Least-to-most prompting, Self-Ask, Meta-Prompt, Chain-Of-Thought, ReACT, Symbolic Reasoning PAL, Iterative Prompting, Sequential Prompting, Self-Consistency, and Automatic Reasoning & Tool Use (ART) will be discussed. To enhance prompting, the talk concludes with four key factors: structured text, decomposition and reasoning, self-criticism, and ensembling. Finally, a demonstration focused on Chain-of-Thought prompting will be performed.
Alignment, Interpretability and Robustness in LLMs
The rise of large language models (LLMs) in natural language processing tasks has sparked interest in their alignment, interpretability, and robustness. Alignment addresses the issue of model toxicity, ensuring outputs are safe and accurate. Approaches like LLM reasoning and strategic prompting are being explored. As LLMs become more integrated into society, it's crucial they meet human expectations consistently. The 'Chain of Thought' method aids in understanding LLM reasoning. However, LLMs face challenges like hallucinations and adversarial prompting, where they may generate incorrect or unsafe content. Addressing these requires a comprehensive approach involving ethics, technology, and ongoing evaluation. Striving for these goals aims to transform LLMs from powerful tools into trusted companions in the digital era.
High speed attention in LLMs
Transformers revolutionized Natural Language Processing (NLP) and underpin recent Large Language Models (LLMs). Their key innovation, "self-attention," overcomes the inherent sequential nature of language, enabling parallel data processing and faster model training. This breakthrough paved the way for mainstream LLMs since its introduction in 2017 by Viswani et al. Ongoing research has sought to enhance the self-attention mechanism. The talk will delve into attention's history, fundamentals, and recent advances, including soft-level concepts like flash-attention and hardware-level improvements. Additionally, the presentation will feature a demonstration of privateGPT++, a secure web application hosted on UiT's private server. This tool allows users to upload documents and pose questions to LLMs within the document's context, showcasing the practical applications of this transformative technology.
Suyog S. Jadhav
Large scale training of LLMs
Equipped with the knowledge about LLMs from all the previous talks, this session will give you some practical tools and resources to help you train your own LLMs! We will discuss some of the key challenges faced while training LLMs on a large scale, and see how the community has managed to mitigate some of these challenges. In the latter half of the session, we will work our way through a detailed technical demo. This demo will be majorly centered around the Hugging Face community packages, and how you can use them to quickly spin up a usable application. Finally, we will end with a scalable model training example and provide you with the starter code so that you can train and build your own applications!
Impact, challenges and limitations of LLMs
Large Language Models (LLMs) have greatly advanced AI and Natural Language Processing, yet they face critical limitations. These include biases in data, a lack of common sense, poor generalizability, interpretability issues, struggles with rare words, and limited grasp of grammar and syntax. They also lack domain-specific knowledge, have difficulty with context-dependent language, and lack emotion analysis. These challenges hinder their responsible use in sensitive contexts. Despite this, models like ChatGPT, LLaMA, and Bard are popular for their human-like responses. However, their training demands extensive data and computational resources, impacting society. LLMs are also vulnerable to adversarial attacks, posing ethical concerns. This talk will delve into these challenges and their societal and human implications, seeking potential solutions.
Dilip K. Prasad
Headway for LLMs
📚 Workshop Topics:
1. Evolution of Foundation LLM Models
Transition from closed to open-source, exploration of size, performance, and scale. Delve into the interesting characteristics,
pros, and cons of foundational LLMs.
2. Understanding Fine-tuning, RLHF, and In-context Learning
Unravel different types of fine-tuning mechanisms and examine their real-world examples in LLMs.
3. Walkthrough Prompting Techniques
Discover various prompting techniques, their efficacies, and cover the most crucial ones within the allotted time.
4. Alignment, Interpretability, and Robustness in LLMs
Discuss the alignment problem, delve into the ethics and toxicity in LLMs, and explore the significant role of prompting
5. Self-Attention and Improvements in Terms of Speed
Understand the multi-head self-attention from self-attention to its hardware-level improvements.
6. Distributed Large-Scale Training of LLMs and Associated Challenges
Discriminate between models based on their training strategies and discuss the differences, training spikes, and
7. Concept of Vector Database and LLM Application Development Tools
Dive into the world of vector databases focusing on Pinecone and discuss the performance, scalability, and flexibility in
vector database. Explore application development using LLMs.
8. Parameter Efficient Fine-tuning and Its Application to LLMs
Compare with Prompting / In-context learning and explore efficient fine-tuning parameters.
🎉 Join Us!
Embark on this enlightening journey and delve into the intricate world of Large Language Models with us. We look forward to seeing you at the workshop, where together, we will explore, learn, and innovate!