Development
Length: 45 Minutes
Speaker(s):
Kyle Strand, Lead Knowledge Management Specialist and Head of Library,Inter-American Development Bank (IDB) Katrina B Pugh, Lecturer & President,Columbia University & AlignConsulting
Title: From Characters to Strings: Using Vectorization to Discover Knowledge in Your Organization
Time: 11:15 AM - 11:35 AM
Description: In today's fast-paced digital world, knowledge is critical to success. But with the overwhelming amount of text data being generated every day, it's become increasingly difficult to find precisely what you are looking for. This is where vectorization comes in, enabling us to represent unstructured data in a structured and analyzable form. This presentation shares the speaker’s experiences in using character-level vectorization approaches and transformer-based approaches to map knowledge in our organization, as part of our knowledge management strategy. We discuss how we have leveraged these techniques to represent both the formal knowledge products that we produce and share with the public, as well as the experiential knowledge of our personnel. It highlights the benefits and challenges of using different vectorization techniques in different contexts. For instance, we'll showcase how character level vectorization approaches have been particularly useful in assembling a puzzle of data related to experience among our personnel for expertise location, while transformer-based approaches have excelled in increasing the relevance of search results when dealing with larger text data such as publications. By the end of this presentation, attendees have gained a deeper understanding of how vectorization can be used to map knowledge in an organization and a few approaches that can be used to represent both formal and experiential knowledge.
Title: Improve Sustainability Conversations With Neural Nets
Time: 11:35 AM - 11:55 AM
Description: In today’s polarized world, sustainability solutions require skillful conversation. Text analytics can help. At the University of Maine, we asked, “What are the features of sustainability conversations that lead to innovation, cohesion, and intent?” First, we captured transcripts from aquaculture town-hall conversations which could be fraught with conflict and debate. Next, we coded the transcripts for rhetorical intent, or what we call “discussion disciplines”—statements, questions, positivity, acknowledgments, synthesis, and even snarkiness. In addition, from the transcription, we developed a set of terms for identifying outcomes. Then, we evaluated several machine learning approaches, such as TF*IDF, Google’s (open) BERT, and a combination of BERT and ResNet. We found the large language models such as BERT recognize the discussion disciplines with the greatest accuracy, compared to the human-coded data. We used the “winning” model to ingest more than 21,000 open source utterances, labeled each for the discussion disciplines, and labeled each transcript for its likely outcomes. With this large dataset, we found that acknowledgment and positivity have a positive, large, statistically significant impact on intent. Now, using similar models (even hand-coded) and careful observation, sustainability leaders can be better equipped to change the tone, innovate, and get diverse collaborators focused on positive environmental and societal impact.
Applications
Length: 45 Minutes
Speaker(s):
Mary Osborne, Senior Product Manager - Natural Language Processing,SAS Tom Sabo, Advisory Solutions Architect,SAS Kirk Swilley, Systems Engineer,SAS
Title: Understanding the Evolution of NLP From Pattern Matching to ChatGPT
Time: 11:15 AM - 11:35 AM
Description: Large language models, generative AI, and NLP have entered the domain of everyday conversation thanks to the disruption of ChatGPT. What was once a niche area is becoming more widespread. Have you ever been curious about the genesis of NLP? In this talk, Osborne introduces attendees to an array of characters from obscure names in linguistics like Ferdinand de Saussure, well-known mathematicians like Alan Turing, and the pioneers of the transformer neural network architectures, Vaswani, et al. Discover how this field has evolved and advanced to arrive at mind-blowing technologies like ChatGPT. Throughout, attendees learn about both tried and true techniques and where the domain is headed.
Title: A Few of Our Favorite Things: Public Sector Data Ripe for Text Analytics
Time: 11:35 AM - 11:55 AM
Description: Join Sabo and Swilley for a session where they highlight datasets which are both full of content and readily accessible for text analytics exploration. They also explore the different text analytics methods that are applicable to go from questions to decisions for each of these datasets. The academically minded learn about useful datasets to leverage in class or for that next NLP class project. The industry-minded learn about public sector data that can provide intelligence and marketing signals. And each of these datasets have some relevance to government and NGO workers as they have largely come from that sector. The pair highlights how text analytics and NLP, applied properly, can save lives and improve the quality of life.