Now in our fifth year, Text Analytics Forum is a place for sharing ideas and experiences in text analytics from beginner to advanced developers. We cover all aspects and approaches to text analytics including machine learning and AI, semantic categorization rules, build your own to advanced development-testing platforms, and human-machine hybrid applications.
The last few years have seen some fantastic advances in AI in both theory and practice. However, the vast majority of those advances have been in data and pattern-based AI not in text. Even the much-hyped GPT-3 from Open AI largely treats text as sets of complex patterns without any real understanding of the meaning of the words in it’s truly gigantic “training set”. Three-year-old children continue to outperform the best AI programs in learning language at the meaning level – learning new words from a single exposure rather than billions of examples.
Text analytics to the rescue! AI is only as smart as the content on which it is trained and text analytics is the best tool to create smarter training sets. It is also the best tool to create smart and useful applications of all kinds—search, customer and business intelligence, sentiment and social media analysis, and new applications no one has thought of—yet.
Text Analytics Forum 2021 will showcase how enterprises are using text analytics and AI (or other techniques) to create really useful applications, enhance taxonomies and ontologies, and make existing applications smarter.
Programming includes practical how-to’s, fascinating use cases that showcase the power of text analytics, new techniques and technologies, and new theoretical ideas that drive text analytics to the next level.
Monday, November 7: 9:00 a.m. - 4:30 p.m.
Upgrade to a Platinum Pass for your choice of two preconference workshops or access to Taxonomy Boot Camp, a co-located event with Text Analytics Forum 2022. Workshops are also separately priced.
Monday, November 7: 5:00 p.m. - 6:30 p.m.
Join us for the Enterprise Solutions Showcase Grand Opening reception. Explore the latest products and services from the top companies in the marketplace while enjoying drinks and light bites. Open to all conference attendees, speakers, and sponsors.
Tuesday, November 8: 8:30 a.m. - 5:00 p.m.
Upgrade to a Platinum or Gold Pass for extended access to KMWorld2022 , Enterprise Search & Discovery and Taxonomy Boot Camp, a series of co-located events happening alongside Text Analytics Forum 2022. See the registration page for details.
Wednesday, November 9: 8:30 a.m. - 9:15 a.m.
Located in Capitol Ballroom
This keynote panel looks at the connections between the conference streams of knowledge management, taxonomy work, text analytics, and search. When it comes to implementation, there needs to be an understanding of what each discipline contributes. What is their common ground? How can and should they be orchestrated? Taxonomy work is founded on a suite of methodologies, frameworks, standards, and technologies to organize information for general access and use. Search tools need to reflect how taxonomies and controlled vocabularies can make search smarter. There has been an increasing convergence between taxonomy, search, and data science expressed in the subdisciplines of data analytics, text analytics, machine learning and AI. KM sets the strategic purpose as well as technology implementations using taxonomies, search, and text analytics tools. It identifies and characterizes the contexts of information use so that KM initiatives and technologies can serve practical needs. These three strands do not always interact well. Knowledge organization systems can be be too complex and impractical and are often implemented in ignorance of basic information science principles and technology. AI/machine learning applications are often implemented without rigorous conceptual underpinnings and may not easily scale across multiple working contexts. Hear how information science can combine insights from KM when the KM understanding is not present and from data science to develop more effective, more sustainable knowledge organization architectures.
Patrick Lambe, Principal Consultant, Straits Knowledge and Author, Principles of Knowledge Auditing
Susann Roth, Chief of Knowledge Management, Asian Development Bank (ADB)
Irena Zadonsky, Director, Data & Analytics Architecture, Amtrak and and former Data Strategy and Policy Manager, Federal Reserve Board
Dave Clarke, EVP, Semantic Graph Technology, Synaptica, part of Squirro AG, UK
Wednesday, November 9: 9:15 a.m. - 9:30 a.m.
Located in Capitol Ballroom
Nearly 80% of enterprise data is unstructured and language-based, making it much less accessible. Join this session to discover the three keys to successfully turn your language assets into data to enhance analytics and empower your team to make better decisions. Start learning how to analyze your complex documents, extract language data to accelerate intelligent process automation, and the “signal through the noise” to understand market and customer insights. Get three keys to natural language processing (NLP) that create business value and a head start on projects with customizable, pre-built knowledge models. Learn how to simplify, accelerate, and improve your natural language projects and hear about some successful NLP use cases that deliver quick value for any business.
Christophe Aubry, Global Head of Value Creation, expert.ai
Wednesday, November 9: 9:30 a.m. - 9:45 a.m.
Located in Capitol Ballroom
Increasing content volume makes it hard for knowledge consumers to find all the information they need to make a decision. Whether for prospective customers or employee and partner enablement, poor information findability results in inefficiencies and lost opportunities. Intelligent content which is structured and has metadata with a framework of taxonomy in place can be transformational to user experience. Taxonomy features in content management have long been siloed and limited in scope. Organizations can break away from old standards and by adopting global standards, they can enable knowledge models to be truly integrated across the whole enterprise, and even beyond, to smooth reuse of industry standard taxonomies for efficient data sharing and governance. Learn how structured content together with the power of modern taxonomy and dynamic content delivery can bring enormous benefits to the end users to drive improved customer experience and manage information more efficiently.
Chip Gettinger, VP Global Solutions Consulting, Structured Content Technologies, RWS
Wednesday, November 9: 9:45 a.m. - 10:00 a.m.
Located in Capitol Ballroom
According to our survey of over 450 agents, 63% of contact center agents say that customer queries are increasing in complexity. As self-service gets smarter, it is leaving only the complex questions for agents to handle. This means all contact center agents need to be able to handle routine informational and transactional queries as well as situational queries that only SMEs (subject matter experts) used to handle. In other words, all agents need to become super-agents. How can a contact center make it happen? The answer lies in modernizing knowledge with a Knowledge Hub. Learn what it is and how forward-looking contact centers are leveraging it to create “wow” in agent and customer experiences.
Ashu Roy, Chairman & Chief Executive Officer, eGain Corporation
Wednesday, November 9: 10:45 a.m. - 11:30 a.m.
Located in Grand Ballroom, Salon 1
What are the current and future trends for the field of text analytics? Join program chair Tom Reamy for an overview of the conference themes and highlights and a look at what is driving the field forward. The theme this year is Text Analytics as a Foundation Platform for Multiple Applications. What kinds of text analytics platforms are being built and what is the range of the new and exciting applications being built on those platforms? We also continue the exploration of machine learning and rules-based approaches and how people are combining them to get the best of both worlds. The talk ends with a look at current and future trends that promise to dramatically enhance our ability to utilize text with new techniques and applications.
Tom Reamy, Chief Knowledge Architect & Founder, KAPS Group and Author, Deep Text
Join this session to learn how to combine a machine learning-driven search approach with the power of graphs to form what some refer to as composite AI. As organizations struggle to make the most of their data, the combination of both methods in a single unified approach yields tangible results quickly. The presentation focuses on how the combination works and showcases practical examples of the outcome of a combined approach. A customer use case highlights how the creation of a richer more insightful dataset, intent detection, and contextualization of information works. The result is a competitive edge for anyone working with (unstructured) data at scale.
Dorian Selz, CEO & Co-Founder, Squirro
Traditionally, text analytics addressed entity extraction for text strings that are names, places, and locations. More recently, text analytics evolved to interpret the text with sentiment analysis. One area that remains unexploited by text analytics solutions is strategic business analysis. In this session, we explore the concepts of “context-specific entities” and “meaning-loaded entities” which can be used to generate automated analysis of business strategies of companies for competitive intelligence. Real-world examples are shown.
David Seuss, CEO, Northern Light
Wednesday, November 9: 11:45 a.m. - 12:30 p.m.
Located in Grand Ballroom, Salon 1
With the growing popularity of AutoML platforms, it’s a good time to take a step back and evaluate what is and isn't currently possible with automated machine learning. Can the machines clean data for me? Choose an appropriate algorithm? Detect changes in the world that invalidate my deployed model? We look at the full lifecycle of a machine learning project and discuss whether the task is best solved by human, machine, or some combination of the two. We also look at how different techniques, from traditional supervised learning to approaches like reinforcement learning, can introduce their own unique challenges and opportunities. Finally, when a human is called for, we discuss the skill sets and perspectives that provide the most value at each stage in the machine learning process.
Paul Barba, Chief Scientist, Lexalytics, an InMoment Company
Wednesday, November 9: 1:30 p.m. - 2:15 p.m.
Located in Grand Ballroom, Salon 1
Enterprise data, and therefore its value, is siloed, too unstructured, and too language-based to successfully turn into data to enhance analytics and make better decisions. Teams that stitch together DIY and open-source approaches tend to find themselves spending more time managing their tech stack than generating insights and taxonomies. As teams look to unlock the value of their language data to build more effective search and intelligent applications, combining symbolic and machine learning approaches can provide the highest degree of accuracy, explainability and flexibility. Join us to hear several real-world natural language examples where a hybrid AI approach proved to be the key to success.
Ramona Aubry, Head of Customer Success, North America, Expert.ai
Wednesday, November 9: 2:30 p.m. - 3:15 p.m.
Located in Grand Ballroom, Salon 1
Auto-categorization is at the heart of good text analytics even when your focus is on extracting data from your unstructured text. Auto-categorization can be used to disambiguate, cluster, clean, pre-process content for training AI, reveal gaps and overlaps in your taxonomy, and more. However, to achieve great results requires much more than just turning on the software and sitting back to watch it work.
This talk, based on years of experience and two recent large enterprise auto-categorization projects, will cover what it takes to achieve 80-90% accuracy. Some of the techniques we will demonstrate include content structure models to read a document like a human, merging machine learning and symbolic rules, creating dynamic clusters of terms, and more. We’ll also look at the human side – how to train people to build good categorization rules, what makes a good categorization term (not too general, not too specific, but just right), what kind of user research is best and more.
Tom Reamy, Chief Knowledge Architect & Founder, KAPS Group and Author, Deep Text
Wednesday, November 9: 11:45 a.m. - 12:30 p.m.
Located in Hart, Meeting Room Level
Sentiment analysis is typically represented using a single dimension, polarity, which measures the relative positivity, negativity, or neutrality of the language of a text. A multi-vector approach to sentiment analysis provides a more enriched, nuanced evaluation, quantifying additional factors that engage the writer and the individuals for whom the text was intended. Mood is a dimension that assesses the emotional response elicited by the reader or listener, while aspect gauges the level of control the reader or listener is likely to experience. An intensity vector estimates the relative level of activation associated with a text in terms of the degree of arousal it is expected to elicit in the reader or listener. These four vectors often vary independently, as for example in the description of a judicial acquittal, typically a positive event, for which there is a negative reaction. In many instances, however, the vectors act in conjunction to provide a consistent view of a text or text collection. This case study illustrates such a situation using these four sentiment vectors to examine the approximately 56,000 tweets issued by President Trump during his time in office. The analysis demonstrates a contrast between the president's original tweets and those written by others that the president retweeted. While any one of the individual vectors would be useful in evaluating differences between the two sets of tweets, the four in combination show a consistent distinction between the tweets the president authored and those of others he chose to share.
Kemp Williams, Senior Computational Linguist, i2 Group
Gregory Roberts, CEO, Rosoka Software, Inc.
Wednesday, November 9: 1:30 p.m. - 2:15 p.m.
Located in Hart, Meeting Room Level
Text analytics has broad and deep applications. One of the most successful applications is for deeply understanding human experience … the analysis of unstructured data about many human roles: customer, prospect, patient, and employee experiences. Part of that focus is the sentiment analysis of feedback and unsolicited opinions. To help you prevent missteps and build your analytics programs credibly, we share what you should know when applying sentiment to experience or observational data. Join us to get a handle on types of sentiment engines, overall vs. aspect sentiment, challenges and limitations, why tuning may be necessary, how to audit sentiment, how solicited question-wording impacts sentiment, using sentiment as a predictor, and more. Bonus: Get information about the two types of sentiment applied for speech analytics!
Alison Bushell, Head of Global Analytics, Forsta
Fang Chen, Principal Analytics Consultant, Forsta
Wednesday, November 9: 2:30 p.m. - 3:15 p.m.
Located in Hart, Meeting Room Level
The traditional CX space is undergoing a period of upheaval, with many clients of CX vendors questioning the value of purely survey-focused CX programs and trying to measure the effect to their bottom line. This is where the concept of omnichannel text analytics and NLP comes into play. By measuring every part of a business and how the different parts affect each other, a business can get a holistic view of the winning and losing parts of their business. It's not CX and EX and PX which suggest a siloing of the parts of a business. It's BX (business experience) and how all of the different parts of your business affect each other. Understanding that you have an issue in your call center isn't enough; you need to be able to drill into the call center issue and see, for example, that most of the problems are related to a new product feature that is being poorly received by customers and feeding that back into product management to adjust the product to meet customer needs and help the bottom line. The vendors that are best able to make the transition to working with all of a company’s feedback data, both structured and unstructured, regardless of where it came from and can feed that back into operational business systems like product management, order processing, call center and others, are going to become the winners in the BX space.
Jeff Catlin, CEO, Lexalytics, an InMoment Company
Wednesday, November 9: 4:00 p.m. - 5:00 p.m.
Located in Grand Ballroom, Salon 1
A panel of four text analytics experts answer questions that have been gathered before the conference, during the conference, and some additional questions from the program chair. This is one of our most popular features, so come prepared with your favorite questions and be ready to learn.
Bryan Bell, CEO, Snowball Software Corporation
Jeff Catlin, CEO, Lexalytics, an InMoment Company
Antonio Linari, Head of Innovation, Expert.ai
Helmut Nagy, CPO, Semantic Web Company GmbH
Thursday, November 10: 8:30 a.m. - 9:15 a.m.
Located in Grand Ballroom 3/4
Artificial intelligence and machine learning (AI/ML) are becoming increasingly influential in mainstream KM, whether it’s in powering digital transformation; automatically connecting us to relevant people and content based on similarities of context and activity; extracting concepts from documents, images, and audio-visual files; or finding meaningful patterns in very large datasets. The very complexity of these tools renders them opaque to business users. Too often, people are given AI capabilities and are expected to use them without the necessary safeguards. How are we to know whether the tools we are deploying are fit for our particular purpose and that they do not carry hidden biases or errors? Klein, a research psychologist famous for pioneering in the field of naturalistic decision making, shares five accessible and practical tools from his new book for exploring, understanding, and explaining the boundaries, constraints, and AI/ML applications so that we can direct their uses more effectively, safely, and securely.
Gary Klein, CEO, Author, Snapshots of the Mind
Thursday, November 10: 9:15 a.m. - 9:30 a.m.
Located in Grand Ballroom 3/4
It’s important to put employees at the center of work and create a culture where they can thrive through knowledge and expertise. Our lively and knowledgeable speaker explores how customers and Microsoft itself have implemented Viva Topics. She discusses best practices for getting deployed, garnering adoption, and scaling across organizations. The focus is on the practical application of how People + AI work together to generate a useful, engaging, up-to-date knowledge base. Get lots of tips and ideas ramp up knowledge sharing in your organization.
Naomi Moneypenny, Director, Product Development, Microsoft Viva, Microsoft
Thursday, November 10: 9:30 a.m. - 9:45 a.m.
Located in Grand Ballroom 3/4
How many times have you heard that phrase? Join our popular and experienced thought leader as he discusses new Google-like, AI-driven experiences that users expect these days in search applications. These applications are driven by technologies like knowledge graphs, vector search, and advanced natural language processing, many of which are available as open source. Hear how these technologies are delivering want users want.
Kamran Khan, President & CEO, Pureinsights Technology Corp.
Thursday, November 10: 9:45 a.m. - 10:00 a.m.
Located in Grand Ballroom 3/4
With new machine learning methodologies and innovations taking the search industry by storm, designing a user experience (UI) that aligns not just to relevant results but to predictive and personal experiences is the way of the future. Think about the users--they don’t want to spend their time searching, they want to find their files and get back to work. Relevant search results in an intuitive, easy-to-navigate UI are what the users care about. Visualize what you want the user to experience, then wrap data and services around that, which means you’ll get closer to an ideal transformational outcome for your users. Join our experienced technologist and leader: Get tactics and techniques that create a knowledge management experience that is based on user goals--not system capabilities. The time to transform the way your users discover knowledge is right now!
Patrick Hoeffel, Head, Partner Success, Lucidworks
Thursday, November 10: 10:15 a.m. - 11:00 a.m.
Located in Capitol Ballroom, Salon F
For decades, the "state of the art" for finding relevant documents from mountains of data in e-discovery has been keyword search, which is frustratingly limited in scope and imprecise. Keyword search catches too little: It focuses on a specific spelling, thus missing alternatives, different word forms, and various ways of saying the same thing. At the same time, it is imprecise because it does not understand context: “interest” could mean fascination, or payment on money lent, or property rights. A search for in Japanese documents would return mentions of shoes, when “resume” () was intended. Searching for sentences would supply needed context but would rarely find an exact match. Traditional cross-lingual search on machine translations of documents—instead of operating on the native language—is harder, as meanings are often lost in translation. In response, our teams built a tool for augmenting e-discovery. This tool uses cross-lingual semantics to overcome the challenge of finding relevant documents in multiple languages, while also bootstrapping more accurate machine translation. Learn how we leverage text analytics to enable searching for relevant documents through finding the most case-relevant phrases—within and across languages—containing key concepts and meanings. And further, learn how we leverage those phrases to create starter translation assets for more accurate machine translation from the very start of e-discovery.
Eugene S. Reyes, Federal Solutions Engineer, Basis Technology
Jason E. Boro, Strategic Partnership Development Manager, Linguistic Systems, Inc.
Thursday, November 10: 11:15 a.m. - 12:00 p.m.
Located in Capitol Ballroom, Salon F
Why do we do text analysis? It's mostly to get better search. By search, we mean our enterprise search, where we try to extract entities from text based on machine learning and/or taxonomies. Well, that's not bad. In fact, it's the perfect start to move to the next level. Incorporating ontologies enables named entity recognition and opens the door to fact extraction from text, which can eventually lead to automated sense extraction. In all of this, we are always working with the whole document, but is that what the people using these systems really need? If I work in a contract department, does it make sense to search for all documents that talk about penalties? It helps a little bit, but I want to know which paragraph is about penalties, and in that paragraph, I want to know if it meets certain criteria. If I'm looking in technical documents for the specification on a sensor, maybe I don't want to find the document where this is on page 32. I want to see the specification. Maybe the information I'm looking for is even spread across multiple documents and databases. To achieve this, we need to semantically understand not only the content but also the structure of the document. There are standards such as DITA that focus on generic structures. Combined with knowledge graphs, this can be turned into a methodology for semantically understanding the structure of documents. This allows for more precise text analysis, as we can focus on using the right methods for the different contexts of a document. This presentation illustrates this principle with use cases.
Helmut Nagy, CPO, Semantic Web Company GmbH
Michael Iantosca, Senior Director of Content Platforms, Avalara
Thursday, November 10: 10:15 a.m. - 11:00 a.m.
Located in Capitol Ballroom, Salon E
Indeed is a job search engine that serves more than 60 countries and indexes millions of documents. Indeed constantly strives to provide the best search experience and present users with relevant results that match their job interests. This presentation focuses on query understanding and how ontological metadata can be used to turn keyword search into semantic search for web-scale data. There are two main challenges when matching jobs to queries: defining and detecting user intent and identifying relevant jobs for a given query intent. Indeed addresses these challenges through a combination of machine learning and human curation methods to improve relevance at scale while also fine-tuning the results to incorporate user and stakeholder feedback. Indeed developed a multilingual ontology of custom concepts that describe the hiring domain (such as occupations, skills, benefits) and a system to extract them from jobs in more than 11 languages and 20 countries. It builds on this work and defines query intent in terms of these concepts. Once Indeed identifies the intent of a query, it automatically selects a set of jobs that match that intent based on the concepts they’re tagged with. Query variants with the same intent are grouped together and reused to interpret more complex queries. The success of this semantically informed approach is also a story of close collaboration across five teams. The insight from human curation directly influences how Indeed designs its machine learning models and the need for query classification contributes to expanding its ontology.
When was the last time a chatbot solved a challenging problem for you? Or was engaging? Sure, chatbots and other digital self-service apps are excellent at resetting passwords and projecting popular knowledge articles. However, they still struggle with complex tasks and often need to offload work to a human employee. They struggle simply because they are built with insufficient training data, or worst case, based on guesswork. To make digital self-service truly intelligent, let it learn from your organization's best-performing employees or service agents. Join this session to receive a blueprint on harnessing conversational data to build smart self-service fast. Identify the top consumer intents ripe for automation, the ideal tasks to resolve issues, and the optimal conversational flow from the gold mine of data every organization already has. Learn how this intelligence not only applies to chatbots and virtual assistants but can make the entire enterprise smarter, including knowledge management and proactive outreach. Drive knowledge sharing with a data-driven approach from the best source, the voice of your customer.
Andy Traba, Head, Product Marketing for AI & Analytics, NICE
Thursday, November 10: 11:15 a.m. - 12:00 p.m.
Located in Capitol Ballroom, Salon E
Narrative data from police agencies on arrest or offense incidents, as well as tips to police departments, is both rich in information and also largely unavailable to the public for analysis. The city of Dallas has published over 45,000 deidentified incidents containing narrative data from 2013 and 2014. Assessing large quantities of narrative data for patterns using manual analysis alone can be time consuming and produces limited qualitative results. How can modern methods in text analytics assist? Our speaker proposes the dataset available from https://www.dallasopendata.com/ as a model for how text analytics can assist with assessing police narrative data for patterns enhancing both time to value and quality of analysis. While this presentation assesses the data on a whole for crime-related patterns, it is specifically interested in methods to identify risk indicators for human trafficking. To do this, it showcases AI capabilities for developing relevant concepts for extraction from the narratives and how to subsequently visualize these AI-supported extraction results in a user-friendly environment. Such methods can be replicated across any police agency to comb narrative event data for human trafficking indicators or to surface other crime-related patterns of interest.
Chime is the largest of a rising type of financial technology business known as “neobanks,” which cater to low-to-moderate-income customers who are underserved by traditional banks. In 2021, ProPublica published an article explaining that Chime bank was receiving complaints from customers who were unable to access their funds. According to the business, there had been an “exceptional spike” in fraudulent deposits, but that was little consolation for customers caught in the middle of the chaos. Organizations such as the Consumer Financial Protection Bureau (CFPB) are interested in tracking such financial instabilities defined largely by narrative accounts and how they impact customers. How can text analytics assist? This presentation showcases how to assess complaints from 2018–2021 from the CFPB website. SAS Viya Data Studio and SAS Model Studio were leveraged for visual text analytics on this data, uncovering patterns of how this event impacted consumers, and informing future regulatory actions for Neobanks in general. These capabilities could assist regulators to investigate consumer complaints faster and more effectively. This should therefore help to strengthen consumer protection and clamp down on unfair, misleading, or abusive financial market activities.
Fernanda Molina Galindo, Software Engineer Intern, Microsoft, SAS and Carnegie Mellon University
Thursday, November 10: 12:15 p.m. - 12:30 p.m.
Located in Grand Ballroom 3/4
Do you need to deliver enterprise search across information in SharePoint sites along with other content repositories and possibly even some of your structured data? Do you need your enterprise search to understand your company’s jargon? Do you want search results to return organized facts as opposed to only lists of hopefully relevant links? Knowledge graphs can help. The term “knowledge graph” was coined by Google when it recognized the need to enrich search algorithms through leveraging a curated knowledgebase of facts. In the enterprise, knowledge graphs can include controlled vocabularies, reference data, and other commonly used terms and entities as well as connections between them. Enriching and categorizing content using terms from curated vocabularies, makes it more findable. An enterprise knowledge graph can create a foundation for linking structured and unstructured sources based on the common terms. It can capture commonly asked questions, drive recommendations, make natural language processing more powerful, and more. Join us to learn how using knowledge graphs can help in automatically enriching your unstructured data and improve users’ ability to find relevant content and facts.
Irene Polikoff, Chief Evangelist, TopQuadrant
Thursday, November 10: 12:30 p.m. - 12:45 p.m.
Located in Grand Ballroom 3/4
KMWorld magazine is proud to sponsor the 2022 KMWorld Awards, KMPromise & KMReality, which are designed to celebrate the success stories of knowledge management. The awards will be presented along with Step Two’s Digital Awards, where you get a sneak peek behind the firewall of these organizations.
Thursday, November 10: 1:00 p.m. - 1:45 p.m.
Located in Capitol Ballroom, Salon F
Information security continues to grow in importance. We hear stories all of the time about hackers accessing private information from companies and government agencies. Every organization struggles with employees who store confidential information on insecure networked drives or cloud drives. Our speakers recently did a project with a federal research organization that used auto-tagging and text analytics to identify confidential information that needed to be moved to a secure location. As part of this session, they share the approach taken to identify this information and how they made sure that the tagging and text analytics were accurate. Attendees learn best practices for tuning auto tagging as well as ways to identify confidential information across the enterprise.
Joseph Hilger, COO, Enterprise Knowledge, LLC
Sara Duane, Consultant, Strategic & Business Consulting, Enterprise Knowledge LLC
Thursday, November 10: 2:00 p.m. - 2:45 p.m.
Located in Capitol Ballroom, Salon F
There’s lots of software available that does a good job of identifying named entities—the names of people, organizations, events, places, etc.—that occur in text. Too often, these are just used as keywords to find digital assets without any further differentiation. For example, an organization could be a public company, U.S. government agency, an institution of higher education, or something else. Being able to identify that a named entity is a specific type of organization can be important in determining whether a digital asset is relevant to a particular search. It can also be useful in adding further context to the digital asset. If we know that a named entity is an institution of higher education, we could further differentiate it by size and location, and even link to a short profile of the organization in Wikipedia or to the institution’s website itself. This talk explains how to easily build up metadata related to named entities to improve search to accurately find and use relevant digital assets and provides real-world examples from client projects.
Joseph Busch, Principal, Taxonomy Strategies
Extracting and structuring content from text- or image-based tables has long been a challenge. Tabular content is particularly important in regulatory, financial, and scientific documents where complex alphanumeric content is often presented in tabular format. Tables are tough to structure due to inconsistencies with tabular content, high diversity of layouts, complicated elements such as straddle headings, various alignments of contents, the presence of empty cells, and other intricacies. Transforming tabular content into a structured model such as XML or HTML is nearly always a manual or semi-manual process. This presentation explores methods used to perfect a model to solve the challenges around automating table structure from text extraction. Data Conversion Laboratory and Fusemachines created an AI model that finds and extracts information from all tables in a document using a combination of computer vision (CV) and natural language processing (NLP). Our speakers review how they developed and managed a hybrid approach of rules-based processes and machine-learning to identify and extract tabular data and augmented training data to develop an AI model that automates table-to-XML
Mark Gross, President, Data Conversion Laboratory
Isu Shrestha, Senior Machine Learning Engineer, Technology, Fusemachines
Thursday, November 10: 3:00 p.m. - 3:45 p.m.
Located in Capitol Ballroom, Salon F
This talk takes a close look at knowledge hubs and semantic data fabrics and show how they can dynamically aggregate information from different data and content systems. By applying graph-based text mining methods and principles to multiple sources, they can provide users with personalized and contextualized views. Knowledge hubs and data fabrics are based on multiple repositories with different content/data types and metadata systems mapped to a coherent semantic knowledge model. As a result, users benefit from topical landing pages that provide a comprehensive overview and entry points to other nodes or aspects of a larger knowledge graph. Healthdirect Australia, for example, an Australian government healthcare knowledge hub, links and integrates numerous aspects of the entire healthcare system. In doing so, Healthdirect links content from more than 100 topic-specific sources and connects them to structured data such as the Australian drug database or a service registry. The hub offers convenient search, help and navigation tools: content filtering by user groups, a symptom checker, related search queries, and numerous links to related topics, therapies, or drugs. The second case study looks at the details of a knowledge hub implemented for a software company. Embedded in a broader KM strategy, it provides entity-centric and linked views across multiple content streams, including technical product information, marketing material, and analyst-provided market trends. This content is in turn linked to employee skills and competency profiles, enabling the delivery of intelligent recommendation systems.
Andreas Blumauer, Founder & CEO, Semantic Web Company Inc.
Thursday, November 10: 4:00 p.m. - 4:15 p.m.
Located in Grand Ballroom 3/4
As the information age rapidly accelerates, organizations are challenged with decreasing employee productivity, high employee churn and changing market conditions. We will highlight the importance of integrating data and knowledge to deploy an effective AI strategy as a competitive advantage. Investing in the correct knowledge infrastructure to support AI initiatives is crucial to maximize existing technology investments, improve employee productivity and increase the engagement of your customers. Modern knowledge infrastructure transforms knowledge into the answers that your customers need and that both humans and machines need to operate effectively in the modern workplace. Learn why investing in the correct knowledge infrastructure is the most valuable investment that you can make to prepare your organization for the future of work.
Colin Kennedy, COO and Co-Founder, Shelf
Thursday, November 10: 4:00 p.m. - 5:00 p.m.
Located in Grand Ballroom 3/4
Join the members of Knowledge Cast, the number-one ranked podcast on KM. Wahl and his guests provide insights and inspiration via a live-stream episode of the podcast. Hear some of the key ideas and innovations shared at this year’s KMWorld as well as what our panelists are seeing within the rapidly changing field of KM.
Zach Wahl, CEO, Enterprise Knowledge
Phaedra Boinodiris, Principal Consultant Trustworthy AI, IBM
Jean Claude Monney, Digital Workplace & KM Advisor, The Monney Group, LLC
Larry Prusak, Author, The Smart Mission - NASA's Lessons for Managing Knowledge, People, and Projects
Gloria Burke, Senior Knowledge Management Strategist & formerly KM Slalom