View the Text Analytics Forum 2019 Final Program PDF
Thursday, November 7: 8:30 a.m. - 9:30 a.m.
In the age of new and useful technologies rushing to take their place in our organizations, 80% of respondents to a recent global culture survey say their organization’s culture must evolve in the next 5 years for their company to succeed, grow, and retain the best people. Yes, it’s all about the people in any organization. Technology can certainly support, speed, and spark knowledge sharing, innovation and success, but culture is implicit rather than explicit, emotional rather than rational—that’s what makes it so hard to work with, but that’s also what makes it so powerful! Anderson shares secrets from her recent book, a practical guide to working with culture and tapping into a source of catalytic change within your organization. Since every organization’s culture is intimate and personal, aligning culture always involves getting to the heart of difficult matters, unearthing the “family secrets” of a company—the emotional histories that lie under the surface of the story the company tells about itself to the outside world. Get lots of insights and tips to use in your organization to build a knowledge-sharing culture which supports success!
Gretchen Anderson, Director, Katzenbach Center, PwC Strategy& and Co-Author, The Critical Few: Energize Your Company’s Culture by Choosing What Really Matters
Thursday, November 7: 9:30 a.m. - 9:45 a.m.
Creating a consumer-like, personalized experience for the daily work of finding information people need to do their jobs is the crux of efficient knowledge sharing. These uniquely personal experiences facilitate the flow of organizational operations and more intelligent decision-making. Professionals at companies like Red Hat, Reddit, and PwC rely on this type of platform to quickly find answers and proactively suggest insights. Learn how they approached their challenges and the solutions they found to empower employees to be more productive and help their organizations to attract and retain the best talent.
Diane Burley, VP Content, Lucidworks
Thursday, November 7: 9:45 a.m. - 10:00 a.m.
Our speaker shares best practices for and lessons learned from implementing and enriching enterprise search solutions.
Megan DeSomery, Director, Product Management, EPAM Systems
Thursday, November 7: 10:15 a.m. - 11:00 a.m.
Government influence operations (IO) have been conducted throughout recorded history. In recent times, they have commonly been referred to as propaganda, active measures, or psychological operations (PSYOPS). More than a century of Russian “Chekist” tradition has culminated in a force that can mobilize thousands of humans augmented by unlimited numbers of bots. As documented in congressional testimony, this force has repeatedly seized control of foreign news cycles, inserting sentiments or wholly fictional stories. A simple positive/neutral/negative axis is not as applicable to the IO mission as one specific to the operation in question, such as entity stability/ instability, trustworthiness, or advocacy of violence. Given an IO action, such as promotion of an embarrassing story, the operator wants to measure the effect as change in sentiment, such as distrust of the now-discredited entity.
Christopher Biow, SVP, Global Public Sector, Basis Technology
Mike Harris, Director, Field Operations, Basis Technology Corp
Regulations.gov was launched in 2003 to provide the public with access to federal regulatory content and the ability to submit comments on federal regulations. Manually reading thousands of comments is time-consuming and labor-intensive. It is also difficult for multiple reviewers to accurately and consistently assess content, themes, stakeholder identity, and sentiment. In response to this issue, text analytics can be used to develop transparent and accurate text models, and visual analytics can quantify, summarize, and present the results of that analysis. This talk addresses public commentary submitted in response to new product regulations by the U.S. Food and Drug Administration.
Emily McRae, Systems Engineer, SAS
Thursday, November 7: 11:15 a.m. - 12:00 p.m.
Due to subjective content, an absence of labels and a lack of dimensions, analyzing unstructured data can be a challenging task. In this session we’ll discuss improving unstructured data analysis through automation (including a human-in-the-loop), pre-processing capabilities for reducing noise, and options for feature engineering and extraction. You will also get an overview of our hybrid analytical platform, which combines Natural Language Processing along with Machine Learning and statistical techniques in order to deliver rich insights. Our hope is that regardless of your platform of choice, you obtain ideas that make your own analysis easier and more effective
Sundaresh Sankaran, Solutions Architect, Global Technology Practice, SAS Institute
One fundamental obstacle for using machine learning (ML) to accurately extract facts from free text documents is that it requires huge amounts of pre-categorized data for training. Manual annotation is not a viable option as it would entail enormous amounts of human analyst time. In this presentation we outline an innovative rule-based approach for automated generation of pre-categorized data that can be further used for training ML models. This approach relies on writing queries expressed in the powerful pattern definition language that fully exploits the results of the underlying natural language processing (NLP): deep linguistic, semantic, and statistical analysis of documents. The sequential application of rule-based and ML techniques facilitates the high accuracy of results. An example project illustrating this technology focuses on the automated extraction of clinical information from patient medical records.
Sergei Ananyan, CEO, Megaputer Intelligence
Elli Bourlai, Senior Computational Linguist, Megaputer Intelligence
Thursday, November 7: 10:15 a.m. - 11:00 a.m.
Efforts to counter human trafficking internationally must assess data from a variety of sources to determine where best to devote limited resources. How can analysts effectively tap all the relevant data to best inform decisions to counter human trafficking? This presentation showcases a framework supporting AI for exploring all data related to counter human trafficking initiatives internationally. The framework incorporates rule-based and machine learning (ML) text analytics results not available in the original datasets. As a focal point, we demonstrate how to apply rule-based text extraction of trafficking victims to generate training data for subsequent ML and deep learning models. We ultimately show how this framework provides decision makers with capabilities for countering human trafficking internationally, and how it is extensible as new AI techniques and sources of information become available.
Tom Sabo, Advisory Solutions Architect, SAS
As organizations shift focus from data-generating to data-powered, the ability to incorporate all information—structured and unstructured—is key to delivering insight that is trusted by the business. What does an organization need to bring together all information and successfully manage its data quality? Semantic AI provides the context and meaning that transforms textual information into trusted data. It uncovers the insights and relationships using NLP, machine learning, and AI strategies so you can expose new information to the business and answer questions you couldn’t answer before. Using the real-life success story of one of the world’s largest auto manufacturers that uses Semantic AI to harmonize, extract, and enrich data for vehicle safety analysis, this session covers how the use of Semantic AI cleans and calibrates IoT, text-based data, and unstructured content to improve data quality, analytics, and the fidelity of business decisions.
Jeremy Bentley, Head, Strategy, MarkLogic
Thursday, November 7: 11:15 a.m. - 12:00 p.m.
How can we classify audio files of music with very sparsely available text? A large commercial music publisher with a faceted classification system (for 650,000 tracks!) realized that its search problems were caused by incomplete, missing, or misapplied metadata. Additionally, the text associated with each track (or album) was spotty at best. In this talk, Kasenchak describes the variety of text analytical (and other) approaches used to try to solve the problem: adding or correcting metadata to improve search.
James Schumann, Director of Corporate Relations, Access Innovations, Inc.
Vocabularies and natural language processing (NLP) often work handin- hand to provide text analytics solutions. This talk explores this partnership in detail in the context of a specific knowledge domain: biomedicine. Standard assumptions made by NLP engines regarding how words are stemmed, tokenized, assembled to form phrases and sentences can be challenged by a specialized domain such as biomedicine, which has its own terminology and knowledge models. Biomedical vocabularies and ontologies play an essential role to make biomedical text is being analyzed appropriately. This talk goes into more detail about how the two capabilities—domain-specific vocabularies and ontologies and NLP engines—can learn from each other to deliver better biomedical text analytics solutions.
James Morris, Solution Architect, Semaphore by MarkLogic and MarkLogic Corporation
Jon Stevens, NLP Software Developer, AbbVie
Thursday, November 7: 12:15 p.m. - 12:30 p.m.
The Internet of Things, computer vision, and document understanding are all becoming critical drivers of enterprise evolution. These are the “3 Pillars of AI” and they are having a real, practical impact on the world of KM. Khan, who leads Accenture's Search & Content Analytics Group, briefly explains these pillars, delving into document understanding and explaining how it is drastically changing the search and KM landscape. Citing a real-world example, he discusses how search and analytics are being combined with AI technologies like machine learning and natural language processing to help make it possible for a global enterprise to extract valuable insights from their untapped, unstructured data sources, improving operations and maintaining a competitive advantage.
Kamran Khan, Managing Director, Accenture
Thursday, November 7: 12:30 p.m. - 12:45 p.m.
KMWorld magazine is proud to sponsor the 2019 KMWorld Awards, KM Promise & KM Reality, which are designed to celebrate the success stories of knowledge management. The awards will be presented along with Step Two’s Intranet & Digital Awards, where you get a sneak peek behind the firewall of these organizations.
Rebecca Rodgers, Principal Consultant Digital Workplace & Community Manager, Step Two
Thursday, November 7: 1:00 p.m. - 1:45 p.m.
AI promises to categorize all types of content with reliable results, but the reality is much more complex. Most applications won’t work with a meat grinder approach, where you pour a huge amount of content in one end and a perfectly organized collection comes out the other end. Effective automated categorization depends on defining a process workflow and assembling a stack of methods to process different types of content in different ways. Designing and validating a content processing workflow requires human judgments. So good quality categorization applications often rely on how to make the best use of people. This presentation provides a reality check on unsupervised automated categorization, and discusses a case study in which the performance was suitable for editorial review and approval, but not for unsupervised processing of a large collection.
Joseph Busch, Principal, Taxonomy Strategies
There is no such thing as unstructured text—even tweets have some structure—words, clauses, phrases, even the occasional paragraph. Techniques that treat documents as undifferentiated bags of words have never achieved a high enough accuracy to build good auto-categorization whether using machine learning (ML) or rules. However, going beyond bags of words and utilizing the structures found in “unstructured” text, it is possible to achieve dramatically improved accuracy. This talk, using multiple examples from recent projects, presents how to build content structure models and content structure rules that can be used for both rules-based and ML categorization. We conclude with a method for combining rules and ML in a variety of ways for the best of both worlds.
Tom Reamy, Chief Knowledge Architect & Founder, KAPS Group and Author, Deep Text
Thursday, November 7: 2:00 p.m. - 2:45 p.m.
The American Psychological Association’s PsycINFO databases release around 3,000 records per month. In June 2017, a plan was created to bring machine-aided indexing (MAI) back to the APA’s PsycINFO databases. Since then, MAI has been implemented across three of the databases, including PsycARTICLES. Pearson discusses the strategy used to build the rule base, and integrate the software into the production system. He also takes a look at some of the challenges faced along the way and explores future goals and further deployment plans.
Christopher Pearson, Machine-Aided Indexing Specialist, Content Management, American Psychological Association
Thursday, November 7: 3:00 p.m. - 3:45 p.m.
Products like Amazon Alexa and Google Home are changing the expectations as to how search should work. Searchers now expect voice-driven search solutions that provide answers and not just a list of links. Part of this talk shares how knowledge graphs enable a natural language search and how text analytics along with machine learning can be used to populate these powerful constructs. We explain how to architect these solutions and provide real world examples as to how many of our clients have taken advantage of these powerful tools.
Joseph Hilger, COO, Enterprise Knowledge, LLC
Similar to many enterprises, the Inter-American Development Bank (IDB) has multiple information sources which are isolated in different systems. There is no link between all these information resources that can make them accessible outside of their native systems. It is not possible to relate distinct kinds of resources that share some characteristics, e.g., to find a course that is about the same topic as a publication. To achieve this objective, IDB implemented a system that can automatically extract entities and concepts from its systems, including structured and unstructured data. Further, it semantically enhanced the data and made it accessible in a Knowledge Graph. Hernandez and Marino share lessons learned from this project that can help interested attendees start with a baseline of best practices for their own projects, saving valuable time and money.
Chris Marino, Senior Consultant, Enterprise Knowledge
Monica Hernandez, Senior Project Manager, Inter-American Development Bank
Thursday, November 7: 1:00 p.m. - 1:45 p.m.
In some contexts, such as e-discovery, achieving high recall when retrieving documents is critical. Over the past year, Dimm has challenged audiences at several conferences to construct keyword searches that perform better than supervised machine learning. This talk summarizes the results and explains why it is so hard for humans to beat machine learning when seeking high recall.
Bill Dimm, Founder & CEO, Hot Neuron LLC
It's “How the Future Was” in 2004: Microsoft and enterprise search vendors showed semantic search demos that were "so close" to being better than keywords. Fifteen years on, we're still "so close," and most information retrieval searches are still keyword lookups in hash tables. What has happened is that semantic search matured in other directions. We explore multiple use cases and specific applications to government missions and commercial business problems, where semantic search has established these and a few other niches. We give special emphasis to the “analytic refutation problem,” in which both keyword search and much of current AI serves only to assist people find even more content that reinforces their biases and mistaken conceptions. Here semantic search has found its deepest niche: helping human analysts triage otherwise intractable quantities of textual information, maintaining a healthy bias against their working hypotheses.
Christopher Biow, SVP, Global Public Sector, Basis Technology
Eugene S. Reyes, Federal Solutions Engineer, Basis Technology
Thursday, November 7: 2:00 p.m. - 2:45 p.m.
Combining search with text analytics creates a powerful tool called SAS Cognitive Search to elevate the intelligence of information retrieval. Search features a flexible query syntax to fit various business needs and help uncover insights hidden in data. Text analytics can extract entities from user text data and enrich the raw data with category and sentiment information. This session presents an easy-to-use interface that leverages SAS Cognitive Search to perform search on temporal and spatial data, enriched with NLP features. With this interface, the user can analyze customer reviews for a product category to create a timeline and deduce trends. Other features include an interactive map facilitating geographic data search and filtering and a facet-based view for query results aggregation. Understanding your customers and what they think of your products has never been easier!
Feng Ye, Principal Software Developer, SAS Institute
Key words and hundreds, if not thousands, of rules are no longer enough to keep up with Amazon and recapture lost market share. Amazon has set the new standard that organizations must meet to effectively compete for their customers’ attention. Join this presentation and learn how to capture and aggregate valuable customer interactions like queries, clicks, and cart behavior in real-time so every customer gets a customized experience that is continuously being refined; easily incorporate regional trends and seasonality to deliver relevant results; make every customer experience personal; and run A/B testing and experiments so the shopping and purchase flow is constantly fine-tuned and optimized. Go from running a few experiments that take months to get out the door to dozens running live in production, without having to bother your data scientists or the engineers.
Simon Taylor, VP, Partners & Alliances, Lucidworks
Thursday, November 7: 3:00 p.m. - 3:45 p.m.
Similar to many enterprises, the Inter-American Development Bank (IDB) has multiple information sources which are isolated in different systems. There is no link between all these information resources that can make them accessible outside of their native systems. It is not possible to relate distinct kinds of resources that share some characteristics, e.g., to find a course that is about the same topic as a publication. To achieve this objective, IDB implemented a system that can automatically extract entities and concepts from its systems, including structured and unstructured data. Further, it semantically enhanced the data and made it accessible in a Knowledge Graph. Hernandez and Marino share lessons learned from this project that can help interested attendees start with a baseline of best practices for their own projects, saving valuable time and money.
Monica Hernandez, Senior Project Manager, Inter-American Development Bank
Chris Marino, Senior Consultant, Enterprise Knowledge
Products like Amazon Alexa and Google Home are changing the expectations as to how search should work. Searchers now expect voice-driven search solutions that provide answers and not just a list of links. Part of this talk shares how knowledge graphs enable a natural language search and how text analytics along with machine learning can be used to populate these powerful constructs. We explain how to architect these solutions and provide real world examples as to how many of our clients have taken advantage of these powerful tools.
Joseph Hilger, COO, Enterprise Knowledge, LLC
Thursday, November 7: 4:00 p.m. - 4:15 p.m.
The possibilities are endless but what can we really expect in 2020? Our experience speaker, who wrote the book SharePoint 2013 Consultant’s Handbook, coached people on metadata and taxonomies as well apps for Office 365, and more, shares the most recent developments and promises for the coming years.
Chris McNulty, Senior Product Manager, Microsoft
Thursday, November 7: 4:15 p.m. - 5:00 p.m.
We live in a world which promises infinite choice, but are we more trapped in the patterns of past practice than we care to think? Is the hierarchical or matrixed organization fit for purpose in a world of increased uncertainty and volatility? Governments have increasing legitimate demands on their resources from citizens and the wider needs of the planet, but few resources to deal with it. Ideology and belief seem at times to triumph over fact, evidence, and reason. Have we gone beyond even post-modernism into a new world with constantly shifting paradigms and increasingly less time to adjust to them? Our panel looks at these questions from the perspectives of knowledge and complexity. They discuss transforming and revolutionizing the way we do business as we move into an uncertain future, how we satisfy our clients in an ever-changing technological age, and how, in our complex societies, we provide value, exchange knowledge, innovate, grow and support our world. Our panel of experienced thinkers and doers shares their insights about what we should be doing to further develop a sustainable ecosystem in our organizations, communities, and world.
Patrick Lambe, Principal Consultant, Straits Knowledge and Author, Principles of Knowledge Auditing
Dave Snowden, Founder & Chief Scientist, The Cynefin Company
Tom Stewart, Executive Director, National Center for the Middle Market, Fisher College of Business, The Ohio State University
Alicia Juarrero, Founder and President, VectorAnalytica, Inc. and Author, Dynamics in Action: Intentional Behavior as a Complex System