Click on a session below to see full descriptions and speakers or view the entire Text Analytics Forum program by Track.
Our popular writers, speakers, and authors of Wow, Woo, Win: Service Design, Strategy & the Art of Customer Delight look at how customer experience and service design can enhance knowledge sharing and success in organizations. They discuss the importance of designing your organization around service and offer clear, practical strategies based on the idea that the design of services is markedly different than manufacturing. When customers have more choices than ever before, study after study reveals that it’s the experience that makes the difference. To provide great experiences that keep customers coming back, organizations or KM programs must design services with as much care as design products. Service design is proactive—it is about delivering on your promise to customers in accordance with your strategy. Our speakers share with you how to create “Aha” moments when the customer makes a positive judgment, and to avoid “Ow” moments. They provide tips on how you and customers create a bank of trust, fueled by knowledge of each other’s skills and preferences.
Everyone who engages with your organization is in search of something, whether it’s products, services, people, or support. Too much of their time is spent sifting through useless information. New advances in machine learning and AI technology, combined with contextual search, are finally bringing relevance to every interaction and are making knowledge management a key driver of real business results. See real-world examples of the impact that increased maturity has made on innovative companies. Learn actionable steps to increase the relevance of your organization and start positively impacting your bottom line.
Communities of practice are a great way to develop expertise and innovation around specific interests. By infusing intelligence into many experiences and demonstrating some recent advances in Office 365 you’ll see how to leverage tacit and explicit knowledge in different ways as well as reuse and build upon the work of others. Our speaker has extensive experience in enterprise collaboration systems and currently leads intelligent search and discovery for Microsoft 365. Expect lots of tips & examples for improving your KM initiatives.
With the recently published book Deep Text: Using Text Analytics to Overcome Information Overload, Get Real Value From Social Media, and Add Big(ger) Text to Big Data as a guide, author Tom Reamy provides an extensive overview of the whole field of text analytics: What is text analytics, how to get started, development best practices, latest applications, and building an enterprise text analytics platform. The talk ends with a look at current and future trends that promise to dramatically enhance our ability to utilize text with new techniques and applications.
Text analytics emerged in the mid-2000s, a collection of technologies, solutions, and practices aimed at meeting a diversity of business challenges. A decade in, what’s new and promising, what’s tried-and-true, and what’s on the horizon? Sentiment, identity, personality, and intent, extracted from text: All are now part of the data science mix. How has the market evolved—both demand and supply—and how should practitioners, solution providers, business analysts, and investors stay on top of developments? Text Analytics Market Insights helps all interested parties understand what’s working and what’s next, to enable them to extract the greatest value from text analytics.
Artificial intelligence is transforming text analytics. However, most AI algorithms still lack the ability to understand text data that is different from what is encountered during training. This becomes a critical issue when algorithms encounter unfamiliar words, misspellings, acronyms, or words in a different language. The solution? Transfer learning -- the ability for an AI system to take what it has learned in one situation and apply it to new and different situations. But is transfer learning ready for business deployment, or is it still an emerging technology? How can it be used in text analytics today? Havasi will discuss how transfer learning can be applied to text analytics across multiple industries.
Some technologies such as IBM Watson are being touted as AI. In response, there are new AI offerings from the large enterprise software companies as well as many startup companies. But is this AI or automation? This talk discusses the difference and argues that these offerings use entity extraction and business rules rather than AI. However, there are real opportunities to use this new technology to automate content tagging.
Using case studies of real-world client projects, Smartlogic’s CEO presents, discusses, and demonstrates how post-relational databases, text analytics, AI, semantics, and linked data are delivering rapid returns on investment in data intensive industries. Cases range from predictive analytics and financial risk assessment to compliance, superior superior customer service, and unified enterprise intelligence within industries including banking, life sciences, media, and healthcare. The talk looks at the technology, the opportunity, lessons learned, and the keys to project success.
Text analytics can discover and add underlying structure to content, providing some remarkable new capabilities for the enterprise. This session focuses on the discovery of relationships between data and the population of graph databases and graph search. There are now more than 30 graph databases on the market, ranging from neo4j to SQL Server 2017, and graph search has become mainstream (including Lucene 6, the Microsoft Graph and many more). Text analytics is important for these to provide value to organizations.
NLP and entity extractors make up an important part of our use cases in cognitive computing. We discuss how terminology systems and knowledge bases are used in combination with NLP and entity extractors to greatly enrich the contents of our data infrastructures.
To improve the effectiveness of information findability and usability, we are developing a new mechanism to understand users’ interests and predict the information that will be most relevant to their needs. We analyze the technical documents published by members of the workforce and build models that can be used to match user’s requests with the best available content. We utilize an existing hierarchical taxonomy as part of the clustering effort in order to provide preliminary labels for the clusters. The information retrieval environment we are building will not only support retrieval of relevant corporate information upon request, it is designed to proactively notify targeted members of the workforce when relevant information becomes available.
In this talk, Patrick Lambe takes an unconventional look at how text analytics, taxonomies and search can be used in concert to probe areas of ignorance, not just uncover and organize what is already known, via three problem cases from the areas of public health and public transport. We demonstrate how elements of the search and discovery technology stack can be used to detect patterns in the environment to address or mitigate these types of problems.
A panel of four text analytics experts answer questions that have been gathered before the conference, during the conference, and some additional questions from the program chair and sponsors.
For a KM initiative to be successful, knowledge managers must secure the support of senior leaders before implementation. Early top management buy-in results in funding, resources, advocacy, usage, broad organizational support, and success— the program yields its expected benefits, KM is spoken of and written about positively by leaders, stakeholders, and users. Hear from our long-time KM practitioner about proven practices illustrated by real-world examples for securing resources, active participation, and ongoing advocacy from top leadership. Get lots of tips for leading an effective, sustainable KM program that is seen as essential to the success of companies in different industries, of different sizes, and with different cultures.
For more than a decade, search technology has been used as the primary access point to the mountains of knowledge and data sitting behind an organization’s firewall. As environments evolve to account for private and public clouds, search is evolving beyond just the box to an API for human information. Will Hayes explores that evolution and talks about how search technologies and professionals play a key role in the enterprise cloud migration strategy.
This talk describes work that the IBM Taxonomy Squad has done to develop an enterprise-scale service that automates the extraction of entities and the generation of meaningful metadata. We cover the approach that was taken to design a solution architecture that leverages a corporate knowledgebase and integrates best-of-breed services in taxonomy and ontology management, NLP, machine learning, text annotation, and entity extraction.
The promise of machine learning has become a practical reality in today’s enterprise, but companies often struggle with implementation or reliable results. One fundamental issue is the common “garbage in, garbage out” problem. Poor input stems from the lack of clean data or unclear results from unstructured data analysis feeding machine learning models. Well-built taxonomies powering clear text analytics rules are an important infrastructure need often overlooked in data science activities. Come learn more about the role of taxonomy and text analytics as sources of clean data for machine learning.
For modern digital enterprises, the key to survival is held by real-time predictive analytics done with heterogeneous data gathered from multiple sources—layered with contextual intelligence. The data is a mix of structured and unstructured data. Establishing contextual relevance requires systems imbued with deep reasoning capabilities that can link relevant pieces of information from within and outside the organization. This talk presents the outlines of a framework that can gather news events in real time, classify them, reason with them, and finally link them to an enterprise information repository and thereby generate alerts or early warnings for subscribed users. The framework is presented through a number of case studies.
The globally increasing tendency for political populism and media criticism has raised the sensitivity of brands to avoid misplacement of their own campaigns in negative and compromising contexts (bad ads). However, ad targeting is predominantly based on behavioral targeting techniques that heavily rely on (cookie-based) user profiling. The talk showcases a solution for real-time contextual targeting that is exploiting the full power of cognitive computing to match campaigns to online users’ real interests. The approach abandons tracking of any kind of user data and at the same time increases the precision of ad targeting on a real semantic level—beyond what can be achieved with keyword-based methods.
Machine learning techniques can be used effectively for a wide variety of text analysis scenarios, such as reputation monitoring on social media, fraud detection, patent analysis, and e-Discovery. But to apply them well, you need to understand where the limits and pitfalls are in the technology, and you need to understand your data and the problem you are trying to solve. This session outlines an approach that uses text analytics to help understand the characteristics of your data, followed by selection and tuning of linguistic and statistical processing and machine learning parameters to address the application at hand. We highlight three real-world projects that used this approach and show how they worked, what went right and wrong, and how they evolved over time.
Government agencies face tremendous challenges daily. This includes providing services to ensure a safe, livable environment; making informed spending decisions; and regulating a healthy economy. The data that supports these missions is exploding and is increasingly unstructured. This presentation discusses the application of text analytics and visualizations across a number of these datasets and respective initiatives to provide an actionable view into the data. This involves demystifying techniques including predictive modeling and machine learning in this domain. We show how these techniques can be applied to research analytics, government spending, situational awareness, and assessing consumer financial complaints.
The Inter-American Development Bank (IDB) is the main source of multilateral financing for Latin-America and the Caribbean, and in addition to finance also provides knowledge responses to the Region’s development challenges. In this context, the IDB is constantly working to leverage new technology to improve knowledge management at the IDB in order to support efficiency in its operations and disseminate valuable knowledge and insights for the Region. For this reason, and in order to make all this information more accessible and also to solidify this information’s value to the Bank’s business, we developed a series of proofs of concepts (POCs) that use NLP and ML technologies. The purpose of this presentation is to share reflections gathered during the development of these POCs and the application of these types of approaches within the organization.
In the last 10 years, most of the academic research on entity extraction and content classification has focused on machine learning and complete automation. The latest tools are very precise, but in academic publishing, the use of automatic classification tools is still controversial. Publishers and information managers want the best of both worlds: a clear list of defined, managed keywords for their content and a cost-effective way of implementing the subject tagging. This presentation reviews the current use of machine-learning tools in publishing, both with and without the use of manually curated taxonomies.
If you are a believer in the data-driven organization (or even just curious) and have ever wondered what could happen if you cleverly combined the power of data collection, indexing, text mining, search, and machine learning into a unified platform and applied it within the enterprise, this talk is for you! Come learn about the state of cognitive search and analytics technology and how it is enabling great companies across a wide swath of industries to amplify mission-critical expertise within their business in a surprisingly short amount of time. Our speaker illustrates the technology in action with real-world examples.
The terminologies that form taxonomies, thesauri, classification schemes, and name authorities aim to define all concepts unambiguously. These conceptual definitions are, however, primarily written for a human audience and are only partially meaningful to automated categorization processes. This talk explores how automated categorization rules can be synthetically generated by mining the terminology and semantic relationships found in traditional knowledge organization systems. We examine the pros, cons, and limitations of using categorization rules derived from KOS and discuss how they can then be refined and extended using human-curated categorization rules.
This talk presents an original approach to processing search results. Rather than showing the usual 10 blue links to webpages, the software creates a text summary of those webpages—a narrative on the topic of the user’s query. The narrative gives the user a quick way to understand the key information on his query. This approach is best applicable to queries that are informational in nature, i.e., those where the user wants to understand a particular subject and get a quick grasp of a concept, an event, a product, or a public figure. The talk focuses on the merits and drawbacks of the approach and comparison with other techniques of presenting the answer to the user’s query.
Organizations are always looking for better ways to integrate their structured (databases and reports) and unstructured (documents and webpages) information. This concept is not new; in fact, it has been the primary information management goal for many years. The difference is that today, the technology to make this happen has matured to the point that this is real. This talk shares real-life examples of how this is done in large repositories using text analytics and ontologies. Session attendees will understand what an ontology is and how it can be merged with text analytics tools to provide better analytics for their data scientists.
This presentation discusses two recent projects where enterprise projects have benefited from direct interactions between taxonomies/ ontologies and text analytics. While these are often seen as competing work streams, our recent work continues to build on the idea that complex information-rich projects require both, and that pursuing one while abandoning the other often leads to poor results or project failure.
There is much talk about building triple stores from source content, but most of the models are just that without content to back them up. This session covers a case study of building a triple store to support search and other use cases from nearly 6 million documents. It also looks at the extraction or mining process for pulling 22 types of triple sets for full text and redeploying them for search queries. Lessons learned are also covered.
This session addresses the main principles of extracting entities and relationships from unstructured content against ontologies and semantic data sets. We give industry examples of business cases and key components of semantic technology architectures including text analytics and supporting data and metadata governance workflows. Finally, we demonstrate semantic annotation, talking about the challenges organizations face in this regard and some of the important lessons learned in more than 15 years of industry experience
To address the complexity of language ambiguity requires a technology that can read and understand text the way people do. This session explains the concepts behind linguistic analysis, word disambiguation, and semantic reasoning to read and understand content the way people do. It explains the concepts that support a semantic platform, demonstrates a semantic engine, explains how one mobile phone carrier deployed a self-help solution that automatically answered 24,000,000 customer questions annually with 94% precision, and shows a knowledge platform that automatically organizes hundreds of data sources and millions of unstructured documents around multiple corporate taxonomies and entity clusters using dynamically generated metadata in a precise and complete way.
The next phase of how we communicate has already started. Popularized by Siri, Alexa, and the like, natural language interaction (NLI) has achieved commercial Q&A success. For organizations looking to adopt new experiences with their customers, NLI holds promise. But there is a big difference between AI applications—the distinction is the degree to which they are intelligent. This talk examines the considerations for enterprise application of NLI and how to avoid applications that just drive more white noise.
Auto-categorization is “auto” only in part—there is much in the process that still requires old-fashioned human judgment. One critical step on the human side of the fence is to evaluate the quality of results so refinements can be developed and fed back into the process. But how do you measure quality when human indexers themselves apply topics inconsistently and often differ over applicability of topics? This case study explains how one publisher approached quality assessment in light of human variability and details how classic recall and precision measures were adjusted to provide a user-focused sense of auto-categorization quality.
In search, there is often a trade-off between recall and precision, and this impacts any evaluation of approaches: If one system achieves higher recall but lower precision, is it better? Traditionally, this situation has been addressed by using a measure that combines precision and recall into a single number, such as the F1 score. F1 makes strong assumptions about the amount of precision you can trade for a little more recall, and those assumptions are not always appropriate. In some contexts, recall and precision have very different significance. This talk presents a novel performance measure called the extrapolated precision, which avoids making such strong assumptions about allowed trade-offs between precision and recall.
In the world of scholarly publishing (as well as many other industries— such as KM/information conferences!), meeting organizers are inundated with submissions for inclusions in conference programs. Given a large set of submissions, how can we develop tools to cluster submitted manuscripts into tracks based on topical similarity? This talk describes a project that used a subject taxonomy, NLP, and other text analytics tools as well as a large corpus of documents to construct an application to cluster submitted manuscripts based on topical similarity, including a GUI interface to interact with and analyze the results. This is not intended as a detailed technical talk (no slides of code!), nor is it intended as a product spotlight; the focus is on using known/existing text analytics tools to construct purpose-built applications to solve specific document-centric problems.
The universe of text analytics is largely constrained to the output of the entire human race. This can and does result in huge, petabyte- scale problems. Technologies for this scalability, computational distribution, deep learning, resolution, and semantic expression are all new within the last 10 years, and their combination is revolutionary. Key to putting all of this together is that the text analytics are performed in the native language of the original text, prior to the inevitable loss of fidelity in machine or human translation. This talk covers a number of use cases including counterterrorism, knowing your customer, border security, disease tracking and detection, and countering fake news and conspiracy theories.
In recent years, document-centric search over information has been extended with the use of graph-based content and data models. The implementation of semantic knowledge graphs in enterprises is not only improving search in a traditional sense, but opens up a path of integrating all types of data sources in a most agile way. Linked data technologies have matured in recent years and can now be used as the basis for numerous critical tasks in enterprise information management. Hilger discusses how standards-based graph databases can be used for information integration, document classification, data analytics, and information visualization tasks. He shares how a semantic knowledge graph can be used to develop analytics applications on top of enterprise data lakes and illustrates how a large pharmaceutical company makes use of graph-based technologies to gain new insights into its research work from unified views and semantic search over heterogeneous data sources.
At the cross-section of innovation, open data, and education, our speaker, a former government KM practitioner, shares her thoughts about the challenges and opportunities for organizations and communities in the coming years. She discusses empowering members of our communities and improving services using new tech like AI, machine learning, virtual and augmented reality, Internet of Things, predictive analytics, gamification, and more. Are we moving toward anticipatory knowledge delivery (just enough, just in time, just for me), being in the flow of work at the teachable moment, establishing trust in a virtual environment, and learning from peer-to-peer marketplaces like Airbnb and Uber? Our longtime KM practitioner shares her insights about the evolving digital transformation of every part of our world and hints at the magic sauce we need for a successful future!